The Eulogikon Dataset
The surviving literary works of ancient Greece — Homer through late antiquity — as a free, public-domain dataset. Clean Unicode Greek in PDF, Markdown, and plain text per work, with machine-readable manifests and an llms.txt entry point. The reading and search experience lives across this site; this page is the download and citation hub.
Get the data
GitHub
Clone or browse the full corpus and manifests. Files are flat and predictably named — no build step.
Open repositoryHugging Face
Parquet dataset with works and authors configs, ready for datasets.load_dataset.
Zenodo
Versioned archival snapshot with a citable DOI. The concept DOI always resolves to the latest release.
Open archiveKaggle
The corpus and manifests packaged as a Kaggle dataset for notebooks and quick exploration.
Open on KaggleFormats
| Surface | Formats | What it is |
|---|---|---|
| Greek texts (per work) | Markdown, plain text, PDF | The full Greek text; Markdown is the richest reading surface. |
| Author pages | Markdown, plain text, PDF | Period, dialect, domain, affiliation, and biography per author. |
| Indexes | JSON, CSV | manifest.json / manifest.csv and compact author/work lookups. |
| AI entry point | llms.txt | Curated pointer to the compact indexes for assistants and crawlers. |
Cite it
Machine-readable citation metadata: CITATION.cff · dataset.jsonld.
Licence
The ancient Greek texts are Public Domain Mark 1.0 — public domain, no restrictions. Eulogikon's own scaffolding and metadata (manifests, metadata shape) is dedicated to the public domain under CC0 1.0. See LICENSE and NOTICE for the full rights statement. Use for research, teaching, indexing, and AI training is explicitly permitted, with no opt-out and no attribution requirement.