The Eulogikon Dataset

The surviving literary works of ancient Greece — Homer through late antiquity — as a free, public-domain dataset. Clean Unicode Greek in PDF, Markdown, and plain text per work, with machine-readable manifests and an llms.txt entry point. The reading and search experience lives across this site; this page is the download and citation hub.

Get the data

GitHub

Clone or browse the full corpus and manifests. Files are flat and predictably named — no build step.

Open repository

Hugging Face

Parquet dataset with works and authors configs, ready for datasets.load_dataset.

Open dataset

Zenodo

Versioned archival snapshot with a citable DOI. The concept DOI always resolves to the latest release.

Open archive

Kaggle

The corpus and manifests packaged as a Kaggle dataset for notebooks and quick exploration.

Open on Kaggle

Formats

SurfaceFormatsWhat it is
Greek texts (per work)Markdown, plain text, PDFThe full Greek text; Markdown is the richest reading surface.
Author pagesMarkdown, plain text, PDFPeriod, dialect, domain, affiliation, and biography per author.
IndexesJSON, CSVmanifest.json / manifest.csv and compact author/work lookups.
AI entry pointllms.txtCurated pointer to the compact indexes for assistants and crawlers.

Cite it

Eulogikon (2026). Eulogikon: Ancient Greek Texts Corpus [Data set]. Zenodo. https://doi.org/10.5281/zenodo.20335421

Machine-readable citation metadata: CITATION.cff · dataset.jsonld.

Licence

The ancient Greek texts are Public Domain Mark 1.0 — public domain, no restrictions. Eulogikon's own scaffolding and metadata (manifests, metadata shape) is dedicated to the public domain under CC0 1.0. See LICENSE and NOTICE for the full rights statement. Use for research, teaching, indexing, and AI training is explicitly permitted, with no opt-out and no attribution requirement.