Latin Contextual String Embeddings (Backward)

This model provides contextual string embeddings for Latin, trained as part of the projects "Embedding the Past" (LOEWE-Exploration, TU Darmstadt) and Burchards Dekret Digital (Langzeitvorhaben, Akademie der Wissenschaften und der Literatur | Mainz).

It is specifically optimized for Medieval and Early Modern legal texts, but it serves as a robust general-purpose embedding for Medieval Latin.

Model Description

  • Architecture: Character-level LSTM (Flair Language Model)
  • Direction: Backward
  • Data Source: ~500,000 paragraphs including:
  • Training Epochs: 30
  • Final Perplexity: 2.7164
  • Final Validation Loss: 0.9993

Usage

To use this model in Flair, install the library (pip install flair) and load the model directly from the Hub. Note that backward models are typically used in combination with forward models.

from flair.embeddings import FlairEmbeddings, StackedEmbeddings

# Load the backward model
backward_embeddings = FlairEmbeddings('mschonhardt/latin-legal-backward')

# Load the forward model for a bidirectional setup
forward_embeddings = FlairEmbeddings('mschonhardt/latin-legal-forward')

# Stack them for best performance in downstream tasks (NER, POS)
stacked_embeddings = StackedEmbeddings([forward_embeddings, backward_embeddings])

# Example usage
from flair.data import Sentence
sentence = Sentence("Quod infames uocentur qui ex consanguineis nascuntur")
stacked_embeddings.embed(sentence)

Citation

If you use this model, please cite the original research paper as well as the model.

@inproceedings{akbik-etal-2018-contextual,
    title = "Contextual String Embeddings for Sequence Labeling",
    author = "Akbik, Alan  and
      Blythe, Duncan  and
      Vollgraf, Roland",
    editor = "Bender, Emily M.  and
      Derczynski, Leon  and
      Isabelle, Pierre",
    booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
    month = aug,
    year = "2018",
    address = "Santa Fe, New Mexico, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/C18-1139/",
    pages = "1638--1649"
}
@software{schonhardt_michael_2026_latin_flair,
  author       = "Schonhardt, Michael",
  title        = "Latin Contextual String Embeddings (Backward): Trained on 
                   Medieval and Early Modern Legal Corpora",
  year         = 2026,
  publisher    = "Zenodo",
  version      = "1.0.0",
  doi          = "10.5281/zenodo.18388814",
  url          = "https://doi.org/10.5281/zenodo.18388814"
}

Mirror of https://huggingface.co/mschonhardt/latin-legal-backward

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including mschonhardt/latin-legal-backward