Embedding Models
Collection
Some embedding models I've trained, finetuned, distilled, converted, or something else entirely • 15 items • Updated
This Model2Vec model was created by using Tokenlearn, with nomic-embed-text-v2-moe as a base.
The output dimension is 768.
The evaluation in the model card, was executed using this model (distilled), not the original.
The process to create this one, was not a simple model2vec distill, this involved generating embeddings for 23M triplets (msmarco) with the original model, then training the tokenlearn model on it, with the nomic model as a base.
Load this model using model2vec library:
from model2vec import StaticModel
model = StaticModel.from_pretrained("cnmoro/nomic-embed-text-v2-moe-distilled-high-quality")
# Compute text embeddings
embeddings = model.encode(["Example sentence"])
Or using sentence-transformers library:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('cnmoro/nomic-embed-text-v2-moe-distilled-high-quality')
# Compute text embeddings
embeddings = model.encode(["Example sentence"])
Base model
FacebookAI/xlm-roberta-base