This Model2Vec model was created by using Tokenlearn, with nomic-embed-text-v2-moe as a base.

The output dimension is 768.

The evaluation in the model card, was executed using this model (distilled), not the original.

The process to create this one, was not a simple model2vec distill, this involved generating embeddings for 23M triplets (msmarco) with the original model, then training the tokenlearn model on it, with the nomic model as a base.

Usage

Load this model using model2vec library:

from model2vec import StaticModel

model = StaticModel.from_pretrained("cnmoro/nomic-embed-text-v2-moe-distilled-high-quality")

# Compute text embeddings
embeddings = model.encode(["Example sentence"])

Or using sentence-transformers library:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('cnmoro/nomic-embed-text-v2-moe-distilled-high-quality')

# Compute text embeddings
embeddings = model.encode(["Example sentence"])
Downloads last month
51,798
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cnmoro/nomic-embed-text-v2-moe-distilled-high-quality

Dataset used to train cnmoro/nomic-embed-text-v2-moe-distilled-high-quality

Collection including cnmoro/nomic-embed-text-v2-moe-distilled-high-quality

Evaluation results