This Model2Vec model was created by using Tokenlearn, with nomic-embed-text-v2-moe as a base.
The output dimension is 384.
The evaluation in the model card was executed using this distilled model, not the original.
This model was trained in streaming mode over large precomputed feature shards with incremental PCA (384d), vocabulary quantization capped at 32k effective tokens, and fine-tuning optimizations for large-scale data.
This is smaller but better model than cnmoro/nomic-embed-text-v2-moe-distilled-high-quality
Usage
Load this model using model2vec library:
from model2vec import StaticModel
model = StaticModel.from_pretrained("cnmoro/static-nomic-384-pten")
# Compute text embeddings
embeddings = model.encode(["Example sentence"])
Or using sentence-transformers library:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('cnmoro/static-nomic-384-pten')
# Compute text embeddings
embeddings = model.encode(["Example sentence"])
- Downloads last month
- 39
Model tree for cnmoro/static-nomic-384-pten
Base model
FacebookAI/xlm-roberta-base Finetuned
nomic-ai/nomic-xlm-2048 Finetuned
nomic-ai/nomic-embed-text-v2-moeDataset used to train cnmoro/static-nomic-384-pten
Evaluation results
- pearson on MTEB Assin2STS (default)test set self-reported64.862
- spearman on MTEB Assin2STS (default)test set self-reported60.229
- cosine_pearson on MTEB Assin2STS (default)test set self-reported64.862
- cosine_spearman on MTEB Assin2STS (default)test set self-reported60.229
- manhattan_pearson on MTEB Assin2STS (default)test set self-reported63.422
- manhattan_spearman on MTEB Assin2STS (default)test set self-reported61.033
- euclidean_pearson on MTEB Assin2STS (default)test set self-reported62.569
- euclidean_spearman on MTEB Assin2STS (default)test set self-reported60.229
- main_score on MTEB Assin2STS (default)test set self-reported60.229
- pearson on MTEB BIOSSES (default)test set self-reported69.126
- spearman on MTEB BIOSSES (default)test set self-reported68.588
- cosine_pearson on MTEB BIOSSES (default)test set self-reported69.126
- cosine_spearman on MTEB BIOSSES (default)test set self-reported68.588
- manhattan_pearson on MTEB BIOSSES (default)test set self-reported66.910
- manhattan_spearman on MTEB BIOSSES (default)test set self-reported67.602
- euclidean_pearson on MTEB BIOSSES (default)test set self-reported68.059
- euclidean_spearman on MTEB BIOSSES (default)test set self-reported68.588
- main_score on MTEB BIOSSES (default)test set self-reported68.588
- pearson on MTEB SICK-BR-STS (default)test set self-reported68.883
- spearman on MTEB SICK-BR-STS (default)test set self-reported62.925