CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2

This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L6-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ['Mehmet Ali Ağca (Turkish pronunciation: [ˈaːdʒa]; born January 9, 1958) is a Turkish assassin of Kurdish origin who murdered left-wing journalist Abdi İpekçi on February 1, 1979, and later shot and wounded Pope John Paul II on May 13, 1981, after escaping from a Turkish prison. After serving 19 years of imprisonment in Italy where he was visited by the Pope, he was deported to Turkey, where he ser', 'christian bishop. A DBpedia category.'],
    ['Sucking lice (Anoplura, formerly known as Siphunculata) have around 500 species and represent the smaller of the two traditional suborders of lice. As opposed to the paraphyletic chewing lice, which are now divided among three suborders, the sucking lice are monophyletic. The Anoplura are all blood-feeding ectoparasites of mammals. They only occur on about 20% of all placentalian mammal species, a', 'eukaryote. A DBpedia category.'],
    ['William Ralph \\"Bill\\" Blass (June 22, 1922 – June 12, 2002) was an American fashion designer, born in Fort Wayne, Indiana. He was the recipient of many fashion awards, including seven Coty Awards and the Fashion Institute of Technology\'s Lifetime Achievement Award (1999).', 'artist. A DBpedia category.'],
    ['Secretariat (March 30, 1970 – October 4, 1989) was an American Thoroughbred racehorse who, in 1973, became the first Triple Crown winner in 25 years. His record-breaking win in the Belmont Stakes, where he left the field 31 lengths behind him, is widely regarded as one of the greatest races of all time. During his racing career, he won five Eclipse Awards, including Horse of the Year honors at age', 'stadium. A DBpedia category.'],
    ["The Hunsecker's Mill Covered Bridge is a covered bridge located in Lancaster County, Pennsylvania, United States. The bridge has a single span, wooden, double Burr arch trusses design. The bridge, which spans the Conestoga River, is 180 feet (55 m) long, making it the longest single span covered bridge in the county. The bridge's WGCB Number is 38-36-06. Unlike most historic covered bridges in the", 'baronet. A DBpedia category.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Mehmet Ali Ağca (Turkish pronunciation: [ˈaːdʒa]; born January 9, 1958) is a Turkish assassin of Kurdish origin who murdered left-wing journalist Abdi İpekçi on February 1, 1979, and later shot and wounded Pope John Paul II on May 13, 1981, after escaping from a Turkish prison. After serving 19 years of imprisonment in Italy where he was visited by the Pope, he was deported to Turkey, where he ser',
    [
        'christian bishop. A DBpedia category.',
        'eukaryote. A DBpedia category.',
        'artist. A DBpedia category.',
        'stadium. A DBpedia category.',
        'baronet. A DBpedia category.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 110,400 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 35 characters
    • mean: 351.7 characters
    • max: 400 characters
    • min: 24 characters
    • mean: 33.2 characters
    • max: 69 characters
    • min: 0.0
    • mean: 0.16
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    Mehmet Ali Ağca (Turkish pronunciation: [ˈaːdʒa]; born January 9, 1958) is a Turkish assassin of Kurdish origin who murdered left-wing journalist Abdi İpekçi on February 1, 1979, and later shot and wounded Pope John Paul II on May 13, 1981, after escaping from a Turkish prison. After serving 19 years of imprisonment in Italy where he was visited by the Pope, he was deported to Turkey, where he ser christian bishop. A DBpedia category. 0.0
    Sucking lice (Anoplura, formerly known as Siphunculata) have around 500 species and represent the smaller of the two traditional suborders of lice. As opposed to the paraphyletic chewing lice, which are now divided among three suborders, the sucking lice are monophyletic. The Anoplura are all blood-feeding ectoparasites of mammals. They only occur on about 20% of all placentalian mammal species, a eukaryote. A DBpedia category. 0.0
    William Ralph "Bill" Blass (June 22, 1922 – June 12, 2002) was an American fashion designer, born in Fort Wayne, Indiana. He was the recipient of many fashion awards, including seven Coty Awards and the Fashion Institute of Technology's Lifetime Achievement Award (1999). artist. A DBpedia category. 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • fp16: True

All Hyperparameters

Click to expand
  • per_device_train_batch_size: 32
  • num_train_epochs: 3
  • max_steps: -1
  • learning_rate: 2e-05
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_steps: 0
  • optim: adamw_torch_fused
  • optim_args: None
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • optim_target_modules: None
  • gradient_accumulation_steps: 1
  • average_tokens_across_devices: True
  • max_grad_norm: 1.0
  • label_smoothing_factor: 0.0
  • bf16: False
  • fp16: True
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • use_liger_kernel: False
  • liger_kernel_config: None
  • use_cache: False
  • neftune_noise_alpha: None
  • torch_empty_cache_steps: None
  • auto_find_batch_size: False
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • include_num_input_tokens_seen: no
  • log_level: passive
  • log_level_replica: warning
  • disable_tqdm: False
  • project: huggingface
  • trackio_space_id: trackio
  • eval_strategy: no
  • per_device_eval_batch_size: 8
  • prediction_loss_only: True
  • eval_on_start: False
  • eval_do_concat_batches: True
  • eval_use_gather_object: False
  • eval_accumulation_steps: None
  • include_for_metrics: []
  • batch_eval_metrics: False
  • save_only_model: False
  • save_on_each_node: False
  • enable_jit_checkpoint: False
  • push_to_hub: False
  • hub_private_repo: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_always_push: False
  • hub_revision: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • restore_callback_states_from_checkpoint: False
  • full_determinism: False
  • seed: 42
  • data_seed: None
  • use_cpu: False
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • dataloader_prefetch_factor: None
  • remove_unused_columns: True
  • label_names: None
  • train_sampling_strategy: random
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • ddp_backend: None
  • ddp_timeout: 1800
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • deepspeed: None
  • debug: []
  • skip_memory_metrics: True
  • do_predict: False
  • resume_from_checkpoint: None
  • warmup_ratio: None
  • local_rank: -1
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss
0.0145 50 0.4739
0.0290 100 0.3323
0.0435 150 0.3062
0.0580 200 0.3014
0.0725 250 0.2991
0.0870 300 0.2877
0.1014 350 0.2622
0.1159 400 0.2598
0.1304 450 0.2628
0.1449 500 0.2878
0.1594 550 0.2537
0.1739 600 0.2494
0.1884 650 0.2288
0.2029 700 0.2291
0.2174 750 0.2108
0.2319 800 0.2191
0.2464 850 0.2124
0.2609 900 0.2385
0.2754 950 0.2240
0.2899 1000 0.2017
0.3043 1050 0.2115
0.3188 1100 0.2130
0.3333 1150 0.2063
0.3478 1200 0.2098
0.3623 1250 0.2169
0.3768 1300 0.2096
0.3913 1350 0.2459
0.4058 1400 0.1833
0.4203 1450 0.1974
0.4348 1500 0.1961
0.4493 1550 0.1894
0.4638 1600 0.1935
0.4783 1650 0.1863
0.4928 1700 0.1959
0.5072 1750 0.1929
0.5217 1800 0.1886
0.5362 1850 0.1940
0.5507 1900 0.1786
0.5652 1950 0.1687
0.5797 2000 0.1944
0.5942 2050 0.1943
0.6087 2100 0.1735
0.6232 2150 0.1597
0.6377 2200 0.1954
0.6522 2250 0.1952
0.6667 2300 0.1790
0.6812 2350 0.1647
0.6957 2400 0.1792
0.7101 2450 0.1605
0.7246 2500 0.1872
0.7391 2550 0.1657
0.7536 2600 0.1871
0.7681 2650 0.1845
0.7826 2700 0.1919
0.7971 2750 0.1480
0.8116 2800 0.1701
0.8261 2850 0.1744
0.8406 2900 0.1596
0.8551 2950 0.1726
0.8696 3000 0.1710
0.8841 3050 0.1611
0.8986 3100 0.1520
0.9130 3150 0.1588
0.9275 3200 0.1614
0.9420 3250 0.1407
0.9565 3300 0.1698
0.9710 3350 0.1729
0.9855 3400 0.1493
1.0 3450 0.1587
1.0145 3500 0.1426
1.0290 3550 0.1179
1.0435 3600 0.1272
1.0580 3650 0.1422
1.0725 3700 0.1325
1.0870 3750 0.1398
1.1014 3800 0.1252
1.1159 3850 0.1164
1.1304 3900 0.1290
1.1449 3950 0.1195
1.1594 4000 0.1347
1.1739 4050 0.1665
1.1884 4100 0.1453
1.2029 4150 0.1287
1.2174 4200 0.1335
1.2319 4250 0.1410
1.2464 4300 0.1309
1.2609 4350 0.1425
1.2754 4400 0.1386
1.2899 4450 0.1334
1.3043 4500 0.1194
1.3188 4550 0.1490
1.3333 4600 0.1347
1.3478 4650 0.1531
1.3623 4700 0.1279
1.3768 4750 0.1483
1.3913 4800 0.1407
1.4058 4850 0.1222
1.4203 4900 0.1462
1.4348 4950 0.1217
1.4493 5000 0.1083
1.4638 5050 0.1377
1.4783 5100 0.1404
1.4928 5150 0.1389
1.5072 5200 0.1290
1.5217 5250 0.1354
1.5362 5300 0.1219
1.5507 5350 0.1442
1.5652 5400 0.1462
1.5797 5450 0.1440
1.5942 5500 0.1313
1.6087 5550 0.1163
1.6232 5600 0.1383
1.6377 5650 0.1357
1.6522 5700 0.1286
1.6667 5750 0.1131
1.6812 5800 0.1436
1.6957 5850 0.1233
1.7101 5900 0.1305
1.7246 5950 0.1124
1.7391 6000 0.1398
1.7536 6050 0.1325
1.7681 6100 0.1371
1.7826 6150 0.1200
1.7971 6200 0.1181
1.8116 6250 0.1107
1.8261 6300 0.1168
1.8406 6350 0.1310
1.8551 6400 0.1671
1.8696 6450 0.1257
1.8841 6500 0.1189
1.8986 6550 0.1453
1.9130 6600 0.1567
1.9275 6650 0.1047
1.9420 6700 0.1261
1.9565 6750 0.1244
1.9710 6800 0.1321
1.9855 6850 0.1275
2.0 6900 0.1103
2.0145 6950 0.1219
2.0290 7000 0.1087
2.0435 7050 0.1108
2.0580 7100 0.0927
2.0725 7150 0.1161
2.0870 7200 0.1114
2.1014 7250 0.1102
2.1159 7300 0.1106
2.1304 7350 0.1103
2.1449 7400 0.1328
2.1594 7450 0.1159
2.1739 7500 0.1107
2.1884 7550 0.1042
2.2029 7600 0.1117
2.2174 7650 0.1247
2.2319 7700 0.0953
2.2464 7750 0.1079
2.2609 7800 0.1005
2.2754 7850 0.1103
2.2899 7900 0.1095
2.3043 7950 0.1162
2.3188 8000 0.1318
2.3333 8050 0.0836
2.3478 8100 0.1214
2.3623 8150 0.1283
2.3768 8200 0.1274
2.3913 8250 0.0972
2.4058 8300 0.1121
2.4203 8350 0.1052
2.4348 8400 0.1084
2.4493 8450 0.0962
2.4638 8500 0.1065
2.4783 8550 0.0863
2.4928 8600 0.1146
2.5072 8650 0.0895
2.5217 8700 0.1141
2.5362 8750 0.1070
2.5507 8800 0.1118
2.5652 8850 0.1056
2.5797 8900 0.0831
2.5942 8950 0.1073
2.6087 9000 0.1204
2.6232 9050 0.1157
2.6377 9100 0.1119
2.6522 9150 0.0872
2.6667 9200 0.1059
2.6812 9250 0.0837
2.6957 9300 0.0872
2.7101 9350 0.0981
2.7246 9400 0.1140
2.7391 9450 0.0963
2.7536 9500 0.1141
2.7681 9550 0.1097
2.7826 9600 0.1109
2.7971 9650 0.1202
2.8116 9700 0.1067
2.8261 9750 0.1134
2.8406 9800 0.1108
2.8551 9850 0.0912
2.8696 9900 0.1160
2.8841 9950 0.1002
2.8986 10000 0.0844
2.9130 10050 0.1100
2.9275 10100 0.0930
2.9420 10150 0.1085
2.9565 10200 0.1306
2.9710 10250 0.1232
2.9855 10300 0.1019
3.0 10350 0.1212

Framework Versions

  • Python: 3.11.15
  • Sentence Transformers: 5.3.0
  • Transformers: 5.5.0
  • PyTorch: 2.11.0+cu126
  • Accelerate: 1.13.0
  • Datasets: 4.8.4
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
4
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for legaliza-bit/cross-encoder-dbpedia

Paper for legaliza-bit/cross-encoder-dbpedia