Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
10
This is a sentence-transformers model finetuned from intfloat/e5-large-v2. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
"query: Apparently, the air a bit higher up isn't warming as much as the ground level, according to satellite data. 🤔 #climate #science",
'passage: A chronic difficulty in obtaining reliable climate records from satellites has been changes in instruments, platforms, equator-crossing times, and algorithms. The microwave sounding unit (MSU) tropospheric temperature record has overcome some of these problems, but evidence is presented that it too contains unreliable trends over a 17-yr period (1979–95) because of transitions involving different satellites and complications arising from nonatmospheric signals associated with the surface. The two primary MSU measures of tropospheric temperature contain different error characteristics and trends. The MSU channel 2 record exhibits a slight warming trend since 1979. Its broad vertical weighting function means that the temperature signal originates from throughout the troposphere and part of the lower stratosphere; intersatellite comparisons reveal low noise levels. Off-nadir channel 2 data are combined to provide an adjusted weighting function (called MSU 2R) without the stratospheric signal, but at a cost of an increased influence of surface emissions. Land surface microwave emissions, which account for about 20% of the total signal, depend on ground temperature and soil moisture and are subject to large variations associated with the diurnal cycle. The result is that MSU 2R noise levels are a factor of 3 larger than for MSU 2 and are sufficient to corrupt trends when several satellite records are merged. After allowing for physical differences between the satellite and surface records, large differences remain in temperature trends over the Tropics where there is a strong and deterministic coupling with the surface. The authors use linear regression with observed sea surface temperatures (SSTs) and an atmospheric general circulation model to relate the tropical MSU and surface datasets. These and alternative analyses of the MSU data, radiosonde data, and comparisons between the MSU 2R and channel 2 records, with estimates of their noise, are used to show that the downward trend in tropical MSU 2R temperatures is very likely spurious. Tropical radiosonde records are of limited use in resolving the discrepancies because of artificial trends arising from changes in instruments or sensors;however, comparisons with Australian radiosondes show a spurious downward jump in MSU 2R in mid-1991, which is not evident in MSU 2. Evaluation of reanalyzed tropical temperatures from the National Centers for Environmental Prediction and the European Centre for Medium-Range Weather Forecasts shows that they contain very different and false trends, as the analyses are only as good as the input database. Statistical analysis of the MSU 2R record objectively identifies two stepwise downward discontinuities that coincide with satellite transitions. The first is in mid-1981, prior to which only one satellite was in operation for much of the time so the diurnal cycle was not well sampled. Tropical SST anomalies over these years were small, in agreement with the Southern Oscillation index, yet the MSU 2R values were anomalously warm by ∼0.25°C. The second transition from NOAA-10 to NOAA-12 in mid-1991 did not involve an overlap except with NOAA-11, which suffered from a large drift in its equator-crossing times. MSU 2R anomalies have remained anomalously cold since mid-1991 by ∼0.1°C. Adding the two stepwise discontinuities to the tropical MSU 2R record allows it to be completely reconciled with the SST record within expected noise levels. The statistical results also make physical sense as the tropical satellite anomalies are magnified relative to SST anomalies by a factor of ∼1.3, which is the amplification expected following the saturated adiabatic lapse rate to the level of the peak weighting function of MSU 2R.',
'passage: During the Holocene (last 12,000 years) nine cold relapses were observed mainly in the North Atlantic Ocean area and its surroundings. Based on the pioneering studies by Bond et al. (1997, 2001) these events are called Bond Cycles and thought to be the Holocene equivalents of the Pleistocene Dansgaard-Oeschger cycles. The first event was the Younger Dryas (~12,000 BP; Broecker 2006), the last one was the Little Ice Age (AD 1350-1860; Grove 1988). A number of trigger mechanisms is discussed (see Table 1), but a theory for the Bond Cycles does not exist. Based on spectral analyses of both, forcing factors and climatological time series, we argue that one single process did likely not cause the Holocene cooling events. It is conceivable that the early Holocene coolings were triggered by meltwater pulses. However, the late Holocene events (e.g., the Little Ice Age) were rather caused by a combination of different trigger mechanisms. In every case it has to be taken in mind that natural variability was also playing a decisive role.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.5938, -0.0103],
# [ 0.5938, 1.0000, 0.0422],
# [-0.0103, 0.0422, 1.0000]])
claims-abstracts-devTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.9667 |
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | list |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
query: there is doubt that the survival of polar bears as a species is doomed |
passage: Polar bears (Ursus maritimus) live throughout the ice-covered waters of the circumpolar Arctic, particularly in near shore annual ice over the continental shelf where biological productivity is highest. However, to a large degree under scenarios predicted by climate change models, these preferred sea ice habitats will be substantially altered. Spatial and temporal sea ice changes will lead to shifts in trophic interactions involving polar bears through reduced availability and abundance of their main prey: seals. In the short term, climatic warming may improve bear and seal habitats in higher latitudes over continental shelves if currently thick multiyear ice is replaced by annual ice with more leads, making it more suitable for seals. A cascade of impacts beginning with reduced sea ice will be manifested in reduced adipose stores leading to lowered reproductive rates because females will have less fat to invest in cubs during the winter fast. Non-pregnant bears may have to fa... |
['passage: Polar bears depend on sea ice for survival. Climate warming in the Arctic has caused significant declines in total cover and thickness of sea ice in the polar basin and progressively earlier breakup in some areas. Inuit hunters in the areas of four polar bear populations in the eastern Canadian Arctic (including Western Hudson Bay) have reported seeing more bears near settlements during the open-water period in recent years. In a fifth ecologically similar population, no changes have yet been reported by Inuit hunters. These observations, interpreted as evidence of increasing population size, have resulted in increases in hunting quotas. However, long-term data on the population size and body condition of polar bears in Western Hudson Bay, as well as population and harvest data from Baffin Bay, make it clear that those two populations at least are more likely to be declining, not increasing. While the ecological details vary in the regions occupied by the five different populations discussed in this paper, analysis of passive-microwave satellite imagery beginning in the late 1970s indicates that the sea ice is breaking up at progressively earlier dates, so that bears must fast for longer periods during the open-water season. Thus, at least part of the explanation for the appearance of more bears near coastal communities and hunting camps is likely that they are searching for alternative food sources in years when their stored body fat depots may be depleted before freeze-up, when they can return to the sea ice to hunt seals again. We hypothesize that, if the climate continues to warm as projected by the Intergovernmental Panel on Climate Change (IPCC), then polar bears in all five populations discussed in this paper will be increasingly food-stressed, and their numbers are likely to decline eventually, probably significantly so. As these populations decline, problem interactions between bears and humans will likely continue, and possibly increase, as the bears seek alternative food sources. Taken together, the data reported in this paper suggest that a precautionary approach be taken to the harvesting of polar bears and that the potential effects of climate warming be incorporated into planning for the management and conservation of this species throughout the Arctic.', 'passage: Loss of Arctic sea ice owing to climate change is the primary threat to polar bears throughout their range. We evaluated the potential response of polar bears to sea-ice declines by (i) calculating generation length (GL) for the species, which determines the timeframe for conservation assessments; (ii) developing a standardized sea-ice metric representing important habitat; and (iii) using statistical models and computer simulation to project changes in the global population under three approaches relating polar bear abundance to sea ice. Mean GL was 11.5 years. Ice-covered days declined in all subpopulation areas during 1979–2014 (median −1.26 days year −1 ). The estimated probabilities that reductions in the mean global population size of polar bears will be greater than 30%, 50% and 80% over three generations (35–41 years) were 0.71 (range 0.20–0.95), 0.07 (range 0–0.35) and less than 0.01 (range 0–0.02), respectively. According to IUCN Red List reduction thresholds, which provide a common measure of extinction risk across taxa, these results are consistent with listing the species as vulnerable. Our findings support the potential for large declines in polar bear numbers owing to sea-ice loss, and highlight near-term uncertainty in statistical projections as well as the sensitivity of projections to different plausible assumptions.', 'passage: Loss of Arctic sea ice owing to climate change is the primary threat to polar bears throughout their range. We evaluated the potential response of polar bears to sea-ice declines by (i) calculating generation length (GL) for the species, which determines the timeframe for conservation assessments; (ii) developing a standardized sea-ice metric representing important habitat; and (iii) using statistical models and computer simulation to project changes in the global population under three approaches relating polar bear abundance to sea ice. Mean GL was 11.5 years. Ice-covered days declined in all subpopulation areas during 1979–2014 (median −1.26 days year −1 ). The estimated probabilities that reductions in the mean global population size of polar bears will be greater than 30%, 50% and 80% over three generations (35–41 years) were 0.71 (range 0.20–0.95), 0.07 (range 0–0.35) and less than 0.01 (range 0–0.02), respectively. According to IUCN Red List reduction thresholds, which provide a common measure of extinction risk across taxa, these results are consistent with listing the species as vulnerable. Our findings support the potential for large declines in polar bear numbers owing to sea-ice loss, and highlight near-term uncertainty in statistical projections as well as the sensitivity of projections to different plausible assumptions.'] |
query: Other factors than CO2 like water vapor play a bigger role in determining the Earth's climate. |
passage: Carbon dioxide (CO2) and methane (CH4) are important greenhouse gases in the atmosphere and have large impacts on Earth's radiative forcing and climate. Their natural and anthropogenic emissions have often been in focus, while the role of human metabolic emissions has received less attention. In this study, exhaled, dermal and whole-body CO2 and CH4 emission rates from a total of 20 volunteers were quantified under various controlled environmental conditions in a climate chamber. The whole-body CO2 emissions increased with temperature. Individual differences were the most important factor for the whole-body CH4 emissions. Dermal emissions of CO2 and CH4 only contributed ~3.5% and ~5.5% to the whole-body emissions, respectively. Breath measurements conducted on 24 volunteers in a companion study identified one third of the volunteers as CH4 producers (exhaled CH4 exceeded 1 ppm above ambient level). The exhaled CH4 emission rate of these CH4 producers (4.03 ± 0.71 mg/h/person, ... |
['passage: Significance The fact that water vapor is the most dominant greenhouse gas underscores the need for an accurate understanding of the changes in its distribution over space and time. Although satellite observations have revealed a moistening trend in the upper troposphere, it has been unclear whether the observed moistening is a facet of natural variability or a direct result of human activities. Here, we use a set of coordinated model experiments to confirm that the satellite-observed increase in upper-tropospheric water vapor over the last three decades is primarily attributable to human activities. This attribution has significant implications for climate sciences because it corroborates the presence of the largest positive feedback in the climate system.', 'passage: Significance The fact that water vapor is the most dominant greenhouse gas underscores the need for an accurate understanding of the changes in its distribution over space and time. Although satellite observations have revealed a moistening trend in the upper troposphere, it has been unclear whether the observed moistening is a facet of natural variability or a direct result of human activities. Here, we use a set of coordinated model experiments to confirm that the satellite-observed increase in upper-tropospheric water vapor over the last three decades is primarily attributable to human activities. This attribution has significant implications for climate sciences because it corroborates the presence of the largest positive feedback in the climate system.', 'passage: Significance The fact that water vapor is the most dominant greenhouse gas underscores the need for an accurate understanding of the changes in its distribution over space and time. Although satellite observations have revealed a moistening trend in the upper troposphere, it has been unclear whether the observed moistening is a facet of natural variability or a direct result of human activities. Here, we use a set of coordinated model experiments to confirm that the satellite-observed increase in upper-tropospheric water vapor over the last three decades is primarily attributable to human activities. This attribution has significant implications for climate sciences because it corroborates the presence of the largest positive feedback in the climate system.'] |
query: Climate change is a long-term process. Even if it's entirely human-caused, it's happening so slowly that we won't see its full effects in our lifetime. That's why action is needed NOW. |
passage: ‘Global warming’ may be a familiar term, but it is seriously misleading. Human actions are causing a massive disruption to the planet's climate that is severe, rapid, very variable over space and time, and highly complex. The biosphere itself is complex and its responses to even simple changes are difficult to predict in detail. One can likely only be certain that many changes will be unexpected and some unfortunate. Even the simple, slow warming of the climate will produce complex consequences to species numbers and distributions because of how species depend on each other. An alternative approach to worrying about details is to concentrate on understanding the most significant ecological changes, ones that are irreversible — so-called ‘tipping points’. Once such a point has been passed, even if society managed to restore historical climatic conditions, it might not restore the historical ecological patterns. Nowhere is this more obvious than in the loss of species, for we ca... |
["passage: ‘Global warming’ may be a familiar term, but it is seriously misleading. Human actions are causing a massive disruption to the planet's climate that is severe, rapid, very variable over space and time, and highly complex. The biosphere itself is complex and its responses to even simple changes are difficult to predict in detail. One can likely only be certain that many changes will be unexpected and some unfortunate. Even the simple, slow warming of the climate will produce complex consequences to species numbers and distributions because of how species depend on each other. An alternative approach to worrying about details is to concentrate on understanding the most significant ecological changes, ones that are irreversible — so-called ‘tipping points’. Once such a point has been passed, even if society managed to restore historical climatic conditions, it might not restore the historical ecological patterns. Nowhere is this more obvious than in the loss of species, for we cannot recreate them. Climate disruptions may cause the loss of a large fraction of the planet's biodiversity, even if the only mechanism were to be species ranges moving uphill as temperatures rise. ‘Global warming’ may be a familiar term, but it is seriously misleading. Human actions are causing a massive disruption to the planet's climate that is severe, rapid, very variable over space and time, and highly complex. The biosphere itself is complex and its responses to even simple changes are difficult to predict in detail. One can likely only be certain that many changes will be unexpected and some unfortunate. Even the simple, slow warming of the climate will produce complex consequences to species numbers and distributions because of how species depend on each other. An alternative approach to worrying about details is to concentrate on understanding the most significant ecological changes, ones that are irreversible — so-called ‘tipping points’. Once such a point has been passed, even if society managed to restore historical climatic conditions, it might not restore the historical ecological patterns. Nowhere is this more obvious than in the loss of species, for we cannot recreate them. Climate disruptions may cause the loss of a large fraction of the planet's biodiversity, even if the only mechanism were to be species ranges moving uphill as temperatures rise.", 'passage: Climate is changing in an accelerating pace. Climate change occurs as a result of an imbalance between incoming and outgoing radiation in the atmosphere. The global mean temperatures may increase up to 5.4°C by 2100. Climate change is mainly caused by humans, especially through increased greenhouse gas emissions. Climate change is recognized as a serious threat to ecosystem, biodiversity, and health. It is associated with alterations in the physical environment of the planet Earth. Climate change affects life around the globe. It impacts plants and animals, with consequences for the survival of the species. In humans, climate change has multiple deleterious consequences. Climate change creates water and food insecurity, increased morbidity/mortality, and population movement. Vulnerable populations (e.g., children, elderly, indigenous, and poor) are disproportionately affected. Personalized adaptation to the consequences of climate change and preventive measures are key challenges for the society. Policymakers must implement the appropriate strategies, especially in the vulnerable populations.', 'passage: Climate is changing in an accelerating pace. Climate change occurs as a result of an imbalance between incoming and outgoing radiation in the atmosphere. The global mean temperatures may increase up to 5.4°C by 2100. Climate change is mainly caused by humans, especially through increased greenhouse gas emissions. Climate change is recognized as a serious threat to ecosystem, biodiversity, and health. It is associated with alterations in the physical environment of the planet Earth. Climate change affects life around the globe. It impacts plants and animals, with consequences for the survival of the species. In humans, climate change has multiple deleterious consequences. Climate change creates water and food insecurity, increased morbidity/mortality, and population movement. Vulnerable populations (e.g., children, elderly, indigenous, and poor) are disproportionately affected. Personalized adaptation to the consequences of climate change and preventive measures are key challenges for the society. Policymakers must implement the appropriate strategies, especially in the vulnerable populations.'] |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
query: While CO2 gets a lot of attention, it's actually water vapor that plays the biggest role in trapping heat in our atmosphere #ClimateAction #ClimateAwareness |
passage: Carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O) are the greenhouse gases largely responsible for anthropogenic climate change. Natural plant and microbial metabolic processes play a major role in the global atmospheric budget of each. We have been studying ecosystem-atmosphere trace gas exchange at a sub-boreal forest in the northeastern United States for over two decades. Historically our emphasis was on turbulent fluxes of CO2 and water vapor. In 2012 we embarked on an expanded campaign to also measure CH4 and N2O. Here we present continuous tower-based measurements of the ecosystem-atmosphere exchange of CO2 and CH4, recorded over the period 2012-2018 and reported at a 30-minute time step. Additionally, we describe a five-year (2012-2016) dataset of chamber-based measurements of soil fluxes of CO2, CH4, and N2O (2013-2016 only), conducted each year from May to November. These data can be used for process studies, for biogeochemical and land surface model valida... |
passage: Summary |
query: The wealthy create disproportionately large carbon footprints. |
passage: Shrinking household size is a key challenge for sustainability, simultaneously decreasing sharing and increasing resource consumption. We use the Danish Household Budget Survey and carbon intensities from EXIOBASE to characterise small households in socio-demographic cohorts along the carbon footprint spectrum. Single and dual occupant households represent 77% of the Danish carbon footprint and 73% of the sample, making these households highly relevant for climate and social policy. We identify high carbon footprint cohorts to determine potential intervention targets such as wealthy males living alone and couples in suburban areas. To add emotional depth to these characteristics we provide three stories to our results. Illuminating characteristics of high impact households provides a foundation from which to design and implement interventions to reduce the carbon consequences of the growing trend towards living alone. We also characterise low carbon footprint cohorts, with spe... |
passage: We modelled the financial and environmental costs of two commonly used anaesthetic plastic drug trays. We proposed that, compared with single-use trays, reusable trays are less expensive, consume less water and produce less carbon dioxide, and that routinely adding cotton and paper increases financial and environmental costs. We used life cycle assessment to model the financial and environmental costs of reusable and single-use trays. From our life cycle assessment modelling, the reusable tray cost (Australian dollars) $0.23 (95% confidence interval [CI] $0.21 to $0.25) while the single-use tray alone cost $0.47 (price range of $0.42 to $0.52) and the single-use tray with cotton and gauze added was $0.90 (no price range in Melbourne). Production of CO2 was 110 g CO2 (95% CI 98 to 122 g CO2) for the reusable tray, 126 g (95% CI 104 to 151 g) for single-use trays alone (mean difference of 16 g, 95% CI -8 to 40 g) and 204 g CO2 (95% CI 166 to 268 g CO2) for the single-use trays w... |
query: Turns out, sea levels haven't been rising any faster in the last 120 years. #ClimateAction #Sustainability |
passage: Abstract. Alteration of natural environment in the wake of global warming is one of the most serious issues, which is being discussed across the world. Over the last 100 years, global sea level rose by 1.0–2.5 mm/y. Present estimates of future sea-level rise induced by climate change range from 28 to 98 cm for the year 2100. It has been estimated that a 1-m rise in sea-level could displace nearly 7 million people from their homes in India. The climate change and associated sea level rise is proclaimed to be a serious threat especially to the low lying coastal areas. Thus, study of long term effects on an estuarine region not only gives opportunity for identifying the vulnerable areas but also gives a clue to the periods where the sea level rise was significant and verifies climate change impact on sea level rise. Multi-temporal remote sensing data and GIS tools are often used to study the pattern of erosion/ accretion in an area and to predict the future coast lines. The prese... |
passage: The 2015 Paris Agreement on Climate Change implicitly requires phasing out fossil fuels; such a phase out may cost hundreds of trillions of dollars and induce widespread socio-ecological ramifications. The COVID-19 'pancession' (pandemic + recession) has rattled global economies, possibly accelerating the fossil fuel phase out. This raises the question: What opportunities has COVID-19 presented to phase out fossil fuels, and subsequently, how can transformative recovery efforts be designed to utilize these opportunities and promote social, ecological and relational inclusiveness? We find that: (a) the COVID-19 pancession provides a unique opportunity to accelerate climate action, as it has devalued financial assets, stunned fossil fuel production and paralyzed relevant infrastructure, thus easing the pathway towards stranding global fossil fuel resources and assets; (b) four possible post-pancession recovery scenarios may unravel, of which only one is ecologically, socially an... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
eval_strategy: epochper_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 2e-05warmup_ratio: 0.1fp16: Trueload_best_model_at_end: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss | claims-abstracts-dev_cosine_accuracy |
|---|---|---|---|---|
| -1 | -1 | - | - | 0.9333 |
| 0.5319 | 100 | 1.1308 | - | - |
| 1.0 | 188 | - | 0.2670 | 0.9667 |
| 1.0638 | 200 | 0.3978 | - | - |
| 1.5957 | 300 | 0.2429 | - | - |
| 2.0 | 376 | - | 0.1914 | 0.9667 |
| 2.1277 | 400 | 0.2063 | - | - |
| 2.6596 | 500 | 0.1328 | - | - |
| 3.0 | 564 | - | 0.1797 | 0.9667 |
Carbon emissions were measured using CodeCarbon.
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
intfloat/e5-large-v2