SentenceTransformer based on unsloth/embeddinggemma-300m

This is a sentence-transformers model finetuned from unsloth/embeddinggemma-300m on the generator dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: unsloth/embeddinggemma-300m
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text
  • Training Dataset:
    • generator

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
  (1): Pooling({'embedding_dimension': 768, 'pooling_mode': 'mean', 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity', 'module_input_name': 'sentence_embedding', 'module_output_name': 'sentence_embedding'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity', 'module_input_name': 'sentence_embedding', 'module_output_name': 'sentence_embedding'})
  (4): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("borntobeignored/embeddinggemma_lora")
# Run inference
queries = [
    'What was the debt-to-equity ratio for Republic Services in 2017 based on the actual asset allocation of 70% debt securities and 30% equity securities as of December 31, 2017?',
]
documents = [
    'republic services , inc .\nnotes to consolidated financial statements 2014 ( continued ) we determine the discount rate used in the measurement of our obligations based on a model that matches the timing and amount of expected benefit payments to maturities of high quality bonds priced as of the plan measurement date .\nwhen that timing does not correspond to a published high-quality bond rate , our model uses an expected yield curve to determine an appropriate current discount rate .\nthe yields on the bonds are used to derive a discount rate for the liability .\nthe term of our obligation , based on the expected retirement dates of our workforce , is approximately seven years .\nin developing our expected rate of return assumption , we have evaluated the actual historical performance and long-term return projections of the plan assets , which give consideration to the asset mix and the anticipated timing of the plan outflows .\nwe employ a total return investment approach whereby a mix of equity and fixed income investments are used to maximize the long-term return of plan assets for what we consider a prudent level of risk .\nthe intent of this strategy is to minimize plan expenses by outperforming plan liabilities over the long run .\nrisk tolerance is established through careful consideration of plan liabilities , plan funded status and our financial condition .\nthe investment portfolio contains a diversified blend of equity and fixed income investments .\nfurthermore , equity investments are diversified across u.s .\nand non-u.s .\nstocks as well as growth , value , and small and large capitalizations .\nderivatives may be used to gain market exposure in an efficient and timely manner ; however , derivatives may not be used to leverage the portfolio beyond the market value of the underlying investments .\ninvestment risk is measured and monitored on an ongoing basis through annual liability measurements , periodic asset and liability studies , and quarterly investment portfolio reviews .\nthe following table summarizes our target asset allocation for 2017 and actual asset allocation as of december 31 , 2017 and 2016 for our plan : target allocation actual allocation actual allocation ._|    |                   | targetassetallocation   | 2017actualassetallocation   | 2016actualassetallocation   |\n|---:|:------------------|:------------------------|:----------------------------|:----------------------------|\n|  0 | debt securities   | 72% ( 72 % )            | 70% ( 70 % )                | 72% ( 72 % )                |\n|  1 | equity securities | 28                      | 30                          | 28                          |\n|  2 | total             | 100% ( 100 % )          | 100% ( 100 % )              | 100% ( 100 % )              |_for 2018 , the investment strategy for pension plan assets is to maintain a broadly diversified portfolio designed to achieve our target of an average long-term rate of return of 5.36% ( 5.36 % ) .\nwhile we believe we can achieve a long- term average return of 5.36% ( 5.36 % ) , we cannot be certain that the portfolio will perform to our expectations .\nassets are strategically allocated among debt and equity portfolios to achieve a diversification level that reduces fluctuations in investment returns .\nasset allocation target ranges and strategies are reviewed periodically with the assistance of an independent external consulting firm. .',
    'note 17 .\naccumulated other comprehensive losses : pmi\'s accumulated other comprehensive losses , net of taxes , consisted of the following: ._|    | ( losses ) earnings ( in millions )          | ( losses ) earnings 2015   | ( losses ) earnings 2014   | 2013             |\n|---:|:---------------------------------------------|:---------------------------|:---------------------------|:-----------------|\n|  0 | currency translation adjustments             | $ -6129 ( 6129 )           | $ -3929 ( 3929 )           | $ -2207 ( 2207 ) |\n|  1 | pension and other benefits                   | -3332 ( 3332 )             | -3020 ( 3020 )             | -2046 ( 2046 )   |\n|  2 | derivatives accounted for as hedges          | 59                         | 123                        | 63               |\n|  3 | total accumulated other comprehensive losses | $ -9402 ( 9402 )           | $ -6826 ( 6826 )           | $ -4190 ( 4190 ) |_reclassifications from other comprehensive earnings the movements in accumulated other comprehensive losses and the related tax impact , for each of the components above , that are due to current period activity and reclassifications to the income statement are shown on the consolidated statements of comprehensive earnings for the years ended december 31 , 2015 , 2014 , and 2013 .\nthe movement in currency translation adjustments for the year ended december 31 , 2013 , was also impacted by the purchase of the remaining shares of the mexican tobacco business .\nin addition , $ 1 million , $ 5 million and $ 12 million of net currency translation adjustment gains were transferred from other comprehensive earnings to marketing , administration and research costs in the consolidated statements of earnings for the years ended december 31 , 2015 , 2014 and 2013 , respectively , upon liquidation of subsidiaries .\nfor additional information , see note 13 .\nbenefit plans and note 15 .\nfinancial instruments for disclosures related to pmi\'s pension and other benefits and derivative financial instruments .\nnote 18 .\ncolombian investment and cooperation agreement : on june 19 , 2009 , pmi announced that it had signed an agreement with the republic of colombia , together with the departments of colombia and the capital district of bogota , to promote investment and cooperation with respect to the colombian tobacco market and to fight counterfeit and contraband tobacco products .\nthe investment and cooperation agreement provides $ 200 million in funding to the colombian governments over a 20-year period to address issues of mutual interest , such as combating the illegal cigarette trade , including the threat of counterfeit tobacco products , and increasing the quality and quantity of locally grown tobacco .\nas a result of the investment and cooperation agreement , pmi recorded a pre-tax charge of $ 135 million in the operating results of the latin america & canada segment during the second quarter of 2009 .\nat december 31 , 2015 and 2014 , pmi had $ 73 million and $ 71 million , respectively , of discounted liabilities associated with the colombian investment and cooperation agreement .\nthese discounted liabilities are primarily reflected in other long-term liabilities on the consolidated balance sheets and are expected to be paid through 2028 .\nnote 19 .\nrbh legal settlement : on july 31 , 2008 , rothmans inc .\n( "rothmans" ) announced the finalization of a cad 550 million settlement ( or approximately $ 540 million , based on the prevailing exchange rate at that time ) between itself and rothmans , benson & hedges inc .\n( "rbh" ) , on the one hand , and the government of canada and all 10 provinces , on the other hand .\nthe settlement resolved the royal canadian mounted police\'s investigation relating to products exported from canada by rbh during the 1989-1996 period .\nrothmans\' sole holding was a 60% ( 60 % ) interest in rbh .\nthe remaining 40% ( 40 % ) interest in rbh was owned by pmi. .',
    "the aes corporation notes to consolidated financial statements december 31 , 2016 , 2015 , and 2014 the following table summarizes the company's redeemable stock of subsidiaries balances as of the periods indicated ( in millions ) : ._|    | december 31,                           | 2016   | 2015   |\n|---:|:---------------------------------------|:-------|:-------|\n|  0 | ipalco common stock                    | $ 618  | $ 460  |\n|  1 | colon quotas ( 1 )                     | 100    | 2014   |\n|  2 | ipl preferred stock                    | 60     | 60     |\n|  3 | other common stock                     | 4      | 2014   |\n|  4 | dpl preferred stock                    | 2014   | 18     |\n|  5 | total redeemable stock of subsidiaries | $ 782  | $ 538  |______________________________ ( 1 ) characteristics of quotas are similar to common stock .\ncolon 2014 during the year ended december 31 , 2016 , our partner in colon increased their ownership from 25% ( 25 % ) to 49.9% ( 49.9 % ) and made capital contributions of $ 106 million .\nany subsequent adjustments to allocate earnings and dividends to our partner , or measure the investment at fair value , will be classified as temporary equity each reporting period as it is probable that the shares will become redeemable .\nipl 2014 ipl had $ 60 million of cumulative preferred stock outstanding at december 31 , 2016 and 2015 , which represented five series of preferred stock .\nthe total annual dividend requirements were approximately $ 3 million at december 31 , 2016 and 2015 .\ncertain series of the preferred stock were redeemable solely at the option of the issuer at prices between $ 100 and $ 118 per share .\nholders of the preferred stock are entitled to elect a majority of ipl's board of directors if ipl has not paid dividends to its preferred stockholders for four consecutive quarters .\nbased on the preferred stockholders' ability to elect a majority of ipl's board of directors in this circumstance , the redemption of the preferred shares is considered to be not solely within the control of the issuer and the preferred stock is considered temporary equity .\ndpl 2014 dpl had $ 18 million of cumulative preferred stock outstanding as of december 31 , 2015 , which represented three series of preferred stock issued by dp&l , a wholly-owned subsidiary of dpl .\nthe dp&l preferred stock was redeemable at dp&l's option as determined by its board of directors at per-share redemption prices between $ 101 and $ 103 per share , plus cumulative preferred dividends .\nin addition , dp&l's amended articles of incorporation contained provisions that permitted preferred stockholders to elect members of the dp&l board of directors in the event that cumulative dividends on the preferred stock are in arrears in an aggregate amount equivalent to at least four full quarterly dividends .\nbased on the preferred stockholders' ability to elect members of dp&l's board of directors in this circumstance , the redemption of the preferred shares was considered to be not solely within the control of the issuer and the preferred stock was considered temporary equity .\nin september 2016 , it became probable that the preferred shares would become redeemable .\nas such , the company recorded an adjustment of $ 5 million to retained earnings to adjust the preferred shares to their redemption value of $ 23 million .\nin october 2016 , dp&l redeemed all of its preferred shares .\nupon redemption , the preferred shares were no longer outstanding and all rights of the holders thereof as shareholders of dp&l ceased to exist .\nipalco 2014 in february 2015 , cdpq purchased 15% ( 15 % ) of aes us investment , inc. , a wholly-owned subsidiary that owns 100% ( 100 % ) of ipalco , for $ 247 million , with an option to invest an additional $ 349 million in ipalco through 2016 in exchange for a 17.65% ( 17.65 % ) equity stake .\nin april 2015 , cdpq invested an additional $ 214 million in ipalco , which resulted in cdpq's combined direct and indirect interest in ipalco of 24.90% ( 24.90 % ) .\nas a result of these transactions , $ 84 million in taxes and transaction costs were recognized as a net decrease to equity .\nthe company also recognized an increase to additional paid-in capital and a reduction to retained earnings of 377 million for the excess of the fair value of the shares over their book value .\nno gain or loss was recognized in net income as the transaction was not considered to be a sale of in-substance real estate .\nin march 2016 , cdpq exercised its remaining option by investing $ 134 million in ipalco , which resulted in cdpq's combined direct and indirect interest in ipalco of 30% ( 30 % ) .\nthe company also recognized an increase to additional paid-in capital and a reduction to retained earnings of $ 84 million for the excess of the fair value of the shares over their book value .\nin june 2016 , cdpq contributed an additional $ 24 million to ipalco , with no impact to the ownership structure of the investment .\nany subsequent adjustments to allocate earnings and dividends to cdpq will be classified as nci within permanent equity as it is not probable that the shares will become redeemable. .",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.5397, -0.0544,  0.0456]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.2412
cosine_accuracy@3 0.6206
cosine_accuracy@5 0.7973
cosine_accuracy@10 0.9049
cosine_precision@5 0.1595
cosine_precision@10 0.0905
cosine_recall@5 0.7973
cosine_recall@10 0.9049
cosine_ndcg@10 0.5702
cosine_mrr@10 0.4624
cosine_map@100 0.4671

Training Details

Training Dataset

generator

  • Dataset: generator
  • Size: 6,251 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 100 samples:
    anchor positive
    type string string
    modality text text
    details
    • min: 23 tokens
    • mean: 46.41 tokens
    • max: 81 tokens
    • min: 339 tokens
    • mean: 1030.87 tokens
    • max: 3198 tokens
  • Samples:
    anchor positive
    What was Analog Devices reported interest expense for fiscal year 2009, as listed in its annual report? interest rate to a variable interest rate based on the three-month libor plus 2.05% ( 2.05 % ) ( 2.34% ( 2.34 % ) as of october 31 , 2009 ) .
    if libor changes by 100 basis points , our annual interest expense would change by $ 3.8 million .
    foreign currency exposure as more fully described in note 2i .
    in the notes to consolidated financial statements contained in item 8 of this annual report on form 10-k , we regularly hedge our non-u.s .
    dollar-based exposures by entering into forward foreign currency exchange contracts .
    the terms of these contracts are for periods matching the duration of the underlying exposure and generally range from one month to twelve months .
    currently , our largest foreign currency exposure is the euro , primarily because our european operations have the highest proportion of our local currency denominated expenses .
    relative to foreign currency exposures existing at october 31 , 2009 and november 1 , 2008 , a 10% ( 10 % ) unfavorable movement in foreign cur...
    During the fiscal year ended March 31, 2012, for Abiomed, Inc., did the stock-based compensation expense of $3.3 million for equity awards where prescribed performance milestones were achieved or deemed probable exceed the total fair value of restricted stock and restricted stock units vested during that year? abiomed , inc .
    and subsidiaries notes to consolidated financial statements 2014 ( continued ) note 8 .
    stock award plans and stock-based compensation ( continued ) restricted stock and restricted stock units the following table summarizes restricted stock and restricted stock unit activity for the fiscal year ended march 31 , 2012 : number of shares ( in thousands ) weighted average grant date fair value ( per share ) ._| | | number of shares ( in thousands ) | weighted average grant date fair value ( per share ) |
    |---:|:-----------------------------------------------------------------|:------------------------------------|:-------------------------------------------------------|
    | 0 | restricted stock and restricted stock units at beginning of year | 407 | $ 9.84 |
    | 1 | granted ...
    What were the total operating expenses for American Airlines Group in 2018, as reflected in the table detailing annual aircraft fuel consumption and costs? the following table shows annual aircraft fuel consumption and costs , including taxes , for our mainline and regional operations for 2018 , 2017 and 2016 ( gallons and aircraft fuel expense in millions ) .
    year gallons average price per gallon aircraft fuel expense percent of total operating expenses ._| | year | gallons | average priceper gallon | aircraft fuelexpense | percent of totaloperating expenses |
    |---:|-------:|----------:|:--------------------------|:-----------------------|:-------------------------------------|
    | 0 | 2018 | 4447 | $ 2.23 | $ 9896 | 23.6% ( 23.6 % ) |
    | 1 | 2017 | 4352 | 1.73 | 7510 | 19.6% ( 19.6 % ) |
    | 2 | 2016 | 4347 | 1.42 | 6180 | 17.6% ( 17.6 % ) |_as of december 31 , 2018 , we did not have any fuel hedging contracts outstanding to hedge our ...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false,
        "directions": [
            "query_to_doc"
        ],
        "partition_mode": "joint",
        "hardness_mode": null,
        "hardness_strength": 0.0
    }
    

Evaluation Dataset

generator

  • Dataset: generator
  • Size: 883 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 100 samples:
    anchor positive
    type string string
    modality text text
    details
    • min: 25 tokens
    • mean: 44.78 tokens
    • max: 83 tokens
    • min: 333 tokens
    • mean: 1101.61 tokens
    • max: 2180 tokens
  • Samples:
    anchor positive
    What was the average payment volume per transaction for American Express in 2007, based on its reported payments volume and total number of transactions? largest operators of open-loop and closed-loop retail electronic payments networks the largest operators of open-loop and closed-loop retail electronic payments networks are visa , mastercard , american express , discover , jcb and diners club .
    with the exception of discover , which primarily operates in the united states , all of the other network operators can be considered multi- national or global providers of payments network services .
    based on payments volume , total volume , number of transactions and number of cards in circulation , visa is the largest retail electronic payments network in the world .
    the following chart compares our network with those of our major competitors for calendar year 2007 : company payments volume volume transactions cards ( billions ) ( billions ) ( billions ) ( millions ) visa inc. ( 1 ) .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    .
    $ 2457 $ 3822 50.3 1592 ._| | company | payments volume ( billions ) ...
    What was the percentage cumulative total return for Citi's common stock over the five-year period ended December 31, 2017, as reflected in the performance graph comparison? performance graph comparison of five-year cumulative total return the following graph and table compare the cumulative total return on citi 2019s common stock , which is listed on the nyse under the ticker symbol 201cc 201d and held by 65691 common stockholders of record as of january 31 , 2018 , with the cumulative total return of the s&p 500 index and the s&p financial index over the five-year period through december 31 , 2017 .
    the graph and table assume that $ 100 was invested on december 31 , 2012 in citi 2019s common stock , the s&p 500 index and the s&p financial index , and that all dividends were reinvested .
    comparison of five-year cumulative total return for the years ended date citi s&p 500 financials ._| | date | citi | s&p 500 | s&p financials |
    |---:|:------------|-------:|----------:|-----------------:|
    | 0 | 31-dec-2012 | 100 | 100 | 100 |
    | 1 | 31-dec-2013 | 131.8 | 132.4 | 135.6 |
    | 2 | 31-dec-2014 | 137 | ...
    What percentage of Devon Energy's estimated total oil and gas production in MMBOE for 2008 comes from Canadian operations, given that the total estimated production is 243 MMBOE and Canadian operations are estimated to produce 60 MMBOE? the acquisition date is on or after the beginning of the first annual reporting period beginning on or after december 15 , 2008 .
    we will evaluate how the new requirements of statement no .
    141 ( r ) would impact any business combinations completed in 2009 or thereafter .
    in december 2007 , the fasb also issued statement of financial accounting standards no .
    160 , noncontrolling interests in consolidated financial statements 2014an amendment of accounting research bulletin no .
    51 .
    a noncontrolling interest , sometimes called a minority interest , is the portion of equity in a subsidiary not attributable , directly or indirectly , to a parent .
    statement no .
    160 establishes accounting and reporting standards for the noncontrolling interest in a subsidiary and for the deconsolidation of a subsidiary .
    under statement no .
    160 , noncontrolling interests in a subsidiary must be reported as a component of consolidated equity separate from the parent 2019s equity .
    additionally , the amo...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false,
        "directions": [
            "query_to_doc"
        ],
        "partition_mode": "joint",
        "hardness_mode": null,
        "hardness_strength": 0.0
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 4
  • gradient_accumulation_steps: 8
  • learning_rate: 0.0002
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • gradient_checkpointing: unsloth

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0002
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: unsloth
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss cosine_ndcg@10
-1 -1 - 0.1117
0.2559 50 0.1109 -
0.5118 100 0.0218 -
0.7678 150 0.0112 -
-1 -1 - 0.5702

Training Time

  • Training: 18.2 minutes

Framework Versions

  • Python: 3.11.15
  • Sentence Transformers: 5.5.1
  • Transformers: 4.56.2
  • PyTorch: 2.11.0+cu130
  • Accelerate: 1.14.0
  • Datasets: 4.3.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{oord2019representationlearningcontrastivepredictive,
      title={Representation Learning with Contrastive Predictive Coding},
      author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
      year={2019},
      eprint={1807.03748},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1807.03748},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for borntobeignored/embeddinggemma_lora

Finetuned
(21)
this model

Papers for borntobeignored/embeddinggemma_lora

Evaluation results