Skip to content

models sshleifer distilbart cnn 12 6

github-actions[bot] edited this page Oct 27, 2023 · 20 revisions

sshleifer-distilbart-cnn-12-6

Overview

The RoBERTa Large model is a large transformer-based language model that was developed by the Hugging Face team. It is pre-trained on masked language modeling and can be used for tasks such as sequence classification, token classification, or question answering. Its primary usage is as a fine-tuning tool and is case-sensitive. Additionally, there are metrics provided for DistilBART models, including the number of parameters, inference time, speedup, Rouge 2, and Rouge-L. The distilbart-xsum-12-6 model is recommended with 306 million parameters, 137 milliseconds inference time, 1.68 speedup, 22.12 Rouge 2, and 36.99 Rouge-L.

The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time summarization-online-endpoint.ipynb summarization-online-endpoint.sh
Batch summarization-batch-endpoint.ipynb coming soon

Finetuning samples

Task Use case Dataset Python sample (Notebook) CLI with YAML
Summarization News Summary CNN DailyMail news-summary.ipynb news-summary.sh
Translation Translate English to Romanian WMT16 translate-english-to-romanian.ipynb translate-english-to-romanian.sh

Model Evaluation

Task Use case Dataset Python sample (Notebook) CLI with YAML
Summarization Summarization cnn_dailymail evaluate-model-summarization.ipynb evaluate-model-summarization.yml

Sample inputs and outputs (for real-time inference)

Sample input

{
    "inputs": {
        "input_string": ["The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.", "Paris is the capital and most populous city of France, with an estimated population of 2,175,601 residents as of 2018, in an area of more than 105 square kilometres (41 square miles). The City of Paris is the centre and seat of government of the region and province of Île-de-France, or Paris Region, which has an estimated population of 12,174,880, or about 18 percent of the population of France as of 2017."]
    }
}

Sample output

[
    {
        "0": " The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building . It was the first structure to reach a height of 300 metres . Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct ."
    },
    {
        "0": " Paris is the capital and most populous city of France, with an estimated population of 2,175,601 residents as of 2018 . City of Paris is centre and seat of government of the region and province of Île-de-France, or Paris Region, which has an estimated 12,174,880, or about 18 percent"
    }
]

Version: 9

Tags

Preview computes_allow_list : ['Standard_NV12s_v3', 'Standard_NV24s_v3', 'Standard_NV48s_v3', 'Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_NC6s_v2', 'Standard_NC12s_v2', 'Standard_NC24s_v2', 'Standard_NC24rs_v2', 'Standard_NC4as_T4_v3', 'Standard_NC8as_T4_v3', 'Standard_NC16as_T4_v3', 'Standard_NC64as_T4_v3', 'Standard_ND6s', 'Standard_ND12s', 'Standard_ND24s', 'Standard_ND24rs', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4'] license : apache-2.0 model_specific_defaults : ordereddict({'apply_deepspeed': 'true', 'apply_lora': 'true', 'apply_ort': 'true'}) task : text-summarization

View in Studio: https://ml.azure.com/registries/azureml/models/sshleifer-distilbart-cnn-12-6/version/9

License: apache-2.0

Properties

SHA: a4f8f3ea906ed274767e9906dbaede7531d660ff

datasets: cnn_dailymail, xsum

evaluation-min-sku-spec: 8|0|28|56

evaluation-recommended-sku: Standard_DS4_v2

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC24rs_v3

finetuning-tasks: summarization, translation

inference-min-sku-spec: 2|0|7|14

inference-recommended-sku: Standard_DS2_v2, Standard_D2a_v4, Standard_D2as_v4, Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_F4s_v2, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E2s_v3, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

languages: en

Clone this wiki locally