-
Notifications
You must be signed in to change notification settings - Fork 124
models mistralai Mistral 7B v01
The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks tested.
For full details of this model please read paper and release blog post.
Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
- Grouped-Query Attention
- Sliding-Window Attention
- Byte-fallback BPE tokenizer
Mistral 7B v0.1 has demonstrated remarkable performance, surpassing Llama 2 13B across all evaluated benchmarks. Notably, it outperforms Llama 1 34B in reasoning, mathematics, and code generation tasks. This achievement showcases the model's versatility and capability to handle a diverse range of language-based challenges.
Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
Task | Use case | Dataset | Python sample (Notebook) | CLI with YAML |
---|---|---|---|---|
Text Generation | question-answering | truthful_qa | abstractive_qna_with_text_gen.ipynb | text-generation.sh |
Task | Use case | Dataset | Python sample (Notebook) | CLI with YAML |
---|---|---|---|---|
Text generation | Text generation | cnn_dailymail | evaluate-model-text-generation.ipynb | evaluate-model-text-generation.yml |
Inference type | Python sample (Notebook) | CLI with YAML |
---|---|---|
Real time | text-generation-online-endpoint.ipynb | text-generation-online-endpoint.sh |
Batch | text-generation-batch-endpoint.ipynb | coming soon |
{
"input_data": {
"input_string": [
"What is your favourite condiment?",
"Do you have mayonnaise recipes?"
],
"parameters": {
"max_new_tokens": 100,
"do_sample": true,
"return_full_text": false
}
}
}
[
{
"0": "\n\nMayonnaise - can't be beat.\n\n## If you had to eat one type of food everyday for the rest of your life what would it be?\n\nMango. I'm an avid fruit and vegetable eater.\n\n## What is your favourite fruit and/or vegetable?\n\nMango! I eat an acre of these a year, which is almost two pounds a day.\n\n## What is the strangest food"
},
{
"0": "\n\nWe don't have any mayonnaise recipes - they are too old fashioned!\n\n## I have seen your products in my local Co-op / Waitrose / Spar / Iceland / Marks and Spencers. Where can I buy more?\n\nIf you can't find our products in your local store, ask your Co-op / Sainsburys / Waitrose / Marks & Spencer / Morrisons / Iceland / S"
}
]
Version: 17
Featured
Preview
SharedComputeCapacityEnabled
hiddenlayerscanned
huggingface_model_id : mistralai/Mistral-7B-v0.1
evaluation_compute_allow_list : ['Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_ND40rs_v2', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96amsr_A100_v4', 'Standard_ND96asr_v4']
batch_compute_allow_list : ['Standard_ND40rs_v2', 'Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96amsr_A100_v4', 'Standard_ND96asr_v4']
inference_compute_allow_list : ['Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_ND40rs_v2', 'Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96amsr_A100_v4', 'Standard_ND96asr_v4']
finetune_compute_allow_list : ['Standard_ND40rs_v2', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96amsr_A100_v4', 'Standard_ND96asr_v4']
model_specific_defaults : ordereddict({'precision': '16', 'deepspeed_stage': '2', 'apply_deepspeed': 'true', 'apply_ort': 'true', 'apply_lora': 'true', 'ignore_mismatched_sizes': 'false'})
inference_supported_envs : ['vllm', 'ds_mii']
license : apache-2.0
task : text-generation
author : Mistral
View in Studio: https://ml.azure.com/registries/azureml/models/mistralai-Mistral-7B-v01/version/17
License: apache-2.0
SharedComputeCapacityEnabled: True
SHA: 26bca36bde8333b5d7f72e9ed20ccda6a618af24
inference-min-sku-spec: 12|1|220|64
inference-recommended-sku: Standard_NC12s_v3, Standard_NC24s_v3, Standard_ND40rs_v2, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4
evaluation-min-sku-spec: 6|1|112|128
evaluation-recommended-sku: Standard_NC6s_v3, Standard_NC12s_v3, Standard_NC24s_v3, Standard_NC24rs_v3, Standard_ND40rs_v2, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4
finetune-min-sku-spec: 40|2|440|128
finetune-recommended-sku: Standard_ND40rs_v2, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4
finetuning-tasks: text-generation, text-classification
languages: EN