models roberta large openai detector

roberta-large-openai-detector

Overview

RoBERTa Large OpenAI Detector is a fine-tuned transformer-based language model developed by OpenAI to detect text generated by GPT-2 models. The model has an accuracy of approximately 95% for detecting 1.5B GPT-2-generated text, but the developers note that accuracy may decrease as model sizes increase. The model should not be used to intentionally harm others or support efforts to evade detection, but the model could be used in research related to synthetic text generation. The model has limitations and biases, including disturbing stereotypes and harmful biases, which are discussed further in the associated paper. The model is trained using a sequence classifier based on RoBERTa Large and fine-tuned using the outputs of the 1.5B GPT-2 model. It is evaluated on test data consisting of 5,000 samples from the WebText dataset and 5,000 samples generated by a GPT-2 model.

The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type	Python sample (Notebook)	CLI with YAML
Real time	text-classification-online-endpoint.ipynb	text-classification-online-endpoint.sh
Batch	entailment-contradiction-batch.ipynb	coming soon

Model Evaluation

Task	Use case	Dataset	Python sample (Notebook)	CLI with YAML
Text Classification	Detecting GPT2 Output	GPT2-Outputs	evaluate-model-text-classification.ipynb	evaluate-model-text-classification.yml

Finetuning samples

Task	Use case	Dataset	Python sample (Notebook)	CLI with YAML
Text Classification	Emotion Detection	Emotion	emotion-detection.ipynb	emotion-detection.sh
Token Classification	Named Entity Recognition	Conll2003	named-entity-recognition.ipynb	named-entity-recognition.sh
Question Answering	Extractive Q&A	SQUAD (Wikipedia)	extractive-qa.ipynb	extractive-qa.sh

Sample inputs and outputs (for real-time inference)

Sample input

{
    "input_data": {
        "input_string": ["Today was an amazing day!", "It was an unfortunate series of events."]
    }
}

Sample output

[
    {
        "0": "LABEL_0"
    },
    {
        "0": "LABEL_0"
    }
]

Version: 10

Tags

Preview computes_allow_list : ['Standard_NV12s_v3', 'Standard_NV24s_v3', 'Standard_NV48s_v3', 'Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_NC6s_v2', 'Standard_NC12s_v2', 'Standard_NC24s_v2', 'Standard_NC24rs_v2', 'Standard_NC4as_T4_v3', 'Standard_NC8as_T4_v3', 'Standard_NC16as_T4_v3', 'Standard_NC64as_T4_v3', 'Standard_ND6s', 'Standard_ND12s', 'Standard_ND24s', 'Standard_ND24rs', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4'] license : mit model_specific_defaults : ordereddict([('apply_deepspeed', 'true'), ('apply_lora', 'true'), ('apply_ort', 'true')]) task : text-classification

View in Studio: https://ml.azure.com/registries/azureml/models/roberta-large-openai-detector/version/10

License: mit

Properties

SHA: 5002d695ecf610d8bbfb1fa0d14f1575185b4915

datasets: bookcorpus, wikipedia

evaluation-min-sku-spec: 8|0|28|56

evaluation-recommended-sku: Standard_DS4_v2

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC24rs_v3

finetuning-tasks: text-classification, token-classification, question-answering

inference-min-sku-spec: 2|0|7|14

inference-recommended-sku: Standard_DS2_v2, Standard_D2a_v4, Standard_D2as_v4, Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_F4s_v2, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E2s_v3, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

languages: en

Wiki menu

Home
Reference Documentation
- Components
- Data
- Environments
- Models
Contributing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly