Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kaggle LLM Science Exam #52

Open
1 task done
manisnesan opened this issue Aug 9, 2023 · 8 comments
Open
1 task done

Kaggle LLM Science Exam #52

manisnesan opened this issue Aug 9, 2023 · 8 comments

Comments

@manisnesan
Copy link
Owner

manisnesan commented Aug 9, 2023

  • Jeremy Twitter thread

  • Training set 200 science multiple choice questions autogenerated using GPT 3.5

  • RAG pattern

  • No Retriever: Use LM alone. Pass the question alone directly to GPT 3.5 using llm library.

  • OpenAI got the wrong answer for the following

Which of the following statements accurately describes the origin and significance of the triskeles symbol?"

  • Dive deeper using BING
  • Give the model a chance to "think about the answer" by asking the question without multiple choice questions. Instead ask

Please accurately describe the origin and significance of the triskeles symbol

First go through each of the 5 options, explaining why it is or isn't a good description, and then finally say which you think is most accurate

followed by multiple choice questions.

  • Above 2 step approach is a way to get better results

  • Enable Page Context Feature from bing

Tricks

  • opening a page in Edge and using the Bing sidebar (you must first enable page context in Edge settings). It even works with PDFs!

To enable it, in Microsoft Edge go to settings, then type "sidebar" in the settings search, scroll down to "app specific settings", and click "Discover": > Allow Web page access

  • Allow the LLM to think. GPT are autoregressive models . They produce token in a sequence one by one. The more they perform these computations they perform well. So try to have chat related to your question intent.

Example

Precisely and fully describe the answer to the following question: what is resistivity?"

I will ask a multiple choice question, with 5 answers A-E.
First, output 'Options: ' followed by going through each of the 5 options, explaining why it is or isn't a good description.
Then, output 'Summary: ' followed by a description of which you think is most accurate, and why.
Finally, output 'Answers: ' followed by the 5 answers A-E sorted from best answer to worst. E.g 'Answers: B C E A D'.
Reminder: it's VERY IMPORTANT the final line of your response is text text 'Answers: ' followed by the sorted list of answers A-E.

Question: {r.prompt}
A: {r.A}
B: {r.B}
C: {r.C}
D: {r.D}
E: {r.E}

Also use LLM to rewrite the query.

@manisnesan
Copy link
Owner Author

manisnesan commented Aug 9, 2023

  • Jonathan Whitaker P1 YouTube video

  • Intro to the Kaggle competition

    • Mulitple Choice Question Answering (A-E) with 3 guesses allowed. Metric: MAP@3
  • Benchmarking with GPT3.5

    • Load the data
    • Create our prompts
      • system -> """Answer the follwing multiple-choice question by providing your top 3 guesses in order from most to least likely, using the following format: 'A C D' (just the letters separated by spaces)."""
      • user -> Question: $QUESTION. Answers: A:$A B:$B C:$C C:$C E:$E
    • Call the chat completions API with gpt-3.5-turbo model and messages as input
     openai.ChatCompletion.create(
      model="gpt-3.5-turbo",
      messages=[
          {"role": "system", "content": system_message},
          {"role": "user", "content": user_message}
      ])
    
  • Using the OpenAI function calling API to enforce structure on answers

      • Not relying on the model to provide response in free form but we can able to get a structured output.
# Define the function(s) the model will be able to use (in this case, only one)
    functions = [
        {
            "name": "answer_question",
            "description": "Answers the provided question",
            "parameters": {
                "type": "object",
                "properties": {
                    "reasoning": {
                    "type": "string",
                    "description": "Reasining for what the answer could be. Keep it short."
                    },
                    "answers": {
                    "type": "array",
                    "items": {
                        "type": "string",
                        "enum": ["A", "B", "C", "D", "E"],
                    },
                    "description": "Your top 3 guesses, from most to least likely. e.g. ['A', 'D', 'C']"
                    }
                },
                "required": ["reasoning", "answers"],
            },
        }
    ]
  • Using Llama2 as a classifier by examining the logits (next token predictions)
  • Using perplexity to evaluate question-answer pairs
    • Refer: LLM Perplexity Ranking Ensemble Kaggle notebook

@manisnesan
Copy link
Owner Author

Full playlist

@manisnesan
Copy link
Owner Author

Differential Learning rates and LORA - notebook by Wayde

@manisnesan
Copy link
Owner Author

RAG with additional dataset from Chris Deotte

@manisnesan
Copy link
Owner Author

manisnesan commented Sep 23, 2023

? quantized to 8 bits

@manisnesan
Copy link
Owner Author

Perplexity

@manisnesan
Copy link
Owner Author

manisnesan commented Sep 24, 2023

Transformers - Primer by aman.ai

  • Mathematical background (Vectors, Matrix Multiplication, Dot Product, Masking, Sampling)
  • Attention (Additive/Multiplicative/Dot Product Attention, Self/Cross-Attention, Multihead Attention)
  • Core components of the Transformer Architecture (Embeddings, Positional Encoding, Skip Connections, Layer Normalization, Softmax)
  • Top-level Transformer Architecture (Encoder and Decoder stack)
  • Implementation details (Byte-Pair Encoding, Teacher Forcing, Label Smoothing)
  • Lessons learned (What are Transformers learning? Why is training them so hard?)
  • Pros/cons of Transformers relative to CNNs/RNNs
  • Relation between Transformers and Graph Neural Networks

🔹 GPT: http://gpt.aman.ai

  • Background: Generative Pre-Training, Transformer Decoder
  • GPT-1: Improving Language Understanding by Generative Pre-Training
  • GPT-2: Language Models are Unsupervised Multitask Learners
  • GPT-3: Language Models are Few-Shot Learners
  • GPT-4

🔹 BERT: http://bert.aman.ai

  • Background: Pre-Training, Transformer Encoder
  • Contextualized Embeddings
  • Masked Language Modeling (MLM)
  • Next Sentence Prediction (NSP)
  • BERT’s Encoder Architecture vs. Other Decoder Architectures
  • The Strength of Bidirectionality
  • Supervised Fine-Tuning

@manisnesan
Copy link
Owner Author

https://www.kaggle.com/competitions/kaggle-llm-science-exam

Check the solution posts

  • How to fine-tune LLMs
  • How to properly use RAG techniques for augmenting LLMs
  • RAG chunking, embedding, similarity search, and other related techniques
  • Synthetic data generation for training models
  • How to optimize inference code for optimal runtime on limited HW resources.
  • How to fit large LLMs in small GPUs. People even managed to run 70B LLama2 on 2xT4.

https://www.kaggle.com/competitions/kaggle-llm-science-exam/discussion/446414

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant