Fine-Tuning and Inference with LLaMA-3.1-8B Model

This project demonstrates how to fine-tune the LLaMA-3.1-8B model using LoRA adapters, apply chat templates, and save the model for inference. The model is trained on local data, optimized for parameter-efficient training, and deployed to the Hugging Face Hub.

Overview

Model: LLaMA-3.1-8B with 4-bit quantization for efficient memory usage.
Techniques: Fine-tuning using LoRA (Low-Rank Adaptation) adapters, gradient checkpointing, and chat templates.
Data: Custom local CSV file used for training.
Goal: Train and deploy a chatbot model capable of handling user input in conversation-based formats.

Features

LoRA Adapters: Parameter-efficient fine-tuning.
Quantization: Efficient memory usage with 4-bit precision.
Chat Templates: Structured conversation flow with tokenization.
Model Deployment: Saving and pushing models to Hugging Face Hub in different quantization formats.

Installation

To run this project, you'll need to install the required packages. You can set this up in Google Colab or your local environment:

pip install torch transformers datasets pandas unsloth trl

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Fine_Tuning_LLM_Models.ipynb		Fine_Tuning_LLM_Models.ipynb
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-Tuning and Inference with LLaMA-3.1-8B Model

Overview

Features

Installation

About

Releases

Packages

Languages

SURESHBEEKHANI/Finetune-LLAMA-2-On-Your-DataSet-AutoTrain-From-Hugging-Face

Folders and files

Latest commit

History

Repository files navigation

Fine-Tuning and Inference with LLaMA-3.1-8B Model

Overview

Features

Installation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages