Skip to content

AI-Maker-Space/PromptEngineer-Colab

Repository files navigation

E2E - Inference Endpoints Hugging Face

In today's event, we'll create an E2E Application through Hugging Face Inference Endpoints!

There are 2 main sections to this event:

Deploy LLM and Embedding Model to SageMaker Endpoint Through Hugging Face Inference Endpoints

Select "Inference Endpoint" from the "Solutions" button in Hugging Face:

image

Create a "+ New Endpoint" from the Inference Endpoints dashboard.

image

Select the ai-maker-space/gen-z-translate-llama-3-instruct-v1 model repository and name your endpoint. Select N. Virginia as your region (us-east-1). Give your endpoint an appropriate name.

Select the following settings for your Advanced Configuration.

image

Create a Protected endpoint.

image

If you were successful, you should see the following screen:

image

You'll repeat the same process for your embedding model!

NOTE: PLEASE SHUTDOWN YOUR INSTANCES WHEN YOU HAVE COMPLETED THE ASSIGNMENT TO PREVENT UNESSECARY CHARGES.

Create a Simple Chat Application leveraging the new endpoint!

First, we fine-tune Llama 3 8B Instruct for a specific task, in this case: A translation task!

Then, we create a Docker Hugging Face space powering a Chainlit UI - code available here

Terminating Your Resources

Please go to each endpoint's settings and select Delete Endpoint. To delete the resources, you will need to type the endpoint's name.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published