E2E - Inference Endpoints Hugging Face

In today's event, we'll create an E2E Application through Hugging Face Inference Endpoints!

There are 2 main sections to this event:

Deploy LLM and Embedding Model to SageMaker Endpoint Through Hugging Face Inference Endpoints

Select "Inference Endpoint" from the "Solutions" button in Hugging Face:

Create a "+ New Endpoint" from the Inference Endpoints dashboard.

Select the ai-maker-space/gen-z-translate-llama-3-instruct-v1 model repository and name your endpoint. Select N. Virginia as your region (us-east-1). Give your endpoint an appropriate name.

Select the following settings for your Advanced Configuration.

Create a Protected endpoint.

If you were successful, you should see the following screen:

You'll repeat the same process for your embedding model!

NOTE: PLEASE SHUTDOWN YOUR INSTANCES WHEN YOU HAVE COMPLETED THE ASSIGNMENT TO PREVENT UNESSECARY CHARGES.

Create a Simple Chat Application leveraging the new endpoint!

First, we fine-tune Llama 3 8B Instruct for a specific task, in this case: A translation task!

Then, we create a Docker Hugging Face space powering a Chainlit UI - code available here

Terminating Your Resources

Please go to each endpoint's settings and select Delete Endpoint. To delete the resources, you will need to type the endpoint's name.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

E2E - Inference Endpoints Hugging Face

Deploy LLM and Embedding Model to SageMaker Endpoint Through Hugging Face Inference Endpoints

NOTE: PLEASE SHUTDOWN YOUR INSTANCES WHEN YOU HAVE COMPLETED THE ASSIGNMENT TO PREVENT UNESSECARY CHARGES.

Create a Simple Chat Application leveraging the new endpoint!

Terminating Your Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

E2E - Inference Endpoints Hugging Face

Deploy LLM and Embedding Model to SageMaker Endpoint Through Hugging Face Inference Endpoints

NOTE: PLEASE SHUTDOWN YOUR INSTANCES WHEN YOU HAVE COMPLETED THE ASSIGNMENT TO PREVENT UNESSECARY CHARGES.

Create a Simple Chat Application leveraging the new endpoint!

Terminating Your Resources