Created the first RAG pipeline
-
Create a new Google account.
-
Credits: Received $300 in credits.
-
Enable Vertex AI APIs.
-
Create a bucket in Google Cloud Storage (GCS).
-
Use Vertex AI Workbench:
- Alternatively, you can use a DataProc cluster with Jupyter Lab as an installed application.
- Create a Jupyter Notebook using a machine with specific configurations: Vertex AI Workbench
Note: It takes about 7-8 minutes to bring the Jupyter Notebook up.
- This notebook generates embeddings for the statements in the uploaded PDFs.
- The embeddings file is uploaded to GCS at the specified file path.
- A vector search index is created using the URI of the embeddings file created in the previous step.
- A Matching Index Endpoint is created and deployed. Deployment takes around 20-25 minutes for a small PDF, but this time duration can vary based on the instance used and the size of the data.
Important: Make sure to pass the following parameters in the
deploy_index
function, or the deployment will not succeed:
machine_type=machine_type
min_replica_count=min_replica_count
max_replica_count=max_replica_count
- Initialize the vector search index.
- Generate embeddings for user input.
- Find nearest neighbors for the user input embeddings from the Matching Index Endpoint.
- For example, if you get the IDs of 10 nearest neighbors:
- Look up these 10 UUIDs in the
sentences.json
file to retrieve the corresponding sentences for context creation.
- Look up these 10 UUIDs in the
- The retrieved 10 sentences are now referred to as the context.
- Create a prompt that injects the above-created context and invoke the model to get a response.