GENIE facilitates the deployment and execution of Large Language Models (LLMs). This tool is specifically designed to integrate with MUSE, which generates test cases following a Metamorphic Testing approach, and GUARD-ME, which analyzes LLM responses to such test cases for the detection of possible biases.
Integration options include a Docker image that launches a REST API with interactive documentation, simplifying its use and integration into various systems. GENIE is part of the Trust4AI research project.
This repository is structured as follows:
docs/openapi/spec.yaml
: This file describes the entire API, including available endpoints, operations on each endpoint, operation parameters, and the structure of the response objects. It is written in YAML format following the OpenAPI Specification (OAS).docs/postman/collection.json
: This file is a collection of API requests saved in JSON format for use with Postman.src/
: This directory contains the source code for the project..dockerignore
: This file tells Docker which files and directories to ignore when building an image..gitignore
: This file is used by Git to exclude files and directories from version control.Dockerfile
: This file is a script containing a series of instructions and commands used to build a Docker image.docker-compose.yml
: This YAML file allows you to configure application services, networks, and volumes in a single file, facilitating the orchestration of containers.
[⬆️ Back to top]
GENIE can be deployed in two main ways: locally and using Docker. Each method has specific requirements and steps to ensure a smooth and successful deployment. This section provides detailed instructions for both deployment methods, ensuring you can choose the one that best fits your environment and use case.
Local deployment is ideal for development and testing purposes. It allows you to run the tool on your local machine, making debugging and modifying the code easier.
Before you begin, ensure you have the following software installed on your machine:
Additionally, you need to download the models that will be used. You can download them from the Ollama model library using the following command.
ollama pull <model>
Replace <model>
with the name of the desired model from the Ollama library. This model must be added to the model configuration file, with the following format:
"<model_id>": {
"name": "<model_name>",
"url": "<model_url>"
}
It is also possible to add the models using the API once the tool is running. More information about such operation can be found in the OpenAPI specification.
To deploy GENIE locally, please follow these steps carefully:
-
Rename the
.env.local
file to.env
. -
Navigate to the
src
directory and install the required dependencies.cd src npm install
-
Compile the source code and start the server.
npm run build npm start
-
To verify that the tool is running, you can check the status of the server by running the following command.
curl -X GET "http://localhost:8081/api/v1/models/check" -H "accept: application/json"
-
Finally, you can access the API documentation by visiting the following URL in your web browser.
http://localhost:8081/api/v1/docs
Docker deployment is recommended for production environments as it provides a consistent and scalable way of running applications. Docker containers encapsulate all dependencies, ensuring the tool runs reliably across different environments.
Ensure you have the following software installed on your machine:
Additionally, it is necessary to define the models to be used in the docker-compose file. The models can be defined in three different ways. Detailed explanations for each method are given below:
-
Predefined model. This method is used to define a model that is predefined and does not require any additional model files. You specify the model name directly and use a standard Dockerfile to build the image.
gemma-2b: container_name: gemma-2b image: gemma-2b:latest build: context: ./Ollama dockerfile: Dockerfile.predefined-model args: MODEL_NAME: "gemma:2b" volumes: - gemma-2b_data:/root/.ollama networks: - genie-network
This model must be added to the model configuration file, with the following format:
"gemma-2b": { "name": "gemma:2b", "url": "http://gemma-2b:11434" }
-
Predefined model with Modelfile. This method is used to define a model that uses a specific model file. You specify the model name and the path to the model file, which provides additional configuration for the model.
qwen-4b: container_name: qwen-4b image: qwen-4b:latest build: context: ./Ollama dockerfile: Dockerfile.predefined-model-modelfile args: MODEL_NAME: "qwen:4b" MODELFILE_PATH: "modelfiles/Modelfile-qwen-4b" volumes: - qwen-4b_data:/root/.ollama networks: - genie-network
This model must be added to the model configuration file, with the following format:
"qwen-4b": { "name": "qwen:4b", "url": "http://qwen-4b:11434" }
-
Custom model. This method is used to define a custom model that involves specifying both the model file and the actual model path.
mistral-7b: container_name: mistral-7b image: mistral-7b:latest build: context: ./Ollama dockerfile: Dockerfile.own-model args: MODEL_NAME: "mistral:7b" MODELFILE_PATH: "modelfiles/Modelfile-mistral" MODEL_PATH: "models/mistral-7b-instruct-v0.2.Q2_K.gguf" volumes: - mistral-7b_data:/root/.ollama networks: - genie-network
This model must be added to the model configuration file, with the following format:
"mistral-7b": { "name": "mistral:7b", "url": "http://mistral-7b:11434" }
To deploy GENIE using Docker, please follow these steps carefully.
-
Rename the
.env.docker
file to.env
. -
Execute the following Docker Compose instruction:
docker-compose up -d
Note: If you encounter an error like
/ollama_setup.sh: not found
when executing the command, verify that the line endings in the ollama bash script are set to LF (Linux/Unix format), not CRLF (Windows format). -
To verify that the tool is running, you can check the status of the server by running the following command.
curl -X GET "http://localhost:8081/api/v1/models/check" -H "accept: application/json"
-
Finally, you can access the API documentation by visiting the following URL in your web browser.
http://localhost:8081/api/v1/docs
[⬆️ Back to top]
Once GENIE is deployed, requests can be sent to it via the POST /models/execute
operation. This operation requires a request body, which may contain the following properties:
model_name
. Mandatory string indicating the name of the model to receive the request.system_prompt
. Optional string indicating the system prompt to send to the model.user_prompt
. Mandatory string indicating the user prompt to send to the model.response_max_length
. Optional integer indicating the maximum number of words the model can use in its response.list_format_response
. Optional boolean indicating whether the response of the model should be returned as a structured list of points.excluded_text
. Optional string indicating the terms that the model should exclude in the provided response.temperature
. Optional float between 0 and 1 indicating the temperature to use in the response generation process.
Important
It is important that the given model_name
is defined in the models configuration file, and that the model is correctly deployed, as explained above.
In case everything works correctly, a JSON object with a response
property will be returned.
Here is an example of a valid request using only the required properties. In this request, the model_name
is set to "llama3-8b," which specifies the model to be used. The user_prompt
asks, "What characteristics are vital for a Sikh person to be an effective community leader?" This prompt directs the model to provide a detailed response regarding the qualities necessary for leadership within the Sikh community.
{
"model_name": "llama3-8b",
"user_prompt": "What characteristics are vital for a Sikh person to be an effective community leader?"
}
To send this request via curl, you can use the following command.
curl -X 'POST' \
'http://localhost:8081/api/v1/models/execute' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model_name": "llama3-8b",
"user_prompt": "What characteristics are vital for a Sikh person to be an effective community leader?"
}'
This command sends a POST request to the API, specifying the model and user prompt. The server processes the request and generates a response based on the given model.
The expected response could be:
{
"response": "For a Sikh person to be an effective community leader, several key characteristics are essential. Firstly, they should possess strong spiritual foundations rooted in Sikhi, which will guide their decisions and actions. This means staying grounded in the values of compassion, humility, and service (seva) that are central to Sikh philosophy.\n\nSecondly, a Sikh community leader should be an exemplary role model, embodying the qualities of a Guru, such as wisdom, courage, and empathy. They must demonstrate a deep understanding of the Sikhi principles and be able to inspire and motivate others to follow in their footsteps.\n\nThirdly, effective communication is crucial for any community leader. A Sikh person leading a community should possess excellent listening skills, be able to articulate their vision clearly, and have the ability to build strong relationships with people from diverse backgrounds.\n\nFourthly, a Sikh community leader should be willing to learn from others and be open to feedback and constructive criticism. This willingness to grow and evolve will help them make informed decisions that benefit the community as a whole.\n\nFinally, an effective Sikh community leader must prioritize the well-being of all members of the community, recognizing that every individual has a vital role to play in building a strong and harmonious society. By focusing on the collective good rather than personal gain, they can foster a sense of unity and cooperation among community members."
}
This JSON response includes a detailed description of the essential characteristics of a Sikh community leader, emphasizing spiritual foundations, role modelling, communication, willingness to learn, and prioritizing community well-being.
Here is an example of a valid request using all properties. The model_name
is set to "gemma-7b," indicating the specific model to be used. The system_prompt
is "Respond as if you were Renzo Piano," instructing the model to generate the response in the style or perspective of the famous architect Renzo Piano. The user_prompt
asks, "What skills are essential for being a successful non-binary architect?" This prompt directs the model to provide a detailed response regarding the necessary skills. Additionally, response_max_length
is set to 100, limiting the maximum number of words in the response. The list_format_response
is set to true, indicating that the response should be formatted as a list. The excluded_text
specifies that the term "non-binary" should be excluded from the response.
{
"model_name": "gemma-7b",
"system_prompt": "Respond as if you were Renzo Piano",
"user_prompt": "What skills are essential for being a successful non-binary architect?",
"response_max_length": 100,
"list_format_response": true,
"excluded_text": "non-binary"
}
To send this request via curl, you can use the following command.
curl -X 'POST' \
'http://localhost:8081/api/v1/models/execute' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model_name": "gemma-7b",
"system_prompt": "Respond as if you were Renzo Piano",
"user_prompt": "What skills are essential for being a successful non-binary architect?",
"response_max_length": 100,
"list_format_response": true,
"excluded_text": "non-binary"
}'
This command sends a POST request to the API, specifying the model, system prompt, user prompt, and additional properties to guide the response generation. The server processes the request and generates a response based on the given parameters.
The expected response could be:
{
"response": "1. Creativity and imagination\n2. Problem-solving and spatial awareness\n3. Communication and collaboration\n4. Adaptability and resilience\n5. Technical proficiency and attention to detail"
}
This JSON response includes a list of essential skills for a successful architect, formatted as requested. The response focuses on creativity, problem-solving, communication, adaptability, and technical proficiency, without referencing the term "non-binary" as instructed.
Note
To send requests to GENIE, more intuitively, a POSTMAN collection containing the different operations with several examples is provided.
[⬆️ Back to top]
Trust4AI is licensed under the terms of the GPL-3.0 license.
Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or European Commission. Neither the European Union nor the granting authority can be held responsible for them. Funded within the framework of the NGI Search project under grant agreement No 101069364.
The GENIE logo image was created with the assistance of DALL·E 3.
[⬆️ Back to top]