Welcome to the MySQL Query Generation Project with CrewAI and GroqAI!
This project serves as a comprehensive guide, offering testing and implementation insights for efficiently generating MySQL queries using CrewAI and GroqAI. This project is the 3rd version in a series of similar projects.
This repository contains my implementation of CrewAI, an LLM framework, and GroqAI for MySQL Query Generation based User Specifications. Originally, it was planned for the database schemas to be a constant but I have changed it to reading the database schemas using files, it was later improved to reading from files and convert them into JSON format before processing them.
This project's functionalities include:
- MySQL Generation based on User Specifications.
- Question Recommendations based on User Database Schemas.
I recommend running this project on Python 3.10+. This project was originally running on Python 3.11.9.
A virtual environment should be setup for this project. You can use any of yours preferable virtual environment, I will use Anaconda/Miniconda as the Virtual Environment for this project.
To get started, you need to download this project from Github and navigate to the project's folder.
cd crewai-mysql-gen-cli/
Dowloading the project's dependencies from requirements.txt
file.
pip install -r requirements.txt
I also have a link to support you in this process.
This step is important! Create an .env
file to store your API KEY(s). These are the API KEY(s) you will need. Currently, in this project, I am using GroqAI API but you can change it to any LLM(s) you prefer.
GROQ_API=""
The main functions are located in main.py
and there are databases provided in the repository. Run the project using the following command(s).
python main.py
Inside main.py
, there are 2 crews, these are for the 2 main functions of this project, you can comment out the function you don't want to run.
print(">>> Question Suggest Program starts!")
output = query_suggest_crew.kickoff()
print(">>> Question Suggest Answer:")
print(output)
# print(">>> Program 2 starts!")
# output = mysql_crew.kickoff()
# print(">>> Answer:")
# print(output)
# print(">>> MySQL code block only:")
# print(mysql_generate_task.output.raw_output)
You can provide your own databases in the mysql_db_schemas/
directory. You can load them in main.py
by changing the file path.
filepath = "./mysql_db_schemas/{your-database-name}.sql"
Order from newest to oldest.
- Add feature: MySQL explainer
- Edit feature: MYSQL explainer's prompt clean up
- Initialize: README documentation
- Add feature: read database schemas from file
- Add feature: convert database schemas to JSON format (using hand code)
- Edit feature: convert database schemas to JSON format using AI
- Add feature: convert JSON format database schemas to markdown JSON format
- Add feature: MySQL generation
- Edit feature: MySQL generation's prompt edit
- Edit feature: MySQL generation's prompt clean up
- Initialize: database utils
- Testing:
raw
vsraw_output
- Testing: modular design with CrewAI
- Testing: using
raw_output
with modular design
-
CrewAI is built upon LangChain, so CrewAI can use LangChain functions and tools (at least some of it).
-
In CrewAI, there 4 components that are important, they are
Agent()
,Task()
,Process()
andCrew()
. However, onlyAgent()
,Task()
andProcess()
are worth keeping an eye on.-
Process()
: There are 2 types of process being sequential and hierarchy. However, it's sequential by defaultt. -
Agent()
: Initialize and declare your LLMs to use in the task. -
Task()
: This is the most important part.Task()
is where most of the process' tasks are done. The output of 1 task is automatically used as the input of the next task but you can use thecontext
keyword to force this flow. Thecontext
keyword is also used to dictate which tasks will be used as input of another task.- The output of 1 task is automatically used as the input of the next one. However, you can specify which task(s)
should be used in another task, this could lead to 1 task using multiple tasks for its context. An
Agent
can be used by multipleTask
(s) but thoseTask
(s) must be performing the same operation. For example,analyze_agent
can be used by bothbusiness_analyze
andmarket_analyze
because they perform the same operation, butimage_generate
cannot.
- The output of 1 task is automatically used as the input of the next one. However, you can specify which task(s)
should be used in another task, this could lead to 1 task using multiple tasks for its context. An
-
Crew()
: This is where you will assemble your agents and tasks (list which agents and tasks are used in the process). The order in which the tasks are placed inCrew()
will be the order that the tasks will perform (because it's sequential by default). Because the important components are the tasks not the agents, this means you can have a dispoportion for the agents and the tasks where an agent are used to perform for 2 tasks.Crew()
is not that important, it's just a collection that house agents and tasks.
-
-
In the CrewAI docs, these are the sections you need to view:
-
From
Core Concepts
: Agent, Task, Process, Crew and Tool (maybe) -
From
How to Guides
: Sequential Process Overview -> Implementing Sequential Process and Hierarchical Process Overview (maybe) -
From
Examples
: Look over some example code and you will see some a crew that have multiple task but fewer agent
-
-
Initialize: the 2nd initializattion of the project
-
Rebuild the project based on new layouts:
- Initialize: the project
- Add few-shot examples