- Introduction
- Why NovaDB?
- Key Features
- Architecture
- Technologies Used
- Installation
- Usage
- Testing
- Future Enhancements
NovaDB is a lightweight, command-line interface (CLI) based database management system designed for simplicity and efficiency. It supports a subset of SQL for data definition (DDL) and data manipulation (DML), along with basic transaction management. It aims to provide an educational yet production-grade experience in understanding how databases work from the ground up — including parsing, indexing, transaction management, and SQL execution.
NovaDB was created with the following objectives:
- To build a C++-based database engine with a custom storage engine and SQL parser.
- To support a command-line interface (CLI) for direct interaction.
- To offer SQL compatibility for common operations (CREATE, INSERT, SELECT, UPDATE, DELETE).
- To focus on simplicity, modularity, and performance.
- To provide clear documentation and extensibility for learning and research purposes.
It targets:
- Students and developers learning how databases work internally.
- Developers building lightweight embedded systems.
- Researchers interested in experimenting with custom query optimizations or storage models.
- CLI Interface: Interact with the database using a familiar command-line prompt.
- SQL Support: Execute
CREATE TABLE,INSERT,SELECT,UPDATE,DELETEstatements. - Transaction Management: Supports
BEGIN TRANSACTION,COMMIT TRANSACTION, andROLLBACK TRANSACTIONfor data integrity. - Page-based Storage: Data is stored and managed in fixed-size pages on disk.
- Write-Ahead Log (WAL): Ensures data durability and supports transaction recovery.
- B-Tree Indexing: Provides basic indexing for efficient data retrieval.
- Multi-threaded Query Execution: Utilizes a thread pool for parallel processing of SELECT queries.
- Extensibility: Modular design allows for future additions of indexes, foreign keys, query optimizers, and plug-in style storage engines.
NovaDB is designed with a modular architecture, separating concerns into distinct components for clarity and maintainability. The core components reside in the src/core/ directory, while the user interface is handled by the src/cli/.
- Pager (
pager.h,pager.cpp): The lowest level of the storage engine, responsible for managing interactions with the underlying.dbfile on disk. It handles reading and writing fixed-size data pages, schema storage, and WAL integration. - Parser (
parser.h,parser.cpp): Responsible for interpreting SQL statements, transforming raw SQL strings into a structured internal representation (Statementobjects). - Serializer (
serializer.h,serializer.cpp): Provides utility functions for converting in-memory data structures into a byte stream suitable for storage on disk, and vice-versa. - Index (
index.h,index.cpp): Implements a B-tree data structure to facilitate efficient data retrieval, mapping primary key values to page numbers. - Write-Ahead Log (WAL) (
wal.h,wal.cpp): Ensures the durability and atomicity of transactions by logging all changes before applying them to the main database file. - Thread Pool (
thread_pool.h,thread_pool.cpp): Used to manage and execute tasks concurrently, primarily for parallelizing full table scans duringSELECToperations.
The main.cpp file serves as the entry point for the interactive command-line application, orchestrating the interaction between the user and the core database engine through a Read-Eval-Print Loop (REPL).
- Language: C++ (core engine)
- SQL parser: Custom
- CLI: C++ (standard input/output based REPL)
- File extension:
.db - Build system: CMake
- Operating systems: Linux, Windows
Before you begin, ensure you have the following installed on your system:
- C++ Compiler: A C++17 compatible compiler (e.g., GCC, Clang).
- CMake: Version 3.10 or higher.
- Git: For cloning the repository.
- Docker (Optional): If you plan to use the Dockerized environment.
- Clone the Repository:
git clone https://github.com/akram-dris/nova-db.git cd nova-db - Create a Build Directory:
mkdir cmake-build-debug cd cmake-build-debug - Configure the Project with CMake:
cmake ..
- Build the Project:
cmake --build . - Verify Installation:
The executable
nova_dbwill be located in thecmake-build-debug/directory../nova_db
- Build the Docker Image:
docker build -t novadb-image . - Run the Docker Container:
docker run -it novadb-image
./cmake-build-debug/nova_db.open <filename.db>: Opens an existing database file or creates a new one..tables: Lists all tables in the open database..schema [table_name]: Displays the schema for a specified table or all tables..help: Displays a list of all available CLI dot commands and supported SQL commands..exit: Exits the NovaDB CLI application.
CREATE TABLE <table_name> (<col1> <type1>, <col2> <type2>, ...);: Creates a new table. Supported types:INT,TEXT.
INSERT INTO <table_name> VALUES (<value1>, <value2>, ...);: Inserts a new row.SELECT * FROM <table_name> [WHERE <condition>];: Retrieves all columns from the specified table with optional filtering.UPDATE <table_name> SET <col1> = <val1>, <col2> = <val2>, ... [WHERE <condition>];: Modifies existing rows with optional filtering.DELETE FROM <table_name> [WHERE <condition>];: Deletes rows with optional filtering.
BEGIN TRANSACTION;: Starts a new transaction.COMMIT TRANSACTION;: Saves all changes within the current transaction.ROLLBACK TRANSACTION;: Undoes all changes within the current transaction.
echo ".open my_database.db\nSELECT * FROM users;\n.exit" | ./cmake-build-debug/nova_dbAll test scripts are located in the tests/ directory.
To run the entire test suite:
./tests/run_tests.shTest outputs are stored in tests/output/.
- Integrate NovaDB with the Nova Language as a built-in database layer.
- Create Nova Studio, a lightweight GUI for managing
.dbfiles.

