Large Scale Distributed Systems Final Project
This repository features a local-first list application that demonstrates principles of distributed systems, including data sharding, replication, and conflict-free resolution (CRDT) for large-scale environments. The app prioritizes offline functionality while ensuring consistency through robust sync mechanisms.
Key Features:
- Local-First Approach: Data is stored and managed locally for fast, reliable access.
- Periodic Syncing: Automatic batch updates with server syncs to ensure data consistency.
- Conflict-Free Resolution: Resolves changes intelligently across multiple devices.
- Data Sharding: Horizontally scales data across multiple servers using consistent hashing for even distribution.
- Replication: Redundant copies (N=3) ensure fault tolerance and reliability.
- Read/Write Quorum: Guarantees data consistency and integrity with majority-based operations (W=R=2).
This project highlights modern techniques like consistent hashing, quorum reads/writes, eventual consistency, and redundancy to handle distributed, large-scale data efficiently
Made in Collaboration with:
src
- The source codedoc
- Documentation: includes the slides used in the presentation and the initial requirements
- Please make sure you have npm installed. Visit nodejs.org to download and install it.
- The Slides used in the presentation are in the doc folder
- The Demo video
- A video denoting the use cases
⚠️ To replicate the project, please ensure you have:
- A valid connection string to a PostgreSQL server
- A Redis connection for Bull jobs (you can create a free database at redis.io)
Refer to the example .env files in the the app and the api
- Move into the
api
folder
cd api
- Install dependencies, globally
npm install -g yarn
- Ensure
.env
is set up correctlyshould have
DATABASE_URL
set - Install dependencies
yarn
- Apply prisma shards migrations and generate the shards Prisma client (script for windows shell)
.\migrate_shards.ps1
For linux, you can try running
migrate_shards.sh
. This has not been tested yet, so please report any issues.
- Start the server
npm run start:dev
- Move into the
app
folder
cd app # assuming you are in the root folder
- Install dependencies
npm install
- Start the development server
npm run dev
- In order to use ZeroMQ to its full extent, we establish a PUB/SUB relationship between the API and a server, that we call "the Bridge".
- Then, using Socket.IO, the bridge will forward the messages to the web app.
- Move into the
bridge
folder
cd bridge
- Install dependencies
npm install
- Start the bridge
npm start
- test_user
- E3%hjEzN@WULHM
Prisma is the ORM used to interact with the database.
The schema is the source of truth for the database. When you modify the schema, you need to generate the Prisma client to reflect the changes (it will create an .sql
file in the **prisma**/migrations
folder).
- Make changes to
prisma/schema.prisma
- Run
yarn prisma generate
to update the Prisma client
Run with
--name <name>
to name your migration, otherwise you will be prompted to name it after
- Run
yarn prisma migrate dev
to update the database
- Run
yarn prisma studio
to view the database in the browser
- Run
yarn prisma migrate dev --create-only
to create an empty migration
Note that it does not apply the migration to the database, it just generates the SQL file
- Write the SQL to insert the data you want into the
.sql
file in theprisma/migrations
folder - Apply the migration to the database by running
yarn prisma migrate dev
It will prompt you to give the migration a name, if you haven't changed the name from the default, just press enter to accept it