A robust, fault-tolerant distributed file system implementation with master-slave architecture, file replication, and concurrent access control.
-
Master Server (
master_server.py)- Central metadata management
- Chunk server registration and coordination
- Primary server selection and updates
- Client request routing
-
Chunk Servers (3 instances)
chunk_server1.py- Port 6001, ID 1chunk_server2.py- Port 6002, ID 2chunk_server3.py- Port 6003, ID 3- Local file storage and management
- File operations (create, read, write, delete)
- Concurrent access control with file locking
-
Clients (2 instances)
client1.py- Client application 1client2.py- Client application 2- File operation requests to chunk servers
-
Supporting Components
master_server_heartbeat.py- Master server health monitoringfile_server_heartbeat.py- Chunk server health monitoringnode_failure.py- Failure detection and recoverymetadata.json- System metadata storage
- File Operations: Create, read, write, and delete files
- Concurrent Access: Thread-safe file operations with locking
- Fault Tolerance: Replication and failure detection
- Load Balancing: Distributed file storage across multiple servers
- Metadata Management: Centralized file location tracking
- File Locking: Prevents concurrent write conflicts
- Performance Monitoring: Request timing and logging
- Health Monitoring: Heartbeat mechanisms for server health
- Error Handling: Comprehensive timeout and exception management
- Backup System: Local file copies for data protection
- Python 3.7+
- Socket programming support
- Threading support
- File system access
git clone https://github.com/Jay2704/distributed_file_system.git
cd distributed_file_systempython master_server.pyThe master server will start on 127.0.0.1:5011
Open separate terminal windows for each chunk server:
# Terminal 1 - Chunk Server 1
python chunk_server1.py
# Terminal 2 - Chunk Server 2
python chunk_server2.py
# Terminal 3 - Chunk Server 3
python chunk_server3.py# Terminal 4 - Client 1
python client1.py
# Terminal 5 - Client 2
python client2.pyThe system supports the following file operations:
# Client sends: CREATE_FILE:filename
# Response: FILE_CREATED# Client sends: WRITE_FILE:filename:content
# Response: FILE_WRITTEN# Client sends: READ_FILE:filename
# Response: FILE_CONTENT:content or FILE_NOT_FOUND# Client sends: DELETE_FILE:filename
# Response: FILE_DELETED or FILE_NOT_FOUND# Connect to chunk server
# Send file operations
# Receive responses- Master Server: 5011
- Chunk Server 1: 6001
- Chunk Server 2: 6002
- Chunk Server 3: 6003
distributed_file_system/
βββ chunk_server_1_directory/
βββ chunk_server_2_directory/
βββ chunk_server_3_directory/
βββ master_server.py
βββ chunk_server1.py
βββ chunk_server2.py
βββ chunk_server3.py
βββ client1.py
βββ client2.py
βββ README.md
- Master Server: Central coordinator managing metadata and chunk server registration
- Chunk Servers: Distributed storage nodes handling file operations
- Clients: Applications requesting file operations
- Prevents concurrent write conflicts
- Ensures data consistency
- Supports multiple client connections
- Primary server selection for redundancy
- Backup file creation during write operations
- Fault tolerance through multiple chunk servers
- Heartbeat mechanisms for server health checks
- Failure detection and recovery
- Performance metrics logging
- Request timing and performance metrics
- Error handling and exception logging
- File operation status tracking
FILE_LOCKED_ERROR: File is currently locked by another clientFILE_NOT_FOUND: Requested file doesn't existTIMEOUT_ERROR: Operation exceeded timeout limitCOPY_ERROR: Backup creation failed
- Automatic retry mechanisms
- Graceful degradation on server failures
- Data consistency checks
- File access control through locking
- Concurrent access prevention
- Data integrity through backup mechanisms
- Error isolation and recovery
- O(2*N) for file storage with replication
- Efficient metadata management
- Optimized file locking mechanisms
- O(1) for most file operations
- Thread-safe concurrent access
- Minimal latency for distributed operations
- Fork the repository
- Create a feature branch
- Make your changes
- Add comprehensive comments
- Test thoroughly
- Submit a pull request
This project is open source and available under the MIT License.
Jay2704 - Distributed Systems Implementation
GitHub: https://github.com/Jay2704/distributed_file_system
This distributed file system demonstrates key concepts in distributed computing, fault tolerance, and concurrent programming.