A comprehensive Python tool for synchronizing data between MySQL databases with support for parallel processing, retry mechanisms, and comprehensive logging.
sync-data/
├── main.py # Main entry point
├── requirements.txt # Python dependencies
├── config.yaml # Configuration file
├── test_config.yaml # Test configuration
├── src/ # Source code
│ └── sync_data/ # Main package
│ ├── main.py # Main application logic
│ ├── config_manager.py # Configuration management
│ ├── database_operations.py # Database operations
│ ├── log_manager.py # Logging system
│ ├── password_manager.py # Password encryption
│ └── sync_engine.py # Synchronization engine
├── tests/ # Test suite
│ ├── test_main.py # Main application tests
│ ├── test_config_manager.py # Configuration tests
│ ├── test_database_operations.py # Database operation tests
│ ├── test_log_manager.py # Logging tests
│ ├── test_password_manager.py # Password manager tests
│ ├── test_sync_engine.py # Sync engine tests
│ └── test_integration.py # Integration tests
├── docs/ # Documentation
│ ├── README.md # This file
│ └── README_TESTING.md # Testing documentation
└── scripts/ # Utility scripts
└── run_tests.py # Test runner
- Clone the repository:
git clone <repository-url>
cd sync-data
- Install dependencies:
pip install -r requirements.txt
-
Configure your database settings in
config.yaml
-
Run the synchronization:
python main.py --config config.yaml
- Object-Oriented Design: Clean, modular architecture using Python classes
- Atomic Data Import: Safe data import using temporary tables and atomic table swapping
- Parallel Processing: Support for both sequential and parallel data synchronization
- Retry Mechanism: Automatic retry with configurable attempts and delays
- Comprehensive Logging: Structured logging with rotation and colored output
- Password Security: Encrypted storage of database credentials
- Health Checks: Database connectivity and status monitoring
- Data Validation: Row count verification after synchronization
- Error Handling: Robust error handling with detailed logging
The tool uses YAML configuration files. See config.yaml
for a complete example.
- database: Source and target database connection settings
- sync: Synchronization parameters (tables, parallel mode, timeout)
- paths: Directory paths for temporary files, exports, and logs
- retry: Retry configuration for failed operations
- tools: External tool paths (mo-dump, mysql)
- security: Password encryption settings
Run the complete test suite:
python scripts/run_tests.py
Or run individual test modules:
python -m unittest tests.test_main
python -m unittest tests.test_sync_engine
python -m unittest tests.test_integration
- DatabaseSyncer: Main synchronization orchestrator
- ConfigurationManager: Configuration loading and validation
- DatabaseConnection: Database connectivity management
- SyncEngine: Data export/import coordination with atomic operations
- DataExporter: Table export using mo-dump tool
- DataImporter: Atomic table import with temporary table swapping
- LogManager: Structured logging system
- PasswordManager: Secure credential management
- Create feature branch:
git checkout -b feature/new-feature
- Implement changes in appropriate modules under
src/sync_data/
- Add tests in
tests/
directory - Update documentation
- Submit pull request
- Follow PEP 8 guidelines
- Use type hints where appropriate
- Include docstrings for all public methods
- Maintain test coverage above 90%
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.