Skip to content

Golang Server to Periodically Create & Upload MongoDB database Backups of any Size to AWS S3 as CSV with Circular Buffer Functionality

License

Notifications You must be signed in to change notification settings

paulmuenzner/backupserver

Repository files navigation

Go Report Card Codacy Badge Issues GitHub go.mod Go version GNU License GitHub code size in bytes GitHub top language

paulmuenzner github Contributors


Logo

Golang Backup Server

Circular Buffer Backups For MongoDB Using AWS S3
EXPLORE DOCS

High Flexibility · Report Bug · Request Feature

Header

Table of Contents
  1. About The Project
  2. Getting Started
  3. Roadmap
  4. Contributing
  5. License
  6. Contact
  7. Acknowledgments

About The Project

This Golang-based server is designed to automate recurring backups of MongoDB databases, offering flexibility and ease of use. The server converts each collection into CSV file format, providing human-readable backups that are easily managed. Backups are then uploaded to AWS S3, ensuring secure and scalable storage.

Features

  • Recurring Backups: Define automatic, scheduled backups using cron jobs.
  • Circular Buffer: Implement a circular buffer to manage and optimize backup storage.
  • CSV Format: Each MongoDB collection is saved as a CSV file, offering simplicity and human readability.
  • Configuration Flexibility: Easily modify several parameters, such as for cron jobs and adjust the number of kept backups thanks to a flexible configuration system.
  • Limitless Sizes: Create MongoDB backups of any size without limits and upload to AWS S3. Large collections are splitted into multiple numbered files automatically to not exceed S3's maximum permitted upload size.
  • Limitless Nested Levels: This backup server can handle any level of nested values inside MongoDB's bson documents.
  • Dependency Injection (DI) setup: This Golang webserver boasts a robust architecture designed for flexibility, reduced coupling and testibiliy through a dedicated Dependency Injection (DI) setup. The core functionalities of database communications, and AWS operations and sending email notifications are seamlessly integrated, providing a cohesive and modular solution.
  • AWS S3 Integration: Backups are securely uploaded to AWS S3 for reliable and scalable storage. Multipart uploads are applied automatically for large csv files improving throughput by uploading a number of parts in parallel.
  • S3 Pagination: Pagination implemented to handle large object lists with AWS S3.
  • Local Backups: Store backups optionally on your local machine; even with circular buffer functionality.
  • Robust Error Handling Mechanism: Any encountered errors are diligently logged to the designated log folder and simultaneously dispatched via email notifications.

Advantage CSV Backups

Choosing to make backups in CSV format offers simplicity, portability, and human readability. CSV files are plain text, making them easy to understand, edit, and share across various platforms. They don't rely on database-specific tools (eg. MongoDB Compass), providing independence and ease of use. Additionally, CSV allows for straightforward analysis, version control, and transparency into data structure. This format is database-agnostic, facilitating compatibility and reducing dependencies. While MongoDB backups have their advantages, CSV backups are often preferred for their versatility and accessibility.

(back to top)

Tech Stack

This project is basically built with and for:

  • Aws
  • Golang
  • MongoDB

(back to top)

Getting Started

Prior to launching the program, clone the repo, install go dependencies and ensure that all configurations are set.

Prerequisites

  • Make sure MongoDB is installed and available.
  • Make sure a properly configured AWS S3 Bucket is ready.

Installation

  • Clone the repo
git clone https://github.com/paulmuenzner/backupserver.git
  • Install go dependencies by running
go get

Environment file (.env)

Before running the program, you need to set up the required environment variables by creating a .env file in the root directory of the project. This file holds sensitive information and configurations needed for the proper functioning of the application.

Mandatory Environment Variables

AWS S3 & MongoDB Configuration:

If your application involves interactions with AWS S3, you must provide the following key-value pairs in the .env file:

  • AWS_S3_BUCKET_NAME: The name of your AWS S3 bucket.
  • AWS_REGION: The AWS region where your S3 bucket is located.
  • AWS_ACCESS_KEY_ID: Your AWS access key ID.
  • AWS_SECRET_ACCESS_KEY: Your AWS secret access key.
  • MONGODB_SCHEME: MongoDB Scheme (likely mongodb)
  • MONGODB_HOST: MongoDB Host (localhost if self-hosted running locally. Read more on mongodb.com.)
  • MONGODB_PORT: MongoDB Port, eg. 27018 or standard port 27017. Read more on mongodb.com.
  • MONGODB_DATABASE_NAME: Name of your MongoDB database you like to backup.

Optional Environment Variables

Email Notification Configuration:

If you intend to use email notifications (configured with SendEmailNotifications in the config file), include the following additional variables in your .env file:

  • EMAIL_PROVIDER_PASSWORD: Password for the email provider.
  • EMAIL_PROVIDER_USERNAME: Username for the email provider.
  • EMAIL_PROVIDER_SMTP_PORT: SMTP port for the email provider.
  • EMAIL_PROVIDER_HOST: Hostname of the email provider.
  • EMAIL_ADDRESS_SENDER_BACKUP: Sender email address for backup notifications.
  • EMAIL_ADDRESS_RECEIVER_BACKUP: Receiver email address for backup notifications.
  • MONGODB_USERNAME: Username as part of your MongoDB connection string if needed. Read more on mongodb.com.
  • MONGODB_PASSWORD: Password as part of your MongoDB connection string if needed. Read more on mongodb.com.

Important Note

Make sure to keep your '.env' file secure and do not share it publicly.

The program relies on these configurations to run successfully. Without the correct values in the .env file, certain features may not work as expected.

Template

Here's an example .env template in code format. Replace "your-..." placeholders with your actual values. Ensure that this file is kept secure, and sensitive information is not shared publicly. Users should fill in the appropriate values for their specific configurations.

# AWS S3 Configuration
AWS_S3_BUCKET_NAME=your-s3-bucket-name
AWS_REGION=your-aws-region
AWS_ACCESS_KEY_ID=your-access-key-id
AWS_SECRET_ACCESS_KEY=your-secret-access-key

# Database Configuration 
MONGODB_SCHEME=mongodb
MONGODB_USERNAME=your-mongodb-username # Optional
MONGODB_PASSWORD=your-mongodb-password # Optional
MONGODB_HOST=localhost
MONGODB_PORT=your-mongodb-port # Likely 27017
MONGODB_DATABASE_NAME=your-mongodb-database-name

# Email Notification Configuration (Optional)
EMAIL_PROVIDER_PASSWORD=your-email-provider-password
EMAIL_PROVIDER_USERNAME=your-email-provider-username
EMAIL_PROVIDER_SMTP_PORT=your-smtp-port
EMAIL_PROVIDER_HOST=your-email-provider-host
EMAIL_ADDRESS_SENDER_BACKUP=your-sender-email-address
EMAIL_ADDRESS_RECEIVER_BACKUP=your-receiver-email-address

(back to top)

Configuration

The following configurations can be modified in the config file located at => /config/base_config.go

Key Description Type Example
DeleteLogsAfterDays Errors are logged to the 'log/' folder, with log file names assigned based on the day. All logs generated within a day are consolidated into a designated backup file. This parameter determines the number of days after which log files will be automatically deleted. int 5
NameDatabase Configure your database name you like to backup. It must be 100% identical to the MongoDB database name. string "MyProjectDB"
FolderNameBackup Determine the folder name where your backup is stored in the cloud; inside the S3 bucket. string "mydbbackup"
FileNameMetaData File name for meta data file containing information on each created backup file string "meta_data.csv"
IntervalBackup Cron-like syntax format to define the recurring schedule of your automatic backup string "@every 6h"
MaxFileSizeInBytes The maximum size of a backup file. Be aware of the max upload size permitted by AWS S3. Of the configured file size is not sufficient, a new backup file is created with the same name plus an added sequential numbering at the end int64 2 * 1024 * 1024 * 1024
SendEmailNotifications Decide whether you want to send email notifications or not. Emails are sent in both cases error and successfully completed backup. bool false
EmailProviderUserNameEnv Name of .env key. The value behind this .env key is placed in your .env file. Needed, if you want to send transactional email notifications. Ask your provider for this value. string "EMAIL_PROVIDER_USERNAME"
EmailProviderPasswordEnv Name of .env key. The value behind this .env key is placed in your .env file. Needed, if you want to send transactional email notifications. Ask your provider for this value. string "EMAIL_PROVIDER_PASSWORD"
EmailProviderSmtpPortEnv Name of .env key. The value behind this .env key is placed in your .env file. Needed, if you want to send transactional email notifications. Ask your provider for this value. string "EMAIL_PROVIDER_SMTP_PORT"
EmailProviderHostEnv Name of .env key. The value behind this .env key is placed in your .env file. Needed, if you want to send transactional email notifications. Ask your provider for this value. string "EMAIL_PROVIDER_HOST"
EmailAddressSenderEnv Name of .env key. The value behind this .env key is placed in your .env file. Needed, if you want to send transactional email notifications. Ask your provider for this value. string "EMAIL_ADDRESS_SENDER_BACKUP"
EmailAddressReceiverEnv Name of .env key. The value behind this .env key is placed in your .env file. Needed, if you want to send transactional email notifications. Ask your provider for this value. string "EMAIL_ADDRESS_RECEIVER_BACKUP"
IsCircularBufferActivatedS3 Decide whether you like to implement a circular buffer or not. If 'false', all backups on S3 are stored without deleting them, which might increase costs depending on backup interval and database size. bool true
MaxBackupsS3 Configuration for circular buffer. If IsCircularBufferActivatedS3 is set to true, circular buffer deletes backups older than latest number of MaxBackupsS3 in S3. In this example, the 12 newest backups are stored on S3 only - older backups will be deleted. int 12
UseLocalBackupStorage Decide if you like to store backups on your local machine (where this program is running on), too. bool true
IsCircularBufferActivatedLocally Same function as 'IsCircularBufferActivatedS3' but for local backup storage. bool true
MaxBackupsLocally Same as 'MaxBackupsS3' but for local backup storage. If MaxBackupsLocally is set to true, circular buffer deletes backups older than latest number of MaxBackupsLocally locally. int 10
S3BucketEnv Name of .env key to configure bucket name. The value behind this .env key is placed in your .env file. Needed, to configure AWS S3. Check your S3 AWS dashboard for this value. The bucket with the exact same name must be ready in your AWS account. string "AWS_S3_BUCKET_NAME"
S3RegionEnv Name of .env key to configure S3 region. The value behind this .env key is placed in your .env file. The region with the exact same name is mentioned in your AWS account. string "AWS_REGION"
S3AccessKeyEnv Name of .env key to add S3 access key. The value behind this .env key is placed in your .env file. The access key is available in your AWS account. string "AWS_ACCESS_KEY_ID"
S3SecretKeyEnv Name of .env key to add S3 secret key. The value behind this .env key is placed in your .env file. The secret key is available in your AWS account. string "AWS_SECRET_ACCESS_KEY"
MongoDatabaseSchemeEnv Name of .env key to define a MongoDB scheme. The value behind this .env key is placed in your .env file. string "MONGODB_SCHEME"
MongoDatabaseUsernameEnv Name of .env key to define a MongoDB user name if needed. The value behind this .env key is placed in your .env file. string "MONGODB_USERNAME"
MongoDatabasePasswordEnv Name of .env key to define a MongoDB password if needed. The value behind this .env key is placed in your .env file. string "MONGODB_PASSWORD"
MongoDatabaseHostdEnv Name of .env key to define a MongoDB host. The value behind this .env key is placed in your .env file. string "MONGODB_HOST"
MongoDatabasePortEnv Name of .env key to define a MongoDB port number. The value behind this .env key is placed in your .env file. string "MONGODB_PORT"
MongoDatabaseNameEnv Name of .env key to define a MongoDB database name. The value behind this .env key is placed in your .env file. string "MONGODB_DATABASE_NAME"

Run program

Run program by: go run main.go or use live-reloader such as air with air

(back to top)

Roadmap

  • ✅ Add optional circular buffer feature for S3
  • ✅ Add optional circular buffer feature for local storage
  • ✅ Add optional email notification feature
  • ⬜️ Add gzip compression feature for entire backup files
  • ⬜️ Configuring a threshold option for the amount of memory used during the temporary storage of collection loop iteration steps as a prudent practice for conscientious RAM resource management
  • ⬜️ Expand testing
  • ⬜️ Addressing more nuanced linting issues
  • ⬜️ Include the option to perform backups for SQL databases in addition to MongoDB
  • ⬜️ Add option to backup multiple databases
  • ⬜️ Add option to upload backups to MS Azure

See the open issues to report bugs or request fatures.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Contributions are more than welcome! See CONTRIBUTING.md for more info.

(back to top)

License

Distributed under the GNU General Public License v2.0. See LICENSE for more information.

(back to top)

Contact

Paul Münzner: https://paulmuenzner.com

Project Link: https://github.com/paulmuenzner/backupserver

(back to top)

Acknowledgments

Use this space to list resources you find helpful and would like to give credit to. I've included a few of my favorites to kick things off!

(back to top)

About

Golang Server to Periodically Create & Upload MongoDB database Backups of any Size to AWS S3 as CSV with Circular Buffer Functionality

Topics

Resources

License

Stars

Watchers

Forks