FLoC is a set of command line tools that can be used together to implement backups.
Main features:
- Simple: specialized replaceable processes using clean interfaces.
- Distributed: client-server architecture.
- Performant: optimized for data safety and speed.
- Deduplicating: data and metadata is deduplicated.
- Incremental: backups are incrementally made but data is non-incrementally stored.
Not ready for production yet.
Clients read and write the files to backup and restore and send and receive data to and from the Servers.
Each Server offer the same JSON-RPC v1.0 protocol to their Clients through named sockets. There is one Server per backend system that keeps the data. Planned backends:
floc-leveldb
: storage resides in LevelDB.floc-boltdb
: storage resides in BoltDB.
Each Server includes a Client for configuration purposes:
floc-leveldb-admin
: configures afloc-leveldb
Server.floc-boltdb-admin
: configures afloc-boltdb
Server.
The files and metadata of a single backup are called an Archive
. A group of Archives
is called a Vault
. Vaults
are identified by a string. Archives
belong to one Vault
and are identified by that Vault
and a timestamp.
A backup is performed incrementally, but the Archive
stores the full view, effectively being a full backup. Backups can be frequent without sacrificing space because of the deduplication.
If a backup is interrupted then Archives
are flagged as incomplete.
A stream or file of JSON documents that contain file metadata like name, type, ownership, permissions, timestamps, and path on the disk but does not include contents or extended attributes is called a Catalog
.
The Clients that perform the actual backup and restore on any Server are:
floc-read
: reads the file system and generates aCatalog
.floc-upload
: reads aCatalog
, partitions the contents and extended attributes of a file in chunks, deduplicates the chunks, sends only the new chunks to a Server, effectively creating a newArchive
in the Server.floc-vault
: browsesVaults
of a Server.floc-archive
: browsesArchives
of a Server.floc-download
: receives from a Server aCatalog
extended with the ids of the contents and extended attributes of the files.floc-write
: reads aCatalog
, downloads the contents of the files and writes them to the file system.floc-copy
: copiesArchives
between Servers.floc-prunable
: listsArchives
that may be obsolete according to some policy.floc-catalog
: reads aCatalog
and returns a possibly different one after applying filters and transformations to the file metadata.
Data streams are splitted in chunks of variable size using a simple and fast rolling hash. Chunks are identified and deduplicated by their SHA256, are stored with their 32 bit FVN-1a and with Reed-Solomon erasure code metadata.
If a backend allows the removal of an Archive
or a Vault
then it must support a garbage collection mechanism to free disk storage in a way that chunks are retained only when they are 'reachable' from the remaining Archives
. If a backend does not allow removals then a combination of floc-prunable
and floc-copy
may be used.
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-leveldb
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-leveldb-admin
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-boltdb
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-boltdb-admin
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-read
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-upload
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-vault
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-archive
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-download
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-write
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-copy
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-prunable
go get github.com/daniel-fanjul-alcuten/floc/cmd/floc-catalog