(work in progress 🚧 ) A sql-based database for secure scuttlebutt, written in rust, with bindings to js
This module parses the flume append only log file and inserts each message into a sql database.
This is conceptually very similar to a flume-view but it isn't a flume view that plugs into the rest of flume. @mmckegg wrote a post about his dream of a "consensus free scuttlestack." This module intenionally isn't a flume view because we want clients to be able to take ownership of the indexes they need without needing consensus about how they modify the .ssb folder.
- Features
- Example
- Schema
- API
- More Example Queries
- Hints For Developing Queries
- Performance
- Development
- Acknowledgments
- See Also
- Code of Conduct
- Easy to use
- models all the common relationships between messages, enabling powerful queries that are generally hard to do
- uses knex for powerful async queries
- Fast + flexible
- Advanced features
- query the view at any time - for when waiting for the view to be in sync isn't relevant
- supports multiple identities (WIP) (multiple keypairs), with decryption done with Rust
- supports multiple clients running different versions of this view, by giving you control over where this view is saved
- Friendly. We have a code of conduct and are commited to abiding by it.
- Found something missing from the docs? Can't understand the code? Found an un-tested edge case? Spotted a poorly named variable? Raise an issue! We'd love to help.
- Well tested
- Well documented
const SqlView = require('ssb-flume-follower-sql')
const config = // load ssb config
const keys = // load ssb keys
const logPath = Path.join(config.path, 'flume', 'log.offset')
const secret = ssbKeys.ssbSecretKeyToPrivateBoxSecret(keys)
// Constructing a new sqlView doesn't do that much automatically. A new sqlite db is created if it doesn't exist. No indexing is happening automatically.
const sqlView = SqlView(logPath, '/tmp/patchwork.sqlite3', secret, keys.id)
// The sql view has the knex instance
var { knex } = sqlView
//Query for content of 20 most recent posts.
knex
.select(['content'])
.from('messages_raw')
.join('authors', 'messages_raw.author_id', 'authors.id')
.orderBy('flume_seq', 'desc')
.limit(20)
.asCallback(function(err, result) {
console.log(result)
})
// Process data from the offset log into the db as it comes in, but throttled so it's not too cpu hungry. (Assumes you can use `requestIdleCallback`)
window.requestIdleCallback(function processMore (deadline) {
window.requestIdleCallback(processMore)
//uses the latest sequence from flumedb to check if there are new things in the offset log.
const sbotLatest = api.sbot.obs.latestSequence()()
if (sqlView.getLatest() === sbotLatest) return
// Process chunks until we've used up the available free time.
while (deadline.timeRemaining() > 0) {
sqlView.process({ chunkSize: 250 })
}
sqlViewLatest = sqlView.getLatest()
})
See more example queries below
sql views of the db that do joins you're likely to use.
Query that generates view:
SELECT
flume_seq,
key_id,
seq,
received_time,
asserted_time,
root_id,
fork_id,
author_id,
content,
content_type,
is_decrypted,
keys.key as key,
root_keys.key as root,
fork_keys.key as fork,
authors.author as author
FROM messages_raw
JOIN keys ON keys.id=messages_raw.key_id
LEFT JOIN keys AS root_keys ON root_keys.id=messages_raw.root_id
LEFT JOIN keys AS fork_keys ON fork_keys.id=messages_raw.fork_id
JOIN authors ON authors.id=messages_raw.author_id
var SqlView = require('ssb-flume-follower-sql')
var sqlView = SqlView('/path/to/log.offset', '/path/to/view.sqlite', <pubKey>, <secreKey> )
opts
is mandatory and has one optional field:
-
opts.chunkSize
(optional) - Sets the maximum number of items to process. If this is omitted it will process all entries, bringing the view up to date with the log. -
Note that processing will block this thread while executing. If you want to limit resource use of processing, use something like
requestIdleCallback
like in the example. Also be careful not to makeopts.chunkSize
too large. As a starting point, my machine processes 10000 entries in 140ms.
Gets the latest flume sequence value processed by the db.
A knex instance ready to do read only queries on the db.
- TBD if I can get knex writes working. Sqlite theortically supports multiple db connections.
- TBD if you should mess with the db.
TBD if these are a good idea. They are syntactic sugar for common queries using knex modify. I'll know more once I write a whole lot of queries for patchwork.
db.knex
.select(['content', 'author'])
.from('messages_raw')
.join('authors', 'messages_raw.author_id', 'authors.id')
.where('authors.is_me', 1)
.orderBy('flume_seq', 'desc')
.limit(20)
db.knex
.select(['content', 'author'])
.from('messages_raw')
.join('authors', 'messages_raw.author_id', 'authors.id')
.where('authors.is_me', 1)
.orderBy('flume_seq', 'desc')
.limit(20)
.offset(40)
db.knex
.select(['author'])
.from('contacts_raw')
.join('authors', 'contacts_raw.author_id', 'authors.id')
.where('authors.is_me', 1)
.where('state', -1)
db.knex
.select(['author'])
.from('contacts_raw')
.join('authors', 'contacts_raw.author_id', 'authors.id')
.where('authors.is_me', 1)
.where('state', 1)
var last_seq_num = 12345
db.knex('mentions_raw')
.count('id')
.join('authors', 'mentions_raw.link_to_author_id', 'authors.id')
.join('messages_raw', 'mentions_raw.link_from_key_id', 'messages_raw.key_id')
.where('authors.is_me', 1)
.where('messages_raw.flume_seq', '>', last_seq_num )
db.knex
.select(['content'])
.from('abouts_raw')
.join('authors', 'abouts_raw.link_to_author_id', 'authors.id')
.join('messages_raw', 'abouts_raw.link_from_key_id', 'messages_raw.key_id')
.where('authors.is_me', 1)
TODO
TODO
TODO
TODO
- install
sqlite3
on your system. - in the sqlite command prompt run:
sqlite> .timer on
. The timer lets you see how long your queries are taking. There are often multiple ways to write a query. Try them out to see which one is faster. If something is slow, pay attention to which columns are indexed (marked withIDX
in the schema diagram.)sqlite> .headers on
. Gives you column header names at the top of your results.
Roughly 10x faster!
Sqlite db rebuild as fast as possible: 40s
Sqlite db rebuild, chunks of 250, running on patchwork's main thread without lagging the ui: 64s.
Flume rebuild of indexes used by patchwork: 404s
NB: This is a bit hard to do an exact comparison. Expect these numbers to change.
WIP.
From rust (not using knex) querying for all authors I block on a db of 100k messages: 54 micro seconds.
Roughly 65% of the offset log.
Roughly 50% of the existing flume indexes.
My Sqlite db is 379MB.
My offset log is 598MB,
My flumedb indexes are 771MB.
With npm installed, run
$ npm install ssb-flume-follower-sql
Dev Dependencies:
- rust + cargo
- cargo-make
- cmake
$ ./build-native.sh
Dev Dependencies
- same as above
- cross
- dockcross
Cross compiling is still a work in progress. Still todo:
- split out scripts to cross in build-native.sh
- add script to tar up cross compiled binding with correct prebuild compatible filename
- windows builds don't work
- @ahdinosaur for orchestrating the sunrise choir.
- @mmckegg for the infectious enthusiasm.
- @mixmix for awesome feedback on the readme.
- @dominictarr for making it work before making it nice.
- @AljoschaMeyer for making it nice before making it work.
- @noffle for common-readme.
All the lovely people on scuttlebutt who make it the place it is.
LGPL-3.0