neutrino+query+rescan: improve rescan speed #236

ellemouton · 2021-11-11T10:31:06Z

This PR should probably be split into 3 PR's. Doing it in 1 for now just to get some initial feedback and to better show the progression. The three sections are as follows:

The first 2 commits alter the GetBlock and GetCFilter functions to use the work dispatcher for their queries instead of using the old queryPeers function. The function is now removed bringing us one step closer to removing all query logic from the main package.
The 3rd commit isolates the bottleneck of the GetCFilter function which is persisting filters to the DB. In this commit, this operation is spun off into a goroutine thus allowing the GetCFilter function to return as soon as all the filters are written to the cache.
Finally, the 4th commit ensures that rescan can make use of batch filter fetching by waiting until the header chain is either current or until it is ahead of the specified end height. Before this commit, if rescan is started before the chain is current, then the filters are fetched one by one instead which is what makes things super slow. For example, on local regtest: before this commit it would take 7 seconds to sync 3000 blocks but with this commit, it takes 900ms. (full testnet rescan from local testnet bitcoind took like 5 mins)

Use the work dipatcher interface from the query package to make getdata requests instead of using the old queryPeers function.

Use the work dispatcher query interface instead of the old queryPeers method for making getcfilter requests. This ensures that the queries are made to the most responsive peers. With this PR we can also remove the queryPeers function.

Let the GetCFilter function return, and hence unlock the mutex, as soon as it is done writing filters to the cache and then let the writing to the DB happen in a separate goroutine. This greatly improves the speed at which filters can be downloaded since the bottle neck in this operation is writing the filters to the db.

Let the rescan function wait until the filter headers have either caught up to the back end chain or until they have caught up to the specified rescan end block. This lets the rescan operation take advantage of doing batch filter fetching during rescan making the operation a lot faster since filters can be fetched in batches of 1000 instead of one at a time.

ellemouton · 2021-11-11T10:35:07Z

query.go

+				case <-s.quit:
+					return
+				}
+


One worry with this is that we (in worst case sync from genesis) spin up like 700 goroutines (on mainnet). So another option to rate limit this a bit is to have a single outer goroutine that listens on a channel with a large buffer. Then we rate limit ourselves with that buffered channel

also gonna look into doing batch writes to the db instead of one by one

Batch writes would be great, and shouldn't be too difficult to add in.

Re the amount of goroutines: we can instead make a basic worker pool here. Or we only allow so many of them to be active at one time via a semaphore.

ellemouton · 2021-11-12T11:58:32Z

rescan.go

+			return hash == ro.endBlock.Hash
+		}); err != nil {
+			return err
+		}


could maybe change this a bit so that if we are lagging by at least 1000, we start preemptively fetching filters (similarly to what is done for filter header syncing)

positiveblue

First time diving into the neutrino repository so I may be missing something. Better have an extra couple of eyes to check this but everything made sense to me. A couple of nits:

Maybe some of the functions defined inside functions can be put outside or as a struct method so our routines do not get that long.
Maybe we can only create the persistChan and write into it if s.persistToDisk.

For example, on local regtest: before this commit it would take 7 seconds to sync 3000 blocks but with this commit, it takes 900ms

Thats a big improvement 🔥 🔥 🔥 great work!

positiveblue · 2021-11-19T07:02:54Z

query.go

+				log.Warnf("Invalid block for %s "+
+					"received from %s -- ",
+					blockHash, peer)
+				fmt.Println(err)


Maybe it's worth it to include the error to the log.Warnf

oh haha yeah that was left over from debugging 🙈 will fix. thanks!

ellemouton · 2021-11-19T08:31:38Z

Thanks for taking a look @positiveblue!!! Will address your comments soon 👍

lightninglabs-deploy · 2021-12-07T15:09:14Z

@ellemouton, remember to re-request review from reviewers for your latest update

ellemouton · 2021-12-07T15:11:42Z

!lightninglabs-deploy mute

chappjc · 2021-12-10T22:58:30Z

Gave it a test on mainnet and there were no more single cfilters requests, just big chunks of blocks, which is great. Not always 1000, but never just 1.

ellemouton · 2021-12-11T05:24:52Z

awesome! Thanks for testing @chappjc!

losh11 · 2023-01-14T23:15:08Z

hey guys, is anyone at lightninglabs looking into this?
would really appreciate it! resyncing takes forever, and makes user experience really, really bad.

Roasbeef · 2023-03-15T00:42:05Z

query.go

+				log.Warnf("Invalid block for %s "+
+					"received from %s -- ",
+					blockHash, peer)
+				fmt.Println(err)


Prior logic would disconnect here, but now we'll continue...

Seems worthy of a future spin off to propagate a "ban worthy" error back up to the main scheduler.

Roasbeef · 2023-03-15T00:42:50Z

neutrino.go

@@ -575,11 +575,6 @@ type ChainService struct { // nolint:maligned
 	FilterCache *lru.Cache
 	BlockCache  *lru.Cache

-	// queryPeers will be called to send messages to one or more peers,
-	// expecting a response.
-	queryPeers func(wire.Message, func(*ServerPeer, wire.Message,


Roasbeef · 2023-03-15T00:43:30Z

query.go

+		}
+	}
+
+	// If the request filter type doesn't match the type we were expecting,


Few lines here and below look to run over.

Roasbeef · 2023-03-15T00:44:52Z

query.go

+
+	// At this point the filter matches what we know about it, and we
+	// declare it sane. We send it into a channel to be processed elsewhere.
+	q.filterChan <- &filterResponse{


Should select on quit both here and below.

Roasbeef · 2023-03-15T00:46:26Z

query.go

-			return
+		for {
+			select {
+			case resp, ok = <-q.filterChan:


Hmm, I think better to not have the select assignment override the variable above? Also given that ok isn't used in the scope below. So can use an intermediate variable.

Roasbeef · 2023-03-15T00:48:09Z

query.go

+				case <-s.quit:
+					return
+				}
+


Batch writes would be great, and shouldn't be too difficult to add in.

Re the amount of goroutines: we can instead make a basic worker pool here. Or we only allow so many of them to be active at one time via a semaphore.

Roasbeef · 2023-03-15T00:49:03Z

query.go

+
+			for {
+				select {
+				case resp, ok = <-persistChan:


I think would be cleaner to send the filter to persist, instead of relying on a chan close above as a signal to check a local closure variable.

Roasbeef · 2023-03-15T00:50:09Z

rescan.go

-	}
+	// waitFor is a helper closure that can be used to wait on block
+	// notifications until the given predicate returns true.
+	waitFor := func(predicate func(hash chainhash.Hash,


Perhaps move into new function to cut down on the vertical distance a bit here?

ellemouton · 2023-05-05T13:34:30Z

Closing - gonna open 3 follow-up PRs 🤓

replaced by #273, #274 and #275

ellemouton added 4 commits November 11, 2021 12:02

neutrino+query: use work dispatcher for GetBlock

d983c73

Use the work dipatcher interface from the query package to make getdata requests instead of using the old queryPeers function.

neutrino+query: use work dispatcher for GetCFilter

fd76d6f

Use the work dispatcher query interface instead of the old queryPeers method for making getcfilter requests. This ensures that the queries are made to the most responsive peers. With this PR we can also remove the queryPeers function.

ellemouton commented Nov 11, 2021

View reviewed changes

ellemouton commented Nov 12, 2021

View reviewed changes

positiveblue reviewed Nov 19, 2021

View reviewed changes

ellemouton marked this pull request as draft December 8, 2021 07:03

This was referenced Nov 1, 2022

core,btc,app: Manage SPV wallet peers decred/dcrdex#1931

Merged

neutrino sync and shutdown problems decred/dcrdex#1934

Closed

Roasbeef requested changes Mar 15, 2023

View reviewed changes

saubyk assigned ellemouton Apr 27, 2023

ellemouton closed this May 5, 2023

ellemouton mentioned this pull request May 5, 2023

rescan: use batch filter fetching #275

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

neutrino+query+rescan: improve rescan speed #236

neutrino+query+rescan: improve rescan speed #236

ellemouton commented Nov 11, 2021 •

edited

Loading

ellemouton Nov 11, 2021

ellemouton Nov 12, 2021

Roasbeef Mar 15, 2023

ellemouton Nov 12, 2021

positiveblue left a comment

positiveblue Nov 19, 2021

ellemouton Nov 19, 2021

ellemouton commented Nov 19, 2021

lightninglabs-deploy commented Dec 7, 2021

ellemouton commented Dec 7, 2021

chappjc commented Dec 10, 2021

ellemouton commented Dec 11, 2021

losh11 commented Jan 14, 2023

Roasbeef Mar 15, 2023

Roasbeef Mar 15, 2023

Roasbeef Mar 15, 2023

Roasbeef Mar 15, 2023

Roasbeef Mar 15, 2023

Roasbeef Mar 15, 2023

Roasbeef Mar 15, 2023

Roasbeef Mar 15, 2023

ellemouton commented May 5, 2023 •

edited

Loading

neutrino+query+rescan: improve rescan speed #236

neutrino+query+rescan: improve rescan speed #236

Conversation

ellemouton commented Nov 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

positiveblue left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ellemouton commented Nov 19, 2021

lightninglabs-deploy commented Dec 7, 2021

ellemouton commented Dec 7, 2021

chappjc commented Dec 10, 2021

ellemouton commented Dec 11, 2021

losh11 commented Jan 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ellemouton commented May 5, 2023 • edited Loading

ellemouton commented Nov 11, 2021 •

edited

Loading

ellemouton commented May 5, 2023 •

edited

Loading