-
Notifications
You must be signed in to change notification settings - Fork 211
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This PR includes 2 memory reduction changes: 1. Reduce the memory footprint of `collect_peer_intervals` 2. Reduce the peak memory footprint during node initialization This PR also adds a new `process_memory` metric which samples the memory footprint of all active erlang processes every 5 seconds. The relatively slow sample rate for this metric and the related `process_functions` metric, minimizes the performance impact of the sampling (it's negligible). ## collect_peer_intervals Prior to this change the `collect_peer_intervals` process could reserve roughly 500MB of memory per storage_module. This change limits the footprint to about 1MB. ### Root cause `collect_peer_intervals` queries the syncing intervals advertised by a basket of peers, intersects those intervals with the ranges sought by the node, and then queues up "work" in form of specific chunks to be queried from each peer. Prior to this change `collect_peer_intervals` would try to queue all the chunks to be queried for a complete storage_module. In an extreme case - with 3.6TB of unsynced data - this resulted in about 15,000,000 chunks worth of tasks being queued at once. The `gb_sets` data structure to hold this queue could baloon to 500MB. ### Fix Now `collect_peer_intervals` only queries one sync bucket worth of work at a time. This means only 10GB of chunks (not 3.6TB) are queued at any one time. More detail: Considerations when managing the queue length: 1. Periodically `sync_intervals` will pull from Q and send work to `ar_data_sync_worker_master`. We need to make sure Q is long enough so that we never starve `ar_data_sync_worker_master` of work. 4. On the flip side we don't want Q to get so long as to trigger an out-of-memory condition. In the extreme case we could collect and enqueue all chunks in a full 3.6TB storage_module. A Q of this length would have a roughly 500MB memory footprint per storage_module. For a node that is syncing multiple storage modules, this can add up fast. 5. We also want to make sure we are using the most up to date information we can. Every time we add a task to the Q we're locking in a specific view of Peer data availability. If that peer goes offline before we get to the task it can result in wasted work or syncing stalls. A shorter queue helps ensure we're always syncing from the "best" peers at any point in time. With all that in mind, we'll pause collection once the Q hits roughly a bucket size worth of chunks. This number is slightly arbitrary and we should feel free to adjust as necessary. Considerations when pausing: 1. When we pause and resume we keep track of the last range collected so we can pick up where we left off. 2. While we are working through the storage_module range we'll maintain a cache of the mapping of peers to their advertised intervals. This ensures we don't query each peer's /data_sync_record endpoint too often. Once we've made a full pass through the range, we'll clear the cache so that we can incorporate any peer interval changes the next time through. ## Peak memory during initialization Previously `ar_node_worker` was using `ar_events:send` to cast the Blocks list. The Blocks list can get big (~175MB), and any process that subscribes to 1node_state1 would receive the list in its mailbox - even if it did not handle the message. Having dozens of processes all loading 175MB at the same time could sometimes exceed the resident memory capacity and trigger an OOM. The fix is to have `ar_node_worker` cast an empty `initializing` message and any process that cares about the Blocks list can load it from `ets`
- Loading branch information
1 parent
3f02c3d
commit be9b87c
Showing
12 changed files
with
202 additions
and
106 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.