Skip to content

Commit

Permalink
[BBPBGLIB-556] Full estimate of memory consumption (#32)
Browse files Browse the repository at this point in the history
Introduce full memory estimation capabilities for dry run as detailed in
https://bbpteam.epfl.ch/project/issues/browse/BBPBGLIB-556.
When running in dry run mode, neurodamus will now provide a full
estimate of the memory usage for both cells and synapses.
The synapse workflow is mostly untouched from the previous version with
just a small bug fix.
For the cells estimate workflow we now perform full calculation of the
estimate with the following workflow:

1. Finding all unique combinations of METypes and taking only the first
50 elements of each combination
2. Instantiating these combinations. Each set of maximum 50 elements is
allocated in a different rank.
3. Getting the total memory consumption and then average it per cells.
4. Use the above average to calculate total memory usage for any
combination with more than 50 elements.
5. Combine the cells memory total with the synapse one to get a full
estimate.

The MR also introduces exporting the cells memory estimate to `json`
file. Since this phase is the most time consuming one, by default the
dry run workflow will automatically export the results for cell memory
usage to a `memory_usage.json` file. It will also try to load this file
on any subsequent run and just perform instantiation for METype
combinations that are not already present in the file.

Furthermore, in dry run we will attempt to get an estimate for the
overhead memory used by every rank, normally used to load libraries
and data structure. This amount will be added at the grand total at 
the end of the execution.

---------

Co-authored-by: Fernando Pereira <fernando.pereira@epfl.ch>
  • Loading branch information
st4rl3ss and ferdonline authored Oct 5, 2023
1 parent 0d0bc7c commit 55b958d
Show file tree
Hide file tree
Showing 12 changed files with 261 additions and 62 deletions.
4 changes: 2 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Neurodamus

Neurodamus is a BBP Simulation Control application for Neuron.

The Python implementation offers a comprehensive Python API for fine tunning of the simulation, initially defined by a BlueConfig file.
The Python implementation offers a comprehensive Python API for fine tuning of the simulation, initially defined by a BlueConfig file.


Description
Expand Down Expand Up @@ -81,7 +81,7 @@ An example of a full installation with a simulation run can be found in the work

Docker container
================
Alternaltively, you can start directly a neurodamus docker container where all the packages are built.
Alternatively, you can start directly a neurodamus docker container where all the packages are built.
With the container, you can build your mod files and run simulations.
See instructions in `docker/README.md <https://github.com/BlueBrain/neurodamus/blob/main/docker/README.md>`_.

Expand Down
51 changes: 51 additions & 0 deletions docs/architecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,57 @@ Indeed public API represents exactly these 3 cases:
cell_manager.finalize()
conn_manager.create_connections()
Dry Run
-------

A dry run mode was introduced to help users in understanding how many nodes and tasks are
necessary to run a specific circuit. In the future this mode will also be used to improve
load balancing.

By running a dry run, using the `--dry-run` flag, the user will NOT run an actual simulation but
will get a summary of the estimated memory used for cells and synapses, including also the overhead
memory necessary to load libraries and neurodamus data structures.
A grand total is provided to the user as well as a per-cell type and per-synapse type breakdown.

In this paragraph we will go a bit more into details on how the estimation is done.

Below you can see the workflow of the dry run mode:

.. image:: ./img/neurodamus_dry_run.png

First of all, since memory usage of cells is strongly connected to their metypes, we create a dictionary
of all the gids corresponding to a certain metype combination. This dictionary is then crosschecked
with the one imported from the external `memory_usage.json` file, which contains the memory usage
of metype combinations coming from a previous execution of dry run on this or any other circuits.
As long as the `memory_usage.json` file is present in the working directory, it will be loaded.

If the metype combination is not present in the external file, we compute the memory usage of the
metype combination by instantiating a group of (maximum) 50 cells per metype combination and then
measuring memory usage before and after the instantiation. The memory usage is then averaged over
the number of cells instantiated and the result are saved internally and added to the external
`memory_usage.json` file. Any combination already present in the external file is simply imported
and is not instantiated again in order to speed up the execution. One can simply delete the `memory_usage.json`
file (or any relevant lines) in order to force the re-evaluation of all (or some) metype
combinations.

The memory usage of synapses is instead estimated using a pre-computed look up table, which is
hardcoded in the `SynapseMemoryUsage` class. The values used for this look up table were computed by using an external script
to instantiate 1M synapses of each type, each with 1K connections, and then measuring the memory
usage before and after the instantiation. The memory usage is then averaged over the number of
synapses instantiated. The script used to perform this operation `synstat.py` is available for the user
and is archived in this repo in the `_benchmarks` folder.

Having these pre-computed values allows us to simply count the amount of synapses of each type
and multiply it by the corresponding memory usage value.

Apart from both cells and synapses, we also need to take into account the memory usage of neurodamus
itself, e.g. data structures, loaded libraries and so on. This is done by measuring the RSS of the neurodamus
process before any of the actual instantiation is done. This value, since it's averaged over all ranks that take
part in the execution, is then multiplied by the number of ranks used in the execution.

The final result is then printed to the user in a human readable format.


Development
------------

Expand Down
25 changes: 16 additions & 9 deletions docs/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,21 +87,28 @@ In order to obtain a more accurate estimation of the resources needed for a simu
users can also run Neurodamus in dry run mode. This functionality is only available
for libsonata circuits. MVD3 circuits are not supported.

This mode will instantiate all the cells but won't run the actual simulation.
The user can then check the memory usage of the simulation as it's printed on
the terminal and decide how to proceed.

The mode also provides detailed information on the memory usage of each cell type
and the total memory usage of the simulation.
This mode will partially instantiate cells and synapses to get a statistical overview
of the memory used but won't run the actual simulation.
The user can then check the estimated memory usage of the simulation as it's printed on
the terminal at the end of the execution. In a future update we will also integrate
indications and suggestions on the number of tasks and nodes to use for that circuit
based on the amount of memory used during the dry run.

The mode also provides detailed information on the memory usage of each cell metype,
synapse type and the total estimated memory usage of the simulation, including the
memory overhead dictated by loading of libraries and data structures.

The information on the cell memory usage is also automatically saved in a file called
``memory_usage.json`` in the working directory. This json file contains a
dictionary with the memory usage of each cell metype in the circuit and is automatically
loaded in any further execution of Neurodamus in dry run mode, in order to speed up the execution.
In future we plan to also use this file to improve the load balance of actual simulations.

To run Neurodamus in dry run mode, the user can use the ``--dry-run`` flag when launching
Neurodamus. For example:

``neurodamus --configFile=BlueConfig --dry-run``

At the moment dry run mode only supports memory estimation for cell instantiation. Evaluation
of other resources (e.g. connections) will be added in the future.


Neurodamus for Developers
-------------------------
Expand Down
Binary file added docs/img/neurodamus_dry_run.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
59 changes: 41 additions & 18 deletions neurodamus/cell_distributor.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ def __init__(self, circuit_conf, target_manager, _run_conf=None, **_kw):
self._binfo = None
self._pc = Nd.pc
self._conn_managers_per_src_pop = weakref.WeakValueDictionary()
self._metype_counts = None

if type(circuit_conf.CircuitPath) is str:
self._init_config(circuit_conf, self._target_spec.population or '')
Expand All @@ -163,6 +164,7 @@ def __init__(self, circuit_conf, target_manager, _run_conf=None, **_kw):
is_default = property(lambda self: self._circuit_name is None)
is_virtual = property(lambda self: False)
connection_managers = property(lambda self: self._conn_managers_per_src_pop)
metype_counts = property(lambda self: self._metype_counts)

def is_initialized(self):
return self._local_nodes is not None
Expand Down Expand Up @@ -219,6 +221,8 @@ def load_nodes(self, load_balancer=None, *, _loader=None, loader_opts=None):
else:
gidvec, me_infos, *cell_counts = self._load_nodes_balance(loader_f, load_balancer)
self._local_nodes.add_gids(gidvec, me_infos)
if SimConfig.dry_run:
self._metype_counts = me_infos.counts
self._total_cells = cell_counts[0]
logging.info(" => Loaded info about %d target cells (out of %d)", *cell_counts)

Expand Down Expand Up @@ -253,7 +257,7 @@ def _load_nodes_balance(self, loader_f, load_balancer):
return gidvec, me_infos, total_cells, full_size

# -
def finalize(self, *_):
def finalize(self, imported_memory_dict=None, *_):
"""Instantiates cells and initializes the network in the simulator.
Note: it should be called after all cell distributors have done load_nodes()
Expand All @@ -262,11 +266,13 @@ def finalize(self, *_):
if self._local_nodes is None:
return
logging.info("Finalizing cells... Gid offset: %d", self._local_nodes.offset)
self._instantiate_cells()
memory_dict = self._instantiate_cells(imported_memory_dict)
self._update_targets_local_gids()
self._init_cell_network()
self._local_nodes.clear_cell_info()

return memory_dict

@mpi_no_errors
def _instantiate_cells(self, _CellType=None):
CellType = _CellType or self.CellType
Expand All @@ -286,52 +292,69 @@ def _instantiate_cells(self, _CellType=None):
self._store_cell(gid + cell_offset, cell)

@mpi_no_errors
def _instantiate_cells_dry(self, _CellType=None):
def _instantiate_cells_dry(self, _CellType=None, imported_memory_dict=None):
CellType = _CellType or self.CellType
assert CellType is not None, "Undefined CellType in Manager"
Nd.execute("xopen_broadcast_ = 0")

logging.info(" > Dry run on cells... (%d in Rank 0)", len(self._local_nodes))
logging.info("Memory usage for metype combinations:")
logging.info("Memory usage for newly instantiated metype combinations:")
cell_offset = self._local_nodes.offset

gid_info_items = self._local_nodes.items()

prev_emodel = None
prev_etype = None
prev_mtype = None
start_memory = get_mem_usage()
n_cells = 0
memory_dict = {}

for gid, cell_info in gid_info_items:
filtered_gid_info_items = self._filter_memory_dict(imported_memory_dict, gid_info_items)

for gid, cell_info in filtered_gid_info_items:
diff_mtype = prev_mtype != cell_info.mtype
diff_emodel = prev_emodel != cell_info.emodel
first = prev_emodel is None and prev_mtype is None
if (diff_mtype or diff_emodel) and not first:
diff_etype = prev_etype != cell_info.etype
first = prev_etype is None and prev_mtype is None
if (diff_mtype or diff_etype) and not first:
end_memory = get_mem_usage()
memory_allocated = end_memory - start_memory
log_all(logging.INFO, " * %s %s: %f MB averaged over %d cells",
prev_emodel, prev_mtype, memory_allocated/n_cells, n_cells)
memory_dict[(prev_emodel, prev_mtype)] = memory_allocated/n_cells
log_all(logging.INFO, " * %s %s: %.2f MB averaged over %d cells",
prev_etype, prev_mtype, memory_allocated/n_cells, n_cells)
memory_dict[(prev_etype, prev_mtype)] = memory_allocated/n_cells
start_memory = end_memory
n_cells = 0

cell = CellType(gid, cell_info, self._circuit_conf)
self._store_cell(gid + cell_offset, cell)

prev_emodel = cell_info.emodel
prev_etype = cell_info.etype
prev_mtype = cell_info.mtype
n_cells += 1

if prev_emodel is not None and prev_mtype is not None:
if prev_etype is not None and prev_mtype is not None:
end_memory = get_mem_usage()
memory_allocated = end_memory - start_memory
log_all(logging.INFO, " * %s %s: %f MB averaged over %d cells",
prev_emodel, prev_mtype, memory_allocated/n_cells, n_cells)
memory_dict[(prev_emodel, prev_mtype)] = memory_allocated/n_cells
prev_etype, prev_mtype, memory_allocated/n_cells, n_cells)
memory_dict[(prev_etype, prev_mtype)] = memory_allocated/n_cells

if imported_memory_dict is not None:
memory_dict.update(imported_memory_dict)

return memory_dict

def _filter_memory_dict(self, imported_memory_dict, gid_info_items):
if imported_memory_dict is not None:
filtered_gid_info_items = (
(gid, cell_info)
for gid, cell_info in gid_info_items
if (cell_info.etype, cell_info.mtype) not in imported_memory_dict
)
else:
filtered_gid_info_items = gid_info_items

return filtered_gid_info_items

def _update_targets_local_gids(self):
logging.info(" > Updating targets")
cell_offset = self._local_nodes.offset
Expand Down Expand Up @@ -559,7 +582,7 @@ def load_nodes(self, load_balancer=None, **kw):
log_verbose("Nodes Format: %s, Loader: %s", self._node_format, loader.__name__)
return super().load_nodes(load_balancer, _loader=loader, loader_opts=loader_opts)

def _instantiate_cells(self, *_):
def _instantiate_cells(self, imported_memory_dict, *_):
if self.CellType is not NotImplemented:
return super()._instantiate_cells(self.CellType)
conf = self._circuit_conf
Expand All @@ -570,7 +593,7 @@ def _instantiate_cells(self, *_):
log_verbose("Loading '%s' morphologies from: %s",
CellType.morpho_extension, conf.MorphologyPath)
if SimConfig.dry_run:
super()._instantiate_cells_dry(CellType)
return super()._instantiate_cells_dry(CellType, imported_memory_dict)
else:
super()._instantiate_cells(CellType)

Expand Down
2 changes: 1 addition & 1 deletion neurodamus/core/nodeset.py
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@ def intersection(self, other: _NodeSetBase, raw_gids=False, _quick_check=False):
# Like that we could still keep ranges internally and have PROPER API to get raw ids
return numpy.add(intersect, 1, dtype=intersect.dtype)
return numpy.add(intersect, self.offset + 1, dtype=intersect.dtype)
return []
return numpy.array([], dtype="uint32")

def intersects(self, other):
return self.intersection(other, _quick_check=True)
Expand Down
33 changes: 23 additions & 10 deletions neurodamus/io/cell_readers.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,8 @@ def fetch_MEinfo(node_reader, gidvec, combo_file, meinfo):
mtypes = node_reader.mtypes(indexes)
emodels = node_reader.emodels(indexes) \
if combo_file else None # Rare but we may not need emodels (ngv)
etypes = node_reader.etypes(indexes) \
if combo_file else None
exc_mini_freqs = node_reader.exc_mini_frequencies(indexes) \
if node_reader.hasMiniFrequencies() else None
inh_mini_freqs = node_reader.inh_mini_frequencies(indexes) \
Expand All @@ -259,8 +261,8 @@ def fetch_MEinfo(node_reader, gidvec, combo_file, meinfo):
positions = node_reader.positions(indexes)
rotations = node_reader.rotations(indexes) if node_reader.rotated else None

meinfo.load_infoNP(gidvec, morpho_names, emodels, mtypes, threshold_currents, holding_currents,
exc_mini_freqs, inh_mini_freqs, positions, rotations)
meinfo.load_infoNP(gidvec, morpho_names, emodels, mtypes, etypes, threshold_currents,
holding_currents, exc_mini_freqs, inh_mini_freqs, positions, rotations)


def load_sonata(circuit_conf, all_gids, stride=1, stride_offset=0, *,
Expand All @@ -279,7 +281,7 @@ def load_nodes_base_info():
total_cells = node_pop.size
if SimConfig.dry_run:
logging.info("Sonata dry run mode: looking for unique metype instances")
gid_metype_bundle = _retrieve_unique_metypes(node_pop, all_gids)
gid_metype_bundle, count_per_metype = _retrieve_unique_metypes(node_pop, all_gids)
gidvec = dry_run_distribution(gid_metype_bundle, stride, stride_offset, total_cells)
else:
gidvec = split_round_robin(all_gids, stride, stride_offset, total_cells)
Expand All @@ -289,8 +291,13 @@ def load_nodes_base_info():
node_sel = libsonata.Selection(gidvec - 1) # 0-based node indices
morpho_names = node_pop.get_attribute("morphology", node_sel)
mtypes = node_pop.get_attribute("mtype", node_sel)
emodels = [emodel.removeprefix("hoc:")
for emodel in node_pop.get_attribute("model_template", node_sel)]
try:
etypes = node_pop.get_attribute("etype", node_sel)
except libsonata.SonataError:
logging.warning("etype not found in node population, setting to None")
etypes = None
_model_templates = node_pop.get_attribute("model_template", node_sel)
emodel_templates = [emodel.removeprefix("hoc:") for emodel in _model_templates]
if set(["exc_mini_frequency", "inh_mini_frequency"]).issubset(attr_names):
exc_mini_freqs = node_pop.get_attribute("exc_mini_frequency", node_sel)
inh_mini_freqs = node_pop.get_attribute("inh_mini_frequency", node_sel)
Expand All @@ -309,13 +316,17 @@ def load_nodes_base_info():
rotations = _get_rotations(node_pop, node_sel)

# For Sonata and new emodel hoc template, we need additional attributes for building metype
# TODO: validate it's really the emodel_templates var we should pass here, or etype
add_params_list = None if not has_extra_data \
else _getNeededAttributes(node_pop, circuit_conf.METypePath, emodels, gidvec-1)
else _getNeededAttributes(node_pop, circuit_conf.METypePath, emodel_templates, gidvec-1)

meinfos = METypeManager()
meinfos.load_infoNP(gidvec, morpho_names, emodels, mtypes, threshold_currents,
holding_currents, exc_mini_freqs, inh_mini_freqs, positions,
rotations, add_params_list)
meinfos.load_infoNP(gidvec, morpho_names, emodel_templates, mtypes, etypes,
threshold_currents, holding_currents,
exc_mini_freqs, inh_mini_freqs, positions, rotations,
add_params_list)
if SimConfig.dry_run:
meinfos.counts = count_per_metype
return gidvec, meinfos, total_cells

# If dynamic properties are not specified simply return early
Expand Down Expand Up @@ -480,8 +491,10 @@ def _retrieve_unique_metypes(node_reader, all_gids) -> dict:
raise Exception(f"Reader type {type(node_reader)} incompatible with dry run.")

unique_metypes = defaultdict(list)
count_per_metype = defaultdict(int)
for gid, emodel, mtype in zip(gidvec, emodels, mtypes):
unique_metypes[(emodel, mtype)].append(gid)
count_per_metype[(emodel, mtype)] += 1

logging.info("Out of %d cells, found %d unique mtype+emodel combination",
len(gidvec), len(unique_metypes))
Expand All @@ -498,4 +511,4 @@ def _retrieve_unique_metypes(node_reader, all_gids) -> dict:
else:
gid_metype_bundle.append(unique_metypes[key])

return gid_metype_bundle
return gid_metype_bundle, count_per_metype
Loading

0 comments on commit 55b958d

Please sign in to comment.