-
Notifications
You must be signed in to change notification settings - Fork 5
Exporting simulation data
valery edited this page Mar 21, 2023
·
2 revisions
Once you have run a simulation, you will have a folder containing all the simulation data (e.g. Experiment1). To export this data to a tabular format, you will need to follow the steps outlined below:
-
Reorganize the file structure: Use the script
/ABM/abm/data/metaprotocol/experiments/scripts/organize_distributed_experiment.pyto reorganize the file structure. Replace the path in the script with the path to the outerExperiment1folder that contains all the individual hashed folders. This will result in a folder structure with folders namedbatch_XXinside. -
Load the simulation data: Use
ExperimentLoaderto load the simulation data. -
Select the required data: Select the necessary data from the
agent_summarydictionary. The data shape will be(batch, simulation parameter 1 ... simulation parameter n, agent number, timestep).
Here's an example script you can use to export the data to a tabular format:
import os
import uuid
from pathlib import Path
from abm.loader.data_loader import ExperimentLoader
import pandas as pd
import numpy as np
# path to the folder with all the simulation data
data_folder = ".../Experiment1"
simulation_params = ['PARAMETER_1', 'PARAMETER_2']
experiment = ExperimentLoader(data_folder)
# data shape is (batch, simulation parameter 1 ... simulation parameter n, agent number, timestep)
pos_x = experiment.agent_summary['posx']
pos_y = experiment.agent_summary['posy']
num_batches = experiment.num_batches
# convert zarr to the tabular format with columns: time, x, y
for i, s in enumerate(experiment.varying_params[simulation_params[0]]):
for j, v in enumerate(experiment.varying_params[simulation_params[1]]):
folder_name = f"param1_{s}_param2_{v}"
# create a folder with the condition name (parameter values) using pathlib
condition_folder = Path(data_folder) / folder_name
condition_folder.mkdir(parents=True, exist_ok=True)
time = np.arange(0, experiment.chunksize)
for batch in range(num_batches):
data = []
ids = []
for a in range(pos_x.shape[-2]):
# create a unique ID for each agent
agent_id = uuid.uuid4().hex
data.append(pd.DataFrame({
'time': time,
'id': [a + 1] * len(time),
'x': pos_x[batch, i, j, a, :],
'y': pos_y[batch, i, j, a, :],
}))
ids.append(agent_id)
data = pd.concat(data)
# sort by time and agent ID
data = data.sort_values(by=['time', 'id'], ignore_index=True)
data.to_csv(os.path.join(condition_folder, f"{'_'.join(ids)}.csv"), index=False)