-
Notifications
You must be signed in to change notification settings - Fork 939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhance AgentSet.do to accept a callable #2210
Conversation
picks up on an idea from projectmesa#1944, see projectmesa#1944 (comment)
d89bd77
to
a617d04
Compare
for more information, see https://pre-commit.ci
Performance benchmarks:
|
Sounds useful! Since the “agent” and “agentset” applies are two fully separate code paths, I’m thinking about making it two separate methods. Something |
for more information, see https://pre-commit.ci
Not sure about this. The current design is analogous to pandas DataFrame.apply. This is a familiar API making it easier for users new to MESA and easier for more experienced users to remember. |
A few thoughts:
I think it helps if we create a list of possible use cases and write some possible API examples for them. Would be nice if this could take some of the heavy lifting of the datacollector (and we have to be careful not to do things duplicate). Edit: One more:
|
I agree and this is easily done by changing the default to be
I am inclined to not do this. It overlaps with AgentSet.do (as you anticipate) and it is also not allowed by DataFrame.apply
The main use case for AgentSet.apply seems to be gathering data over the agentset. An example would be calculating gini. Of course, the question then becomes whether this is redundant with
How would that differ from agentset.apply?
yes. I am also not sure whether axis is the best name for the keyword argument, but I needed to start somewhere. |
creating agentsets is expensive, so this makes it possible to avoid creating them
for more information, see https://pre-commit.ci
It's a bit more narrowly scoped I imagined, where apply is very flexible aggregate is more focussed. total_wealth = agents.agg("Wealth", "sum")
mean_energy = agents.agg("Energy", np.mean)
Of course, Cunningham's Law in full force! |
Ok, but that would give us 2 ways to achieve roughly the same thing: np.sum(agentset.get("wealth"))
agentset.agg("wealth", np.sum) Note that doing this through |
I'm going to try to compile a large list of use cases tomorrow, so we can test our ideas against a test set. |
Given that the |
I'm not against that idea, but it will be a huge overhaul and I’m not sure we have the maintenance capacity to facilitate that. Maybe look more serious into Rust for Mesa 4.0. |
I had said on Matrix chat, but will point out again here there is a critical performance issue: the time elapsed of the Boltzmann wealth model steps is quadratic in the number of agents: projectmesa/mesa-frames#25. mesa-frames addresses this by caching the |
I concur with @EwoutH. I don't think we presently have the capacity to reimplement AgentSet. I also strongly doubt it is desirable to implement it on top of some data frame style data structure because it will break the object-oriented nature of the Agent class itself. I would very much, at some future point, like to port the core of MESA to rust for performance reasons as discussed before. But in the meantime, why not flesh out the API first? Also, looking at @rht's graph here, I don't see quadratic scaling of MESA itself. Am i missing something? |
But it turns out that AgentSet has quite a number of DF-like operations. Polars is already based on Rust, and can be optimized further by using its native expression.
This is the current Mesa's performance: . The graph mentioned is for when I replaced the AgentSet with just a list of agents. |
Yes, the API has various operations inspired by the pandas API. From this, however,it does not follow that the best data structure for implementation is a data frame. Again, I contend it breaks OO of the underlying Agent class, and we don't have the capacity at present anyway. I would prefer to keep this PR focussed on adding functionality to the current AgentSet class rather than having it turn into a discussion about reimplementing the entire class itself. Happy to have this conversation, but preferably as a discussion/ideas topic.
Ok, so there is still a remaining performance overhead of the agent set class that should be addressed. Which version o the wealth model was used for this graph? |
I'd like to add my 2 cents on mesa's performance. TLDR: If Python remains the target language, I'm not sure if reimplementing the backend in Rust makes sense. For performance improvements, it might be better to focus on mesa-frames for now and potentially consider rewriting mesa in Mojo when it becomes available. In the meantime, adding DataFrame-like operations to AgentSet is a good idea, as it will facilitate compatibility with mesa-frames and possibly enable "linting" the model to vectorize it if operations are DataFrame-like.
I used the Boltzmann wealth model from the old mesa-examples: projectmesa/mesa-frames@a87adda. |
I completely agree with this. However, from this, it does not follow that we should not try to make the core of MESA as fast as possible while maintaining the current ease of use. That is, the bottleneck in terms of performance should not be in MESA itself. I don't see MESA as a workhorse library for heavy number-crunching ABMs. Rather, I see its niche as being an alternative to NetLogo for training students while being able to build small to medium-sized ABMs that are quick to develop and can be run within a reasonable amount of time. |
mesa/agent.py
Outdated
# TODO:: this is a good idea, but tricky because you don't know all column names | ||
return [func(agent, *args, **kwargs) for agent in self] | ||
elif axis == "agentset": | ||
return func(self, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EwoutH is there a way to turn off the codecov warning comments? It's harder to review this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add require_changes: true
to the comment
part of the YML. I think that might help? Or is that only about the main PR comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require_changes: true
seems to be better, so that the comments don't show up in everyone's browsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it turns out that whenever I refresh the browser tab, the annotation is back on. So require_changes: true
is the permanent solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mesa/agent.py
Outdated
else: | ||
raise ValueError(f"axis should be `agent` or `agentset` not {axis}") | ||
|
||
def group_by( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the naming choice of Polars as well.
I agree with the concept and API of this PR. No strong opinion on the Regarding with performance, at the very least, the model step elapsed time shouldn't be quadratic in the number of agents. This fix can happen on the Python layer, without Polars gimmick. For speeding up Mesa further with Rust/Cython, I disagree with @quaquel and @Corvince (they said something similar, but I couldn't find the comment). Mojo is a good example that it is possible to have something that is both simple to write and fast. A reminder that rewriting Mesa backend in Rust would allow it to be cross-language. If one of the concerns is not enough time to develop the features, then we can at least apply for NumFOCUS grant for funding (@adamamer20 has shown that he could write production code in his spare time lol, but maybe you prefer to do AI stuff instead, who knows). We should continue the discussion at #2042 (Rust) or #1610 (Cython). |
I am investigating this issue as we speak and will open a separate PR/issue once I complete my diagnosis. As indicated by @Tortar, the problem is the copying done in the scheduler. That is,
Yes, we discussed this before. I, however, don't think we really disagree. User-written pure Python for the present time will remain a bottleneck for the performance of MESA models. Of course, if the user moves part of their model code to, say, Cython or Rust or C or whatever, their models will run faster. The key is, and I think here we agree, to make the core of MESA not the bottleneck. |
Some use cases:
I think the discussion from #348 (comment) already describes most cases, together with #1944 of course. The biggest question is: What do we want to handle in the AgentSet and what in the DataCollector? |
So we have a few capabilities needed for the datacollector:
|
For some related thoughts on the DataCollector, see #348 (comment) |
Okay so what if we make that whole thing a AgentSet function? That return a (tuple of) attributes and a (tuple of) aggegates. Then you can either save those or chain further. |
for more information, see https://pre-commit.ci
@EwoutH and I had a good discussion about this PR. In short, Also, I'll expand the # in e.g., the eppstein model
self.model.agents.groupby("condition").apply(len) This will return a dictionary with the number of agents for each condition (e.g., arrested, quiet, protesting). # another API example, to be fine tuned, basically this is random activation by type
groups = self.model.agents.groupby(type, as_agentset=True)
for group_name, group in groups.items() :
group.shuffle().do("step") I'll separate the update to |
Thanks! Can you split of the (I can also do all of this stuff, if preferred) |
I have split them so this is closed |
This PR enhances AgentSet.do to take a callable or str. Currently,
AgentSet.do
takes a str which maps to a method on the agents in the set. This PR makes it possible to use a callable instead. This callable will be called with the agent as the first argument.