Copied from https://github.com/bin123apple/MACM, paper: https://arxiv.org/abs/2404.04735. Want to explore how it works
io -> I believe the io folder was used for running GPT-4 baseline. Looks like it uses an assistant + code interpreter, to get a more accurate baseline