-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi,
First of all, thank you for this great project!
Observation
I noticed that when Python REPL mode is enabled, files generated with to_excel() (and potentially other pandas export methods) don't seem to be captured
in the API response. The files array comes back empty even though the file is successfully created in /mnt/data/.
Interestingly, the same workflow with to_csv() works perfectly.
How to Reproduce
- REPL mode enabled (
REPL_ENABLED=true) - Execute via
/exec:{"lang":"py","code":"import pandas as pd\ndf = pd.DataFrame({\"City\": [\"Tokyo\"], \"Pop\": [37400000]})\ndf.to_excel(\"/mnt/data/test.xlsx\",
index=False)\nprint("done")"}
3. Response: "files":[] — file not captured
4. Same test with to_csv works as expected
What I Found
Looking at src/services/execution/runner.py (around line 224), it seems like the REPL mode uses a keyword list to decide whether to scan for generated
files:
for kw in ["open(", "savefig", "to_csv", "write(", ".save("]
It looks like to_excel and a few other common pandas export methods might be missing from this list.
Possible Fix
Adding the missing keywords seems to resolve the issue on our side:
for kw in ["open(", "savefig", "to_csv", "to_excel", "to_json", "to_parquet", "to_html", "to_xml", "to_feather", "to_pickle", "write(", ".save(", "dump("]
dump( would also cover json.dump(), pickle.dump(), yaml.dump() patterns.
Of course, you may have a better approach in mind — just sharing what worked for us. Happy to submit a PR if that would be helpful.
Environment
- LibreCodeInterpreter: latest (GHCR)
- Python execution image: latest (GHCR)
- REPL_ENABLED=true
- STATE_PERSISTENCE_ENABLED=true