-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Milestone
Description
Initially we wanted to use serialized JSON input for the LLM calls but found that even heavily pruned JSON files were way too big to fit in the token window of the selected model. See commit 497378f for the function used before its deletion.
To improve the stability of proChariot we should look into alternative methods of serializations or way to structure a JSON or other serialized data file to keep it small enough to feed to the LLM
Tasks
- Find the minimal set of data to keep
- Is it worth it to do batching into contigs or other structures?
- Explore different possible data structures
- Each row as an entry
- Data prefixes
- Alternatives to JSON
- Check size against a maximum
- Test thoroughly and weigh against the
.tsvapproach- Using big and small genomes to make sure the size is kept low enough
- Update system prompt to improve fidelity
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request