Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implement support to BatchAPIs to gather evidence #687

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

maykcaldas
Copy link
Collaborator

@maykcaldas maykcaldas commented Nov 14, 2024

Due to the parallel nature of gathering evidence and summarizing all candidate papers, we plan to use the batch API when possible.

Task list

  • Create a class to make batch calls to openai
  • Create a class to make batch calls to anthropic
  • Integrate the openai class to the get_evidence method
  • Integrate the anthropic class to the get_evidence method
  • Update get_summary_llm to decide which provider to use given the llm in the config

Mayk Caldas added 2 commits November 14, 2024 14:26
This class is used to submit batch calls to the OpenAI batch API
@maykcaldas maykcaldas self-assigned this Nov 14, 2024
data: list[dict[str,str]],
callbacks: list[Callable] | None = None,
name: str | None = None,
skip_system: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored out skip_system in #680, can you propagate that change to here?

paperqa/llms.py Show resolved Hide resolved
@@ -609,6 +618,10 @@ class Settings(BaseSettings):
" router_kwargs key with router kwargs as values."
),
)
use_batch_in_summary: bool = Field(
default=False,
description="Whether to use batch API for LLMs in summarization",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a few words on how the batches are actually formed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you can say something like:

Whether to use batch API for LLMs in summarization, which means multiple messages are sent in one API request.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was updated to:

"Whether to use batch API for LLMs in summarization, "
"which means multiple messages are sent in one API request "
"to the LLM provider's batch API."
"This option is only available for Claude(https://docs.anthropic.com/en/api/creating-message-batches)"
"and OpenAI (https://platform.openai.com/docs/guides/batch) chat models."

}
)

while batch.status != "completed":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably want "completed" and "failed" to be OpenAI enums here rather than free strings.

batch = client.batches.retrieve(batch.id)
if batch.status == "failed":
raise Exception("Batch failed. \n\nReason: \n" + "\n".join([k.message for k in batch.errors.data]))
await asyncio.sleep(5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's parameterize this waiting, and maybe make the default longer? like 30 second or 1 min polling?

We should probably add some debug/info logs here to track progress along with maybe a max-timeout which users can set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants