Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exercise Chat: Implement native function calling agent #154

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

kaancayli
Copy link
Contributor

@kaancayli kaancayli commented Sep 5, 2024

This PR implements the exercise chat as a native tool calling agent. For now only OpenAI tool calling is supported.

More details will be added.

Summary by CodeRabbit

  • New Features

    • Enhanced message conversion functionality to support new message types and roles, including tool messages.
    • Introduced the ExerciseChatAgentPipeline for improved interactions regarding exercise-related queries.
    • Added flexibility in the chat pipeline to handle different variants and improve response handling.
    • Implemented new methods for binding tools within language models, enhancing interaction capabilities.
    • Updated prompts for improved interaction with students during exercise submissions and feedback.
  • Bug Fixes

    • Improved robustness in message handling and state management to prevent invalid transitions.
  • Documentation

    • Updated prompts and guidelines for the AI tutor to enhance educational interactions.
  • Chores

    • Added new dependencies to the project for enhanced language processing capabilities.

Copy link
Contributor

coderabbitai bot commented Sep 5, 2024

Walkthrough

The changes involve updates to multiple files, enhancing message handling and pipeline functionality. Key modifications include the addition of new message types and subclasses, adjustments to method signatures for tool binding, and the introduction of new parameters in pipeline execution methods. The .gitignore file is also updated to ignore .DS_Store files. Overall, the changes aim to improve the flexibility and clarity of the message processing and interaction pipelines.

Changes

Files Change Summary
.gitignore Added entry for ignoring .DS_Store files.
app/common/pyris_message.py Updated IrisMessageRole enum to include TOOL, modified PyrisMessage contents to allow a union type, and added PyrisAIMessage and PyrisToolMessage subclasses.
app/llm/external/model.py Changed chat method return type from PyrisMessage to ChatCompletionMessage and added bind_tools method.
app/llm/external/ollama.py Added bind_tools method to OllamaModel.
app/llm/external/openai_chat.py Enhanced message conversion functions and added bind_tools methods to various chat model classes.
app/llm/langchain/iris_langchain_chat_model.py Added bind_tools method to allow dynamic tool binding.
app/llm/request_handler/basic_request_handler.py Introduced bind_tools method for binding tools to the language model.
app/llm/request_handler/capability_request_handler.py Added bind_tools method for tool integration.
app/llm/request_handler/request_handler_interface.py Introduced abstract bind_tools method to the request handler interface.
app/pipeline/chat/course_chat_pipeline.py Enhanced CourseChatPipeline to handle new event types and updated logic for competency calculations.
app/pipeline/chat/exercise_chat_agent_pipeline.py Introduced ExerciseChatAgentPipeline class for interactive chat sessions focused on student exercises.
app/pipeline/chat/exercise_chat_pipeline.py Updated ExerciseChatPipeline to accept a variant parameter for flexible behavior.
app/pipeline/chat/interaction_suggestion_pipeline.py Simplified prompt handling logic and improved error handling for chat history.
app/pipeline/prompts/iris_course_chat_prompts.py Updated prompts for clarity on response formatting and originality.
app/web/routers/pipelines.py Updated pipeline execution methods to accept new parameters for enhanced functionality.
app/web/status/status_update.py Replaced serialization method and improved state management in the StatusCallback class.

Possibly related PRs

  • Pipeline: Ingestion pipeline #96: The changes to the .gitignore file in the main PR are related to the modifications in the ingestion pipeline, which also includes updates to the .gitignore file to manage Docker-related artifacts.
  • Tutor Chat Pipeline with Lecture content.  #104: The Tutor Chat Pipeline with Lecture content PR may relate to the main PR as it involves enhancements to the chat functionality, which could include managing ignored files for better organization.
  • Track token usage of iris requests #165: The introduction of token tracking in the Iris requests may connect to the main PR's changes in the .gitignore, as both involve improvements in managing resources and tracking usage effectively.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kaancayli kaancayli self-assigned this Sep 5, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Outside diff range, codebase verification and nitpick comments (3)
app/domain/data/programming_exercise_dto.py (1)

34-38: recent_changes field lacks integration.

The recent_changes field is defined in programming_exercise_dto.py but is not currently integrated with version control or change tracking systems. Consider implementing functionality to utilize this field effectively for tracking Git diffs and changes.

  • Location: app/domain/data/programming_exercise_dto.py
Analysis chain

Addition of recent_changes enhances transparency.

The new recent_changes field, which captures the Git diff of recent changes, is a valuable addition for tracking updates. Ensure that this field is properly integrated with version control and change tracking systems.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify integration of `recent_changes` with version control systems.

# Test: Search for handling of `recent_changes`. Expect: Proper integration with version control systems.
rg --type python -A 5 $'recent_changes'

Length of output: 69


Script:

#!/bin/bash
# Description: Verify integration of `recent_changes` with version control systems.

# Test: Search for handling of `recent_changes`. Expect: Proper integration with version control systems.
rg --type py -A 5 $'recent_changes'

Length of output: 410


Script:

#!/bin/bash
# Description: Search for references to `recent_changes` across the codebase to verify its integration.

# Test: Find all occurrences of `recent_changes` to check for integration with version control systems.
rg 'recent_changes'

Length of output: 106

app/domain/data/exercise_with_submissions_dto.py (1)

37-37: Field addition approved; consider reviewing the alias usage.

The addition of the url field with a default value of None is well-implemented and ensures backward compatibility. However, the use of an alias that directly matches the field name might be unnecessary unless there are specific serialization requirements that necessitate this approach.

app/common/message_converters.py (1)

Line range hint 28-47: Refactor suggestion for handling PyrisAIMessage and AIMessage conversions.

The function convert_iris_message_to_langchain_message has been updated to handle PyrisAIMessage specifically, which includes processing tool calls. This is a significant enhancement for supporting interactive tool functionalities. However, the current implementation could be improved for clarity and maintainability:

  • Extract Tool Call Conversion: The conversion logic for tool calls (lines 39-46) could be extracted into a separate function. This would make the convert_iris_message_to_langchain_message function cleaner and more focused on its primary responsibility.
  • Error Handling: Consider adding more specific error messages or custom exceptions for better debugging and user feedback.



class ExerciseChatPipelineExecutionDTO(ChatPipelineExecutionDTO):
submission: Optional[ProgrammingSubmissionDTO] = None
exercise: ProgrammingExerciseDTO
course: CourseDTO
event_payload: Optional[PyrisEventDTO[Any]] = Field(None, alias="eventPayload")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New attribute event_payload is well-integrated.

The addition of event_payload with an alias for serialization purposes is well-implemented. It would be beneficial to add documentation to explain its usage and relevance in the context of the application.

Would you like me to draft the documentation for this new attribute?

Comment on lines 37 to 40
class PyrisToolMessage(PyrisMessage):
model_config = ConfigDict(populate_by_name=True)
sender: IrisMessageRole = IrisMessageRole.TOOL
contents: List[ToolMessageContentDTO] = Field(default=[])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specialization in PyrisToolMessage is appropriate.

The setting of sender to TOOL and the specific structure for contents are well-suited for messages from tools. It would be beneficial to add documentation detailing the specific use cases and expected content structure for tool messages.

Would you like me to draft the documentation for this class?

@@ -30,3 +30,9 @@ class ProgrammingExerciseDTO(BaseModel):
problem_statement: str = Field(alias="problemStatement", default=None)
start_date: Optional[datetime] = Field(alias="startDate", default=None)
end_date: Optional[datetime] = Field(alias="endDate", default=None)
max_points: Optional[float] = Field(alias="maxPoints", default=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addition of max_points is beneficial.

The new max_points field adds valuable metrics for grading. Consider adding validation to ensure that the value is non-negative.

# Suggested validation to ensure non-negative values
if self.max_points is not None and self.max_points < 0:
    raise ValueError("max_points must be non-negative")

Comment on lines 131 to 178
tools: Sequence[
Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool]
] = []
_client: OpenAI

def chat(
self, messages: list[PyrisMessage], arguments: CompletionArguments
) -> PyrisMessage:
print("Sending messages to OpenAI", messages)
# noinspection PyTypeChecker
retries = 10
backoff_factor = 2
initial_delay = 1

for attempt in range(retries):
try:
if arguments.response_format == "JSON":
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
response_format=ResponseFormatJSONObject(type="json_object"),
)
if self.tools:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
response_format=ResponseFormat(type="json_object"),
tools=self.tools,
)
else:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
response_format=ResponseFormat(type="json_object"),
)
else:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
)
if self.tools:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
tools=self.tools,
)
else:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complex logic in chat method of OpenAIChatModel needs attention.

The chat method in OpenAIChatModel has been updated to handle tool integrations. The method is complex and handles multiple scenarios:

  • Refactor Suggestion: Consider refactoring to reduce complexity. Splitting the method into smaller parts or using helper functions for handling retries and response construction could improve clarity.
  • Error Handling: The method includes error handling and retry logic, which is good. However, ensure that all potential error paths are covered, especially with the new tool integration.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Outside diff range, codebase verification and nitpick comments (9)
app/domain/chat/exercise_chat/exercise_chat_pipeline_execution_dto.py (1)

16-16: Well-implemented attribute addition.

The event_payload attribute is correctly implemented with an alias for serialization purposes. Consider adding a comment explaining the purpose of this attribute for future maintainability.

app/domain/pyris_message.py (2)

31-34: Well-designed subclass for AI messages.

PyrisAIMessage is appropriately designed for AI-generated messages. Consider adding documentation for the tool_calls attribute to clarify its usage.


37-40: Clear and functional design for tool messages.

PyrisToolMessage is well-designed for messages from tools. Adding documentation for the contents attribute could enhance clarity and maintainability.

app/domain/data/programming_exercise_dto.py (1)

33-38: Useful additions to the DTO.

The new fields max_points and recent_changes are well-implemented and enhance the functionality of the ProgrammingExerciseDTO. Consider adding validation for the max_points field to ensure it remains within a sensible range.

app/llm/request_handler/capability_request_handler.py (1)

74-81: Well-implemented method for tool binding.

The bind_tools method is correctly implemented and integrates well with the existing class structure. The use of type annotations is commendable as it enhances code readability and maintainability.

Consider adding error handling or logging within the bind_tools method to manage potential issues during the tool binding process, such as when the selected model does not support tool binding or when an invalid tool type is passed.

app/llm/external/model.py (1)

53-63: New abstract method bind_tools added.

The addition of the bind_tools method is a positive enhancement, increasing the extensibility of the LanguageModel class. It allows for more complex interactions with various tools.

Consider providing more detailed documentation or examples on how to implement this method in subclasses, especially given the variety of tool types that can be bound.

app/web/routers/pipelines.py (1)

28-37: Enhanced flexibility in run_exercise_chat_pipeline_worker.

The addition of the variant parameter and the use of ExerciseChatAgentPipeline enhance the flexibility and adaptability of the pipeline. This change allows for variant-specific processing, which could lead to improved performance in handling different scenarios.

Consider adding more detailed logging or error handling specific to the variant parameter to ensure that any issues related to variant-specific configurations are captured and addressed promptly.

app/llm/external/ollama.py (2)

Line range hint 83-135: Approve existing methods and suggest documentation for bind_tools.

The existing methods in the OllamaModel class are correctly implemented. However, the bind_tools method is marked as not implemented. It's important to document its intended future functionality to guide development and integration.

Would you like help drafting the documentation for the bind_tools method?


Potential Issue with convert_to_iris_message Function

The convert_to_iris_message function in app/llm/external/ollama.py correctly maps the role and content fields from a Message to a PyrisMessage. However, it uses datetime.now() for the send_at field, which may not preserve the original timestamp from the Message. Additionally, other fields in the Message are not handled, which could affect data integrity if those fields are important.

  • Consider preserving the original timestamp if it is significant.
  • Review if additional fields from the Message should be included in the PyrisMessage.
Analysis chain

Line range hint 43-66: Verify correct mapping and data integrity in convert_to_iris_message.

Ensure that the convert_to_iris_message function correctly maps all fields and maintains data integrity during the conversion.

Run the following script to verify the correct mapping and data integrity:

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correct mapping and data integrity in `convert_to_iris_message`.

# Test: Search for the function usage. Expect: Correct mapping and data integrity.
rg --type python -A 5 $'convert_to_iris_message'

Length of output: 78


Script:

#!/bin/bash
# Description: Verify the correct mapping and data integrity in `convert_to_iris_message`.

# Test: Search for the function definition and usage. Expect: Correct mapping and data integrity.
rg --type py -A 5 $'def convert_to_iris_message'

Length of output: 827


Script:

#!/bin/bash
# Description: Extract the complete implementation of `convert_to_iris_message` from `app/llm/external/ollama.py`.

# Extract the function implementation to verify correct mapping and data integrity.
ast-grep --lang python --pattern $'def convert_to_iris_message($_) {\n  $$$\n}' app/llm/external/ollama.py

Length of output: 569


Script:

#!/bin/bash
# Description: Extract the complete implementation of `convert_to_iris_message` from `app/llm/external/ollama.py`.

# Extract the function implementation to verify correct mapping and data integrity.
rg --type py -A 20 $'def convert_to_iris_message' app/llm/external/ollama.py

Length of output: 545

Comment on lines +8 to +10
class PyrisEventDTO(BaseModel, Generic[T]):
event_type: str = Field(None, alias="eventType")
event: T = Field(None, alias="event")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding explicit default values and improving type annotations.

The PyrisEventDTO class correctly uses Pydantic models and generics. However, it would benefit from the following improvements:

  • Specify explicit default values for event_type and event to enhance clarity and ensure that these fields are optional.
  • Improve type annotations by specifying more precise types for event_type and event.

Consider applying these changes:

-    event_type: str = Field(None, alias="eventType")
-    event: T = Field(None, alias="event")
+    event_type: Optional[str] = Field(default=None, alias="eventType")
+    event: Optional[T] = Field(default=None, alias="event")
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class PyrisEventDTO(BaseModel, Generic[T]):
event_type: str = Field(None, alias="eventType")
event: T = Field(None, alias="event")
class PyrisEventDTO(BaseModel, Generic[T]):
event_type: Optional[str] = Field(default=None, alias="eventType")
event: Optional[T] = Field(default=None, alias="event")

Comment on lines +6 to +11
class ToolMessageContentDTO(BaseModel):

model_config = ConfigDict(populate_by_name=True)
name: Optional[str] = Field(alias="toolName", default="")
tool_content: str = Field(alias="toolContent")
tool_call_id: str = Field(alias="toolCallId")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making model_config an instance attribute and verify its intended usage.

The ToolMessageContentDTO class uses Pydantic models effectively. However, the model_config being a class attribute might not be appropriate unless it is intended to be shared across all instances. Typically, DTOs should not have shared state unless explicitly required.

Consider changing model_config to an instance attribute or provide clarification on its intended use:

-    model_config = ConfigDict(populate_by_name=True)
+    model_config: ConfigDict = Field(default_factory=lambda: ConfigDict(populate_by_name=True))

Committable suggestion was skipped due to low confidence.

Comment on lines +6 to +9
class FunctionDTO(BaseModel):
name: str = Field(..., alias="name")
arguments: Json[Any] = Field(..., alias="arguments")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest enhancing type safety for arguments.

While Json[Any] provides flexibility, it may compromise type safety. Consider specifying more detailed types if possible, or validate the structure of arguments at runtime to prevent errors.

Comment on lines +11 to +16
class ToolCallDTO(BaseModel):

model_config = ConfigDict(populate_by_name=True)
id: str = Field(alias="id")
type: Literal["function"] = "function"
function: FunctionDTO = Field(alias="function")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider future extensibility for ToolCallDTO.

The type field is currently restricted to Literal["function"]. Consider allowing more types if future extensions are planned, to avoid limiting the DTO's usability.

Comment on lines +37 to +96
if message.sender == "TOOL":
match content:
case ToolMessageContentDTO():
openai_messages.append(
{
"role": "tool",
"content": content.tool_content,
"tool_call_id": content.tool_call_id,
}
)
case _:
pass
else:
match content:
case ImageMessageContentDTO():
openai_content.append(
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{content.base64}",
"detail": "high",
},
}
)
case TextMessageContentDTO():
openai_content.append(
{"type": "text", "text": content.text_content}
)
case JsonMessageContentDTO():
openai_content.append(
{
"type": "json_object",
"json_object": content.json_content,
}
)
case _:
pass

if isinstance(message, PyrisAIMessage) and message.tool_calls:
openai_message = {
"role": map_role_to_str(message.sender),
"content": openai_content,
"tool_calls": [
{
"id": tool.id,
"type": tool.type,
"function": {
"name": tool.function.name,
"arguments": json.dumps(tool.function.arguments),
},
}
for tool in message.tool_calls
],
}
else:
openai_message = {
"role": map_role_to_str(message.sender),
"content": openai_content,
}
openai_messages.append(openai_message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor: Enhance message conversion logic in convert_to_open_ai_messages.

The updated logic to handle messages from tools separately is a significant improvement. It allows for the inclusion of tool-specific content in the OpenAI message format, which is crucial for integrating tool interactions into the chat. Ensure that all new branches and conditions are covered by unit tests to verify correct behavior.

Consider refactoring the method to separate concerns more clearly, possibly by extracting the tool message handling into a separate function.

Comment on lines +155 to +157
query: PyrisMessage | None = None
if history:
query = dto.chat_history[-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated logic in _run_exercise_chat_pipeline for handling variants:

The method now includes logic to append different system prompts based on the variant value. This is a robust design choice that tailors the chatbot's responses to specific scenarios like "progress_stalled" or "build_failed". It's crucial to ensure that these variants are well-documented and tested to prevent any unexpected behavior.

Consider adding unit tests to cover these new branches in the logic to ensure they behave as expected under different conditions.

Also applies to: 248-259

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Outside diff range, codebase verification and nitpick comments (8)
app/domain/data/exercise_with_submissions_dto.py (1)

37-37: Field definition for url is correct but consider simplifying the alias.

The implementation of the url field in ExerciseWithSubmissionsDTO is correct. However, the alias "url" is redundant since it matches the field name. Consider removing the alias for simplicity unless there is a specific reason for its inclusion.

app/llm/llm_manager.py (1)

42-42: Good shift towards validation in load_llms method.

The change to use model_validate instead of parse_obj in the load_llms method is a positive step towards ensuring data integrity. Consider adding error handling or logging to monitor the impact of this change on the system.

app/llm/request_handler/capability_request_handler.py (1)

74-81: Lack of Validation for tools Parameter in bind_tools Method

The bind_tools method in capability_request_handler.py does not perform any validation or security checks on the tools parameter before binding them to the language model. This could lead to potential security vulnerabilities. It is recommended to implement validation checks to ensure that only valid and safe tools are bound to the model.

  • File: app/llm/request_handler/capability_request_handler.py
  • Lines: 74-81
Analysis chain

Review the bind_tools method implementation.

The implementation of bind_tools appears correct and aligns with the PR objectives. However, consider adding security checks or validations for the tools parameter to ensure that only valid tools are bound to the model.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify security checks for the `tools` parameter.

# Test: Search for security checks related to `tools`. Expect: Security checks or validations.
rg --type python -A 5 $'tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]'

Length of output: 132


Script:

#!/bin/bash
# Description: Search for occurrences of `tools` and any associated validation or security checks in the codebase.

# Search for occurrences of `tools` in the codebase to identify any validation or security checks.
rg 'tools' -A 5

Length of output: 27543

app/llm/external/model.py (1)

47-47: Potential issues with return type change in chat method.

The return type of the chat method in app/llm/external/model.py has been changed to ChatCompletionMessage, but other implementations and usages still expect PyrisMessage. This discrepancy could lead to compatibility issues. Ensure that all usages of the chat method are updated to handle the new return type correctly.

  • Files to check:
    • app/pipeline/lecture_ingestion_pipeline.py
    • app/pipeline/competency_extraction_pipeline.py
    • app/llm/external/openai_chat.py
    • app/llm/external/ollama.py
    • Various request handler files
Analysis chain

Verify the impact of the return type change in the chat method.

The change in return type from PyrisMessage to ChatCompletionMessage could affect downstream processing. Verify that all existing usages of the chat method have been updated to handle the new return type.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the impact of the return type change on existing code.

# Test: Search for usages of the `chat` method. Expect: Updated handling for `ChatCompletionMessage`.
rg --type python -A 5 $'chat\('

Length of output: 63


Script:

#!/bin/bash
# Description: Verify the impact of the return type change on existing code.

# Test: Search for usages of the `chat` method. Expect: Updated handling for `ChatCompletionMessage`.
rg --type py -A 5 $'chat\('

Length of output: 5840

app/llm/external/ollama.py (1)

Line range hint 83-135: Review: New method bind_tools and class structure.

The class OllamaModel is well-structured and inherits appropriately from related classes. The new method bind_tools is designed to raise a NotImplementedError, indicating it's a placeholder for future functionality. This approach is acceptable as it clearly communicates the current limitations and future intentions.

However, consider adding more detailed documentation to the bind_tools method to explain its future purpose and expected integration.

Would you like help drafting the additional documentation for this method?

app/pipeline/chat/interaction_suggestion_pipeline.py (1)

Line range hint 115-145: Approved: Streamlined prompt handling and improved error handling.

The modifications to the __call__ method in the InteractionSuggestionPipeline class simplify the handling of chat history and prompts. The removal of specific prompts related to chat history reduces redundancy and potential confusion. The addition of a ValueError when no last message is provided enhances the robustness of the method by ensuring necessary data is present.

Consider adding unit tests to cover these changes, especially the new error handling paths, to ensure they work as expected under various scenarios.

Would you like assistance in creating these unit tests?

app/common/message_converters.py (1)

Line range hint 22-47: Approved: Expanded message conversion functionality.

The function convert_iris_message_to_langchain_message has been effectively expanded to handle new message types such as ToolMessage and PyrisToolMessage. The robust error handling for empty messages and type mismatches ensures reliable operation.

Consider adding detailed documentation to describe the handling of these new message types, which will aid in maintenance and future enhancements.

Would you like help with drafting this documentation?

app/pipeline/chat/course_chat_pipeline.py (1)

Line range hint 585-591: Suggest refining exception handling.

The broad exception handling in the __call__ method could be refined to catch specific exceptions. This would allow for more targeted error responses and recovery actions, improving the robustness and maintainability of the code.

Consider handling specific exceptions like ValueError or TypeError separately to provide more informative error messages and recovery options.



class PyrisMessage(BaseModel):
model_config = ConfigDict(populate_by_name=True)

sent_at: datetime | None = Field(alias="sentAt", default=None)
sender: IrisMessageRole
contents: List[MessageContentDTO] = []

contents: Union[str, List[MessageContentDTO]] = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest additional validation for the contents attribute.

Given the change to a union type, ensure that the contents attribute has adequate validation to handle both str and List[MessageContentDTO] types effectively.

Consider implementing a method in the PyrisMessage class to validate the contents based on its type before processing.

Comment on lines +41 to +47
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
llm = self.llm_manager.get_llm_by_id(self.model_id)
llm.bind_tools(tools)
return llm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve the bind_tools method with a suggestion.

The implementation of bind_tools is correct and aligns with the described functionality. However, it would benefit from a detailed docstring explaining the parameters and the expected behavior.

Consider adding a detailed docstring to the bind_tools method:

def bind_tools(
    self,
    tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
    """
    Binds a sequence of tools to the language model.

    Args:
        tools (Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]): A sequence of tools to be bound.

    Returns:
        LanguageModel: The language model with tools bound.
    """
    llm = self.llm_manager.get_llm_by_id(self.model_id)
    llm.bind_tools(tools)
    return llm

Comment on lines +46 to +52
@abstractmethod
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
"""Bind tools"""
raise NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve the addition of bind_tools method with a suggestion.

The addition of the bind_tools method to the RequestHandler interface is crucial for ensuring consistency across implementations. It would benefit from a detailed docstring explaining the parameters and the expected behavior.

Consider adding a detailed docstring to the bind_tools method:

def bind_tools(
    self,
    tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
    """
    Abstract method to bind tools to the language model.

    Args:
        tools (Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]): A sequence of tools to be bound.

    Returns:
        LanguageModel: The language model with tools bound.
    """
    raise NotImplementedError

Comment on lines +77 to +89
def convert_langchain_tool_calls_to_iris_tool_calls(
tool_calls: List[ToolCall],
) -> List[ToolCallDTO]:
return [
ToolCallDTO(
function=FunctionDTO(
name=tc["name"],
arguments=json.dumps(tc["args"]),
),
id=tc["id"],
)
for tc in tool_calls
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved: Effective tool call conversion with a note on JSON serialization.

The function convert_langchain_tool_calls_to_iris_tool_calls effectively converts tool calls for integration, using JSON serialization for arguments. Ensure that this serialization process is thoroughly tested to maintain data integrity and prevent issues in downstream processing.

Would you like assistance in setting up tests for the JSON serialization process?

Comment on lines +91 to +134

def convert_langchain_message_to_iris_message(
base_message: BaseMessage,
) -> PyrisMessage:
match base_message.type:
case "human":
role = IrisMessageRole.USER
case "ai":
role = IrisMessageRole.ASSISTANT
case "system":
role = IrisMessageRole.SYSTEM
case _:
raise ValueError(f"Unknown message type: {base_message.type}")
contents = [TextMessageContentDTO(textContent=base_message.content)]
type_to_role = {
"human": IrisMessageRole.USER,
"ai": IrisMessageRole.ASSISTANT,
"system": IrisMessageRole.SYSTEM,
"tool": IrisMessageRole.TOOL,
}

role = type_to_role.get(base_message.type)
if role is None:
raise ValueError(f"Unknown message type: {base_message.type}")

if isinstance(base_message, (HumanMessage, SystemMessage)):
contents = [TextMessageContentDTO(textContent=base_message.content)]
elif isinstance(base_message, AIMessage):
if base_message.tool_calls:
contents = [TextMessageContentDTO(textContent=base_message.content)]
tool_calls = convert_langchain_tool_calls_to_iris_tool_calls(
base_message.tool_calls
)
return PyrisAIMessage(
contents=contents,
tool_calls=tool_calls,
send_at=datetime.now(),
)
else:
contents = [TextMessageContentDTO(textContent=base_message.content)]
elif isinstance(base_message, ToolMessage):
contents = [
ToolMessageContentDTO(
toolContent=base_message.content,
toolName=base_message.additional_kwargs["name"],
toolCallId=base_message.tool_call_id,
)
]
return PyrisToolMessage(
contents=contents,
send_at=datetime.now(),
)
else:
raise ValueError(f"Unknown message type: {type(base_message)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved: Comprehensive message conversion with robust error handling.

The function convert_langchain_message_to_iris_message handles a wide range of message types and roles, with robust error handling for unknown types. This comprehensive approach enhances the system's flexibility and reliability.

Consider adding detailed documentation to explain the handling of different message types and roles, which will aid in maintenance and future enhancements.

Would you like help with drafting this documentation?

app/pipeline/chat/course_chat_pipeline.py Show resolved Hide resolved
app/pipeline/chat/course_chat_pipeline.py Show resolved Hide resolved
app/pipeline/chat/course_chat_pipeline.py Outdated Show resolved Hide resolved
app/pipeline/chat/exercise_chat_agent_pipeline.py Outdated Show resolved Hide resolved
Comment on lines 151 to 591
HumanMessage(
"Consider the student's newest and latest input:"
),
convert_iris_message_to_langchain_human_message(query),
("placeholder", "{agent_scratchpad}"),
]
)
else:
self.prompt = ChatPromptTemplate.from_messages(
[
SystemMessage(
initial_prompt_with_date
+ "\n"
+ add_exercise_context_to_prompt(
exercise_title,
problem_statement,
programming_language,
)
+ agent_prompt
+ "\n"
+ format_reminder_prompt,
),
("placeholder", "{agent_scratchpad}"),
]
)

tools = generate_structured_tools_from_functions(
[
get_submission_details,
get_additional_exercise_details,
get_build_logs_analysis_tool,
get_feedbacks,
repository_files,
file_lookup,
]
)
agent = create_tool_calling_agent(
llm=self.llm, tools=tools, prompt=self.prompt
)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False)
out = None
self.callback.in_progress()
invocation_result = agent_executor.invoke(params)
if invocation_result.get("output", None):
out = invocation_result["output"]

try:
self.callback.in_progress("Refining response...")
self.prompt = ChatPromptTemplate.from_messages(
[
SystemMessagePromptTemplate.from_template(guide_system_prompt),
]
)

guide_response = (self.prompt | self.llm | StrOutputParser()).invoke(
{
"response": out,
}
)
if "!ok!" in guide_response:
print("Response is ok and not rewritten!!!")
else:
out = guide_response
print("Response is rewritten.")

self.callback.done("Response created", final_result=out)
except Exception as e:
logger.error(
"An error occurred while running the course chat interaction suggestion pipeline",
exc_info=e,
)
traceback.print_exc()
self.callback.error("Error in refining response")
try:
if out:
suggestion_dto = InteractionSuggestionPipelineExecutionDTO()
suggestion_dto.chat_history = dto.chat_history
suggestion_dto.last_message = out
suggestions = self.suggestion_pipeline(suggestion_dto)
self.callback.done(final_result=None, suggestions=suggestions)
else:
# This should never happen but whatever
self.callback.skip(
"Skipping suggestion generation as no output was generated."
)
except Exception as e:
logger.error(
"An error occurred while running the course chat interaction suggestion pipeline",
exc_info=e,
)
traceback.print_exc()
self.callback.error("Generating interaction suggestions failed.")
except Exception as e:
logger.error(
"An error occurred while running the course chat pipeline", exc_info=e
)
traceback.print_exc()
self.callback.error(
"An error occurred while running the course chat pipeline."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method: __call__

This method orchestrates the execution of the chat agent pipeline. It is complex and handles multiple steps including logging, error handling, and invoking various tools. The method could benefit from further decomposition into smaller methods to improve readability and maintainability.

Refactor the __call__ method by extracting parts of its functionality into separate methods, such as handle_query, setup_prompt, and execute_tools, to make the code cleaner and easier to manage.

- def __call__(self, dto: ExerciseChatPipelineExecutionDTO):
+ def handle_query(self, dto):
+     # Logic to handle query extraction and validation
+
+ def setup_prompt(self, dto, query):
+     # Logic to setup the chat prompt based on the query and dto
+
+ def execute_tools(self, dto, params):
+     # Logic to execute tools and handle outputs
+
+ def __call__(self, dto: ExerciseChatPipelineExecutionDTO):
+     query = self.handle_query(dto)
+     self.setup_prompt(dto, query)
+     params = {}
+     self.execute_tools(dto, params)

Committable suggestion was skipped due to low confidence.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (2)
app/config.py (1)

Line range hint 1-50: Consider enhancing security and flexibility of configuration handling.

While the current implementation is functional, consider the following suggestions to improve security and flexibility:

  1. Secret Management: For sensitive data like API keys, consider using a secure secret management solution instead of storing them in a YAML file.

  2. Configuration Validation: Add more detailed validation for the configuration values, such as checking for empty strings or invalid formats.

  3. Default Values: Provide default values where appropriate to make the configuration more robust.

  4. Configuration Reloading: Implement a method to reload the configuration at runtime without restarting the application.

Here's a sample implementation for adding default values and more detailed validation:

from pydantic import BaseModel, Field, HttpUrl

class APIKeyConfig(BaseModel):
    token: str = Field(..., min_length=1)

class WeaviateSettings(BaseModel):
    host: HttpUrl
    port: int = Field(default=8080, ge=1, le=65535)
    grpc_port: int = Field(default=50051, ge=1, le=65535)

class Settings(BaseModel):
    api_keys: list[APIKeyConfig] = Field(default_factory=list)
    env_vars: dict[str, str] = Field(default_factory=dict)
    weaviate: WeaviateSettings

    # ... rest of the class implementation

This example adds default values, uses Field for more detailed validation, and uses HttpUrl for validating the Weaviate host.

app/llm/external/openai_chat.py (1)

Line range hint 1-236: Add Unit Tests for New Tool Integration Features

The addition of tool integrations and the associated logic changes introduce complexity that should be thoroughly tested. Ensure that unit tests cover:

  • Messages with and without tools.
  • Different content types (text, image, JSON).
  • Error handling when unexpected content types are encountered.
  • The behavior of bind_tools in both the base class and subclasses.

Would you like assistance in creating unit tests for these new functionalities?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4c653ed and d6b8fa6.

📒 Files selected for processing (3)
  • app/config.py (1 hunks)
  • app/llm/external/openai_chat.py (4 hunks)
  • app/pipeline/chat/exercise_chat_agent_pipeline.py (1 hunks)
🧰 Additional context used
🔇 Additional comments (5)
app/config.py (1)

35-35: Update to Pydantic v2 syntax approved, but verify compatibility.

The change from cls.parse_obj(settings_file) to cls.model_validate(settings_file) aligns with Pydantic v2's new API. This update is correct and follows the latest best practices for Pydantic.

To ensure full compatibility, please run the following verification script:

This script will help ensure that:

  1. The project is using Pydantic v2.
  2. There are no remaining instances of parse_obj that need to be updated.
  3. The new model_validate method is being used consistently across the codebase.
✅ Verification successful

Pydantic update verified successfully. No issues found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify Pydantic version and usage across the codebase

# Check Pydantic version
echo "Checking Pydantic version:"
rg -i "pydantic[=><]" requirements.txt

# Check for any remaining usage of parse_obj
echo "Checking for remaining usage of parse_obj:"
rg --type python "parse_obj"

# Check for usage of model_validate
echo "Checking for usage of model_validate:"
rg --type python "model_validate"

Length of output: 424


Script:

#!/bin/bash
# Description: Corrected verification of Pydantic version and usage across the codebase

# Check Pydantic version
echo "Checking Pydantic version:"
rg -i "pydantic[=><]" requirements.txt

# Check for any remaining usage of parse_obj
echo "Checking for remaining usage of parse_obj:"
rg --type py "parse_obj"

# Check for usage of model_validate
echo "Checking for usage of model_validate:"
rg --type py "model_validate"

Length of output: 766

app/llm/external/openai_chat.py (2)

105-126: Ensure Correct Handling in convert_to_iris_message Function

In the convert_to_iris_message function, when constructing a PyrisAIMessage with tool_calls, the contents field is set to [TextMessageContentDTO(textContent="")]. This might not correctly represent the message content if the message.content is not empty. Verify if this is the intended behavior or if the actual content should be included.


132-134: 🛠️ Refactor suggestion

Review Type Annotations for tools Attribute

The tools attribute in OpenAIChatModel is annotated with a complex Union type. Ensure that all possible types are necessary and compatible. Simplifying the type annotations or creating a separate type alias might improve readability and maintainability.

app/pipeline/chat/exercise_chat_agent_pipeline.py (2)

109-153: Consider refactoring 'ExerciseChatAgentPipeline' for single responsibility

The ExerciseChatAgentPipeline class handles multiple responsibilities, including LLM interactions, tool management, prompt construction, and callback handling. This may violate the Single Responsibility Principle, making the code harder to maintain.


153-604: Refactor the 'call' method to improve readability

The __call__ method is extensive and contains nested functions and complex logic. Breaking it down into smaller, well-named methods can enhance readability and maintainability.

Comment on lines +38 to +49
if message.sender == "TOOL":
match content:
case ToolMessageContentDTO():
openai_messages.append(
{
"role": "tool",
"content": content.tool_content,
"tool_call_id": content.tool_call_id,
}
)
case _:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle Unhandled Content Types Explicitly

In the convert_to_open_ai_messages function, when matching content for messages where message.sender == "TOOL", the default case is a pass statement. This could lead to silent failures if unexpected content types are encountered. It's advisable to handle unanticipated content types explicitly, possibly by raising an exception or logging a warning.

Comment on lines +76 to +91
if isinstance(message, PyrisAIMessage) and message.tool_calls:
openai_message = {
"role": map_role_to_str(message.sender),
"content": openai_content,
"tool_calls": [
{
"id": tool.id,
"type": tool.type,
"function": {
"name": tool.function.name,
"arguments": json.dumps(tool.function.arguments),
},
}
for tool in message.tool_calls
],
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor Conditional Logic to Avoid Code Duplication

The construction of openai_message has two branches: one for messages with tool_calls and one without. Both branches share common keys like "role" and "content". Consider refactoring this logic to reduce duplication by initializing the common parts of openai_message and conditionally adding the "tool_calls" key if it exists.

Comment on lines 147 to 179
if self.tools:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
response_format=ResponseFormat(type="json_object"),
tools=self.tools,
)
else:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
response_format=ResponseFormat(type="json_object"),
)
else:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
)
if self.tools:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
tools=self.tools,
)
else:
response = self._client.chat.completions.create(
model=self.model,
messages=convert_to_open_ai_messages(messages),
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reduce Code Duplication in API Calls

The chat method contains duplicated code for making API calls to OpenAI, differing only in the presence of the tools parameter and response_format. Consider abstracting the common parameters and constructing the API call dynamically based on conditions to enhance maintainability.

yassinsws
yassinsws previously approved these changes Oct 13, 2024
Copy link
Contributor

@yassinsws yassinsws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good only one minor nitpick

app/pipeline/chat/course_chat_pipeline.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

🧹 Outside diff range and nitpick comments (16)
app/common/pyris_message.py (2)

17-17: Consider adding docstring for the new TOOL role.

While the addition of the TOOL role is appropriate, adding documentation would help clarify its purpose and usage in the context of native function calling.

-    TOOL = "TOOL"
+    TOOL = "TOOL"  # Represents messages from tool/function executions in native function calling

34-37: Add class-level documentation for PyrisAIMessage.

The implementation looks good, but adding a docstring would help explain its role in the tool calling system.

 class PyrisAIMessage(PyrisMessage):
+    """
+    Represents an AI assistant message that may include tool/function calls.
+    Used in the context of native function calling to request tool executions.
+    """
     model_config = ConfigDict(populate_by_name=True)
app/llm/request_handler/request_handler_interface.py (1)

46-52: Consider the implications of making tool binding mandatory.

Making bind_tools an abstract method requires all request handlers to implement tool binding, which might not be necessary for simpler handlers. Consider:

  1. Moving tool binding to a separate interface (e.g., ToolBindingCapable)
  2. Or making it an optional capability through a default implementation that returns self

This would provide more flexibility in implementing request handlers that don't need tool binding capabilities.

app/llm/request_handler/capability_request_handler.py (1)

77-84: Consider implementing a broader tool management interface

The current implementation only supports binding tools, but a more comprehensive tool management interface might be needed for:

  • Unbinding tools
  • Listing bound tools
  • Checking tool compatibility
  • Tool validation and verification

Consider creating a dedicated tool management interface that this class can implement.

app/pipeline/chat/interaction_suggestion_pipeline.py (2)

119-122: Consider making the chat history limit configurable.

The hardcoded limit of 4 messages could be made configurable to allow adjustment based on token limits or specific use cases.

+    MAX_HISTORY_MESSAGES = 4  # Class constant for easy configuration
     
     chat_history_messages = [
         convert_iris_message_to_langchain_message(message)
-        for message in history[-4:]
+        for message in history[-self.MAX_HISTORY_MESSAGES:]
     ]

141-143: Consider validating the problem statement format.

While a default value is provided for missing problem statements, there's no validation of the format or content when it exists.

-    prob_st_val = dto.problem_statement or "No problem statement provided."
+    def validate_problem_statement(ps: Optional[str]) -> str:
+        if not ps:
+            return "No problem statement provided."
+        if len(ps.strip()) == 0:
+            return "No problem statement provided."
+        return ps
+
+    prob_st_val = validate_problem_statement(dto.problem_statement)
app/web/routers/pipelines.py (1)

34-45: Document the new parameters for ExerciseChatAgentPipeline

The implementation looks good, but please add docstrings to explain:

  • The purpose and expected values for the variant parameter
  • The significance and possible values for the optional event parameter

Example docstring:

def run_exercise_chat_pipeline(
    variant: str,
    event: str | None = Query(None, description="Event query parameter"),
    dto: ExerciseChatPipelineExecutionDTO = Body(
        description="Exercise Chat Pipeline Execution DTO"
    ),
):
    """Execute the exercise chat pipeline with the specified variant and event.
    
    Args:
        variant: The pipeline variant to use (e.g., "default", "experimental")
        event: Optional event identifier to customize pipeline behavior
        dto: The pipeline execution configuration
    """

Also applies to: 65-74

app/pipeline/prompts/iris_course_chat_prompts.py (1)

83-85: Add examples to clarify the expected format.

The formatting requirements for resource links are clear, but adding examples would help ensure consistent implementation. Consider adding examples like:

 If you link a resource, DO NOT FORGET to include a markdown link. Use markdown format: [Resource title](Resource URL).
 The resource title should be the title of the lecture, exercise, or any other course material and shoud be descriptive in case no title is provided. Do not use "here" as a link text
-The resource URL should only be the relative path to the course website, not the full URL.
+The resource URL should only be the relative path to the course website, not the full URL. For example:
+- Good: [Introduction to Python](/lectures/intro-python)
+- Bad: [click here](/lectures/intro-python)
+- Bad: [Introduction to Python](https://example.com/lectures/intro-python)
app/pipeline/chat/exercise_chat_pipeline.py (3)

Line range hint 60-75: Add type validation for variant parameter.

Consider validating the variant parameter against a predefined set of allowed values to prevent runtime errors from invalid variants.

+from enum import Enum, auto
+
+class ChatVariant(Enum):
+    DEFAULT = auto()
+    PROGRESS_STALLED = auto()
+    BUILD_FAILED = auto()
+
 class ExerciseChatPipeline(Pipeline):
     """Exercise chat pipeline that answers exercises related questions from students."""
 
     llm: IrisLangchainChatModel
     pipeline: Runnable
     callback: ExerciseChatStatusCallback
     suggestion_pipeline: InteractionSuggestionPipeline
     code_feedback_pipeline: CodeFeedbackPipeline
     prompt: ChatPromptTemplate
-    variant: str
+    variant: ChatVariant
 
-    def __init__(self, callback: ExerciseChatStatusCallback, variant: str = "default"):
+    def __init__(self, callback: ExerciseChatStatusCallback, variant: str = "default"):
         super().__init__(implementation_id="exercise_chat_pipeline")
+        try:
+            self.variant = ChatVariant[variant.upper()]
+        except KeyError:
+            raise ValueError(f"Invalid variant: {variant}. Must be one of {[v.name.lower() for v in ChatVariant]}")

171-173: Improve clarity of query extraction logic.

The current implementation could be more explicit about extracting the last message from chat history.

-query: PyrisMessage | None = None
-if history:
-    query = dto.chat_history[-1]
+query: PyrisMessage | None = dto.chat_history[-1] if history else None

354-362: Add documentation for chat history handling logic.

The new chat history handling logic would benefit from docstring updates to explain the different scenarios and their implications.

 def _add_conversation_to_prompt(
     self,
     chat_history: List[PyrisMessage],
     user_question: PyrisMessage,
 ):
     """
     Adds the chat history and user question to the prompt
-        :param chat_history: The chat history
-        :param user_question: The user question
-        :return: The prompt with the chat history
+        :param chat_history: The chat history. If empty, a special no-history prompt is used
+        :param user_question: The user question. If None, no user question is added to the prompt
+        :return: None
+
+        The method handles three scenarios:
+        1. With chat history: Adds the last 4 messages from history
+        2. Without chat history: Adds a special system prompt
+        3. With user question: Adds it as the latest input to consider
     """
app/llm/external/openai_chat.py (4)

20-20: Remove unused import ResponseFormat.

The import ResponseFormat at line 20 is unused in the code and can be safely removed to clean up the imports.

Apply this diff to remove the unused import:

-from openai.types.chat.completion_create_params import ResponseFormat
🧰 Tools
🪛 Ruff

20-20: openai.types.chat.completion_create_params.ResponseFormat imported but unused

Remove unused import: openai.types.chat.completion_create_params.ResponseFormat

(F401)


47-58: Handle unanticipated content types explicitly in convert_to_open_ai_messages.

In the match statement for messages where message.sender == "TOOL", the default case uses a pass statement. This could lead to silent failures if unexpected content types are encountered. It's advisable to handle unanticipated content types explicitly, possibly by raising an exception or logging a warning.

Apply this diff to handle unexpected content types:

                 case _:
-                    pass
+                    logging.warning(f"Unhandled content type: {type(content)} in TOOL message.")

85-100: Refactor to reduce duplication in message construction.

The construction of openai_message in both branches shares common keys "role" and "content". Consider initializing openai_message with the common fields and conditionally adding the "tool_calls" key if it exists to reduce code duplication.

Apply this diff to refactor the code:

                 openai_message = {
                     "role": map_role_to_str(message.sender),
                     "content": openai_content,
                 }
-                if isinstance(message, PyrisAIMessage) and message.tool_calls:
+                if getattr(message, "tool_calls", None):
                     openai_message["tool_calls"] = [
                         {
                             "id": tool.id,
                             "type": tool.type,
                             "function": {
                                 "name": tool.function.name,
                                 "arguments": json.dumps(tool.function.arguments),
                             },
                         }
                         for tool in message.tool_calls
                     ]

174-206: Simplify API call logic to reduce duplication.

The chat method has duplicated code for making API calls to OpenAI, differing only in the presence of tools and response_format. Consider refactoring to construct the kwargs dynamically based on conditions.

Apply this diff as a starting point to refactor the API calls:

         for attempt in range(retries):
             try:
                 kwargs = {
                     "model": self.model,
                     "messages": messages,
                     "temperature": arguments.temperature,
                     "max_tokens": arguments.max_tokens,
                 }
                 if self.tools:
                     kwargs["tools"] = self.tools
                 if arguments.response_format == "JSON":
-                    if self.tools:
-                        response = self._client.chat.completions.create(
-                            response_format=ResponseFormatJSONObject(type="json_object"),
-                            **kwargs,
-                        )
-                    else:
-                        response = self._client.chat.completions.create(
-                            response_format=ResponseFormatJSONObject(type="json_object"),
-                            **kwargs,
-                        )
+                    kwargs["response_format"] = ResponseFormatJSONObject(type="json_object")
                 response = self._client.chat.completions.create(**kwargs)
                 choice = response.choices[0]
                 usage = response.usage
                 model = response.model
                 # Existing error handling...

Ensure that after refactoring, all necessary parameters are included, and the logic remains correct.

🧰 Tools
🪛 Ruff

180-180: Undefined name ResponseFormatJSONObject

(F821)


189-189: Undefined name ResponseFormatJSONObject

(F821)

app/pipeline/chat/exercise_chat_agent_pipeline.py (1)

430-607: Catch specific exceptions instead of using a broad except Exception

Throughout the __call__ method, general exceptions are caught using except Exception as e:. Catching all exceptions can make debugging difficult and may suppress important error information. It's advisable to catch specific exceptions to improve error handling and code robustness.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d6b8fa6 and a9997a5.

📒 Files selected for processing (16)
  • .gitignore (1 hunks)
  • app/common/pyris_message.py (2 hunks)
  • app/llm/external/model.py (2 hunks)
  • app/llm/external/ollama.py (3 hunks)
  • app/llm/external/openai_chat.py (8 hunks)
  • app/llm/langchain/iris_langchain_chat_model.py (2 hunks)
  • app/llm/request_handler/basic_request_handler.py (2 hunks)
  • app/llm/request_handler/capability_request_handler.py (2 hunks)
  • app/llm/request_handler/request_handler_interface.py (2 hunks)
  • app/pipeline/chat/course_chat_pipeline.py (7 hunks)
  • app/pipeline/chat/exercise_chat_agent_pipeline.py (1 hunks)
  • app/pipeline/chat/exercise_chat_pipeline.py (6 hunks)
  • app/pipeline/chat/interaction_suggestion_pipeline.py (2 hunks)
  • app/pipeline/prompts/iris_course_chat_prompts.py (1 hunks)
  • app/web/routers/pipelines.py (4 hunks)
  • app/web/status/status_update.py (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🧰 Additional context used
🪛 Ruff
app/llm/external/openai_chat.py

20-20: openai.types.chat.completion_create_params.ResponseFormat imported but unused

Remove unused import: openai.types.chat.completion_create_params.ResponseFormat

(F401)


180-180: Undefined name ResponseFormatJSONObject

(F821)


189-189: Undefined name ResponseFormatJSONObject

(F821)

app/web/routers/pipelines.py

68-70: Do not perform function call Body in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


108-110: Do not perform function call Body in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🔇 Additional comments (24)
app/common/pyris_message.py (3)

3-3: LGTM: Import changes are appropriate.

The new imports support the tool-related functionality and follow proper import conventions.

Also applies to: 9-10


40-43: Add documentation and verify tool message handling.

The implementation is correct, but would benefit from documentation explaining its purpose in the tool calling workflow.

Let's verify the tool message handling implementation:

#!/bin/bash
# Check for ToolMessageContentDTO usage and implementation
ast-grep --pattern 'class $name implements ToolMessageContentDTO {
  $$$
}'

# Look for tool message processing logic
rg -l 'PyrisToolMessage' --type py

27-28: Verify compatibility with existing message handlers.

The change from List[MessageContentDTO] to Union[str, List[MessageContentDTO]] is a significant modification that could affect existing code. Please ensure all message handlers can properly handle both string and list content types.

Let's check for potential compatibility issues:

✅ Verification successful

Based on the search results, I can now provide a final response as there's clear evidence of how the contents field is used throughout the codebase:

The type change is safe but requires careful testing of message conversion logic

The change from List[MessageContentDTO] to Union[str, List[MessageContentDTO]] appears safe because:

  1. All existing code consistently treats messages as having a list of contents and accesses the first element:

    • Message converters validate non-empty contents: if len(iris_message.contents) == 0
    • Content access pattern is always: message.contents[0].text_content
    • No direct list operations (append/extend/remove) are performed on contents
  2. The conversion logic in message_converters.py properly handles the contents field:

    • Creates new messages with list contents: contents=[TextMessageContentDTO(...)]
    • Validates content type before access: isinstance(message, TextMessageContentDTO)

However, recommend testing these specific areas:

  • Message conversion functions in app/common/message_converters.py
  • OpenAI and Ollama message conversion in their respective adapters
  • Pipeline message handling in text/exercise/lecture chat pipelines
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for direct access to message contents that might assume list type
rg -l 'contents\[' --type py

# Search for iterations over contents that might break with string type
rg -l 'for .+ in .+contents' --type py

# Search for list-specific method calls on contents
rg -l 'contents\.(append|extend|pop|remove)' --type py

Length of output: 305


Script:

#!/bin/bash
# Let's try a more comprehensive search for contents usage patterns
# Search for any reference to contents with surrounding context
rg -A 3 -B 3 'contents' --type py

# Search for message handling or processing methods
rg -A 5 'def .*message' --type py

# Search for any class that might be handling PyrisMessage
rg -A 10 'class.*Message' --type py

Length of output: 29376

app/llm/request_handler/basic_request_handler.py (3)

1-4: LGTM: Import statements are well-organized and necessary.

The new imports are correctly added and properly used for type hints in the bind_tools method.


41-47: LGTM: Tool binding implementation is correct and type-safe.

The implementation properly integrates with the LLM manager and follows the established pattern for request handlers. The comprehensive type hints ensure type safety when binding various tool types.


41-47: Verify LLM compatibility with tool binding.

Since tool binding is provider-specific functionality, we should verify that the underlying LLM implementation supports it.

✅ Verification successful

Let me gather more information about the LLM implementations to verify tool binding support.


Let me check the actual implementation of bind_tools in OpenAI models since we found multiple declarations.


Tool binding is properly implemented in supported models

The verification shows that tool binding is well-implemented:

  • OpenAI models (both Direct and Azure) properly implement bind_tools with tool conversion
  • Base models have appropriate NotImplementedError for unsupported cases
  • The request handler correctly delegates tool binding to the underlying LLM implementation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if LLM implementations support tool binding
# Expected: Find implementations of bind_tools in LLM classes

# Look for LLM implementations with bind_tools method
echo "Searching for bind_tools implementations in LLM classes..."
ast-grep --pattern 'class $CLASS implements LanguageModel {
  $$$
  bind_tools($$$) {
    $$$
  }
  $$$
}'

# Look for potential incompatibility markers
echo "Checking for potential incompatibility markers..."
rg -i "not(\s+)?(implemented|supported)|todo|fixme" -g "app/llm/**/*.py" -A 2 -B 2

Length of output: 4582


Script:

#!/bin/bash
# Search for concrete LLM implementations and their bind_tools methods
echo "Searching for LLM class implementations..."
rg "class.*\(.*LanguageModel.*\)" -g "app/llm/**/*.py" -A 10

# Search specifically for bind_tools implementations
echo "Searching for bind_tools method implementations..."
rg "def bind_tools" -g "app/llm/**/*.py" -A 5

Length of output: 6901


Script:

#!/bin/bash
# Check the OpenAI chat model implementations
echo "Checking OpenAI chat model bind_tools implementations..."
rg "def bind_tools" -g "app/llm/external/openai_chat.py" -A 15

Length of output: 2599

app/llm/request_handler/request_handler_interface.py (3)

2-4: LGTM: Imports are well-organized and necessary.

The new imports are properly organized and provide the required type hints for the tool binding functionality.


46-52: LGTM: Well-typed method signature for tool binding.

The method signature correctly handles various tool types through Union typing, which provides flexibility for different tool implementations.


46-52: Enhance the docstring as previously suggested.

The current docstring "Bind tools" is too minimal for an interface method.

app/llm/langchain/iris_langchain_chat_model.py (2)

2-2: LGTM: Import additions support tool binding functionality.

The new imports appropriately support the implementation of native function calling, with proper type hints and necessary tool-related classes.

Also applies to: 5-5, 10-14


43-49: Consider enhancing tool binding architecture.

Since this implementation is part of a larger function calling feature, consider these architectural improvements:

  1. Add model capability checking (e.g., does the model support function calling?)
  2. Add tool compatibility verification
  3. Consider adding configuration options for tool execution (e.g., temperature, tool choice strategy)

Let's verify if there are any model capability checks in the codebase:

app/llm/request_handler/capability_request_handler.py (1)

2-5: LGTM: Well-organized imports

The new imports are properly organized and necessary for the added functionality.

app/llm/external/model.py (2)

2-6: LGTM: Import changes are well-structured and necessary.

The new imports properly support tool binding functionality and chat completion types, while maintaining compatibility with both pydantic v1 and v2 through appropriate aliasing.


47-47: Verify the impact of return type change across implementations.

Changing the return type from PyrisMessage to ChatCompletionMessage is a breaking change that might affect existing implementations. While it aligns with OpenAI's types, it could limit flexibility with other providers.

Consider:

  1. Adding a conversion layer to maintain compatibility with different providers
  2. Documenting this breaking change in the changelog
app/llm/external/ollama.py (1)

3-8: LGTM: Imports are well-organized and support tool binding functionality.

The new imports from typing and langchain_core are appropriate for implementing the tool binding interface.

app/pipeline/chat/interaction_suggestion_pipeline.py (2)

145-150: ⚠️ Potential issue

Improve error handling in pipeline execution.

The current implementation has good error handling for missing last message but the catch-all exception handler silently returns an empty list, which might hide issues.

-            return []
+            logger.error(f"Pipeline execution failed: {str(e)}")
+            raise RuntimeError(f"Failed to generate interaction suggestions: {str(e)}") from e

Likely invalid or redundant comment.


124-126: Consider handling message content validation.

The current implementation assumes the message content is valid and only handles brace escaping. Consider adding validation for message content and proper error handling.

app/web/routers/pipelines.py (1)

78-85: Verify intended CourseChatPipeline changes

The course chat pipeline has been updated with similar parameters (variant and event) as the exercise chat pipeline. Please confirm if this is intentional, as the PR objective only mentions changes to the exercise chat functionality.

Also applies to: 105-112

✅ Verification successful

Based on the git history and code changes shown, I can now generate a final response:

CourseChatPipeline changes are intentional and part of systematic updates

The changes to CourseChatPipeline are intentional and part of a larger refactoring effort. The git history shows:

  • The variant parameter was added to support different chat modes (e.g., "jol")
  • The event parameter was added to handle specific event types like "jol" (judgment of learning)
  • These changes align with the codebase's evolution, particularly:
    • Addition of competency and JOL-related functionality
    • Improved event handling for different chat scenarios
    • Consistent parameter patterns across pipeline implementations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check CourseChatPipeline implementation for event parameter usage
ast-grep --pattern 'class CourseChatPipeline {
  $$$
  def __init__($_, $event, $_) {
    $$$
  }
  $$$
}'

Length of output: 108


Script:

#!/bin/bash
# Search for CourseChatPipeline class definition and its usage
rg "class CourseChatPipeline" -A 10

# Search for run_course_chat_pipeline_worker usage
rg "run_course_chat_pipeline_worker" -A 5

# Search for any recent changes related to CourseChatPipeline
git log -p | rg "CourseChatPipeline" -A 5 -B 5

Length of output: 101872

app/web/status/status_update.py (3)

136-136: LGTM: Proper cleanup of suggestions during error state.

The addition of self.status.suggestions = None ensures a clean error state by preventing stale suggestions from being displayed to users.


167-167: LGTM: Fixed suggestion cleanup scope in skip method.

The change correctly moves suggestion cleanup from stage scope to status scope, maintaining consistency with error handling and the status object being the source of truth.


Line range hint 209-214: Review stage configuration for tool calling agent.

The stage name change to "Checking available information" aligns with the new tool calling functionality. However, please consider:

  1. Should the stage weight (30) be adjusted based on the expected duration of tool calling operations?
  2. Does the "Creating suggestions" stage need updates to handle tool-specific suggestions?
#!/bin/bash
# Description: Check for any tool-related configurations or similar stage patterns
# that might inform the weight adjustment

echo "Checking for similar stage configurations..."
rg -B 2 -A 2 "weight.*=.*30.*state.*=.*StageStateEnum"
app/pipeline/chat/exercise_chat_pipeline.py (1)

Line range hint 1-424: Verify variant handling across the codebase.

Let's verify that all variant values are consistently handled and that there are no unhandled variants in the codebase.

✅ Verification successful

Variant handling is properly implemented and consistent

The verification shows that variant handling is well-implemented across the codebase:

  • All variants ("default", "progress_stalled", "build_failed") are consistently defined and used
  • Each variant has its corresponding system prompt template defined in iris_exercise_chat_prompts.py
  • The variant logic in exercise_chat_pipeline.py correctly handles all cases:
    • Default case uses final_system_prompt
    • "progress_stalled" case uses progress_stalled_system_prompt
    • "build_failed" case uses build_failed_system_prompt
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for consistency in variant handling

# Search for variant-related string literals
echo "Checking for variant string literals:"
rg -i "variant.*=.*['\"](?:default|progress_stalled|build_failed)['\"]" --type py

# Search for direct string comparisons with variants
echo -e "\nChecking for variant string comparisons:"
rg -i "variant.*==.*['\"](?:default|progress_stalled|build_failed)['\"]" --type py

# Look for potential variant definitions or enums
echo -e "\nChecking for variant type definitions:"
rg -i "class.*variant" --type py

Length of output: 1241


Script:

#!/bin/bash
# Let's check for all the system prompts referenced in the variant handling

echo "Checking for system prompt definitions:"
rg -l "progress_stalled_system_prompt|build_failed_system_prompt|final_system_prompt" --type py

echo -e "\nChecking the content of the prompts file:"
fd "iris_exercise_chat_prompts.py" -x cat {}

Length of output: 12559

app/pipeline/chat/exercise_chat_agent_pipeline.py (2)

219-261: Add null checks for dto.exercise and dto.exercise.end_date

As previously noted, there is a potential AttributeError when dto.exercise or dto.exercise.end_date is None in the get_additional_exercise_details function. Ensure that both are not None before performing the comparison to prevent runtime errors.


336-353: Ensure latest_result is not None before accessing feedbacks

This issue was previously raised and is still applicable. In the get_feedbacks function, check if dto.submission.latest_result is not None before accessing its feedbacks attribute to avoid an AttributeError.

Comment on lines +43 to +49
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
**kwargs: Any,
) -> Runnable[LanguageModelInput, BaseMessage]:
self.request_handler.bind_tools(tools)
return self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation and documentation to bind_tools.

While the implementation is clean, consider these improvements:

  1. Add docstring explaining:
    • Purpose of the method
    • Expected tool types
    • Return value significance
  2. Add validation for:
    • Empty tool sequence
    • Invalid tool types
    • Request handler readiness

Here's a suggested implementation:

     def bind_tools(
         self,
         tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
         **kwargs: Any,
     ) -> Runnable[LanguageModelInput, BaseMessage]:
+        """Bind a sequence of tools to the request handler for function calling support.
+
+        Args:
+            tools: Sequence of tools that can be one of:
+                  - Dict describing the tool
+                  - Pydantic BaseModel
+                  - Callable function
+                  - BaseTool instance
+            **kwargs: Additional arguments passed to the request handler
+
+        Returns:
+            self: Returns this instance as a Runnable
+
+        Raises:
+            ValueError: If tools sequence is empty or contains invalid tool types
+        """
+        if not tools:
+            raise ValueError("At least one tool must be provided")
+
         self.request_handler.bind_tools(tools)
         return self
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
**kwargs: Any,
) -> Runnable[LanguageModelInput, BaseMessage]:
self.request_handler.bind_tools(tools)
return self
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
**kwargs: Any,
) -> Runnable[LanguageModelInput, BaseMessage]:
"""Bind a sequence of tools to the request handler for function calling support.
Args:
tools: Sequence of tools that can be one of:
- Dict describing the tool
- Pydantic BaseModel
- Callable function
- BaseTool instance
**kwargs: Additional arguments passed to the request handler
Returns:
self: Returns this instance as a Runnable
Raises:
ValueError: If tools sequence is empty or contains invalid tool types
"""
if not tools:
raise ValueError("At least one tool must be provided")
self.request_handler.bind_tools(tools)
return self

Comment on lines +77 to +84
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
"""Bind the tools to the models"""
llm = self._select_model(ChatModel)
llm.bind_tools(tools)
return llm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add comprehensive error handling and documentation

The bind_tools method needs additional safeguards and documentation:

  1. Add parameter validation for the tools sequence
  2. Add error handling for unsupported tool types
  3. Enhance the docstring with parameter and return value details

Here's a suggested improvement:

     def bind_tools(
         self,
         tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
     ) -> LanguageModel:
-        """Bind the tools to the models"""
+        """Bind the provided tools to the selected ChatModel.
+        
+        Args:
+            tools: A sequence of tools to bind. Can be one of:
+                - Dict[str, Any]: Tool configuration dictionary
+                - Type[BaseModel]: Pydantic model class
+                - Callable: Function to be used as a tool
+                - BaseTool: LangChain tool instance
+        
+        Returns:
+            LanguageModel: The selected chat model with tools bound
+            
+        Raises:
+            ValueError: If tools sequence is empty or contains unsupported tool types
+            TypeError: If selected model doesn't support tool binding
+        """
+        if not tools:
+            raise ValueError("Tools sequence cannot be empty")
+            
         llm = self._select_model(ChatModel)
+        if not hasattr(llm, 'bind_tools'):
+            raise TypeError(f"Selected model {llm.description} doesn't support tool binding")
+            
         llm.bind_tools(tools)
         return llm
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
"""Bind the tools to the models"""
llm = self._select_model(ChatModel)
llm.bind_tools(tools)
return llm
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
) -> LanguageModel:
"""Bind the provided tools to the selected ChatModel.
Args:
tools: A sequence of tools to bind. Can be one of:
- Dict[str, Any]: Tool configuration dictionary
- Type[BaseModel]: Pydantic model class
- Callable: Function to be used as a tool
- BaseTool: LangChain tool instance
Returns:
LanguageModel: The selected chat model with tools bound
Raises:
ValueError: If tools sequence is empty or contains unsupported tool types
TypeError: If selected model doesn't support tool binding
"""
if not tools:
raise ValueError("Tools sequence cannot be empty")
llm = self._select_model(ChatModel)
if not hasattr(llm, 'bind_tools'):
raise TypeError(f"Selected model {llm.description} doesn't support tool binding")
llm.bind_tools(tools)
return llm

💡 Codebase verification

Tool binding support varies across model implementations

The assumption that tool binding is only supported by ChatModel is incorrect. The codebase shows:

  • OpenAI models (both Direct and Azure) support tool binding with proper tool conversion
  • Ollama model explicitly does not support tool binding (raises NotImplementedError)
  • LangChain chat model delegates tool binding to its request handler

The current implementation in capability_request_handler.py may need to handle these variations appropriately.

🔗 Analysis chain

Verify tool binding support across models

The method assumes tool binding is only supported by ChatModel. We should verify this assumption and document any limitations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other model implementations that might support tool binding
ast-grep --pattern 'class $MODEL implements LanguageModel {
  $$$
  bind_tools($$$) {
    $$$
  }
  $$$
}'

Length of output: 108


Script:

#!/bin/bash
# Let's try a different approach to find model implementations and tool binding
# First, find all Python files that might contain model implementations
fd -e py | xargs rg -l "class.*Model"

# Then specifically look for bind_tools method implementations
fd -e py | xargs rg "def bind_tools"

# Also search for LanguageModel interface/base class
fd -e py | xargs rg "class LanguageModel"

Length of output: 3121


Script:

#!/bin/bash
# Let's examine the LanguageModel interface and its bind_tools implementation
rg -A 5 "class LanguageModel" app/llm/external/model.py

# Let's also check the implementation in ChatModel and other models
rg -B 2 -A 10 "def bind_tools" app/llm/external/openai_chat.py
rg -B 2 -A 10 "def bind_tools" app/llm/external/ollama.py
rg -B 2 -A 10 "def bind_tools" app/llm/langchain/iris_langchain_chat_model.py

Length of output: 2128

Comment on lines +53 to +63
@abstractmethod
def bind_tools(
self,
tools: Sequence[
Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool]
],
):
"""Bind tools"""
raise NotImplementedError(
f"The LLM {self.__str__()} does not support binding tools"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add docstring to explain tool binding behavior.

While the method signature is flexible and supports multiple tool types, it lacks documentation about:

  • Expected behavior and usage
  • Requirements for each tool type
  • Impact on chat completions

Consider adding a comprehensive docstring:

 def bind_tools(
     self,
     tools: Sequence[
         Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool]
     ],
 ):
-    """Bind tools"""
+    """Bind tools to the language model for function calling capabilities.
+    
+    Args:
+        tools: A sequence of tools that can be one of:
+            - Dict[str, Any]: Raw function schema
+            - Type[LegacyBaseModel]: Pydantic v1 model
+            - Callable: Python function
+            - BaseTool: Langchain tool
+    
+    Note:
+        The bound tools will be available during chat completions,
+        allowing the model to call them when needed.
+    """
     raise NotImplementedError(
         f"The LLM {self.__str__()} does not support binding tools"
     )

Also consider using a more specific type for the dictionary schema:

FunctionSchema = TypedDict('FunctionSchema', {
    'name': str,
    'description': str,
    'parameters': dict
})

Comment on lines +140 to +148
def bind_tools(
self,
tools: Sequence[
Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool]
],
) -> Runnable[LanguageModelInput, BaseMessage]:
raise NotImplementedError(
f"The LLM {self.__str__()} does not support binding tools"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add documentation about tool binding support status.

While the method signature aligns with the tool binding interface, the implementation needs more clarity:

  1. Add a docstring explaining that tool binding is not supported in Ollama yet
  2. Add a TODO comment indicating plans for future implementation
  3. Consider adding a warning in the class docstring about unsupported features

Here's a suggested improvement:

+    """Bind tools to the language model for function calling capabilities.
+    
+    Note: Tool binding is currently not supported in Ollama models. This feature
+    is only available for OpenAI models.
+    
+    Args:
+        tools: A sequence of tools to bind to the model.
+    
+    Returns:
+        A runnable that can process inputs with the bound tools.
+    
+    Raises:
+        NotImplementedError: Always raised as Ollama does not support tool binding.
+    """
+    # TODO: Implement tool binding support for Ollama models
     def bind_tools(
         self,
         tools: Sequence[
             Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool]
         ],
     ) -> Runnable[LanguageModelInput, BaseMessage]:
         raise NotImplementedError(
             f"The LLM {self.__str__()} does not support binding tools"
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def bind_tools(
self,
tools: Sequence[
Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool]
],
) -> Runnable[LanguageModelInput, BaseMessage]:
raise NotImplementedError(
f"The LLM {self.__str__()} does not support binding tools"
)
"""Bind tools to the language model for function calling capabilities.
Note: Tool binding is currently not supported in Ollama models. This feature
is only available for OpenAI models.
Args:
tools: A sequence of tools to bind to the model.
Returns:
A runnable that can process inputs with the bound tools.
Raises:
NotImplementedError: Always raised as Ollama does not support tool binding.
"""
# TODO: Implement tool binding support for Ollama models
def bind_tools(
self,
tools: Sequence[
Union[Dict[str, Any], Type[LegacyBaseModel], Callable, BaseTool]
],
) -> Runnable[LanguageModelInput, BaseMessage]:
raise NotImplementedError(
f"The LLM {self.__str__()} does not support binding tools"
)

Comment on lines +68 to +70
dto: ExerciseChatPipelineExecutionDTO = Body(
description="Exercise Chat Pipeline Execution DTO"
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix Body() usage in parameter defaults

Using Body() in parameter defaults can lead to unexpected behavior. Instead, use FastAPI's parameter declaration pattern.

Apply this pattern to both endpoints:

-    dto: ExerciseChatPipelineExecutionDTO = Body(
-        description="Exercise Chat Pipeline Execution DTO"
-    ),
+    dto: ExerciseChatPipelineExecutionDTO = Body(...,
+        description="Exercise Chat Pipeline Execution DTO"
+    ),
-    dto: CourseChatPipelineExecutionDTO = Body(
-        description="Course Chat Pipeline Execution DTO"
-    ),
+    dto: CourseChatPipelineExecutionDTO = Body(...,
+        description="Course Chat Pipeline Execution DTO"
+    ),

Also applies to: 108-110

🧰 Tools
🪛 Ruff

68-70: Do not perform function call Body in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

Comment on lines +85 to +90
def __init__(
self,
callback: CourseChatStatusCallback,
variant: str = "default",
event: str | None = None,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update method signature to use Optional for event parameter.

Similar to the class attribute, the event parameter in the __init__ method should use Optional[str] for compatibility with Python versions earlier than 3.10.

Apply this diff to fix the method signature:

 def __init__(
     self,
     callback: CourseChatStatusCallback,
     variant: str = "default",
-    event: str | None = None,
+    event: Optional[str] = None,
 ):
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __init__(
self,
callback: CourseChatStatusCallback,
variant: str = "default",
event: str | None = None,
):
def __init__(
self,
callback: CourseChatStatusCallback,
variant: str = "default",
event: Optional[str] = None,
):

Comment on lines +256 to +264
weight = 2.0 / 3.0
return [
{
"info": competency_metrics.competency_information.get(comp, None),
"exercise_ids": competency_metrics.exercises.get(comp, []),
"progress": competency_metrics.progress.get(comp, 0),
"mastery": get_mastery(
competency_metrics.progress.get(comp, 0),
competency_metrics.confidence.get(comp, 0),
"mastery": (
(1 - weight) * competency_metrics.progress.get(comp, 0)
+ weight * competency_metrics.confidence.get(comp, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ensure consistent mastery calculation across the codebase.

The calculation of mastery in get_competency_list differs from the get_mastery function, which may lead to inconsistent results. Consider using get_mastery here for consistency.

Apply this diff to unify the mastery calculation:

 weight = 2.0 / 3.0
 return [
     {
         "info": competency_metrics.competency_information.get(comp, None),
         "exercise_ids": competency_metrics.exercises.get(comp, []),
         "progress": competency_metrics.progress.get(comp, 0),
-        "mastery": (
-            (1 - weight) * competency_metrics.progress.get(comp, 0)
-            + weight * competency_metrics.confidence.get(comp, 0)
-        ),
+        "mastery": get_mastery(
+            competency_metrics.progress.get(comp, 0),
+            competency_metrics.confidence.get(comp, 0),
+        ),
         "judgment_of_learning": (
             competency_metrics.jol_values.get[comp].json()
             if competency_metrics.jol_values
             and comp in competency_metrics.jol_values
             else None
         ),
     }
     for comp in competency_metrics.competency_information
 ]

Committable suggestion skipped: line range outside the PR's diff.

@@ -123,6 +131,7 @@
:param dto: The pipeline execution data transfer object
:param kwargs: The keyword arguments
"""
logger.debug(dto.model_dump_json(indent=4))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure logging does not expose sensitive information.

Logging the entire DTO using dto.model_dump_json() may inadvertently expose sensitive user data. Consider limiting the logged information or masking sensitive fields to protect user privacy.

Apply this diff to sanitize the logged output:

- logger.debug(dto.model_dump_json(indent=4))
+ logger.debug(f"Received DTO with user ID: {dto.user.id} and course ID: {dto.course.id}")

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +335 to +357
if self.event == "jol":
event_payload = CompetencyJolDTO.model_validate(dto.event_payload.event)
logger.debug("Event Payload:", event_payload)
comp = next(
(
c
for c in dto.course.competencies
if c.id == dto.competency_jol.competency_id
if c.id == event_payload.competency_id
),
None,
)
agent_prompt = begin_agent_jol_prompt
params = {
"jol": json.dumps(
{
"value": dto.competency_jol.jol_value,
"value": event_payload.jol_value,
"competency_mastery": get_mastery(
dto.competency_jol.competency_progress,
dto.competency_jol.competency_confidence,
event_payload.competency_progress,
event_payload.competency_confidence,
),
}
),
"competency": comp.json(),
"competency": comp.model_dump_json(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle potential None value for comp to avoid exceptions.

If no matching competency is found, comp will be None, leading to an AttributeError when calling comp.model_dump_json(). Ensure that comp is not None before using it.

Apply this diff to add a check for comp:

 if self.event == "jol":
     event_payload = CompetencyJolDTO.model_validate(dto.event_payload.event)
     logger.debug(f"Event Payload: {event_payload}")
     comp = next(
         (c for c in dto.course.competencies if c.id == event_payload.competency_id),
         None,
     )
+    if comp is None:
+        logger.error(f"Competency with ID {event_payload.competency_id} not found.")
+        return  # or handle appropriately
     agent_prompt = begin_agent_jol_prompt
     params = {
         "jol": json.dumps(
             {
                 "value": event_payload.jol_value,
                 "competency_mastery": get_mastery(
                     event_payload.competency_progress,
                     event_payload.competency_confidence,
                 ),
             }
         ),
         "competency": comp.model_dump_json(),
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if self.event == "jol":
event_payload = CompetencyJolDTO.model_validate(dto.event_payload.event)
logger.debug("Event Payload:", event_payload)
comp = next(
(
c
for c in dto.course.competencies
if c.id == dto.competency_jol.competency_id
if c.id == event_payload.competency_id
),
None,
)
agent_prompt = begin_agent_jol_prompt
params = {
"jol": json.dumps(
{
"value": dto.competency_jol.jol_value,
"value": event_payload.jol_value,
"competency_mastery": get_mastery(
dto.competency_jol.competency_progress,
dto.competency_jol.competency_confidence,
event_payload.competency_progress,
event_payload.competency_confidence,
),
}
),
"competency": comp.json(),
"competency": comp.model_dump_json(),
if self.event == "jol":
event_payload = CompetencyJolDTO.model_validate(dto.event_payload.event)
logger.debug(f"Event Payload: {event_payload}")
comp = next(
(
c
for c in dto.course.competencies
if c.id == event_payload.competency_id
),
None,
)
if comp is None:
logger.error(f"Competency with ID {event_payload.competency_id} not found.")
return # or handle appropriately
agent_prompt = begin_agent_jol_prompt
params = {
"jol": json.dumps(
{
"value": event_payload.jol_value,
"competency_mastery": get_mastery(
event_payload.competency_progress,
event_payload.competency_confidence,
),
}
),
"competency": comp.model_dump_json(),

Comment on lines +160 to +353
),
"due_date_over": (
dto.exercise.end_date < current_time
if dto.exercise.end_date
else "No end date provided"
),
}

def get_build_logs_analysis_tool() -> str:
"""
# Build Logs Analysis Tool

## Purpose
Analyze CI/CD build logs for debugging and code quality feedback.

## Retrieved Information
- Build status (successful or failed)
- If failed:
- Error messages
- Warning messages
- Timestamps for log entries

## Usage Guidelines
1. Use when code fails to compile or to assess code quality.
2. If build successful, no further analysis needed.
3. For failed builds:
a. Identify specific error messages causing compilation failure.
b. Note warnings for style and potential issues.
c. Use timestamps to understand error sequence.
4. Relate log information to specific parts of student's code.
5. Provide targeted feedback on compilation errors and code quality issues.

## Key Points
- Critical for addressing compilation failures.
- Useful for identifying and explaining code quality concerns.
- Combine with code content analysis for comprehensive feedback.
- Consider student's skill level when interpreting and explaining logs.


"""
self.callback.in_progress("Analyzing build logs ...")
if not dto.submission:
return "No build logs available."
build_failed = dto.submission.build_failed
build_logs = dto.submission.build_log_entries
logs = (
"The build was successful."
if not build_failed
else (
"\n".join(
str(log) for log in build_logs if "~~~~~~~~~" not in log.message
)
)
)
return logs

def get_feedbacks() -> str:
"""
# Get Feedbacks Tool
## Purpose
Retrieve and analyze automated test feedback from the CI/CD pipeline.

## Retrieved Information
For each feedback item:
- Test case name
- Credits awarded
- Text feedback

## Usage Guidelines
1. Use when automated tests fail to understand specific issues.
2. Analyze feedback to identify logic errors in student's code.
3. Use test case names to pinpoint problematic areas of the implementation.
4. Utilize text feedback to provide specific improvement suggestions.

## Key Points
- Essential for detailed, test-based code evaluation.
- Helps in providing targeted, actionable feedback to students.
- If "No feedbacks" returned, consider if this indicates all tests passed or a system issue.
- Combine with code content analysis for comprehensive review.


"""
self.callback.in_progress("Analyzing feedbacks ...")
if not dto.submission:
return "No feedbacks available."
feedbacks = dto.submission.latest_result.feedbacks
feedback_list = (
"\n".join(
[
"Case: {}. Credits: {}. Info: {}".format(
feedback.test_case_name, feedback.credits, feedback.text
)
for feedback in feedbacks
]
)
if feedbacks
else "No feedbacks."
)
return feedback_list

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor helper functions out of the __call__ method

Defining multiple helper functions (get_submission_details, get_additional_exercise_details, get_build_logs_analysis_tool, get_feedbacks, repository_files, file_lookup) within the __call__ method can make the code harder to read and test. Consider moving these functions outside the __call__ method as instance methods or standalone functions to improve maintainability and readability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants