Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added attempt_complete tool, giving agent ability to signal completion #406

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ErikBjare
Copy link
Owner

@ErikBjare ErikBjare commented Jan 16, 2025

Played with this a couple of days ago, idea was to mimic a similar tool I saw in one of the SWE-Bench submissions.

Could be a more reliable stopping-condition (and a good place to run appropriate checks) than waiting for a non-tooluse message.

Not sure if this should be merged here or in gptme-contrib, but since it required some changes to the Python tool it'll go here for now.


Important

Add attempt_complete tool to signal task completion, integrated into Python execution flow in python.py.

  • New Tool:
    • Added attempt_complete.py with Completed class to signal task completion.
    • attempt_complete() function attempts task completion and returns Completed.
    • check_completion() placeholder for task completion checks.
  • Integration:
    • In python.py, import Completed from attempt_complete.py.
    • Update execute_python() to handle Completed by exiting process.
  • Misc:
    • attempt_complete tool is disabled by default in ToolSpec.

This description was created by Ellipsis for 70e5b91. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 70e5b91 in 1 minute and 9 seconds

More details
  • Looked at 69 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. gptme/tools/python.py:145
  • Draft comment:
    Using sys.exit(0) in a library function can be problematic as it abruptly terminates the program. Consider raising an exception or handling this more gracefully.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The use of sys.exit(0) here seems intentional as part of handling a specific completion signal. This is a tool for executing Python code interactively, and the Completed signal appears to be a deliberate way to terminate the session. The comment's suggestion might make the code more library-friendly, but it would complicate the simple and clear termination logic currently in place.
    I might be overlooking potential use cases where this code needs to be used as a library without terminating the entire program. The comment could be raising a valid architectural concern.
    While the concern is theoretically valid, the code appears to be part of an interactive tool where immediate termination on completion is the desired behavior. The current implementation is clear and serves its purpose.
    Delete the comment. The use of sys.exit(0) appears intentional for this interactive tool's use case, and the suggested change would complicate the code without clear benefit.
2. gptme/tools/attempt_complete.py:27
  • Draft comment:
    check_completion always returns True, which means it doesn't perform any actual checks. Implement meaningful checks here.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is technically correct that check_completion() always returns True. However, the code already has a TODO comment explaining this is temporary and listing what real checks will be added. The automated comment is therefore redundant and doesn't add new information or insight beyond what's already documented in the code.
    Maybe the TODO comment isn't sufficient and having an additional review comment would put more pressure on fixing this soon?
    The TODO comment is actually more specific and helpful than the review comment, as it suggests concrete examples of what checks to add. The review comment doesn't add meaningful pressure beyond the TODO.
    Delete this comment as it's redundant with the existing TODO comment that already acknowledges and better explains the placeholder implementation.

Workflow ID: wflow_UmZ3ogCrZ4bZ4gyR


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 50.00000% with 9 lines in your changes missing coverage. Please review.

Project coverage is 69.86%. Comparing base (d27da5e) to head (70e5b91).

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
gptme/tools/attempt_complete.py 46.15% 7 Missing ⚠️
gptme/tools/python.py 60.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #406      +/-   ##
==========================================
- Coverage   70.10%   69.86%   -0.25%     
==========================================
  Files          70       71       +1     
  Lines        5761     5778      +17     
==========================================
- Hits         4039     4037       -2     
- Misses       1722     1741      +19     
Flag Coverage Δ
anthropic/claude-3-haiku-20240307 68.76% <50.00%> (-0.16%) ⬇️
openai/gpt-4o-mini 67.68% <50.00%> (-0.29%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants