Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make check for isElementOnTop more stable #85

Merged
merged 35 commits into from
Oct 3, 2024
Merged

Conversation

navidkpr
Copy link
Collaborator

@navidkpr navidkpr commented Oct 1, 2024

why

Screenshot 2024-10-02 at 3 17 56 PM

Before this PR:

Extracted data: {
  name: 'Asim Shrestha',
  content: "I'm a software engineer co-founding Reworkd",
  page_links: [
    { title: 'Reworkd', url: 'https://reworkd.ai/' },
    { title: 'Writing', url: 'https://asim-shrestha.com/writing' },
    { title: 'Quotes', url: 'https://asim-shrestha.com/quotes' },
    { title: 'Reading', url: 'https://asim-shrestha.com/reading-list' },
    { title: 'Contact', url: 'https://asim-shrestha.com/contact' }
  ]
}

After this PR:

Extracted data: {
  name: 'Asim Shrestha',
  content: 'Asim Shrestha is a software engineer co-founding Reworkd, a YC S23 company that focuses on extracting structured data from the web. They are known for developing AgentGPT 🤖.',
  page_links: [
    { title: 'Reworkd', url: 'https://reworkd.ai/' },
    { title: 'Writing', url: 'https://asim-shrestha.com/writing' },
    { title: 'Quotes', url: 'https://asim-shrestha.com/quotes' },
    { title: 'Reading', url: 'https://asim-shrestha.com/reading-list' },
    { title: 'Contact', url: 'https://asim-shrestha.com/contact' }
  ]
}

what changed

Make isTopElement check if the element is top in multiple points in the bounding triangle instead of just the center.

The test case is still failing but because we have an extra link extract ({ title: 'Reworkd', url: 'https://reworkd.ai/' }). But this seems reasonable enough to me / doesn't seem like something we want to add to the prompt (since the user at times might want to get the current page url)

test plan

@navidkpr navidkpr marked this pull request as draft October 1, 2024 23:59
@navidkpr navidkpr marked this pull request as ready for review October 3, 2024 00:02
@pkiv
Copy link
Contributor

pkiv commented Oct 3, 2024

Can you run the evals a few times to ensure it's stable? When I run on my local it doesn't match your results.

  • extract_collaborators_from_github_repository: Fails on my local, passes on yours, fails on main
  • extract_last_twenty_github_commits: **Fails on my local, passes on main, fails on yours. **
  • google_jobs: Passed on my local, passes on main, fails on yours
  • bananalyzer_1: Fails on my local, fails on main, passed on your run.

https://www.braintrust.dev/app/Browserbase/p/stagehand/experiments/npour%2Ffix-bananalyzer-1-1727926914?c=npour/fix-bananalyzer-3-1727921350,main-1727927090&diff=between_experiments

@navidkpr navidkpr changed the title fix banalayzer-1 eval Make check for isElementOnTop more stable Oct 3, 2024
@navidkpr navidkpr changed the title Make check for isElementOnTop more stable make check for isElementOnTop more stable Oct 3, 2024
@navidkpr navidkpr merged commit 8c1bba2 into main Oct 3, 2024
1 check passed
@filip-michalsky filip-michalsky deleted the npour/fix-bananalyzer-1 branch October 6, 2024 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants