-
Notifications
You must be signed in to change notification settings - Fork 157
feat(browser): implement persistent browser context for session management #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Greptile OverviewGreptile SummaryReplaces manual Key Changes:
Benefits:
Migration: Confidence Score: 4.5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant CLI as cli_main.py
participant Browser as browser.py
participant Persistent as PersistentBrowserManager
participant Playwright
User->>CLI: Start server
CLI->>CLI: Check needs_migration()
alt Legacy session exists
CLI->>Browser: migrate_from_legacy_session()
Browser->>Browser: Load legacy BrowserManager
Browser->>Browser: Extract storage state
Browser->>Persistent: Create new context
Browser->>Persistent: Transfer state
Browser->>Browser: Verify login
Browser-->>CLI: Migration successful
end
CLI->>Browser: get_or_create_browser()
Browser->>Persistent: Initialize with user_data_dir
Browser->>Persistent: start()
Persistent->>Playwright: Start playwright
Persistent->>Playwright: Launch persistent context
Note over Playwright: State persists automatically
Playwright-->>Persistent: BrowserContext with Page
Persistent-->>Browser: PersistentBrowserManager
Browser->>Browser: Navigate to LinkedIn
Browser->>Browser: Verify authentication
Browser-->>CLI: Authenticated browser
CLI->>CLI: Start FastMCP server
Note over CLI: Tools use singleton browser
User->>CLI: Shutdown
CLI->>Browser: close_browser()
Browser->>Persistent: close()
Persistent->>Playwright: Close context
Persistent->>Playwright: Stop playwright
Note over Persistent: Session persisted
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 1 comment
| await persistent.start() | ||
|
|
||
| # Copy cookies from old session to new persistent context | ||
| storage_state = await temp_browser.context.storage_state() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify BrowserManager.context property exists - this relies on an undocumented interface from linkedin_scraper
Prompt To Fix With AI
This is a comment left during a code review.
Path: linkedin_mcp_server/drivers/browser.py
Line: 266:266
Comment:
Verify `BrowserManager.context` property exists - this relies on an undocumented interface from `linkedin_scraper`
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
…ement Replace manual session.json file management with Playwright's persistent browser context. Sessions now persist automatically in browser profile directory, eliminating need for manual save/load cycles. **Major Changes:** - Add PersistentBrowserManager using launch_persistent_context() - Change session storage: session.json file → browser-profile/ directory - Add automatic migration for existing session.json users - Update configuration with --user-data-dir option - Fix CLI default path (session.json → browser-profile) **Breaking Changes:** - Session location changed from ~/.linkedin-mcp/session.json to ~/.linkedin-mcp/browser-profile/ - Automatic migration provided for existing users - Version bumped to 3.0.0 **Benefits:** - More reliable cookie persistence (behaves like real browser) - No manual save/load cycles needed - Better Docker support with standard volume mount pattern - More LinkedIn-friendly (reduces CAPTCHAggers) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Hey, can you explain the re-authenticating issues you had? The session management is implemented in the upstream scraper; maybe create an issue there suggesting the use of Playwright's persistent browser context. |
|
Hey Daniel, thanks for the quick response! Maybe I got caught in weird moment, I was getting bitten by the is_logged_in() issue and kept having to re-authenticate every time and it was getting rather tedious. But that had me thinking: the session IDs won't last forever, and the is_logged_in() detection is bound to break in the future because it is inherently fragile. So why not make it a little bit easier on myself and others by reusing the same browser session, rather than a fresh one every time? I do agree this could be more elegantly implemented in the upstream scraper, but I saw that project had a really long queue of unreviewed PRs and plus I wanted to verify this was even the right solution so I implemented here. Totally understand if you'd rather see it upstreamed, and if so I can work on that but it'll be a much more circuitous route. |
|
I see where you're coming from, but I think the upstream PR backlog is mostly stale v2 code. My recent issues there were resolved quite fast. |
|
My main constraint is avoiding the maintenance burden of custom session management within this repository |
|
Fair point, and totally understandable. If I refactored this such that persistent context stuff went into the scraper library, would you accept a PR to utilize that? |
|
Yes absolutely |
|
Upstream PR: joeyism/linkedin_scraper#270 |
Summary
Replaces manual session.json file management with Playwright's persistent browser context for more reliable LinkedIn authentication and session persistence.
Motivation
Changes
Core Implementation
PersistentBrowserManagerclass usinglaunch_persistent_context()~/.linkedin-mcp/browser-profile/directoryMigration
session.jsonon first runsession.json.backupConfiguration
--user-data-dirCLI option for custom profile locationssession.json, nowbrowser-profile)Breaking Changes
This is a breaking change (v3.0.0):
~/.linkedin-mcp/session.json→~/.linkedin-mcp/browser-profile/--get-sessionto re-authenticate if migration failsBenefits
Testing
Verification Checklist
--get-sessioncreates profile--session-inforeports correct status--clear-sessionremoves profileMigration Guide for Users
Existing users (v2.x → v3.0):
Migration is automatic! On first run with v3.0, the server will:
session.jsonsession.json.backupBtw, I'm happy to go with whatever version numbering you want here.