Skip to content

Fix broken profile scrapers due to upstream site changes#90

Open
robmoss344 wants to merge 1 commit intoLucksi:masterfrom
robmoss344:master
Open

Fix broken profile scrapers due to upstream site changes#90
robmoss344 wants to merge 1 commit intoLucksi:masterfrom
robmoss344:master

Conversation

@robmoss344
Copy link
Copy Markdown

Summary

  • Instagram: Picuki.com redirected to a Cloudflare-protected TikTok viewer — replaced broken HTML scraping with a clear unavailable message
  • Twitter: nitter.net now returns empty responses — replaced with unavailable message
  • TikTok: updated URLebird selectors (h2.text-dark, img.u-image, div.stats-header) to match redesigned layout; correctly extracts username, followers, likes, following, and profile pic
  • GitHub: fixed silent error handler to expose actual exception message
  • GitLab: added empty-list guard so missing users exit cleanly instead of crashing with IndexError
  • Chess.com: added HTTP status check; exits gracefully on non-200 responses
  • Gravatar: added 404 handling and switched optional keys to .get() to prevent KeyError on minimal profiles
  • Tellonym: Cloudflare 403 blocks all requests — replaced with unavailable message
  • Joinroll: public API now requires authentication (HTTP 500) — replaced with unavailable message

Test plan

  • Tested profile scraping against username torvalds
  • TikTok: returns username, followers, following, hearts, profile pic
  • GitHub: returns name, repos, followers, company, location
  • NGL.link: returns profile pic and click count
  • Chess.com: returns player ID, status, league, country
  • GitLab: gracefully reports user not found
  • Gravatar: gracefully reports user not found
  • Instagram/Twitter/Tellonym/Joinroll: clear informative unavailable messages instead of cryptic NoneType crashes
  • Syntax verified: python3 -m py_compile Core/Support/Username/Scraper.py

🤖 Generated with Claude Code

- Instagram: Picuki.com redirected to Cloudflare-protected TikTok viewer;
  replaced broken HTML scraping with clear unavailable message
- Twitter: nitter.net returns empty responses; replaced with unavailable message
- TikTok: updated URLebird selectors (h2.text-dark, img.u-image, div.stats-header)
  to match redesigned layout; now correctly extracts username, followers, and stats
- GitHub: fixed silent error handler to expose actual exception message
- GitLab: added empty-list guard so missing users exit cleanly instead of IndexError
- Chess.com: added HTTP status check; exits gracefully on non-200 responses
- Gravatar: added 404 handling and switched optional keys to .get() to prevent
  KeyError on minimal profiles
- Tellonym: Cloudflare 403 blocks all requests; replaced with unavailable message
- Joinroll: public API requires auth (HTTP 500); replaced with unavailable message

All scrapers now either return data or fail with an informative message.
Tested against username 'torvalds' — TikTok, GitHub, NGL, and Chess.com confirmed working.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant