Skip to content
@apify

Apify

We're making the web more programmable.

Pinned Loading

  1. crawlee-python crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

    Python 4.5k 312

  2. crawlee crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

    TypeScript 15.5k 665

  3. proxy-chain proxy-chain Public

    Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.

    JavaScript 850 143

  4. apify-sdk-js apify-sdk-js Public

    Apify SDK monorepo

    TypeScript 123 35

  5. got-scraping got-scraping Public

    HTTP client made for scraping based on got.

    TypeScript 551 44

  6. fingerprint-suite fingerprint-suite Public

    Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

    TypeScript 961 101

Repositories

Showing 10 of 128 repositories
  • apify-cli Public

    Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.

    apify/apify-cli’s past year of commit activity
    TypeScript 122 18 34 (1 issue needs help) 5 Updated Nov 10, 2024
  • apify-shared-js Public

    Utilities and constants shared across Apify projects.

    apify/apify-shared-js’s past year of commit activity
    TypeScript 12 Apache-2.0 11 4 0 Updated Nov 10, 2024
  • openapi Public

    An OpenAPI specification for the Apify API.

    apify/openapi’s past year of commit activity
    JavaScript 2 MIT 0 17 2 Updated Nov 10, 2024
  • crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee’s past year of commit activity
    TypeScript 15,521 Apache-2.0 665 116 (1 issue needs help) 14 Updated Nov 10, 2024
  • docusaurus-plugin-typedoc-api Public Forked from milesj/docusaurus-plugin-typedoc-api

    Apify's fork of `docusaurus-plugin-typedoc-api`, customized for our Python documentation.

    apify/docusaurus-plugin-typedoc-api’s past year of commit activity
    TypeScript 0 29 0 1 Updated Nov 9, 2024
  • rag-web-browser Public

    RAG Web Browser is a tool to provide your RAG pipelines with up-to-date information from the web.

    apify/rag-web-browser’s past year of commit activity
    TypeScript 2 0 0 1 Updated Nov 9, 2024
  • actor-whitepaper Public

    This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.

    apify/actor-whitepaper’s past year of commit activity
    2 Apache-2.0 0 7 5 Updated Nov 8, 2024
  • crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee-python’s past year of commit activity
    Python 4,501 Apache-2.0 312 73 11 Updated Nov 8, 2024
  • apify/actor-beautifulsoup-scraper’s past year of commit activity
    Python 3 Apache-2.0 0 2 0 Updated Nov 8, 2024
  • apify-client-python Public

    Apify API client for Python

    apify/apify-client-python’s past year of commit activity
    Python 47 Apache-2.0 11 8 2 Updated Nov 8, 2024