Skip to content

Conversation

TristanDonze
Copy link

@TristanDonze TristanDonze commented Sep 1, 2025

Context

While crawling startup websites and taking full-page screenshots, I encountered an issue where images within the screenshots (such as product dashboards) had poor quality that made them unusable for further processing. After investigating, I discovered that increasing the device_scale_factor from 1.0 to 2.0 significantly improved screenshot quality when using Playwright directly.

However, when trying to apply this solution using this library, I found that the device_scale_factor was hardcoded to 1.0 in browser_manager.py and there was no way to configure it through BrowserConfig. Attempts to use Chromium extra args like --device-scale-factor=2 were unsuccessful.

Summary

This PR makes the device_scale_factor configurable through BrowserConfig instead of being hardcoded to 1.0. This allows users to improve screenshot quality when needed, particularly for capturing detailed images or dashboards on websites.

The changes maintain backward compatibility by keeping the default value at 1.0, so existing code will continue to work unchanged.

List of files changed and why

  • async_configs.py: Added device_scale_factor parameter to the BrowserConfig constructor, from_kwargs(), and to_dict() methods following the same pattern as other existing parameters
  • browser_manager.py: Modified to retrieve the device_scale_factor value from the BrowserConfig instance instead of using the hardcoded 1.0 value
  • deploy/docker/c4ai-code-context.md: Updated documentation to reflect the new configurable parameter

How Has This Been Tested?

  • Tested with a personal script that takes screenshots and verified that modifying the device_scale_factor successfully improved screenshot quality
  • Ran pytest before and after the modifications with no errors or warnings introduced by the changes
  • The changes follow the existing code patterns, so the risk of breaking existing functionality is minimal

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added/updated unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Note: I'm not very experienced with open source contributions, so I may have missed some important conventions or additional tests that should be included. Please let me know if there's anything else that should be addressed.

Summary by CodeRabbit

  • New Features

    • Added a configurable device scale factor in browser settings to control rendering scale (e.g., sharper screenshots and UI rendering). Defaults to 1.0 and can be adjusted via configuration.
    • Browser contexts now honor the configured device scale factor instead of a fixed value.
  • Documentation

    • Updated usage docs to include the new device scale factor option and configuration guidance.

Copy link
Contributor

coderabbitai bot commented Sep 1, 2025

Walkthrough

Adds a configurable device_scale_factor to BrowserConfig and propagates it to Playwright browser context creation. Updates serialization/deserialization (from_kwargs, to_dict) and documentation. Adjusts browser_manager to use the config’s device_scale_factor instead of a hard-coded value. Updates related deployment documentation.

Changes

Cohort / File(s) Summary
Config enhancement
crawl4ai/async_configs.py
Added device_scale_factor to BrowserConfig (init, attribute, docstring). Updated from_kwargs to accept it and to_dict to serialize it.
Browser context usage
crawl4ai/browser_manager.py
create_browser_context now uses self.config.device_scale_factor instead of a hard-coded 1.0.
Docs update
deploy/docker/c4ai-code-context.md
Documentation updated to reflect configurable device_scale_factor sourced from BrowserConfig.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant BrowserConfig
  participant BrowserManager
  participant PlaywrightContext as Playwright Context

  User->>BrowserConfig: Construct with device_scale_factor (default 1.0 or provided)
  User->>BrowserManager: Instantiate with BrowserConfig
  BrowserManager->>PlaywrightContext: create_browser_context(device_scale_factor = config.device_scale_factor)
  PlaywrightContext-->>BrowserManager: Context created
  BrowserManager-->>User: Ready context
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I tuned my screens with gentle care,
A pixel’s hop, a sharper glare.
Scale set right, I browse with cheer,
Tiny paws, big pages near.
From config burrow to browser glade—
Crisp carrots on display, upgrade! 🥕🖥️

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deploy/docker/c4ai-code-context.md (1)

7693-7704: Support device_scale_factor in BrowserConfig.from_kwargs and to_dict
BrowserConfig.init defines device_scale_factor: float = 1.0, but its static from_kwargs and to_dict omit this field—configs loaded from or serialized to dict will drop any custom DPI setting. In crawl4ai/async_configs.py, update:

  • BrowserConfig.from_kwargs to include device_scale_factor=kwargs.get("device_scale_factor", 1.0)
  • BrowserConfig.to_dict to add "device_scale_factor": self.device_scale_factor
🧹 Nitpick comments (9)
crawl4ai/async_configs.py (3)

368-375: Docstring is clear; add brief guidance on sensible ranges.

Consider noting typical values (e.g., 1.0–3.0) and that higher values increase memory/CPU to help users choose safely.


421-473: Validate and coerce device_scale_factor.

Guard against non-numeric, non-finite, or non-positive inputs to avoid surprising runtime behavior.

Apply this diff to replace the assignment with validation/coercion:

-        self.device_scale_factor = device_scale_factor
+        # Validate device_scale_factor
+        try:
+            dsf = float(device_scale_factor)
+        except (TypeError, ValueError):
+            raise ValueError("device_scale_factor must be numeric")
+        if dsf <= 0 or dsf != dsf or dsf in (float("inf"), float("-inf")):
+            raise ValueError("device_scale_factor must be a positive, finite number")
+        self.device_scale_factor = dsf

Additionally, add this import outside the selected range:

import math  # if you prefer math.isfinite, replace the finiteness checks accordingly

546-547: Coerce from kwargs to float to match constructor type.

This avoids silent string propagation from env/CLI sources.

-            device_scale_factor=kwargs.get("device_scale_factor", 1.0),
+            device_scale_factor=float(kwargs.get("device_scale_factor", 1.0) or 1.0),
crawl4ai/browser_manager.py (2)

663-673: Managed-browser path may not apply device_scale_factor to existing contexts.

When connecting over CDP and reusing an existing context (contexts[0]), immutable options like device_scale_factor won’t be updated. Result: the new setting is ignored in managed/CDP mode unless a fresh context is created.

Option: when contexts already exist, create a new context with config settings, clone runtime state, and use it as default:

-            contexts = self.browser.contexts
-            if contexts:
-                self.default_context = contexts[0]
-            else:
-                self.default_context = await self.create_browser_context()
-            await self.setup_context(self.default_context)
+            contexts = self.browser.contexts
+            if contexts:
+                old_ctx = contexts[0]
+                new_ctx = await self.create_browser_context()
+                # copy cookies/localStorage, then apply headers/etc. in setup_context
+                await clone_runtime_state(old_ctx, new_ctx, None, self.config)
+                self.default_context = new_ctx
+            else:
+                self.default_context = await self.create_browser_context()
+            await self.setup_context(self.default_context)

Please verify in managed mode by taking a full-page screenshot and checking the output resolution equals viewport * device_scale_factor.


974-1009: Context reuse key ignores BrowserConfig changes.

contexts_by_config is keyed only by CrawlerRunConfig, so changing BrowserConfig (e.g., device_scale_factor) mid-run won’t trigger a new context in non-managed mode. If runtime mutation of BrowserConfig is expected, consider including a hash of relevant BrowserConfig fields in the signature or making BrowserConfig immutable post-init.

deploy/docker/c4ai-code-context.md (4)

7693-7704: CDP reuse may bypass the new scale factor.

When connecting over CDP, you reuse the first existing context if any. That context won’t inherit the new device_scale_factor. Consider documenting this or creating a fresh context when scale differs from the existing one.


7693-7704: Add basic validation for device_scale_factor.

Recommend enforcing > 0 and a reasonable upper bound (e.g., ≤ 4) in BrowserConfig to prevent excessive memory/CPU during screenshots.

Apply in BrowserConfig.init:

+        if hasattr(self, "device_scale_factor"):
+            if not (self.device_scale_factor and self.device_scale_factor > 0):
+                raise ValueError("device_scale_factor must be > 0")
+            if self.device_scale_factor > 4:
+                self.device_scale_factor = 4  # clamp to avoid runaway memory usage

7693-7704: Document how to set it (code and CLI).

  • Python:
browser_cfg = BrowserConfig(device_scale_factor=2.0)
  • CLI:
crwl https://example.com -b "device_scale_factor=2.0,screenshot=true"

Also add a note about higher memory use with larger scale factors.


7693-7704: Add a regression test for scaling effect.

Capture a known page at scale 1.0 vs 2.0 and assert the PNG dimensions (in pixels) scale accordingly for the same viewport.

I can draft a minimal async pytest using Playwright to validate this if helpful.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e651e04 and b54c200.

📒 Files selected for processing (3)
  • crawl4ai/async_configs.py (5 hunks)
  • crawl4ai/browser_manager.py (1 hunks)
  • deploy/docker/c4ai-code-context.md (1 hunks)
🔇 Additional comments (3)
crawl4ai/async_configs.py (1)

584-585: Serialization looks good; ensure dumps omit defaults if intended.

to_dict includes device_scale_factor explicitly. If you want dump() to omit defaults, it already does via to_serializable_dict; no change needed.

crawl4ai/browser_manager.py (1)

922-924: Correctly plumbs device_scale_factor into new_context.

This is the key change to make the feature effective.

deploy/docker/c4ai-code-context.md (1)

7693-7704: Playwright context option name is correct.

Passing device_scale_factor to new_context is supported and should improve screenshot DPI as intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants