-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Add configurable device_scale_factor for improved screenshot quality #1463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add configurable device_scale_factor for improved screenshot quality #1463
Conversation
WalkthroughAdds a configurable device_scale_factor to BrowserConfig and propagates it to Playwright browser context creation. Updates serialization/deserialization (from_kwargs, to_dict) and documentation. Adjusts browser_manager to use the config’s device_scale_factor instead of a hard-coded value. Updates related deployment documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant BrowserConfig
participant BrowserManager
participant PlaywrightContext as Playwright Context
User->>BrowserConfig: Construct with device_scale_factor (default 1.0 or provided)
User->>BrowserManager: Instantiate with BrowserConfig
BrowserManager->>PlaywrightContext: create_browser_context(device_scale_factor = config.device_scale_factor)
PlaywrightContext-->>BrowserManager: Context created
BrowserManager-->>User: Ready context
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
deploy/docker/c4ai-code-context.md (1)
7693-7704
: Support device_scale_factor in BrowserConfig.from_kwargs and to_dict
BrowserConfig.init definesdevice_scale_factor: float = 1.0
, but its staticfrom_kwargs
andto_dict
omit this field—configs loaded from or serialized to dict will drop any custom DPI setting. In crawl4ai/async_configs.py, update:
BrowserConfig.from_kwargs
to includedevice_scale_factor=kwargs.get("device_scale_factor", 1.0)
BrowserConfig.to_dict
to add"device_scale_factor": self.device_scale_factor
🧹 Nitpick comments (9)
crawl4ai/async_configs.py (3)
368-375
: Docstring is clear; add brief guidance on sensible ranges.Consider noting typical values (e.g., 1.0–3.0) and that higher values increase memory/CPU to help users choose safely.
421-473
: Validate and coerce device_scale_factor.Guard against non-numeric, non-finite, or non-positive inputs to avoid surprising runtime behavior.
Apply this diff to replace the assignment with validation/coercion:
- self.device_scale_factor = device_scale_factor + # Validate device_scale_factor + try: + dsf = float(device_scale_factor) + except (TypeError, ValueError): + raise ValueError("device_scale_factor must be numeric") + if dsf <= 0 or dsf != dsf or dsf in (float("inf"), float("-inf")): + raise ValueError("device_scale_factor must be a positive, finite number") + self.device_scale_factor = dsfAdditionally, add this import outside the selected range:
import math # if you prefer math.isfinite, replace the finiteness checks accordingly
546-547
: Coerce from kwargs to float to match constructor type.This avoids silent string propagation from env/CLI sources.
- device_scale_factor=kwargs.get("device_scale_factor", 1.0), + device_scale_factor=float(kwargs.get("device_scale_factor", 1.0) or 1.0),crawl4ai/browser_manager.py (2)
663-673
: Managed-browser path may not apply device_scale_factor to existing contexts.When connecting over CDP and reusing an existing context (contexts[0]), immutable options like device_scale_factor won’t be updated. Result: the new setting is ignored in managed/CDP mode unless a fresh context is created.
Option: when contexts already exist, create a new context with config settings, clone runtime state, and use it as default:
- contexts = self.browser.contexts - if contexts: - self.default_context = contexts[0] - else: - self.default_context = await self.create_browser_context() - await self.setup_context(self.default_context) + contexts = self.browser.contexts + if contexts: + old_ctx = contexts[0] + new_ctx = await self.create_browser_context() + # copy cookies/localStorage, then apply headers/etc. in setup_context + await clone_runtime_state(old_ctx, new_ctx, None, self.config) + self.default_context = new_ctx + else: + self.default_context = await self.create_browser_context() + await self.setup_context(self.default_context)Please verify in managed mode by taking a full-page screenshot and checking the output resolution equals viewport * device_scale_factor.
974-1009
: Context reuse key ignores BrowserConfig changes.contexts_by_config is keyed only by CrawlerRunConfig, so changing BrowserConfig (e.g., device_scale_factor) mid-run won’t trigger a new context in non-managed mode. If runtime mutation of BrowserConfig is expected, consider including a hash of relevant BrowserConfig fields in the signature or making BrowserConfig immutable post-init.
deploy/docker/c4ai-code-context.md (4)
7693-7704
: CDP reuse may bypass the new scale factor.When connecting over CDP, you reuse the first existing context if any. That context won’t inherit the new device_scale_factor. Consider documenting this or creating a fresh context when scale differs from the existing one.
7693-7704
: Add basic validation for device_scale_factor.Recommend enforcing > 0 and a reasonable upper bound (e.g., ≤ 4) in BrowserConfig to prevent excessive memory/CPU during screenshots.
Apply in BrowserConfig.init:
+ if hasattr(self, "device_scale_factor"): + if not (self.device_scale_factor and self.device_scale_factor > 0): + raise ValueError("device_scale_factor must be > 0") + if self.device_scale_factor > 4: + self.device_scale_factor = 4 # clamp to avoid runaway memory usage
7693-7704
: Document how to set it (code and CLI).
- Python:
browser_cfg = BrowserConfig(device_scale_factor=2.0)
- CLI:
crwl https://example.com -b "device_scale_factor=2.0,screenshot=true"
Also add a note about higher memory use with larger scale factors.
7693-7704
: Add a regression test for scaling effect.Capture a known page at scale 1.0 vs 2.0 and assert the PNG dimensions (in pixels) scale accordingly for the same viewport.
I can draft a minimal async pytest using Playwright to validate this if helpful.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
crawl4ai/async_configs.py
(5 hunks)crawl4ai/browser_manager.py
(1 hunks)deploy/docker/c4ai-code-context.md
(1 hunks)
🔇 Additional comments (3)
crawl4ai/async_configs.py (1)
584-585
: Serialization looks good; ensure dumps omit defaults if intended.to_dict includes device_scale_factor explicitly. If you want dump() to omit defaults, it already does via to_serializable_dict; no change needed.
crawl4ai/browser_manager.py (1)
922-924
: Correctly plumbs device_scale_factor into new_context.This is the key change to make the feature effective.
deploy/docker/c4ai-code-context.md (1)
7693-7704
: Playwright context option name is correct.Passing device_scale_factor to new_context is supported and should improve screenshot DPI as intended.
Context
While crawling startup websites and taking full-page screenshots, I encountered an issue where images within the screenshots (such as product dashboards) had poor quality that made them unusable for further processing. After investigating, I discovered that increasing the
device_scale_factor
from 1.0 to 2.0 significantly improved screenshot quality when using Playwright directly.However, when trying to apply this solution using this library, I found that the
device_scale_factor
was hardcoded to 1.0 inbrowser_manager.py
and there was no way to configure it throughBrowserConfig
. Attempts to use Chromium extra args like--device-scale-factor=2
were unsuccessful.Summary
This PR makes the
device_scale_factor
configurable throughBrowserConfig
instead of being hardcoded to 1.0. This allows users to improve screenshot quality when needed, particularly for capturing detailed images or dashboards on websites.The changes maintain backward compatibility by keeping the default value at 1.0, so existing code will continue to work unchanged.
List of files changed and why
async_configs.py
: Addeddevice_scale_factor
parameter to theBrowserConfig
constructor,from_kwargs()
, andto_dict()
methods following the same pattern as other existing parametersbrowser_manager.py
: Modified to retrieve thedevice_scale_factor
value from theBrowserConfig
instance instead of using the hardcoded 1.0 valuedeploy/docker/c4ai-code-context.md
: Updated documentation to reflect the new configurable parameterHow Has This Been Tested?
device_scale_factor
successfully improved screenshot qualitypytest
before and after the modifications with no errors or warnings introduced by the changesChecklist:
Note: I'm not very experienced with open source contributions, so I may have missed some important conventions or additional tests that should be included. Please let me know if there's anything else that should be addressed.
Summary by CodeRabbit
New Features
Documentation