-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Firefox does not work with proxy. #320
Comments
I can not reproduce with mitmproxy:
Slightly adapted sample spider: import scrapy
class ExampleSpider(scrapy.Spider):
name = "ex"
start_urls = ["https://httpbin.org/get"]
custom_settings = {
"DOWNLOAD_HANDLERS": {
"http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
"https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
},
"TWISTED_REACTOR": "twisted.internet.asyncioreactor.AsyncioSelectorReactor",
"PLAYWRIGHT_BROWSER_TYPE": "firefox",
"PLAYWRIGHT_LAUNCH_OPTIONS": {
"headless": False,
"timeout": 20 * 1000,
'proxy': {
"server": "127.0.0.1:8080",
"username": "user",
"password": "pass",
}
},
}
def start_requests(self):
yield scrapy.Request(
url=self.start_urls[0],
callback=self.parse_detail,
meta=dict(
playwright=True,
playwright_include_page=True,
playwright_context_kwargs=dict(
java_script_enabled=True,
ignore_https_errors=True,
),
)
)
async def parse_detail(self, response):
print(f"Received response from {response.url}")
page = response.meta["playwright_page"]
await page.close()
Which proxy are you using? Perhaps this is an interaction with that specific provider. |
I have some thoughts
In my case scrapy got 407 then set it failure. I use https://scrapoxy.io to manage proxies. |
All requests were routed through Playwright, notice the "scrapy-playwright" logger name:
The provided spider works correctly with Scrapoxy. I've started it as indicated in their docs and I'm getting the following logs. There is a failure downloading the response, but that's reasonable because I did not add an actual proxy provider in the Scrapoxy configuration site.
However, if I pass incorrect credentials I do get the reported message:
|
I also experienced Update: With Chromium I get |
I just create an example spider.
Chromium works well. but with the setup below. it's raise
NS_ERROR_PROXY_CONNECTION_REFUSED
fromplaywright._impl._errors.Error: Page.goto: NS_ERROR_PROXY_CONNECTION_REFUSED
Debug to in ScrapyPlaywrightDownloadHandler._maybe_launch_browser and i got launch_options.
And i copy it to playwright to test and it's works.
example_spider.py
test_with_playwright.py
The text was updated successfully, but these errors were encountered: