You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature request: Limit the maximum number of bytes to parse.
A maximum file size may be enforced per crawler. Content which is after the maximum file size may be ignored. Google currently enforces a size limit of 500 kilobytes (KB).
When forming the robots.txt file, you should keep in mind that the robot places a reasonable limit on its size. If the file size exceeds 32 KB, the robot assumes it allows everything
At the moment, it's possible to generate large (fake or valid) robots.txt files, with the aim to trap the robots.txt crawler, slow down the server, and even cause it to hang or crash.
It's also (depending on the setup) possible to trap the crawler in an infinite retry-loop, if the external code utilizing this library, isn't handling repeating fatal errors correctly...
Feature request: Limit the maximum number of bytes to parse.
Source: Google
Source: Yandex
The text was updated successfully, but these errors were encountered: