-
Notifications
You must be signed in to change notification settings - Fork 7
Ensure api requests adhere to 5000 / hr limit #29
Comments
Hey abpoll, I added a asyncio throttle functionality in the new release (2.1.0). To get it to work, specify the number of calls and the period in the firststreet object initialization Ex 1 - (1 call per 1 second): Ex 2 - (5000 calls per hour): Let me know if there's any issues |
Thanks! |
Will 2.1.0 be available through pip install soon? |
Hey abpoll, The 2.1.0 should already be up on PyPi: https://pypi.org/project/fsf-api-access-python/#description I did notice in a previous issue (#28), that you are on Python 3.6. Due to a change in how we handled the asynchronous call returns, we needed to move to using 3.7 and had to drop support for 3.6. You will need to update your Python version to 3.7/3.8 to get the version 2 updates |
Thanks very much!
… On Oct 2, 2020, at 11:37 AM, Kelvin ***@***.***> wrote:
Hey abpoll,
The 2.1.0 should already be up on PyPi: https://pypi.org/project/fsf-api-access-python/#description
I did notice in a previous issue (#28), that you are on Python 3.6. Due to a change in how we handled the asynchronous call returns, we needed to move to using 3.7 and had to drop support for 3.6. You will need to update your Python version to 3.7/3.8 to get the version 2 updates
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I tested the new functionality for county_fips 25017 (~400k parcels). I set the limit to 4000 per minute and the throttle function stably avoids Error 429 for 8 hours of runtime. An issue is that Timeout Errors occur constantly so that what hypothetically could take 2 hours doesn't finish. At around the 8 hour mark, a 429 Rate Limit exceeded error was thrown. The call to the api is fs = firststreet.FirstStreet(api_key, rate_limit=4000, rate_period=60). Is that correct? Here is the log until the first Timeout Error was thrown 0%| | 0/420263 [00:00<?, ?it/s] Here is the log at the bottom before the 429 error was called 94%|█████████▎| 393061/420263 [8:14:02<02:12, 204.90it/s] |
Hey @abpoll, So for the two errors: The timeout error you're receiving looks like it's from the probability products. The next update to the First Street API (v1.2) should include an update to the probability products, making them respond faster and fix the timeout errors from the server. The second error I'm having a hard time reproducing. rate_limit=4000, rate_period=60 is correct, it will run 4000 connections every 60 seconds. Just to check, you want 4000 / 60 seconds, and not 5000 / hour right? |
Hi @Lyetenth - thanks for the help. Sorry for the confusing title of this issue. The specific call I made was to get 4000 / 60 seconds (trying to avoid hitting that 5000 limit). Looking forward to the update. I'll wait for v1.2 before making another big request like at the county level. |
@Lyetenth Joining this discussion as I'm facing the same errors when trying to make larger calls. Is there a recommended solution for big requests with FSIDs? Is there a rate limit that will definitely avoid timeout issues? |
Hey @bradleyswilson, Just to confirm: And what value are you providing to the Also what version of the library are you currently on? There's been a few updates since this issue was closed, so it might be a new issue I need to take a look at -- |
@Lyetenth Yes, I've only been working with the probability product. My situation is very similar to the previous one (i.e. timeout errors that were retrying successfully, but then getting a 429 ratelimit error at some point during the run) I've played around with rate limit and rate period. rate_limit=4000, rate_period=60 were my initial specs, but upping the rate_period to 120 seems to have fixed the problem for some calls, but not others. I have about ~200 unique lists of FSIDs for portions of major US cities. They range from 2,400-760,000 FSIDs. I believe it's version 2.1. - downloaded on October 6th. I can upgrade to the most recent if that would make a difference? |
Hey @bradleyswilson, try updating to the most recent version ( Since version Assuming you're getting a lot of timeouts (more than the 5 I got. Likely tens or hundreds), then my thoughts on why this might fix it is that during a timeout, it attempts to re-call the API, which will count as an additional call for your API key. When too many timeouts happen, that's when you get a rateLimit error even though it's explicitly set in the FirstStreet object. Just in case the 5 timeouts are unlucky and it trips the rateLimit, I do use a slightly lower limit of 4990 per 60 period Try the update, and let me know if it is successful! |
@Lyetenth No luck, still getting tons of timeouts on large lists, eventually triggering a 429. Splitting into a smaller batch of 200k doesn't seem to reduce the number of timeout errors either, although it hasn't thrown a rateLimit error yet (still running). If the new updates are coming soon, I can probably wait to do more testing then, or work around by batching the files into smaller lists. Most of the cities I'm working with are much smaller in size. |
@bradleyswilson, that's weird hmmm. I'm currently running a test on With my script:
And I haven't had a timeout error yet. If you have your script, could you post it as well? Or try what I have above (replacing the API key and file of course). Are you also using -- My concern is that I don't think these are server-side issues, and are client-side. As a result, even with the 1.2 update to the API, the bug would persist |
Was running from command line, but script is basically identical. I'll test it through the script though too. Using I can send/post my list of FSIDs somewhere if that would be helpful on trying on your end? |
Okay, I just started a run with python -m firststreet -p probability.get_depth -s .\test_files\sample_property.txt --connection_limit 100 -rate_limit 5000 -rate_period 60 Thus far it's looking okay. Do you have an estimate on when the timeouts begin to occur? I'll take the list of FSIDs as well to see if it might be specific properties that may have an issue. If you can post it here: https://drive.google.com/drive/folders/1Eaa1aux_5PW0jNXLxOFKWsbUx_yMPZXy?usp=sharing |
Just dropped the file that I've been testing with in there. It starts throwing timeouts really early (like 1-2% for me). |
I've been able to process The next thing I can think of is checking the versions of the minor libraries Can you post a screenshot of running |
@Lyetenth Python 3.7.4 |
The Re: libraries, the only difference I see in the libraries is I'm using Python I do think the many timeOut errors are causing the rateLimit crash. You can try reducing the number of concurrent connections by changing the ie: -- |
Weird, I get timeout errors before it even gets to 1% completion. Thanks for trying to troubleshoot with me, I'll keep playing around with different settings and see if I can get something to work, otherwise I'll hope for v1.2 or another team member having some luck. |
Question:
How can I adhere to the 5000 / hr limit with an api request?
I've tried implementing the ratelimit (https://pypi.org/project/ratelimit/) and backoff (https://pypi.org/project/backoff/) packages but get a 429 error.
Here is what the function that calls from the api looks like:
I also tried setting calls=500 but the 429 error still is thrown.
I'm guessing the limitation is not being implemented in the call.
Do I have to break out fsids into chunks of <= 5000, get them in fs.probability.get_depths(fsids_5000), time the call, and ensure that a minute passes before the next call?
The text was updated successfully, but these errors were encountered: