Skip to content

Catch API Limitations/Blockers Earlier in the call chain #89

@joshdunnlime

Description

@joshdunnlime

There are several limitation on API calls:

  1. 10 or 10,000 locations*.
  2. 10 weather parameters per query.
  3. Get requests have a limit on URL length.
  4. Timeseries length (16.8M rows when **stacked)
  5. Timeout on API call.

The combination of these can create real issues and blockers for developers trying to make optimal or efficient calls to Meteomatics. Online documentation is either limited, hard to find (not in one place) or none existent. Having some pre-call checks in the Meteomatics Python API would at least document this one place. Better still, adding this to function docstrings would be an even bigger improvement.

The main theme here is catching errors earlier in the call stack, avoiding errors being raised from responses which can obscure or complicated debugging. It also allows for more specific error messaging. Finally, it also users to better unit test their code without mocking api calls. If something is changed server side, for example an error message http status or otherwise, user will no longer need to change their unit tests.

Some simple but very effective improvements might be as follows:

  1. Add {"number of locations": 10} to the query_user_limits or query_user_features function. This could be used to catch errors earlier in the call stack, either by users or in the Meteomatics Python package.

  2. Validate the number of paramters before making the API call. Additionally, add an option to batch >10 parameters into multiple calls and then aggregate the dataframe.

query_time_series(
    ...,
    parameters=[f'wind_speed_{h}m' for h in range(0, 200, 10)],
    batch_parameters=True,
)
# returns a single dataframe with two calls to the api
  1. The URL length error appears to be returned after making the call to Meteomatics. This could be caught before the call is made and raise directly in Python. A helpful error message could also be added recommending the user switches to POST requests.

  2. Compute the size of the returned data/dataframe and raise an error message early. Given that when a user makes call to the API with time series length (f(startdate, enddate, interval)), parameters and locations, the multiple of all of these can be computed and the error message raised there.

I have no simple suggestions for 5). It is main there to show the the compounding of all these errors/limits makes it more tricky than needed to use the API. Fixing the four other issues would drastically help users in getting to a place where they can optimise their queries for request timeouts. I am a big fan of the Meteomatics API and am a regular user but these are often pain points - addressing even some of these issues would make it amazing to work with. Thanks!

*10 or 10,000 location depending on API service agreement.

**stacked refers to pandas specific operation of stacking or pivoting to make the dataframe long and thin. In this case, it means stacking such that timestamps, locations and parameters are all parts of the multiIndex.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions