-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get dataframe #15
base: master
Are you sure you want to change the base?
Get dataframe #15
Conversation
Removed usage of sys.stderr in mock exception which caused the stream to be closed. This in turn failed the next test. Simplified test_search to avoid the zip and the loop.
Arguments realtime_start and realtime_end in get_series() now cause a pandas.DataFrame to be returned with pandas.MultiIndex for realtime data. Added simple test for the new feature and documentation. Added __init__.py in fredapi.tests so it's correctly interpreted as a package. Now we could revert to python setup.py test in .travis.yml. Fixed test_invalid_kwarg_in_get_series() as we sometimes get a TypeError and sometimes a ValueError. Seems that pandas passes through whatever exception it gets, might be a good reason for this so we follow the same policy. Simplified comparison of dataframe output in tests.
9999-12-31 cannot be converted to pandas.Timestamp because it's too big. Reason it's prefereable to use pandas.Timestamp than datetime.datetime is that the former can be used as an index whereas the second cannot.
Retrieve multiple series in one call with get_dataframe. Reorganized test mock calls.
@elmotec one high level comment is that it seems odd to have a function called the reason it also seems like it can potentially be quite tricky to try to fit different series data into the same index of a DataFrame. Maybe it's best for the user to decide on how to deal with different frequencies |
Fair enough. I guess we could call it With respect to the join of multiple series, I have not pushed the concept much to be honest (it fits my smallish purpose) but pandas can handle quite a bit of data even if the join were to create a lot of rows (memory is usually the bottleneck when I had problems). I've added a frequency argument to force say a quarterly observation for series with a lot of data points. Also, note that One last thing: I can see a challenge to backward compatibility as 9999-12-31 is now interpreted as None rather than literally. That was mostly to play nice with |
pandas has a datetime null value called |
Right, right. Somehow I missed it even though pandas actually turns None into NaT. Thank you for pointing it. What's more, checking it out made me realize that all the comparisons to NaT return false and that actually may not be what I want because things like |
Changed the translation of 9999-12-31 in Fred API to a customizable variable Fred.latest_time_stamp which defaults to pandas.Timestamp.max Renamed get_dataframe to get_multi_series to reflect what it does rather than what it returns. Made Fred.freq_map an internal variable (Fred.__freq_map). Documentation improvements.
@mortada, So I made the replacement of 9999-12-31 a class variable. For now, I think it should default to pd.Timestamp.max to allow selection on the resulting dataframe like |
Hey @mortada,
This is a cumulative PR for the fixes to test_search, get_series changes, and the new get_dataframe. Don't know if you are interested and if so whether you'd like all in one commit (that's quite a bit) or in 3 commits as it is. The normal order of the pull requests are:
1- fix-test_search
2- get_series_with_realtime
3- get_dataframe
Cheers