-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: 'the label [2000-01-03 00:00:00+00:00] is not in the [index]' #1957
Comments
Hi @kerwinxu I'm currently looking at #1953 #1950 #1949 #1947 and it looks like these are all the same problem; they're coming from Google no longer giving us enough benchmark data. Before, we could get up to 4000 days of data for |
I think until it is fixed, an idea might be to copy https://github.com/quantopian/zipline/blob/master/zipline/resources/market_data/SPY_benchmark.csv to your |
Thanks about the pointers to benchmarks. I found the code doing this and it looks like Google (not Yahoo) is returning just the last year's worth of data, no matter what dates you pass it. I see other people have since commented on the same. The latest pandas_reader version also has this same behavior. I modified the benchmarks.py code to use Yahoo and print the data to STDOUT and I then fetched the data as a one-off. I then saved the data into SPY_benchmarks.csv. I tried just leaving Yahoo in there permanently, but it comes back with errors and I think it has something to do with it rate limiting connections. So doing a one-off grab and saving it into the csv and then changing it back to google worked for me. Thanks for the help everyone. |
As I mentioned in #1950 the copy from a prepared SPY_benchmark.csv without up-to-date does not work because zipline will compare the latest date and download from Google. I think currently the better work-around is using the yahoo data with a yahoo-fix-patch for pandas Datareader, here is the reference and see the comment by @edmunch. It does work for me. |
this patch seems ok . edmunch commented on 30 Jun • edited install pandas_datareader patch Benchmarks.py with: import pandas as pd from six.moves.urllib_parse import urlencode import pandas_datareader as pdr #NEW def get_benchmark_returns(symbol, start_date, end_date): |
no, even if I put the correct SPY_benchmark.csv, call to TradingAlgorithm overwrite it with the wrong version! Please, reopen the issue... |
@zxweed |
@ezfine I have not used the yahoo download because it's closed by yahoo couple of months ago. I have used the quandl as a source. |
Yes, yahoo made changes of its api several months ago and that's why we need a patch for pandas DataReader. I didn't try quandl data on zipline because it doesn't provide adjust close data. |
@ezfine @zxweed The original issue at the top of this thread (to be clear, the one with the warning message "WARNING: Loader: Refusing to download new benchmark data because a download succeeded at 2017-09-20 02:19:52.057758+00:00.") has nothing to do with recent changes to google API. @kerwinxu Please read previous paragraph. Perhaps an optional flag to force downloads despite cooldown would be a good idea? Would you like me to PR this? |
The reason for this is because Google has now limited users to about 251 days worth of data per request, so you can't run backtests over a year. There is a fix currently being worked on. There are duplicates of this issue so I'm just going to direct everyone to this issue: #1965. I'll comment there when there is a fix on master |
Dear Zipline Maintainers,
Before I tell you about my issue, let me describe my environment:
Environment
$ python -c 'import math, sys;print(int(math.log(sys.maxsize + 1, 2) + 1))'
$ pip freeze
or$ conda list
Now that you know a little about me, let me tell you about the issue I am
having:
Description of Issue
Here is how you can reproduce this issue on your machine:
Reproduction Steps
1.i install "conda install -n python35 -c Quantopian zipline"
2.zipline ingest
3.zipline run -f dual_moving_average.py --start 2011-1-1 --end 2012-1-1 -o dma.pickle
4.error:
[2017-09-20 02:40:15.276265] WARNING: Loader: Refusing to download new benchmark data because a download succeeded at 2017-09-20 02:19:52.057758+00:00.
Traceback (most recent call last):
File "d:\Anaconda3\envs\python35\lib\site-packages\pandas\core\indexing.py", line 1395, in _has_valid_type
error()
File "d:\Anaconda3\envs\python35\lib\site-packages\pandas\core\indexing.py", line 1390, in error
(key, self.obj._get_axis_name(axis)))
KeyError: 'the label [2000-01-03 00:00:00+00:00] is not in the [index]'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:\Anaconda3\envs\python35\Scripts\zipline-script.py", line 11, in
load_entry_point('zipline==1.1.1', 'console_scripts', 'zipline')()
File "d:\Anaconda3\envs\python35\lib\site-packages\click\core.py", line 722, in call
return self.main(*args, **kwargs)
File "d:\Anaconda3\envs\python35\lib\site-packages\click\core.py", line 697, in main
rv = self.invoke(ctx)
File "d:\Anaconda3\envs\python35\lib\site-packages\click\core.py", line 1066, in invoke
return process_result(sub_ctx.command.invoke(sub_ctx))
File "d:\Anaconda3\envs\python35\lib\site-packages\click\core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "d:\Anaconda3\envs\python35\lib\site-packages\click\core.py", line 535, in invoke
return callback(*args, **kwargs)
File "d:\Anaconda3\envs\python35\lib\site-packages\zipline_main.py", line 97, in _
return f(*args, **kwargs)
File "d:\Anaconda3\envs\python35\lib\site-packages\click\decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "d:\Anaconda3\envs\python35\lib\site-packages\zipline_main_.py", line 240, in run
environ=os.environ,
File "d:\Anaconda3\envs\python35\lib\site-packages\zipline\utils\run_algo.py", line 179, in _run
overwrite_sim_params=False,
File "d:\Anaconda3\envs\python35\lib\site-packages\zipline\algorithm.py", line 709, in run
for perf in self.get_generator():
File "d:\Anaconda3\envs\python35\lib\site-packages\zipline\gens\tradesimulation.py", line 230, in transform
handle_benchmark(normalize_date(dt))
File "d:\Anaconda3\envs\python35\lib\site-packages\zipline\gens\tradesimulation.py", line 190, in handle_benchmark
benchmark_source.get_value(date)
File "d:\Anaconda3\envs\python35\lib\site-packages\zipline\sources\benchmark_source.py", line 75, in get_value
return self._precalculated_series.loc[dt]
File "d:\Anaconda3\envs\python35\lib\site-packages\pandas\core\indexing.py", line 1296, in getitem
return self._getitem_axis(key, axis=0)
File "d:\Anaconda3\envs\python35\lib\site-packages\pandas\core\indexing.py", line 1466, in _getitem_axis
self._has_valid_type(key, axis)
File "d:\Anaconda3\envs\python35\lib\site-packages\pandas\core\indexing.py", line 1403, in _has_valid_type
error()
File "d:\Anaconda3\envs\python35\lib\site-packages\pandas\core\indexing.py", line 1390, in error
(key, self.obj._get_axis_name(axis)))
KeyError: 'the label [2000-01-03 00:00:00+00:00] is not in the [index]'
...
What steps have you taken to resolve this already?
...
Anything else?
...
Sincerely,
$ whoami
The text was updated successfully, but these errors were encountered: