-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP standardizer : TradingSymbol, SecurityExchangeName #12
Comments
Hi @ebadi
If we want "text" data, we would need to use the SEC financial statement and notes data set (https://www.sec.gov/data-research/financial-statement-notes-data-sets). However, this one is about 10 times larger. I might be writing a version 2 of the library somewhen in the future, which will use that larger dataset. But I haven't decided yet. So bottom line: There is no textual information in the SEC financial statement data set. However, the SEC does maintain a list with cik to Symbol mapping: There is also a python library addressing this, but the last release was over 2 years ago: The about your footnote: If you see "adsh" number in the "version" column, it means that the company is not using the definition of the official US-GAAP definition tags. It might be the same name as the official tag, but they might use a somewhat different "definition" of it. That should be mentioned somewhere in the notes of a report. This is actually, what the Fun fact as a side note: The SEC actually trags the "usage" of custom tags on a special page: |
Thank you for another comprehensive response. I am currently using sec-cik-mapper but it misses many records. e.g. unlisted companies. (jadchaar/sec-cik-mapper#5) .The current mapping includes 7,975 entries. I will try the list provided by SEC which appears to be a more comprehensive with 12,084 lines. This might resolve my issue. |
Hello,
I tried using the code in the
medium_intro_secfsds.ipynb
notebook to extract the TradingSymbol, SecurityExchangeName from the cover page (CP
) statements.As you can see in my jupyter notebook these tags are present in
pre_df
. However when I try to merge it withnum_df
to get their value, I am not successful. From CP statements, I also find other tags such asEntityCommonStockSharesOutstanding
,EntityAddressStateOrProvince
to be interesting. I really hope that I won't need to write a new CP standardizer.What is the simplest way to do cik or even adsh to TradingSymbol, SecurityExchangeName lookup?
In general, it would be great if there was a function (similar to ZipCollector) that without any standardization, we get the columns of data (tags) when we pass
Footnote
I also noticed that the version column in
num_df
has adsh values!The text was updated successfully, but these errors were encountered: