sec-parser-test-data

This repository complements the sec-parser project by storing raw SEC EDGAR documents for various quality assurance tasks. These tasks include end-to-end testing and evaluation tests. For guidelines on contributing, please consult the Contribution Guide.

How This Repository Helps

End-to-End Testing: Maintains manually reviewed snapshots of parser outputs. These snapshots serve as a benchmark for validating the latest outputs from the parser, ensuring it accurately processes a selected subset of documents.
Generalization Testing: Keeps a wide range of SEC documents on hand. Unlike tests that only use a few hand-picked documents, it's aimed to allow testing the parser on a much larger and diverse set of reports. By doing so, it checks if the parser can handle different types of documents effectively. For extra trust in the test results, they may also be compared to data from third-party services.

Keeping this data separate ensures a clean and manageable main code repository, aiding in maintenance and efficiency.

How To Add New Items

Option A. Auto-download filings

Add new lines in 00_report-list.csv, then run 00_download-reports-from-report-list.ipynb.
- The first CSV column values (comment) are ignored.
- The query column value has to be in the format AAPL to download the latest 10-Q report (will work only if the folder doesn't exist), or AAPL/0000320193-23-000077 for a specific report.
Add the new reports to the .yaml files in the sec-parser repository (e.g. e2e_test_data.yaml, to include them in the tests.

Option B. Copy paste filings

Download the filings yourself and copy-paste it to the repo, while keeping the correct format.

Contributing

When submitting changes, please include the git hash of the sec-parser repository in your commit message or pull request for version tracking.

Feedback

For questions or discussions, use Discussions. For issues, open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
10-K		10-K
10-Q		10-Q
modification_scripts		modification_scripts
retrieval_scripts		retrieval_scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
last_accuracy_test_result.json		last_accuracy_test_result.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

sec-parser-test-data

How This Repository Helps

How To Add New Items

Option A. Auto-download filings

Option B. Copy paste filings

Contributing

Feedback

About

Uh oh!

Uh oh!

Contributors 5

Uh oh!

Languages

License

alphanome-ai/sec-parser-test-data

Folders and files

Latest commit

History

Repository files navigation

sec-parser-test-data

How This Repository Helps

How To Add New Items

Option A. Auto-download filings

Option B. Copy paste filings

Contributing

Feedback

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 5

Uh oh!

Languages