Sorry, but I don't have strong motivation to maintain this package because currently I don't use this kind of data format. I don't recommend to use this for practical use because it has problems such as memory leak or segfault.
Smuggler enables you to load the data sets of stats packages as pandas DataFrame using ReadStat C library written by Evan Miller.
Supported formats are below.
- STATA dta file
- SPSS sav file
- SPSS por file
- SAS sas7bdat file
Some of them are already supported in pandas, so I want to compare the performance.
SAS catalog format (.sas7bcat) is unavailable though ReadStat library can read them.
$ python setup.py build install
- pandas
- numpy
- STATA dta file: read_dta("path/to/file")
- SPSS sav file: read_sav("path/to/file")
- SPSS por file: read_por("path/to/file")
- SAS sas7bdat file: read_sas7bdat("path/to/file")
- Make test suites (currently I just checked with a few files).
- Support reading RDS and Rdata
- Support writing dta and sav
- Create a wheel for Windows