tsfresh

This repository contains the TSFRESH python package. The abbreviation stands for

"Time Series Feature extraction based on scalable hypothesis tests".

The package contains many feature extraction methods and a robust feature selection algorithm.

Spend less time on feature engineering

Data Scientists often spend most of their time either cleaning data or building features. While we cannot change the first thing, the second can be automated. TSFRESH frees your time spend on building features by extracting them automatically. Hence, you have more time to study the newest deep learning paper, read hacker news or build better models.

Automatic extraction of 100s of features

TSFRESH automatically extracts 100s of features from time series. Those features describe basic characteristics of the time series such as the number of peaks, the average or maximal value or more complex features such as the time reversal symmetry statistic.

The set of features can then be used to construct statistical or machine learning models on the time series to be used for example in regression or classification tasks.

Forget irrelevant features

Time series often contain noise, redundancies or irrelevant information. As a result most of the extracted features will not be useful for the machine learning task at hand.

To avoid extracting irrelevant features, the TSFRESH package has a built-in filtering procedure. This filtering procedure evaluates the explaining power and importance of each characteristic for the regression or classification tasks at hand.

It is based on the well developed theory of hypothesis testing and uses a multiple test procedure. As a result the filtering process mathematically controls the percentage of irrelevant extracted features.

The algorithm is described in the following paper

Christ, M., Kempa-Liehr, A.W. and Feindt, M. (2016).
Distributed and parallel time series feature extraction for industrial big data applications.
ArXiv e-print 1610.07717, https://arxiv.org/abs/1610.07717.

Advantages of tsfresh

TSFRESH has several selling points, for example

it is field tested
it is unit tested
the filtering process is statistically/mathematically correct
it has a comprehensive documentation
it is compatible with sklearn, pandas and numpy
it allows anyone to easily add his own favorite features

Next steps

If you are interested in the technical workings, go to see our comprehensive Read-The-Docs documentation at http://tsfresh.readthedocs.io.

The algorithm, especially the filtering part are also described in the paper mentioned above.

If you have some questions or feedback you can find the developers in the gitter chatroom.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
docs		docs
notebooks		notebooks
tests		tests
tsfresh		tsfresh
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CHANGES.rst		CHANGES.rst
LICENSE.txt		LICENSE.txt
README.md		README.md
README.rst		README.rst
docs-requirements.txt		docs-requirements.txt
pypi_index.zip		pypi_index.zip
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
test-requirements.txt		test-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tsfresh

Spend less time on feature engineering

Automatic extraction of 100s of features

Forget irrelevant features

Advantages of tsfresh

Next steps

About

Releases

Packages

Languages

License

startakovsky/tsfresh

Folders and files

Latest commit

History

Repository files navigation

tsfresh

Spend less time on feature engineering

Automatic extraction of 100s of features

Forget irrelevant features

Advantages of tsfresh

Next steps

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages