Trade Classification With Python

Documentation ✒️: https://karelze.github.io/tclf/

Source Code 🐍: https://github.com/KarelZe/tclf

tclf is a scikit-learn-compatible implementation of trade classification algorithms to classify financial markets transactions into buyer- and seller-initiated trades.

The key features are:

Easy: Easy to use and learn.
Sklearn-compatible: Compatible to the sklearn API. Use sklearn metrics and visualizations.
Feature complete: Wide range of supported algorithms. Use the algorithms individually or stack them like LEGO blocks.

Installation

pip

pip install tclf

uv⚡

uv add tclf

Supported Algorithms

(Rev.) CLNV rule¹
(Rev.) EMO rule²
(Rev.) LR algorithm³
(Rev.) Tick test⁴
Depth rule⁵
Quote rule⁶
Tradesize rule⁵

For a primer on trade classification rules visit the rules section 🆕 in our docs.

Minimal Example

Let's start simple: classify all trades by the quote rule and all other trades, which cannot be classified by the quote rule, randomly.

Create a main.py with:

import numpy as np
import pandas as pd

from tclf.classical_classifier import ClassicalClassifier

X = pd.DataFrame(
    [
        [1.5, 1, 3],
        [2.5, 1, 3],
        [1.5, 3, 1],
        [2.5, 3, 1],
        [1, np.nan, 1],
        [3, np.nan, np.nan],
    ],
    columns=["trade_price", "bid_ex", "ask_ex"],
)

clf = ClassicalClassifier(layers=[("quote", "ex")], strategy="random")
clf.fit(X)
probs = clf.predict_proba(X)

Run your script with

$ python main.py

In this example, input data is available as a pd.DataFrame with columns conforming to our naming conventions.

The parameter layers=[("quote", "ex")] sets the quote rule at the exchange level and strategy="random" specifies the fallback strategy for unclassified trades.

Advanced Example

Often it is desirable to classify both on exchange level data and nbbo data. Also, data might only be available as a numpy array. So let's extend the previous example by classifying using the quote rule at exchange level, then at nbbo and all other trades randomly.

import numpy as np
from sklearn.metrics import accuracy_score

from tclf.classical_classifier import ClassicalClassifier

X = np.array(
    [
        [1.5, 1, 3, 2, 2.5],
        [2.5, 1, 3, 1, 3],
        [1.5, 3, 1, 1, 3],
        [2.5, 3, 1, 1, 3],
        [1, np.nan, 1, 1, 3],
        [3, np.nan, np.nan, 1, 3],
    ]
)
y_true = np.array([-1, 1, 1, -1, -1, 1])
features = ["trade_price", "bid_ex", "ask_ex", "bid_best", "ask_best"]

clf = ClassicalClassifier(
    layers=[("quote", "ex"), ("quote", "best")], strategy="random", features=features
)
clf.fit(X)
acc = accuracy_score(y_true, clf.predict(X))

In this example, input data is available as np.arrays with both exchange ("ex") and nbbo data ("best"). We set the layers parameter to layers=[("quote", "ex"), ("quote", "best")] to classify trades first on subset "ex" and remaining trades on subset "best". Additionally, we have to set ClassicalClassifier(..., features=features) to pass column information to the classifier.

Like before, column/feature names must follow our naming conventions.

Other Examples

For more practical examples, see our examples section.

Development

We are using tox with uv for development.

tox -e lint
tox -e format
tox -e test
tox -e build

Citation

If you are using the package in publications, please cite as:

@software{bilz_tclf_2023,
    author = {Bilz, Markus},
    license = {BSD 3},
    month = nov,
    title = {{tclf} -- trade classification with python},
    url = {https://github.com/KarelZe/tclf},
    version = {0.0.1},
    year = {2023}
}

Footnotes

Chakrabarty, B., Li, B., Nguyen, V., & Van Ness, R. A. (2007). Trade classification algorithms for electronic communications network trades. Journal of Banking & Finance, 31(12), 3806–3821. https://doi.org/10.1016/j.jbankfin.2007.03.003
↩
Ellis, K., Michaely, R., & O’Hara, M. (2000). The accuracy of trade classification rules: Evidence from nasdaq. The Journal of Financial and Quantitative Analysis, 35(4), 529–551. https://doi.org/10.2307/2676254
↩
Lee, C., & Ready, M. J. (1991). Inferring trade direction from intraday data. The Journal of Finance, 46(2), 733–746. https://doi.org/10.1111/j.1540-6261.1991.tb02683.x
↩
Hasbrouck, J. (2009). Trading costs and returns for U.s. Equities: Estimating effective costs from daily data. The Journal of Finance, 64(3), 1445–1477. https://doi.org/10.1111/j.1540-6261.2009.01469.x
↩
Grauer, C., Schuster, P., & Uhrig-Homburg, M. (2023). Option trade classification. https://doi.org/10.2139/ssrn.4098475
↩ ↩²
Harris, L. (1989). A day-end transaction price anomaly. The Journal of Financial and Quantitative Analysis, 24(1), 29. https://doi.org/10.2307/2330746
↩

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github		.github
.vscode		.vscode
docs		docs
src/tclf		src/tclf
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
sonar-project.properties		sonar-project.properties
uv.lock		uv.lock
version		version

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Trade Classification With Python

Installation

Supported Algorithms

Minimal Example

Advanced Example

Other Examples

Development

Citation

Footnotes

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

KarelZe/tclf

Folders and files

Latest commit

History

Repository files navigation

Trade Classification With Python

Installation

Supported Algorithms

Minimal Example

Advanced Example

Other Examples

Development

Citation

Footnotes

Footnotes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages