pat-cli

Overview

pat-cli is a tool for clustering logs based on the textual content of the log. The tool uses a two-step process to achieve this:

Vectorization: In this step, the log statement is converted into a vector in an n-dimensional space. This way, we can treat logs just like we treat points on a 2D graph and try to cluster them. The only difference is that the dimension of this space.

To achieve this vectorization, few algorithms can be employed. The available algorithms can be found using the help page of the tool, but one example is the TF-IDF.

Usually, vectorization involves a sub-step called tokenization. In this step, the logs are broken down into a set of tokens. For example, a log like "Writing output to file" can be broken down into the following tokens: "writing", "output", "to", "file". This allows the tool to understand the textual content of the logs. For example, in the TF-IDF algorithm, the frequencies of the words (tokens) making up each log statement are used to determine how important each word in the log is.

Clustering: In this step, a clustering algorithm like K-Means or Birch is used to cluster the logs into multiple groups that are likely to be similar to each other.

Usage

pat-cli is available on PyPI, so you can install it with pip:

pip install pat-cli

The pat command will then be available in your environment. You can get help on how to use it using:

pat --help

Issue

If you face any issue, feel free to create a GitHub Issue in this repository and I will try to address it or respond to it as soon as possible.

Contribution

Contribution is welcome. If you have an interesting addition to the tool, be it another vectorization or clustering algorithm, feel free to publish a PR.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
pat_cli		pat_cli
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pat-cli

Table of Contents

Overview

Usage

Issue

Contribution

About

Releases 5

Packages

Languages

License

rafidka/pat-cli

Folders and files

Latest commit

History

Repository files navigation

pat-cli

Table of Contents

Overview

Usage

Issue

Contribution

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages