Plai

Plai is a domain specific programming language(DSL) to create data manipulation pipelines with focus on data treatment, validation and easier syntax. It uses pandas as data manipulation engine so it is meant to work with small data.

Installation

The package can be installed using pip:

pip install plai

Examples

Example of pipeline with basic data manipulation using Plai:

df = read_file('issues.csv')

pipeline(df) as 'gh_pct_issues_by_language.csv':
    $.groupby(.name, as_index=False).sum()
    (.count/.count.sum()) * 100 as pct
    {.name, .count, .pct}

To create validations for the dataframes being manipulated you can define dictionaries mapping each column to a specific type, and apply that to a dataframe or pipeline. When applied to the dataframe it will validate its schema accordingly to the defined on the dictionary, that is, it will check data type and column presence. For the pipeline, the result dataframe will be validated. The following snippet is an example of implementation:

input_type = {
    'name': 'str',
    'year': 'int',
    'quarter': 'int',
    'count': 'int'
}

output_type = {
    'name': 'str',
    'count': 'int',
    'pct': 'float'
}

input_type::df = read_file('issues.csv')

output_type::pipeline(df) as 'gh_pct_issues_by_language.csv':
    $.groupby(.name, as_index=False).sum()
    (.count/.count.sum()) * 100 as pct
    {.name, .count, .pct}

Development

Install the dependencies by running the command on the root folder of the project:

pip install -r requirements-dev.txt

To run all the tests execute:

pytest tests

To run a specific test execute:

# For a specific test file
pytest tests/test_grammar.py

# For a specific test class
pytest tests/test_grammar.py::TestBasicTokens

# For a specific tests method
pytest tests/test_grammar.py::TestBasicTokens::test_token_number

To run the interactive terminal execute on the root folder:

python -m plai

To execute the code from a file:

python -m plai file.plai

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
.github/workflows		.github/workflows
plai		plai
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
requirements-dev.txt		requirements-dev.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Plai

Installation

Examples

Development

About

Releases 1

Packages

Contributors 4

Languages

License

matheusbsilva/plai

Folders and files

Latest commit

History

Repository files navigation

Plai

Installation

Examples

Development

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Languages

Packages