Skip to content

Latest commit

 

History

History
144 lines (96 loc) · 3.59 KB

README.md

File metadata and controls

144 lines (96 loc) · 3.59 KB

tap-parquet

tap-parquet is a Singer tap for Parquet.

Built with the Meltano Tap SDK for Singer Taps.

This is a fork of the ae-nv variant, rebuilt on v0.39.1 of the Meltano sdk cookiecutter template.

About Parquet

Parquet is a portable, type-aware, columnar, compressed, splittable, and cloud-friendly format.

For more information why Parquet is increasingly used in big data applications, see this comparison.

Configuration

Accepted Config Options

Setting Required Default Description
paths True None Paths to Parquet Datasets

A full list of supported settings and capabilities for this tap is available by running:

tap-parquet --about

Configure using environment variables

This Singer tap will automatically import any environment variables within the working directory's .env if the --config=ENV is provided, such that config values will be considered if a matching environment variable is set either in the terminal context or in the .env file.

Source Authentication and Authorization

Usage

You can easily run tap-parquet by itself or in a pipeline using Meltano.

Executing the Tap Directly

tap-parquet --version
tap-parquet --help
tap-parquet --config CONFIG --discover > ./catalog.json

Developer Resources

Follow these instructions to contribute to this project.

Initialize your Development Environment

pipx install poetry
poetry install

Create and Run Tests

Create tests within the tests subfolder and then run:

poetry run pytest

You can also test the tap-parquet CLI interface directly using poetry run:

poetry run tap-parquet --help

Testing with Meltano

Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.

Next, install Meltano (if you haven't already) and any needed plugins:

# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-parquet
meltano install

Now you can test and orchestrate using Meltano:

# Test invocation:
meltano invoke tap-parquet --version
# OR run a test `elt` pipeline:
meltano elt tap-parquet target-jsonl

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.