Parquet files have many advantages over text-based formats like CSV except for one: they are not human-readable.
This image provides the utility of the parquet-cli command line tool to help fill this gap.
Please refer to the parquet-cli documentation for available commands and flags.
First, pull the image from GHCR:
apptainer pull oras://ghcr.io/imageomics/parquet-cli:latest
Docker image coming soon
Move the image to your preferred location:
mv parquet-cli_latest.sif /preferred/path/parquet-cli_latest.sif
Run the container to inspect your Parquet file:
apptainer run /preferred/path/parquet-cli_latest.sif <flags-and-commands> /path/to/file.parquet
or
docker run --rm -v $(pwd):/data -w /data parquet-cli:latest <flags-and-commands> /path/to/file.parquet
For simpler usage, you can add an alias in your ~/.bashrc
file:
alias parquet='apptainer --silent run --cleanenv /preferred/path/parquet-cli_latest.sif'
or
alias parquet='docker run --rm -v $(pwd):/data -w /data parquet-cli:latest'
Then you can use the parquet
command to inspect Parquet files:
parquet <flags-and-commands> /path/to/file.parquet # with either apptainer or docker
To build the image from the definition file, run the following command:
apptainer build <my-image>.sif parquet-cli.def
or
docker build -t parquet-cli:latest -f .