parquet-lite

Lightweight Java library for reading and writing Apache Parquet files without Hadoop dependencies.

Features

Read and write Parquet files with a simple API
No Hadoop dependency tree — minimal stubs included
Write to File, OutputStream, or any OutputFile implementation
Configurable compression codecs

Supported types

Primitive	Logical Type	Java Type	Notes
INT32	—	`int`
INT64	—	`long`
INT64	TIMESTAMP(NANOS, UTC)	`long`	Nanos since epoch, compatible with QuestDB `now_ns()`
FLOAT	—	`float`
DOUBLE	—	`double`
BOOLEAN	—	`boolean`
BINARY	STRING	`String`
BINARY	JSON	`String`
BINARY	ENUM	`String`
FIXED_LEN_BYTE_ARRAY(16)	UUID	`java.util.UUID`
FIXED_LEN_BYTE_ARRAY	DECIMAL	`java.math.BigDecimal`	Configurable precision/scale

Supported compression codecs

Codec	Status	Library
UNCOMPRESSED	Supported	Built-in
SNAPPY	Supported	xerial-snappy (no Hadoop)
ZSTD	Supported	zstd-jni (no Hadoop)
GZIP	Not supported	Requires hadoop-common
LZ4	Not supported	Requires hadoop-common

Usage

Writing

MessageType schema = new MessageType("ticker",
    Types.required(PrimitiveTypeName.INT64).named("t"),
    Types.required(PrimitiveTypeName.DOUBLE).named("cls"));

Dehydrator<Tick> dehydrator = (tick, writer) -> {
    writer.write("t", tick.timestamp());
    writer.write("cls", tick.close());
};

// Default codec (SNAPPY)
try (ParquetWriter<Tick> writer = ParquetWriter.writeFile(schema, file, dehydrator)) {
    writer.write(tick);
}

// Explicit codec
try (ParquetWriter<Tick> writer = ParquetWriter.writeFile(schema, file, dehydrator,
        CompressionCodecName.ZSTD)) {
    writer.write(tick);
}

// Write to OutputStream
try (ParquetWriter<Tick> writer = ParquetWriter.writeOutputStream(schema, outputStream,
        dehydrator, CompressionCodecName.ZSTD)) {
    writer.write(tick);
}

// Write with custom compression level (e.g. ZSTD max)
Configuration conf = new Configuration(false);
conf.setInt("parquet.compression.codec.zstd.level", 22);
try (ParquetWriter<Tick> writer = ParquetWriter.writeOutputStream(schema, outputStream,
        dehydrator, CompressionCodecName.ZSTD, conf)) {
    writer.write(tick);
}

Writing with extended types

MessageType schema = new MessageType("trades",
    Types.required(INT64)
        .as(LogicalTypeAnnotation.timestampType(true, LogicalTypeAnnotation.TimeUnit.NANOS))
        .named("ts_ns"),
    Types.required(FIXED_LEN_BYTE_ARRAY).length(16)
        .as(LogicalTypeAnnotation.uuidType()).named("trade_id"),
    Types.required(FIXED_LEN_BYTE_ARRAY).length(16)
        .as(LogicalTypeAnnotation.decimalType(2, 18)).named("price"),
    Types.required(BINARY)
        .as(LogicalTypeAnnotation.enumType()).named("exchange"));

Dehydrator<Trade> dehydrator = (trade, writer) -> {
    writer.write("ts_ns", trade.timestampNanos());
    writer.write("trade_id", trade.id());        // UUID
    writer.write("price", trade.price());         // BigDecimal
    writer.write("exchange", trade.exchange());   // String
};

Reading

Hydrator<Map<String, Object>, Map<String, Object>> hydrator = new Hydrator<>() {
    public Map<String, Object> start() { return new HashMap<>(); }
    public Map<String, Object> add(Map<String, Object> target, String heading, Object value) {
        target.put(heading, value);
        return target;
    }
    public Map<String, Object> finish(Map<String, Object> target) { return target; }
};

try (Stream<Map<String, Object>> rows = ParquetReader.streamContent(file,
        HydratorSupplier.constantly(hydrator))) {
    rows.forEach(row -> System.out.println(row));
}

Dependency (JitPack)

Maven

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>

<dependency>
    <groupId>com.github.qtsurfer</groupId>
    <artifactId>parquet-lite</artifactId>
    <version>2.1.0</version>
</dependency>

Gradle

repositories {
    maven { url 'https://jitpack.io' }
}

dependencies {
    implementation 'com.github.qtsurfer:parquet-lite:2.1.0'
}

License

Apache License 2.0 — see LICENSE.

Attribution

Original work: Copyright Strategic Blue Ltd — strategicblue/parquet-floor
Fork maintenance: Copyright Wualabs LTD — wualabs.com

See NOTICE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 186 Commits
.github/workflows		.github/workflows
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
checkstyle.xml		checkstyle.xml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

parquet-lite

Features

Supported types

Supported compression codecs

Usage

Writing

Writing with extended types

Reading

Dependency (JitPack)

Maven

Gradle

License

Attribution

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

parquet-lite

Features

Supported types

Supported compression codecs

Usage

Writing

Writing with extended types

Reading

Dependency (JitPack)

Maven

Gradle

License

Attribution

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages