Skip to content

Latest commit

 

History

History
9 lines (7 loc) · 668 Bytes

README.md

File metadata and controls

9 lines (7 loc) · 668 Bytes

Pipelines

Module uses Apache Beam as an unified programming model to define and execute data processing pipelines

Module structure:

  • export-gbif-hbase - The pipeline to export the verbatim data from the GBIF HBase tables and save as ExtendedRecord avro files
  • ingest-gbif-beam - Main GBIF pipelines for ingestion of biodiversity data
  • ingest-gbif-fragmenter - Writes raw archive's data to HBase store
  • ingest-gbif-java - Main GBIF pipelines for ingestion of biodiversity data, Java version