Skip to content

Commit

Permalink
[DOP-21387] Rebuild documentation structure
Browse files Browse the repository at this point in the history
  • Loading branch information
dolfinus committed Nov 15, 2024
1 parent 68a1ac6 commit da6eb4e
Show file tree
Hide file tree
Showing 44 changed files with 1,194 additions and 453 deletions.
9 changes: 6 additions & 3 deletions .env.docker
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
POSTGRES_DB=data_rentgen
POSTGRES_USER=data_rentgen
POSTGRES_PASSWORD=changeme
POSTGRES_INITDB_ARGS: '--encoding=UTF-8 --lc-collate=C --lc-ctype=C'
POSTGRES_INITDB_ARGS=--encoding=UTF-8 --lc-collate=C --lc-ctype=C

# Init Kafka
KAFKA_CFG_NODE_ID=0
Expand All @@ -20,15 +20,18 @@ KAFKA_CLIENT_PASSWORDS=changeme
KAFKA_CFG_SASL_ENABLED_MECHANISMS=PLAIN,SCRAM-SHA-256

# Common backend config
DATA_RENTGEN__LOGGING__PRESET=colored
DATA_RENTGEN__DATABASE__URL=postgresql+asyncpg://data_rentgen:changeme@db:5432/data_rentgen
DATA_RENTGEN__LOGGING__PRESET=colored

# See Backend -> Server -> Configuration documentation
DATA_RENTGEN__SERVER__DEBUG=false

# See Backend -> Consumer -> Configuration documentation
DATA_RENTGEN__KAFKA__BOOTSTRAP_SERVERS=kafka:9092
DATA_RENTGEN__KAFKA__BOOTSTRAP_SERVERS=broker:9092
DATA_RENTGEN__KAFKA__SECURITY__TYPE=scram-sha256
DATA_RENTGEN__KAFKA__SECURITY__USER=data_rentgen
DATA_RENTGEN__KAFKA__SECURITY__PASSWORD=changeme
DATA_RENTGEN__KAFKA__COMPRESSION=zstd

# See Frontend -> UI
DATA_RENTGEN__UI__API_BROWSER_URL=http://localhost:8000
2 changes: 2 additions & 0 deletions .env.local
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,5 @@ export DATA_RENTGEN__KAFKA__SECURITY__PASSWORD=changeme
export DATA_RENTGEN__KAFKA__COMPRESSION=zstd

export DATA_RENTGEN__SERVER__DEBUG=true

export DATA_RENTGEN__UI__API_BROWSER_URL=http://localhost:8000
18 changes: 10 additions & 8 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,24 +31,26 @@
What is Data.Rentgen?
---------------------

Data.Rentgen is a DataLineage service compatible with `OpenLineage <https://openlineage.io/>`_ specification.
Data.Rentgen is a Data Motion Lineage service, compatible with `OpenLineage <https://openlineage.io/>`_ specification.

**Note**: service is under active development, and is not ready to use.
**Note**: service is under active development, and is not ready to use yet.

Goals
-----

* Collect lineage events produced by OpenLineage clients & integrations (Spark, Airflow, Flink, custom ones).
* Collect lineage events produced by OpenLineage clients & integrations (Spark, Airflow).
* Support consuming large amounts of lineage events, by using Kafka as event buffer and storing data in tables partitioned by event timestamp.
* Store operation-grained events (instead of job grained `Marquez <https://marquezproject.ai/>`_), for better detalization.
* Provide API for run ↔ dataset lineage, as well as parent run → children run lineage.
* Support handling large amounts of lineage events, using Kafka as event buffer and storing data in tables partitioned by event timestamp.
* Provide API for building run ↔ dataset lineage, as well as parent run → children run lineage.
* Ability to build lineage graph with specific time boundaries (unlike Marquez there lineage is build only for last job run).
* Ability to build lineage graph with different granularity. e.g. merge all individual Spark operations into Spark applicationId or Spark applicationName.

Non-goals
---------

* This is **not** a data catalog. Use `Datahub <https://datahubproject.io/>`_ or `OpenMetadata <https://open-metadata.org/>`_ instead.
* Static dataset → dataset lineage (like view → table) is not supported.
* Currently column-level lineage is not supported.
* This is **not** a Data Catalog. Use `Datahub <https://datahubproject.io/>`_ or `OpenMetadata <https://open-metadata.org/>`_ instead.
* Static Data Lineage like view → table is not supported.
* Currently column-level lineage is collected by OpenLineage, but not yet consumed by Data.Rentgen.

.. documentation
Expand Down
2 changes: 1 addition & 1 deletion data_rentgen/consumer/settings/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ class ConsumerApplicationSettings(BaseSettings):
DATA_RENTGEN__LOGGING__PRESET=json
"""

database: DatabaseSettings = Field(description=":ref:`Database settings <configuration-consumer-database>`")
database: DatabaseSettings = Field(description=":ref:`Database settings <configuration-database>`")
kafka: KafkaSettings = Field(
description=":ref:`Kafka settings <configuration-consumer-kafka>`",
)
Expand Down
2 changes: 1 addition & 1 deletion data_rentgen/consumer/settings/kafka.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ class KafkaSettings(BaseModel):
)
security: KafkaSecuritySettings = Field(
default_factory=KafkaSecuritySettings,
description=":ref:`Kafka security settings <configuration-consumer-kafka-security>`",
description="Kafka security settings",
)
compression: KafkaCompression | None = Field(
default=None,
Expand Down
144 changes: 0 additions & 144 deletions data_rentgen/db/models/README.rst

This file was deleted.

2 changes: 1 addition & 1 deletion data_rentgen/server/settings/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ class ServerApplicationSettings(BaseSettings):
DATA_RENTGEN__SERVER__DEBUG=True
"""

database: DatabaseSettings = Field(description=":ref:`Database settings <configuration-server-database>`")
database: DatabaseSettings = Field(description=":ref:`Database settings <configuration-database>`")
logging: LoggingSettings = Field(
default_factory=LoggingSettings,
description=":ref:`Logging settings <configuration-server-logging>`",
Expand Down
20 changes: 18 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ services:
db-migration:
condition: service_completed_successfully

kafka:
broker:
image: bitnami/kafka:3.7
restart: unless-stopped
env_file: .env.docker
Expand All @@ -60,7 +60,7 @@ services:
restart: unless-stopped
env_file: .env.docker
depends_on:
kafka:
broker:
condition: service_healthy
db-migration:
condition: service_completed_successfully
Expand All @@ -74,6 +74,22 @@ services:
db:
condition: service_healthy

frontend:
image: mtsrus/data-rentgen-ui:develop
restart: unless-stopped
env_file: .env.docker
ports:
- 3000:3000
healthcheck:
test: [CMD-SHELL, curl, -f, http://localhost:3000/]
interval: 30s
timeout: 5s
retries: 3
start_period: 5s
depends_on:
server:
condition: service_healthy

volumes:
postgres_data:
kafka_data:
31 changes: 0 additions & 31 deletions docs/backend/architecture.rst

This file was deleted.

6 changes: 0 additions & 6 deletions docs/backend/consumer/configuration/database.rst

This file was deleted.

11 changes: 0 additions & 11 deletions docs/backend/consumer/index.rst

This file was deleted.

Loading

0 comments on commit da6eb4e

Please sign in to comment.