Skip to content

sdsc-ordes/dt-political-debates

Repository files navigation

dt-political-debates

Digital Twin on Political Debates

graph TB
    ExternalUNDatabase[External UN Database] --> UNScrapper[ODTP UN Scrapper]

    subgraph ODTP
    direction TB
        UNScrapper --> UNMediaDataloader[UN Media Dataloader]

        UNMediaDataloader --> ODTPPyannoteWhisper[ODTP Pyannote Whisper]
        ODTPPyannoteWhisper --> ODTPTranslation[ODTP Translation]

        ODTPTranslation --> S3Dataloader[S3 Dataloader]
    end
    S3Dataloader --> UNS3[Political Debates S3]

    subgraph Webplatform
    direction TB
        UNS3 --> UNMongoDB

        UNMongoDB <--> Backend["Backend (API and Tools)"]
        UNMongoDB --> UNSolr[UN Solr]
        UNSolr --> Backend["Backend (API and Tools)"]
        Backend <--> Frontend[Political Debates GUI]
    end
Loading

List of components

The following part of the projects are not odtp components.

How to run this pipeline?

How to create an initial json file?

In order to fully run this pipeline is necesary to start by a valid metadata json file. You can obtain one by fetching the data from the scrapper or create a synthetic one manually. This is the example of a synthetic one you can use to generate yours. It should validate against schemas/unogDigitalRecordingMetadataMinimalSchema.json.

{
    "$schema": "https://raw.githubusercontent.com/sdsc-ordes/dt-political-debates/refs/heads/main/schemas/unogDigitalRecordingMetadataMinimalSchema.json",
    "version": "1.0",
    "metadata": {
      "title": "HRC_20220929T0000",
      "date": "2022-09-29",
      "time": "00:00",
      "url": "http://example.com",
      "tags": ["HRC_20220929T0000"],
      "summary": "",
      "labels": {}
    },
    "channels": [
      {
        "id": "video",
        "type": "video",
        "name": "Main Video Channel",
        "data": "HRC_20220929T0000.mp4",
        "tags": ["main", "video"]
      },
      {
        "id": "original",
        "type": "audio",
        "name": "Original Audio Channel",
        "data": "HRC_20220929T0000-original.wav",
        "tags": ["original", "audio"]
      }
    ],
    "annotations": [
    ]
  }

TBD

Tutorial to run the pipeline in ODTP

TBD

How to run the pipeline with Docker Compose

TBD

Changelog

  • v1.0.0
    • Basic project structure, schemas, and scripts.

Roadmap

  • odtp-trascription2pdf component
  • data-downloader component
  • datauploader component
  • faces indentifier component
  • docker-compose
  • odtp compatibility
  • documentation

About

Digital Twin on Political Debates

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published