-
Notifications
You must be signed in to change notification settings - Fork 0
👔 End-to-End ETL Project: Automated crawler, XML parsing pipeline and BI Dashboard tracking Brazilian Federal Personnel Acts. Built with R, Shiny, and Regex.
andeliton/temometro_D.O.U.
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
# 📉 Termômetro DOU (Official Gazette Thermometer)    <div align="center"> <a href="#-english">English</a> | <a href="#-português">🇧🇷 Português</a> </div> --- <div id="-english"></div> ##English ### 🏛️ About the Project **Termômetro DOU** is a Business Intelligence dashboard that monitors the Brazilian Federal Administration's pulse. It tracks personnel acts (Nominations vs. Exonerations) published in the **Official Gazette (DOU)**. ### ⚙️ How it Works (The Pipeline) 1. **Crawler (`01_crawler_dou.R`):** Scrapes the official government portal (`in.gov.br`), identifies dynamic download links for "Section 2", and downloads monthly ZIP files. 2. **Processing (`02_processamento_xml.R`):** Unpacks ZIPs, parses thousands of XML files, and uses **Regex** to classify acts. 3. **Visualization (`app.R`):** Displays the "bureaucratic churn" and net balance of the federal workforce. ### 🛠️ Tech Stack * **Core:** R, Shiny, `bslib` * **Data Eng:** `rvest` (Crawling), `xml2` (Parsing), `arrow` (Parquet storage). --- <div id="-português"></div> ## 🇧🇷 Português ### 🏛️ Sobre o Projeto O **Termômetro DOU** é um dashboard que monitora o pulso da Administração Pública Federal. Ele rastreia atos de pessoal (Nomeações vs. Exonerações) publicados no **Diário Oficial da União (DOU)**. ### ⚙️ Como Funciona (Pipeline) 1. **Crawler (`01_crawler_dou.R`):** Navega no portal `in.gov.br`, localiza links dinâmicos da "Seção 2" e baixa arquivos ZIP mensais. 2. **Processamento (`02_processamento_xml.R`):** Descompacta os ZIPs, processa milhares de XMLs e usa **Regex** para classificar os atos. 3. **Visualização (`app.R`):** Exibe a rotatividade burocrática e o saldo líquido da força de trabalho. ### 🛠️ Tecnologias * **Core:** R, Shiny, `bslib` * **Engenharia de Dados:** `rvest` (Crawling), `xml2` (Parsing), `arrow` (Parquet). --- *Developed by Andéliton Soares*
About
👔 End-to-End ETL Project: Automated crawler, XML parsing pipeline and BI Dashboard tracking Brazilian Federal Personnel Acts. Built with R, Shiny, and Regex.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published