This dbt project transforms Gmail data from Airbyte into analytics-ready tables for email insights.
# Create virtual environment
python3 -m venv venv
# Activate virtual environment (macOS/Linux)
source venv/bin/activate
# Activate virtual environment (Windows)
# venv\Scripts\activate
# Install dbt
pip install -r requirements.txtThe profiles.yml file is already configured for BigQuery with:
- Project:
nao-production - Development dataset:
dev - Location:
EU - Authentication: OAuth
Make sure you're authenticated with Google Cloud:
gcloud auth application-default logindbt debug# Install dependencies (if any)
dbt deps
# Run all models
dbt run
# Run tests
dbt test
# Generate documentation
dbt docs generate
dbt docs servemodels/
├── staging/
│ ├── _sources.yml # Source definitions
│ └── stg_messages.sql # Staging model for messages
└── marts/
├── schema.yml # Model documentation and tests
└── fct_messages.sql # Fact table for email analytics
stg_messages: Cleaned and standardized message data from Gmail sources
fct_messages: Analytics-ready fact table with:- Message metadata (ID, thread, timestamp)
- Sender/recipient information
- Time-based dimensions (hour, day type, time category)
- Message characteristics (size, direction, snippet length)
The fct_messages table enables analysis of:
- Email volume trends over time
- Communication patterns by time of day/week
- Sender/recipient analysis
- Message size and content analysis
- Work-life balance insights (business hours vs off-hours)
- Make changes to models
- Run
dbt run --select model_nameto test specific models - Run
dbt testto validate data quality - Commit changes to version control