SQL journey from foundations to production: window functions, CTEs, stored procedures, and real-world data cleaning projects
Practical SQL from foundations to production patterns. This repository documents my SQL learning journey and showcases production-grade data cleaning and analysis projects.
sql-portfolio/
├── foundations/ # Core SQL concepts
├── intermediate/ # Joins, subqueries, string functions
├── advanced/ # CTEs, window functions, stored procedures
└── projects/
└── layoffs_2023/ # 🌟 Featured Project: Data cleaning + EDA
Complete data pipeline: Raw data → Production-ready dataset → Insights
Skills Demonstrated:
- Data cleaning with CTEs and window functions
- Exploratory data analysis with complex aggregations
- Documentation and data quality standards
- SELECT, WHERE, GROUP BY, ORDER BY
- LIMIT, ALIAS, HAVING vs WHERE
- All types of JOINs (INNER, LEFT, RIGHT, FULL)
- String functions and manipulation
- Subqueries and CASE statements
- Window functions (RANK, ROW_NUMBER, DENSE_RANK)
- Common Table Expressions (CTEs)
- Temporary tables
- Stored procedures
- Triggers and events
Prerequisites:
- PostgreSQL, MySQL, or BigQuery access
- Basic SQL knowledge
Run the layoffs project:
- Load
layoffs.csvto your database - Execute
01_data_cleaning.sql - Execute
02_eda.sql - Review insights in comments
Dense rank vs row_number:
-- See advanced/14_window_functions_rank_rownum.sql
ROW_NUMBER() -- Sequential numbering, no ties
RANK() -- Gaps after ties
DENSE_RANK() -- No gaps after tiesCTE chains for auditable cleaning:
-- See projects/layoffs_2023/01_data_cleaning.sql
WITH duplicates_flagged AS (...),
cleaned_companies AS (...),
standardized_industries AS (...)
SELECT * FROM standardized_industries;- Add performance tuning notes (EXPLAIN plans)
- Implement data quality unit tests
- Add real-time data pipeline examples
- Build dbt transformation models
Feel free to open an issue or reach out via LinkedIn
⭐️ Found this helpful? Star this repo!