This repository contains a comprehensive SQL-driven analysis of Major League Baseball (MLB) data, created as part of the Advanced SQL Querying course taught by Maven Analytics.
Course: Advanced SQL Querying
Instructor: Maven Analytics
Link: https://mavenanalytics.io
Course Description:
SQL is a powerful tool for managing and analyzing relational databases, and a skill that every analyst, data scientist, and BI professional should master.
In this course, we moved beyond the basic "Big 6" clauses (SELECT
, FROM
, WHERE
, GROUP BY
, HAVING
, and ORDER BY
) and covered advanced querying techniques including:
- JOIN types
- Subqueries & Common Table Expressions (CTEs)
- Window functions
- Advanced functions for numeric, datetime, string, and NULL handling
Finally, we applied these techniques to real-world business problems and analytics use cases, including pivoting with CASE statements, calculating running totals, and more.
This project leverages advanced SQL queries, window functions, and aggregations to uncover insights from historical MLB data.
- Identifies which schools produced the most MLB players over time
- Highlights top contributing schools per decade
- Evaluates team spending trends
- Identifies the top 20% of teams by average payroll and cumulative expenditures
- Tracks when each franchise surpassed $1 billion in spending
- Examines career trajectories, including:
- Debut and retirement ages
- Career lengths
- Starting and ending teams
- Highlights players with 10+ year careers who stayed with a single team
- Analyzes:
- Shared birthdays among players
- Batting hand distributions across teams
- Trends in player physiques (average height & weight at debut, decade by decade)
- Advanced SQL Querying
- Window Functions & Aggregations
- Subqueries & CTEs
- Pivoting & Conditional Logic with CASE
- Real-world Data Analysis & Storytelling
Maven Analytics helps analysts build job-ready skills through hands-on, project-based learning. This project was completed as part of their Advanced SQL Querying course.