I built this project to push my SQL skills beyond the basics - not just writing correct queries, but writing efficient and scalable ones. Using a dataset of over 20,000 tracks, I explored a series of progressively complex data questions ranging from simple aggregations to advanced window functions and CTE-driven transformations.
Rather than treating this as a “one-query-per-result” exercise, I structured the project the way real analysts work:
I organized tasks by difficulty - Easy, Medium, and Hard with each stored in dedicated
.sql
filesI approached each question as a use case, often asking: “How would a business team interpret this?”
To dig deeper, I focused on query performance optimization. I compared:
A raw SQL query with calculations inside
AVG()
Versus a refactored version that used a
TEMP TABLE
and pre-filtered logic
💡 Result? I reduced execution time by 55%! Optimization isn't just about indexing, it's about how you structure your logic.
I also wrapped the project with:
Query benchmark screenshots using
EXPLAIN ANALYZE
Visual EDA with Pandas (coming soon!)
A Streamlit dashboard (coming soon!) for dynamic filtering and visual insight
If you're curious about practical SQL use cases, query tuning, or performance benchmarking — this is a project worth checking out.
Skills Demonstrated
Advanced SQL (Window Functions, CTEs, TEMP Tables)
Query Optimization & Benchmarking
EDA with Pandas & Jupyter
Structured Problem-Solving
Data Communication & Documentation