Practical SQL Guide an Overview of Where to Start With SQL
Updated: March 27, 2026
TL;DR
SQL remains essential for every developer in 2026—start with SELECT, JOIN, and GROUP BY, then advance to CTEs and window functions. Modern databases like PostgreSQL 18, SQLite, and DuckDB offer different tradeoffs; cloud platforms (Supabase, Neon, PlanetScale) simplify hosting. Practice with interactive tools like SQLZoo and LeetCode's database problems to build confidence quickly.
SQL isn't dead—it's more relevant than ever. Data lives in databases, and SQL is how you ask questions of that data. Whether you're building a backend API, analyzing business metrics, or competing in data engineering roles, SQL fluency is non-negotiable. This guide cuts through the noise: start here, learn in this order, and practice on real databases. You don't need to be a DBA; you need to think in sets and understand the query planning.
Part 1: SQL Fundamentals (Start Here)
SELECT and WHERE
The foundation: fetch rows matching a condition.
SELECT id, name, email FROM users WHERE created_at > '2026-01-01';
Key concept: SQL returns a set of rows. Think in sets, not loops.
JOIN: Combining Tables
Almost every real query combines data from multiple tables.
SELECT u.name, p.title, p.published_at
FROM users u
INNER JOIN posts p ON u.id = p.user_id
WHERE p.published_at > '2026-01-01';
JOIN types:
INNER JOIN: Only matching rows (most common)LEFT JOIN: All rows from left table, matching rows from rightFULL OUTER JOIN: All rows from both tablesCROSS JOIN: Cartesian product (rarely needed)
Common mistake: Forgetting the ON clause creates a cross join.
GROUP BY and Aggregation
Summarize data across groups.
SELECT category, COUNT(*) as post_count, AVG(views) as avg_views
FROM posts
GROUP BY category
HAVING COUNT(*) > 5;
Key functions: COUNT, SUM, AVG, MIN, MAX
HAVING vs WHERE:
- WHERE filters rows before grouping
- HAVING filters groups after aggregation
Part 2: Intermediate SQL
CTEs (Common Table Expressions)
Break complex queries into readable steps using WITH clauses.
WITH recent_posts AS (
SELECT id, title, author_id FROM posts WHERE created_at > '2026-01-01'
),
top_authors AS (
SELECT author_id, COUNT(*) as post_count
FROM recent_posts
GROUP BY author_id
ORDER BY post_count DESC
LIMIT 5
)
SELECT u.name, ta.post_count
FROM top_authors ta
JOIN users u ON u.id = ta.author_id;
Advantage over subqueries: More readable, easier to debug, often better performance.
Window Functions
Perform calculations across "windows" of rows without collapsing groups.
SELECT
category,
title,
views,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY views DESC) as rank_in_category
FROM posts;
Common window functions: ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, SUM() OVER
Window functions avoid self-joins and make analytics queries efficient.
Subqueries
Nested SELECT statements (less elegant than CTEs, but important to recognize).
SELECT * FROM posts WHERE author_id IN (
SELECT id FROM users WHERE signup_date > '2025-01-01'
);
Modern Databases to Know
PostgreSQL 18 (current stable as of March 2026)
Why: A widely-adopted general-purpose relational database with full SQL support, JSONB for semi-structured data, and features like array types, full-text search, and (new in 18) async I/O for faster sequential scans, virtual generated columns, and uuidv7(). PostgreSQL 18 was released September 25, 2025; PG 17 is also still supported if your platform hasn't upgraded yet.
-- JSONB querying example: extract a field, and filter by a nested key
SELECT id, metadata->>'country' AS country
FROM users
WHERE metadata ? 'signup_date'; -- returns rows where the JSON has a top-level 'signup_date' key
Hosting: Self-managed, or cloud platforms (Supabase, Neon, AWS RDS, DigitalOcean)
SQLite
Why: Embedded database used everywhere—browsers, mobile apps, embedded systems. Zero setup, file-based.
sqlite3 mydb.db "SELECT * FROM users LIMIT 5;"
Tradeoff: Single-writer limitation makes it unfit for high-concurrency backends, but perfect for local development and small apps.
DuckDB
Why: Embedded, columnar analytical database optimized for OLAP (reporting, analytics), not OLTP (transactional). Designed to be fast for aggregations and scans on local files (Parquet, CSV, JSON). It runs in-process like SQLite but with an analytics-focused storage and execution engine, and supports both on-disk and in-memory databases.
# In-memory mode (data goes away when the process exits)
duckdb :memory: "SELECT COUNT(*) FROM 'large_dataset.parquet';"
# Or persistent: writes to a single file on disk
duckdb mydata.duckdb "SELECT COUNT(*) FROM 'large_dataset.parquet';"
Use case: Local data analysis, analytics tools, data pipelines.
Cloud Database Platforms
Supabase
PostgreSQL + auto-generated REST API + real-time subscriptions. Great for rapid prototyping.
const { data } = await supabase
.from('posts')
.select('*')
.eq('author_id', userId);
Neon
Serverless PostgreSQL with auto-scaling. Pay for what you use.
psql postgres://neon-user:password@ep-example.us-east-1.neon.tech/dbname
PlanetScale
A managed database platform built on Vitess (the MySQL sharding layer originally developed at YouTube). PlanetScale offers both MySQL and Postgres options — Postgres reached general availability in September 2025. It is best known for branching workflows for schema changes (database branches that mirror Git branches). Pricing has shifted over time (the original Hobby free tier was deprecated in March 2024 and entry plans now start at a low monthly fee) — check planetscale.com/pricing for the current entry tier.
pscale connect mydb main
SQL in Data Engineering and Analytics
Modern data pipelines heavily use SQL:
- dbt (data build tool): Write SQL models, dbt handles dependencies and testing
- Apache Airflow: DAGs executing SQL queries on schedules
- Looker, Tableau: Built on top of SQL to query data warehouses
- Apache Spark SQL: Distributed SQL over large datasets
If you're pursuing data engineering, SQL is your first language.
Practice Resources (a personal pick, in roughly the order I'd suggest tackling them)
| Resource | Best For | Time | Price (verify on the site — plans change) |
|---|---|---|---|
| SQLZoo | Fundamentals interactively | 5-10 hours | Free |
| LeetCode Database | Interview prep, medium+ difficulty | 20+ hours | Free tier available |
| Mode Analytics | Real analytics scenarios | 10-15 hours | Free |
| DataCamp SQL Track | Structured curriculum | 20+ hours | Paid; published pricing has ranged from roughly $13/mo (annual) to ~$42/mo (monthly billing) — check current rates |
| Stratascratch | Real company data problems | Variable | Paid; ~$29/mo monthly at the time of writing, with annual and lifetime plans |
⚠ Prices change frequently. The values above are for illustration only and may be out of date. Always verify current pricing directly with the provider before making cost decisions: Anthropic · OpenAI · Google Gemini · Google Vertex AI · AWS Bedrock · Azure OpenAI · Mistral · Cohere · Together AI · DeepSeek · Groq · Fireworks AI · Perplexity · xAI · Cursor · GitHub Copilot · Windsurf.
Recommended approach: Start SQLZoo (1 week), then LeetCode Medium problems (2-3 weeks), then attempt one real project (4+ weeks).
Learning Path
- Week 1: Master SELECT, WHERE, JOIN (SQLZoo)
- Week 2: GROUP BY, HAVING, ORDER BY
- Week 3: Subqueries and CTEs
- Week 4: Window functions and analytical queries
- Week 5+: Set up a cloud database and run real queries against real data
The jump from exercises to real data is crucial. Practice platforms are sanitized; production data is messier.
Common Pitfalls
- N+1 queries: Fetching a list, then looping to fetch details for each item. Use JOIN instead.
- Missing indexes: Queries work fine on small tables but crawl on 10M+ rows. Add indexes on frequently filtered columns.
- Avoiding NULL: NULL comparisons use IS NULL, not = NULL.
- Ignoring execution plans: Use EXPLAIN to see if your query scans the entire table or uses an index.
Conclusion
SQL is learnable in weeks, not months. Start with fundamentals, practice on real problems, then pick a database platform that fits your use case. Whether you're a backend developer integrating databases, a data analyst, or a data engineer, SQL proficiency compounds. Spend time here—it's one of the highest-ROI skills in 2026 development.