Lesson 3: Basic SQL Syntax

Learn basic SQL techniques to answer complex business questions

Estimated Read Time: 2 - 3 Hours

Learning Goals

In this lesson, you will learn to:

  • Write SELECT queries to retrieve specific columns and rows from database tables
  • Apply filtering conditions using WHERE clauses with multiple operators
  • Sort and limit result sets to focus on relevant data
  • Combine multiple SQL techniques to answer complex business questions

In the previous lesson, you explored the structure of relational databases, understood the Northwind schema through its ERD, and learned three methods for importing data into PostgreSQL. You’re now equipped with a working database containing real-world business data – orders, customers, products, suppliers, and employees – all interconnected through carefully designed relationships.

Now comes the exciting part: extracting insights from this data. This is where SQL transforms from a database management tool into your primary instrument for data analysis. In this lesson, you’ll master the fundamental SQL operations that form the foundation of every analytical query you’ll write in your career.

Think of SQL as a conversation with your database. You ask questions, and the database responds with answers. The art lies in asking the right questions in the right way. By the end of this lesson, you’ll be able to query data with precision, filter it according to business requirements, aggregate it to reveal patterns, and transform it to meet analytical needs.

Important Note: It’s possible that some queries in this lesson will result in different results when you try them out. Please take the results with a grain of salt.

 

1. Basic SQL Syntax

1.1. The SELECT Statement: Asking Your First Questions

Every SQL query begins with a simple question: “What data do I want to see?” The SELECT statement is your tool for answering this question.

Basic Syntax:

				
					SELECT column1, column2, column3
FROM table_name;
				
			

Business Context: Imagine you’re a sales analyst at Northwind Traders, and your manager asks, “Can you give me a list of all our products with their prices?”

Query:

				
					SELECT product_name, unit_price
FROM products;
				
			

This query returns exactly what was asked for—nothing more, nothing less. Notice how we specify only the columns we need. This is more efficient than retrieving everything.

Selecting All Columns:

When you need all information from a table, use the asterisk (*) wildcard:

				
					SELECT *
FROM products;
				
			
Important:

While SELECT * is convenient during exploration, avoid it in production queries. It’s less efficient and can cause problems if the table structure changes. Always specify the columns you need.

1.2. The FROM Clause: Specifying Your Data Source

Every SELECT statement must include a FROM clause that tells SQL where to find the data. In relational databases, data lives in tables, and you must specify which table you’re querying.

Business Context: The marketing team wants to analyze customer demographics. They need to know where customers are located.

Query:

				
					SELECT company_name, city, country
FROM customers;
				
			

Understanding Table References:

Tables in your database are like filing cabinets—each contains organized information about a specific entity. The Northwind database has tables for:

  • customers: Information about companies that buy from Northwind
  • products: Details about items Northwind sells
  • orders: Records of customer purchases
  • employees: Staff information
  • suppliers: Companies that supply products to Northwind
 
1.3. The WHERE Clause: Filtering Your Results

Retrieving all rows from a table is rarely useful in business analysis. You typically want to focus on specific subsets of data that meet certain criteria. The WHERE clause is your filtering mechanism.

Basic Syntax:

				
					SELECT column1, column2
FROM table_name
WHERE condition;
				
			

Business Context: The inventory manager asks, “Which products are we selling for more than $50? I need to review our premium offerings.”

Query:

				
					SELECT product_name, unit_price, units_in_stock
FROM products
WHERE unit_price > 50;
				
			

Common Comparison Operators:

Operator Meaning Example
= Equal to WHERE country = ‘USA’
!= or <> Not equal to WHERE country != ‘USA’
> Greater than WHERE unit_price > 50
< Less than WHERE units_in_stock < 10
>= Greater than or equal WHERE unit_price >= 50
<= Less than or equal WHERE units_in_stock <= 10

 

Text Comparison Example:

Business Context: “Show me all customers in Germany.”

				
					SELECT company_name, contact_name, city
FROM customers
WHERE country = 'Germany';
				
			
Critical Note:

When comparing text values:

  • Always enclose them in single quotes: ‘Germany’
  • SQL is case-sensitive in some databases for text comparisons
  • In PostgreSQL, ‘Germany’ and ‘germany’ are different values
1.4. The ORDER BY Clause: Sorting Your Results

Data returned from a database has no guaranteed order unless you explicitly specify one. The ORDER BY clause sorts your results to make them more meaningful.

Basic Syntax:

				
					SELECT column1, column2
FROM table_name
ORDER BY column1 ASC;  -- ASC for ascending (default)
-- OR
ORDER BY column1 DESC; -- DESC for descending
				
			

Business Context: The sales director asks, “Show me our product list sorted by price, from most to least expensive.”

Query:

				
					SELECT product_name, unit_price
FROM products
ORDER BY unit_price DESC;
				
			

Sorting by Multiple Columns:

Business Context: “I need a customer list organized by country, and within each country, sorted by company name alphabetically.”

				
					SELECT company_name, contact_name, country, city
FROM customers
ORDER BY country ASC, company_name ASC;
				
			

How This Works:

  1. SQL first sorts all customers by country (A to Z)
  2. Within each country group, it sorts by company name (A to Z)
Practical Tip

When sorting by multiple columns, the order matters. The first column specified is the primary sort, the second is the secondary sort (used when the first column has duplicate values), and so on.

1.5. Column Aliases with AS: Making Results Readable

When you write queries, column names in the output should be clear and professional, especially when sharing results with non-technical stakeholders. The AS keyword lets you rename columns in your result set.

Basic Syntax:

				
					SELECT column_name AS alias_name
FROM table_name;
				
			

Business Context: You’re preparing a report for the executive team. They don’t understand database column names like “unit_price”—they expect “Product Price”.

Query:

				
					SELECT 
    product_name AS "Product Name",
    unit_price AS "Price (USD)",
    units_in_stock AS "Stock Level"
FROM products
WHERE unit_price > 50
ORDER BY unit_price DESC;
				
			

Important Notes About Aliases:

  • The AS keyword is optional, but using it makes your code more readable
  • If your alias contains spaces or special characters, enclose it in double quotes
  • Aliases only affect the output – they don’t change the actual column names in the database
  • You cannot use aliases in WHERE clauses (they don’t exist yet at that stage of query execution)

 

1.6. The LIMIT Clause: Controlling Result Set Size

When exploring data or when you only need to see a sample, LIMIT restricts the number of rows returned. This is especially important with large datasets.

Basic Syntax:

				
					SELECT column1, column2
FROM table_name
LIMIT number_of_rows;
				
			

Business Context: “Show me the top 10 most expensive products in our catalog.”

Query:

				
					SELECT product_name, unit_price
FROM products
ORDER BY unit_price DESC
LIMIT 10;
				
			

Why LIMIT Matters:

  1. Performance: Returning 1 million rows when you only need 10 wastes system resources
  2. Quick Exploration: When learning about a new table, LIMIT 100 lets you see sample data quickly
  3. Top-N Analysis: Combined with ORDER BY, it answers “top 10” or “bottom 5” questions
Important Order of Execution

SQL processes clauses in a specific order (not the order you write them):

  1. FROM – Identifies the table
  2. WHERE – Filters rows
  3. ORDER BY – Sorts the results
  4. LIMIT – Restricts the number of rows returned

 

This means LIMIT is always applied last, after sorting.

 

Why This Matters:

  1. You can’t use SELECT aliases in WHERE (they don’t exist yet)
  2. You CAN use SELECT aliases in ORDER BY (they exist by then)

 

Example:

				
					-- This works
SELECT unit_price * 1.2 AS price_with_tax
FROM products
ORDER BY price_with_tax;  -- Alias exists here

-- This doesn't work
SELECT unit_price * 1.2 AS price_with_tax
FROM products
WHERE price_with_tax > 50;  -- X Alias doesn't exist yet

-- Correct version
SELECT unit_price * 1.2 AS price_with_tax
FROM products
WHERE unit_price * 1.2 > 50;  -- ✓ Use the calculation
				
			

2. Writing Readable SQL

2.1. Use Consistent Formatting:
				
					-- Good
SELECT 
    product_name,
    unit_price,
    units_in_stock
FROM products
WHERE category_id = 1
ORDER BY product_name;

-- Difficult to read
SELECT product_name,unit_price,units_in_stock FROM products WHERE category_id=1 ORDER BY product_name;
				
			

 

2.2. Use Meaningful Aliases:
				
					-- Good
SELECT COUNT(*) AS total_products
FROM products;

-- Poor
SELECT COUNT(*) AS cnt
FROM products;
				
			

 

2.3. Comment Complex Logic:
				
					-- Calculate inventory value for active products only
-- Excludes discontinued items from the calculation
SELECT 
    SUM(units_in_stock * unit_price) AS total_inventory_value
FROM products
WHERE discontinued = 0;
				
			

 

2.4. Break Complex Queries into Steps:

If a query becomes too complex, use CTEs (Common Table Expressions) to break it into logical steps. You’ll learn these in advanced lessons.

 

3. Performance Considerations

3.1. Use LIMIT During Development:

When testing queries on large tables, always use LIMIT:

				
					-- During development
SELECT *
FROM orders
LIMIT 100;  -- Just to see the structure
				
			

 

3.2. Be Specific in SELECT:
				
					-- Good
SELECT product_name, unit_price
FROM products;

-- Wasteful
SELECT *
FROM products;  -- Returns all columns you might not need
				
			

Summary

You’ve completed a comprehensive journey through Core SQL Operations:

  • You can retrieve data using SELECT, FROM, WHERE, ORDER BY, and LIMIT

Suggested References

Exercise: SQL Fundamentals for Data Analysts

Using the NYC Yellow Taxi Trip Dataset

 

Exercise Set Overview

These exercises use the NYC Taxi and Limousine Commission’s Yellow Taxi trip data. You’ll be analyzing real-world transportation data to answer business questions that taxi fleet managers, city planners, and operations teams face daily.

Database Schema Reference:

  • fact_yellow_trips_raw_2025 – Main trip records table
  • dim_taxi_zones – Location information (pickup/dropoff zones)
  • dim_rate_codes – Rate code descriptions
  • dim_payment_types – Payment method descriptions
  • dim_vendors – Vendor/provider information
  • dim_date – Date dimension for time-based analysis
 
Important:

For each of the tasks below:

  1. write the appropriate query to extract information from the database
  2. generate insights and answer the business question 
 
Part 1: Basic Data Retrieval 
Exercise 1.1: Initial Data Exploration

Business Context: You’re new to the taxi operations team and need to familiarize yourself with the trip data structure.

Task: Write a query to display the first 20 trip records, showing:

  • Pickup datetime
  • Dropoff datetime
  • Passenger count
  • Trip distance
  • Total amount

Learning Objective: Practice SELECT with specific columns and LIMIT

 

Exercise 1.2: Understanding Trip Patterns

Business Context: The operations manager wants to understand typical trip characteristics.

Task: Retrieve 50 trips showing:

  • Trip distance
  • Fare amount
  • Tip amount
  • Total amount
  • Sort by trip distance from longest to shortest

Learning Objective: Practice ORDER BY with DESC

 

Exercise 1.3: Payment Method Analysis

Business Context: The finance team needs to understand which payment types are used.

Task: Write a query to find all unique payment types in the dataset. Display only distinct payment_type values, sorted numerically.

Learning Objective: Practice DISTINCT and basic sorting

 

Exercise 1.4: Vendor Identification

Business Context: You need to identify which vendors (taxi technology providers) operate in the system.

Task: List all unique VendorID values from the trips table, along with a count of how many appear in your sample. Use LIMIT 100 for your sample.

Learning Objective: Practice DISTINCT with basic exploration

 

Part 2: Filtering Data 
Exercise 2.1: High-Value Trip Analysis

Business Context: Management wants to focus on premium trips to understand high-value customer behavior.

Task: Find all trips where the total amount exceeded $100. Display:

  • Pickup datetime
  • Trip distance
  • Fare amount
  • Tip amount
  • Total amount
  • Sort by total amount (highest first)
  • Limit to 25 records

Learning Objective: Practice WHERE with numeric comparison and ORDER BY

 

Exercise 2.2: Short Trip Identification

Business Context: The pricing team is reviewing short-distance trips to ensure fare structures are appropriate.

Task: Identify trips that are:

  • Less than 2 miles in distance
  • Show trip distance, fare amount, and total amount
  • Sort by distance (shortest first)

Learning Objective: Practice WHERE with less-than comparison

 

Exercise 2.3: Cash Payment Analysis

Business Context: The finance department needs to analyze cash transactions (payment_type = 2) for reconciliation purposes.

Task: Find all cash-paid trips showing:

  • Trip distance
  • Fare amount
  • Total amount
  • Payment type (to verify)
  • Limit to first 30 results
  • Sort by fare amount descending

Learning Objective: Practice WHERE with equality comparison

 

Exercise 2.4: Multi-Passenger Trip Analysis

Business Context: Operations wants to understand group travel patterns for vehicle allocation planning.

Task: Find trips with 3 or more passengers, showing:

  • Passenger count
  • Trip distance
  • Total amount
  • Sort by passenger count (highest first), then by total amount (highest first)

Learning Objective: Practice WHERE with >= and multiple ORDER BY columns

 

Exercise Submission Guidelines

For each exercise:

  1. Write the SQL query
  2. Execute it in pgAdmin
  3. Copy the query and take a screenshot of the first 10-20 rows of results
  4. For aggregate exercises, include the complete result set (usually just a few summary rows)
  5. Write 1-2 sentences explaining what business insight the results reveal

 

Example Format:

Exercise 2.1: High-Value Trip Analysis

Query:
[Your SQL code here]

Results:
[Screenshot]

Business Insight:
The top 25 trips range from $100 to $350, with the highest-value trips covering distances of 15-25 miles, suggesting airport or long-distance suburban routes are the most lucrative for drivers.

 

 

Submission & Resubmission Guidelines

1. When submitting your exercise, use this naming format:

YourName_Submission_Lesson#.xxx

If you need to revise and resubmit, add a version suffix:

YourName_Submission_Lesson#_v2.xxx
YourName_Submission_Lesson#_v3.xxx

2. Do not overwrite or change the original evaluation entries in the rubric.

Instead, enter your updated responses or corrections in a new “v2” (or later) column in the rubric for mentor review.

 

Evaluation Rubric

Criteria Meets Expectations Needs Improvement Incomplete/Off-Track
Query Execution All 8 queries execute without errors and return correct results 6-7 queries work correctly; 1-2 have minor syntax errors Fewer than 6 queries execute successfully
SQL Fundamentals Correctly uses SELECT, FROM, WHERE, ORDER BY, LIMIT with appropriate operators and column selection Demonstrates understanding but makes mistakes in 2-3 queries (wrong operators, missing clauses, incorrect sorting) Frequent errors across multiple queries showing gaps in basic SQL concepts
Business Insights Provides relevant 1-2 sentence insights for each exercise showing understanding of results Insights present but generic or missing for 1-2 exercises Most insights missing, incorrect, or show no understanding of data

 

Contact

Talk to us

Have questions or feedback about Lumen? We’d love to hear from you.