Lesson 1 - Setting up and Getting started with SQL

Learn how to setup your SQL environment and get started working with databases.

Estimated Read Time: 1,5 - 2 Hours

Learning Goals

In this lesson, you will learn to:

Technical & Analytical Skills:
  • Install and configure PostgreSQL and pgAdmin.

  • Understand how a relational database works at a basic architectural level.

  • Connect pgAdmin to your PostgreSQL server and prepare your workspace for analysis.

Business Acumen
  • Understand why organisations rely on relational databases for reliable, structured, traceable analytics.

1. Introduction: Why This Matters

You’re about to learn SQL—but before you write a single query, you need to understand what problem SQL actually solves.

Imagine you’re a data analyst at a mid-sized logistics company. Your manager asks: “Which delivery routes had the highest failure rate last quarter?”

You have 2.3 million delivery records. Each record contains: route ID, driver ID, customer location, delivery timestamp, success status, and 15 other fields.

 

If this data is in Excel:

  • Opening the file takes 3-5 minutes (if it opens at all)
  • You filter by quarter → Excel freezes
  • You try to pivot → your laptop fan screams
  • Your coworker needs the same data → now you’re emailing 500MB files
  • You finally get an answer → 45 minutes later

 

If this data is in a database:

				
					SELECT route_id, 
       COUNT(*) as total_deliveries,
       SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) as failures,
       ROUND(100.0 * SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) / COUNT(*), 2) as failure_rate
FROM deliveries
WHERE delivery_date >= '2024-10-01' AND delivery_date < '2025-01-01'
GROUP BY route_id
ORDER BY failure_rate DESC
LIMIT 10;
				
			

Result: 2 seconds. Ten routes. Failure rates. Done.

Your coworker needs the same data? They run the same query against the same database. No file sharing. No version conflicts. No duplicated storage.

This is why databases exist. Not because they’re theoretically better – because at scale, they’re the only thing that works.

This lesson builds the technical foundation for everything that follows. You’ll install PostgreSQL (the database engine) and pgAdmin (your interface to that engine). More importantly, you’ll understand why these tools are structured the way they are, and what business problems they solve.

 

2. The Problem: Why Files Don’t Scale?

Let’s start with first principles.

When you save an Excel file, you’re creating a file-based data system:

  • Data lives in a single file on your computer
  • To access data, you open the entire file
  • One person can edit at a time (unless you use cloud tools with limitations)
  • Calculations happen in your computer’s memory
  • Sharing means copying files

 

This works perfectly fine for:

  • Personal budgets
  • Small project tracking (< 10,000 rows)
  • Ad-hoc calculations
  • Quick exploratory analysis
  • One-off reports

 

It breaks down when:

1. Data Volume Exceeds Memory

Your laptop has 8-16GB of RAM. A dataset with 5 million rows and 30 columns consumes roughly 4-6GB when loaded into Excel. Add pivot tables, formulas, and background processes—your system is at capacity. Excel becomes unusable.

 
2. Multiple Users Need Concurrent Access

Your sales team needs live dashboard data. Your finance team needs the same data for forecasting. Your operations team needs it for capacity planning. If everyone opens the same Excel file, you get:

  • File locking (only one editor at a time)
  • Version chaos (“Wait, are you working on v3_final or v3_final_ACTUAL?”)
  • Manual refresh cycles (someone has to update and re-share the file)

 

3. Query Patterns Become Complex

“Show me customers who bought Product A but not Product B, grouped by region, excluding returns, for customers acquired in Q4 2024.”

In Excel: 20 minutes of filtering, helper columns, VLOOKUPs, and pivot tables. In SQL: One query, 5 seconds.

 

4. Data Integrity Requirements Increase

You need to ensure:

  • Customer IDs are always unique
  • Order totals always match line item sums
  • Dates are always in valid format
  • Foreign key relationships remain consistent

 

Excel has no built-in enforcement. You rely on manual validation. At scale, manual validation fails.

 

3. The Solution: Server-Based Data Systems

A database is fundamentally different from a file. Instead of storing data in a file that must be opened in full, a database server:

  1. Runs continuously as a background service, always ready to respond to requests
  2. Stores data in optimized structures (tables, indexes) designed for fast retrieval
  3. Accepts queries from multiple users simultaneously, returning only requested data
  4. Enforces rules (constraints) that maintain data integrity automatically
  5. Manages concurrent access so multiple users can read and write without conflicts

 

The key insight: You don’t load the database. You query it. You ask specific questions and get specific answers.

 

4. PostgreSQL: Your Database Engine

PostgreSQL is a relational database management system (RDBMS)—software that stores, manages, and serves data in table format with defined relationships between tables.

 
4.1. Why PostgreSQL?

1. Industry Legitimacy PostgreSQL powers production systems at:

  • Apple (internal data infrastructure)
  • Instagram (main database for years during explosive growth)
  • Spotify (user data and music metadata)
  • Reddit (comment and post storage)
  • Uber (geospatial data and route optimization)

 

When you list PostgreSQL on your resume, employers recognize it as enterprise-grade experience.

 

2. Open Source with No Strings

  • Free to use commercially
  • No licensing restrictions
  • No vendor lock-in
  • Extensive community documentation
  • Active development (new features every year)

 

3. SQL Standards Compliance PostgreSQL adheres closely to ANSI SQL standards. Skills you develop here transfer directly to Oracle, MySQL, Microsoft SQL Server, and other systems.

 

4. Advanced Feature Set While you’re learning fundamentals, PostgreSQL supports:

  • Complex queries with subqueries and CTEs
  • Window functions for advanced analytics
  • JSON data types (bridging relational and document storage)
  • Full-text search
  • Custom functions and extensions

 

You won’t outgrow PostgreSQL as your skills advance.

 

4.2. How PostgreSQL Works?

When you install PostgreSQL, you’re installing a database server – a program that runs continuously in the background, listening for incoming requests.

Think of it like this:

  • PostgreSQL = A library with millions of books
  • You = Someone who wants specific information
  • SQL = The language you use to ask the librarian for exactly what you need
  • The librarian = PostgreSQL’s query engine, which finds and returns your results

 

You never enter the library and read every book. You ask precise questions and get precise answers.

 

5. pgAdmin: Your Window into PostgreSQL

PostgreSQL runs as a server process. It has no graphical interface. You communicate with it by sending SQL commands through a connection.

pgAdmin is a graphical user interface (GUI) that:

  • Connects to PostgreSQL servers
  • Provides a visual browser for databases, tables, and data
  • Offers a query editor for writing and executing SQL
  • Displays query results in table format
  • Includes tools for database management and design

 

Important distinction:

  • PostgreSQL = The engine (stores data, executes queries)
  • pgAdmin = The dashboard (your interface to the engine)

 

You’ll write SQL in pgAdmin’s query editor. pgAdmin sends that SQL to PostgreSQL. PostgreSQL executes the query and returns results. pgAdmin displays those results.

 

6. Installation: Building Your Environment

You’re about to install the. data infrastructure. Follow the setup video carefully.

Video 1 – Setting up postgres and pgadmin

6.1. Installation Checklist

Phase 1: Download

  • Navigate to postgresql.org/download
  • Select your operating system (Windows/macOS)
  • Download the latest stable version

 

Phase 2: Install PostgreSQL

  • Run the installer
  • Accept default installation path
  • Critical step: Set a superuser password (write this down – you’ll need it constantly)
  • Default port: 5432 (keep this unless you have conflicts)
  • Install all components (including pgAdmin 4)

 

Phase 2b: Download and Install pgAdmin

 

Phase 3: Verify PostgreSQL is Running

  • Windows: Open Services → Find “postgresql-x64-[version]” → Status should be “Running”
  • macOS: PostgreSQL runs automatically; check Activity Monitor if unsure

 

Phase 4: Launch pgAdmin

  • Open pgAdmin 4 from your applications
  • Interface opens in your default web browser (this is normal—pgAdmin is a web application)

 

Phase 5: Register Your Server in pgAdmin

  • Right-click “Servers” in the left panel
  • Select “Register” → “Server”
  • Enter connection details:
    • Name: LocalPostgres (or any descriptive name)
    • Host: localhost
    • Port: 5432
    • Username: postgres
    • Password: [your superuser password from installation]
  • Click “Save”

 

If successful, your server appears in the left panel. Expand it to see the default “postgres” database.

 

6.2. Common Installation Issues

Issue: “Could not connect to server”

  • Solution: Verify PostgreSQL service is running. Restart it if necessary.

 

Issue: “Password authentication failed”

  • Solution: Re-enter password carefully. If forgotten, you’ll need to reset it through PostgreSQL configuration.

 

Issue: “Port 5432 is already in use”

  • Solution: Another application is using that port. Either stop that application or configure PostgreSQL to use a different port during installation.

 

Issue: pgAdmin won’t open/shows blank page

  • Solution: Clear browser cache, or try a different browser. pgAdmin is browser-based.

 

7. Your First SQL Query: Verification

Let’s confirm everything works by running a simple query.

 
Step 1: Open the Query Tool
  • In pgAdmin, expand your server
  • Expand “Databases”
  • Right-click the “postgres” database
  • Select “Query Tool”

 

A new tab opens with a blank query editor.

 

Step 2: Write Your First Query

Type this SQL exactly as shown:

				
					SELECT version();
				
			

This query asks PostgreSQL to report its version number.

 
Step 3: Execute the Query
  • Click the ▶ (Execute) button in the toolbar, or press F5

 

You should see output like:

PostgreSQL 16.2 on x86_64-apple-darwin, compiled by Apple clang version 14.0.0, 64-bit

 

Congratulations. You’ve just executed SQL against a production database system.

 

Step 4: Try a Calculation

Clear the editor and run:

				
					SELECT
'Data Analytics by Lumen' as program,
2026 as year,
24 * 35 as calculation,
CURRENT_TIMESTAMP as query_time;
				
			

PostgreSQL returns:

program year calculation query_time
Data Analytics by Lumen 2026 840 2025-11-30 14:23:15.428

 

Notice: You didn’t create a table. SQL can perform calculations and return results without stored data.

 

8. Understanding What Just Happened

When you clicked Execute, here’s what occurred:

  1. pgAdmin sent your SQL to the PostgreSQL server over the network connection (localhost in this case)
  2. PostgreSQL’s query parser validated your SQL syntax
  3. PostgreSQL’s query engine executed the SELECT statement
  4. PostgreSQL returned results to pgAdmin
  5. pgAdmin displayed results in table format

 

This is the request-response cycle you’ll use for every query in this course.

Key point: The database server is separate from the interface. You could connect to the same PostgreSQL server from:

  • pgAdmin (what you’re using)
  • Python scripts (using psycopg2 library)
  • Tableau (direct database connection)
  • Excel (via ODBC driver)
  • Web applications (via application code)

 

One database, many possible interfaces.

Summary

Databases solve problems that spreadsheets cannot:

  • They handle data volumes that exceed computer memory
  • They support concurrent access by multiple users
  • They enable complex queries that return only necessary data
  • They enforce data integrity through automated constraints

 

PostgreSQL is an enterprise-grade RDBMS:

  • Used by major technology companies
  • Open source and freely available
  • Standards-compliant (skills transfer across systems)
  • Feature-rich enough to support advanced analytics

 

pgAdmin provides a graphical interface:

  • Connects to PostgreSQL servers
  • Offers visual database navigation
  • Includes a query editor for SQL execution
  • Displays results in readable format

 

You’ve established your working environment and confirmed functionality with your first SQL queries.

Suggested References

Official Documentation:

 

Industry Context:

Exercise

Estimated Time to Complete: 1-2 hours

Exercise: Environment Validation and First Analysis
Objective

Verify your database environment is properly configured.

  1. Confirm PostgreSQL is running

    • Take a screenshot showing PostgreSQL service/process active
  2. Verify pgAdmin connection

    • Execute the query in the Query Tool to identify your SQL version.
    • Take a screenshot of the results
 
Submission Format

Submit a single PDF containing two screenshots (PostgreSQL running, query results)

 

Submission & Resubmission Guidelines

1. When submitting your exercise, use this naming format:

YourName_Submission_Lesson#.xxx

If you need to revise and resubmit, add a version suffix:

YourName_Submission_Lesson#_v2.xxx
YourName_Submission_Lesson#_v3.xxx

2. Do not overwrite or change the original evaluation entries in the rubric.

Instead, enter your updated responses or corrections in a new “v2” (or later) column in the rubric for mentor review.

Evaluation Rubric

Criterion Meets Expectation Needs Improvement Incomplete/Off-Track
Technical Verification

• Both screenshots clearly show required elements

• PostgreSQL service/process is demonstrably running

• Query executes successfully showing PostgreSQL version

• Screenshots are unclear or partially incomplete

• One screenshot is missing

• Query results are present but screenshot quality makes details difficult to verify

• Installation is incomplete or pgAdmin cannot connect to PostgreSQL

• Screenshots are missing or don’t show required verification elements

• Query did not execute or shows errors

Student Submissions

Check out recently submitted work by other students to get an idea of what’s required for this Exercise:

Approved on Jan 5th

 

Got Feedback?

Drop us a line here.

Contact

Talk to us

Have questions or feedback about Lumen? We’d love to hear from you.