NULLs in PostgreSQL

PostgreSQL Queries

The Ultimate PostGIS course

Get full access to this course and all its content.

₹3399.00 ₹1299.00
Buy Course Now
Chapter Info
Course The Ultimate PostGIS course
Module PostgreSQL Queries
Chapter NULLs in PostgreSQL

Chapter Content

NULL values are one of the most misunderstood concepts in database systems. In PostgreSQL, understanding how to properly handle NULLs is crucial for writing correct queries and avoiding common pitfalls. This comprehensive guide covers everything you need to know about working with NULL values in PostgreSQL.

What is NULL?

In PostgreSQL, NULL represents unknown, missing, or not applicable data. It's important to understand that NULL is not the same as:

NULL is NOT Example Explanation
Zero (0) 0 Zero is a specific numeric value
Empty string '' Empty string is a string with zero length
FALSE false FALSE is a specific boolean value
Space character ' ' Space is a printable character

NULL is a special marker that indicates the absence of any value. It represents uncertainty—we don't know what the value is, or it doesn't exist.

Setting Up Sample Data

Let's create a projects table with some NULL values to demonstrate various concepts:

CREATE TABLE projects (
  id SERIAL PRIMARY KEY,
  name TEXT,
  budget NUMERIC,
  deadline DATE,
  status TEXT,
  team_size INTEGER
);

INSERT INTO projects (name, budget, deadline, status, team_size) VALUES
('Website Redesign', 10000, '2024-12-31', 'Active', 5),
('Mobile App', NULL, '2024-10-15', 'Planning', 3),
('Data Cleanup', 5000, NULL, 'Active', NULL),
('Market Research', NULL, NULL, 'On Hold', 2),
('API Development', 15000, '2024-11-30', NULL, 4);

Identifying NULL Values

Using IS NULL and IS NOT NULL

To find NULL values, you must use IS NULL or IS NOT NULL:

-- Find projects with no budget assigned
SELECT * FROM projects WHERE budget IS NULL;

-- Find projects with known deadlines
SELECT * FROM projects WHERE deadline IS NOT NULL;

-- Find projects with both budget and deadline
SELECT * FROM projects 
WHERE budget IS NOT NULL AND deadline IS NOT NULL;

Common Mistake: Using = or != with NULL

-- These DON'T work with NULL:
SELECT * FROM projects WHERE budget = NULL;    -- Wrong! Returns no rows
SELECT * FROM projects WHERE budget != NULL;   -- Wrong! Returns no rows

-- Always use IS NULL or IS NOT NULL:
SELECT * FROM projects WHERE budget IS NULL;     -- Correct!
SELECT * FROM projects WHERE budget IS NOT NULL; -- Correct!
Operator Works with NULL? Example Result
= No NULL = NULL NULL (unknown)
!= No NULL != 5 NULL (unknown)
IS NULL Yes NULL IS NULL TRUE
IS NOT NULL Yes 5 IS NOT NULL TRUE

NULL Behavior in Comparisons and Expressions

Arithmetic Operations with NULL

When you perform arithmetic operations involving NULL, the result is always NULL:

-- Arithmetic with NULL always returns NULL
SELECT 
  name,
  budget,
  budget + 1000 AS increased_budget,
  budget * 1.1 AS budget_with_tax
FROM projects;

-- Example results:
-- 'Website Redesign' | 10000 | 11000 | 11000
-- 'Mobile App'       | NULL  | NULL  | NULL

Comparison Operations with NULL

-- Comparisons with NULL return NULL (treated as FALSE in WHERE)
SELECT name, budget FROM projects WHERE budget > 5000;
-- This will NOT include rows where budget is NULL

-- To include NULLs in comparisons, handle them explicitly
SELECT name, budget FROM projects 
WHERE budget > 5000 OR budget IS NULL;

Three-Valued Logic

PostgreSQL uses three-valued logic: TRUE, FALSE, and NULL (unknown). This affects how logical operations work:

Expression Result Explanation
TRUE AND NULL NULL Unknown because we don't know the NULL value
FALSE AND NULL FALSE FALSE regardless of the unknown value
TRUE OR NULL TRUE TRUE regardless of the unknown value
FALSE OR NULL NULL Unknown because we don't know the NULL value
NOT NULL NULL Negation of unknown is still unknown

Handling NULL Values

COALESCE Function

COALESCE returns the first non-NULL value from a list of expressions:

-- Replace NULL budget with 0
SELECT 
  name, 
  COALESCE(budget, 0) AS safe_budget,
  COALESCE(team_size, 1) AS safe_team_size
FROM projects;

-- Multiple fallback values
SELECT 
  name,
  COALESCE(status, 'Unknown', 'Not Set') AS project_status
FROM projects;

-- Use in calculations
SELECT 
  name,
  COALESCE(budget, 0) * COALESCE(team_size, 1) AS budget_per_person
FROM projects;

NULLIF Function

NULLIF returns NULL if two values are equal, otherwise returns the first value:

-- Convert 0 to NULL (treat 0 as "no budget")
SELECT 
  name,
  NULLIF(budget, 0) AS budget_or_null
FROM projects;

-- Convert empty strings to NULL
SELECT 
  name,
  NULLIF(TRIM(status), '') AS clean_status
FROM projects;

-- Avoid division by zero
SELECT 
  name,
  budget / NULLIF(team_size, 0) AS budget_per_person
FROM projects;

CASE Statements with NULL

-- Complex NULL handling with CASE
SELECT 
  name,
  CASE 
    WHEN budget IS NULL THEN 'Budget not set'
    WHEN budget = 0 THEN 'No budget allocated'
    WHEN budget < 5000 THEN 'Small budget'
    WHEN budget < 15000 THEN 'Medium budget'
    ELSE 'Large budget'
  END AS budget_category
FROM projects;

NULL in Aggregate Functions

Most aggregate functions ignore NULL values:

-- Aggregate functions ignore NULLs
SELECT 
  COUNT(*) AS total_projects,           -- Counts all rows including NULLs
  COUNT(budget) AS projects_with_budget, -- Counts only non-NULL budgets
  AVG(budget) AS average_budget,        -- Calculates average ignoring NULLs
  SUM(budget) AS total_budget,          -- Sums only non-NULL values
  MIN(budget) AS min_budget,            -- Minimum non-NULL value
  MAX(budget) AS max_budget             -- Maximum non-NULL value
FROM projects;
Function NULL Handling Example Result
COUNT(*) Counts all rows 5 (includes NULL rows)
COUNT(column) Counts non-NULL values 3 (excludes NULL budgets)
AVG(column) Ignores NULLs Average of non-NULL values only
SUM(column) Ignores NULLs Sum of non-NULL values only

NULL in Sorting (ORDER BY)

PostgreSQL allows you to control where NULL values appear in sorted results:

-- Default behavior (NULLs last for ASC, first for DESC)
SELECT name, budget FROM projects ORDER BY budget;

-- Explicitly control NULL positioning
SELECT name, budget FROM projects ORDER BY budget NULLS FIRST;
SELECT name, budget FROM projects ORDER BY budget NULLS LAST;

-- Multiple columns with NULL handling
SELECT name, budget, deadline 
FROM projects 
ORDER BY budget NULLS LAST, deadline NULLS FIRST;
Sort Direction Default NULL Position Override Options
ASC NULLS LAST NULLS FIRST
DESC NULLS FIRST NULLS LAST

NULL in Joins

NULL values in join conditions require special attention:

-- Create related table
CREATE TABLE project_managers (
  project_id INTEGER,
  manager_name TEXT
);

INSERT INTO project_managers VALUES
(1, 'Alice'),
(2, 'Bob'),
(NULL, 'Charlie');  -- Manager not assigned to specific project

-- Standard join - NULL project_id won't match
SELECT p.name, pm.manager_name
FROM projects p
LEFT JOIN project_managers pm ON p.id = pm.project_id;

-- To include unassigned managers, handle NULLs explicitly
SELECT 
  COALESCE(p.name, 'Unassigned') AS project_name,
  pm.manager_name
FROM projects p
FULL OUTER JOIN project_managers pm ON p.id = pm.project_id;

Advanced NULL Handling Techniques

Conditional Aggregation with NULLs

-- Count different types of NULL scenarios
SELECT 
  COUNT(CASE WHEN budget IS NULL THEN 1 END) AS projects_no_budget,
  COUNT(CASE WHEN deadline IS NULL THEN 1 END) AS projects_no_deadline,
  COUNT(CASE WHEN budget IS NULL AND deadline IS NULL THEN 1 END) AS projects_no_budget_or_deadline
FROM projects;

NULL-Safe Equality Comparison

-- PostgreSQL doesn't have NULL-safe equality, but you can create it
SELECT 
  p1.name AS project1,
  p2.name AS project2
FROM projects p1, projects p2
WHERE p1.id < p2.id
  AND (
    (p1.budget = p2.budget) OR 
    (p1.budget IS NULL AND p2.budget IS NULL)
  );

-- Or use COALESCE for comparison
SELECT 
  p1.name AS project1,
  p2.name AS project2
FROM projects p1, projects p2
WHERE p1.id < p2.id
  AND COALESCE(p1.budget, -1) = COALESCE(p2.budget, -1);

Filtering with Complex NULL Logic

-- Projects that are either active or have unknown status
SELECT * FROM projects 
WHERE status = 'Active' OR status IS NULL;

-- Projects with budget issues (NULL or zero)
SELECT * FROM projects 
WHERE budget IS NULL OR budget = 0;

-- Projects ready for review (have budget and deadline)
SELECT * FROM projects 
WHERE budget IS NOT NULL 
  AND deadline IS NOT NULL 
  AND status IS NOT NULL;

Data Quality and NULL Analysis

Analyzing NULL Patterns

-- NULL analysis report
SELECT 
  'budget' AS column_name,
  COUNT(*) AS total_rows,
  COUNT(budget) AS non_null_count,
  COUNT(*) - COUNT(budget) AS null_count,
  ROUND(100.0 * (COUNT(*) - COUNT(budget)) / COUNT(*), 2) AS null_percentage
FROM projects

UNION ALL

SELECT 
  'deadline',
  COUNT(*),
  COUNT(deadline),
  COUNT(*) - COUNT(deadline),
  ROUND(100.0 * (COUNT(*) - COUNT(deadline)) / COUNT(*), 2)
FROM projects

UNION ALL

SELECT 
  'status',
  COUNT(*),
  COUNT(status),
  COUNT(*) - COUNT(status),
  ROUND(100.0 * (COUNT(*) - COUNT(status)) / COUNT(*), 2)
FROM projects;

Data Completeness Scoring

-- Calculate completeness score for each project
SELECT 
  name,
  (
    CASE WHEN budget IS NOT NULL THEN 1 ELSE 0 END +
    CASE WHEN deadline IS NOT NULL THEN 1 ELSE 0 END +
    CASE WHEN status IS NOT NULL THEN 1 ELSE 0 END +
    CASE WHEN team_size IS NOT NULL THEN 1 ELSE 0 END
  ) * 100.0 / 4 AS completeness_percentage
FROM projects
ORDER BY completeness_percentage DESC;

Best Practices for NULL Handling

  1. Always use IS NULL/IS NOT NULL: Never use = or != with NULL values
  2. Handle NULLs explicitly: Use COALESCE, CASE, or NULLIF to manage NULL values
  3. Consider three-valued logic: Remember that NULL comparisons return NULL, not FALSE
  4. Document NULL meanings: Clearly define what NULL represents in each column
  5. Use constraints wisely: Add NOT NULL constraints where appropriate
  6. Test with NULL data: Always test your queries with NULL values present
  7. Be consistent: Handle similar NULL scenarios consistently across your application

Common Pitfalls and Solutions

Pitfall Problem Solution
Using = NULL WHERE column = NULL never matches Use WHERE column IS NULL
Ignoring NULLs in calculations Unexpected NULL results Use COALESCE(column, default_value)
Assuming COUNT(*) = COUNT(column) Incorrect row counts Understand the difference between the two
Not handling NULLs in joins Missing data in results Use appropriate join types and NULL handling

Practice Exercises

Try these exercises with the sample data:

  1. List all projects where the budget is NULL
  2. List all projects with known deadlines
  3. Show projects with a 'safe_budget' column that replaces NULL with 0
  4. Calculate the average budget across all projects (how does NULL affect this?)
  5. Show how budget + 500 behaves when budget is NULL vs non-NULL
  6. Find projects that have either no budget OR no deadline
  7. Create a completeness report showing which projects have all required fields

Exercise Solutions

-- 1. Projects with NULL budget
SELECT * FROM projects WHERE budget IS NULL;

-- 2. Projects with known deadlines
SELECT * FROM projects WHERE deadline IS NOT NULL;

-- 3. Safe budget column
SELECT name, COALESCE(budget, 0) AS safe_budget FROM projects;

-- 4. Average budget (NULLs are ignored)
SELECT AVG(budget) AS average_budget FROM projects;

-- 5. Arithmetic with NULL
SELECT name, budget, budget + 500 AS budget_plus_500 FROM projects;

-- 6. Projects missing budget OR deadline
SELECT * FROM projects WHERE budget IS NULL OR deadline IS NULL;

-- 7. Completeness report
SELECT 
  name,
  budget IS NOT NULL AS has_budget,
  deadline IS NOT NULL AS has_deadline,
  status IS NOT NULL AS has_status,
  team_size IS NOT NULL AS has_team_size
FROM projects;