NULL handling and common filtering mistakes Interview Questions

Overview

NULL in SQL means a value is missing, unknown, not applicable, or not yet supplied. It is not the same as zero, an empty string, or a false Boolean value. Because SQL uses three-valued logic, expressions involving NULL can evaluate to UNKNOWN, not just TRUE or FALSE.

This topic matters because many SQL bugs are caused by filtering as if NULL were an ordinary value. Common mistakes include writing Column = NULL, using <> and accidentally excluding null rows, using NOT IN with nullable subqueries, filtering a LEFT JOIN in the wrong place, replacing nulls too early, and misunderstanding how aggregates treat null values.

For interviews, strong candidates can explain IS NULL, IS NOT NULL, UNKNOWN, COUNT(*) versus COUNT(column), NOT IN versus NOT EXISTS, null behavior in joins, and how to write filters that match the business meaning of missing data.

The practical goal is to make missing data explicit. Decide what NULL means in the model, write predicates that handle it intentionally, and avoid hiding null-related bugs with broad replacement functions.

Core Concepts

What NULL Means

NULL means the database does not have a value for that column in that row.

Possible meanings:

Unknown: the customer's middle name has not been collected.
Not applicable: an order has no cancellation reason because it was not cancelled.
Not yet happened: ShippedAtUtc is null because the order has not shipped.
Optional: a secondary phone number is not provided.

These meanings are different. A good schema and query should make the intended meaning clear.

Example:

Code

CREATE TABLE dbo.Orders
(
    OrderId BIGINT NOT NULL PRIMARY KEY,
    Status NVARCHAR(20) NOT NULL,
    OrderedAtUtc DATETIME2(3) NOT NULL,
    ShippedAtUtc DATETIME2(3) NULL,
    CancelledAtUtc DATETIME2(3) NULL
);

Here ShippedAtUtc IS NULL can mean the order has not shipped yet. CancelledAtUtc IS NULL can mean the order has not been cancelled. Those are meaningful business states.

NULL Is Not Zero or Empty String

NULL, 0, and '' are different.

Code

SELECT *
FROM dbo.Products
WHERE Weight IS NULL;

This finds rows where weight is missing.

Code

SELECT *
FROM dbo.Products
WHERE Weight = 0;

This finds rows where weight is known to be zero.

For strings:

Code

WHERE MiddleName IS NULL

means there is no stored value.

Code

WHERE MiddleName = N''

means there is a stored empty string. Whether that should be allowed is a data modeling decision.

Three-Valued Logic

SQL predicates can evaluate to:

TRUE
FALSE
UNKNOWN

The WHERE clause returns rows where the predicate is TRUE. It does not return rows where the predicate is FALSE or UNKNOWN.

Example:

Code

SELECT *
FROM dbo.Customers
WHERE MiddleName = N'Lee';

For rows where MiddleName is NULL, the comparison is UNKNOWN, not FALSE. Those rows are not returned.

This is why NULL requires explicit handling.

Use IS NULL and IS NOT NULL

Use IS NULL and IS NOT NULL to test nullness.

Bad:

Code

SELECT *
FROM dbo.Orders
WHERE ShippedAtUtc = NULL;

Correct:

Code

SELECT *
FROM dbo.Orders
WHERE ShippedAtUtc IS NULL;

For non-null values:

Code

SELECT *
FROM dbo.Orders
WHERE ShippedAtUtc IS NOT NULL;

Do not depend on old settings that change null comparison behavior. Modern SQL Server uses ANSI null behavior, and new code should use IS NULL and IS NOT NULL.

Equality and Inequality With NULL

Equality and inequality comparisons with NULL do not behave like comparisons with ordinary values.

Code

-- Does not return rows where Status is NULL.
SELECT *
FROM dbo.Orders
WHERE Status <> N'Cancelled';

If the business rule is "all orders that are not cancelled, including rows with no status yet," write that explicitly:

Code

SELECT *
FROM dbo.Orders
WHERE Status <> N'Cancelled'
   OR Status IS NULL;

If Status should never be missing, the better fix is schema enforcement:

Code

Status NVARCHAR(20) NOT NULL

and a check constraint for allowed values.

NOT IN With NULL

NOT IN can produce surprising results when the list or subquery contains NULL.

Problem:

Code

SELECT c.CustomerId
FROM dbo.Customers AS c
WHERE c.CustomerId NOT IN
(
    SELECT o.CustomerId
    FROM dbo.Orders AS o
);

If Orders.CustomerId contains a NULL, the NOT IN comparison can become UNKNOWN for every customer, returning no rows.

Safer anti-join:

Code

SELECT c.CustomerId
FROM dbo.Customers AS c
WHERE NOT EXISTS
(
    SELECT 1
    FROM dbo.Orders AS o
    WHERE o.CustomerId = c.CustomerId
);

If using NOT IN, filter nulls inside the subquery:

Code

WHERE c.CustomerId NOT IN
(
    SELECT o.CustomerId
    FROM dbo.Orders AS o
    WHERE o.CustomerId IS NOT NULL
);

NOT EXISTS is often clearer for "find rows with no matching row."

IN With Nullable Values

IN is usually safe for positive matching, but it still does not match NULL using normal equality.

Code

SELECT *
FROM dbo.Products
WHERE Color IN (N'Red', N'Blue', NULL);

This does not find rows where Color IS NULL in the way many beginners expect.

Write:

Code

SELECT *
FROM dbo.Products
WHERE Color IN (N'Red', N'Blue')
   OR Color IS NULL;

Be explicit when nulls are part of the intended result.

LEFT JOIN Filtering Mistakes

A LEFT JOIN keeps rows from the left table even when no matching right-side row exists. But a WHERE condition on the right table can accidentally turn it into an inner join.

Bad:

Code

SELECT
    c.CustomerId,
    o.OrderId
FROM dbo.Customers AS c
LEFT JOIN dbo.Orders AS o
    ON o.CustomerId = c.CustomerId
WHERE o.Status = N'Completed';

Rows with no order have o.Status as NULL, so the WHERE predicate removes them.

If the goal is "all customers, with completed orders if they exist," put the right-side filter in the join condition:

Code

SELECT
    c.CustomerId,
    o.OrderId
FROM dbo.Customers AS c
LEFT JOIN dbo.Orders AS o
    ON o.CustomerId = c.CustomerId
   AND o.Status = N'Completed';

If the goal is "customers with completed orders," use an inner join or EXISTS.

Finding Missing Matches

To find rows with no match, use NOT EXISTS or a careful LEFT JOIN ... IS NULL pattern.

Using NOT EXISTS:

Code

SELECT c.CustomerId
FROM dbo.Customers AS c
WHERE NOT EXISTS
(
    SELECT 1
    FROM dbo.Orders AS o
    WHERE o.CustomerId = c.CustomerId
);

Using LEFT JOIN:

Code

SELECT c.CustomerId
FROM dbo.Customers AS c
LEFT JOIN dbo.Orders AS o
    ON o.CustomerId = c.CustomerId
WHERE o.OrderId IS NULL;

Use a non-nullable key from the right table in the IS NULL check. Checking a nullable non-key column can produce false positives.

COALESCE and ISNULL

COALESCE returns the first non-null expression.

Code

SELECT
    CustomerId,
    COALESCE(DisplayName, Email, N'Unknown customer') AS CustomerLabel
FROM dbo.Customers;

ISNULL replaces NULL with a specified value in SQL Server.

Code

SELECT
    ProductId,
    ISNULL(Color, N'Unknown') AS ColorName
FROM dbo.Products;

Use these functions for display values and intentional replacement. Be careful when using them in filters because wrapping a column in a function can make indexes harder to use and can change business meaning.

Problem:

Code

WHERE ISNULL(Status, N'') <> N'Cancelled'

Clearer:

Code

WHERE Status <> N'Cancelled'
   OR Status IS NULL;

NULL and Aggregates

Most aggregate functions ignore NULL. COUNT(*) counts rows.

Code

SELECT
    COUNT(*) AS RowCount,
    COUNT(ShippedAtUtc) AS ShippedCount,
    AVG(DiscountAmount) AS AverageDiscountForRowsWithDiscount,
    AVG(COALESCE(DiscountAmount, 0)) AS AverageDiscountAcrossAllRows
FROM dbo.Orders;

AVG(DiscountAmount) and AVG(COALESCE(DiscountAmount, 0)) answer different questions. Do not replace nulls with zero unless zero is the intended business value.

NULL and GROUP BY

When a grouping column contains NULL, SQL Server puts all null values into one group.

Code

SELECT
    Region,
    COUNT(*) AS CustomerCount
FROM dbo.Customers
GROUP BY Region;

Rows with no region appear in a single NULL group.

For reporting display:

Code

SELECT
    COALESCE(Region, N'Unknown') AS RegionName,
    COUNT(*) AS CustomerCount
FROM dbo.Customers
GROUP BY COALESCE(Region, N'Unknown');

This is fine for display if Unknown is not also a legitimate region value.

Optional Parameter Filtering

A common stored procedure pattern uses nullable parameters:

Code

WHERE (@Status IS NULL OR Status = @Status)
  AND (@CustomerId IS NULL OR CustomerId = @CustomerId)

This means:

If @Status is null, do not filter by status.
If @Status has a value, filter by status.

This pattern is convenient but can produce poor plans for complex searches because one query shape must handle many selectivity patterns. For important search endpoints, consider dynamic SQL with parameters, separate query paths, or carefully designed indexes.

Do not confuse a null parameter meaning "no filter" with a search for rows where the column itself is null. Those are different requirements.

NULL in CHECK Constraints

Check constraints and nullability should be designed together.

Code

Quantity INT NOT NULL
    CONSTRAINT CK_OrderLines_Quantity CHECK (Quantity > 0)

NOT NULL says the value is required. CHECK says the value must be positive.

If a column is nullable, test what your constraint allows. Nullable values can interact with logical expressions in ways that surprise developers. Use explicit NOT NULL when presence is required.

Sargability and NULL Filters

A predicate is sargable when the optimizer can efficiently use an index seek or range seek. Wrapping an indexed column in a function often makes filtering less efficient.

Less ideal:

Code

WHERE COALESCE(Status, N'Unknown') = N'Completed'

Better:

Code

WHERE Status = N'Completed'

For a null-aware filter:

Code

WHERE Status = N'Completed'
   OR Status IS NULL;

Depending on the data and workload, filtered indexes can help:

Code

CREATE INDEX IX_Orders_Unshipped
ON dbo.Orders(OrderDate)
WHERE ShippedAtUtc IS NULL;

This supports queries that frequently find unshipped orders.

Common Filtering Mistakes

Common mistakes include:

Writing Column = NULL or Column <> NULL.
Forgetting that WHERE returns only TRUE, not UNKNOWN.
Using <> and accidentally excluding null rows.
Using NOT IN with nullable subqueries.
Adding right-table filters in WHERE after a LEFT JOIN.
Counting COUNT(column) when COUNT(*) was intended.
Replacing nulls with zero before understanding the business meaning.
Using COALESCE or ISNULL in predicates and hurting index usage.
Treating empty string, zero, and null as interchangeable.
Using nullable flags where NOT NULL DEFAULT 0 would be clearer.

Best Practices

Best practices include:

Decide what NULL means for each nullable column.
Use NOT NULL for required values.
Use IS NULL and IS NOT NULL.
Write null-inclusive filters explicitly.
Prefer NOT EXISTS for anti-joins when nulls may appear.
Keep right-side filters in the ON clause when preserving left rows.
Use COUNT(*) for rows and COUNT(column) for non-null values.
Use replacement functions for display, not as a default filtering habit.
Add filtered indexes for common null-state queries when justified.
Test queries with nulls, no matches, and duplicate matches.

NULL handling and common filtering mistakes

Overview

Core Concepts

What NULL Means

NULL Is Not Zero or Empty String

Three-Valued Logic

Use IS NULL and IS NOT NULL

Equality and Inequality With NULL

NOT IN With NULL

IN With Nullable Values

LEFT JOIN Filtering Mistakes

Finding Missing Matches

COALESCE and ISNULL

NULL and Aggregates

NULL and GROUP BY

Optional Parameter Filtering

NULL in CHECK Constraints

Sargability and NULL Filters

Common Filtering Mistakes

Best Practices

Interview Practice

Beginner Interview Practice

Intermediate Interview Practice

Advanced Interview Practice