DEV_NET_CORE
GET_STARTED
SQLRelational modeling and normalization

INNER, LEFT, and other join patterns

Overview

SQL joins combine rows from two or more tables based on related columns or logical conditions. They are central to relational databases because normalized models intentionally split data into separate tables such as customers, orders, products, payments, users, roles, and audit records. Joins let you reconstruct meaningful business results from those related tables.

The most common join patterns are INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN, CROSS JOIN, and self joins. In practical SQL, you will also see semi-join and anti-join patterns using EXISTS, NOT EXISTS, and LEFT JOIN ... IS NULL.

This topic is important for interviews because joins reveal whether a candidate understands relational modeling, primary keys, foreign keys, optional relationships, cardinality, null behavior, aggregation, and query correctness. Many real production SQL bugs are caused by choosing the wrong join type, placing filters in the wrong clause, accidentally multiplying rows, or using DISTINCT to hide a bad join.

A strong interview answer should explain not only the syntax, but also the business meaning of each join type. For example, INNER JOIN means “only rows with a match,” while LEFT JOIN means “keep all rows from the left side, even when the optional related data is missing.”

Core Concepts

Basic sample schema

The examples below use a simple sales schema:

Code
CREATE TABLE Customers
(
    CustomerId int PRIMARY KEY,
    CustomerName varchar(100) NOT NULL
);

CREATE TABLE Orders
(
    OrderId int PRIMARY KEY,
    CustomerId int NOT NULL,
    OrderDate date NOT NULL,
    Status varchar(20) NOT NULL,
    CONSTRAINT FK_Orders_Customers
        FOREIGN KEY (CustomerId) REFERENCES Customers(CustomerId)
);

CREATE TABLE Products
(
    ProductId int PRIMARY KEY,
    ProductName varchar(100) NOT NULL
);

CREATE TABLE OrderItems
(
    OrderItemId int PRIMARY KEY,
    OrderId int NOT NULL,
    ProductId int NOT NULL,
    Quantity int NOT NULL,
    UnitPrice decimal(10, 2) NOT NULL,
    CONSTRAINT FK_OrderItems_Orders
        FOREIGN KEY (OrderId) REFERENCES Orders(OrderId),
    CONSTRAINT FK_OrderItems_Products
        FOREIGN KEY (ProductId) REFERENCES Products(ProductId)
);

CREATE TABLE Payments
(
    PaymentId int PRIMARY KEY,
    OrderId int NOT NULL,
    PaidAmount decimal(10, 2) NOT NULL,
    PaidAt datetime2 NOT NULL,
    CONSTRAINT FK_Payments_Orders
        FOREIGN KEY (OrderId) REFERENCES Orders(OrderId)
);

Relationship examples:

Code
Customers -> Orders      : one-to-many
Orders -> OrderItems     : one-to-many
Products -> OrderItems   : one-to-many
Orders -> Payments       : one-to-zero/many depending on business rules

Before writing a join, always understand the relationship between the tables. Join correctness depends on business meaning, not only syntax.

What a join does

A join combines rows using a predicate, usually matching a foreign key to a primary key.

Code
SELECT
    c.CustomerId,
    c.CustomerName,
    o.OrderId,
    o.OrderDate
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

The condition:

Code
o.CustomerId = c.CustomerId

means “match each order to its customer.”

INNER JOIN

INNER JOIN returns only rows where both sides match.

Code
SELECT
    c.CustomerName,
    o.OrderId,
    o.OrderDate
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

Business meaning:

Code
Show customers that have matching orders.

Customers without orders do not appear in the result.

Use INNER JOIN when:

  • A related row is required.
  • You only want rows that exist in both tables.
  • Missing related data should exclude the row.
  • You need columns from both tables for matched rows.

Common examples:

  • Orders with their customer.
  • Order items with their product.
  • Payments with their order.
  • Employees with a mandatory department.

LEFT JOIN

LEFT JOIN returns all rows from the left table and matching rows from the right table. If no right-side match exists, right-side columns are returned as NULL.

Code
SELECT
    c.CustomerId,
    c.CustomerName,
    o.OrderId,
    o.OrderDate
FROM Customers AS c
LEFT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

Business meaning:

Code
Show all customers, including customers with no orders.

Use LEFT JOIN when:

  • The left-side row must be preserved.
  • Related data is optional.
  • You need to show missing related data.
  • You are building a report that includes all parent records.

Common examples:

  • All customers and their orders, including customers with no orders.
  • All products and sales, including products never sold.
  • All employees and managers, including top-level employees.
  • All users and last login, including users who never logged in.

RIGHT JOIN

RIGHT JOIN returns all rows from the right table and matching rows from the left table. If no left-side match exists, left-side columns are NULL.

Code
SELECT
    c.CustomerName,
    o.OrderId,
    o.OrderDate
FROM Customers AS c
RIGHT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

In practice, many teams avoid RIGHT JOIN because the same logic can usually be written more readably as a LEFT JOIN by swapping table order:

Code
SELECT
    c.CustomerName,
    o.OrderId,
    o.OrderDate
FROM Orders AS o
LEFT JOIN Customers AS c
    ON c.CustomerId = o.CustomerId;

Use RIGHT JOIN only when it genuinely improves readability. Most production SQL codebases prefer LEFT JOIN for preserved-side logic.

FULL OUTER JOIN

FULL OUTER JOIN returns matched rows and unmatched rows from both sides.

Code
SELECT
    c.CustomerId,
    c.CustomerName,
    o.OrderId,
    o.OrderDate
FROM Customers AS c
FULL OUTER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

Business meaning:

Code
Show all customers and all orders, whether or not a match exists.

This is useful for reconciliation, migration validation, and comparing two data sets.

Example: compare customers from two systems:

Code
SELECT
    old.CustomerId AS OldCustomerId,
    new.CustomerId AS NewCustomerId,
    old.CustomerName AS OldCustomerName,
    new.CustomerName AS NewCustomerName
FROM OldSystemCustomers AS old
FULL OUTER JOIN NewSystemCustomers AS new
    ON new.CustomerId = old.CustomerId
WHERE old.CustomerId IS NULL
   OR new.CustomerId IS NULL
   OR old.CustomerName <> new.CustomerName;

This finds records missing from either system or records with changed values.

CROSS JOIN

CROSS JOIN returns every combination of rows from both tables. It has no ON condition.

Code
SELECT
    c.CustomerName,
    p.ProductName
FROM Customers AS c
CROSS JOIN Products AS p;

If there are 100 customers and 50 products, the result contains:

Code
100 * 50 = 5,000 rows

Use CROSS JOIN when every combination is intentional, such as:

  • Building product-by-month reporting grids.
  • Creating date/customer/product combinations.
  • Generating test data.
  • Combining small lookup sets.

Common mistake: accidentally creating a cross join by forgetting the join condition.

SELF JOIN

A self join joins a table to itself. This is useful when rows in the same table relate to other rows in that same table.

Example employee hierarchy:

Code
CREATE TABLE Employees
(
    EmployeeId int PRIMARY KEY,
    EmployeeName varchar(100) NOT NULL,
    ManagerId int NULL,
    CONSTRAINT FK_Employees_Manager
        FOREIGN KEY (ManagerId) REFERENCES Employees(EmployeeId)
);

Query employees and their managers:

Code
SELECT
    e.EmployeeName AS EmployeeName,
    m.EmployeeName AS ManagerName
FROM Employees AS e
LEFT JOIN Employees AS m
    ON m.EmployeeId = e.ManagerId;

A LEFT JOIN is used because top-level employees may not have a manager.

Self joins are common for:

  • Employee-manager hierarchy.
  • Category-parent category hierarchy.
  • Related products.
  • Previous/next records.
  • Comparing rows in the same table.

Joining more than two tables

Real queries often join several tables.

Code
SELECT
    c.CustomerName,
    o.OrderId,
    o.OrderDate,
    p.ProductName,
    oi.Quantity,
    oi.UnitPrice,
    oi.Quantity * oi.UnitPrice AS LineTotal
FROM Orders AS o
INNER JOIN Customers AS c
    ON c.CustomerId = o.CustomerId
INNER JOIN OrderItems AS oi
    ON oi.OrderId = o.OrderId
INNER JOIN Products AS p
    ON p.ProductId = oi.ProductId;

This returns one row per order item, with customer and product details.

Important point: joining one-to-many tables increases row count. One order with five items becomes five result rows.

Join cardinality

Cardinality describes how many rows from one table can relate to rows in another table.

RelationshipMeaningJoin impact
One-to-oneOne row relates to at most one rowUsually does not multiply rows
One-to-manyOne parent has many childrenParent rows can repeat
Many-to-manyMany rows relate through a junction tableCan multiply rows significantly
Optional relationshipRelated row may not existOften requires LEFT JOIN

Example one-to-many multiplication:

Code
SELECT
    o.OrderId,
    oi.OrderItemId
FROM Orders AS o
INNER JOIN OrderItems AS oi
    ON oi.OrderId = o.OrderId;

If one order has three items, that order appears three times. This is expected, not automatically a duplicate bug.

Many-to-many join pattern

A many-to-many relationship is usually modeled with a junction table.

Code
CREATE TABLE Students
(
    StudentId int PRIMARY KEY,
    StudentName varchar(100) NOT NULL
);

CREATE TABLE Courses
(
    CourseId int PRIMARY KEY,
    CourseName varchar(100) NOT NULL
);

CREATE TABLE StudentCourses
(
    StudentId int NOT NULL,
    CourseId int NOT NULL,
    PRIMARY KEY (StudentId, CourseId),
    FOREIGN KEY (StudentId) REFERENCES Students(StudentId),
    FOREIGN KEY (CourseId) REFERENCES Courses(CourseId)
);

Query students and their courses:

Code
SELECT
    s.StudentName,
    c.CourseName
FROM Students AS s
INNER JOIN StudentCourses AS sc
    ON sc.StudentId = s.StudentId
INNER JOIN Courses AS c
    ON c.CourseId = sc.CourseId;

The junction table stores the relationship. The query joins through it.

LEFT JOIN for missing data

A common anti-join pattern is LEFT JOIN plus IS NULL.

Customers with no orders:

Code
SELECT
    c.CustomerId,
    c.CustomerName
FROM Customers AS c
LEFT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId
WHERE o.OrderId IS NULL;

Business meaning:

Code
Find customers that do not have any orders.

This works because customers without matching orders get NULL right-side columns.

NOT EXISTS anti-join pattern

NOT EXISTS is another common way to find rows without a match.

Code
SELECT
    c.CustomerId,
    c.CustomerName
FROM Customers AS c
WHERE NOT EXISTS
(
    SELECT 1
    FROM Orders AS o
    WHERE o.CustomerId = c.CustomerId
);

NOT EXISTS is often preferred because it clearly expresses the intent and avoids null-related pitfalls that can occur with NOT IN.

EXISTS semi-join pattern

Use EXISTS when you only need to know whether a related row exists.

Customers with at least one paid order:

Code
SELECT
    c.CustomerId,
    c.CustomerName
FROM Customers AS c
WHERE EXISTS
(
    SELECT 1
    FROM Orders AS o
    WHERE o.CustomerId = c.CustomerId
      AND o.Status = 'Paid'
);

This avoids accidental row multiplication. If a customer has five paid orders, the customer still appears once.

Compare with an inner join:

Code
SELECT
    c.CustomerId,
    c.CustomerName
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId
WHERE o.Status = 'Paid';

This returns one row per paid order unless DISTINCT or grouping is added.

INNER JOIN vs EXISTS

Use INNER JOIN when:

  • You need columns from both tables.
  • You want one row per matching combination.
  • Row multiplication is expected and meaningful.

Use EXISTS when:

  • You only need to test whether a match exists.
  • You only return columns from the outer table.
  • You want to avoid duplicate parent rows.
  • The related table is only used as a filter.

LEFT JOIN filter placement: ON vs WHERE

A very common SQL interview trap is filtering the right table of a LEFT JOIN in the WHERE clause.

Requirement:

Code
Show all customers and their paid orders if they have any.

Correct:

Code
SELECT
    c.CustomerName,
    o.OrderId,
    o.Status
FROM Customers AS c
LEFT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId
   AND o.Status = 'Paid';

This keeps all customers. Customers with no paid orders still appear with NULL order columns.

Incorrect for that requirement:

Code
SELECT
    c.CustomerName,
    o.OrderId,
    o.Status
FROM Customers AS c
LEFT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId
WHERE o.Status = 'Paid';

This removes rows where o.Status is NULL, so the query behaves like an INNER JOIN for paid orders.

Rule of thumb:

Code
Filters that decide which right-side rows match usually belong in ON.
Filters that decide which final result rows to keep usually belong in WHERE.

Filtering the preserved side of a LEFT JOIN

Filtering the preserved left table in WHERE is usually safe.

Code
SELECT
    c.CustomerName,
    o.OrderId
FROM Customers AS c
LEFT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId
WHERE c.CustomerName LIKE 'A%';

This means:

Code
Show customers whose names start with A, and include their orders if any.

The filter applies to customers, not optional order rows.

Null handling in joins

SQL NULL means unknown or missing. In normal SQL equality comparisons:

Code
NULL = NULL

does not evaluate as true.

This means nullable join columns do not match each other using = when both sides are NULL.

Code
SELECT
    a.Id,
    b.Id
FROM TableA AS a
INNER JOIN TableB AS b
    ON a.OptionalCode = b.OptionalCode;

Rows where both OptionalCode values are NULL will not match.

If the business rule requires nulls to match, you must express that explicitly:

Code
SELECT
    a.Id,
    b.Id
FROM TableA AS a
INNER JOIN TableB AS b
    ON a.OptionalCode = b.OptionalCode
    OR (a.OptionalCode IS NULL AND b.OptionalCode IS NULL);

Be careful because this can affect performance. A better model may avoid nullable join keys when the relationship is mandatory.

NULL values produced by outer joins

Outer joins produce NULL values for missing related rows.

Code
SELECT
    c.CustomerName,
    o.OrderId
FROM Customers AS c
LEFT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

If a customer has no orders, o.OrderId is NULL.

For display, you can use COALESCE:

Code
SELECT
    c.CustomerName,
    COALESCE(CAST(o.OrderId AS varchar(20)), 'No order') AS OrderDisplay
FROM Customers AS c
LEFT JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

Use display replacement separately from join logic. Replacing nulls inside join predicates can hide data quality problems or reduce index usage.

Joining with multiple conditions

A join can use multiple conditions, especially with composite keys.

Code
SELECT
    s.Sku,
    s.WarehouseId,
    i.QuantityOnHand
FROM ShipmentLines AS s
INNER JOIN Inventory AS i
    ON i.Sku = s.Sku
   AND i.WarehouseId = s.WarehouseId;

Best practices:

  • Join on all parts of a composite key.
  • Do not join only on one column if the relationship requires multiple columns.
  • Use clear aliases.
  • Keep join column data types consistent.

Non-key and range joins

Most joins use key relationships, but SQL can join on ranges or other predicates.

Code
SELECT
    o.OrderId,
    o.OrderDate,
    p.PromotionName
FROM Orders AS o
INNER JOIN Promotions AS p
    ON o.OrderDate >= p.StartDate
   AND o.OrderDate < p.EndDate;

Business meaning:

Code
Match each order to the promotion active on the order date.

Non-key joins are common in reporting, temporal records, price ranges, and data warehouses. They require careful testing because they can easily produce multiple matches per row.

Aliases and readability

Good aliases make joins easier to read.

Code
SELECT
    c.CustomerName,
    o.OrderId
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

Use short but meaningful aliases:

  • c for Customers.
  • o for Orders.
  • oi for OrderItems.
  • p for Products.

Avoid unclear aliases in large queries.

Explicit JOIN syntax vs old-style joins

Old-style comma joins put tables in the FROM clause and conditions in WHERE.

Code
SELECT
    c.CustomerName,
    o.OrderId
FROM Customers AS c, Orders AS o
WHERE o.CustomerId = c.CustomerId;

Modern explicit join syntax is preferred:

Code
SELECT
    c.CustomerName,
    o.OrderId
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

Reasons to prefer explicit JOIN syntax:

  • It separates join conditions from filters.
  • It improves readability.
  • It reduces accidental cross joins.
  • It handles outer joins clearly.
  • It is the standard style in modern SQL codebases.

Joining and aggregation

Joining one-to-many tables before aggregating can multiply rows. This is expected, but it must be handled correctly.

Total sales by customer:

Code
SELECT
    c.CustomerId,
    c.CustomerName,
    SUM(oi.Quantity * oi.UnitPrice) AS TotalSales
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId
INNER JOIN OrderItems AS oi
    ON oi.OrderId = o.OrderId
GROUP BY
    c.CustomerId,
    c.CustomerName;

This is correct because the total is calculated at the order-item level.

However, joining multiple child tables can cause double counting.

Problem example:

Code
SELECT
    o.OrderId,
    SUM(oi.Quantity * oi.UnitPrice) AS ItemTotal,
    SUM(p.PaidAmount) AS PaidTotal
FROM Orders AS o
LEFT JOIN OrderItems AS oi
    ON oi.OrderId = o.OrderId
LEFT JOIN Payments AS p
    ON p.OrderId = o.OrderId
GROUP BY
    o.OrderId;

If an order has three items and two payments, the join produces six rows. Both totals may be inflated.

Better approach: aggregate each child table first, then join the aggregated results.

Code
WITH ItemTotals AS
(
    SELECT
        OrderId,
        SUM(Quantity * UnitPrice) AS ItemTotal
    FROM OrderItems
    GROUP BY OrderId
),
PaymentTotals AS
(
    SELECT
        OrderId,
        SUM(PaidAmount) AS PaidTotal
    FROM Payments
    GROUP BY OrderId
)
SELECT
    o.OrderId,
    COALESCE(i.ItemTotal, 0) AS ItemTotal,
    COALESCE(p.PaidTotal, 0) AS PaidTotal
FROM Orders AS o
LEFT JOIN ItemTotals AS i
    ON i.OrderId = o.OrderId
LEFT JOIN PaymentTotals AS p
    ON p.OrderId = o.OrderId;

DISTINCT is not a join fix

Developers sometimes add DISTINCT when a join returns more rows than expected.

Code
SELECT DISTINCT
    c.CustomerId,
    c.CustomerName
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerId = c.CustomerId;

This may produce a list of customers with orders, but it hides the reason rows multiplied. A clearer query is:

Code
SELECT
    c.CustomerId,
    c.CustomerName
FROM Customers AS c
WHERE EXISTS
(
    SELECT 1
    FROM Orders AS o
    WHERE o.CustomerId = c.CustomerId
);

Use DISTINCT when the requirement is truly to remove duplicate rows, not as a quick fix for a misunderstood join.

Logical query processing and join filters

A simplified logical order is:

Code
FROM and JOIN
ON
WHERE
GROUP BY
HAVING
SELECT
ORDER BY

This matters because joins happen before WHERE filtering. For outer joins, WHERE filters can remove rows that the outer join preserved.

Physical join algorithms

SQL join syntax describes logical result behavior. The database optimizer chooses a physical algorithm to execute it.

Common physical join algorithms:

  • Nested loops join: often good when one input is small and the other has an efficient index lookup.
  • Hash join: often good for large unsorted inputs.
  • Merge join: often good when both inputs are sorted by the join key.

Interview point:

Code
INNER JOIN describes result semantics.
Nested loops, hash, and merge describe execution strategy.

Do not confuse logical join type with physical join algorithm.

Join performance basics

Join performance depends on:

  • Indexes on join keys.
  • Correct statistics.
  • Row counts and data distribution.
  • Join type.
  • Predicate selectivity.
  • Projection size.
  • Sargability.
  • Avoiding unnecessary joins.
  • Avoiding functions on indexed join columns.
  • Matching data types on both sides of the join.
  • Avoiding accidental many-to-many multiplication.

Useful indexes:

Code
CREATE INDEX IX_Orders_CustomerId
ON Orders(CustomerId);

CREATE INDEX IX_OrderItems_OrderId
ON OrderItems(OrderId);

CREATE INDEX IX_OrderItems_ProductId
ON OrderItems(ProductId);

Foreign key columns are often important index candidates because they are frequently used in joins.

Data type mismatch in joins

Joining columns with different data types can cause implicit conversions and poor performance.

Problem:

Code
-- Customers.CustomerId is int
-- Orders.CustomerIdText is varchar
SELECT
    c.CustomerName,
    o.OrderId
FROM Customers AS c
INNER JOIN Orders AS o
    ON o.CustomerIdText = c.CustomerId;

Better practices:

  • Use consistent data types in schema design.
  • Fix schema mismatches where possible.
  • Avoid joining numeric IDs to text columns.
  • Avoid conversion functions on indexed join keys in high-volume queries.

Joining on functions

Joining on expressions can reduce index usage.

Problem:

Code
SELECT
    a.Id,
    b.Id
FROM TableA AS a
INNER JOIN TableB AS b
    ON LOWER(a.Email) = LOWER(b.Email);

Better options:

  • Store normalized email values.
  • Use an appropriate collation.
  • Use computed indexed columns where appropriate.
  • Clean data before storing it.

OUTER APPLY and CROSS APPLY

In SQL Server, APPLY is useful when the right side depends on each row from the left side.

Example: latest order for each customer:

Code
SELECT
    c.CustomerId,
    c.CustomerName,
    latest.OrderId,
    latest.OrderDate
FROM Customers AS c
OUTER APPLY
(
    SELECT TOP (1)
        o.OrderId,
        o.OrderDate
    FROM Orders AS o
    WHERE o.CustomerId = c.CustomerId
    ORDER BY o.OrderDate DESC, o.OrderId DESC
) AS latest;

OUTER APPLY keeps customers without orders. CROSS APPLY removes left-side rows when the right-side query returns no rows.

Use APPLY when:

  • The right-side query depends on each left-side row.
  • You need top 1 or top N related rows per parent.
  • You are calling table-valued functions.
  • A normal join would be less readable.

Latest row per group pattern

A common interview task is “get each customer’s latest order.”

Using OUTER APPLY:

Code
SELECT
    c.CustomerId,
    c.CustomerName,
    latest.OrderId,
    latest.OrderDate
FROM Customers AS c
OUTER APPLY
(
    SELECT TOP (1)
        o.OrderId,
        o.OrderDate
    FROM Orders AS o
    WHERE o.CustomerId = c.CustomerId
    ORDER BY o.OrderDate DESC, o.OrderId DESC
) AS latest;

Using ROW_NUMBER:

Code
WITH RankedOrders AS
(
    SELECT
        o.OrderId,
        o.CustomerId,
        o.OrderDate,
        ROW_NUMBER() OVER
        (
            PARTITION BY o.CustomerId
            ORDER BY o.OrderDate DESC, o.OrderId DESC
        ) AS RowNumber
    FROM Orders AS o
)
SELECT
    c.CustomerId,
    c.CustomerName,
    r.OrderId,
    r.OrderDate
FROM Customers AS c
LEFT JOIN RankedOrders AS r
    ON r.CustomerId = c.CustomerId
   AND r.RowNumber = 1;

Both can be valid. The better option depends on indexing, data volume, and readability.

Reporting grid pattern

Sometimes a report must show rows even when data is missing. A common pattern is:

  1. Create the complete grid with CROSS JOIN.
  2. LEFT JOIN actual facts.
  3. Use COALESCE to show zero when no data exists.
Code
WITH Months AS
(
    SELECT DATEFROMPARTS(2026, 1, 1) AS MonthStart
    UNION ALL
    SELECT DATEFROMPARTS(2026, 2, 1)
    UNION ALL
    SELECT DATEFROMPARTS(2026, 3, 1)
),
SalesByProductMonth AS
(
    SELECT
        oi.ProductId,
        DATEFROMPARTS(YEAR(o.OrderDate), MONTH(o.OrderDate), 1) AS MonthStart,
        SUM(oi.Quantity * oi.UnitPrice) AS SalesAmount
    FROM Orders AS o
    INNER JOIN OrderItems AS oi
        ON oi.OrderId = o.OrderId
    GROUP BY
        oi.ProductId,
        DATEFROMPARTS(YEAR(o.OrderDate), MONTH(o.OrderDate), 1)
)
SELECT
    p.ProductName,
    m.MonthStart,
    COALESCE(s.SalesAmount, 0) AS SalesAmount
FROM Products AS p
CROSS JOIN Months AS m
LEFT JOIN SalesByProductMonth AS s
    ON s.ProductId = p.ProductId
   AND s.MonthStart = m.MonthStart;

This shows every product-month combination, even if sales are zero.

Common join mistakes

Common mistakes include:

  • Using INNER JOIN when optional rows should be preserved.
  • Filtering the right side of a LEFT JOIN in WHERE and accidentally removing unmatched rows.
  • Forgetting the join condition and creating a cross join.
  • Joining on the wrong key, such as joining by name instead of ID.
  • Ignoring one-to-many row multiplication.
  • Using DISTINCT to hide a bad join.
  • Using NOT IN with nullable data.
  • Selecting too many columns.
  • Not indexing join keys.
  • Joining columns with mismatched data types.
  • Using functions on join columns.
  • Mixing old-style comma joins with explicit joins.

Best practices

Good SQL join habits include:

  • Use explicit JOIN ... ON syntax.
  • Use meaningful table aliases.
  • Join on keys, preferably primary key to foreign key.
  • Understand the table relationship before writing the query.
  • Choose INNER JOIN when a match is required.
  • Choose LEFT JOIN when the left row must be preserved.
  • Prefer LEFT JOIN over RIGHT JOIN for readability in most cases.
  • Use FULL OUTER JOIN for reconciliation and data comparison.
  • Use EXISTS when only existence matters.
  • Use NOT EXISTS for anti-join logic.
  • Put right-side filters for a LEFT JOIN in the ON clause when left rows must remain.
  • Aggregate child tables before joining when joining multiple one-to-many relationships.
  • Avoid DISTINCT as a quick fix for row multiplication.
  • Index foreign key columns and common join keys.
  • Keep join column data types consistent.
  • Review execution plans for expensive queries.
  • Test queries with rows that have no match.
  • Test queries with multiple child rows.
  • Test queries with nulls when nullable columns are involved.

Practical join selection guide

Code
Do I need only rows that match on both sides?
  -> Use INNER JOIN.

Do I need all rows from the left table, even without a match?
  -> Use LEFT JOIN.

Do I need all rows from both tables, matched and unmatched?
  -> Use FULL OUTER JOIN.

Do I need every combination of two sets?
  -> Use CROSS JOIN.

Do I need to join a table to itself?
  -> Use SELF JOIN.

Do I only need to know whether a match exists?
  -> Use EXISTS.

Do I need rows where no match exists?
  -> Use NOT EXISTS or LEFT JOIN ... IS NULL.

Do I need the latest or top related row per parent?
  -> Consider OUTER APPLY or a window function.

Interview Practice

PreviousData types, nullability, and business-rule enforcementNext UpNormalization and when denormalization is justified