MERGE and upsert patterns, including concurrency cautions Interview Questions

Overview

An upsert is an operation that updates a row when it already exists and inserts it when it does not. SQL Server supports several upsert patterns, including MERGE, explicit UPDATE then INSERT, INSERT then handle duplicate-key errors, and stored-procedure patterns wrapped in transactions.

MERGE is a single statement that can compare a source row set with a target table and then run INSERT, UPDATE, or DELETE actions based on match conditions. It is powerful for synchronization and data-loading scenarios, but it is also easy to misuse. A correct MERGE statement needs a stable match key, duplicate-safe source data, correct predicates, and careful concurrency thinking.

This topic matters because upsert logic appears in APIs, import jobs, ETL pipelines, configuration management, identity/profile updates, inventory updates, and synchronization tasks. The wrong pattern can create duplicates, overwrite newer data, deadlock under load, update the same row twice, or silently change too many rows.

For interviews, strong candidates can explain what MERGE does, when separate statements are safer, why unique constraints are still required, and how transaction isolation, locks, error handling, and retries affect correctness under concurrency.

Core Concepts

What MERGE Does

MERGE compares a source row set with a target table and applies actions based on whether rows match.

Basic shape:

Code

MERGE dbo.CustomerProfile AS target
USING
(
    SELECT
        @Email AS Email,
        @DisplayName AS DisplayName
) AS source
ON target.Email = source.Email
WHEN MATCHED THEN
    UPDATE SET
        DisplayName = source.DisplayName,
        UpdatedAt = SYSUTCDATETIME()
WHEN NOT MATCHED BY TARGET THEN
    INSERT (Email, DisplayName, CreatedAt, UpdatedAt)
    VALUES (source.Email, source.DisplayName, SYSUTCDATETIME(), SYSUTCDATETIME());

The ON clause defines how source rows match target rows. WHEN MATCHED handles existing rows. WHEN NOT MATCHED BY TARGET handles source rows that do not exist in the target.

MERGE can also support WHEN NOT MATCHED BY SOURCE, commonly used for synchronization cleanup, but that clause is dangerous if the source is only a partial feed.

What Upsert Means

An upsert combines two intents:

Update the existing row for a business key.
Insert a new row when no row exists for that business key.

The business key might be:

Email
ExternalCustomerId
Sku
TenantId plus Name
Provider plus ProviderUserId

Example table:

Code

CREATE TABLE dbo.Users
(
    UserId BIGINT IDENTITY(1, 1) NOT NULL PRIMARY KEY,
    TenantId INT NOT NULL,
    Email NVARCHAR(320) NOT NULL,
    DisplayName NVARCHAR(200) NOT NULL,
    CreatedAt DATETIME2 NOT NULL,
    UpdatedAt DATETIME2 NOT NULL,

    CONSTRAINT UX_Users_Tenant_Email UNIQUE (TenantId, Email)
);

The unique constraint is not optional. It is the database-level protection that prevents duplicate users for the same tenant and email.

MERGE For A Simple Upsert

Example:

Code

MERGE dbo.Users WITH (HOLDLOCK) AS target
USING
(
    SELECT
        @TenantId AS TenantId,
        @Email AS Email,
        @DisplayName AS DisplayName
) AS source
ON target.TenantId = source.TenantId
AND target.Email = source.Email
WHEN MATCHED THEN
    UPDATE SET
        DisplayName = source.DisplayName,
        UpdatedAt = SYSUTCDATETIME()
WHEN NOT MATCHED BY TARGET THEN
    INSERT (TenantId, Email, DisplayName, CreatedAt, UpdatedAt)
    VALUES (source.TenantId, source.Email, source.DisplayName, SYSUTCDATETIME(), SYSUTCDATETIME())
OUTPUT
    $action AS MergeAction,
    inserted.UserId,
    inserted.Email;

This pattern uses:

A unique business key.
HOLDLOCK to reduce race conditions around the target key range.
OUTPUT to return whether the row was inserted or updated.
A clear ON clause based only on identity or business-key matching.

Even with this pattern, many teams prefer separate statements for high-concurrency OLTP upserts because the behavior is easier to reason about and test.

Separate UPDATE Then INSERT Pattern

One common alternative is to update first, then insert when no row was affected.

Example:

Code

BEGIN TRANSACTION;

UPDATE dbo.Users WITH (UPDLOCK, HOLDLOCK)
SET
    DisplayName = @DisplayName,
    UpdatedAt = SYSUTCDATETIME()
WHERE TenantId = @TenantId
  AND Email = @Email;

IF @@ROWCOUNT = 0
BEGIN
    INSERT dbo.Users (TenantId, Email, DisplayName, CreatedAt, UpdatedAt)
    VALUES (@TenantId, @Email, @DisplayName, SYSUTCDATETIME(), SYSUTCDATETIME());
END;

COMMIT TRANSACTION;

This is often easier to review than MERGE. The update path and insert path are explicit, and locking hints can protect the searched key range.

However, the unique constraint is still required. Locks are part of the correctness strategy, but constraints are the final guardrail.

INSERT Then Handle Duplicate Key Pattern

Another pattern inserts first and handles duplicate-key errors by updating.

Example:

Code

BEGIN TRY
    INSERT dbo.Users (TenantId, Email, DisplayName, CreatedAt, UpdatedAt)
    VALUES (@TenantId, @Email, @DisplayName, SYSUTCDATETIME(), SYSUTCDATETIME());
END TRY
BEGIN CATCH
    IF ERROR_NUMBER() IN (2601, 2627)
    BEGIN
        UPDATE dbo.Users
        SET
            DisplayName = @DisplayName,
            UpdatedAt = SYSUTCDATETIME()
        WHERE TenantId = @TenantId
          AND Email = @Email;
    END
    ELSE
    BEGIN
        THROW;
    END;
END CATCH;

This can work well when inserts are the common case and duplicate conflicts are rare. It relies on a unique index or unique constraint to detect races. It must be used carefully so the update path does not hide unrelated constraint errors.

Concurrency Risks In Upserts

The classic upsert race looks like this:

Session A checks for a row and sees none.
Session B checks for the same row and sees none.
Both sessions try to insert.
Without a unique constraint, duplicates are created.
With a unique constraint, one session succeeds and the other gets a duplicate-key error.

That is why interview answers should include both transaction design and database constraints.

Concurrency tools include:

Unique constraints or unique indexes.
Explicit transactions.
Correct isolation level.
Lock hints such as UPDLOCK and HOLDLOCK.
Retry logic for deadlocks or duplicate-key races.
Idempotent application behavior.

No SQL syntax removes the need to model the key correctly.

HOLDLOCK, UPDLOCK, And Serializable Range Protection

HOLDLOCK is equivalent to serializable behavior for the table reference. It can protect the searched key range until the transaction completes, which helps prevent another transaction from inserting a matching row into the gap.

UPDLOCK asks SQL Server to take update locks when reading rows that may later be updated. This can reduce conversion deadlocks and coordinate writers.

Example:

Code

UPDATE dbo.Users WITH (UPDLOCK, HOLDLOCK)
SET
    DisplayName = @DisplayName,
    UpdatedAt = SYSUTCDATETIME()
WHERE TenantId = @TenantId
  AND Email = @Email;

These hints are not decoration. They communicate the concurrency intent: "I am checking this key because I may insert or update it, and other writers should not slip through the same key range."

Trade-offs:

Stronger locks reduce race conditions.
Stronger locks can reduce concurrency.
Poor indexes can cause broader locking than intended.
Locking strategy must be tested under concurrent load.

Source Duplicates And MERGE Failures

MERGE expects the source-to-target match to be well-defined. If multiple source rows match the same target row and the action tries to update or delete that target row, SQL Server can fail because the same target row cannot be updated more than once by one MERGE statement.

Problem:

Code

MERGE dbo.Products AS target
USING @ImportedProducts AS source
ON target.Sku = source.Sku
WHEN MATCHED THEN
    UPDATE SET Name = source.Name;

If @ImportedProducts contains the same Sku twice, the result is ambiguous.

Better:

Code

WITH DedupedSource AS
(
    SELECT
        Sku,
        MAX(Name) AS Name
    FROM @ImportedProducts
    GROUP BY Sku
)
MERGE dbo.Products AS target
USING DedupedSource AS source
ON target.Sku = source.Sku
WHEN MATCHED THEN
    UPDATE SET Name = source.Name;

In real systems, do not hide source duplicates with MAX unless that rule is correct. Often, duplicates should be rejected and reported as import errors.

The ON Clause Must Define Matching Only

A common MERGE mistake is putting target filters in the ON clause that are not truly part of the key.

Risky:

Code

MERGE dbo.Users AS target
USING @Rows AS source
ON target.TenantId = source.TenantId
AND target.Email = source.Email
AND target.IsDeleted = 0
WHEN NOT MATCHED BY TARGET THEN
    INSERT (TenantId, Email, DisplayName)
    VALUES (source.TenantId, source.Email, source.DisplayName);

If a soft-deleted row exists with the same tenant and email, the ON clause does not match it, so the NOT MATCHED branch may try to insert a duplicate. That might fail due to a unique constraint or create a logical duplicate if the constraint does not cover the right key.

Better approach:

Keep the ON clause focused on the match key.
Put action-specific conditions in WHEN MATCHED AND ....
Decide explicitly how soft-deleted rows should be handled.

WHEN NOT MATCHED BY SOURCE Caution

WHEN NOT MATCHED BY SOURCE means the target row did not appear in the source row set. It is useful for full synchronization jobs.

Example:

Code

WHEN NOT MATCHED BY SOURCE THEN
    UPDATE SET IsActive = 0;

This is dangerous when the source is only a partial feed. If today's import file contains only changed rows, then almost every target row is "not matched by source" and could be deactivated or deleted incorrectly.

Interview answer: only use WHEN NOT MATCHED BY SOURCE when the source represents the complete desired state for the target scope, such as all products for one tenant or all assignments for one role.

OUTPUT With MERGE

MERGE supports OUTPUT, including the special $action value.

Example:

Code

DECLARE @Changes TABLE
(
    ActionName NVARCHAR(10) NOT NULL,
    ProductId BIGINT NOT NULL,
    Sku NVARCHAR(50) NOT NULL
);

MERGE dbo.Products AS target
USING @Source AS source
ON target.Sku = source.Sku
WHEN MATCHED THEN
    UPDATE SET Name = source.Name
WHEN NOT MATCHED BY TARGET THEN
    INSERT (Sku, Name)
    VALUES (source.Sku, source.Name)
OUTPUT
    $action,
    inserted.ProductId,
    inserted.Sku
INTO @Changes;

This is useful for auditing, returning changed keys to the application, or triggering downstream work. The output should be treated as part of the data-change contract, not just debug information.

MERGE Vs Separate Statements

MERGE can be a good fit when:

You are synchronizing a clean source set with a target table.
You need multiple actions in one statement.
You need an output stream of inserted, updated, and deleted rows.
The source has been deduplicated and validated.
Concurrency requirements are understood and tested.

Separate statements can be a better fit when:

The operation is a simple single-row API upsert.
The code must be easy to review and debug.
You need very explicit locking and error handling.
You have high write concurrency.
Different branches need different business rules.

Many senior SQL developers are cautious with MERGE in OLTP code. That does not mean MERGE is never useful. It means the pattern must be justified and tested.

Common Mistakes

Common mistakes include:

Using MERGE without a unique constraint on the business key.
Allowing duplicate source rows.
Putting non-key filters in the ON clause.
Using WHEN NOT MATCHED BY SOURCE with a partial source feed.
Assuming MERGE automatically solves concurrency.
Ignoring duplicate-key and deadlock retry behavior.
Forgetting to capture changed rows when downstream logic needs them.
Running upsert logic without a transaction.
Using broad locks because the target key is not indexed.
Updating columns unnecessarily, causing extra writes and row-version churn.

Best Practices

Best practices:

Define and enforce the business key with a unique constraint or unique index.
Validate or deduplicate the source before MERGE.
Keep the ON clause focused on matching keys.
Use WHEN MATCHED AND ... for action-specific filters.
Be very cautious with WHEN NOT MATCHED BY SOURCE.
Use transactions and concurrency-appropriate locks.
Add retry handling for deadlocks and expected duplicate-key races.
Test with concurrent sessions, not just one user.
Prefer separate statements when they are clearer and safer.
Use OUTPUT when the caller or audit process needs changed keys.

MERGE and upsert patterns, including concurrency cautions

Overview

Core Concepts

What MERGE Does

What Upsert Means

MERGE For A Simple Upsert

Separate UPDATE Then INSERT Pattern

INSERT Then Handle Duplicate Key Pattern

Concurrency Risks In Upserts

HOLDLOCK, UPDLOCK, And Serializable Range Protection

Source Duplicates And MERGE Failures

The ON Clause Must Define Matching Only

WHEN NOT MATCHED BY SOURCE Caution

OUTPUT With MERGE

MERGE Vs Separate Statements

Common Mistakes

Best Practices

Interview Practice

Beginner Interview Practice

Intermediate Interview Practice

Advanced Interview Practice