Overview
Azure Blob Storage applications commonly combine three architectural concerns:
- .NET services use Azure SDK clients to authorize and manage blob operations.
- Browsers upload large files directly to Blob Storage using narrowly scoped, short-lived delegated access.
- A database stores authoritative business metadata while Blob Storage stores binary content.
Separating these responsibilities avoids routing every byte through an API, keeps searchable and transactional data in a database, and lets Blob Storage handle durable object transfer. It also creates a distributed workflow: the database and Blob Storage do not share one transaction, so the application must handle incomplete uploads, retries, duplicate completion calls, malware scanning, and orphan cleanup.
For interviews, candidates should explain the SDK client hierarchy, managed identity, dependency injection, SAS security, CORS, metadata modeling, idempotency, consistency, and secure download authorization.
Core Concepts
Azure Blob Client Hierarchy
The primary .NET clients are:
BlobServiceClient: account-level operations and client creation.BlobContainerClient: operations within one container.BlobClient: operations on a specific blob.BlockBlobClient: explicit block staging and block-list commits.AppendBlobClient: append-only blob operations.PageBlobClient: page-range operations.
Create narrower clients from broader clients:
BlobContainerClient container =
serviceClient.GetBlobContainerClient("documents");
BlobClient blob =
container.GetBlobClient("tenant-42/8f7c/report.pdf");
Use a specialized client only when its specialized operations are needed.
Client Lifetime and Thread Safety
Azure SDK clients are thread-safe and designed for reuse. Register them as singletons or otherwise create and retain a small number of clients.
Do not create a new BlobServiceClient for every request. Reuse provides:
- Shared HTTP connection pools.
- Lower allocation and connection overhead.
- Centralized configuration.
- Consistent diagnostics and retry behavior.
Per-operation request objects and streams still require normal concurrency care.
Authentication with DefaultAzureCredential
Prefer Microsoft Entra authentication over storage account keys.
var serviceClient = new BlobServiceClient(
new Uri(configuration["Storage:BlobServiceUri"]!),
new DefaultAzureCredential());
DefaultAzureCredential supports local developer credentials and deployed workload identities. In Azure, use a system-assigned or user-assigned managed identity and grant only the required data-plane role.
Common roles include:
- Storage Blob Data Reader.
- Storage Blob Data Contributor.
- Storage Blob Data Owner.
Management-plane roles such as Contributor do not automatically grant blob data access.
Dependency Injection
A simple registration:
builder.Services.AddSingleton(sp =>
{
var configuration = sp.GetRequiredService<IConfiguration>();
var serviceUri = new Uri(
configuration["Storage:BlobServiceUri"]!);
return new BlobServiceClient(
serviceUri,
new DefaultAzureCredential());
});
A domain-facing service should hide storage-specific details from controllers:
public interface IDocumentContentStore
{
Task<UploadTarget> CreateUploadTargetAsync(
CreateUploadRequest request,
CancellationToken cancellationToken);
Task<Stream> OpenReadAsync(
string blobName,
CancellationToken cancellationToken);
}
Do not expose account keys or unrestricted BlobClient instances to unrelated application layers.
SDK Retry Configuration
Azure SDK clients include retry behavior for transient failures. Configure retries at client creation rather than layering arbitrary retries around every operation.
var options = new BlobClientOptions
{
Retry =
{
Mode = RetryMode.Exponential,
MaxRetries = 5,
Delay = TimeSpan.FromSeconds(0.8),
MaxDelay = TimeSpan.FromSeconds(8),
NetworkTimeout = TimeSpan.FromMinutes(2)
}
};
var client = new BlobServiceClient(
serviceUri,
new DefaultAzureCredential(),
options);
Retries must respect cancellation and an overall operation deadline. A retry can repeat a request whose outcome is unknown, so application workflows still need idempotency and reconciliation.
Streaming Upload and Download
Avoid loading an entire file into memory.
await blobClient.UploadAsync(
contentStream,
new BlobUploadOptions
{
HttpHeaders = new BlobHttpHeaders
{
ContentType = validatedContentType
},
TransferOptions = new StorageTransferOptions
{
MaximumConcurrency = 4,
MaximumTransferSize = 8 * 1024 * 1024
}
},
cancellationToken);
For downloads, either:
- Stream through the API when inspection, transformation, or strict network control is required.
- Return a short-lived read SAS after the API authorizes the user.
Do not infer authorization from knowledge of a blob name.
Direct Browser Upload Architecture
A secure direct-upload flow is:
- The browser authenticates to the application API.
- The API authorizes the user and validates file intent, size, and allowed type.
- The API creates a pending upload record and a server-generated blob name.
- The API returns a short-lived write-only SAS for that blob.
- The browser uploads directly to Blob Storage.
- The browser calls a completion endpoint with the upload ID.
- The API checks blob properties, expected size, checksum, and ownership.
- A worker scans and processes the file in quarantine.
- The database state changes to
Availableonly after validation succeeds.
This keeps large payloads away from the API while preserving server control over identity, authorization, naming, and publication.
User Delegation SAS
When possible, generate a user delegation SAS using Microsoft Entra credentials instead of signing SAS tokens with an account key.
A direct-upload SAS should normally be:
- Scoped to one blob.
- Limited to create and write permissions.
- Valid for a short period.
- Served only over HTTPS.
- Bound to a server-generated path.
- Issued only after application authorization.
Do not grant list, delete, or container-wide permissions unless the workflow proves they are necessary.
Treat a SAS as a bearer credential:
- Do not log it.
- Do not place it in analytics events.
- Avoid storing it in application state longer than needed.
- Redact query strings from diagnostics.
- Use a renewal endpoint rather than issuing a long-lived token.
CORS Is Not Authorization
Blob Storage cross-origin resource sharing configuration controls whether a browser is allowed to make a cross-origin request. It does not decide whether the caller can read or write a blob.
Direct browser uploads require both:
- A CORS rule that permits the trusted web origin, methods, and headers.
- A valid authorization mechanism such as a SAS.
Avoid wildcard origins for credential-bearing production workflows when a known origin can be specified.
Server-Controlled Naming
Use opaque storage names:
quarantine/{tenant-id}/{upload-id}
Store the user's display filename separately. This prevents:
- Path traversal assumptions.
- Cross-tenant overwrite.
- Guessable object identifiers.
- Unsafe characters leaking into headers.
- Renaming a document from requiring a physical blob move.
Blob names are object keys, not operating-system file paths, even though slash-delimited names appear as virtual folders.
Quarantine and Publication
An uploaded blob should not become downloadable merely because the upload request succeeded.
Keep new content in quarantine until the application verifies:
- Expected size.
- Allowed extension and detected media type.
- Whole-file checksum where required.
- Malware scan result.
- File-format validity.
- Tenant and upload-session ownership.
Use a state machine:
Pending -> Uploading -> Uploaded -> Scanning -> Available
| |
v v
Rejected Deleted
Only Available content should be returned by normal download endpoints.
Store Binary Content and Business Metadata Separately
Blob Storage is optimized for object content. A relational or document database is better for business metadata that requires transactions, joins, rich filtering, ownership rules, and workflow state.
An example document record:
DocumentId
TenantId
OwnerUserId
BlobContainer
BlobName
BlobVersionId
OriginalFileName
ValidatedContentType
ExpectedLength
ActualLength
Checksum
ChecksumAlgorithm
Status
RetentionClass
CreatedAt
UploadedAt
PublishedAt
RowVersion
The API should address a document by DocumentId, authorize the business record, and resolve the internal blob location. Clients should not construct blob paths from business IDs.
Blob Metadata and Index Tags
Blob metadata is suitable for small technical key-value properties that should travel with the object. Blob index tags support server-side filtering and lifecycle rules.
Good uses include:
- Schema or pipeline version.
- Technical classification.
- Processing correlation ID.
- Lifecycle tag.
Avoid treating blob metadata as the authoritative business database because:
- Query capabilities are limited.
- Multi-record transactions are unavailable.
- Rich constraints and relationships are difficult.
- Metadata updates can create concurrency and versioning side effects.
- Sensitive values may be exposed to identities that can read blob properties.
Store only the minimum duplicated metadata needed for storage operations.
No Distributed Transaction
Azure SQL and Blob Storage do not share a normal atomic transaction. Failures can occur between:
- Creating the database record.
- Uploading the blob.
- Marking the record complete.
- Publishing a processing event.
Model this explicitly instead of pretending the operations are atomic.
Pending-First Workflow
Create the database record first with a unique upload ID and Pending state. Then upload to the generated blob name.
Benefits:
- Every authorized upload has a business owner.
- Repeated create requests can be idempotent.
- Cleanup can find expired sessions.
- Completion can validate the expected target.
If the upload never happens, a scheduled cleanup marks the record expired.
Completion Endpoint
The completion endpoint should be idempotent:
POST /api/uploads/{uploadId}/complete
On every call it should:
- Authorize the upload owner.
- Read the expected record.
- Fetch blob properties.
- Verify length, checksum, and target name.
- Transition state conditionally.
- Return the already completed result when appropriate.
Do not trust size, content type, or completion status supplied only by the browser.
Outbox and Event Processing
When downstream processing starts after a database state change, use an outbox pattern:
- Update the upload record.
- Insert an outbox message in the same database transaction.
- Publish the message asynchronously.
- Mark the outbox record dispatched.
Consumers should be idempotent because a message can be delivered more than once.
Reconciliation and Orphan Cleanup
Run reconciliation jobs for:
- Pending records whose SAS expired.
- Blobs with no database record.
- Records marked uploaded whose blob is missing.
- Scans that exceeded their expected duration.
- Duplicate or superseded blobs.
Deletion should consider soft delete, legal retention, and audit requirements. Prefer writing orphan candidates to a report or state table before destructive cleanup.
Secure Download Pattern
A download flow should:
- Authenticate the user.
- Load the document record by opaque ID and tenant.
- Enforce object-level authorization.
- Require
Availablestate. - Resolve the exact blob name and version if necessary.
- Stream content or issue a short-lived read-only SAS.
- Set a safe
Content-Dispositionfilename. - Log the business access event without logging credentials.
For sensitive content, consider streaming through the API or a controlled gateway rather than issuing a reusable URL.
Optimistic Concurrency
Blob operations return ETags. Use conditional requests when concurrent writers could overwrite one another:
var conditions = new BlobRequestConditions
{
IfMatch = expectedETag
};
The metadata database should also use optimistic concurrency, such as a SQL rowversion. These tokens protect different resources; the workflow still needs reconciliation across them.
Multi-Tenant Isolation
For a multi-tenant system:
- Include the tenant in every metadata query.
- Generate blob names on the server.
- Scope SAS tokens to one authorized object.
- Avoid container listing from the browser.
- Separate accounts or containers when compliance requires stronger boundaries.
- Use distinct encryption scopes only when a key boundary is required.
- Test cross-tenant access as a security requirement.
Naming conventions improve organization but do not create authorization by themselves.
Common Mistakes
- Creating SDK clients per request.
- Deploying storage account keys in application settings.
- Issuing container-wide or long-lived SAS tokens.
- Treating CORS as a security boundary.
- Letting the browser choose the final blob name.
- Trusting browser-provided content type or completion status.
- Publishing files before scanning.
- Storing searchable business state only in blob metadata.
- Assuming the database and blob write are atomic.
- Deleting orphans without accounting for retention controls.
- Authorizing downloads only by a predictable blob path.
Best Practices
- Reuse thread-safe SDK clients.
- Use managed identity and data-plane RBAC.
- Pass cancellation tokens through all storage operations.
- Keep upload SAS tokens short-lived and object-scoped.
- Use quarantine, validation, and explicit publication states.
- Keep authoritative metadata in a transactional store.
- Make create, complete, scan, and cleanup operations idempotent.
- Use an outbox for reliable downstream events.
- Reconcile database and storage state regularly.
- Monitor request failures, throttling, scan latency, orphan counts, and SAS issuance.