AverageDevs
ArchitectureDevops

Building a secure file upload pipeline with virus scanning and presigned URLs

A practical, production focused guide to designing a secure file upload pipeline using object storage, presigned URLs, and server side virus scanning.

Building a secure file upload pipeline with virus scanning and presigned URLs

File uploads look harmless at first glance. A user drags a file into your app, a progress bar fills, and somewhere a product manager celebrates. In reality, a file upload endpoint is one of the highest risk surfaces in your entire system, and a sloppy implementation can quietly turn your nice SaaS into free malware hosting. Treating uploads as a UX problem instead of a security boundary is how teams get surprised in postmortems.

In this guide, we will design a secure file upload pipeline built on object storage, presigned URLs, and server side virus scanning. We will look at it the way a practical senior engineer would explain it to a mid level teammate who already knows the basics of S3 and HTTP, but wants to ship something that would survive a security review. If you like thinking in systems, you will see echoes of patterns from Error Handling Patterns for Distributed Systems and Clean Architecture for Fullstack throughout.

TLDR

  • Move upload bandwidth away from your API: Use presigned URLs so clients talk directly to object storage.
  • Never trust raw uploads: Store new files in a raw area, scan them, then promote to safe or quarantine.
  • Scan and validate asynchronously: Use a worker or serverless function to run virus scans and policy checks.
  • Expose only safe files: Frontend and API should only serve objects that passed scanning and authorization checks.
  • Instrument the pipeline: Track statuses, errors, and infected files so incidents are observable and debuggable.

If you want to see how this ties into the rest of your platform, pair this piece with Using Edge Functions and Serverless Compute Effectively in 2025 and Next.js SEO Best Practices.

Why uploads are uniquely dangerous

Every file upload is untrusted input that can be arbitrarily large, binary, and opaque. Compared to JSON payloads, you cannot simply log and eyeball them. Attackers can:

  • Upload malware that other users download.
  • Attempt to exploit parsers in your image processing or PDF libraries.
  • Abuse storage to inflate your costs.
  • Use your service as a distribution point that earns you a place on security blacklists.

The main mindset shift is simple but important: an uploaded file is guilty until proven safe. Your pipeline exists to move files from untrusted to trusted states in a controlled way. That state transition needs to be explicit in your architecture and data model, not hidden in ad hoc flags.

High level architecture for a secure upload pipeline

Let us start with a zoomed out view. We will assume S3 compatible storage, but the same pattern applies to any cloud provider.

Client                API / Backend            Object Storage         Scanner / Worker
  |                        |                         |                        |
  | 1) Request upload ---> |                         |                        |
  |    permissions         |                         |                        |
  |                        | 2) AuthN/AuthZ         |                        |
  |                        |    generate presigned  |                        |
  |                        |    URL + object key    |                        |
  |                        |----------------------->|                        |
  | 3) Receive URL + key   |                         |                        |
  |                        |                         |                        |
  | 4) PUT file ------------------------------------>|                        |
  |                        |                         | 5) Object created      |
  |                        |                         |    event to queue      |
  |                        |<------------------------------------------------>|
  |                        |                         | 6) Worker downloads,   |
  |                        |                         |    scans, updates DB   |
  |                        |                         |    and moves object    |
  |                        |                         |    raw -> safe/quar    |
  | 7) Client polls or     |                         |                        |
  |    subscribes to       |                         |                        |
  |    upload status       |                         |                        |

Key ideas:

  • Your application server never handles the file bytes directly, so it cannot be taken down by large uploads.
  • Storage is logically partitioned into raw, safe, and quarantine zones.
  • A scanner worker is responsible for promotion from raw to safe, or diverting to quarantine.
  • Your API and UI only generate links to safe objects.

If you already think in terms of ingestion pipelines for RAG or analytics workloads, this will feel familiar. The same pattern shows up in Retrieval Augmented Generation (RAG) Guide where raw documents move through cleaning and indexing before being trusted in production search.

Designing your storage layout and object lifecycle

Before writing any code, you should define how objects move through your system. A simple and effective layout in S3 might look like this:

my-app-uploads
  uploads/
    raw/
      user-123/
        f6fdc0b2-raw-avatar.jpg
    safe/
      user-123/
        f6fdc0b2-avatar.jpg
    quarantine/
      user-123/
        f6fdc0b2-infected.jpg

The scanner is the only component that moves objects between these prefixes. Typical lifecycle:

  1. Client uploads to uploads/raw/... using a presigned URL.
  2. Object storage emits a creation event into a queue or event bus.
  3. A worker picks the job, downloads the object, and runs:
    • Virus scanning.
    • File type sniffing and extension validation.
    • Policy checks such as max resolution, allowed content type, or tenant quotas.
  4. If everything is clean, it moves or copies the object into uploads/safe/... and updates a database row.
  5. If scanning fails or detects malware, it moves the object into uploads/quarantine/... and records details for audit.

You can add lifecycle rules to automatically delete quarantined objects after some retention period, and to prune stale raw objects that were never fully uploaded or processed.

Tracking uploads in your database

To keep the system observable, treat an upload as a first class record in your database, not just a side effect in object storage. A minimal schema might look like this:

CREATE TABLE uploads (
  id            uuid PRIMARY KEY,
  user_id       uuid NOT NULL,
  object_key    text NOT NULL,
  status        text NOT NULL, -- 'pending' | 'safe' | 'infected' | 'scan-error'
  content_type  text,
  size_bytes    bigint,
  created_at    timestamptz NOT NULL DEFAULT now(),
  scanned_at    timestamptz,
  infection_reason text
);

When your API issues a presigned upload URL, it should also create an uploads row with status = 'pending'. The scanner will later update that row to safe, infected, or scan-error. Any feature that wants to display or download the file will check this table instead of guessing from raw object keys.

This is essentially the same durability pattern you use for long running jobs, webhooks, or RAG ingestion tasks, and it fits nicely with the reliability themes from AI Summarized Dashboards and AI Maintain Code Quality and Reduce Bugs.

Generating presigned URLs securely

Presigned URLs give clients temporary permission to PUT a specific object into your bucket without giving them credentials. The tricky part is not generating the URL, it is enforcing a clear policy around what can be uploaded.

Your presign endpoint should:

  • Authenticate the caller.
  • Authorize the upload type for that user or tenant.
  • Decide on:
    • Target prefix (uploads/raw/profile-images/... for example).
    • Allowed MIME type or set of types.
    • Maximum size in bytes.
  • Create a database record for the upload with status pending.

Here is a simplified TypeScript example using AWS SDK v3:

import { randomUUID } from "node:crypto";
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";

const BUCKET_NAME = "my-app-uploads";
const REGION = "us-east-1";
const PROFILE_IMAGE_MAX_BYTES = 5 * 1024 * 1024; // 5 MB

const s3 = new S3Client({ region: REGION });

type CreateProfileImageUploadUrlParams = {
  userId: string;
  contentType: string;
};

export const createProfileImageUploadUrl = async ({
  userId,
  contentType,
}: CreateProfileImageUploadUrlParams) => {
  const allowedTypes = ["image/jpeg", "image/png", "image/webp"];

  if (!allowedTypes.includes(contentType)) {
    throw new Error("Unsupported content type for profile images");
  }

  const uploadId = randomUUID();
  const objectKey = `uploads/raw/profile-images/${userId}/${uploadId}`;

  // Insert into your uploads table with status = 'pending'
  await db
    .insertInto("uploads")
    .values({
      id: uploadId,
      user_id: userId,
      object_key: objectKey,
      status: "pending",
      content_type: contentType,
    })
    .execute();

  const command = new PutObjectCommand({
    Bucket: BUCKET_NAME,
    Key: objectKey,
    ContentType: contentType,
    Metadata: {
      "x-upload-type": "profile-image",
      "x-user-id": userId,
    },
  });

  const uploadUrl = await getSignedUrl(s3, command, {
    expiresIn: 60 * 5, // 5 minutes
  });

  return {
    uploadId,
    uploadUrl,
    maxBytes: PROFILE_IMAGE_MAX_BYTES,
  };
};

Expose this from a Next.js route handler such as /api/uploads/profile-image. The client calls it, gets back an uploadUrl and uploadId, then uses fetch or XMLHttpRequest to PUT the file bytes directly to storage.

Client side upload with progress

On the frontend, uploading to a presigned URL is straightforward. The main concerns are UX and error handling, not security. You still enforce basic checks like size limits and rough MIME type before attempting the upload, so you can fail fast and give clear messages.

type UploadToPresignedUrlParams = {
  uploadUrl: string;
  file: File;
};

export const uploadToPresignedUrl = async ({
  uploadUrl,
  file,
}: UploadToPresignedUrlParams) => {
  const response = await fetch(uploadUrl, {
    method: "PUT",
    headers: {
      "Content-Type": file.type || "application/octet-stream",
    },
    body: file,
  });

  if (!response.ok) {
    throw new Error(`Upload failed with status ${response.status}`);
  }
};

In a Next.js app you might wrap this into a custom hook that tracks upload state and later polls an /api/uploads/status?id=... endpoint until the scanner marks the upload as safe. That polling pattern will feel familiar if you have built long running operations such as Deploying Next.js on a VPS where background tasks do the heavy lifting after an initial request.

Building the virus scanning worker

The scanning component is where the real security work happens. You have a few implementation options:

  • A containerized worker service running ClamAV or a commercial scanner, consuming events from a queue.
  • A serverless function triggered by object storage events, which calls out to a scanning API.
  • A hybrid approach where the function streams the object into a dedicated scanning microservice.

For many teams, a small container with ClamAV is a sensible default. Here is a conceptual TypeScript worker that consumes messages describing new uploads:

type ScanJob = {
  bucket: string;
  key: string;
};

type ScanOutcome = "safe" | "infected" | "error";

const scanBufferWithClamAV = async (buffer: Buffer): Promise<ScanOutcome> => {
  try {
    const result = await clamd.scanBuffer(buffer);
    if (result.isInfected) {
      return "infected";
    }
    return "safe";
  } catch (error) {
    console.error("Virus scan failed", error);
    return "error";
  }
};

export const handleScanJob = async (job: ScanJob) => {
  const object = await s3.getObject({ Bucket: job.bucket, Key: job.key });
  const bytes = await object.Body?.transformToByteArray();

  if (!bytes) {
    throw new Error("Could not read object body for scanning");
  }

  const buffer = Buffer.from(bytes);
  const outcome = await scanBufferWithClamAV(buffer);

  const upload = await db
    .selectFrom("uploads")
    .selectAll()
    .where("object_key", "=", job.key)
    .executeTakeFirst();

  if (!upload) {
    console.warn("Scan completed for unknown object key", job.key);
    return;
  }

  if (outcome === "safe") {
    await moveToSafePrefix(job.bucket, job.key);
    await updateUploadStatus(upload.id, "safe", null);
  } else if (outcome === "infected") {
    await moveToQuarantinePrefix(job.bucket, job.key);
    await updateUploadStatus(upload.id, "infected", "Detected by ClamAV");
  } else {
    // Treat scanner errors as conservative blocks
    await moveToQuarantinePrefix(job.bucket, job.key);
    await updateUploadStatus(upload.id, "scan-error", "Scan failed");
  }
};

A few design decisions are worth calling out:

  • Scanner failures are not silent. If the scanner cannot process a file, the upload is blocked and shunted into quarantine by default.
  • Object movement is encapsulated in helpers like moveToSafePrefix, which handle S3 copy plus delete, retry, and metrics.
  • Database is the source of truth for client facing status, not object tags alone.

This is where solid error handling patterns from Error Handling Patterns for Distributed Systems really pay off: retries, idempotency, and dead letter queues turn a fragile scanner into a robust subsystem.

Serving only safe files

Now that you are tracking upload state and scanning results, your download paths can become strict about what they expose. A simple pattern is:

  1. Client requests a download link for an upload ID.
  2. API checks:
    • The upload exists.
    • The requester is authorized to access it.
    • The status is safe.
  3. API generates a short lived presigned GET URL for the underlying safe object key.
  4. Client either redirects to that URL or uses it directly.

Here is a sketch of such a function in TypeScript:

import { GetObjectCommand } from "@aws-sdk/client-s3";

type GetDownloadUrlParams = {
  uploadId: string;
  requesterUserId: string;
};

export const getSafeDownloadUrl = async ({
  uploadId,
  requesterUserId,
}: GetDownloadUrlParams) => {
  const upload = await db
    .selectFrom("uploads")
    .selectAll()
    .where("id", "=", uploadId)
    .executeTakeFirst();

  if (!upload) {
    throw new Error("Upload not found");
  }

  if (upload.user_id !== requesterUserId) {
    throw new Error("Not authorized to access this file");
  }

  if (upload.status !== "safe") {
    throw new Error("File is not available for download");
  }

  const command = new GetObjectCommand({
    Bucket: BUCKET_NAME,
    Key: upload.object_key.replace("/raw/", "/safe/"),
  });

  const url = await getSignedUrl(s3, command, { expiresIn: 60 * 10 });
  return url;
};

By centralizing access through this function, you ensure that no UI component ever constructs bucket URLs by hand or accidentally links to raw prefixes.

Handling large files and timeouts

So far we have mostly thought about small to medium size files. Once users start pushing multi gigabyte uploads, additional concerns show up:

  • Scanner performance and memory: Loading large objects into RAM just to scan them is not ideal. For very large media types you may:
    • Stream the object through the scanner instead of buffering.
    • Treat certain types as lower risk and scan metadata only, with clear documentation.
  • Multipart uploads: S3 style multipart uploads require completing an upload after all parts arrive. Your scanner should subscribe to completion events, not part uploads.
  • Timeouts and SLAs: Some scans will take longer than your usual UX tolerance. Your UI should reflect that scanning is a background process, not a blocking step on the main thread of the user journey.

From a product perspective, communicate clearly that large uploads might take a bit before they are available. Think of it like a cooking show where something goes into the oven offscreen; you want to reassure the viewer that the food will come back later, fully cooked and safe to eat.

Security hardening checklist

Once your basic pipeline is working in a staging environment, walk through this checklist with your team:

  • Transport and credentials
    • All presign endpoints require authentication and authorization.
    • Presigned URLs have short lifetimes and minimal permissions.
    • Buckets are private and not listable anonymously.
  • Validation
    • Client side checks for size and rough content type.
    • Server side enforcement of allowed MIME types and size ceilings.
    • Scanner validates true content type, not just the extension.
  • Scanning
    • Scanner runs in an isolated environment, not on your main API nodes.
    • Failures are logged and conservatively treated as blocked uploads.
    • Quarantine storage has lifecycle rules and restricted access.
  • Access control
    • Download endpoints verify both upload status and user or tenant ownership.
    • Links are always signed and short lived.
    • Raw prefixes are never exposed to end users.
  • Observability
    • Dashboards track counts of pending, safe, infected, and scan error uploads.
    • Alerts fire on spikes in infected or failed scans.
    • Logs capture scanner signatures and decisions for later forensics.

This list pairs well with periodic security reviews and more general platform hygiene work such as API Versioning and Backward Compatibility or Ethics of AI Generated Code in Production.

Pulling it together in a Next.js app

If you are building on Next.js with the App Router, the pieces fit into patterns you already know:

  • An app/api/uploads/ route that issues presigned PUT URLs and creates uploads rows.
  • A client component or hook that calls this route, performs the upload, and then polls or subscribes to status updates.
  • A background worker or serverless function that reacts to object storage events, performs scanning, and updates the database.
  • A download route that validates status and permissions before redirecting to a presigned GET URL.

You can run the worker as a small Node service alongside your other jobs, or containerize it and schedule it using the same orchestration platform you use for cron workloads. If you are already comfortable with Edge Functions and Serverless Compute, you might even experiment with storage triggered functions where latency and scale are handled for you.

Actionable next steps

To make this concrete, here is a pragmatic path you can follow over the next couple of sprints:

  1. Move uploads to object storage with presigned URLs
    Replace any API endpoints that proxy file bytes with a presign plus direct PUT flow.
  2. Introduce an uploads table with statuses
    Treat uploads as first class records and show their state explicitly in your UI.
  3. Add a basic virus scanning worker
    Start with ClamAV or a managed service and wire it into storage events.
  4. Lock down serving paths
    Only expose files with status = 'safe', and ensure links are signed and short lived.
  5. Iterate on policies and observability
    Tighten allowed types, improve error messages, and add dashboards and alerts.

Follow these steps and your file upload pipeline will move from “it works in dev” to “it is a well understood, observable, and secure subsystem.” Your users still see a simple upload button. Under the hood, you have built a small conveyor belt that only lets safe, validated files roll onto the production floor, while the suspicious ones take a quiet detour to quarantine where they cannot hurt anyone.