Designing Scalable File Upload Systems: A System Design Guide for Backend Engineers

Aman Sahni
2026 · 12 min read

Here's a question that shows up in system design interviews more often than you'd expect: "Design a file upload system." Sounds simple. Accept a file, store it somewhere, return a URL. You could probably build that in an hour.

But then the interviewer starts adding constraints. What if the file is 2GB? What if 10,000 users upload simultaneously? What about virus scanning? What about serving files across 5 regions? What happens when the upload fails at 90%?

Suddenly your simple upload endpoint is a distributed system with storage, queues, workers, security gates, and CDN layers. And that's exactly why companies like Google and Amazon love this question. It starts easy and scales into every hard problem in backend engineering.

Who this is for — Backend engineers who want to understand how file upload systems work at scale. Whether you're preparing for a system design interview or building one at work, this post covers the architecture decisions that matter.

The naive approach (and why it breaks)

Let's start with what most developers build first. It's the approach that works perfectly on your laptop and falls apart the moment real traffic hits.

@PostMapping("/upload")
public ResponseEntity<String> upload(@RequestParam MultipartFile file) {
    // File goes through YOUR server
    String key = UUID.randomUUID() + "_" + file.getOriginalFilename();
    s3Client.putObject(bucket, key, file.getInputStream(), metadata);
    return ResponseEntity.ok("uploaded: " + key);
}

This works. But here's what's actually happening: the client sends the entire file to your Spring Boot server. Your server holds it in memory (or writes it to a temp file), then re-uploads it to S3. Your server is a middleman doing nothing but shuffling bytes.

Why this breaks at scale: A 500MB file ties up one of your server's threads for the entire upload duration. Tomcat has 200 threads by default. A few dozen large concurrent uploads and your thread pool is saturated. Your API stops responding to all other requests, not just uploads.

The production architecture

A well-designed file upload system has one core principle: your backend never touches the file bytes. It manages permissions, metadata, and orchestration. The actual file transfer happens directly between the client and object storage.

Production file upload architecture — the backend never touches file bytes

Notice step 3 in the diagram. The file goes directly from the client to S3. Your Spring Boot backend only handles two things: generating the presigned URL (a signed permission slip that says "this user is allowed to upload this file to this location for the next 5 minutes") and managing the metadata in your database.

Let's walk through each part of this architecture and the decisions behind them.

Presigned URLs: the key to everything

A presigned URL is a temporary, signed URL that gives the client permission to upload directly to S3 without needing AWS credentials. Your backend generates it, the client uses it, and your server never sees the file.

@PostMapping("/api/uploads/request")
public UploadPermission requestUpload(
        @RequestBody UploadRequest request,
        @AuthenticationPrincipal User user) {

    // Validate: file type, size, user quota
    validateUploadRequest(request, user);

    // Generate unique key
    String key = String.format("uploads/%s/%s/%s",
        user.getId(),
        LocalDate.now(),
        UUID.randomUUID() + "_" + sanitize(request.getFileName()));

    // Generate presigned URL (expires in 5 minutes)
    PutObjectPresignRequest presignRequest = PutObjectPresignRequest.builder()
        .signatureDuration(Duration.ofMinutes(5))
        .putObjectRequest(b -> b
            .bucket(uploadBucket)
            .key(key)
            .contentType(request.getContentType())
            .contentLength(request.getFileSize()))
        .build();

    PresignedPutObjectRequest presigned = s3Presigner.presignPutObject(presignRequest);

    // Save metadata with status PENDING
    FileMetadata metadata = FileMetadata.builder()
        .key(key)
        .userId(user.getId())
        .fileName(request.getFileName())
        .fileSize(request.getFileSize())
        .status(FileStatus.PENDING)
        .createdAt(Instant.now())
        .build();
    fileMetadataRepository.save(metadata);

    return new UploadPermission(presigned.url().toString(), key, metadata.getId());
}

The client gets back a URL and uploads directly to S3. Your server handled the auth check and metadata in milliseconds. The heavy lifting happens between the client and AWS, not through your infrastructure.

Why 5 minutes? — Presigned URLs are temporary credentials. The shorter the expiration, the smaller the attack window if the URL leaks. Five minutes is long enough for any reasonable upload to start, short enough that a leaked URL becomes useless quickly.

Never trust the file extension

This is one of those lessons that sounds obvious when you hear it but most teams learn through a security incident. We had one where a user renamed an executable to .jpg and our system accepted it because we were checking the extension.

File extensions are just a naming convention. They carry zero information about what's actually inside the file. To know what a file really is, you need to read its content.

Every file format has a unique signature in its first few bytes, called magic bytes:

// Magic bytes for common file types
private static final Map<String, byte[]> MAGIC_BYTES = Map.of(
    "image/jpeg",  new byte[]{(byte)0xFF, (byte)0xD8, (byte)0xFF},
    "image/png",   new byte[]{(byte)0x89, 0x50, 0x4E, 0x47},
    "application/pdf", new byte[]{0x25, 0x50, 0x44, 0x46}
);

public boolean isValidFileType(byte[] fileHeader, String declaredType) {
    byte[] expected = MAGIC_BYTES.get(declaredType);
    if (expected == null) return false;

    for (int i = 0; i < expected.length; i++) {
        if (fileHeader[i] != expected[i]) return false;
    }
    return true;
}

This validation runs in your processing worker after the file lands in S3, not during the upload itself. If the magic bytes don't match the declared content type, quarantine the file and alert your security team.

Chunked and resumable uploads

Uploading a 500MB video as a single HTTP request is fragile. Network drops at 90% and the user starts over. On mobile networks, this is almost guaranteed to happen for large files.

The solution is multipart uploads, where the file is split into chunks (typically 5-10MB each) and uploaded independently. S3 supports this natively.

Chunked uploads with resume capability — only retry failed chunks

The key insight: your backend tracks which chunks have been uploaded. When the client reconnects, it asks "which parts are done?" and only uploads the remaining ones. The user experience goes from "start over from zero" to "continue from 90%." For mobile users on unreliable networks, this is the difference between a usable product and an unusable one.

The security gate: scan before you serve

This is the one that catches most teams off guard. The file finishes uploading and it's immediately accessible. No scanning, no validation, just raw access. Someone uploads a malicious PDF and it's served to every user who requests it.

A production system needs a gate between upload and access:

Files move through a status lifecycle — never served until they pass the security gate

Every file starts as PENDING. A worker picks it up, runs a virus scan, and moves it to READY or QUARANTINED. Only READY files can be served. Your download endpoint checks the status before generating a signed URL:

@GetMapping("/api/files/{fileId}/download")
public DownloadResponse download(
        @PathVariable Long fileId,
        @AuthenticationPrincipal User user) {

    FileMetadata file = fileMetadataRepository.findById(fileId)
        .orElseThrow(() -> new FileNotFoundException(fileId));

    // Permission check
    if (!file.getUserId().equals(user.getId())) {
        throw new AccessDeniedException("Not your file");
    }

    // Status check — only serve READY files
    if (file.getStatus() != FileStatus.READY) {
        throw new FileNotReadyException("File is still processing");
    }

    // Generate signed download URL (expires in 15 minutes)
    GetObjectPresignRequest presignRequest = GetObjectPresignRequest.builder()
        .signatureDuration(Duration.ofMinutes(15))
        .getObjectRequest(b -> b.bucket(bucket).key(file.getKey()))
        .build();

    String signedUrl = s3Presigner.presignGetObject(presignRequest).url().toString();
    return new DownloadResponse(signedUrl);
}

Notice the download URL expires in 15 minutes. Never expose raw S3 paths. A raw S3 URL gives permanent access to anyone who has it. A signed URL is time-bound, specific to one file, and tied to your permission check. If someone shares the link, it stops working in 15 minutes.

Async processing: don't block the upload

After the file lands in S3, there's work to do: virus scanning, image compression, thumbnail generation, video transcoding, indexing for search. None of this should happen during the upload request.

S3 can emit events when files are uploaded. You wire these events to a message queue (Kafka, SQS, SNS), and workers consume them asynchronously:

@KafkaListener(topics = "file-uploads")
public void processUpload(S3EventNotification event) {
    String key = event.getRecords().get(0).getS3().getObject().getKey();

    FileMetadata file = fileMetadataRepository.findByKey(key)
        .orElseThrow();

    try {
        // Step 1: Validate magic bytes
        byte[] header = s3Client.getObjectHeader(bucket, key, 16);
        if (!isValidFileType(header, file.getContentType())) {
            file.setStatus(FileStatus.QUARANTINED);
            file.setRejectionReason("Content type mismatch");
            fileMetadataRepository.save(file);
            return;
        }

        // Step 2: Virus scan
        ScanResult scan = virusScanner.scan(bucket, key);
        if (!scan.isClean()) {
            file.setStatus(FileStatus.QUARANTINED);
            file.setRejectionReason("Virus detected: " + scan.getThreatName());
            fileMetadataRepository.save(file);
            return;
        }

        // Step 3: Process (thumbnails, compression)
        if (file.isImage()) {
            thumbnailService.generate(bucket, key);
            compressionService.compress(bucket, key);
        }

        // Step 4: Mark ready
        file.setStatus(FileStatus.READY);
        file.setProcessedAt(Instant.now());
        fileMetadataRepository.save(file);

    } catch (Exception e) {
        file.setStatus(FileStatus.FAILED);
        file.setRejectionReason(e.getMessage());
        fileMetadataRepository.save(file);
        log.error("Processing failed for {}", key, e);
    }
}

The upload returns 201 immediately. The user sees "Processing..." for a few seconds, then the file is ready. The processing pipeline can scale independently of your API. Heavy upload day? Add more workers. Your API servers don't care.

Rate limiting and size limits

Without limits, your system is an open target. A bot can upload 10,000 garbage files in an hour and your storage bill becomes a boardroom conversation.

Set constraints at multiple levels:

Per file: Maximum file size (e.g., 100MB for images, 2GB for videos). Reject before the presigned URL is even generated.
Per user: Maximum total storage (e.g., 5GB per free user, 50GB per premium). Check quota before allowing upload.
Per time window: Maximum uploads per hour per user (e.g., 50 uploads/hour). Prevents automated abuse.
At the gateway: Request size limits at Nginx/API Gateway level. Don't let oversized requests reach your application at all.

private void validateUploadRequest(UploadRequest request, User user) {
    // File size limit
    if (request.getFileSize() > MAX_FILE_SIZE) {
        throw new FileTooLargeException(
            "Max file size is " + MAX_FILE_SIZE / 1_000_000 + "MB");
    }

    // User quota check
    long currentUsage = fileMetadataRepository.sumFileSizeByUserId(user.getId());
    if (currentUsage + request.getFileSize() > user.getStorageQuota()) {
        throw new StorageQuotaExceededException("Storage quota exceeded");
    }

    // Rate limit check
    long recentUploads = fileMetadataRepository
        .countByUserIdAndCreatedAtAfter(user.getId(), Instant.now().minus(1, ChronoUnit.HOURS));
    if (recentUploads >= MAX_UPLOADS_PER_HOUR) {
        throw new RateLimitExceededException("Upload limit exceeded. Try again later.");
    }
}

All of these checks happen before the presigned URL is generated. Reject early. Don't let the upload happen and then reject.

Common interview pitfalls

If you're asked this question in a system design interview, here's what separates a strong answer from an average one:

Average Answer

"Client uploads file to my server, server stores in database, return URL."

Routes files through backend. No presigned URLs. No async processing. No security scanning. No size limits.

Strong Answer

"Client gets a presigned URL, uploads directly to S3. Event triggers worker pipeline for scanning and processing. Files served through signed download URLs with expiration."

Explains trade-offs at each step. Mentions failure handling, resume, rate limiting.

The key mistakes candidates make:

Designing synchronous pipelines — processing during the upload request blocks everything.
Ignoring failure scenarios — what happens when the upload fails at 90%? When the worker crashes mid-scan?
Skipping security layers — no virus scanning, no content validation, raw S3 URLs exposed.
Overcomplicating too early — jumping to multi-region replication before the basic flow is solid.
Not clarifying requirements — file types? sizes? users? region? Always ask before designing.

Start simple. Get the basic presigned URL flow right. Then layer in scanning, processing, chunking, and CDN. Explain trade-offs at each step. That's what the interviewer wants to see: not a perfect design, but a structured, evolving one.

Quick reference: the complete flow

Here's the entire system in one summary for easy reference:

Client requests upload — sends file name, size, type to your API.
Backend validates — checks auth, file type, size limits, user quota, rate limits.
Backend generates presigned URL — valid for 5 minutes, restricted to specific file and bucket location.
Client uploads directly to S3 — for large files, uses multipart upload with resume capability.
S3 emits event — triggers Kafka/SQS message with file details.
Worker picks up — validates magic bytes, runs virus scan, generates thumbnails, compresses.
File marked READY — status updated in database. File is now servable.
Client requests download — backend checks permissions, generates signed download URL (15 min expiry).
Client downloads from S3/CDN — backend never touches file bytes in either direction.

Your backend manages permissions and metadata. Object storage handles the files. Workers handle the processing. Each layer scales independently. That's the architecture.

System Design Course

Want more system design patterns like this?

URL shorteners, payment systems, notification systems, rate limiters, and more — designed specifically for backend engineers preparing for interviews and building production systems.

Explore the System Design Course →