March 15, 202611 min readRishi

Designing a URL Shortener: System Design from Scratch

URL shorteners look simple on the surface. You take a long URL, generate a short alias, and redirect anyone who visits that alias back to the original. But behind that simplicity hides a system that must handle billions of reads, guarantee uniqueness across distributed nodes, and resolve redirects in single-digit milliseconds. This is why "design a URL shortener" is one of the most popular system design interview questions — it touches on hashing, databases, caching, analytics, and horizontal scaling.

Let us build one from scratch.

Step 1: Requirements Gathering

Before writing any code, we need to pin down what we are actually building. This is the most important part of any system design exercise.

Functional Requirements

Shorten: Given a long URL, generate a unique short URL
Redirect: Given a short URL, redirect the user to the original long URL
Custom aliases: Optionally allow users to pick their own short code
Expiration: URLs can have an optional TTL (time to live)
Analytics: Track click counts, referrers, geolocation, and timestamps

Non-Functional Requirements

High availability: The redirect service must have 99.99% uptime
Low latency: Redirects should complete in under 10ms at the p99
Read-heavy: The read-to-write ratio will be roughly 100:1
Scale: Support 100M new URLs per month and 10B redirects per month

Back-of-the-Envelope Estimates

Let us do the math:

Writes: 100M URLs/month = ~40 URLs/second
Reads: 10B redirects/month = ~3,800 redirects/second
Storage per record: ~500 bytes (short code + long URL + metadata)
Storage for 5 years: 100M x 12 x 5 x 500 bytes = ~3 TB
Cache: If we cache the top 20% of URLs, that is ~600 GB

These numbers tell us something important: this is a read-dominated system. Our architecture should optimize aggressively for reads.

Step 2: High-Level Architecture

Here is the overall flow:

Client
  |
  v
API Gateway (rate limiting, auth)
  |
  v
Application Server (stateless)
  |
  +---> Cache (Redis) ---> hit? return long URL
  |         |
  |         v (miss)
  +---> Database (primary storage)
  |
  v
Analytics Pipeline (async, Kafka)

The key components are:

API Gateway — handles rate limiting, authentication, and SSL termination
Application servers — stateless service that handles shortening and redirecting
Cache layer — Redis or Memcached for hot URLs
Database — stores the URL mappings
Analytics pipeline — asynchronous event processing for click tracking

Step 3: URL Encoding Strategy

This is the core algorithmic question. How do we generate a short, unique string for each URL?

Approach 1: Base62 Encoding

We use the characters [a-zA-Z0-9] which gives us 62 possible characters per position. A 7-character code gives us:

62^7 = 3,521,614,606,208 (~3.5 trillion unique codes)

That is far more than we need. Here is how it works:

import string

ALPHABET = string.ascii_letters + string.digits  # 62 chars

def base62_encode(num: int) -> str:
    if num == 0:
        return ALPHABET[0]
    result = []
    while num > 0:
        result.append(ALPHABET[num % 62])
        num //= 62
    return ''.join(reversed(result))

def base62_decode(code: str) -> int:
    num = 0
    for char in code:
        num = num * 62 + ALPHABET.index(char)
    return num

# Example
print(base62_encode(123456789))  # "8M0kX"
print(base62_decode("8M0kX"))    # 123456789

Approach 2: MD5/SHA-256 Hash + Truncation

We hash the long URL and take the first 7 characters:

import hashlib

def hash_url(long_url: str) -> str:
    hash_hex = hashlib.md5(long_url.encode()).hexdigest()
    # Take first 43 bits worth (7 base62 chars)
    num = int(hash_hex[:11], 16)
    return base62_encode(num)[:7]

Problem: This creates hash collisions. Two different URLs can produce the same short code. We need collision resolution.

Approach 3: Pre-Generated Key Service (Recommended)

The cleanest solution is to pre-generate unique keys in a separate service:

class KeyGenerationService:
    """
    Pre-generates unique 7-character keys and stores them
    in a database. Application servers request keys in batches.
    """

    def __init__(self, db):
        self.db = db

    def generate_keys(self, batch_size=10000):
        """Generate a batch of unique keys and store them."""
        keys = set()
        while len(keys) < batch_size:
            key = base62_encode(random.randint(0, 62**7 - 1))
            key = key.ljust(7, '0')[:7]  # Ensure 7 chars
            keys.add(key)
        self.db.insert_unused_keys(list(keys))

    def get_key(self) -> str:
        """Atomically fetch and mark a key as used."""
        return self.db.fetch_and_mark_used()

This approach eliminates collisions entirely because every key is unique before it is ever assigned to a URL. Application servers request keys in batches (say 1000 at a time) and cache them locally, so the key service is not a bottleneck.

Step 4: Handling Hash Collisions

If you go with the hashing approach, you must handle collisions. Here is a robust strategy:

def create_short_url(long_url: str, db) -> str:
    short_code = hash_url(long_url)

    # Check for collision
    existing = db.get(short_code)

    if existing is None:
        # No collision — insert and return
        db.insert(short_code, long_url)
        return short_code
    elif existing.long_url == long_url:
        # Same URL already shortened — return existing code
        return short_code
    else:
        # Collision with a different URL
        # Append a counter and re-hash
        for i in range(10):
            new_code = hash_url(long_url + str(i))
            if db.get(new_code) is None:
                db.insert(new_code, long_url)
                return new_code
        raise Exception("Unable to resolve collision")

In practice, collisions are rare with 7-character base62 codes (3.5 trillion possibilities), but you must handle them to guarantee correctness.

Step 5: Database Schema

URL Mappings Table

CREATE TABLE url_mappings (
    short_code  VARCHAR(7)   PRIMARY KEY,
    long_url    TEXT         NOT NULL,
    user_id     BIGINT       DEFAULT NULL,
    created_at  TIMESTAMP    NOT NULL DEFAULT NOW(),
    expires_at  TIMESTAMP    DEFAULT NULL,
    click_count BIGINT       DEFAULT 0
);

-- Index for reverse lookups (check if URL already shortened)
CREATE INDEX idx_long_url ON url_mappings (long_url);

-- Index for expiration cleanup
CREATE INDEX idx_expires ON url_mappings (expires_at)
    WHERE expires_at IS NOT NULL;

Analytics Events Table

CREATE TABLE click_events (
    id          BIGSERIAL    PRIMARY KEY,
    short_code  VARCHAR(7)   NOT NULL,
    clicked_at  TIMESTAMP    NOT NULL DEFAULT NOW(),
    referrer    TEXT,
    user_agent  TEXT,
    ip_address  INET,
    country     VARCHAR(2),
    city        VARCHAR(100)
);

-- Partition by month for efficient querying
-- In production, use TimescaleDB or ClickHouse for this
CREATE INDEX idx_click_code_time
    ON click_events (short_code, clicked_at);

SQL vs. NoSQL Decision

For URL mappings, both work. Here is the tradeoff:

Factor	SQL (PostgreSQL)	NoSQL (DynamoDB)
Lookups by short_code	Fast (primary key)	Fast (partition key)
Reverse lookups	Index on long_url	Requires GSI
Transactions	Full ACID	Limited
Scaling	Harder (sharding)	Automatic
Cost at scale	Lower	Higher

For most teams, PostgreSQL with read replicas is the right starting point. Switch to DynamoDB if you need automatic scaling beyond what read replicas provide.

Step 6: Read-Heavy Optimization

Since our read-to-write ratio is 100:1, we must make redirects blazing fast.

Caching Strategy

We use a write-around cache with Redis:

class URLRedirectService:
    def __init__(self, redis_client, db):
        self.redis = redis_client
        self.db = db
        self.CACHE_TTL = 86400  # 24 hours

    async def resolve(self, short_code: str) -> str | None:
        # Step 1: Check cache
        long_url = await self.redis.get(f"url:{short_code}")
        if long_url:
            return long_url.decode()

        # Step 2: Cache miss — query database
        record = await self.db.get_url(short_code)
        if record is None:
            # Cache the miss too (prevent cache penetration)
            await self.redis.set(
                f"url:{short_code}", "__NULL__",
                ex=300  # Short TTL for misses
            )
            return None

        # Step 3: Check expiration
        if record.expires_at and record.expires_at < datetime.utcnow():
            return None

        # Step 4: Populate cache and return
        await self.redis.set(
            f"url:{short_code}", record.long_url,
            ex=self.CACHE_TTL
        )
        return record.long_url

Cache Sizing

Hot URLs follow a Zipf distribution — a small percentage of URLs get most of the traffic
Caching the top 20% of URLs should give us a 90%+ cache hit rate
With 6B total URLs and 500 bytes each: 6B * 0.2 * 500 = 600 GB
This fits in a Redis cluster with 10-15 nodes (64 GB each)

Database Read Replicas

For cache misses, we route reads to read replicas and writes to the primary:

Writes --> Primary DB (1 instance)
Reads  --> Read Replicas (3-5 instances, async replication)

The slight replication lag (typically under 100ms) is acceptable because a newly created short URL is extremely unlikely to receive a redirect request within that window.

Step 7: Analytics Tracking

We do not want analytics to slow down redirects. The solution is asynchronous event processing:

from kafka import KafkaProducer
import json

class AnalyticsTracker:
    def __init__(self):
        self.producer = KafkaProducer(
            bootstrap_servers=['kafka:9092'],
            value_serializer=lambda v: json.dumps(v).encode()
        )

    def track_click(self, short_code: str, request):
        event = {
            "short_code": short_code,
            "timestamp": datetime.utcnow().isoformat(),
            "referrer": request.headers.get("Referer"),
            "user_agent": request.headers.get("User-Agent"),
            "ip": request.client.host,
        }
        # Fire and forget — do not await
        self.producer.send("click-events", value=event)

The analytics pipeline looks like this:

Redirect Service --> Kafka Topic --> Consumer Group --> ClickHouse/TimescaleDB
                                        |
                                        +--> Real-time dashboard (WebSocket)
                                        +--> Batch aggregation (hourly/daily)

Counter Updates

For the click_count in the URL mappings table, do not update it on every click. Instead:

Accumulate counts in Redis: INCR click:{short_code}
Flush to the database every 5 minutes via a batch job
This reduces database write load by 99%

Step 8: The Redirect Flow

The actual redirect is a 301 or 302 HTTP response. The choice matters:

301 (Permanent Redirect): Browser caches the redirect. Faster for users, but you lose analytics visibility because subsequent visits never hit your server.
302 (Temporary Redirect): Browser does not cache. Every visit hits your server, so you can track every click.

Most URL shorteners use 302 because analytics tracking is a key feature:

from fastapi import FastAPI, Response
from fastapi.responses import RedirectResponse

app = FastAPI()

@app.get("/{short_code}")
async def redirect(short_code: str, request: Request):
    long_url = await redirect_service.resolve(short_code)

    if long_url is None:
        return Response(status_code=404, content="URL not found")

    # Track the click asynchronously
    analytics.track_click(short_code, request)

    # 302 redirect to preserve analytics
    return RedirectResponse(url=long_url, status_code=302)

Step 9: Scaling Considerations

Horizontal Scaling

Every component in our architecture is horizontally scalable:

Application servers: Stateless — add more behind the load balancer
Redis cache: Shard by short_code hash across a cluster
Database: Read replicas for reads, sharding by short_code for writes
Kafka: Partition the click-events topic for parallel consumers

Database Sharding

When a single database can no longer handle the write volume, shard by the first character of the short_code:

Shard 0: a-f  (covers short codes starting with a through f)
Shard 1: g-m
Shard 2: n-s
Shard 3: t-z
Shard 4: A-M
Shard 5: N-Z
Shard 6: 0-9

This gives you 7 shards with roughly equal distribution. Each shard is an independent PostgreSQL instance with its own read replicas.

Rate Limiting

Protect the URL creation endpoint from abuse:

Per-user: 100 URLs per hour for free tier, 10,000 for paid
Per-IP: 50 URLs per hour for anonymous users
Global: Circuit breaker at 10,000 URLs/second to protect the database

URL Cleanup

Expired URLs consume storage. Run a background job to clean them up:

async def cleanup_expired_urls():
    """Run every hour via cron."""
    deleted = await db.execute("""
        DELETE FROM url_mappings
        WHERE expires_at IS NOT NULL
          AND expires_at < NOW()
        LIMIT 10000
    """)
    # Also invalidate cache entries
    for code in deleted:
        await redis.delete(f"url:{code}")

Step 10: Security Considerations

A URL shortener can be abused for phishing. Mitigate this with:

URL validation: Reject malformed URLs and known malicious domains
Google Safe Browsing API: Check every submitted URL against Google's malware/phishing database
Preview pages: Show users where a short URL leads before redirecting (like bit.ly's + suffix)
Abuse reporting: Allow users to flag malicious short URLs
Rate limiting: Prevent automated bulk URL creation

Complete API Design

POST   /api/v1/urls          Create a short URL
GET    /api/v1/urls/{code}   Get URL metadata (not redirect)
DELETE /api/v1/urls/{code}   Delete a short URL
GET    /api/v1/urls/{code}/stats   Get analytics for a URL
GET    /{code}               Redirect to long URL (302)

Create URL Request/Response

// POST /api/v1/urls
// Request
{
    "long_url": "https://example.com/very/long/path?with=params",
    "custom_alias": "my-link",    // optional
    "expires_in": 86400           // optional, seconds
}

// Response (201 Created)
{
    "short_code": "my-link",
    "short_url": "https://sho.rt/my-link",
    "long_url": "https://example.com/very/long/path?with=params",
    "expires_at": "2026-03-16T00:00:00Z",
    "created_at": "2026-03-15T00:00:00Z"
}

Summary

Component	Technology	Why
Key Generation	Pre-generated base62 keys	No collisions, O(1) assignment
Primary Storage	PostgreSQL (sharded)	ACID, mature, cost-effective
Cache	Redis Cluster	Sub-millisecond lookups
Analytics Queue	Kafka	Decouple analytics from redirects
Analytics Store	ClickHouse	Columnar, fast aggregations
Load Balancer	Nginx / ALB	SSL termination, routing

The URL shortener is a deceptively deep system design problem. It tests your ability to think about encoding, storage, caching, async processing, and scaling — all in a system that appears trivially simple. The key insight is that this is a read-dominated system, and every design decision should optimize for redirect latency first.

SharePost Share

Keep reading

Jul 13, 20265 min read

Designing Ad Click Aggregation: Exactly-Once Counting at Scale

Billions of clicks, billed to the cent: streaming aggregation with watermarks, dedupe, idempotent sinks, and lambda-style reconciliation.

system-design

Jul 13, 20265 min read

Designing a CDN: Cache Hierarchy, Invalidation, and Request Routing

How a CDN actually works: edge PoPs, origin shields, consistent-hash cache keys, purge fan-out, and the anycast vs DNS routing decision.

system-design

Jul 13, 20265 min read

Designing a Distributed Cache Service: Redis Cluster Internals and Hot Keys

Build the cache, not just use it: slot-based sharding, gossip and failover, eviction under memory pressure, and the hot-key problem that shards can't solve.

Designing a URL Shortener: System Design from Scratch

Step 1: Requirements Gathering

Functional Requirements

Non-Functional Requirements

Back-of-the-Envelope Estimates

Step 2: High-Level Architecture

Step 3: URL Encoding Strategy

Approach 1: Base62 Encoding

Approach 2: MD5/SHA-256 Hash + Truncation

Approach 3: Pre-Generated Key Service (Recommended)

Step 4: Handling Hash Collisions

Step 5: Database Schema

URL Mappings Table

Analytics Events Table

SQL vs. NoSQL Decision

Step 6: Read-Heavy Optimization

Caching Strategy

Cache Sizing

Database Read Replicas

Step 7: Analytics Tracking

Counter Updates

Step 8: The Redirect Flow

Step 9: Scaling Considerations

Horizontal Scaling

Database Sharding

Rate Limiting

URL Cleanup

Step 10: Security Considerations

Complete API Design

Create URL Request/Response

Summary

Keep reading

Designing Ad Click Aggregation: Exactly-Once Counting at Scale

Designing a CDN: Cache Hierarchy, Invalidation, and Request Routing

Designing a Distributed Cache Service: Redis Cluster Internals and Hot Keys

New posts, straight to your inbox

Comments