·11 min read·Rishi

Designing a URL Shortener: System Design from Scratch

Designing a URL Shortener: System Design from Scratch

URL shorteners look simple on the surface. You take a long URL, generate a short alias, and redirect anyone who visits that alias back to the original. But behind that simplicity hides a system that must handle billions of reads, guarantee uniqueness across distributed nodes, and resolve redirects in single-digit milliseconds. This is why "design a URL shortener" is one of the most popular system design interview questions — it touches on hashing, databases, caching, analytics, and horizontal scaling.

Let us build one from scratch.

Step 1: Requirements Gathering

Before writing any code, we need to pin down what we are actually building. This is the most important part of any system design exercise.

Functional Requirements

  • Shorten: Given a long URL, generate a unique short URL
  • Redirect: Given a short URL, redirect the user to the original long URL
  • Custom aliases: Optionally allow users to pick their own short code
  • Expiration: URLs can have an optional TTL (time to live)
  • Analytics: Track click counts, referrers, geolocation, and timestamps

Non-Functional Requirements

  • High availability: The redirect service must have 99.99% uptime
  • Low latency: Redirects should complete in under 10ms at the p99
  • Read-heavy: The read-to-write ratio will be roughly 100:1
  • Scale: Support 100M new URLs per month and 10B redirects per month

Back-of-the-Envelope Estimates

Let us do the math:

  • Writes: 100M URLs/month = ~40 URLs/second
  • Reads: 10B redirects/month = ~3,800 redirects/second
  • Storage per record: ~500 bytes (short code + long URL + metadata)
  • Storage for 5 years: 100M x 12 x 5 x 500 bytes = ~3 TB
  • Cache: If we cache the top 20% of URLs, that is ~600 GB

These numbers tell us something important: this is a read-dominated system. Our architecture should optimize aggressively for reads.

Step 2: High-Level Architecture

Here is the overall flow:

Client
  |
  v
API Gateway (rate limiting, auth)
  |
  v
Application Server (stateless)
  |
  +---> Cache (Redis) ---> hit? return long URL
  |         |
  |         v (miss)
  +---> Database (primary storage)
  |
  v
Analytics Pipeline (async, Kafka)

The key components are:

  1. API Gateway — handles rate limiting, authentication, and SSL termination
  2. Application servers — stateless service that handles shortening and redirecting
  3. Cache layer — Redis or Memcached for hot URLs
  4. Database — stores the URL mappings
  5. Analytics pipeline — asynchronous event processing for click tracking

Step 3: URL Encoding Strategy

This is the core algorithmic question. How do we generate a short, unique string for each URL?

Approach 1: Base62 Encoding

We use the characters [a-zA-Z0-9] which gives us 62 possible characters per position. A 7-character code gives us:

62^7 = 3,521,614,606,208 (~3.5 trillion unique codes)

That is far more than we need. Here is how it works:

import string

ALPHABET = string.ascii_letters + string.digits  # 62 chars

def base62_encode(num: int) -> str:
    if num == 0:
        return ALPHABET[0]
    result = []
    while num > 0:
        result.append(ALPHABET[num % 62])
        num //= 62
    return ''.join(reversed(result))

def base62_decode(code: str) -> int:
    num = 0
    for char in code:
        num = num * 62 + ALPHABET.index(char)
    return num

# Example
print(base62_encode(123456789))  # "8M0kX"
print(base62_decode("8M0kX"))    # 123456789

Approach 2: MD5/SHA-256 Hash + Truncation

We hash the long URL and take the first 7 characters:

import hashlib

def hash_url(long_url: str) -> str:
    hash_hex = hashlib.md5(long_url.encode()).hexdigest()
    # Take first 43 bits worth (7 base62 chars)
    num = int(hash_hex[:11], 16)
    return base62_encode(num)[:7]

Problem: This creates hash collisions. Two different URLs can produce the same short code. We need collision resolution.

The cleanest solution is to pre-generate unique keys in a separate service:

class KeyGenerationService:
    """
    Pre-generates unique 7-character keys and stores them
    in a database. Application servers request keys in batches.
    """

    def __init__(self, db):
        self.db = db

    def generate_keys(self, batch_size=10000):
        """Generate a batch of unique keys and store them."""
        keys = set()
        while len(keys) < batch_size:
            key = base62_encode(random.randint(0, 62**7 - 1))
            key = key.ljust(7, '0')[:7]  # Ensure 7 chars
            keys.add(key)
        self.db.insert_unused_keys(list(keys))

    def get_key(self) -> str:
        """Atomically fetch and mark a key as used."""
        return self.db.fetch_and_mark_used()

This approach eliminates collisions entirely because every key is unique before it is ever assigned to a URL. Application servers request keys in batches (say 1000 at a time) and cache them locally, so the key service is not a bottleneck.

Step 4: Handling Hash Collisions

If you go with the hashing approach, you must handle collisions. Here is a robust strategy:

def create_short_url(long_url: str, db) -> str:
    short_code = hash_url(long_url)

    # Check for collision
    existing = db.get(short_code)

    if existing is None:
        # No collision — insert and return
        db.insert(short_code, long_url)
        return short_code
    elif existing.long_url == long_url:
        # Same URL already shortened — return existing code
        return short_code
    else:
        # Collision with a different URL
        # Append a counter and re-hash
        for i in range(10):
            new_code = hash_url(long_url + str(i))
            if db.get(new_code) is None:
                db.insert(new_code, long_url)
                return new_code
        raise Exception("Unable to resolve collision")

In practice, collisions are rare with 7-character base62 codes (3.5 trillion possibilities), but you must handle them to guarantee correctness.

Step 5: Database Schema

URL Mappings Table

CREATE TABLE url_mappings (
    short_code  VARCHAR(7)   PRIMARY KEY,
    long_url    TEXT         NOT NULL,
    user_id     BIGINT       DEFAULT NULL,
    created_at  TIMESTAMP    NOT NULL DEFAULT NOW(),
    expires_at  TIMESTAMP    DEFAULT NULL,
    click_count BIGINT       DEFAULT 0
);

-- Index for reverse lookups (check if URL already shortened)
CREATE INDEX idx_long_url ON url_mappings (long_url);

-- Index for expiration cleanup
CREATE INDEX idx_expires ON url_mappings (expires_at)
    WHERE expires_at IS NOT NULL;

Analytics Events Table

CREATE TABLE click_events (
    id          BIGSERIAL    PRIMARY KEY,
    short_code  VARCHAR(7)   NOT NULL,
    clicked_at  TIMESTAMP    NOT NULL DEFAULT NOW(),
    referrer    TEXT,
    user_agent  TEXT,
    ip_address  INET,
    country     VARCHAR(2),
    city        VARCHAR(100)
);

-- Partition by month for efficient querying
-- In production, use TimescaleDB or ClickHouse for this
CREATE INDEX idx_click_code_time
    ON click_events (short_code, clicked_at);

SQL vs. NoSQL Decision

For URL mappings, both work. Here is the tradeoff:

FactorSQL (PostgreSQL)NoSQL (DynamoDB)
Lookups by short_codeFast (primary key)Fast (partition key)
Reverse lookupsIndex on long_urlRequires GSI
TransactionsFull ACIDLimited
ScalingHarder (sharding)Automatic
Cost at scaleLowerHigher

For most teams, PostgreSQL with read replicas is the right starting point. Switch to DynamoDB if you need automatic scaling beyond what read replicas provide.

Step 6: Read-Heavy Optimization

Since our read-to-write ratio is 100:1, we must make redirects blazing fast.

Caching Strategy

We use a write-around cache with Redis:

class URLRedirectService:
    def __init__(self, redis_client, db):
        self.redis = redis_client
        self.db = db
        self.CACHE_TTL = 86400  # 24 hours

    async def resolve(self, short_code: str) -> str | None:
        # Step 1: Check cache
        long_url = await self.redis.get(f"url:{short_code}")
        if long_url:
            return long_url.decode()

        # Step 2: Cache miss — query database
        record = await self.db.get_url(short_code)
        if record is None:
            # Cache the miss too (prevent cache penetration)
            await self.redis.set(
                f"url:{short_code}", "__NULL__",
                ex=300  # Short TTL for misses
            )
            return None

        # Step 3: Check expiration
        if record.expires_at and record.expires_at < datetime.utcnow():
            return None

        # Step 4: Populate cache and return
        await self.redis.set(
            f"url:{short_code}", record.long_url,
            ex=self.CACHE_TTL
        )
        return record.long_url

Cache Sizing

  • Hot URLs follow a Zipf distribution — a small percentage of URLs get most of the traffic
  • Caching the top 20% of URLs should give us a 90%+ cache hit rate
  • With 6B total URLs and 500 bytes each: 6B * 0.2 * 500 = 600 GB
  • This fits in a Redis cluster with 10-15 nodes (64 GB each)

Database Read Replicas

For cache misses, we route reads to read replicas and writes to the primary:

Writes --> Primary DB (1 instance)
Reads  --> Read Replicas (3-5 instances, async replication)

The slight replication lag (typically under 100ms) is acceptable because a newly created short URL is extremely unlikely to receive a redirect request within that window.

Step 7: Analytics Tracking

We do not want analytics to slow down redirects. The solution is asynchronous event processing:

from kafka import KafkaProducer
import json

class AnalyticsTracker:
    def __init__(self):
        self.producer = KafkaProducer(
            bootstrap_servers=['kafka:9092'],
            value_serializer=lambda v: json.dumps(v).encode()
        )

    def track_click(self, short_code: str, request):
        event = {
            "short_code": short_code,
            "timestamp": datetime.utcnow().isoformat(),
            "referrer": request.headers.get("Referer"),
            "user_agent": request.headers.get("User-Agent"),
            "ip": request.client.host,
        }
        # Fire and forget — do not await
        self.producer.send("click-events", value=event)

The analytics pipeline looks like this:

Redirect Service --> Kafka Topic --> Consumer Group --> ClickHouse/TimescaleDB
                                        |
                                        +--> Real-time dashboard (WebSocket)
                                        +--> Batch aggregation (hourly/daily)

Counter Updates

For the click_count in the URL mappings table, do not update it on every click. Instead:

  1. Accumulate counts in Redis: INCR click:{short_code}
  2. Flush to the database every 5 minutes via a batch job
  3. This reduces database write load by 99%

Step 8: The Redirect Flow

The actual redirect is a 301 or 302 HTTP response. The choice matters:

  • 301 (Permanent Redirect): Browser caches the redirect. Faster for users, but you lose analytics visibility because subsequent visits never hit your server.
  • 302 (Temporary Redirect): Browser does not cache. Every visit hits your server, so you can track every click.

Most URL shorteners use 302 because analytics tracking is a key feature:

from fastapi import FastAPI, Response
from fastapi.responses import RedirectResponse

app = FastAPI()

@app.get("/{short_code}")
async def redirect(short_code: str, request: Request):
    long_url = await redirect_service.resolve(short_code)

    if long_url is None:
        return Response(status_code=404, content="URL not found")

    # Track the click asynchronously
    analytics.track_click(short_code, request)

    # 302 redirect to preserve analytics
    return RedirectResponse(url=long_url, status_code=302)

Step 9: Scaling Considerations

Horizontal Scaling

Every component in our architecture is horizontally scalable:

  • Application servers: Stateless — add more behind the load balancer
  • Redis cache: Shard by short_code hash across a cluster
  • Database: Read replicas for reads, sharding by short_code for writes
  • Kafka: Partition the click-events topic for parallel consumers

Database Sharding

When a single database can no longer handle the write volume, shard by the first character of the short_code:

Shard 0: a-f  (covers short codes starting with a through f)
Shard 1: g-m
Shard 2: n-s
Shard 3: t-z
Shard 4: A-M
Shard 5: N-Z
Shard 6: 0-9

This gives you 7 shards with roughly equal distribution. Each shard is an independent PostgreSQL instance with its own read replicas.

Rate Limiting

Protect the URL creation endpoint from abuse:

  • Per-user: 100 URLs per hour for free tier, 10,000 for paid
  • Per-IP: 50 URLs per hour for anonymous users
  • Global: Circuit breaker at 10,000 URLs/second to protect the database

URL Cleanup

Expired URLs consume storage. Run a background job to clean them up:

async def cleanup_expired_urls():
    """Run every hour via cron."""
    deleted = await db.execute("""
        DELETE FROM url_mappings
        WHERE expires_at IS NOT NULL
          AND expires_at < NOW()
        LIMIT 10000
    """)
    # Also invalidate cache entries
    for code in deleted:
        await redis.delete(f"url:{code}")

Step 10: Security Considerations

A URL shortener can be abused for phishing. Mitigate this with:

  • URL validation: Reject malformed URLs and known malicious domains
  • Google Safe Browsing API: Check every submitted URL against Google's malware/phishing database
  • Preview pages: Show users where a short URL leads before redirecting (like bit.ly's + suffix)
  • Abuse reporting: Allow users to flag malicious short URLs
  • Rate limiting: Prevent automated bulk URL creation

Complete API Design

POST   /api/v1/urls          Create a short URL
GET    /api/v1/urls/{code}   Get URL metadata (not redirect)
DELETE /api/v1/urls/{code}   Delete a short URL
GET    /api/v1/urls/{code}/stats   Get analytics for a URL
GET    /{code}               Redirect to long URL (302)

Create URL Request/Response

// POST /api/v1/urls
// Request
{
    "long_url": "https://example.com/very/long/path?with=params",
    "custom_alias": "my-link",    // optional
    "expires_in": 86400           // optional, seconds
}

// Response (201 Created)
{
    "short_code": "my-link",
    "short_url": "https://sho.rt/my-link",
    "long_url": "https://example.com/very/long/path?with=params",
    "expires_at": "2026-03-16T00:00:00Z",
    "created_at": "2026-03-15T00:00:00Z"
}

Summary

ComponentTechnologyWhy
Key GenerationPre-generated base62 keysNo collisions, O(1) assignment
Primary StoragePostgreSQL (sharded)ACID, mature, cost-effective
CacheRedis ClusterSub-millisecond lookups
Analytics QueueKafkaDecouple analytics from redirects
Analytics StoreClickHouseColumnar, fast aggregations
Load BalancerNginx / ALBSSL termination, routing

The URL shortener is a deceptively deep system design problem. It tests your ability to think about encoding, storage, caching, async processing, and scaling — all in a system that appears trivially simple. The key insight is that this is a read-dominated system, and every design decision should optimize for redirect latency first.

Keep Reading

Comments

No comments yet. Be the first!