Designing a URL Shortener: System Design from Scratch
URL shorteners look simple on the surface. You take a long URL, generate a short alias, and redirect anyone who visits that alias back to the original. But behind that simplicity hides a system that must handle billions of reads, guarantee uniqueness across distributed nodes, and resolve redirects in single-digit milliseconds. This is why "design a URL shortener" is one of the most popular system design interview questions — it touches on hashing, databases, caching, analytics, and horizontal scaling.
Let us build one from scratch.
Step 1: Requirements Gathering
Before writing any code, we need to pin down what we are actually building. This is the most important part of any system design exercise.
Functional Requirements
- Shorten: Given a long URL, generate a unique short URL
- Redirect: Given a short URL, redirect the user to the original long URL
- Custom aliases: Optionally allow users to pick their own short code
- Expiration: URLs can have an optional TTL (time to live)
- Analytics: Track click counts, referrers, geolocation, and timestamps
Non-Functional Requirements
- High availability: The redirect service must have 99.99% uptime
- Low latency: Redirects should complete in under 10ms at the p99
- Read-heavy: The read-to-write ratio will be roughly 100:1
- Scale: Support 100M new URLs per month and 10B redirects per month
Back-of-the-Envelope Estimates
Let us do the math:
- Writes: 100M URLs/month = ~40 URLs/second
- Reads: 10B redirects/month = ~3,800 redirects/second
- Storage per record: ~500 bytes (short code + long URL + metadata)
- Storage for 5 years: 100M x 12 x 5 x 500 bytes = ~3 TB
- Cache: If we cache the top 20% of URLs, that is ~600 GB
These numbers tell us something important: this is a read-dominated system. Our architecture should optimize aggressively for reads.
Step 2: High-Level Architecture
Here is the overall flow:
Client
|
v
API Gateway (rate limiting, auth)
|
v
Application Server (stateless)
|
+---> Cache (Redis) ---> hit? return long URL
| |
| v (miss)
+---> Database (primary storage)
|
v
Analytics Pipeline (async, Kafka)
The key components are:
- API Gateway — handles rate limiting, authentication, and SSL termination
- Application servers — stateless service that handles shortening and redirecting
- Cache layer — Redis or Memcached for hot URLs
- Database — stores the URL mappings
- Analytics pipeline — asynchronous event processing for click tracking
Step 3: URL Encoding Strategy
This is the core algorithmic question. How do we generate a short, unique string for each URL?
Approach 1: Base62 Encoding
We use the characters [a-zA-Z0-9] which gives us 62 possible characters per position. A 7-character code gives us:
62^7 = 3,521,614,606,208 (~3.5 trillion unique codes)
That is far more than we need. Here is how it works:
import string
ALPHABET = string.ascii_letters + string.digits # 62 chars
def base62_encode(num: int) -> str:
if num == 0:
return ALPHABET[0]
result = []
while num > 0:
result.append(ALPHABET[num % 62])
num //= 62
return ''.join(reversed(result))
def base62_decode(code: str) -> int:
num = 0
for char in code:
num = num * 62 + ALPHABET.index(char)
return num
# Example
print(base62_encode(123456789)) # "8M0kX"
print(base62_decode("8M0kX")) # 123456789
Approach 2: MD5/SHA-256 Hash + Truncation
We hash the long URL and take the first 7 characters:
import hashlib
def hash_url(long_url: str) -> str:
hash_hex = hashlib.md5(long_url.encode()).hexdigest()
# Take first 43 bits worth (7 base62 chars)
num = int(hash_hex[:11], 16)
return base62_encode(num)[:7]
Problem: This creates hash collisions. Two different URLs can produce the same short code. We need collision resolution.
Approach 3: Pre-Generated Key Service (Recommended)
The cleanest solution is to pre-generate unique keys in a separate service:
class KeyGenerationService:
"""
Pre-generates unique 7-character keys and stores them
in a database. Application servers request keys in batches.
"""
def __init__(self, db):
self.db = db
def generate_keys(self, batch_size=10000):
"""Generate a batch of unique keys and store them."""
keys = set()
while len(keys) < batch_size:
key = base62_encode(random.randint(0, 62**7 - 1))
key = key.ljust(7, '0')[:7] # Ensure 7 chars
keys.add(key)
self.db.insert_unused_keys(list(keys))
def get_key(self) -> str:
"""Atomically fetch and mark a key as used."""
return self.db.fetch_and_mark_used()
This approach eliminates collisions entirely because every key is unique before it is ever assigned to a URL. Application servers request keys in batches (say 1000 at a time) and cache them locally, so the key service is not a bottleneck.
Step 4: Handling Hash Collisions
If you go with the hashing approach, you must handle collisions. Here is a robust strategy:
def create_short_url(long_url: str, db) -> str:
short_code = hash_url(long_url)
# Check for collision
existing = db.get(short_code)
if existing is None:
# No collision — insert and return
db.insert(short_code, long_url)
return short_code
elif existing.long_url == long_url:
# Same URL already shortened — return existing code
return short_code
else:
# Collision with a different URL
# Append a counter and re-hash
for i in range(10):
new_code = hash_url(long_url + str(i))
if db.get(new_code) is None:
db.insert(new_code, long_url)
return new_code
raise Exception("Unable to resolve collision")
In practice, collisions are rare with 7-character base62 codes (3.5 trillion possibilities), but you must handle them to guarantee correctness.
Step 5: Database Schema
URL Mappings Table
CREATE TABLE url_mappings (
short_code VARCHAR(7) PRIMARY KEY,
long_url TEXT NOT NULL,
user_id BIGINT DEFAULT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
expires_at TIMESTAMP DEFAULT NULL,
click_count BIGINT DEFAULT 0
);
-- Index for reverse lookups (check if URL already shortened)
CREATE INDEX idx_long_url ON url_mappings (long_url);
-- Index for expiration cleanup
CREATE INDEX idx_expires ON url_mappings (expires_at)
WHERE expires_at IS NOT NULL;
Analytics Events Table
CREATE TABLE click_events (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(7) NOT NULL,
clicked_at TIMESTAMP NOT NULL DEFAULT NOW(),
referrer TEXT,
user_agent TEXT,
ip_address INET,
country VARCHAR(2),
city VARCHAR(100)
);
-- Partition by month for efficient querying
-- In production, use TimescaleDB or ClickHouse for this
CREATE INDEX idx_click_code_time
ON click_events (short_code, clicked_at);
SQL vs. NoSQL Decision
For URL mappings, both work. Here is the tradeoff:
| Factor | SQL (PostgreSQL) | NoSQL (DynamoDB) |
|---|---|---|
| Lookups by short_code | Fast (primary key) | Fast (partition key) |
| Reverse lookups | Index on long_url | Requires GSI |
| Transactions | Full ACID | Limited |
| Scaling | Harder (sharding) | Automatic |
| Cost at scale | Lower | Higher |
For most teams, PostgreSQL with read replicas is the right starting point. Switch to DynamoDB if you need automatic scaling beyond what read replicas provide.
Step 6: Read-Heavy Optimization
Since our read-to-write ratio is 100:1, we must make redirects blazing fast.
Caching Strategy
We use a write-around cache with Redis:
class URLRedirectService:
def __init__(self, redis_client, db):
self.redis = redis_client
self.db = db
self.CACHE_TTL = 86400 # 24 hours
async def resolve(self, short_code: str) -> str | None:
# Step 1: Check cache
long_url = await self.redis.get(f"url:{short_code}")
if long_url:
return long_url.decode()
# Step 2: Cache miss — query database
record = await self.db.get_url(short_code)
if record is None:
# Cache the miss too (prevent cache penetration)
await self.redis.set(
f"url:{short_code}", "__NULL__",
ex=300 # Short TTL for misses
)
return None
# Step 3: Check expiration
if record.expires_at and record.expires_at < datetime.utcnow():
return None
# Step 4: Populate cache and return
await self.redis.set(
f"url:{short_code}", record.long_url,
ex=self.CACHE_TTL
)
return record.long_url
Cache Sizing
- Hot URLs follow a Zipf distribution — a small percentage of URLs get most of the traffic
- Caching the top 20% of URLs should give us a 90%+ cache hit rate
- With 6B total URLs and 500 bytes each:
6B * 0.2 * 500 = 600 GB - This fits in a Redis cluster with 10-15 nodes (64 GB each)
Database Read Replicas
For cache misses, we route reads to read replicas and writes to the primary:
Writes --> Primary DB (1 instance)
Reads --> Read Replicas (3-5 instances, async replication)
The slight replication lag (typically under 100ms) is acceptable because a newly created short URL is extremely unlikely to receive a redirect request within that window.
Step 7: Analytics Tracking
We do not want analytics to slow down redirects. The solution is asynchronous event processing:
from kafka import KafkaProducer
import json
class AnalyticsTracker:
def __init__(self):
self.producer = KafkaProducer(
bootstrap_servers=['kafka:9092'],
value_serializer=lambda v: json.dumps(v).encode()
)
def track_click(self, short_code: str, request):
event = {
"short_code": short_code,
"timestamp": datetime.utcnow().isoformat(),
"referrer": request.headers.get("Referer"),
"user_agent": request.headers.get("User-Agent"),
"ip": request.client.host,
}
# Fire and forget — do not await
self.producer.send("click-events", value=event)
The analytics pipeline looks like this:
Redirect Service --> Kafka Topic --> Consumer Group --> ClickHouse/TimescaleDB
|
+--> Real-time dashboard (WebSocket)
+--> Batch aggregation (hourly/daily)
Counter Updates
For the click_count in the URL mappings table, do not update it on every click. Instead:
- Accumulate counts in Redis:
INCR click:{short_code} - Flush to the database every 5 minutes via a batch job
- This reduces database write load by 99%
Step 8: The Redirect Flow
The actual redirect is a 301 or 302 HTTP response. The choice matters:
- 301 (Permanent Redirect): Browser caches the redirect. Faster for users, but you lose analytics visibility because subsequent visits never hit your server.
- 302 (Temporary Redirect): Browser does not cache. Every visit hits your server, so you can track every click.
Most URL shorteners use 302 because analytics tracking is a key feature:
from fastapi import FastAPI, Response
from fastapi.responses import RedirectResponse
app = FastAPI()
@app.get("/{short_code}")
async def redirect(short_code: str, request: Request):
long_url = await redirect_service.resolve(short_code)
if long_url is None:
return Response(status_code=404, content="URL not found")
# Track the click asynchronously
analytics.track_click(short_code, request)
# 302 redirect to preserve analytics
return RedirectResponse(url=long_url, status_code=302)
Step 9: Scaling Considerations
Horizontal Scaling
Every component in our architecture is horizontally scalable:
- Application servers: Stateless — add more behind the load balancer
- Redis cache: Shard by short_code hash across a cluster
- Database: Read replicas for reads, sharding by short_code for writes
- Kafka: Partition the click-events topic for parallel consumers
Database Sharding
When a single database can no longer handle the write volume, shard by the first character of the short_code:
Shard 0: a-f (covers short codes starting with a through f)
Shard 1: g-m
Shard 2: n-s
Shard 3: t-z
Shard 4: A-M
Shard 5: N-Z
Shard 6: 0-9
This gives you 7 shards with roughly equal distribution. Each shard is an independent PostgreSQL instance with its own read replicas.
Rate Limiting
Protect the URL creation endpoint from abuse:
- Per-user: 100 URLs per hour for free tier, 10,000 for paid
- Per-IP: 50 URLs per hour for anonymous users
- Global: Circuit breaker at 10,000 URLs/second to protect the database
URL Cleanup
Expired URLs consume storage. Run a background job to clean them up:
async def cleanup_expired_urls():
"""Run every hour via cron."""
deleted = await db.execute("""
DELETE FROM url_mappings
WHERE expires_at IS NOT NULL
AND expires_at < NOW()
LIMIT 10000
""")
# Also invalidate cache entries
for code in deleted:
await redis.delete(f"url:{code}")
Step 10: Security Considerations
A URL shortener can be abused for phishing. Mitigate this with:
- URL validation: Reject malformed URLs and known malicious domains
- Google Safe Browsing API: Check every submitted URL against Google's malware/phishing database
- Preview pages: Show users where a short URL leads before redirecting (like bit.ly's + suffix)
- Abuse reporting: Allow users to flag malicious short URLs
- Rate limiting: Prevent automated bulk URL creation
Complete API Design
POST /api/v1/urls Create a short URL
GET /api/v1/urls/{code} Get URL metadata (not redirect)
DELETE /api/v1/urls/{code} Delete a short URL
GET /api/v1/urls/{code}/stats Get analytics for a URL
GET /{code} Redirect to long URL (302)
Create URL Request/Response
// POST /api/v1/urls
// Request
{
"long_url": "https://example.com/very/long/path?with=params",
"custom_alias": "my-link", // optional
"expires_in": 86400 // optional, seconds
}
// Response (201 Created)
{
"short_code": "my-link",
"short_url": "https://sho.rt/my-link",
"long_url": "https://example.com/very/long/path?with=params",
"expires_at": "2026-03-16T00:00:00Z",
"created_at": "2026-03-15T00:00:00Z"
}
Summary
| Component | Technology | Why |
|---|---|---|
| Key Generation | Pre-generated base62 keys | No collisions, O(1) assignment |
| Primary Storage | PostgreSQL (sharded) | ACID, mature, cost-effective |
| Cache | Redis Cluster | Sub-millisecond lookups |
| Analytics Queue | Kafka | Decouple analytics from redirects |
| Analytics Store | ClickHouse | Columnar, fast aggregations |
| Load Balancer | Nginx / ALB | SSL termination, routing |
The URL shortener is a deceptively deep system design problem. It tests your ability to think about encoding, storage, caching, async processing, and scaling — all in a system that appears trivially simple. The key insight is that this is a read-dominated system, and every design decision should optimize for redirect latency first.
Keep reading
Event-Driven System Design: The Decisions That Bite You Later
A practical guide to the design decisions that determine whether an event-driven system stays maintainable or quietly rots — delivery guarantees, ordering, idempotency, schema evolution, and the outbox.
Designing a Scalable Search Engine: From Inverted Index to Hybrid Vector Search
Walk through the architecture of a production search system — inverted indexes, BM25 ranking, vector embeddings, and hybrid retrieval for 50M documents at 10K QPS.
Designing a Scalable Notification System
A system design deep dive into building a notification platform that handles push, email, SMS, and in-app notifications at scale — covering architecture, priority queues, fan-out strategies, rate limiting, and delivery tracking.
Newsletter
New posts, straight to your inbox
One email per post. No spam, no tracking pixels, unsubscribe anytime.
Comments
No comments yet. Be the first.