Designing a URL Shortener: System Design from Scratch
URL shorteners look simple on the surface. You take a long URL, generate a short alias, and redirect anyone who visits that alias back to the original. But behind that simplicity hides a system that must handle billions of reads, guarantee uniqueness across distributed nodes, and resolve redirects in single-digit milliseconds. This is why "design a URL shortener" is one of the most popular system design interview questions — it touches on hashing, databases, caching, analytics, and horizontal scaling.
Let us build one from scratch.
Step 1: Requirements Gathering
Before writing any code, we need to pin down what we are actually building. This is the most important part of any system design exercise.
Functional Requirements
- Shorten: Given a long URL, generate a unique short URL
- Redirect: Given a short URL, redirect the user to the original long URL
- Custom aliases: Optionally allow users to pick their own short code
- Expiration: URLs can have an optional TTL (time to live)
- Analytics: Track click counts, referrers, geolocation, and timestamps
Non-Functional Requirements
- High availability: The redirect service must have 99.99% uptime
- Low latency: Redirects should complete in under 10ms at the p99
- Read-heavy: The read-to-write ratio will be roughly 100:1
- Scale: Support 100M new URLs per month and 10B redirects per month
Back-of-the-Envelope Estimates
Let us do the math:
- Writes: 100M URLs/month = ~40 URLs/second
- Reads: 10B redirects/month = ~3,800 redirects/second
- Storage per record: ~500 bytes (short code + long URL + metadata)
- Storage for 5 years: 100M x 12 x 5 x 500 bytes = ~3 TB
- Cache: If we cache the top 20% of URLs, that is ~600 GB
These numbers tell us something important: this is a read-dominated system. Our architecture should optimize aggressively for reads.
Step 2: High-Level Architecture
Here is the overall flow:
Client
|
v
API Gateway (rate limiting, auth)
|
v
Application Server (stateless)
|
+---> Cache (Redis) ---> hit? return long URL
| |
| v (miss)
+---> Database (primary storage)
|
v
Analytics Pipeline (async, Kafka)
The key components are:
- API Gateway — handles rate limiting, authentication, and SSL termination
- Application servers — stateless service that handles shortening and redirecting
- Cache layer — Redis or Memcached for hot URLs
- Database — stores the URL mappings
- Analytics pipeline — asynchronous event processing for click tracking
Step 3: URL Encoding Strategy
This is the core algorithmic question. How do we generate a short, unique string for each URL?
Approach 1: Base62 Encoding
We use the characters [a-zA-Z0-9] which gives us 62 possible characters per position. A 7-character code gives us:
62^7 = 3,521,614,606,208 (~3.5 trillion unique codes)
That is far more than we need. Here is how it works:
import string
ALPHABET = string.ascii_letters + string.digits # 62 chars
def base62_encode(num: int) -> str:
if num == 0:
return ALPHABET[0]
result = []
while num > 0:
result.append(ALPHABET[num % 62])
num //= 62
return ''.join(reversed(result))
def base62_decode(code: str) -> int:
num = 0
for char in code:
num = num * 62 + ALPHABET.index(char)
return num
# Example
print(base62_encode(123456789)) # "8M0kX"
print(base62_decode("8M0kX")) # 123456789
Approach 2: MD5/SHA-256 Hash + Truncation
We hash the long URL and take the first 7 characters:
import hashlib
def hash_url(long_url: str) -> str:
hash_hex = hashlib.md5(long_url.encode()).hexdigest()
# Take first 43 bits worth (7 base62 chars)
num = int(hash_hex[:11], 16)
return base62_encode(num)[:7]
Problem: This creates hash collisions. Two different URLs can produce the same short code. We need collision resolution.
Approach 3: Pre-Generated Key Service (Recommended)
The cleanest solution is to pre-generate unique keys in a separate service:
class KeyGenerationService:
"""
Pre-generates unique 7-character keys and stores them
in a database. Application servers request keys in batches.
"""
def __init__(self, db):
self.db = db
def generate_keys(self, batch_size=10000):
"""Generate a batch of unique keys and store them."""
keys = set()
while len(keys) < batch_size:
key = base62_encode(random.randint(0, 62**7 - 1))
key = key.ljust(7, '0')[:7] # Ensure 7 chars
keys.add(key)
self.db.insert_unused_keys(list(keys))
def get_key(self) -> str:
"""Atomically fetch and mark a key as used."""
return self.db.fetch_and_mark_used()
This approach eliminates collisions entirely because every key is unique before it is ever assigned to a URL. Application servers request keys in batches (say 1000 at a time) and cache them locally, so the key service is not a bottleneck.
Step 4: Handling Hash Collisions
If you go with the hashing approach, you must handle collisions. Here is a robust strategy:
def create_short_url(long_url: str, db) -> str:
short_code = hash_url(long_url)
# Check for collision
existing = db.get(short_code)
if existing is None:
# No collision — insert and return
db.insert(short_code, long_url)
return short_code
elif existing.long_url == long_url:
# Same URL already shortened — return existing code
return short_code
else:
# Collision with a different URL
# Append a counter and re-hash
for i in range(10):
new_code = hash_url(long_url + str(i))
if db.get(new_code) is None:
db.insert(new_code, long_url)
return new_code
raise Exception("Unable to resolve collision")
In practice, collisions are rare with 7-character base62 codes (3.5 trillion possibilities), but you must handle them to guarantee correctness.
Step 5: Database Schema
URL Mappings Table
CREATE TABLE url_mappings (
short_code VARCHAR(7) PRIMARY KEY,
long_url TEXT NOT NULL,
user_id BIGINT DEFAULT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
expires_at TIMESTAMP DEFAULT NULL,
click_count BIGINT DEFAULT 0
);
-- Index for reverse lookups (check if URL already shortened)
CREATE INDEX idx_long_url ON url_mappings (long_url);
-- Index for expiration cleanup
CREATE INDEX idx_expires ON url_mappings (expires_at)
WHERE expires_at IS NOT NULL;
Analytics Events Table
CREATE TABLE click_events (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(7) NOT NULL,
clicked_at TIMESTAMP NOT NULL DEFAULT NOW(),
referrer TEXT,
user_agent TEXT,
ip_address INET,
country VARCHAR(2),
city VARCHAR(100)
);
-- Partition by month for efficient querying
-- In production, use TimescaleDB or ClickHouse for this
CREATE INDEX idx_click_code_time
ON click_events (short_code, clicked_at);
SQL vs. NoSQL Decision
For URL mappings, both work. Here is the tradeoff:
| Factor | SQL (PostgreSQL) | NoSQL (DynamoDB) |
|---|---|---|
| Lookups by short_code | Fast (primary key) | Fast (partition key) |
| Reverse lookups | Index on long_url | Requires GSI |
| Transactions | Full ACID | Limited |
| Scaling | Harder (sharding) | Automatic |
| Cost at scale | Lower | Higher |
For most teams, PostgreSQL with read replicas is the right starting point. Switch to DynamoDB if you need automatic scaling beyond what read replicas provide.
Step 6: Read-Heavy Optimization
Since our read-to-write ratio is 100:1, we must make redirects blazing fast.
Caching Strategy
We use a write-around cache with Redis:
class URLRedirectService:
def __init__(self, redis_client, db):
self.redis = redis_client
self.db = db
self.CACHE_TTL = 86400 # 24 hours
async def resolve(self, short_code: str) -> str | None:
# Step 1: Check cache
long_url = await self.redis.get(f"url:{short_code}")
if long_url:
return long_url.decode()
# Step 2: Cache miss — query database
record = await self.db.get_url(short_code)
if record is None:
# Cache the miss too (prevent cache penetration)
await self.redis.set(
f"url:{short_code}", "__NULL__",
ex=300 # Short TTL for misses
)
return None
# Step 3: Check expiration
if record.expires_at and record.expires_at < datetime.utcnow():
return None
# Step 4: Populate cache and return
await self.redis.set(
f"url:{short_code}", record.long_url,
ex=self.CACHE_TTL
)
return record.long_url
Cache Sizing
- Hot URLs follow a Zipf distribution — a small percentage of URLs get most of the traffic
- Caching the top 20% of URLs should give us a 90%+ cache hit rate
- With 6B total URLs and 500 bytes each:
6B * 0.2 * 500 = 600 GB - This fits in a Redis cluster with 10-15 nodes (64 GB each)
Database Read Replicas
For cache misses, we route reads to read replicas and writes to the primary:
Writes --> Primary DB (1 instance)
Reads --> Read Replicas (3-5 instances, async replication)
The slight replication lag (typically under 100ms) is acceptable because a newly created short URL is extremely unlikely to receive a redirect request within that window.
Step 7: Analytics Tracking
We do not want analytics to slow down redirects. The solution is asynchronous event processing:
from kafka import KafkaProducer
import json
class AnalyticsTracker:
def __init__(self):
self.producer = KafkaProducer(
bootstrap_servers=['kafka:9092'],
value_serializer=lambda v: json.dumps(v).encode()
)
def track_click(self, short_code: str, request):
event = {
"short_code": short_code,
"timestamp": datetime.utcnow().isoformat(),
"referrer": request.headers.get("Referer"),
"user_agent": request.headers.get("User-Agent"),
"ip": request.client.host,
}
# Fire and forget — do not await
self.producer.send("click-events", value=event)
The analytics pipeline looks like this:
Redirect Service --> Kafka Topic --> Consumer Group --> ClickHouse/TimescaleDB
|
+--> Real-time dashboard (WebSocket)
+--> Batch aggregation (hourly/daily)
Counter Updates
For the click_count in the URL mappings table, do not update it on every click. Instead:
- Accumulate counts in Redis:
INCR click:{short_code} - Flush to the database every 5 minutes via a batch job
- This reduces database write load by 99%
Step 8: The Redirect Flow
The actual redirect is a 301 or 302 HTTP response. The choice matters:
- 301 (Permanent Redirect): Browser caches the redirect. Faster for users, but you lose analytics visibility because subsequent visits never hit your server.
- 302 (Temporary Redirect): Browser does not cache. Every visit hits your server, so you can track every click.
Most URL shorteners use 302 because analytics tracking is a key feature:
from fastapi import FastAPI, Response
from fastapi.responses import RedirectResponse
app = FastAPI()
@app.get("/{short_code}")
async def redirect(short_code: str, request: Request):
long_url = await redirect_service.resolve(short_code)
if long_url is None:
return Response(status_code=404, content="URL not found")
# Track the click asynchronously
analytics.track_click(short_code, request)
# 302 redirect to preserve analytics
return RedirectResponse(url=long_url, status_code=302)
Step 9: Scaling Considerations
Horizontal Scaling
Every component in our architecture is horizontally scalable:
- Application servers: Stateless — add more behind the load balancer
- Redis cache: Shard by short_code hash across a cluster
- Database: Read replicas for reads, sharding by short_code for writes
- Kafka: Partition the click-events topic for parallel consumers
Database Sharding
When a single database can no longer handle the write volume, shard by the first character of the short_code:
Shard 0: a-f (covers short codes starting with a through f)
Shard 1: g-m
Shard 2: n-s
Shard 3: t-z
Shard 4: A-M
Shard 5: N-Z
Shard 6: 0-9
This gives you 7 shards with roughly equal distribution. Each shard is an independent PostgreSQL instance with its own read replicas.
Rate Limiting
Protect the URL creation endpoint from abuse:
- Per-user: 100 URLs per hour for free tier, 10,000 for paid
- Per-IP: 50 URLs per hour for anonymous users
- Global: Circuit breaker at 10,000 URLs/second to protect the database
URL Cleanup
Expired URLs consume storage. Run a background job to clean them up:
async def cleanup_expired_urls():
"""Run every hour via cron."""
deleted = await db.execute("""
DELETE FROM url_mappings
WHERE expires_at IS NOT NULL
AND expires_at < NOW()
LIMIT 10000
""")
# Also invalidate cache entries
for code in deleted:
await redis.delete(f"url:{code}")
Step 10: Security Considerations
A URL shortener can be abused for phishing. Mitigate this with:
- URL validation: Reject malformed URLs and known malicious domains
- Google Safe Browsing API: Check every submitted URL against Google's malware/phishing database
- Preview pages: Show users where a short URL leads before redirecting (like bit.ly's + suffix)
- Abuse reporting: Allow users to flag malicious short URLs
- Rate limiting: Prevent automated bulk URL creation
Complete API Design
POST /api/v1/urls Create a short URL
GET /api/v1/urls/{code} Get URL metadata (not redirect)
DELETE /api/v1/urls/{code} Delete a short URL
GET /api/v1/urls/{code}/stats Get analytics for a URL
GET /{code} Redirect to long URL (302)
Create URL Request/Response
// POST /api/v1/urls
// Request
{
"long_url": "https://example.com/very/long/path?with=params",
"custom_alias": "my-link", // optional
"expires_in": 86400 // optional, seconds
}
// Response (201 Created)
{
"short_code": "my-link",
"short_url": "https://sho.rt/my-link",
"long_url": "https://example.com/very/long/path?with=params",
"expires_at": "2026-03-16T00:00:00Z",
"created_at": "2026-03-15T00:00:00Z"
}
Summary
| Component | Technology | Why |
|---|---|---|
| Key Generation | Pre-generated base62 keys | No collisions, O(1) assignment |
| Primary Storage | PostgreSQL (sharded) | ACID, mature, cost-effective |
| Cache | Redis Cluster | Sub-millisecond lookups |
| Analytics Queue | Kafka | Decouple analytics from redirects |
| Analytics Store | ClickHouse | Columnar, fast aggregations |
| Load Balancer | Nginx / ALB | SSL termination, routing |
The URL shortener is a deceptively deep system design problem. It tests your ability to think about encoding, storage, caching, async processing, and scaling — all in a system that appears trivially simple. The key insight is that this is a read-dominated system, and every design decision should optimize for redirect latency first.
Keep Reading
Designing a Scalable Notification System
A system design deep dive into building a notification platform that handles push, email, SMS, and in-app notifications at scale — covering architecture, priority queues, fan-out strategies, rate limiting, and delivery tracking.
API Design Best Practices: REST, GraphQL, and gRPC Compared
A deep dive into the three dominant API paradigms — REST, GraphQL, and gRPC — covering design principles, pagination strategies, versioning, authentication patterns, and practical guidance on choosing the right one for your system.
Event-Driven Architecture: Patterns, Pitfalls, and Practical Guidance
A comprehensive guide to event-driven architecture — covering pub/sub, event sourcing, CQRS, saga patterns, message broker trade-offs, and the hard lessons teams learn in production.
Comments
No comments yet. Be the first!