API Design Best Practices: REST, GraphQL, and gRPC Compared
APIs are the contracts that hold distributed systems together. Choose the wrong paradigm or ignore fundamental design principles and you will spend years fighting your own architecture. This post breaks down REST, GraphQL, and gRPC in practical terms — how each works, where each shines, and the design patterns that matter regardless of which one you pick.
REST: The Workhorse of the Web
REST (Representational State Transfer) is not a protocol. It is an architectural style built on top of HTTP. When done properly, REST APIs are predictable, cacheable, and easy to reason about.
Core Principles
Resources, not actions. URLs should represent nouns, not verbs:
# Good — resource-oriented
GET /api/users/42
POST /api/users
PUT /api/users/42
DELETE /api/users/42
# Bad — action-oriented (this is RPC over HTTP, not REST)
POST /api/getUser
POST /api/createUser
POST /api/deleteUser
HTTP verbs convey intent. Each method has defined semantics:
| Method | Purpose | Idempotent | Safe |
|---|---|---|---|
| GET | Read a resource | Yes | Yes |
| POST | Create a resource | No | No |
| PUT | Replace a resource | Yes | No |
| PATCH | Partial update | No* | No |
| DELETE | Remove a resource | Yes | No |
*PATCH can be made idempotent with proper design, but is not required to be.
Status codes tell the client what happened:
200 OK — Request succeeded
201 Created — Resource created (include Location header)
204 No Content — Success with no response body (common for DELETE)
400 Bad Request — Client sent invalid data
401 Unauthorized — Authentication required
403 Forbidden — Authenticated but not authorized
404 Not Found — Resource does not exist
409 Conflict — Resource state conflict (e.g., duplicate email)
422 Unprocessable — Validation errors (semantically correct but logically wrong)
429 Too Many Reqs — Rate limit exceeded
500 Internal Error — Server bug
REST Pagination: Cursor vs Offset
Pagination is where many REST APIs get it wrong. There are two main approaches.
Offset-based pagination is simple but breaks under mutation:
GET /api/posts?page=3&per_page=20
{
"data": [...],
"meta": {
"page": 3,
"per_page": 20,
"total": 1847,
"total_pages": 93
}
}
The problem: if someone inserts a post while you are paginating, you will either skip an item or see a duplicate. This is acceptable for admin dashboards but terrible for feeds.
Cursor-based pagination is stable under concurrent writes:
GET /api/posts?limit=20&cursor=eyJpZCI6MTAwfQ==
{
"data": [...],
"meta": {
"next_cursor": "eyJpZCI6MTIwfQ==",
"has_more": true
}
}
The cursor is typically a base64-encoded reference to the last item's sort key. The server uses it to query WHERE id > :cursor ORDER BY id ASC LIMIT 20. No skips, no duplicates, and the database can use an index scan.
Rule of thumb: use cursor pagination for anything user-facing or high-throughput. Use offset pagination only when you need random page access (admin tables, search results with page numbers).
Versioning Strategies
APIs evolve. You need a strategy for breaking changes.
URL versioning is the most explicit:
GET /api/v1/users/42
GET /api/v2/users/42
Pros: obvious, easy to route, easy to deprecate. Cons: URL pollution, clients must update URLs.
Header versioning keeps URLs clean:
GET /api/users/42
Accept: application/vnd.myapi.v2+json
Pros: cleaner URLs. Cons: harder to test in a browser, less discoverable.
Query parameter versioning is a middle ground:
GET /api/users/42?version=2
My recommendation: use URL versioning for public APIs (it is the most obvious) and header versioning for internal APIs (where you control both client and server).
Idempotency in REST APIs
Idempotency means calling the same operation multiple times produces the same result. GET, PUT, and DELETE are naturally idempotent. POST is not — if a client retries a failed POST, you might create duplicate resources.
The solution is an idempotency key:
POST /api/payments
Idempotency-Key: 8a3b4c5d-6e7f-8a9b-0c1d-2e3f4a5b6c7d
Content-Type: application/json
{
"amount": 5000,
"currency": "USD",
"recipient": "user_123"
}
Server-side implementation:
def create_payment(request):
idempotency_key = request.headers.get("Idempotency-Key")
# Check if we already processed this key
existing = redis.get(f"idempotency:{idempotency_key}")
if existing:
return json.loads(existing) # Return cached response
# Process the payment
result = payment_service.charge(request.body)
# Cache the result with a TTL (e.g., 24 hours)
redis.setex(
f"idempotency:{idempotency_key}",
86400,
json.dumps(result)
)
return result
Stripe popularized this pattern, and it is now considered a best practice for any API that handles money or state-changing operations.
GraphQL: Flexibility at a Cost
GraphQL lets clients request exactly the data they need in a single query. No over-fetching, no under-fetching. Facebook created it to solve the problem of mobile clients needing different data shapes than web clients.
Where GraphQL Shines
Flexible queries eliminate over-fetching:
# Client asks for exactly what it needs
query {
user(id: "42") {
name
email
posts(first: 5) {
title
createdAt
}
}
}
Compare this to REST, where you might need /api/users/42 and then /api/users/42/posts?limit=5 — two round trips, and the user endpoint returns 30 fields when you only need two.
Strongly typed schema serves as documentation:
type User {
id: ID!
name: String!
email: String!
posts(first: Int, after: String): PostConnection!
role: UserRole!
}
enum UserRole {
ADMIN
EDITOR
VIEWER
}
Single endpoint simplifies API surface. Everything goes through POST /graphql. No more debating URL structure.
Where GraphQL Hurts
The N+1 problem is GraphQL's biggest trap. Consider this query:
query {
posts(first: 20) {
title
author {
name
}
}
}
A naive resolver fetches 20 posts, then makes 20 separate database queries for each author. The fix is a DataLoader — a batching and caching utility:
const authorLoader = new DataLoader(async (authorIds) => {
// Single query: SELECT * FROM users WHERE id IN (...)
const authors = await db.users.findMany({
where: { id: { in: authorIds } }
});
// Return in same order as input IDs
const authorMap = new Map(authors.map(a => [a.id, a]));
return authorIds.map(id => authorMap.get(id));
});
// Resolver
const resolvers = {
Post: {
author: (post) => authorLoader.load(post.authorId)
}
};
Caching is harder. REST responses can be cached by URL. GraphQL queries are POST requests with unique bodies — HTTP caching does not work out of the box. You need:
- Persisted queries (hash the query, send the hash)
- Normalized client-side caches (Apollo Client, urql)
- CDN-level caching with query whitelisting
Query complexity can be weaponized. A malicious client can craft deeply nested queries that overwhelm your server:
query {
user(id: "1") {
friends {
friends {
friends {
friends {
posts { comments { author { friends { ... } } } }
}
}
}
}
}
}
You must implement query depth limiting and query cost analysis to prevent this.
gRPC: Performance for Internal Services
gRPC uses HTTP/2 and Protocol Buffers for high-performance, strongly-typed communication between services. It is not a replacement for REST or GraphQL — it occupies a different niche.
Protocol Buffers
You define your API in .proto files, and the compiler generates client and server code in any supported language:
syntax = "proto3";
service UserService {
rpc GetUser(GetUserRequest) returns (User);
rpc ListUsers(ListUsersRequest) returns (stream User);
rpc CreateUser(CreateUserRequest) returns (User);
}
message GetUserRequest {
string user_id = 1;
}
message User {
string id = 1;
string name = 2;
string email = 3;
UserRole role = 4;
}
enum UserRole {
VIEWER = 0;
EDITOR = 1;
ADMIN = 2;
}
Why gRPC Wins for Internal Services
- Binary serialization — Protocol Buffers are 3-10x smaller and faster to parse than JSON
- HTTP/2 multiplexing — multiple requests over a single TCP connection, no head-of-line blocking
- Streaming — supports server streaming, client streaming, and bidirectional streaming
- Code generation — type-safe clients in any language, no manual HTTP client code
- Deadlines and cancellation — built-in support for request timeouts that propagate through the call chain
Why gRPC Is Wrong for Public APIs
- Not browser-friendly (requires gRPC-Web proxy)
- Binary protocol is not human-readable or debuggable with curl
- Steeper learning curve for API consumers
- No native support in most API gateways and tooling
When to Use Which
| Criteria | REST | GraphQL | gRPC |
|---|---|---|---|
| Public API | Best choice | Good choice | Avoid |
| Internal services | OK | Overhead | Best choice |
| Mobile clients | OK | Best choice | Via gRPC-Web |
| Real-time streaming | SSE/WebSocket | Subscriptions | Best choice |
| Browser support | Native | Via client lib | Needs proxy |
| Caching | Easy (HTTP) | Complex | Manual |
| File uploads | Easy | Awkward | Streaming |
| Learning curve | Low | Medium | High |
The pragmatic answer: most teams should use REST for public-facing APIs, GraphQL for complex client applications that consume data from multiple sources, and gRPC for performance-critical service-to-service communication.
Cross-Cutting Concerns
These patterns matter regardless of which API paradigm you choose.
Rate Limiting
Always communicate rate limits via standard headers:
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 947
X-RateLimit-Reset: 1619472000
Retry-After: 30
Common rate limiting algorithms:
- Token bucket — allows bursts, smooths out over time
- Sliding window — more precise than fixed windows, prevents boundary bursts
- Leaky bucket — constant outflow rate, queues excess requests
Implement rate limiting at the API gateway level, keyed by API key or IP address. Use Redis for distributed state.
Authentication Patterns
API Keys — simple, suitable for server-to-server communication:
GET /api/data
Authorization: Bearer sk_live_abc123def456
OAuth 2.0 — the standard for delegated authorization. Use the Authorization Code flow with PKCE for user-facing applications:
1. Client redirects to authorization server
2. User authenticates and consents
3. Auth server redirects back with authorization code
4. Client exchanges code for access token (with PKCE verifier)
5. Client uses access token to call API
JWT (JSON Web Tokens) — for stateless authentication in microservices. Short-lived access tokens (15 minutes) paired with longer-lived refresh tokens:
{
"sub": "user_42",
"iat": 1619472000,
"exp": 1619472900,
"scope": "read:posts write:posts",
"role": "editor"
}
Rule: API keys for machine clients, OAuth for user delegation, JWTs for service-to-service within your own infrastructure.
API Documentation with OpenAPI
Good documentation is not optional. OpenAPI (formerly Swagger) is the standard for REST APIs:
openapi: 3.0.3
info:
title: User API
version: 2.0.0
paths:
/users/{id}:
get:
summary: Get a user by ID
parameters:
- name: id
in: path
required: true
schema:
type: string
responses:
'200':
description: User found
content:
application/json:
schema:
$ref: '#/components/schemas/User'
'404':
description: User not found
For GraphQL, the schema is self-documenting via introspection. For gRPC, the .proto file is the documentation — but consider generating human-readable docs from it.
Backward Compatibility
Breaking your API contract breaks your consumers. Follow these rules:
- Never remove a field — mark it deprecated, keep returning it
- Never change a field's type — add a new field instead
- Never change the meaning of a status code for an existing endpoint
- Additive changes are safe — new optional fields, new endpoints, new enum values
- Use sunset headers to communicate deprecation timelines:
Sunset: Sat, 01 Jan 2028 00:00:00 GMT
Deprecation: true
Link: </api/v3/users>; rel="successor-version"
Summary
API design is not about picking the "best" technology. It is about understanding the tradeoffs of each paradigm and matching them to your constraints. REST gives you simplicity and universal tooling. GraphQL gives you flexibility and client control. gRPC gives you performance and type safety. The best systems often use all three — REST for the public API, GraphQL for the frontend BFF, and gRPC between backend services.
Design your APIs as if someone who has never spoken to you will be integrating with them at 2 AM. Be consistent, be predictable, document everything, and never ship a breaking change without a migration path.
Keep Reading
Designing a Scalable Notification System
A system design deep dive into building a notification platform that handles push, email, SMS, and in-app notifications at scale — covering architecture, priority queues, fan-out strategies, rate limiting, and delivery tracking.
Event-Driven Architecture: Patterns, Pitfalls, and Practical Guidance
A comprehensive guide to event-driven architecture — covering pub/sub, event sourcing, CQRS, saga patterns, message broker trade-offs, and the hard lessons teams learn in production.
Caching Strategies for High-Performance Systems
A deep dive into caching patterns that power the world's fastest systems — from cache-aside and write-through to multi-level architectures, stampede prevention, and real-world eviction strategies at scale.
Comments
No comments yet. Be the first!