April 8, 202610 min readRishi

API Design Best Practices: REST, GraphQL, and gRPC Compared

APIs are the contracts that hold distributed systems together. Choose the wrong paradigm or ignore fundamental design principles and you will spend years fighting your own architecture. This post breaks down REST, GraphQL, and gRPC in practical terms — how each works, where each shines, and the design patterns that matter regardless of which one you pick.

REST: The Workhorse of the Web

REST (Representational State Transfer) is not a protocol. It is an architectural style built on top of HTTP. When done properly, REST APIs are predictable, cacheable, and easy to reason about.

Core Principles

Resources, not actions. URLs should represent nouns, not verbs:

# Good — resource-oriented
GET    /api/users/42
POST   /api/users
PUT    /api/users/42
DELETE /api/users/42

# Bad — action-oriented (this is RPC over HTTP, not REST)
POST   /api/getUser
POST   /api/createUser
POST   /api/deleteUser

HTTP verbs convey intent. Each method has defined semantics:

Method	Purpose	Idempotent	Safe
GET	Read a resource	Yes	Yes
POST	Create a resource	No	No
PUT	Replace a resource	Yes	No
PATCH	Partial update	No*	No
DELETE	Remove a resource	Yes	No

*PATCH can be made idempotent with proper design, but is not required to be.

Status codes tell the client what happened:

200 OK              — Request succeeded
201 Created         — Resource created (include Location header)
204 No Content      — Success with no response body (common for DELETE)
400 Bad Request     — Client sent invalid data
401 Unauthorized    — Authentication required
403 Forbidden       — Authenticated but not authorized
404 Not Found       — Resource does not exist
409 Conflict        — Resource state conflict (e.g., duplicate email)
422 Unprocessable   — Validation errors (semantically correct but logically wrong)
429 Too Many Reqs   — Rate limit exceeded
500 Internal Error  — Server bug

REST Pagination: Cursor vs Offset

Pagination is where many REST APIs get it wrong. There are two main approaches.

Offset-based pagination is simple but breaks under mutation:

GET /api/posts?page=3&per_page=20

{
  "data": [...],
  "meta": {
    "page": 3,
    "per_page": 20,
    "total": 1847,
    "total_pages": 93
  }
}

The problem: if someone inserts a post while you are paginating, you will either skip an item or see a duplicate. This is acceptable for admin dashboards but terrible for feeds.

Cursor-based pagination is stable under concurrent writes:

GET /api/posts?limit=20&cursor=eyJpZCI6MTAwfQ==

{
  "data": [...],
  "meta": {
    "next_cursor": "eyJpZCI6MTIwfQ==",
    "has_more": true
  }
}

The cursor is typically a base64-encoded reference to the last item's sort key. The server uses it to query WHERE id > :cursor ORDER BY id ASC LIMIT 20. No skips, no duplicates, and the database can use an index scan.

Rule of thumb: use cursor pagination for anything user-facing or high-throughput. Use offset pagination only when you need random page access (admin tables, search results with page numbers).

Versioning Strategies

APIs evolve. You need a strategy for breaking changes.

URL versioning is the most explicit:

GET /api/v1/users/42
GET /api/v2/users/42

Pros: obvious, easy to route, easy to deprecate. Cons: URL pollution, clients must update URLs.

Header versioning keeps URLs clean:

GET /api/users/42
Accept: application/vnd.myapi.v2+json

Pros: cleaner URLs. Cons: harder to test in a browser, less discoverable.

Query parameter versioning is a middle ground:

GET /api/users/42?version=2

My recommendation: use URL versioning for public APIs (it is the most obvious) and header versioning for internal APIs (where you control both client and server).

Idempotency in REST APIs

Idempotency means calling the same operation multiple times produces the same result. GET, PUT, and DELETE are naturally idempotent. POST is not — if a client retries a failed POST, you might create duplicate resources.

The solution is an idempotency key:

POST /api/payments
Idempotency-Key: 8a3b4c5d-6e7f-8a9b-0c1d-2e3f4a5b6c7d
Content-Type: application/json

{
  "amount": 5000,
  "currency": "USD",
  "recipient": "user_123"
}

Server-side implementation:

def create_payment(request):
    idempotency_key = request.headers.get("Idempotency-Key")

    # Check if we already processed this key
    existing = redis.get(f"idempotency:{idempotency_key}")
    if existing:
        return json.loads(existing)  # Return cached response

    # Process the payment
    result = payment_service.charge(request.body)

    # Cache the result with a TTL (e.g., 24 hours)
    redis.setex(
        f"idempotency:{idempotency_key}",
        86400,
        json.dumps(result)
    )

    return result

Stripe popularized this pattern, and it is now considered a best practice for any API that handles money or state-changing operations.

GraphQL: Flexibility at a Cost

GraphQL lets clients request exactly the data they need in a single query. No over-fetching, no under-fetching. Facebook created it to solve the problem of mobile clients needing different data shapes than web clients.

Where GraphQL Shines

Flexible queries eliminate over-fetching:

# Client asks for exactly what it needs
query {
  user(id: "42") {
    name
    email
    posts(first: 5) {
      title
      createdAt
    }
  }
}

Compare this to REST, where you might need /api/users/42 and then /api/users/42/posts?limit=5 — two round trips, and the user endpoint returns 30 fields when you only need two.

Strongly typed schema serves as documentation:

type User {
  id: ID!
  name: String!
  email: String!
  posts(first: Int, after: String): PostConnection!
  role: UserRole!
}

enum UserRole {
  ADMIN
  EDITOR
  VIEWER
}

Single endpoint simplifies API surface. Everything goes through POST /graphql. No more debating URL structure.

Where GraphQL Hurts

The N+1 problem is GraphQL's biggest trap. Consider this query:

query {
  posts(first: 20) {
    title
    author {
      name
    }
  }
}

A naive resolver fetches 20 posts, then makes 20 separate database queries for each author. The fix is a DataLoader — a batching and caching utility:

const authorLoader = new DataLoader(async (authorIds) => {
  // Single query: SELECT * FROM users WHERE id IN (...)
  const authors = await db.users.findMany({
    where: { id: { in: authorIds } }
  });
  // Return in same order as input IDs
  const authorMap = new Map(authors.map(a => [a.id, a]));
  return authorIds.map(id => authorMap.get(id));
});

// Resolver
const resolvers = {
  Post: {
    author: (post) => authorLoader.load(post.authorId)
  }
};

Caching is harder. REST responses can be cached by URL. GraphQL queries are POST requests with unique bodies — HTTP caching does not work out of the box. You need:

Persisted queries (hash the query, send the hash)
Normalized client-side caches (Apollo Client, urql)
CDN-level caching with query whitelisting

Query complexity can be weaponized. A malicious client can craft deeply nested queries that overwhelm your server:

query {
  user(id: "1") {
    friends {
      friends {
        friends {
          friends {
            posts { comments { author { friends { ... } } } }
          }
        }
      }
    }
  }
}

You must implement query depth limiting and query cost analysis to prevent this.

gRPC: Performance for Internal Services

gRPC uses HTTP/2 and Protocol Buffers for high-performance, strongly-typed communication between services. It is not a replacement for REST or GraphQL — it occupies a different niche.

Protocol Buffers

You define your API in .proto files, and the compiler generates client and server code in any supported language:

syntax = "proto3";

service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc ListUsers(ListUsersRequest) returns (stream User);
  rpc CreateUser(CreateUserRequest) returns (User);
}

message GetUserRequest {
  string user_id = 1;
}

message User {
  string id = 1;
  string name = 2;
  string email = 3;
  UserRole role = 4;
}

enum UserRole {
  VIEWER = 0;
  EDITOR = 1;
  ADMIN = 2;
}

Why gRPC Wins for Internal Services

Binary serialization — Protocol Buffers are 3-10x smaller and faster to parse than JSON
HTTP/2 multiplexing — multiple requests over a single TCP connection, no head-of-line blocking
Streaming — supports server streaming, client streaming, and bidirectional streaming
Code generation — type-safe clients in any language, no manual HTTP client code
Deadlines and cancellation — built-in support for request timeouts that propagate through the call chain

Why gRPC Is Wrong for Public APIs

Not browser-friendly (requires gRPC-Web proxy)
Binary protocol is not human-readable or debuggable with curl
Steeper learning curve for API consumers
No native support in most API gateways and tooling

When to Use Which

Criteria	REST	GraphQL	gRPC
Public API	Best choice	Good choice	Avoid
Internal services	OK	Overhead	Best choice
Mobile clients	OK	Best choice	Via gRPC-Web
Real-time streaming	SSE/WebSocket	Subscriptions	Best choice
Browser support	Native	Via client lib	Needs proxy
Caching	Easy (HTTP)	Complex	Manual
File uploads	Easy	Awkward	Streaming
Learning curve	Low	Medium	High

The pragmatic answer: most teams should use REST for public-facing APIs, GraphQL for complex client applications that consume data from multiple sources, and gRPC for performance-critical service-to-service communication.

Cross-Cutting Concerns

These patterns matter regardless of which API paradigm you choose.

Rate Limiting

Always communicate rate limits via standard headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 947
X-RateLimit-Reset: 1619472000
Retry-After: 30

Common rate limiting algorithms:

Token bucket — allows bursts, smooths out over time
Sliding window — more precise than fixed windows, prevents boundary bursts
Leaky bucket — constant outflow rate, queues excess requests

Implement rate limiting at the API gateway level, keyed by API key or IP address. Use Redis for distributed state.

Authentication Patterns

API Keys — simple, suitable for server-to-server communication:

GET /api/data
Authorization: Bearer sk_live_abc123def456

OAuth 2.0 — the standard for delegated authorization. Use the Authorization Code flow with PKCE for user-facing applications:

1. Client redirects to authorization server
2. User authenticates and consents
3. Auth server redirects back with authorization code
4. Client exchanges code for access token (with PKCE verifier)
5. Client uses access token to call API

JWT (JSON Web Tokens) — for stateless authentication in microservices. Short-lived access tokens (15 minutes) paired with longer-lived refresh tokens:

{
  "sub": "user_42",
  "iat": 1619472000,
  "exp": 1619472900,
  "scope": "read:posts write:posts",
  "role": "editor"
}

Rule: API keys for machine clients, OAuth for user delegation, JWTs for service-to-service within your own infrastructure.

API Documentation with OpenAPI

Good documentation is not optional. OpenAPI (formerly Swagger) is the standard for REST APIs:

openapi: 3.0.3
info:
  title: User API
  version: 2.0.0
paths:
  /users/{id}:
    get:
      summary: Get a user by ID
      parameters:
        - name: id
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: User found
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/User'
        '404':
          description: User not found

For GraphQL, the schema is self-documenting via introspection. For gRPC, the .proto file is the documentation — but consider generating human-readable docs from it.

Backward Compatibility

Breaking your API contract breaks your consumers. Follow these rules:

Never remove a field — mark it deprecated, keep returning it
Never change a field's type — add a new field instead
Never change the meaning of a status code for an existing endpoint
Additive changes are safe — new optional fields, new endpoints, new enum values
Use sunset headers to communicate deprecation timelines:

Sunset: Sat, 01 Jan 2028 00:00:00 GMT
Deprecation: true
Link: </api/v3/users>; rel="successor-version"

Summary

API design is not about picking the "best" technology. It is about understanding the tradeoffs of each paradigm and matching them to your constraints. REST gives you simplicity and universal tooling. GraphQL gives you flexibility and client control. gRPC gives you performance and type safety. The best systems often use all three — REST for the public API, GraphQL for the frontend BFF, and gRPC between backend services.

Design your APIs as if someone who has never spoken to you will be integrating with them at 2 AM. Be consistent, be predictable, document everything, and never ship a breaking change without a migration path.

SharePost Share

Keep reading

Jul 13, 20265 min read

Designing Ad Click Aggregation: Exactly-Once Counting at Scale

Billions of clicks, billed to the cent: streaming aggregation with watermarks, dedupe, idempotent sinks, and lambda-style reconciliation.

system-design

Jul 13, 20265 min read

Designing a CDN: Cache Hierarchy, Invalidation, and Request Routing

How a CDN actually works: edge PoPs, origin shields, consistent-hash cache keys, purge fan-out, and the anycast vs DNS routing decision.

system-design

Jul 13, 20265 min read

Designing a Distributed Cache Service: Redis Cluster Internals and Hot Keys

Build the cache, not just use it: slot-based sharding, gossip and failover, eviction under memory pressure, and the hot-key problem that shards can't solve.

API Design Best Practices: REST, GraphQL, and gRPC Compared

REST: The Workhorse of the Web

Core Principles

Versioning Strategies

Idempotency in REST APIs

GraphQL: Flexibility at a Cost

Where GraphQL Shines

Where GraphQL Hurts

gRPC: Performance for Internal Services

Protocol Buffers

Why gRPC Wins for Internal Services

Why gRPC Is Wrong for Public APIs

When to Use Which

Cross-Cutting Concerns

Rate Limiting

Authentication Patterns

API Documentation with OpenAPI

Backward Compatibility

Summary

Keep reading

Designing Ad Click Aggregation: Exactly-Once Counting at Scale

Designing a CDN: Cache Hierarchy, Invalidation, and Request Routing

Designing a Distributed Cache Service: Redis Cluster Internals and Hot Keys

New posts, straight to your inbox

Comments