How I Design REST APIs That Scale: Building Production-Ready Apps from Scratch Using Node.js
Beginners ask, "What endpoint should I create?"
Production engineers ask, "What workflow am I protecting?"
That difference is everything. I've seen plenty of APIs start clean — a neat server.ts, a handful of routes, maybe some validation. Then features pile on. Users multiply. Errors get awkward. The database starts choking. Before long, that tidy API is a ball of mud held together by desperation and TODO comments.
This article is about building it right from the start. Not with over-engineering, but with deliberate structure. We'll build an Order Management API as we go — because every production engineer has dealt with orders, payments, and status transitions.
Start With the Workflow, Not the Endpoints
Most tutorials teach APIs backwards. They show you how to define a route, wire a controller, query a database, and call it done. That works for a todo app. It falls apart for anything real.
Instead, start with the user's journey:
- A user creates an account and browses products.
- They place an order — but only if they have sufficient funds.
- They pay for the order. That payment must be idempotent (no double-charging).
- An admin reviews and updates the order status.
- Every state change is logged for compliance.
From this flow, endpoints emerge naturally:
POST /api/v1/orders
GET /api/v1/orders/:id
POST /api/v1/orders/:id/pay
GET /api/v1/admin/orders
PATCH /api/v1/admin/orders/:id/status
We didn't start with endpoints. We started with what the business needs to do.
Choose a Simple Production-Ready Stack
For this project, I reach for the same stack every time:
- Runtime: Node.js + TypeScript
- Framework: Express (or NestJS if the domain is complex)
- Database: PostgreSQL
- ORM: Prisma (type-safe queries, great migrations)
- Cache: Redis (rate limiting, sessions, caching)
- Validation: Zod (lightweight, TypeScript-first)
- Containers: Docker + docker-compose
Nothing exotic. Every team knows these tools. Docker Compose (via docker compose) ties them together locally so onboarding a new developer is a single docker compose up command. When something goes wrong at 2 AM, your on-call engineer can read the stack without context-switching.
Design the API Contract First
Before writing a single route handler, decide how every response looks. Consistency is the cheapest UX improvement you can make.
Response envelope:
interface ApiResponse<T> {
success: boolean
data: T | null
error: {
code: string
message: string
details?: unknown
} | null
requestId: string
timestamp: string
}
Error codes are not HTTP statuses. The status is transport; the code is semantics.
INSUFFICIENT_FUNDS → 402
ORDER_NOT_FOUND → 404
ORDER_STATUS_INVALID → 422
RATE_LIMITED → 429
Pagination follows a consistent shape:
interface PaginatedResponse<T> {
data: T[]
meta: {
page: number
perPage: number
total: number
totalPages: number
}
}
Versioning lives in the URL path (/api/v1/). It's not elegant, but it's obvious. Anyone can look at a log line and know exactly which contract is in play.
Separate Responsibilities
A common mistake is putting everything in the route handler. That works until you need to call the same logic from a background job or another controller.
The layering I use:
Route → Controller → Service → Repository → Database
- Route: Defines the path, middleware, and HTTP method. Nothing else.
- Controller: Parses the request, calls the service, formats the response.
- Service: Contains business logic. Orchestrates calls to repositories and external APIs.
- Repository: Abstracts database access. If I switch from Prisma to raw SQL, only this layer changes.
Example of a controller that stays thin:
async function createOrder(req: Request, res: Response) {
const dto = createOrderSchema.parse(req.body)
const result = await orderService.createOrder(req.user.id, dto)
res.status(201).json(successResponse(result))
}
The service handles the transaction, the validation error, the insufficient funds check. The controller just glues it together.
Validate Everything at the Boundary
Never trust req.body. Never trust req.params. Never trust req.headers. Never trust external APIs.
Validation lives at the boundary — the moment data enters your system. I use Zod schemas that double as TypeScript types:
const createOrderSchema = z.object({
productId: z.string().uuid(),
quantity: z.number().int().positive(),
couponCode: z.string().optional(),
})
type CreateOrderDTO = z.infer<typeof createOrderSchema>
If validation fails, reject early with a clear error. Don't let bad data propagate to the service layer where it causes confusing failures.
function validate(schema: z.ZodSchema, data: unknown) {
const result = schema.safeParse(data)
if (!result.success) {
throw new ValidationError(result.error.flatten())
}
return result.data
}
Think About the Database Early
Production APIs live and die by their database design. Three things matter most.
Indexes. Every WHERE clause, every ORDER BY, every join key should have an index. For the Order Management API:
model Order {
id String @id @default(uuid())
userId String
status OrderStatus
createdAt DateTime
@@index([userId])
@@index([status])
@@index([createdAt])
}
Transactions. When an order involves multiple entities — deducting wallet balance, creating a payment record, updating order status — wrap it in a transaction. Partial failures are not an option.
async function payOrder(orderId: string, userId: string) {
return prisma.$transaction(async (tx) => {
const order = await tx.order.findUniqueOrThrow({ where: { id: orderId } })
const wallet = await tx.wallet.findUniqueOrThrow({ where: { userId } })
if (wallet.balance < order.total) {
throw new InsufficientFundsError()
}
await tx.wallet.update({
where: { id: wallet.id },
data: { balance: { decrement: order.total } },
})
await tx.payment.create({
data: { orderId, amount: order.total, status: 'COMPLETED' },
})
await tx.order.update({
where: { id: orderId },
data: { status: 'PAID' },
})
})
}
Migrations. Never modify the database by hand. Every schema change goes through a migration that is code-reviewed and reversible.
Handle Errors Like a Product
Errors are part of your API's UX. A cryptic 500 with a stack trace leaking to the client is not acceptable.
Custom error classes keep things organized:
class AppError extends Error {
constructor(
public statusCode: number,
public code: string,
message: string,
public details?: unknown
) {
super(message)
}
}
class InsufficientFundsError extends AppError {
constructor() {
super(402, 'INSUFFICIENT_FUNDS', 'Insufficient wallet balance')
}
}
A global error handler catches everything:
function errorHandler(err: Error, req: Request, res: Response) {
if (err instanceof AppError) {
logger.warn({ err, requestId: req.requestId }, 'Application error')
return res.status(err.statusCode).json(errorResponse(err))
}
logger.error({ err, requestId: req.requestId }, 'Unhandled error')
res.status(500).json(errorResponse(new InternalError()))
}
The client gets a safe, structured message. The engineering team gets structured logs with a request ID they can trace.
Add Security From Day One
Security is not a phase. Bolt it in from the start.
Authentication and RBAC. Every protected route checks identity and role. Middleware is the right place for this:
function requireRole(...roles: string[]) {
return (req: Request, res: Response, next: NextFunction) => {
if (!roles.includes(req.user.role)) {
throw new ForbiddenError()
}
next()
}
}
router.patch('/api/v1/admin/orders/:id/status',
authenticate,
requireRole('ADMIN'),
adminOrderController.updateStatus
)
Rate limiting on every public endpoint. Redis-backed, per-user or per-IP.
CORS locked down to known origins. No wildcards in production.
Input sanitization for any user-generated content that might reach a frontend.
Audit logs for every sensitive operation — status changes, payment attempts, admin actions.
await tx.auditLog.create({
data: {
action: 'ORDER_STATUS_CHANGED',
entityType: 'Order',
entityId: orderId,
actorId: userId,
metadata: { from: 'PENDING', to: 'CONFIRMED' },
},
})
Make the API Observable
You can't fix what you can't see. Every request gets a unique ID, propagated through the entire system:
app.use((req, _res, next) => {
req.requestId = uuidv4()
next()
})
Structured logging (JSON) goes to stdout. No file logging in containers — the platform handles log shipping.
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
transport: process.env.NODE_ENV === 'development'
? { target: 'pino-pretty' }
: undefined,
})
Health checks at GET /health — returns database connectivity, Redis connectivity, and uptime. Your orchestrator needs this.
Metrics expose request rates, error rates, and latency percentiles via Prometheus. A dashboard showing p50/p95/p99 response times tells you more than hours of log spelunking. Set alerts on the metrics that matter: 5xx spike, p99 latency breach, payment failure rate.
Test the Flows That Matter
Not every function needs a unit test. Every critical flow does.
Unit tests cover service logic — status transitions, validation rules, edge cases.
describe('OrderService.createOrder', () => {
it('throws when wallet balance is insufficient', async () => {
const dto = { productId: uuid(), quantity: 1 }
jest.spyOn(walletRepo, 'findByUserId').mockResolvedValue({ balance: 0 })
await expect(
orderService.createOrder(userId, dto)
).rejects.toThrow(InsufficientFundsError)
})
})
Integration tests cover the full request-response cycle. Spin up a test database, seed data, hit the real route.
describe('POST /api/v1/orders', () => {
it('creates an order and returns 201', async () => {
const res = await request(app)
.post('/api/v1/orders')
.set('Authorization', `Bearer ${token}`)
.send({ productId: seedProduct.id, quantity: 2 })
expect(res.status).toBe(201)
expect(res.body.success).toBe(true)
expect(res.body.data.status).toBe('PENDING')
})
})
The goal is confidence. If you can deploy without sweating, you've tested enough.
Prepare for Scale Before Scale Arrives
Scaling is not about throwing servers at a problem. It's about making each request cheap and reliable.
Caching. Read-heavy endpoints (product catalog) get cached in Redis with a sensible TTL. Cache-aside pattern: check cache first, fall back to database, populate cache on miss.
Idempotency. Payment endpoints must survive retries. The client sends an Idempotency-Key header. The server checks if it has seen that key before and returns the stored result instead of processing again.
async function payOrder(req: Request) {
const idempotencyKey = req.headers['idempotency-key'] as string
const existing = await idempotencyCache.get(idempotencyKey)
if (existing) return existing
const result = await orderService.payOrder(req.params.id, req.user.id)
await idempotencyCache.set(idempotencyKey, result, { ttl: 86_400 })
return result
}
Background jobs. Sending confirmation emails, generating invoices, syncing with external systems — none of this belongs in the request lifecycle. Push it to a queue (BullMQ with Redis). The queue handles retries, dead letters, and concurrency. Your API stays fast because it only does what it must.
Async workflows. If an order requires external approval, model it as a workflow with explicit state transitions and event handlers. Don't block the HTTP response waiting for an external system.
Horizontal scaling. Stateless application servers behind a load balancer. Sessions in Redis. File uploads to S3. If any instance can die without notice, your architecture is resilient.
Conclusion
Building a production-ready API is not about knowing the hottest framework or the most advanced patterns. It's about the boring stuff done well: consistent contracts, clear layering, database discipline, thoughtful errors, observability, and testing.
The difference between a toy API and a production API is not complexity. It's intentionality.
Beginners ask, "What framework should I use?"
Production engineers ask, "What happens when this fails?"
The Order Management API we walked through is small — maybe 15 endpoints. But every decision we made is the same one you'd make for a system handling a million requests a day. The tools change. The principles don't.
Go build something that survives production.
