The traditional testing pyramid -- many unit tests, fewer integration tests, even fewer end-to-end tests -- was designed for monolithic applications where all code runs in a single process and all data lives in a single database. In distributed systems, this model breaks down. A unit test that mocks every external dependency tells you that your business logic is correct in isolation, but it says nothing about whether your service can actually communicate with the services it depends on. And the failures that take down production in a microservices architecture are almost never logic bugs caught by unit tests. They are network timeouts, message serialization mismatches, database connection pool exhaustion, and cascading failures when a downstream service becomes slow.
Testing distributed systems requires a fundamentally different strategy: one that validates contracts between services, verifies behavior under realistic infrastructure conditions, and deliberately injects failures to confirm that resilience mechanisms actually work.
Rethinking the Testing Pyramid
The classic pyramid suggests that unit tests should form the broad base of your testing strategy. In a distributed system, this remains partially true -- unit tests are fast, cheap, and effective for validating business logic within a single service. But the ratio shifts significantly.
In a monolith, the integration boundary is usually the database. In a microservices architecture, every service has multiple integration boundaries: databases, message brokers, caches, other services via HTTP or gRPC, and external APIs. The bugs that matter most live at these boundaries. A unit test with a mocked HTTP client cannot catch a breaking change in a downstream service's response format. A unit test with a mocked database cannot catch a query that performs acceptably in development but times out against a production-scale dataset.
A more effective model for distributed systems is the testing honeycomb, which emphasizes integration tests as the largest layer. The structure looks like this:
- Unit tests -- Validate pure business logic, domain models, algorithms, and utility functions. Mock external dependencies. Run in milliseconds.
- Integration tests -- Validate interactions with real infrastructure: databases, message brokers, caches. Use lightweight, containerized instances of real dependencies. These are the most valuable tests in a distributed system.
- Contract tests -- Validate that services agree on the format and semantics of their communication. Run independently of other services.
- End-to-end tests -- Validate complete user journeys across multiple services. Run sparingly due to cost and fragility.
- Chaos tests -- Validate that the system behaves correctly under failure conditions: network partitions, service crashes, resource exhaustion.
Unit Testing in Microservices
Unit tests in microservices focus on the same things they always have: business logic correctness. The difference is scope. In a well-designed microservice aligned to a bounded context, the domain model and its rules are self-contained.
import { describe, it, expect } from "vitest";
import { Order, LineItem, Money } from "../domain/order";
describe("Order aggregate", () => {
it("calculates total correctly with multiple line items", () => {
const order = Order.create("order-1", "customer-1");
order.addLineItem("product-a", 2, Money.of(25.0, "USD"));
order.addLineItem("product-b", 1, Money.of(49.99, "USD"));
expect(order.total).toEqual(Money.of(99.99, "USD"));
});
it("rejects submission of an empty order", () => {
const order = Order.create("order-1", "customer-1");
expect(() => order.submit()).toThrow("Cannot submit an empty order");
});
it("produces OrderSubmitted event on successful submission", () => {
const order = Order.create("order-1", "customer-1");
order.addLineItem("product-a", 1, Money.of(25.0, "USD"));
order.setShippingAddress(validAddress);
order.submit();
const events = order.uncommittedEvents;
expect(events).toHaveLength(1);
expect(events[0].type).toBe("OrderSubmitted");
expect(events[0].data.totalAmount).toBe(25.0);
});
it("prevents modification after submission", () => {
const order = Order.create("order-1", "customer-1");
order.addLineItem("product-a", 1, Money.of(25.0, "USD"));
order.setShippingAddress(validAddress);
order.submit();
expect(() =>
order.addLineItem("product-b", 1, Money.of(10.0, "USD"))
).toThrow("Cannot modify a submitted order");
});
});
These tests are fast, deterministic, and validate core domain rules. They do not require a database, a message broker, or any network access. Keep them focused on business logic and avoid testing framework wiring, HTTP routing, or serialization -- those belong in integration tests.
The key heuristic: if a unit test requires more than two or three mocks to set up, the code under test has too many dependencies for unit testing and should be validated through integration or contract tests instead.
Integration Testing with Testcontainers
Testcontainers is a library that programmatically manages Docker containers during test execution. Instead of mocking your database or message broker, you spin up a real instance, run your tests against it, and tear it down when the test suite completes. This gives you confidence that your code works with real infrastructure without the fragility of shared test environments.
import { describe, it, expect, beforeAll, afterAll } from "vitest";
import { PostgreSqlContainer, StartedPostgreSqlContainer } from "@testcontainers/postgresql";
import { GenericContainer, StartedTestContainer } from "testcontainers";
import { OrderRepository } from "../repositories/order-repository";
import { createPool, Pool } from "pg";
describe("OrderRepository integration tests", () => {
let pgContainer: StartedPostgreSqlContainer;
let pool: Pool;
let repository: OrderRepository;
beforeAll(async () => {
// Start a real PostgreSQL instance in a container
pgContainer = await new PostgreSqlContainer("postgres:16")
.withDatabase("test_orders")
.start();
pool = createPool({
connectionString: pgContainer.getConnectionUri(),
});
// Run migrations against the test database
await runMigrations(pool);
repository = new OrderRepository(pool);
}, 60000);
afterAll(async () => {
await pool.end();
await pgContainer.stop();
});
it("persists and retrieves an order with line items", async () => {
const order = Order.create("order-1", "customer-1");
order.addLineItem("product-a", 2, Money.of(25.0, "USD"));
order.addLineItem("product-b", 1, Money.of(49.99, "USD"));
await repository.save(order);
const retrieved = await repository.findById("order-1");
expect(retrieved).not.toBeNull();
expect(retrieved.total).toEqual(Money.of(99.99, "USD"));
expect(retrieved.lineItems).toHaveLength(2);
});
it("handles concurrent saves with optimistic locking", async () => {
const order = Order.create("order-2", "customer-1");
order.addLineItem("product-c", 1, Money.of(10.0, "USD"));
await repository.save(order);
// Simulate two concurrent modifications
const instance1 = await repository.findById("order-2");
const instance2 = await repository.findById("order-2");
instance1.addLineItem("product-d", 1, Money.of(15.0, "USD"));
await repository.save(instance1);
instance2.addLineItem("product-e", 1, Money.of(20.0, "USD"));
await expect(repository.save(instance2)).rejects.toThrow("ConcurrencyError");
});
});
Testcontainers is particularly valuable for testing message broker interactions:
describe("Order event publishing", () => {
let rabbitContainer: StartedTestContainer;
let connection: Connection;
beforeAll(async () => {
rabbitContainer = await new GenericContainer("rabbitmq:3.13")
.withExposedPorts(5672)
.withWaitStrategy(Wait.forLogMessage("Server startup complete"))
.start();
const port = rabbitContainer.getMappedPort(5672);
connection = await amqp.connect(`amqp://localhost:${port}`);
}, 60000);
it("publishes OrderSubmitted event to the exchange", async () => {
const channel = await connection.createChannel();
const exchange = "events.orders";
await channel.assertExchange(exchange, "topic", { durable: false });
const { queue } = await channel.assertQueue("", { exclusive: true });
await channel.bindQueue(queue, exchange, "order.submitted");
const receivedMessages: Buffer[] = [];
channel.consume(queue, (msg) => {
if (msg) receivedMessages.push(msg.content);
});
// Execute the operation that should publish the event
const publisher = new OrderEventPublisher(connection);
const order = Order.create("order-1", "customer-1");
order.addLineItem("product-a", 1, Money.of(25.0, "USD"));
order.setShippingAddress(validAddress);
order.submit();
await publisher.publishAll(order.uncommittedEvents);
// Allow time for message delivery
await new Promise((resolve) => setTimeout(resolve, 500));
expect(receivedMessages).toHaveLength(1);
const event = JSON.parse(receivedMessages[0].toString());
expect(event.type).toBe("OrderSubmitted");
expect(event.data.orderId).toBe("order-1");
});
});
These tests are slower than unit tests -- typically 10-30 seconds for the initial container startup, then milliseconds per test. The startup cost amortizes across the test suite. In CI pipelines, containers start fresh for each run, ensuring test isolation.
Contract Testing with Pact
Integration tests verify that your code works with real infrastructure, but they do not verify that two services agree on their communication format. If the Order service expects the Payment service to return { "status": "confirmed" } but the Payment service actually returns { "paymentStatus": "CONFIRMED" }, both services pass their individual integration tests but fail when deployed together.
Contract testing with Pact solves this by defining and verifying contracts between service consumers and providers. The consumer defines what it expects from the provider. The provider verifies that it can fulfill those expectations. Neither side needs the other to be running during the test.
Consumer side (the Order service that calls the Payment service):
import { PactV4, MatchersV3 } from "@pact-foundation/pact";
const { like, eachLike, string, integer } = MatchersV3;
const provider = new PactV4({
consumer: "OrderService",
provider: "PaymentService",
});
describe("Payment Service contract", () => {
it("processes a payment successfully", async () => {
await provider
.addInteraction()
.given("a valid payment method exists")
.uponReceiving("a request to process a payment")
.withRequest("POST", "/api/payments", (builder) => {
builder
.headers({ "Content-Type": "application/json" })
.jsonBody({
orderId: string("order-123"),
amount: integer(9999),
currency: string("USD"),
paymentMethodId: string("pm-456"),
});
})
.willRespondWith(201, (builder) => {
builder.jsonBody({
paymentId: string("pay-789"),
status: string("confirmed"),
processedAt: string("2025-07-28T10:00:00Z"),
amount: integer(9999),
});
})
.executeTest(async (mockServer) => {
const client = new PaymentClient(mockServer.url);
const result = await client.processPayment({
orderId: "order-123",
amount: 9999,
currency: "USD",
paymentMethodId: "pm-456",
});
expect(result.status).toBe("confirmed");
expect(result.paymentId).toBeDefined();
});
});
it("handles payment failure", async () => {
await provider
.addInteraction()
.given("the payment method has insufficient funds")
.uponReceiving("a request to process a payment with insufficient funds")
.withRequest("POST", "/api/payments", (builder) => {
builder
.headers({ "Content-Type": "application/json" })
.jsonBody({
orderId: string("order-456"),
amount: integer(999999),
currency: string("USD"),
paymentMethodId: string("pm-789"),
});
})
.willRespondWith(422, (builder) => {
builder.jsonBody({
error: string("insufficient_funds"),
message: string("Payment method has insufficient funds"),
});
})
.executeTest(async (mockServer) => {
const client = new PaymentClient(mockServer.url);
await expect(
client.processPayment({
orderId: "order-456",
amount: 999999,
currency: "USD",
paymentMethodId: "pm-789",
})
).rejects.toThrow("insufficient_funds");
});
});
});
The consumer test generates a pact file -- a JSON contract describing the expected interactions. This file is shared with the provider (typically through a Pact Broker).
Provider side (the Payment service verifies it meets the contract):
import { Verifier } from "@pact-foundation/pact";
describe("Payment Service provider verification", () => {
it("fulfills the OrderService contract", async () => {
const verifier = new Verifier({
providerBaseUrl: "http://localhost:3001",
pactBrokerUrl: "https://pact-broker.internal.example.com",
provider: "PaymentService",
providerVersion: process.env.GIT_SHA,
publishVerificationResult: true,
stateHandlers: {
"a valid payment method exists": async () => {
// Set up test data so the provider can fulfill the interaction
await testDb.insert("payment_methods", {
id: "pm-456",
type: "card",
last4: "4242",
status: "active",
});
},
"the payment method has insufficient funds": async () => {
await testDb.insert("payment_methods", {
id: "pm-789",
type: "card",
last4: "0002",
status: "active",
balance: 0,
});
},
},
});
await verifier.verifyProvider();
});
});
The provider verification runs the actual provider service and replays the interactions defined by the consumer. If the provider's responses do not match the consumer's expectations, the test fails. This catches breaking changes before deployment.
Consumer-driven contracts are especially powerful in CI/CD pipelines. Before deploying a new version of the Payment service, the pipeline verifies it against all consumer contracts. If a change breaks the Order service's expectations, deployment is blocked.
End-to-End Testing Strategies
End-to-end tests validate complete workflows across multiple services. In distributed systems, they are expensive to write, slow to run, and fragile to maintain. But they catch integration issues that no other test level can: misconfigured service discovery, incorrect environment variables, broken deployment artifacts, and timing-dependent failures.
The key is to run end-to-end tests sparingly and focus them on critical business paths:
describe("Order placement end-to-end", () => {
it("completes the full order lifecycle", async () => {
// This test runs against a deployed staging environment
const api = new ApiClient(process.env.STAGING_URL);
// Step 1: Create a customer session
const session = await api.authenticate(testCredentials);
// Step 2: Add items to cart
await api.post("/api/cart/items", {
productId: "test-product-1",
quantity: 2,
});
// Step 3: Submit order
const order = await api.post("/api/orders", {
shippingAddress: testAddress,
paymentMethodId: "test-card-1",
});
expect(order.status).toBe("submitted");
// Step 4: Wait for async processing to complete
const finalOrder = await pollUntil(
() => api.get(`/api/orders/${order.id}`),
(o) => o.status === "paid",
{ timeout: 30000, interval: 1000 }
);
expect(finalOrder.status).toBe("paid");
// Step 5: Verify inventory was updated
const inventory = await api.get(`/api/inventory/test-product-1`);
expect(inventory.available).toBeLessThan(inventory.previousAvailable);
}, 60000);
});
Limit end-to-end tests to 5-10 critical paths. Run them in a dedicated staging environment that mirrors production. Accept that they will occasionally flake due to timing or infrastructure issues -- investigate flakes but do not let them block every deployment. Many teams run end-to-end tests on a schedule (nightly or hourly) rather than on every commit.
Chaos Engineering
All the testing strategies above validate that the system works when things go right. Chaos engineering validates that the system behaves acceptably when things go wrong. It is the practice of deliberately injecting failures into a running system to discover weaknesses before they cause outages.
The foundational tool is Netflix's Chaos Monkey, which randomly terminates production instances to ensure services handle instance failures gracefully. Modern chaos engineering has expanded far beyond random termination.
Common chaos experiments:
- Network latency injection. Add 500ms-2s latency to calls between two services. Does the caller time out gracefully? Do circuit breakers trigger? Does the user experience degrade gracefully or fail catastrophically?
- Network partition. Sever communication between two services entirely. Does the system detect the partition? Do queued messages replay correctly when connectivity is restored?
- Resource exhaustion. Fill disk space, exhaust database connection pools, or consume all available memory. Does the service shed load gracefully? Do health checks report the degraded state?
- Dependency failure. Kill a downstream service entirely. Does the upstream service return cached data, a degraded response, or an error? Does it recover when the dependency comes back?
Litmus is a popular open-source chaos engineering platform for Kubernetes:
# Litmus chaos experiment: inject network latency
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: payment-service-latency
spec:
appinfo:
appns: production
applabel: app=payment-service
appkind: deployment
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-latency
spec:
components:
env:
- name: NETWORK_INTERFACE
value: eth0
- name: TARGET_CONTAINER
value: payment-service
- name: NETWORK_LATENCY
value: "2000" # 2 seconds
- name: TOTAL_CHAOS_DURATION
value: "300" # 5 minutes
- name: DESTINATION_IPS
value: "10.0.1.50" # Database IP
Start chaos engineering in non-production environments. Run controlled experiments with clear hypotheses: "We believe the Order service will return cached payment status when the Payment service has 2-second latency." Measure the actual behavior against the hypothesis. When you have confidence in the experiment design, graduate it to production -- during business hours, with the team watching, and with a kill switch ready.
Load Testing and Performance Validation
Distributed systems often behave differently under load than they do under normal conditions. Connection pools exhaust, garbage collection pauses become noticeable, message broker queues back up, and database queries that complete in 5ms at low concurrency take 500ms under contention.
Load testing should be part of your regular testing cadence, not a one-time event before launch. Tools like k6 make it straightforward to define load test scenarios:
import http from "k6/http";
import { check, sleep } from "k6";
export const options = {
stages: [
{ duration: "2m", target: 100 }, // Ramp up to 100 users
{ duration: "5m", target: 100 }, // Hold at 100 users
{ duration: "2m", target: 500 }, // Ramp up to 500 users
{ duration: "5m", target: 500 }, // Hold at 500 users
{ duration: "2m", target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ["p(95)<500", "p(99)<1000"],
http_req_failed: ["rate<0.01"],
},
};
export default function () {
// Simulate a realistic user journey
const products = http.get("http://staging.example.com/api/products?limit=20");
check(products, {
"product list returns 200": (r) => r.status === 200,
"product list returns items": (r) => JSON.parse(r.body).data.length > 0,
});
sleep(1);
const productId = JSON.parse(products.body).data[0].id;
const detail = http.get(`http://staging.example.com/api/products/${productId}`);
check(detail, {
"product detail returns 200": (r) => r.status === 200,
});
sleep(2);
}
Run load tests against a staging environment that matches production in configuration (even if not in scale). Track metrics over time: if p95 latency creeps up across releases, you have a regression even if no individual test fails. Integrate load test results into your CI dashboard so the team sees performance trends alongside functional test results.
Observability-Driven Testing
Traditional tests verify that the system produces the correct output for a given input. In distributed systems, you also need to verify that the system produces the correct telemetry -- because when production issues occur, your ability to diagnose them depends entirely on the quality of your logs, metrics, and traces.
Observability-driven testing adds assertions about telemetry to your test suite:
- Verify that a completed request produces a distributed trace spanning all involved services.
- Verify that error conditions emit structured log entries with correlation IDs.
- Verify that business metrics (orders placed, payments processed) are recorded correctly.
- Verify that health check endpoints accurately report service state after a failure scenario.
This is not about testing your monitoring tools. It is about ensuring that the instrumentation your team depends on for production operations is present and correct. A service that processes orders correctly but does not emit traces is effectively untestable in production.
Building a Testing Strategy That Works
Testing distributed systems is more nuanced than testing monoliths, but the principles are clear. Unit tests validate business logic. Integration tests with Testcontainers validate infrastructure interactions. Contract tests with Pact validate inter-service agreements. End-to-end tests validate critical paths. Chaos engineering validates resilience. Load testing validates performance characteristics. And observability-driven testing validates your ability to operate the system in production.
No team implements all of these on day one. Start with unit tests and integration tests for each service. Add contract tests when you have more than two services communicating. Introduce chaos engineering once your services have circuit breakers, retries, and fallbacks worth validating. Each layer builds on the confidence provided by the layers below it.
If your team is building or scaling a distributed system and needs help establishing a testing strategy that gives you confidence in your deployments, Maranatha Technologies can help. We work with teams to design testing pipelines, implement contract testing, and establish chaos engineering practices. Visit our software architecture services or contact us to get started.