min read

Payment system is software and infrastructure solution that allows businesses to accept, process, and manage digital payments.

Payment System

Written by

Senior Staff at Google

Last revisited

July 12, 2025

💡

What is Stripe?

Stripe is a modern payment processing platform that enables businesses to accept online payments quickly and securely. It abstracts the complexities of payment infrastructure—such as integrating with card networks, handling authorization and settlement, and complying with financial regulations—into a set of developer-friendly APIs.

Basic Payment Workflow in Stripe

Payment Initiation: A user submits a payment request (e.g., credit card purchase) through a merchant's website or app.
Authorization: Stripe forwards the request to an external payment provider (e.g., Visa, Mastercard, or a bank) to authorize the payment. If approved, the funds are held (not immediately transferred).
Status Tracking: The merchant or user can check the status of the payment (e.g., authorized, declined, pending).
Batch Settlement: Once per day, Stripe aggregates authorized payments and submits them to the payment network for actual fund transfer and final settlement.

Functional Requirements

1. Merchant can submit a payment request

The system should allow a merchant to initiate a payment with amount, currency, and customer details.

2. Customer can complete the payment using a payment method

The system should accept and process a customer's payment method and trigger authorization with the external provider.

3. Merchant can view the current status of any payment

The system should return real-time or persisted status updates of a payment, including authorization outcome and settlement state.

4. System should settle all authorized payments once daily

The system should batch authorized payments at the end of each day and submit them for final settlement with the external provider.

Non-Functional Requirements

1. Security — No confidential data leak

The system must protect all sensitive payment data by enforcing industry-standard security measures: encrypting all traffic with TLS 1.2 or higher, encrypting stored data with AES-256, and enforcing role-based access control and audit logging. This is essential to prevent data breaches, meet regulatory requirements, and build trust with users and financial partners.

2. Strong Consistency — Exactly-once processing, no duplicate settlement

The system must ensure that each payment is processed exactly once and moves through a well-defined state machine (e.g., pending → authorized → settled) without duplication or loss. Inaccurate payment states can lead to financial loss, double charges, or regulatory issues, which are unacceptable in a production payment system.

3. Durability — ≥ 99.9999999% (nine 9s) data durability

All accepted payments must be written to a durable, replicated data store before acknowledging the client. A payment system cannot lose data due to crashes, network failures, or region outages; persistence guarantees are foundational to user trust and financial correctness.

4. High Scalability — ≥ 10,000 QPS sustained throughput

The system must be able to ingest and process at least 10,000 payments per second at steady load, with the ability to scale horizontally across stateless services and background workers. This ensures the system can handle real-world traffic spikes, global usage, and future growth without becoming a bottleneck.

Core Entities

1. Merchant

Represents a business integrating with the payment platform to accept payments.

2. Payment

Represents a payment request initiated by a merchant and fulfilled by a customer.

Field	Type	Description
`payment_id`	UUID	Unique ID for the payment
`merchant_id`	UUID	FK to the merchant receiving the payment
`amount`	Decimal	Payment amount
`currency`	String	ISO code (e.g., USD)
`status`	Enum	`PENDING`, `AUTHORIZED`, `DECLINED`, `SETTLED`
`customer_info`	Object	Optional customer metadata (e.g., ID, email, session token)
`created_at`	Timestamp	When the payment was initiated
`payment_info`	Object	Payment Raw Data, including credit card number, cvc etc

💡

[Common Pitfall]

Many candidates include fields like this in their Payment entity:

"payment_info": {
  "card_number": "4242 4242 4242 4242",
  "exp_month": 12,
  "exp_year": 2026,
  "cvc": "123"
}

This instantly implies that your backend:

Receives raw card data (puts you in PCI DSS Level 1 scope)
Might store or log that data somewhere (a major compliance breach)
Doesn’t properly separate data handling responsibilities

[Great Answer]

We use tokenization to avoid handling raw payment data. The customer submits their card directly to Stripe (or another vault provider), and our backend receives a token like pm_12345abc, which we store in the payments table. That ensures we never touch sensitive data directly. (Please check Deep Dive 1 later for updated schema)

3. Transaction

Represents a discrete step taken to process a payment — such as authorization or settlement.

Field	Type	Description
`transaction_id`	UUID	Unique ID for the transaction
`payment_id`	UUID	FK to the payment this transaction belongs to
`type`	Enum	`AUTHORIZATION`, `SETTLEMENT`, `REFUND`
`status`	Enum	`SUCCESS`, `FAILED`, `RETRYING`
`provider_id`	String	External payment provider reference (e.g., auth code)
`actor_type`	Enum	`CUSTOMER`, `MERCHANT`, `SYSTEM` — who triggered the action
`response_code`	String	Provider result code or message
`processed_at`	Timestamp	Timestamp of when this transaction was processed

💡

Payment Terminology

Many candidates don’t have experience in payment system development, so we provide you some fundamental understandings on the core concepts. Still we use real-world analogy, i.e. Stripe

Payment = The "PaymentIntent" in Stripe — a complete business unit initiated by a merchant.
Transaction = A processing event like an authorization, a capture, or a refund attempt.

Relationship

Each Payment can have many Transactions (1 to N)
Transactions are always tied to a specific payment_id
Transactions allow retries, multiple steps, and eventual settlement tracking without modifying the original Payment intent

Comparison

Category	Payment	Transaction
Definition	A business-level intent to transfer funds	A processing-level record of an action taken on a payment
Granularity	One payment per customer/merchant event	One or more transactions per payment
Examples	"Charge $120 to customer for an order"	"Authorize $120", "Settle $120", "Refund $120"
Lifespan	Created once per user request	May have multiple entries (e.g., retries, stages, failures)
Status	`PENDING`, `AUTHORIZED`, `DECLINED`, `SETTLED`	`SUCCESS`, `FAILED`, `RETRYING`
Purpose	Represents the logical flow the merchant/user cares about	Captures the actual steps taken to fulfill the payment
Who cares?	Exposed to external clients (e.g., merchants, UIs, APIs)	Mostly internal (for processing, retries, audit, reconciliation)

API Design

1. Merchant submits a payment request

Endpoint:

POST /payments

Request Body:

{
  "merchant_id": "uuid",
  "amount": 120.50,
  "currency": "USD",
  "customer_info": {
    "customer_id": "cust_001",
    "email": "alice@example.com"
  }
}
    

Response:

{
  "payment_id": "uuid",
  "status": "PENDING",
  "created_at": "timestamp"
}
    

This endpoint registers a Payment intent. No funds are authorized at this stage — it simply prepares the system for a future customer confirmation.

2. Customer completes the payment using a payment method

Endpoint:

POST /payments/{payment_id}/confirm

Request Body:

{
  "payment_method": {
    "type": "card",
    "card_number": "4242424242424242",
    "exp_month": 12,
    "exp_year": 2026,
    "cvc": "123"
  }
}
    

Response:

{
  "payment_id": "uuid",
  "status": "AUTHORIZED",
  "transaction_id": "uuid",
  "processed_at": "timestamp"
}
    

This endpoint triggers authorization with the external provider. It creates a Transaction of type AUTHORIZATION, updates the Payment status, and holds funds on success.

3. Merchant views the current status of any payment

Endpoint:

GET /payments/{payment_id}

Response:

{
  "payment_id": "uuid",
  "merchant_id": "uuid",
  "amount": 120.50,
  "currency": "USD",
  "status": "AUTHORIZED",
  "customer_info": {
    "customer_id": "cust_001",
    "email": "alice@example.com"
  },
  "created_at": "timestamp",
  "transactions": [
    {
      "transaction_id": "uuid",
      "type": "AUTHORIZATION",
      "status": "SUCCESS",
      "actor_type": "CUSTOMER",
      "processed_at": "timestamp"
    }
  ]
}
    

Returns the current state of a payment and its related transactions, useful for merchant dashboards and auditing.

4. System settles all authorized payments once daily

Internal Endpoint (scheduled job):

POST /settlement_batches

Request Body:

{
  "scheduled_time": "2025-07-08T00:00:00Z"
}
    

Response:

{
  "batch_id": "uuid",
  "payment_count": 12458,
  "submitted_at": "timestamp",
  "status": "SUBMITTED"
}
    

Triggered by a scheduled task, this endpoint collects all AUTHORIZED payments, submits them for final settlement with the external provider, and creates SETTLEMENT transactions. Payment statuses are updated to SETTLED.

State Machine

We will use state machine to reflect the business lifecycle of a payment and a transaction from creation to completion or failure.

💡

State machine is a powerful tool for managing how a payment moves through its lifecycle. Think of it as a clearly defined map of allowed states—like pending, authorized, settled, or failed—and the valid transitions between them. Each event, such as a successful authorization or a failed settlement, triggers a state change.

Why do we highlight this in interviews? Because modeling your system with an explicit state machine reduces ambiguity, prevents invalid transitions—like settling an unapproved payment—and makes the system easier to debug and extend. It’s a sign that the candidate understands not just how to move data around, but how to design for correctness, resilience, and traceability at scale.

‍

Payment States

State	Description
`CREATED`	Payment is created by merchant, pending customer confirmation
`AUTHORIZED`	Funds are held after external provider approves authorization
`SETTLED`	Funds are captured and transferred during batch settlement
`REFUNDED`	Full refund successfully processed
`PARTIALLY_REFUNDED`	Partial refund completed; payment is not fully returned
`REFUND_FAILED`	Refund attempt failed at external provider
`CANCELLED`	Payment was cancelled (manually or via timeout) before authorization

Transaction States

State	Description
`INITIATED`	Transaction has been constructed and queued
`SUCCESS`	External provider processed the action successfully
`FAILED`	Temporary failure (e.g., network error, declined card)
`RETRYING`	System is re-attempting the failed transaction
`FAILED_FINAL`	All retry attempts exhausted or failure is terminal

Retry Policy Notes:

Retryable errors: timeouts, network failures, rate limits
Non-retryable errors: invalid card, expired method, fraud flags
Use exponential backoff and dead-letter queue for robust retry strategy

💡 Common Pitfall:

Many candidates describe high-level flows (e.g., “the payment is authorized, then settled”) without explicitly defining the state transitions or handling edge cases like retries, timeouts, or partial failures. In real-world systems — especially financial platforms like Stripe — understanding and clearly modeling state machines for entities like Payment and Transaction is essential. It demonstrates that you can build resilient, auditable systems with predictable behavior under both success and failure scenarios. Without this clarity, systems often suffer from hidden bugs like double settlements, lost refunds, or inconsistent status reporting — all of which are deal-breakers in production-grade payment infrastructure.

High Level Design

The system is composed of stateless API services, asynchronous background workers, persistent storage, and external integrations with third-party payment providers. Each functional requirement maps to a specific request flow that spans multiple components.

1. Merchant submits a payment request

When a merchant initiates a payment, the system validates the request, persists it as a Payment in the database, and prepares it for later customer confirmation.

Flow Description

Merchant sends a POST /payments request.
API Gateway authenticates the merchant via API key.
Payment Service:
- Validates merchant status and request data
- Persists a new Payment record with status PENDING
Returns a payment_id for the customer to use in the next step.

Flow Chart

2. Customer completes the payment using a payment method

Once the customer is ready to pay, they confirm the payment using the payment_id. The system triggers authorization by calling an external provider, updates state based on the outcome, and records a Transaction.

Flow Description

Customer submits POST /payments/{payment_id}/confirm with card or wallet details.
API validates the payment and forwards the request to an async processing queue.
Authorization Worker:
- Fetches the payment
- Calls the external provider for authorization
- Records the outcome as a Transaction (type AUTHORIZATION)
- Updates the Payment status to AUTHORIZED or DECLINED
Result is eventually available via GET /payments/{id}.

3. Merchant views the current status of any payment

Merchants can check the real-time status of any payment, including authorization state and transaction history.

Flow Description

Merchant sends GET /payments/{payment_id}.
Payment Service fetches:
- The Payment record
- All related Transaction records (AUTHORIZATION, SETTLEMENT, etc.)
Response is assembled and returned.

4. System settles all authorized payments once daily

At the end of each day, a scheduled job aggregates all authorized payments into a batch and sends them to the external provider for final settlement.

Flow Description

Batch Processor triggers at scheduled time (e.g., midnight UTC).
Payment Service queries all AUTHORIZED payments.
Creates a settlement batch and sends it to the external provider.
For each settled payment:
- Records a Transaction (type SETTLEMENT)
- Updates the Payment status to SETTLED

Flow Chart

💡 Database Selection Strategy

1. Relational Database for Payments and Transactions

Use a relational database (e.g., PostgreSQL, MySQL, or Google Spanner) to store Payment and Transaction records.

Why:

Strong consistency for financial data
Referential integrity between Payment ↔ Merchant and Payment ↔ Transaction
Easy to query by status, merchant, or time for dashboards and settlement batches

Design Tips:

Index on (merchant_id, status, created_at) for efficient lookups
Partition by merchant_id or region for horizontal scalability

2. Kafka (or Pub/Sub) for Asynchronous Workflows

Use Kafka to decouple the synchronous API layer from background workflows like

Why:

Smooths out traffic spikes and retries
Allows workers to process at their own pace
Enables observability and dead-letter queueing for failure handling

Where it's used:

After POST /payments (to trigger authorization)
Before daily settlement (to schedule batches)

Deep Dives

Deep Dive 1 - How Does the System Stay Secure?

In the high-level design, we describe a system where merchants submit payments, customers confirm them with payment methods, and backend workers authorize and settle payments using external providers. At a functional level, this works.

However, it’s not sufficient from a security standpoint. The current architecture leaves open two major risks that would be unacceptable in a production-grade payment platform like Stripe:

Handling raw payment information, which is tightly governed by PCI DSS, and
Allowing internal services to interact without proper authentication, which can lead to privilege escalation or lateral movement if compromised.

Risk 1: Raw Card Data Can Flow Through Application Infrastructure

In the current flow, the customer submits a card to POST /payments/{id}/confirm. This implies the payment method (e.g., card number, CVC) flows through your public API, service layer, queue, and worker — even if only briefly.

But why this is a problem, because handling raw card data directly puts your entire system into PCI DSS Level 1 scope, which means every service that touches that data must go through costly audits, strict isolation, and compliance checks. Even worse, if card details accidentally end up in logs, error dumps, or unprotected Kafka topics due to a misconfiguration, it could lead to a serious regulatory breach — exposing sensitive information and putting the company at legal and financial risk.

✅ Solution: Vault-Based Tokenization (Don’t Handle Cards Directly)

Use tokenization: offload card handling to a vaulted, PCI-compliant provider like Stripe, Braintree, or a custom HSM-backed vault.

💡 Tokenization in Payment Systems

Tokenization is the process of replacing sensitive payment data (like credit card numbers) with a non-sensitive, unique reference token that can be safely stored and transmitted across your system.

Why it matters:

By never storing or even transmitting raw card data through your backend, you dramatically reduce PCI DSS scope and lower the risk of security breaches.

How it works in practice:

The customer enters their card details using a frontend SDK (e.g., Stripe.js, Braintree Hosted Fields).
These SDKs send the card data directly to the payment provider’s vault — bypassing your backend entirely.
The vault returns a payment method token (e.g., pm_12345abc).
This token is sent to your backend, stored in your payments table, and used in subsequent API calls.

Example:

Instead of processing 4242 4242 4242 4242, your backend only sees:

"payment_method": "pm_1HxYzwFbNK12345abc"

This token can be safely passed through databases, logs, and queues without violating PCI compliance.

Updated Payment Authorization Flow with Tokenization

Customer enters card details via frontend SDK
- Example: Stripe.js, Braintree Hosted Fields
- The SDK securely collects raw card details in the browser (never hitting your backend)
SDK sends the card details directly to the payment provider’s vault
- This bypasses your backend infrastructure entirely
- Stripe or another PCI-compliant vault handles encryption, validation, and token issuance
Vault returns a payment_method_token (e.g., pm_abc123)
- This token is a safe reference to the actual payment method stored in the provider’s secure system
Browser sends the token to your backend
- POST /payments/{payment_id}/confirm with body containing only the token
- No raw payment data touches your application infrastructure
Your backend uses the token to authorize payment via provider APIs
- For example, POST /v1/payment_intents/{id}/confirm in

Production Tip: Even when using your own vault, isolate the tokenization service on a separate VPC, restrict access by service identity, and log only token references.

Risk 2: Internal Services Trust Each Other Implicitly

In the high-level architecture, background workers and internal services communicate with each other and external providers (e.g., auth-worker → provider, cron job → settlement API) but there is no mention of internal authentication or authorization boundaries. These services operate as if they fully trust one another.

In a real-world microservices environment, trust boundaries must be enforced. If one internal service is compromised — either due to a software bug or a security incident — it could impersonate another service, call sensitive internal APIs, or write directly to shared databases.

Without internal authentication and access control:

Any service could issue sensitive operations like refunds or settlements.
A misconfigured worker could write corrupted data into payments or transactions.
Attackers exploiting one service could move laterally across your infrastructure.

✅ Solution: Mutual TLS and Service Identity-Based Authorization

Introduce mutual TLS (mTLS) and per-service identity propagation to harden internal service communication.

How It Works in Practice

Every internal service (like auth-worker, payment-api, or batch-processor) is given a unique identity, similar to a verified name tag.

When one service wants to talk to another (e.g., auth-worker wants to update a payment), it must:

Prove who it is using a secure certificate — this is like showing an official badge.
The receiving service (e.g., payment-db or settlement-api) checks the badge and verifies:
- Is this service really who it says it is?
- Is it allowed to perform this action?

This process is enforced by the system automatically using mutual TLS (mTLS) — where both sides of the connection authenticate each other, and all traffic is encrypted.

For example:

auth-worker shows its identity: auth-worker.prod.internal
payment-db only allows writes from trusted services like auth-worker
If an unknown or untrusted service tries to access it, the connection is blocked

This setup prevents internal services from impersonating each other and stops any compromised part of the system from moving laterally or accessing sensitive data.

Updated Payment Entity Table

Still remember in “Core Entities” section, we introduced Payment entity and that needs to be further discussed? Now, we can make an update correspondingly:

Field	Type	Description
`payment_id`	UUID	Unique ID for the payment
`merchant_id`	UUID	Foreign key to the merchant receiving the payment
`amount`	Decimal	Total amount of the payment
`currency`	String	ISO 4217 currency code (e.g., `USD`, `EUR`)
`status`	Enum	Payment status: `PENDING`, `AUTHORIZED`, `DECLINED`, `SETTLED`, etc.
`payment_method_token`	String	Tokenized reference to the customer's payment method (e.g., `pm_abc123`)
`customer_info`	Object	Optional customer metadata (e.g., customer ID, email, session ID)
`created_at`	Timestamp	Timestamp when the payment was initiated

✅ Added payment_method_token: this replaces raw card data
❌ Removed payment_info (card number, CVC, etc.): not safe to store and a common interview red flag

Updated Diagram

Deep Dive 2 – How Does the System Guarantee Exactly-Once Processing?

In the high-level design, we allow merchants to initiate payments, customers to confirm them, and workers to interact with external payment providers for authorization and settlement. But that design — while functionally complete — falls short when it comes to correctness guarantees, especially around exactly-once execution.

This is critical in payment systems. Unlike many domains, where duplicate writes or retries are acceptable, double authorizing or settling a payment is a catastrophic bug.

Where the Current Design Falls Short?

In a naive system, retries, crashes, or race conditions can cause:

Duplicate authorization charges to a customer card
Payments marked as AUTHORIZED even when external provider failed
A payment settled twice because the batch processor was re-run
Conflicting state updates due to concurrent workers

None of this is acceptable. Stripe, PayPal, and similar systems spend enormous engineering effort ensuring that every payment moves through a strictly valid state machine — and only once per transition.

Why This Is a Problem

Modern systems use retries and asynchronous processing heavily. That’s good for availability, but without safeguards, retries can re-trigger real-world actions like charging a card. The following issues arise:

Stateless APIs may re-authorize a card if the response is lost or retried.
Worker crashes before persisting a Transaction may leave the system in limbo.
Settlement workers scanning the same AUTHORIZED rows may run twice due to timeouts, leading to double settlement.

These bugs are hard to detect, cause financial loss, and undermine user trust.

✅ Solution: Idempotency, Transactional Outbox, and State Enforcement

1. Use Idempotency Keys at API Boundaries

Each request to confirm a payment (e.g., POST /payments/{id}/confirm) should include an idempotency key — a unique identifier for this intent.

Store the key with a hash of the input parameters and the final result.
If the same key is reused, return the cached result — do not reprocess.
Stripe does this with a client-provided Idempotency-Key header.

2. Implement Transactional Outbox Pattern

Don’t directly call the external provider from your DB transaction. Instead:

Write a message (e.g., "authorize this payment") into an outbox table.
Commit this message in the same transaction that updates payment status.
A worker polls this table and processes it asynchronously.

This avoids “write succeeded, provider call failed” and vice versa — both of which break consistency.

3. Enforce State Transitions at the Database Level

Add guards around updates to ensure only valid transitions occur:

From PENDING → AUTHORIZED
From AUTHORIZED → SETTLED
Never allow reverting or skipping intermediate states

Use optimistic locking (e.g., version numbers or timestamps) to reject concurrent conflicting updates.

4. De-duplicate Settlements with Batch Token

Every settlement batch should have a unique batch ID and each payment should record which batch settled it. This prevents re-processing if the batch is re-run or partially failed.

And here is a summary to resolve the duplications.

Design Gap	Risk	Fix	Outcome
Payment authorized twice on retry	Double charge	Idempotency keys	Safe retry of payment confirmation
Worker crashes after provider success	Payment state out of sync	Transactional outbox	State change and side effects always consistent
Settlement job runs twice	Double settlement	Batch token + status check	Only settled once per payment
Concurrent workers update same record	Race conditions, invalid states	Optimistic concurrency + DB-level state guard	Predictable, auditable state transitions

Updated Payment Entity Table

Field	Type	Description
`payment_id`	UUID	Unique ID for the payment
`merchant_id`	UUID	Foreign key to the merchant
`amount`	Decimal	Payment amount
`currency`	String	ISO 4217 code
`status`	Enum	PENDING, AUTHORIZED, DECLINED, SETTLED
`payment_method_token`	String	Tokenized reference to payment method
`customer_info`	Object	Customer metadata
`batch_id`	UUID (nullable)	Assigned during daily settlement; used for exactly-once batch tracking
`created_at`	Timestamp	Creation time

✅ Added batch_id to track settlement inclusion and prevents reprocessing in repeated batch runs.

Updated Transaction Entity Table

Field	Type	Description
`transaction_id`	UUID	Unique ID
`payment_id`	UUID	Foreign key
`type`	Enum	`AUTHORIZATION`, `SETTLEMENT`, `REFUND`
`status`	Enum	`INITIATED`, `SUCCESS`, `FAILED`, `RETRYING`, `FAILED_FINAL`
`provider_id`	String	External reference (e.g., Stripe auth ID)
`actor_type`	Enum	`CUSTOMER`, `MERCHANT`, `SYSTEM`
`response_code`	String	External result info
`processed_at`	Timestamp	Time of transaction
`idempotency_key`	String	🆕 Used to detect duplicate requests

✅ Added idempotency_key to allow the system to safely deduplicate API retries.

New Table: Outbox

Field	Type	Description
`outbox_id`	UUID	Unique record ID
`event_type`	String	e.g., `AUTHORIZE_PAYMENT`, `SETTLE_PAYMENT`
`payload`	JSON	Serialized message body
`status`	Enum	`PENDING`, `PUBLISHED`, `FAILED`
`created_at`	Timestamp	When the event was created
`published_at`	Timestamp (nullable)	When the event was dispatched

✅ Added to decouple DB writes from external side effects, ensuring at-least-once delivery and enabling retries without duplication.

Updated Diagram

‍

‍

Deep Dive 3 - Payment System with Webhook

Modern payment systems like Stripe support webhooks to enable event-driven integrations for merchants. While APIs allow merchants to poll payment status, this is inefficient for near-real-time use cases like triggering order fulfillment or updating accounting systems. Webhooks solve this by pushing structured event notifications to merchants as critical state changes occur in the payment lifecycle.

💡 Why We Don't Use Distributed Transactions — And Why That's OK

Some payment system designs — especially in monoliths or enterprise environments — introduce distributed transactions (e.g., using 2PC or XA protocols) to ensure atomicity between local database writes and external payment provider calls (like Stripe). Their goal is to guarantee "either everything commits or nothing does."

But in modern cloud-native architectures (like ours), this approach is rarely necessary and often harmful. Instead, we follow a more scalable and resilient pattern:

Use a transactional outbox to persist internal state and enqueue a downstream task in a single DB transaction.
Process external actions (e.g., authorization, settlement, webhook delivery) asynchronously from the outbox — with retries, idempotency, and state machine guards.

This design decouples internal and external systems without sacrificing correctness. It works seamlessly with Stripe's idempotent APIs. And it avoids the complexity, brittleness, and tight coupling of distributed transactions.

For our existing payment system here, webhook publishing is triggered after core payment events (like authorization or settlement). Rather than synchronously notifying merchants during payment processing, the system uses an event outbox model for durability and decoupling:

After a critical event (e.g., status = SETTLED), the Payment Service writes a webhook_event record into a persistent table.
A Webhook Dispatcher worker reads undelivered events and sends signed POST requests to the merchant’s registered webhook URL.
Success responses (2xx) mark the event as DELIVERED. Failures are retried with exponential backoff.
If delivery fails persistently, events are sent to a dead-letter queue or marked FAILED.

We use the Transactional Outbox Pattern to ensure state updates (e.g., updating payment to SETTLED) and webhook event creation happen atomically — guaranteeing exactly-once webhook generation, even if the system crashes.

Updated webhook_events Table

Field	Type	Description
`event_id`	UUID	Unique identifier
`merchant_id`	UUID	FK to the merchant receiving this event
`event_type`	String	e.g., `payment.settled`, `payment.authorized`
`payload`	JSON	Structured event body (e.g., includes `payment_id`, status, amount)
`status`	Enum	`PENDING`, `DELIVERED`, `FAILED`
`attempts`	INT	Retry attempt count
`last_attempt_at`	Timestamp	Last retry timestamp
`created_at`	Timestamp	Event creation time

‍

Optional: add a merchant_webhook_config table to store endpoint URL, secret for HMAC signing, etc.

Updated Diagram

Final Thoughts – Why Many Candidates Fail This System Design Interview

Designing a Stripe-like payment system is not just about drawing boxes and arrows — it’s about building trust, ensuring correctness, and handling failure gracefully in a financial environment where mistakes cost real money.

Many candidates fail this interview because they stop at the happy path: a merchant submits a payment, a customer confirms it, and everything settles. But what interviewers look for is how you handle edge cases, enforce exactly-once guarantees, protect sensitive data, and deliver webhook events reliably. Weak answers skip security (e.g., raw card handling), ignore retries and duplicates, or fail to define clear state transitions and database integrity. Others forget that real-world systems need observability, failure recovery, and scalability beyond 10K QPS. What separates a great candidate is not just technical knowledge, but their ability to design a system that is robust, auditable, and production-ready — end to end.

‍

Coach + Mock

Practice with a Senior+ engineer who just get an offer from your dream (FANNG) companies.

Schedule Now

Content:

Payment System

Unlock Full System Design Access