Load Testing

Part of: MPAC SmartPOS Cloud Platform - Product RequirementsVersion: 2.0 Last Updated: 2026-01-28

Overview

This document defines load testing scenarios, performance targets, and infrastructure requirements for the MPAC platform. Load testing validates the system's ability to handle production-scale traffic (15,000 RPS sustained, 400,000+ concurrent devices) and identifies performance bottlenecks before they impact users. Tests simulate realistic traffic patterns including authentication storms, peak transaction loads, payment processing, and settlement operations using tools like JMeter or Gatling.

Load Testing Scenarios
Load Test Infrastructure
Performance Targets

Load Testing Scenarios

Purpose: Simulate production-scale loads to validate performance and identify bottlenecks.

1. Device Authentication Storm

Scenario: Mass device authentication during morning startup period.

Test Parameters:

Concurrent Devices: 10,000 devices authenticate simultaneously
Duration: All authentications must complete within 30 seconds
Target Metric: Token generation latency P95 < 2 seconds

Test Script (JMeter pseudocode):

Thread Group: 10,000 threads
Ramp-up: 10 seconds (1,000 devices/second)

HTTP Request: POST /auth/device/token
Body: {
  "grant_type": "client_credentials",
  "client_assertion_type": "urn:ietf:params:oauth:client-assertion-type:jwt-bearer",
  "client_assertion": "${JWT_ASSERTION}"
}

Assertions:
- Response code: 200
- Response time P95 < 2000ms
- Token format valid
- All requests complete within 30s

Success Criteria:

✅ All 10,000 devices receive valid tokens
✅ P95 latency < 2 seconds
✅ P99 latency < 5 seconds
✅ Zero authentication failures
✅ CPU usage < 80% on auth service instances
✅ Database connection pool healthy (<80% utilization)

2. Peak Hour Transaction Load

Scenario: Sustained high-volume transaction processing during lunch rush.

Test Parameters:

Request Rate: 15,000 RPS sustained for 1 hour
Traffic Mix:
- 40% Order creation (POST /orders)
- 30% Bill generation (POST /bills)
- 20% Payment creation (POST /payments)
- 10% Query operations (GET /orders/*, GET /bills/*)
Target Metric: P95 latency < 500ms for all endpoints

Test Script (Gatling pseudocode):

scala

val scn = scenario("Peak Hour Traffic")
  .during(1.hour) {
    randomSwitch(
      40.0 -> exec(http("Create Order")
        .post("/orders")
        .body(StringBody(orderPayload))
        .check(status.is(201), responseTimeMillis.lessThan(500))),

      30.0 -> exec(http("Generate Bill")
        .post("/bills")
        .body(StringBody(billPayload))
        .check(status.is(201), responseTimeMillis.lessThan(500))),

      20.0 -> exec(http("Create Payment")
        .post("/payments")
        .body(StringBody(paymentPayload))
        .check(status.is(201), responseTimeMillis.lessThan(500))),

      10.0 -> exec(http("Query Order")
        .get("/orders/${orderId}")
        .check(status.is(200), responseTimeMillis.lessThan(200)))
    )
  }

setUp(
  scn.inject(constantUsersPerSec(15000).during(1.hour))
).protocols(http.baseUrl("https://api.mpac-cloud-dev.com"))

Success Criteria:

✅ 15,000 RPS maintained for full hour (54 million requests total)
✅ P95 latency < 500ms across all endpoints
✅ P99 latency < 1000ms
✅ Error rate < 0.1% (< 54,000 failed requests)
✅ Database write throughput stable
✅ No connection pool exhaustion
✅ Auto-scaling triggers appropriately (if load exceeds 70% capacity)

3. Payment Gateway Load

Scenario: High-volume concurrent payment processing.

Test Parameters:

Concurrent Payments: 5,000 simultaneous payment requests
Payment Mix:
- 50% QR code payments
- 30% Card payments
- 20% Cash payments
Target Metric: P95 latency < 200ms for payment creation

Test Script (JMeter pseudocode):

Thread Group: 5,000 threads
Ramp-up: 30 seconds

Random Controller:
  50% - QR Payment Request:
    POST /pgw/v1/payments
    Body: {"method": "qr", "amount": ${amount}, "bill_id": "${billId}"}

  30% - Card Payment Request:
    POST /pgw/v1/payments
    Body: {"method": "card", "amount": ${amount}, "bill_id": "${billId}"}

  20% - Cash Payment Request:
    POST /pgw/v1/payments
    Body: {"method": "cash", "amount": ${amount}, "bill_id": "${billId}"}

Assertions:
- Response code: 201
- Response time P95 < 200ms
- Payment ID returned
- Idempotency key handled correctly

Success Criteria:

✅ All 5,000 payments processed successfully
✅ P95 latency < 200ms
✅ P99 latency < 500ms
✅ Rate limiting enforced (1000/min per merchant, 10000/min global)
✅ Idempotency working (duplicate requests handled correctly)
✅ HMAC authentication validated on all requests
✅ Redis cache hit rate > 80% for payment lookups

4. Settlement Spike

Scenario: End-of-day settlement generation across all stores.

Test Parameters:

Concurrent Settlements: 1,000 stores generate settlements simultaneously
Target Duration: All settlements complete within 5 minutes
Data Volume: Each settlement aggregates 100-500 transactions

Test Script (JMeter pseudocode):

Thread Group: 1,000 threads
Ramp-up: 60 seconds (stores trigger settlement over 1 minute)

HTTP Request: POST /settlements
Body: {
  "store_id": "${storeId}",
  "date": "2026-01-28",
  "auto_close": true
}

Assertions:
- Response code: 201
- Settlement ID returned
- Completion within 5 minutes
- Transaction totals match expected values

Success Criteria:

✅ All 1,000 settlements complete within 5 minutes
✅ No data inconsistencies (sum of payments matches settlement total)
✅ P95 settlement generation time < 3 minutes
✅ Database maintains consistency under heavy aggregation load
✅ No deadlocks or lock timeouts
✅ Background job queue processes settlements efficiently

Load Test Infrastructure

Purpose: Ensure load tests run in realistic, production-like environments.

Infrastructure Requirements

yaml

Environment: Staging (Production-scale)
- Application Servers: Same instance types as production (e.g., AWS ECS Fargate)
- Database: Production-scale RDS instance (same specs)
- Cache: Production-scale Redis cluster
- Load Balancer: Application Load Balancer with production configuration
- Network: Same VPC and security group setup

Load Generation:
- JMeter/Gatling cluster: 10+ load generators
- Distributed across multiple AZs
- Realistic geographic distribution (if applicable)

Synthetic Data Generation

python

# Generate realistic test data
def generate_realistic_load_data():
    """
    Create synthetic data that mirrors production patterns:
    - Peak hours: 9-11am, 12-2pm, 6-8pm
    - Transaction value distribution: Normal(mean=$45, std=$20)
    - Product mix: Follows actual sales data distribution
    - Device ID pool: 10,000 unique devices
    - Merchant/store hierarchy: Matches production distribution
    """
    return {
        "devices": generate_devices(count=10000),
        "merchants": generate_merchants(count=500),
        "stores": generate_stores(count=1000),
        "products": generate_products(count=5000),
        "traffic_pattern": generate_traffic_curve(peak_rps=15000)
    }

Monitoring During Tests

Real-time Dashboards (Grafana):

Request rate (RPS) by endpoint
Response latency (P50, P95, P99) by endpoint
Error rate percentage
Database connection pool utilization
Redis cache hit/miss rates
CPU and memory usage per service
Network throughput
Active WebSocket connections

Alerting Thresholds:

P95 latency > 500ms for 2 minutes
Error rate > 1% for 1 minute
CPU usage > 85% sustained for 3 minutes
Database connection pool > 90% for 2 minutes

Performance Regression Tracking

bash

# Compare current test with baseline
compare_load_test_results \
  --baseline=results/2026-01-01/peak-load.json \
  --current=results/2026-01-28/peak-load.json \
  --threshold=10%  # Alert if degradation > 10%

# Output:
# ✅ P95 latency: 450ms (baseline 420ms) - +7.1%
# ⚠️  Throughput: 13,500 RPS (baseline 15,000 RPS) - -10.0%
# ❌ Error rate: 0.5% (baseline 0.1%) - +400%

Performance Targets

System-wide SLAs:

Metric	Target	Critical Threshold
Throughput	15,000 RPS sustained	< 12,000 RPS
Latency (P95)	< 500ms	> 1000ms
Latency (P99)	< 1000ms	> 2000ms
Error Rate	< 0.1%	> 1%
Availability	99.95% (4.38h/year downtime)	< 99.9%

Service-specific Targets:

Service	Operation	P95 Latency Target	P99 Latency Target
svc-portal	User authentication	< 500ms	< 1000ms
svc-portal	Device authentication	< 2000ms	< 5000ms
svc-smarttab	Order creation	< 300ms	< 600ms
svc-smarttab	Bill generation	< 400ms	< 800ms
mpac-pgw	Payment creation	< 200ms	< 500ms
mpac-pgw	Payment confirmation	< 300ms	< 600ms

Unit Testing - Component-level testing
Integration Testing - Service integration validation
Security Testing - Security and compliance testing
Performance & Scalability - Architecture design for scale
Deployment Architecture - Infrastructure setup

Previous: Integration Testing
Next: Security Testing
Up: Testing Strategy
Home: PRD Home

Load Testing

Overview

Table of Contents

Load Testing Scenarios

1. Device Authentication Storm

2. Peak Hour Transaction Load

3. Payment Gateway Load

4. Settlement Spike

Load Test Infrastructure

Infrastructure Requirements

Synthetic Data Generation

Monitoring During Tests

Performance Regression Tracking

Performance Targets

Navigation

Load Testing ​

Overview ​

Table of Contents ​

Load Testing Scenarios ​

1. Device Authentication Storm ​

2. Peak Hour Transaction Load ​

3. Payment Gateway Load ​

4. Settlement Spike ​

Load Test Infrastructure ​

Infrastructure Requirements ​

Synthetic Data Generation ​

Monitoring During Tests ​

Performance Regression Tracking ​

Performance Targets ​

Related Documentation ​

Navigation ​

Load Testing

Overview

Table of Contents

Load Testing Scenarios

1. Device Authentication Storm

2. Peak Hour Transaction Load

3. Payment Gateway Load

4. Settlement Spike

Load Test Infrastructure

Infrastructure Requirements

Synthetic Data Generation

Monitoring During Tests

Performance Regression Tracking

Performance Targets

Related Documentation

Navigation