Use MPS Logging SDK for Payment App Logging
Status: Accepted Date: 2025-10-29 Authors: CloudSky Architecture Team Version: 1.0
1. Context
Card Service Co. requires a method to output logs from their Android payment application for troubleshooting purposes. The logging solution faces several constraints:
- Security & Compliance: The application handles sensitive payment data (PAN, CVV) and must adhere strictly to PCI-DSS requirements.
- Reliability: Troubleshooting often requires logs from the exact moment of failure, even if the network was unstable.
- Crash Visibility: We need to capture unhandled exceptions (crashes) automatically, including stack traces and device state, to diagnose production stability issues.
- Development Efficiency: Card Service Co. wants to minimize the development effort required to integrate and maintain the logging mechanism. To minimize the integration cost, we want to avoid asking them to build complex authentication, buffering, or encryption logic.
2. Decision
We will provide a pre-built MPS Logging SDK to Card Service Co., which they must integrate into their Android application.
This SDK will serve as the exclusive mechanism for shipping application logs to the MPAC Obs (logging system).
Rationale
We selected the SDK Integration approach over File Output or Custom IPC for the following reasons:
- Encapsulated Complexity: The SDK abstracts away the complexities of authentication (JWT refresh), network resilience (exponential backoff), and buffering.
- Defense-in-Depth Security: The SDK includes client-side sanitization that automatically redacts PCI sensitive data (PAN, CVV) before it leaves the device. This provides a safety net against accidental developer logging.
- Standardization: The SDK wraps OpenTelemetry, ensuring logs are emitted in a standard format compatible with our backend observability stack (Loki/Grafana).
3. Technical Specification
3.1 Architecture Overview
The "Remote Logging System" data flow is as follows: Android App -> MPS Logging SDK -> [HTTPS/OTLP] -> mpac-obs (observability) Gateway -> Loki -> Grafana
3.2 Integration Guide
3.2.1 SDK Compatibility Requirements
| Requirement | Minimum Version | Recommended |
|---|---|---|
| Android SDK | API 24 (Android 7.0) | API 26+ |
| Kotlin | 1.8.0 | 1.9.0+ |
| Gradle Plugin | 7.4.0 | 8.0+ |
| Java | 11 | 17 |
Dependencies:
- OpenTelemetry SDK: 1.32.0
- OkHttp: 4.12.0
- Protobuf: 3.25.0
3.2.2 SDK Dependency (Gradle)
Gradle
dependencies {
implementation 'com.mpac:logging-sdk:1.0.0'
}3.2.3 SDK Initialization (Application onCreate)
Kotlin
import com.mpac.logging.*
import java.util.concurrent.locks.ReentrantLock
import kotlin.concurrent.withLock
class PaymentApplication : Application() {
// Thread-safe JWT token management
private val tokenLock = ReentrantLock()
@Volatile private var cachedToken: String? = null
private val tmsClient by lazy { TmsApiClient(this) }
override fun onCreate() {
super.onCreate()
// Initialize MPS Logging SDK with production configuration
MpsLogger.init(
context = this,
config = LogConfig(
endpoint = "https://otlp.obs.mpac-cloud.com/v1/logs",
minLevel = LogLevel.INFO,
// JWT Token Provider (CRITICAL for 24+ hour uptime)
// Called from SDK's background thread - safe for network calls
tokenProvider = object : JwtTokenProvider {
override fun getToken(): String {
tokenLock.withLock {
cachedToken?.takeIf { !isTokenExpired(it) }?.let { return it }
val fresh = tmsClient.refreshDeviceToken()
cachedToken = fresh
return fresh
}
}
},
// LZ4 compression (30-50% faster than gzip)
compressionAlgorithm = CompressionAlgorithm.LZ4,
// Error callback for programmatic error handling
errorCallback = object : LoggerCallback {
override fun onError(errorCode: LoggerErrorCode, message: String, throwable: Throwable?) {
when (errorCode) {
LoggerErrorCode.AUTHENTICATION_FAILED -> {
// Token expired, provider will refresh automatically
}
LoggerErrorCode.RATE_LIMITED -> {
// Reduce log frequency
}
else -> {
Log.e("Mps", "Error: $message", throwable)
}
}
}
}
)
)
}
// Token expiration helper
private fun isTokenExpired(token: String): Boolean {
return try {
val payload = Base64.decode(token.split(".")[1], Base64.URL_SAFE)
val json = JSONObject(String(payload))
val exp = json.getLong("exp")
System.currentTimeMillis() / 1000 >= exp - 60 // 1-min buffer
} catch (e: Exception) {
true // Expired if unparseable
}
}
}Key Features:
- Automatic JWT Refresh: No manual token management, no expiration failures
- Crash Reporting: Automatic capture of uncaught exceptions with stack traces
- LZ4 Compression: Fast, battery-efficient (default)
- Error Handling: Programmatic error handling with 11 error codes
- PCI-DSS Safe: 8 sanitization patterns applied automatically
3.2.4 Thread Safety Guarantees
The MpsLogger class is fully thread-safe. You can safely call logging methods from:
- Main/UI thread
- Background threads
- Coroutine dispatchers (IO, Default)
- WorkManager workers
Internal Threading Model:
- Logging calls are non-blocking and return immediately
- Logs are queued to a dedicated background thread
- Token refresh (
JwtTokenProvider.getToken()) is called from a background thread - Network uploads occur on a separate I/O thread pool
Best Practices:
- Avoid logging in tight loops (< 1ms intervals)
- For high-frequency events, use sampling or aggregation
3.2.5 Crash Reporting (Automatic)
The SDK automatically registers an UncaughtExceptionHandler to capture application crashes.
Captured Crash Data:
- Exception: Type, Message, and full Stack Trace.
- Context: Thread name, Device state (Battery, RAM, Orientation), and App state (Foreground/Background).
- Correlation: Last breadcrumb/log events leading up to the crash.
All crash reports are sent exclusively to MPAC Obs - no third-party crash reporting services are required or recommended.
Technical Note (Crash Handler Chaining):
The SDK preserves any previously registeredUncaughtExceptionHandlerby chaining to it after capturing the crash. This ensures the app's existing behavior is not disrupted. In typical deployments, Card Service Co. should NOT use any other crash reporters - MPS Logger handles everything.
Note: ProGuard/R8 mappings must be uploaded to MPS Build Pipeline during the build process to de-obfuscate stack traces.
3.2.6 Logging API Usage
Kotlin
// Simple logging API (similar to Android Log)
MpsLogger.debug("Payment initialized", "amount" to 1000, "currency" to "JPY")
MpsLogger.info("PayPay QR displayed", "merchantId" to "M12345")
MpsLogger.warn("Payment timeout", "duration" to 30000)
MpsLogger.error("Payment failed", throwable, "errorCode" to "E501")
// Distributed tracing (optional, for advanced troubleshooting)
MpsLogger.startTrace("payment_flow") { trace ->
trace.addEvent("qr_displayed")
processPayment()
trace.addEvent("qr_scanned")
}3.2.7 Log Format (Automatic)
The SDK automatically structures logs in OpenTelemetry format:
JSON
{
"timestamp": "2025-10-29T10:15:30.123Z",
"severity": "INFO",
"body": "PayPay QR displayed",
"attributes": {
"device_id": "ST-00001",
"merchant_id": "M12345",
"app_version": "2.1.0"
},
"corr_id": "a1b2c3d4e5f6...",
"span_id": "1234567890abcdef"
}3.2.8 PCI-DSS Compliance (Automatic)
What Card Service Co. Must NOT Log:
- ❌ Full card numbers (PAN)
- ❌ CVV/CVC codes
- ❌ Magnetic stripe data
- ❌ PIN codes
SDK Protection:
The SDK automatically sanitizes logs client-side:
- PAN detected → Masked to 411111******1111 (first 6 + last 4)
- CVV detected → Replaced with [REDACTED]
- Patterns matching PCI data → Removed
Recommendation: Card Service Co. developers should still avoid logging sensitive data, but SDK provides defense-in-depth protection.
3.2.9 Offline Behavior (Automatic)
Network Unavailable:
- SDK buffers logs to encrypted local storage (max 50 MB)
- Automatic upload when network restored
- No action required by Card Service Co.
Storage Full:
- SDK drops oldest logs (FIFO)
- No app crashes or errors
3.2.10 Advanced Configuration (Optional)
For apps requiring custom behavior, the SDK provides advanced configuration options:
Kotlin
val config = LogConfig(
// localhost is for local development (point to OTLP)
endpoint = "https://obs.mpac-cloud.com/api/v1/telemetry",
deviceId = getDeviceId(),
jwtToken = getJwtToken(),
// Log filtering
minLevel = LogLevel.INFO, // Drop VERBOSE and DEBUG in production
// Offline buffering
enableOfflineBuffer = true,
maxOfflineBufferBytes = 50 * 1024 * 1024, // 50 MB
// Batching (network optimization)
batchIntervalMs = 10_000, // Send every 10 seconds
maxBatchSize = 512, // Or when 512 logs accumulated
// Battery optimization
enableBatteryOptimization = true, // Adaptive batching based on battery
// Crash Reporting mechanism
enableCrashReporting = true, // Catch uncaught exceptions
// PCI-DSS (MUST be enabled for payment apps)
enableSanitization = true,
// Distributed tracing
enableTracing = true, // Correlate logs with payment transactions
// Network constraints
networkPolicy = NetworkPolicy(
allowMeteredNetwork = true, // Upload on cellular
allowRoaming = false, // Block on roaming
respectDataSaver = true, // Honor Android Data Saver
wifiOnlyForLargeBatches = true // >1MB batches wait for WiFi
),
// Custom metadata
globalAttributes = mapOf(
"app_name" to "card_service_payment",
"payment_provider" to "paypay"
)
)
MpsLogger.init(this, config)Pre-configured Profiles:
Kotlin
// Development: Verbose logging, fast batching
MpsLogger.init(this, LogConfig.development())
// Production: Optimized for battery and network
MpsLogger.init(this, LogConfig.production())3.2.11 SDK Performance Characteristics
| Metric | Value | Notes |
|---|---|---|
| APK Size Increase | ~4.5 MB | OpenTelemetry dependencies |
| Memory Footprint | ~5 MB | Runtime allocation |
| Battery Impact | <1% per day | With battery optimization enabled |
| Network Usage | ~2-5 KB per 100 logs | After gzip compression |
| Storage Impact | Max 50 MB | Offline buffer (auto-cleanup) |
| CPU Usage | <0.1% | Background processing |
Performance Recommendations:
- Enable battery optimization for production builds
- Use INFO or WARN minimum log level in production
- Avoid logging in tight loops (<1ms intervals)
3.2.12 Lifecycle Integration
The SDK automatically detects app lifecycle state using ProcessLifecycleOwner.
Captured State in Crash Reports:
app_state: FOREGROUND or BACKGROUNDlast_activity: Name of the last visible Activitysession_duration_ms: Time since app moved to foreground
Manual Lifecycle Hooks (Optional):
// For apps with complex lifecycle requirements
MpsLogger.onActivityResumed(activity)
MpsLogger.onActivityPaused(activity)WorkManager Integration:
class SyncWorker : CoroutineWorker() {
override suspend fun doWork(): Result {
MpsLogger.setContext("worker_name" to "SyncWorker")
// ... work
return Result.success()
}
}3.2.13 Graceful Shutdown
For terminals that may be remotely restarted, ensure logs are flushed before shutdown:
// In your shutdown handler or onDestroy
lifecycleScope.launch {
val flushed = MpsLogger.flushAndShutdown(timeoutMs = 30_000)
if (!flushed) {
Log.w("App", "Some logs may not have been uploaded")
}
}Shutdown Behavior:
- Pending logs are uploaded synchronously (up to timeout)
- Logs that fail to upload are persisted to offline buffer
- SDK releases all resources (threads, file handles)
- Calling
MpsLogger.info()after shutdown throwsIllegalStateException
3.3 Testing and Validation
3.3.1 Unit Testing (Card Service Co. Dev Environment)
Kotlin
// Test SDK initialization
@Test
fun testSdkInitialization() {
MpsLogger.init(context, LogConfig.development())
// Should not throw exception
MpsLogger.info("Test log")
}
// Test PCI-DSS sanitization
@Test
fun testPCIDSSSanitization() {
MpsLogger.info("Card: 4111111111111111 CVV:123")
// Verify in logcat (with debug mode):
// Expected: "Card: 411111******1111 [REDACTED]"
}3.3.2 Integration Testing (Staging Environment)
Test Scenario 1: Online Logging
- Connect device to WiFi
- Log test messages: MpsLogger.info("Test at ${System.currentTimeMillis()}")
- Wait 15 seconds for batch upload
- Verify logs appear in Grafana dashboard - MPAC Obs
Test Scenario 2: Offline Buffering
- Enable airplane mode
- Generate 10 test logs
- Verify files created in /data/data/com.cardservice.payment/files/mps-logs/
- Disable airplane mode
- Wait 30 seconds for automatic upload
- Verify files deleted after successful upload
Test Scenario 3: Crash Reporting
- Trigger a test crash:
throw RuntimeException("Test Crash") - Restart the application.
- Verify the crash report appears in Grafana/Loki within 60 seconds.
- Confirm stack trace is legible (if mapped) or raw (if unmapped).
Test Scenario 4: Payment Flow Tracing
Kotlin
MpsLogger.startTrace("payment_flow") { trace ->
trace.addEvent("qr_generated")
displayPayPayQR()
trace.addEvent("waiting_for_scan")
waitForPayment()
trace.addEvent("payment_confirmed")
}Verify in Grafana that all events are correlated under same corr_id.
3.3.3 Production Validation Checklist
Before going live:
- [ ] SDK initialized in Application.onCreate()
- [ ] Device token provisioned by MPS Backend
- [ ] Logs visible in Grafana (verify with MPS support)
- [ ] PCI-DSS sanitization working (no card numbers in logs)
- [ ] Offline buffering functional (network disconnect test)
- [ ] No ANRs or crashes introduced
- [ ] Payment transactions traceable end-to-end
3.3.4 ProGuard/R8 Configuration
Add to proguard-rules.pro:
# MPS Logging SDK - Public API only
-keep class com.mpac.logging.MpsLogger { *; }
-keep class com.mpac.logging.LogConfig { *; }
-keep class com.mpac.logging.LogConfig$Builder { *; }
-keep class com.mpac.logging.LogLevel { *; }
-keep interface com.mpac.logging.JwtTokenProvider { *; }
-keep interface com.mpac.logging.LoggerCallback { *; }
-keep enum com.mpac.logging.LoggerErrorCode { *; }
# OpenTelemetry - Required for OTLP
-keep class io.opentelemetry.exporter.otlp.** { *; }
-dontwarn io.opentelemetry.**
# Protobuf - Lite runtime only
-keep class * extends com.google.protobuf.GeneratedMessageLite { *; }
-dontwarn com.google.protobuf.**3.4 Troubleshooting Guide
Common Issues and Solutions
Issue 1: SDK Initialization Error
Symptom: IllegalStateException: MpsLogger not initialized
Solution:
Kotlin
// Ensure init() is called in Application.onCreate()
class PaymentApplication : Application() {
override fun onCreate() {
super.onCreate()
MpsLogger.init(this) // ← Must be first
}
}
// Register in AndroidManifest.xml
<application android:name=".PaymentApplication" ...>Issue 2: Logs Not Appearing in Grafana
Possible Causes:
- Device not activated by MPS TMS (no JWT token)
- Network firewall blocking HTTPS to tms.mpscloud.com
- Incorrect endpoint configuration
Debug Steps:
Kotlin
// Enable debug mode
if (BuildConfig.DEBUG) {
MpsLogger.init(this, LogConfig.development())
}
// Check logcat for errors
// adb logcat | grep ErrorIssue 3: High Battery Drain
Solution:
Kotlin
val config = LogConfig.production().copy(
enableBatteryOptimization = true,
batchIntervalMs = 30_000, // Longer interval = less battery
minLevel = LogLevel.INFO // Drop DEBUG logs
)Issue 4: Offline Logs Not Uploading
Solution:
Kotlin
// Check offline storage
val logDir = File(context.filesDir, "mps-logs")
Log.d("Debug", "Buffered logs: ${logDir.listFiles()?.size ?: 0}")
// Force flush
val success = MpsLogger.flush(timeoutMs = 60_000)Issue 5: Understanding Flush Behavior
MpsLogger.flush() Semantics:
| Return Value | Meaning |
|---|---|
true | All pending logs uploaded successfully |
false | Timeout reached, some logs still pending |
On Failure:
- Logs are not lost - they remain in the offline buffer
- Will retry on next batch interval or app restart
Force Upload with Retry:
suspend fun ensureLogsUploaded(): Boolean {
repeat(3) { attempt ->
if (MpsLogger.flush(timeoutMs = 20_000)) return true
delay(5_000) // Wait before retry
}
return false
}3.5 Notes
Further detailed technical specifications will be provided:
- Full Integration Guide
- SDK Technical Specification
- Security & PCI-DSS Compliance:
- System Architecture
3.6 Demo Application
We request the inclusion of a demo application in the same folder as the SDK to verify integration.
Example Structure:
./<root>
+ app/: demo/example app to test integration of logging sdk
+ mpac-logging-sdk4. Consequences
Positive
- Simplified Integration: Card Service Co. only needs to add a dependency and a few lines of initialization code.
- Compliance Assurance: Automatic sanitization reduces the risk of PCI-DSS violations.
- Operational Visibility: We gain structured logs and traces that can be correlated with backend transaction logs.
Negative
- App Size: The SDK adds approximately 4.5 MB to the final APK size due to OpenTelemetry dependencies.
- Dependency Management: Card Service Co. must update the SDK version periodically to receive security patches.
Risks and Mitigation
- Battery Drain: Risk of process-heavy operations. Mitigation: The SDK implements adaptive batching and battery-aware scheduling.
- Data Loss: Risk of log loss during long outages. Mitigation: Local encrypted buffering (up to 50MB) with FIFO rotation.