Caricash Nova Platform
Home
Home
  1. Internal
  • Default module
    • Clients
      • PBAC for Customer Authentication
      • Create Clients
    • Internal
      • Accounts, Transactioins and Ledger Implementatioin
      • Ensure real-time balance guarantees
      • Web App Scaffold
      • Database Migrations Guide
      • Microservices
      • Service Implementation
      • TO-DO
      • Authentication & Authorization
    • Customers
      • Onboarding
  • Release Schedule
    • Agency Operations
      • Agent APIs Specs
        • auth
          • POST /v1/agent/auth/login
          • POST /v1/agent/auth/logout
          • POST /v1/agent/auth/refresh
          • POST /v1/agent/auth/device-bind
          • POST /v1/agent/auth/otp/request
          • POST /v1/agent/auth/otp/verify
        • agent
          • GET /v1/agent/me
          • PATCH /v1/agent/me
          • GET /v1/agent/outlet
          • GET /v1/agent/capabilities
        • kyc
          • POST /v1/kyc/customers
          • POST /v1/kyc/customers/{customer_id}/upgrade
          • POST /v1/kyc/customers/{customer_id}/rekcy
          • GET /v1/kyc/customers/{customer_id}/status
        • transactions
          • POST /v1/txns/cashin
          • POST /v1/txns/cashout
          • POST /v1/txns/p2p/assist
          • GET /v1/txns/{txn_id}
          • POST /v1/txns/{txn_id}/reverse
        • wallets
          • GET /v1/wallets/{wallet_id}/balance
          • GET /v1/wallets/{wallet_id}/transactions
        • float
          • GET /v1/float
          • POST /v1/float/topup
          • POST /v1/float/redeem
          • GET /v1/float/instructions/{instruction_id}
        • commissions
          • GET /v1/agents/{agent_id}/commissions
          • POST /v1/agents/{agent_id}/commissions/payouts/preview
          • POST /v1/agents/{agent_id}/commissions/payouts/accept
        • disputes
          • POST /v1/disputes
          • GET /v1/disputes/{case_id}
          • POST /v1/disputes/{case_id}/attachments
        • reports
          • GET /v1/reports/eod
          • POST /v1/reports/eod/close
          • GET /v1/reports/txns
          • GET /v1/reports/float
        • content
          • GET /v1/announcements
        • training
          • GET /v1/training/courses
          • POST /v1/training/quizzes/{quiz_id}/submit
        • ussd
          • POST /v1/ussd/session
          • POST /v1/ussd/agent/menu
        • ops
          • GET /v1/health
      • Agent Scope
        • Agent Scope
    • Customer Operations
      • Customer Scope
        • Customer & Merchant Scope
    • Schemas
      • Agent Ops APIs
  • Nova Core Banking Service API
    • core
      • Create account
      • Get account
      • Get balances
      • Create posting (double-entry)
      • Reverse posting
      • Check limits
      • Generate statement
    • Schemas
      • Schemas
        • Amount
        • Account
        • BalanceSet
        • PostingEntry
        • Posting
        • Hold
        • LimitCheckResponse
        • SavingsProduct
        • OverdraftLine
        • StatementRequest
        • Error
Home
Home
  1. Internal

Authentication & Authorization

A complete, regulator-grade implementation blueprint


0) Executive summary

This document specifies an end-to-end, enterprise-grade AuthN/Z stack for a regulated, multi-tenant payments system. It emphasizes defense-in-depth, tenant isolation, provable policy correctness, auditor-ready evidence, and latency budgets compatible with high-throughput ledgers and payment rails.

Core pillars

  1. Identity & Sessions: WebAuthn/AAL2+, short-lived audience-scoped JWTs, DPoP/mTLS binding, opaque rotating refresh tokens, continuous session evaluation.
  2. Authorization: Externalized policy-as-code (OPA) + Zanzibar-style ReBAC, plus database Row-Level Security (RLS) for irreversible isolation. Envoy ext_authz at the edge, partial evaluation for p99 < 5ms.
  3. Enterprise Integrations: OIDC/SAML federation per tenant, SCIM user provisioning, per-tenant JWKS, step-up MFA with bound challenges, purpose-based access control (PBAC).
  4. Operational Integrity: Hash-chained audit logs to WORM storage, policy supply-chain signing, blue/green policy rollouts, mutation & property-based tests, formal invariants for crown-jewel actions.
  5. Compliance & Privacy: PCI DSS, PSD2/SCA, GDPR, Kenya NDPA mapping; data minimization, field-level encryption, selective disclosure; documented control evidence.

1) Design principles

  • Fail-closed everywhere; never allow on error or cache miss.
  • Separation of concerns: app enforces, OPA decides, DB constrains.
  • Least privilege by default, time-boxed elevation with explicit approvals.
  • Deterministic request identity: stable headers/claims across all layers.
  • Provability over intuition: mutation tests, property-based tests, formal specs.
  • Operator ergonomics: clear dashboards, kill switches, and reversible rollouts.
  • Privacy by design: minimize, mask, encrypt, and justify access via PBAC.

2) Identity model

2.1 Principal types

  • User: human (employee, tenant admin, merchant operator, auditor).
  • Service: machine/workload identities (SPIFFE/SPIRE SVIDs).
  • API Client: external integrators (mTLS + PrivateKeyJWT or HMAC legacy).

2.2 Tenancy

  • Tenant (e.g., Merchant of Record, Institution, Region).
  • Optional sub-tenants for regions/brands.
  • Data residency (e.g., KE/NG/BB) attached to tenant; influences routing, KMS keys, and policy.

2.3 Claims & headers (canonical contract)

  • Headers: x-req-id, x-tenant-id, x-principal-id, x-session-id, x-client-ip, x-device-id.

  • JWT claims:

    • sub, tid, jti, iat/exp, aud, iss,
    • aal (Authenticator Assurance Level), amr[] (methods),
    • scope[], org_roles[], tenant_roles[{tenant, roles[]}],
    • cnf (confirmation / key binding for DPoP or mTLS).

3) Authentication (AuthN)

3.1 Factors and assurance

  • Primary: WebAuthn (passkeys) → AAL2/3; TOTP fallback; SMS discouraged (risk/lawful intercept).
  • Step-up: Bound challenge tokens for high-risk actions; enforce aal>=2.

3.2 Sessions & tokens

  • Access tokens: JWT, 5–10 min, audience-scoped (per service).

  • Refresh tokens: opaque, rotating, stored server-side; theft ⇒ single-use rotation detects replay.

  • Binding:

    • Browser: DPoP (proof of possession) + device context.
    • Service→Service: mTLS with SPIFFE/SPIRE; tokens issued via token exchange to bind to workload identity.

Bound step-up (prevents replay across endpoints)

// On protected action with insufficient AAL:
const origHash = base64url(sha256(`${method}|${path}|${normalizedBody}`));
return 403, {
  error: "MFA_REQUIRED",
  challenge: signJWT({ origHash, tid, sub, exp: now+300 }) // short-lived
};

// /mfa/complete validates factor + challenge:
assert verify(challengeJWT) && challenge.origHash == recompute();
return accessToken({ aal: 2, cnf: { orig: origHash }, aud, exp: now+10*60 });

3.3 Federation & provisioning

  • OIDC/SAML per tenant:

    • Store: saml_idp_metadata_url, oidc_issuer, client_id, redirect_uris.
    • Validate signatures, cache JWKS with TTL.
  • SCIM 2.0: /Users, /Groups → map to internal principals and ReBAC tuples.

  • Attribute mapping: IdP groups → roles; time-boxed caveats for contractors.

3.4 Continuous Access Evaluation (CAE)

  • Reevaluate mid-session on impossible travel, device posture change, IP drift.
  • If risk escalates → downgrade AAL, force step-up, or revoke.

4) Authorization (AuthZ)

4.1 Strategy

  • OPA policy server (centralized behind Envoy ext_authz) for all HTTP/gRPC requests.
  • ReBAC graph (Zanzibar-style tuples) as the source of truth for relationships.
  • Database RLS as a hard backstop.

4.2 Data model — relationship tuples

CREATE TABLE rel_tuples (
  subject_ns text,  -- user | role | service | tenant
  subject_id text,
  relation   text,  -- owner | admin | member | editor | viewer | parent | ...
  object_ns  text,  -- tenant | merchant | account | payout | ...
  object_id  text,
  caveat     jsonb, -- { "expires_at": "...", "hours":[9,17], "ip_ranges":["203.0.113.0/24"] }
  PRIMARY KEY (subject_ns, subject_id, relation, object_ns, object_id)
);
CREATE INDEX ON rel_tuples (object_ns, object_id, relation);
CREATE INDEX ON rel_tuples USING gin (caveat);

4.3 OPA input contract (pin this in a shared package)

{
  "tenant": { "id": "acme", "security": {"mfa_required": true} },
  "subject": { "id": "user_123", "type": "user", "roles": ["ops"], "aal": 1 },
  "resource": { "type": "merchant", "id": "m_789", "tenant_id": "acme", "attrs": {"region":"KE"} },
  "action": "settlement.update",
  "purpose": "reconciliation",
  "context": { "ip":"203.0.113.5", "ua":"...", "risk":"high", "time":"2025-09-22T09:10:00Z" }
}

4.4 Rego (ABAC + ReBAC + PBAC + step-up)

package payments.authz
default allow := false
default step_up_required := false

same_tenant { input.resource.tenant_id == input.tenant.id }

# Purpose-based access control (PBAC)
purpose_allowed {
  some p
  data.purposes[p].name == input.purpose
  data.purposes[p].resources[_] == input.resource.type
  data.purposes[p].actions[_] == input.action
}

# ReBAC: subject related to object (direct or via parent)
related(subject, object, rel) {
  some i
  data.relationships[i].subject_ns == subject.type
  data.relationships[i].subject_id == subject.id
  data.relationships[i].relation   == rel
  data.relationships[i].object_ns  == object.type
  data.relationships[i].object_id  == object.id
  not expired(data.relationships[i])
}

expired(t) { t.caveat.expires_at != "" ; time.now_ns() > time.parse_ns_rfc3339(t.caveat.expires_at) }

# High-risk signals
step_up_required { input.action == "settlement.update" } else { input.context.risk == "high" }

# Allow if:
allow {
  same_tenant
  purpose_allowed
  related({"id": input.subject.id, "type": "user"},
          {"id": input.resource.id, "type": input.resource.type},
          "editor")
  not step_up_required
}

# Allow with AAL2 if step-up needed:
allow { same_tenant; purpose_allowed; step_up_required; input.subject.aal >= 2 }

4.5 Performance

  • Partial evaluation & compiled bundles for hot paths.
  • In-cluster OPA, memory-resident data (relationships, purpose registry).
  • p99 decision < 5ms target.

4.6 Policy lifecycle

  • Git monorepo for policies; code review required; cosign signed bundles.
  • Blue/green policy clusters; Envoy routes canary % → green; auto-promote/rollback based on error budgets.
  • Decision logs shipped to secure sink with redaction.

5) Database-level isolation (RLS & crypto)

5.1 RLS

-- Set at connection per request
-- SET app.current_tenant = '<tid>';
ALTER TABLE merchants ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON merchants
  USING (tenant_id = current_setting('app.current_tenant', true));

-- Optionally: fine-grained role mirror (if needed)

5.2 Security definer & views

  • Use SECURITY DEFINER functions with strict parameter validation for complex reads.
  • Expose views that already filter by tenant to reduce foot-guns.

5.3 Field-level encryption

  • Per-tenant DEKs (enveloped by region KMS keys).
  • Decrypt only the columns required for the action; propagate “crypto shields” so logs never contain plaintext PII/PAN.

6) API keys & machine auth

6.1 Key lifecycle & storage

  • Keys have public ID (ak_live_xxx) and secret (shown once).
  • Store argon2id hash of secret; rotateable; per-key metadata: scopes, budget, rate plan, allowed IPs, expiry.

6.2 HMAC signed requests (legacy/compat)

function hmacSignature({method, path, body, date, nonce}: any, secret: Buffer) {
  const payload = [method.toUpperCase(), path, sha256(body), date, nonce].join("\n");
  return base64url(hmacSha256(secret, payload));
}
  • Replay cache (nonce TTL 5–10 min).
  • Enforce budgets (amount/day, txn/min) and purpose. Deny with BUDGET_EXCEEDED.

6.3 Preferred for new clients

  • OAuth 2.1 confidential clients: mTLS + PrivateKeyJWT; GNAP-ready facade for future migration.

7) Risk engine & adaptive controls

  • Signals: IP reputation, device posture, geo/residency, impossible travel, velocity, amount, time-of-day.
  • UEBA profiles per role & tenant; deviations → step-up or JIT requirement.
  • Risk decision included in OPA input; explainable components logged.

8) Auditing & evidence

8.1 Tamper-evident log (hash chain)

CREATE TABLE audit_log (
  id BIGSERIAL PRIMARY KEY,
  ts timestamptz NOT NULL DEFAULT now(),
  tenant_id text NOT NULL,
  actor jsonb NOT NULL,       -- {id,type,aal,session_id}
  action text NOT NULL,       -- "settlement.update"
  target jsonb NOT NULL,      -- {type,id}
  decision jsonb NOT NULL,    -- {allow, policy_version, trace_id}
  attrs jsonb,                -- IP, UA, risk, purpose, mfa_used
  prev_hash bytea,
  row_hash bytea
);
  • Daily export to WORM S3 with bucket retention; KMS signed manifests.
  • Coverage SLO: 100% of mutating calls have audit entries; alert <100%.

8.2 Redaction & minimization

  • Centralized redaction rules (mask PANs, tokens).
  • No decrypted fields in logs; store token handles not secrets.

9) Compliance mapping (excerpt)

DomainControlMechanism
PCI DSS 7.xAccess controlReBAC+OPA, RLS, least privilege, step-up
PCI DSS 10.xLoggingHash-chain audit, WORM export
PSD2/SCAStrong authWebAuthn/TOTP step-up; AAL2 thresholds
GDPR/NDPAPurpose limitationPBAC + purpose registry; minimization; field-level crypto
ISO 27001Policy managementSigned bundles, reviews, change control
SOC 2EvidenceAutomated reports (MFA rates, revocation SLAs, decision latency)

10) Edge & service mesh

  • Envoy ext_authz calls centralized OPA for allow/deny before routing.
  • mTLS everywhere, workload identity via SPIFFE/SPIRE.
  • Per-tenant JWKS (rotation states: active|grace|retired, max 24h overlap).
  • Secrets via Vault/Secrets Manager with rotation jobs.

Envoy excerpt

http_filters:
- name: envoy.filters.http.ext_authz
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
    grpc_service: { envoy_grpc: { cluster_name: authz } }
    with_request_body: { max_request_bytes: 8192, allow_partial_message: true }
    include_peer_certificate: true

11) Observability & SLOs

11.1 Metrics (dashboards)

  • AuthN: login success/fail by reason, AAL distribution, step-up prompts & acceptance, session churn.
  • AuthZ: allow/deny by rule, top deny reasons, decision latency p50/p95/p99, policy version adoption.
  • ReBAC: tuple counts, delta lag to OPA, cache hit rate.
  • API keys: usage by scope/geo, HMAC failures, nonce replays.
  • Audit: hash verification success, coverage %, WORM exports.

11.2 Latency budgets

  • JWT verify + Redis session check: p99 < 2ms
  • OPA decision: p99 < 5ms
  • AuthN full login with step-up: p95 < 1.5s

12) Testing & assurance

12.1 Automated tests

  • Unit & integration for AuthN flows (token rotation, DPoP binding, step-up).
  • Rego tests with examples mirroring prod traffic.
  • Property-based: fuzz tenant graphs, roles, caveats.
  • Mutation testing for Rego: flip/remove rules; ensure failures → Policy Mutation Score.

12.2 Formal invariants (crown-jewels)

  • Model settlement approval invariants in TLA+/Alloy: no external principal of tenant T can transition settlement state for T without AAL2 + requisite relation + purpose. Check small state spaces per CI.

12.3 Game days & chaos

  • Kill OPA cluster, corrupt tuple stream, spike step-up; verify fail-closed and graceful degradation paths.
  • Drill break-glass account procedures (YubiKey+passkey; dual approvals; time-boxed; extra audit stream).

13) Policy & data delivery

13.1 Bundles

  • Build: opa build -t eval -e payments.authz/allow -o bundle.tar.gz policy/ data/
  • Sign with cosign; store in OCI registry or S3; OPA verifies signature before load.
  • OPAL for near-real-time tuple deltas with read-your-writes gating for admin UI.

13.2 Versioning & simulation

  • /v1/auth/simulate?policy_ref=refs/heads/next accepts input doc, returns allow/deny + explain trace.
  • Policy drift detection: shadow-eval requests against “next” and compare decisions before rollout.

14) Administrative UX & governance

  • Security settings per tenant (security_settings JSONB): mfa_required_for_all_users, session_timeout_minutes, allowed_ip_ranges, resident_regions, require_dpop, require_mtls_for_m2m.
  • JIT access workflows: request → approval → short-lived credential → auto-expire; link every action back to ticket.
  • Kill switches for disbursement/settlement by tenant/region; flipped via change-controlled UI; propagate within seconds.
  • Evidence exports: click-to-download PDF/CSV summaries for auditors with cryptographic attestations.

15) Reference schemas & endpoints

15.1 Tables (core)

sessions (Redis JSON)

{
  "jti":"sess_abc", "sub":"user_123", "tid":"acme",
  "aal":1, "device":"browser:chrome", "ip":"203.0.113.5",
  "created_at":"2025-09-22T09:00:00Z", "last_seen":"2025-09-22T09:05:12Z",
  "revoked_at":null
}

api_keys (SQL)

CREATE TABLE api_keys (
  id text PRIMARY KEY,             -- ak_live_xxx
  tenant_id text NOT NULL,
  secret_hash text NOT NULL,       -- argon2id
  scopes text[] NOT NULL,
  budget jsonb,                    -- {amount_daily: 1000000, currency:"KES"}
  ip_allowlist cidr[],
  expires_at timestamptz,
  created_by text, created_at timestamptz default now(), revoked_at timestamptz
);

tenants.security_settings (JSONB)

{
  "mfa_required_for_all_users": true,
  "session_timeout_minutes": 30,
  "allowed_ip_ranges": ["196.201.0.0/16"],
  "resident_regions": ["KE","UG"],
  "require_dpop": true,
  "require_mtls_for_m2m": true
}

15.2 Key endpoints (sketch)

  • POST /auth/login → WebAuthn begin/finish or OIDC callback → access & refresh tokens.
  • POST /auth/token → refresh (rotate), DPoP proof validation.
  • POST /auth/mfa/challenge → produce bound challenge on 403.
  • POST /auth/mfa/complete → exchange for AAL2 token (bound).
  • GET /sessions?userId=... / DELETE /sessions/{id} → admin visibility/revocation.
  • POST /api-keys / DELETE /api-keys/{id} → create/rotate/revoke; show secret once.
  • POST /authz/decision (internal) → Envoy ext_authz integration.
  • POST /authz/simulate?policy_ref=... → dry-run with explain.
  • POST /scim/v2/Users, PATCH /scim/v2/Users/{id}, DELETE /scim/v2/Users/{id}.

16) Security operations runbook (high level)

  1. Key compromise suspected:

    • Revoke sessions (sess:*), rotate signing keys (per-tenant JWKS), invalidate refresh tokens, notify tenants.
  2. Deny storm:

    • Inspect policy rollout; compare canary vs stable; auto-rollback if drift detected; check tuple stream lag.
  3. MFA provider outage:

    • Allow only already-AAL2 sessions; pause new step-ups; enable break-glass for ops with dual approvals.
  4. Region isolation:

    • Enforce residency via PBAC; kill switch high-risk actions in region; fail service-to-service to local region only.

17) Delivery plan (90 days)

Weeks 0–2

  • Canonical input schema (TypeScript types + Zod).
  • Envoy ext_authz + centralized OPA; decision logging to secure sink.
  • WebAuthn + step-up bound challenges; Redis sessions & revocation.

Weeks 3–6

  • ReBAC tuples + OPAL deltas; DB RLS; partial-eval bundles for hot paths.
  • Per-tenant federation setup & SCIM MVP; per-tenant JWKS rotation flow.
  • HMAC legacy support with nonce replay cache; OAuth 2.1 mTLS for new clients.

Weeks 7–9

  • Hash-chained audit + WORM export; PBAC registry; purpose on all requests.
  • Mutation testing for Rego; property-based authz tests; policy coverage dashboard.

Weeks 10–12

  • Blue/green policy deploys with signed bundles; drift detection + canaries.
  • Formal invariants (TLA+) for settlement approval; game day #1; compliance evidence exports.

18) Minimal code snippets to bootstrap

TypeScript: request identity extractor

export type AuthzInput = {
  tenant: { id: string; security: Record<string, unknown> };
  subject: { id: string; type: "user"|"service"; roles: string[]; aal: number };
  resource: { type: string; id: string; tenant_id: string; attrs?: Record<string, unknown> };
  action: string;
  purpose: string;
  context: { ip: string; ua: string; risk: string; time: string };
};

export function toAuthzInput(req: any): AuthzInput {
  return {
    tenant: { id: req.headers["x-tenant-id"], security: req.tenantSecurity },
    subject: { id: req.user.sub, type: req.user.typ, roles: req.user.roles ?? [], aal: req.user.aal ?? 1 },
    resource: req.resourceDescriptor,         // set by router/resource middleware
    action: req.action,                       // e.g., "settlement.update"
    purpose: req.headers["x-purpose"] ?? "operational",
    context: { ip: req.ip, ua: req.headers["user-agent"], risk: req.risk, time: new Date().toISOString() }
  };
}

Node: API key verification with budgets

async function verifyApiKey(req) {
  const id = req.get("x-api-key-id");
  const sig = req.get("x-signature");
  const nonce = req.get("x-nonce");
  const date = req.get("date");
  const rec = await db.api_keys.findByPk(id);
  if (!rec || rec.revoked_at) throw forbidden("invalid_key");
  assertWithin(rec.ip_allowlist, req.ip);
  assertNotExpired(rec.expires_at);
  const ok = await argon2Verify(rec.secret_hash, req.get("x-api-key-secret") ?? ""); // or use HMAC only
  if (!ok) throw forbidden("bad_secret");
  verifyHmac(sig, {method:req.method,path:req.path,body:req.rawBody,date,nonce}, rec.key_bytes);
  await assertBudget(rec, req); // amount/day, txn/min
}

PostgreSQL: per-request tenant scoping

-- At request start:
-- SELECT set_config('app.current_tenant', $1, true);
-- SELECT set_config('app.principal_roles', $2, true); -- optional JSON of role list

19) Risk register (selected)

  • Policy drift (fallback logic diverges) → Block fallbacks; enforce ext_authz mandatory; test drift in canaries.
  • Tuple staleness → OPAL deltas + admin UI waits for ingestion ACK; display “effective policy time”.
  • JWKS misconfiguration per tenant → rotation states + alarms on stale keys; tests on federation setup.
  • Audit PII leakage → central redaction, e2e tests asserting absence of sensitive fields.
  • Deny storms → dashboards, rate-of-change alerts, kill-switches for specific actions/tenants.

20) Final checklist (go-live)

  • WebAuthn active; AAL2 enforced for high-risk actions.
  • Envoy ext_authz → OPA live; no bypass routes.
  • RLS enabled on all multi-tenant tables.
  • ReBAC tuples populated; OPAL streaming deltas OK.
  • Policy bundles signed, blue/green switch tested.
  • Hash-chained audits exported to WORM; verification job green.
  • Per-tenant SSO+SCIM tested; JWKS rotation alarms in place.
  • Observability dashboards live; SLOs tracked.
  • Incident runbooks rehearsed; break-glass sealed.
  • Compliance evidence exports produce correct control mapping.

TL;DR

This blueprint yields regulator-ready, breach-resilient AuthN/Z: passkeys and bound step-ups, DPoP/mTLS-bound short-lived tokens, centralized OPA with ReBAC and PBAC, plus DB-level RLS as a backstop. Everything is signed, measured, tested, and reversible, with clean audit trails and fast failure modes—exactly what a payments platform needs to scale cross-tenant and cross-region without sacrificing safety or speed.

Modified at 2025-09-22 09:56:40
Previous
TO-DO
Next
Onboarding
Built with