The SoM Stack

Affective compute
infrastructure,
written down.

RTScale builds a cryptographic consent layer for consequential moments. The primitive is the State of Mind Signature — a hardware-rooted, tamper-evident artifact that documents observable affective indicators at the instant of action. This page covers the technical model: four signature classes, on-device capture architecture, and the decomposed-indicator format that makes the artifact both portable and privacy-preserving.

§01

Signature classes

Every SoM Signature is declared at one of four named classes. The class selected at session start is a shorthand for an entire configuration — which inputs are present, how identity is bound, what capture protocol runs, and what fidelity floor applies.

A

Adjudication

Legal-effect authorization. Dispute-defensible.

When
High-value wire transfers above jurisdiction threshold. APP-scam-prone transfers (first-time payee, urgency markers). Insurance claim affidavits. Any transaction requiring a full cryptographic record of consent under potential adversarial review.
Inputs bound
All five inputs: facial emotion (FE), vocal emotion (VE), microexpression events (ME), identity (I), and transcript (T). ME minimum: 5 events default, 6 captured.
Identity
Continuous internal biometric — voice and facial biometric both required. Strength floor ≥ 0.90.
Latency
Agent-driven Q&A capture. Deep Scan finalized within 1.5 s of last utterance. Async high-fidelity reprocessing available for dispute review.
B

Authorization

The workhorse tier for real-time payments.

When
Standard RTP / FedNow retail payments. Card step-up authentication (moderate value). Account opening as a KYC adjunct. Workforce privileged-action authorization.
Inputs bound
All five inputs: FE, VE, ME, I, T. ME minimum: 3 events default, 4 captured. ME policy may be parameterized above the class minimum.
Identity
Continuous internal biometric — voice or facial biometric (either one). Strength floor ≥ 0.80.
Latency
Agent-driven Q&A or rich scripted-phrase capture with embedded keywords. Deep Scan finalized within 1.5 s of last utterance.
C

Confirmation

Friction-sensitive re-confirmation within an established session.

When
In-session re-confirmation for low-value repeats. Consumer-facing web-SDK flows — Confirmation is the highest class routinely supportable in-browser. Contact-center agent identity continuity. Login / session continuity checks.
Inputs bound
Facial emotion (FE), vocal emotion (VE), and transcript (T). ME optional — minimum 1 event, default 2 when elicited.
Identity
Start-only internal biometric acceptable, OR continuous external verification. Strength floor ≥ 0.65.
Latency
Scripted-phrase repetition (Deep-Scan protocol) or Quick Scan when friction budget is tight. Quick Scan resolves within ~200 ms p95 of capture-complete.
D

Presence

Non-gating affective signal. No scripted phrase. No identity claim.

When
Analytics and consented research signals. Login and session continuity checks where no financial gate is required. Passive ambient engagement signals during sustained interactions.
Inputs bound
Facial emotion (FE), vocal emotion (VE), and transcript (T). No ME requirement. Anonymous SoM Sig emitted — no identity binding on the signature.
Identity
External verification or none. Identity binding is optional by class design.
Latency
Passive capture during natural interaction — no scripted phrase. Signal available within ~200 ms p95. No escalation path; Presence does not gate any action.

§02

On-device capture model

Capture runs on-device — in the Secure Enclave, StrongBox, or TPM 2.0 execution context of the signer's hardware. The cloud receives the signed artifact; raw camera frames and audio buffers stay on the device.

iOS · iPadOS

Secure Enclave

Hardware-isolated execution context on every iPhone since iPhone 5s. The SoMM inference runs within an enclave-attested process; the root token and session keys are generated inside the enclave and never exported in plaintext. The SoM Sig token is hardware-bound to the device.

  • Full-tier fidelity
  • Adjudication-capable
  • Face ID continuity

Android

StrongBox Keymaster

Tamper-resistant hardware security module. StrongBox provides the same key-isolation guarantee as Secure Enclave on qualifying Android devices. Devices without StrongBox fall back to the TEE (Trusted Execution Environment) with a documented fidelity downgrade and class ceiling.

  • Full-tier on qualifying hardware
  • Standard-tier on TEE fallback
  • Class ceiling per device probe

Windows · macOS

TPM 2.0

Trusted Platform Module 2.0 provides the hardware attestation root on desktop. The RTScale Desktop SDK probes for TPM availability at install time and configures the class ceiling accordingly. Native desktop is Adjudication-capable on all machines meeting the Windows 11 TPM 2.0 requirement.

  • Full-tier on TPM 2.0
  • Adjudication-capable
  • Agent Q&A capture

Browser · Web SDK

WebAuthn + camera API

The Web SDK runs camera and microphone capture in-browser against the WebAuthn platform authenticator. Class ceiling is Confirmation — the highest class routinely supportable in-browser without a native runtime. Presence and Confirmation are fully supported; Authorization and Adjudication require native.

  • Confirmation ceiling (class C)
  • Presence supported
  • No native install required

Why on-device matters

  • Raw data minimization. Camera frames and audio buffers never leave the device. The cloud receives the signed artifact — not the biometric source material.
  • Hardware attestation chain. The signature token is bound to the device's root of trust. A signature produced on device A cannot be replayed from device B — the attestation chain fails.
  • Provenance chain integrity. The SoM Provenance Chain runs from the per-subject hardware root token through ephemeral session keys to the SoM Sig token. Cascade revocation from the root token renders all downstream signatures cryptographically unverifiable.
  • Latency. On-device inference avoids a round-trip to the cloud for the capture-phase ML. Quick Scan resolves within ~200 ms of capture-complete. Deep Scan finalizes within 1.5 s.

§03

Decomposed indicators model

The SoM Signature carries decomposed affective indicators — not raw biometric data. The distinction is architecturally and legally load-bearing.

What the signature contains

  • Facial Emotion (FE) stream Per-frame Action Unit (AU) activations and fused embedding vector over the capture window. FACS-coded. Dimension-fixed per pipeline version.
  • Vocal Emotion (VE) stream Per-window prosody features derived from self-supervised speech representations (HuBERT / WavLM). No speaker-identification embedding is included — vocal features, not voiceprint.
  • Microexpression (ME) events Discrete keyword-bound events: onset timestamp, AU pattern, onset–apex–offset timing. Each event is linked to the keyword articulated when the microexpression fired — not a continuous stream.
  • Structural labels Class declared, inputs present, identity binding method and temporality, ME event count, class compliance status, pipeline version. Non-negotiable — every signature carries them.

What the signature does not contain

  • Raw video or audio Camera frames and audio buffers are consumed by the on-device inference pipeline and not retained after the SoM payload is emitted. The signature cannot be reversed into a video of the signer.
  • A facial recognition embedding The FE stream is an AU-coded affect vector, not a face recognition embedding. A downstream system cannot use the signature to identify the signer's face in a second image — it is not designed for that purpose and does not function as such.
  • A voiceprint or speaker embedding VE features are derived from prosodic structure, not speaker identity. They support vocal emotion inference, not speaker identification.
  • A clinical or psychiatric assertion The signature documents observable affective and behavioral indicators. It does not classify "true" internal emotional states in a ground-truth sense and does not make medical or psychiatric assertions.

Why this matters for privacy regulation

GDPR Article 17 (right to erasure) is satisfied by cryptographic erasure — revoking the per-subject root token renders all downstream SoM Sig tokens mathematically unverifiable without deleting the signed artifact itself. The title company's E&O retention requirement and the state probate record obligation are preserved; the underlying biometric provenance is rendered inaccessible.

CCPA, LGPD, and PIPL biometric-data provisions are addressed by the decomposed-indicator format: the signature carries affect vectors, not facial-recognition embeddings or voiceprints, which are the categories those statutes target most narrowly. Jurisdictions with the broadest biometric definitions (Illinois BIPA) are addressed in the compliance library index.

Read the protocol spec. Schedule a working session.

The developer brief covers the full pipeline in ~2,500 words — signature format, token lifecycle, SDK integration surface, and trust model. The working session is one hour with a solutions engineer and, where relevant, a compliance counsel co-pilot.