SGA Dental Partners · Briefing for Ron

What Data Goes Where
When We Run Claude.

A high-level walkthrough of how the OM Daily brief moves data today, and four ways to make that flow safer — from a do-it-ourselves decoder ring to a hardware-locked Azure enclave.
01 / 17
Prepared byScott Guest
AudienceRon · CIO
DateMay 2026
Read time~6 minutes
Diagram 1 · Where we are today

Today: Power BI → Script → Claude

02 / 17
INSIDE SGA Power BI Semantic Model (production data) SOURCE Our Script Layer Pulls query results Builds the JSON payload NO TOKENS USED HERE query OUTSIDE SGA Claude API Anthropic hosted model processes the JSON payload JSON over HTTPS (this is what the model sees) response text Database query runs directly — no AI involved Python / Node serializes the results into JSON JSON is the entire input the model gets
The Controls Inventory

What's Already Locked Down — and What's Next

03 / 17
IN PLACE TODAY

Baseline — what we already have

BAA with Anthropic
Business Associate Agreement on file. Contractual HIPAA boundary already established with the model provider.
Zero Data Retention (ZDR)
Anthropic deletes the JSON payload the moment the response is returned. No logs, no storage, nothing sitting on their servers.
Encryption in transit to Anthropic
Every call moves over TLS 1.2+. The payload is unreadable on the wire between our environment and the API.
Data at rest in Azure (where it lives there)
Azure SQL, Cosmos, and Data Lake Storage encrypt by default. Anything we've already moved into the tenant is already encrypted on disk.
!
Scattered databases outside the tenant
Local SQLite / DuckDB files spun up by agents today live wherever the agent ran. Not centrally governed, not consistently encrypted, not audited.
!
No PHI scrubbing before send
The JSON payload to Claude carries whatever fields the query returned. We rely entirely on contract + ZDR — not on technical de-ID.
FUTURE STATE

What we add — layered, not all-or-nothing

+
De-identification layer (homegrown or commercial)
Strip PHI before it leaves SGA. Decoder ring stays local. Even if everything else failed, the payload would contain only opaque tokens.
+
Consolidate all data into Azure tenant
Stop creating databases outside our walls. Every agent-spawned SQL / vector store / cache lives in the Azure sandbox under one governance model.
+
Encryption at rest, everywhere
Once everything lives in Azure, the at-rest control is universal — no more files on local disks that escape the policy.
+
Auditable AI gateway (optional)
Every Claude call routed through a logged proxy so we can prove what left and what came back — useful for compliance reviews.
+
Confidential computing enclave (heavyweight)
Data stays encrypted even during processing. Removes our dependence on Anthropic's retention promises — the data simply never leaves.
+
Documented audit trail per request
Who triggered it, what fields went out, what came back, when. Phase B-ready before PHI ever crosses the line.
ANTHROPIC SIDEBAA · ZDR · TLS in transit
SGA SIDEAzure encryption at rest · tenant consolidation
STILL TO ADDDe-ID layer · confidential compute
Diagram 2 · Option A — Build it ourselves

Manual De-Identification — The Decoder Ring

04 / 17
INSIDE SGA Power BI real data 1. Strip Names, MRNs, DOBs replaced w/ tokens Patient_47A8 2. Decoder Ring Lookup table stays local only 47A8 → Jane Smith 4. Re-identify Reverse-map tokens to real values Patient_47A8 → Jane 5. OM Brief to the right manager with full context OUTSIDE SGA Claude API sees ONLY tokens never real PHI step 3 — AI processing tokenized JSON tokenized response
Diagram 3 · Option B — Buy a guardrail

Drop-in Commercial Gateway

05 / 17
INSIDE SGA Power BI real data Our Script builds JSON Privacy Gateway (off-the-shelf product) Detects PHI in payload Tokenizes · logs · audits Re-IDs on return OUTSIDE SGA Claude API sees tokens only tokenized tokenized response re-identified to brief All de-identification, mapping, and audit happens in the gateway — we don't write the regex, we configure the policy.
The foundation move · before any confidential compute

Pull Everything Into the Azure Moat

06 / 17
TODAY — SCATTERED SQLite local laptop DuckDB data-inbox CSV files shared drive Cloudflare D1 edge cache Random VPS DigitalOcean Agent cache /tmp/ Vector store unmanaged XLSX snapshots OneDrive JSON dumps scripts/.tmp Notebook caches .ipynb_checkpoints Scattered across laptops, VPSes, Cloudflare, drives consolidate FUTURE — ONE TENANT SGA AZURE TENANT encrypted at rest · audited · access-controlled Azure SQL DB operational data Cosmos DB agent state Data Lake raw + processed Key Vault secrets · keys Functions agent runners AI Search vector + index Blob Storage files + snapshots Container Apps long-running agents Log Analytics audit trail ENTRA ID · RBAC · DEFENDER · POLICY one identity, one audit log, one access policy — for everything above — THE MOAT —
WHY FIRST
Stops the bleeding — no more new databases landing outside the tenant.
HOW FAST
Minutes to provision a subscription. Lead time = governance approval, not tech.
WHO MIGRATES
Claude Code agents, given the right scoped credentials, can do the SQLite→Azure SQL lift themselves.
Diagram 4 · Option C — Lockbox in our cloud

Azure Confidential Computing — The Hardware Lockbox

07 / 17
INSIDE SGA AZURE TENANT Power BI semantic model Azure SQL encrypted at rest Agent Sandbox all jobs run here CONFIDENTIAL ENCLAVE Hardware-Locked Memory Data decrypted ONLY inside a sealed CPU region • Host OS cannot see it • Hypervisor cannot see it • Even Azure admins cannot AMD SEV-SNP · Intel TDX OUTSIDE OPTIONAL Claude API called from inside the enclave only or self-hosted model tokenized only tokenized response re-identified inside enclave Re-identification happens INSIDE the lockbox. Clear-text patient context never crosses the wall.
Summary · The Ladder

Four Steps. We Don't Have to Pick All of Them.

08 / 17
STEP 0 · TODAY

Current state

BAA + TLS + (negotiate) zero data retention. Defensible, but no in-house guardrail on what PHI fields go out.

EFFORT
STEP 1 · QUICK WIN

De-ID layer (homegrown)

Strip PHI in our script, keep the decoder ring local, re-identify on return. Weeks, not months.

EFFORT
STEP 2 · HARDENED

Commercial gateway

Skyflow / Protecto / John Snow Labs — battle-tested NER, audit logs, less DIY risk. License cost.

EFFORT
STEP 3 · SOVEREIGN

Azure confidential compute

Consolidate everything into our Azure tenant. Hardware-locked enclave. Data never visible outside SGA.

EFFORT

First thing I'll verify when I'm back at the codebase

Find the JSON serialization step in the OM Daily pipeline. That tells us exactly which fields the model sees today — and that drives where Step 1 needs to scrub.

My recommendation

Start with Step 1 (homegrown de-ID) while we evaluate Step 2 vendors. In parallel, move toward Step 3 by giving me an Azure sandbox so new databases stop landing in scattered environments.

Product Deep-Dive · 1 of 7

Skyflow

09 / 17
Tokenization Vault · LLM Privacy Vault

Skyflow

Third-party vault that proxies PHI through deterministic tokens before LLM calls.
Partial Fit
4/10
DECISION SCORE

WHAT IT DOES

Send sensitive fields (names, MRNs, DOBs) to Skyflow → get back deterministic tokens → ship tokenized payload to Claude → Skyflow re-identifies on the return. Their LLM Privacy Vault wraps Anthropic/OpenAI calls directly.

WILL IT SOLVE THE PROBLEM?

Capability fit is real — this is exactly the in/out pattern we want. But scale is wrong: priced for millions of records/day. Our 265 calls/day is a rounding error in their target market. Phase B PHI is still months away.

ESTIMATED COST (USD/YR)

$120K–$200K/yr. Vendr transactions cluster near $195K. AWS Marketplace = “contact sales / custom annual.” No meaningful free tier — sandbox only via sales.

TIME TO IMPLEMENT

POC: 1–2 weeks (Python SDK + one vault + ~10 PHI schemas). Production: 4–8 weeks including Skyflow BAA execution, role/policy modeling, observability, DR config.

WHAT WE NEED

Separate BAA with Skyflow (3rd party in PHI chain). Python & Node SDKs — both fit our stack. Not Azure-native — calls go to their AWS-hosted vault. Schema lock-in: vault schemas defined upfront, migration required to change.

RED FLAGS

Cost/value mismatch at our volume is severe. Third BAA = third audit surface. Vendr sample size tiny. Microsoft Presidio (free OSS, Azure-native) covers 80% of the value for ~3 days of engineering.

SGA
VERDICT
Build a Presidio-based de-ID layer in-house for the OM Daily Agent now. Revisit Skyflow only if Phase B brings multi-system PHI tokenization across WS1 + WS2 + WS3, where $195K amortizes across many workloads.
Product Deep-Dive · 2 of 7

Protecto.ai

10 / 17
API-First Tokenization · LLM Focused

Protecto.ai

REST /mask + /unmask + Python SDK. Entropy-based tokens preserve semantic meaning.
Partial Fit
5/10
DECISION SCORE

WHAT IT DOES

NER scans text for PII/PHI → replaces values with deterministic, entropy-based tokens that preserve meaning (e.g. “George Williams” → “wRePE302Qx vUc7DruuWm”) → LLM processes → SDK unmasks on return. Two REST endpoints, simple Python wrap.

WILL IT SOLVE THE PROBLEM?

Strongest technical match for the pattern. Reversible deterministic tokens preserve narrative consistency per practice. But our payload is structured Power BI JSON aggregates — almost no free-text PHI. NER on numeric KPIs is wasted.

ESTIMATED COST (USD/YR)

$3K–$120K/yr. Self-serve startup tier $250/mo. Enterprise list ~$10K/mo per SoftwareSuggest. Enterprise BAA + VPC deploy almost certainly $30K–$120K range. 14-day free trial available.

TIME TO IMPLEMENT

Fastest of the lot. POC: 1–3 days — REST + Python SDK (protecto-ai on PyPI). Production: 2–4 weeks incl. BAA, VPC deploy if needed, audit logging. Inovalon case study: <1 wk POC, <30 days prod.

WHAT WE NEED

Python SDK ready. BAA available (must execute). No Azure marketplace — SaaS or custom Azure VPC. Tokens only reversible inside their vault (outage = no re-ID). Anthropic Claude is a supported LLM.

RED FLAGS

Solves a problem we don’t have today (aggregated metrics ≠ unstructured PHI). Thin third-party validation — few independent reviews. Pricing opacity ($250 vs $10K spread = sales-led). Anthropic BAA already covers most exposure.

SGA
VERDICT
Cleanest pattern match of any commercial vendor and fastest to stand up. Reasonable to revisit in Phase B when patient records, call recordings, or treatment plans enter the pipeline. For today’s OM Daily Agent, homegrown Presidio is cheaper and equivalent.
Protecto in Practice · concrete fit at SGA

One Vault, Many Data Sources, One Safe Exit

11 / 17
INSIDE SGA OUTSIDE SGA DATA SOURCES Power BI low PHI — mostly aggregated Neurality PHI: patient demographics, DOB Pure Logic PHI: clinical + financial Go high level lead management · contact data Dental Intel PHI: practice + patient Other Systems PHI & Non-PHI } COMMERCIAL VAULT Protecto.ai 1. Detect PHI · Tokenize NER scans every field · entropy-based tokens "Jane Smith" → "wRePE302Qx" 2. Decoder Ring (stays in vault) Deterministic map · consistent across calls never crosses our wall 3. Re-identify on return · audit Reverses tokens for downstream consumers every call logged · HIPAA-ready Mask / Unmask SDK · REST API · ~265 calls/day · +2 hops ANTHROPIC HOSTED Claude API sees tokens only — never real PHI BAA · ZDR · TLS tokenized JSON tokenized response Re-identified Output OM Briefs · Daybreak · dashboards · agent context full patient context restored — only inside SGA
THE KEY IDEA
Every PHI-bearing system in SGA fans into one vault. The vault — not each pipeline — owns the decoder ring. Claude only ever sees tokens.
WHY PROTECTO
Highest-scored commercial vendor (5/10). REST API + Python SDK. Anthropic Claude is a documented supported LLM. POC = 1–3 days.
WHAT IT COSTS
Self-serve $3K/yr for proof-of-life; enterprise BAA tier likely $30K–$120K/yr. 14-day free trial available before any commitment.
Product Deep-Dive · 3 of 7

John Snow Labs

12 / 17
Healthcare NLP Library · Apache Spark

John Snow Labs

Spark NLP for Healthcare — 2,500+ pre-trained clinical models, 96% F1 on PHI detection.
Poor Fit
3/10
DECISION SCORE

WHAT IT DOES

Python library of 2,500+ pre-trained clinical/medical NLP models built on Apache Spark. De-ID pipeline detects 23+ PHI entity types and either masks or substitutes realistic surrogates while preserving relational consistency. Hospital-grade: Providence de-identified 2B notes with 0% re-ID rate.

WILL IT SOLVE THE PROBLEM?

Overkill for the data shape. Their edge is unstructured clinical text (notes, discharges, pathology). SGA’s payload is structured JSON aggregates — no clinical notes. Dental NER doesn’t exist in their corpus (trained on MIMIC, i2b2). Spark JVM is mandatory — heavy dependency for 265 calls/day.

ESTIMATED COST (USD/YR)

$50K–$120K/yr floating license. AWS Marketplace hourly ~$47/hr on m5.2xlarge ($34K/yr always-on). 24x7 support pushes to $150K–$250K. Per-server licensing — no volume discount for our scale.

TIME TO IMPLEMENT

POC: 2–3 weeks incl. JVM/Spark setup, model download (5–15GB), license activation, integration. Production: 6–10 weeks with re-ID mapping, audit logging, security review.

WHAT WE NEED

Spark runtime mandatory (even single-node). Adds JVM to deployment, ~2GB container size. License JSON file. Python only for orchestration. Azure-compatible (Marketplace + Databricks). 8 cores / 64GB recommended.

RED FLAGS

Hospital-system tooling, not dental DSO. Six-figure spend with no published pricing = vendor lock-in. Spark dependency balloons a Python script into a JVM service. Healthcare-NER advantage is wasted on aggregated metric JSON. Microsoft Presidio (free, Azure-native, ships MedicalNER 2025) + dental regex = 90%+ value for $0.

SGA
VERDICT
Strong product, wrong problem. Skip for OM Daily Agent. Could become relevant only if Phase B brings clinical notes, treatment plans, or call recordings into the AI pipeline at high volume.
Product Deep-Dive · 4 of 7

Datavant

13 / 17
Healthcare Data Linkage · HIPAA Expert Determination

Datavant

Privacy Hub + Connect — tokenization for cross-dataset linkage (clinical trials, pharma).
Poor Fit
2/10
DECISION SCORE

WHAT IT DOES

Privacy Hub: HIPAA Expert Determination — statisticians certify a dataset is de-identified for re-release. Connect: tokenization software for cross-dataset patient linkage. Built for static dataset preparation, not real-time API scrubbing.

WILL IT SOLVE THE PROBLEM?

Wrong use case. Datavant solves “share a dataset with a research partner provably de-identified.” SGA needs “scrub a JSON blob in flight before Anthropic.” Connect tokenizes patient identifiers for linkage — doesn’t redact free-text PHI in arbitrary payloads. Different problem class entirely.

ESTIMATED COST (USD/YR)

$130K–$350K/yr. AWS Marketplace lists Connect at $300K/yr (12-mo commit). Vendr median: $130K. Privacy Hub Expert Determination: $50K–$150K per engagement. No SMB tier exists.

TIME TO IMPLEMENT

6–9 months total. Enterprise sales cycle: 3–6 months. POC: 4–8 weeks after contracting. Production integration: 8–16 weeks. Far slower than every other option in this deck.

WHAT WE NEED

AWS-only deployment — Datavant Connect ships as Docker on ECS/EKS, reads/writes S3, integrates with AWS Clean Rooms. No documented Azure support — major architectural mismatch for SGA. Requires PII fields present to tokenize.

RED FLAGS

Product-problem mismatch is the deal-killer. 6-figure minimum commit for what Presidio + AWS Comprehend Medical solves natively. Azure incompatible. 3–6 month enterprise sales cycle. Zero dental/DSO references — we’d be a net-new vertical for them.

SGA
VERDICT
Skip. Datavant solves a fundamentally different problem (data-linkage for research) than ours (LLM payload scrubbing). Recommend evaluating Tonic Textual, Private AI, AWS Comprehend Medical, or Presidio instead.
Product Deep-Dive · 5 of 7

Granica

14 / 17
AI Safe Room · Series A startup

Granica Screen

VPC-deployed PII/PHI masking agent for LLM prompts — AWS & GCP only.
Poor Fit
3/10
DECISION SCORE

WHAT IT DOES

PII/PHI detect-and-mask service deployed as a lightweight agent inside the customer’s VPC. Scans text (LLM prompts, RAG context, tabular data) with NER models, replaces sensitive entities with synthetic surrogates or tokens before the prompt leaves the environment, re-identifies on return. Marketed as “AI Safe Room.”

WILL IT SOLVE THE PROBLEM?

Azure NOT supported — AWS (EKS) and GCP (GKE) only. Fatal for SGA’s stack. Claude support unconfirmed (docs return 404). Re-identification flow documentation is thin. Built for petabyte data-lake scale — massive overkill for 265 calls/day.

ESTIMATED COST (USD/YR)

$60K–$250K/yr enterprise. Pricing page literally “under construction.” Sales-led only. Reference customers are Fortune 500 with petabyte data lakes; SGA is several orders of magnitude smaller than ICP.

TIME TO IMPLEMENT

If Azure were supported: POC 2–4 weeks, production 6–10 weeks. Real answer: cannot deploy on Azure today. Would require AWS migration or cross-cloud network path.

WHAT WE NEED

Kubernetes cluster (EKS or GKE only). Custom Python wrapper around Anthropic SDK to call Screen API for redact/rehydrate. BAA not publicly advertised — must negotiate. Docs gate (404s on Screen overview pages) signals product immaturity.

RED FLAGS

Series A, ~40 employees (Feb 2026). Screen still “early access preview.” Zero healthcare-specific case studies. No G2/independent reviews. Pricing page under construction. Roadmap pivoting (Screen + Chronicle both early access).

SGA
VERDICT
Skip. Azure incompatibility alone disqualifies. Even setting that aside — Series A maturity, zero healthcare references, and enterprise pricing for 265 calls/day make this a no. Better fits: Presidio (Azure-native, free), LiteLLM proxy + Presidio guardrail.
Product Deep-Dive · 6 of 7

hoop.dev

15 / 17
Identity-Aware Access Gateway · Hybrid OSS/SaaS

hoop.dev

Built for human→DB access with masking. LLM API masking is marketing extension, not core product.
Poor Fit
3/10
DECISION SCORE

WHAT IT DOES

Identity-aware access gateway in front of databases (Postgres, MySQL, Mongo, MSSQL), SSH/RDP/K8s, and MCP servers. Parses wire protocols, masks PII in responses, blocks dangerous commands, records sessions. AI/LLM masking is a marketing extension of the same gateway, mostly targeted at MCP clients — not server-side Python scripts calling Anthropic.

WILL IT SOLVE THE PROBLEM?

Wrong dataflow direction. OM Daily Agent = outbound HTTPS to api.anthropic.com from Python. hoop.dev intercepts database wire protocols + MCP traffic, NOT arbitrary outbound REST. Their docs show no first-class “Anthropic API proxy” path. Forcing it would require an undocumented custom HTTP plugin.

ESTIMATED COST (USD/YR)

$5K–$15K/yr cloud (per-identity, ~$30–$60/identity/month). $0 self-host MIT core — but loses commercial AI masking, web UI, IdP sync, and support. Cheapest option in this deck if you only need basic access control.

TIME TO IMPLEMENT

POC: 2–4 weeks — and only if abusing the OSS HTTP/gRPC plugin to wrap outbound LLM calls (unsupported pattern). Production: 6–10 weeks including Azure deploy (Docker/K8s), Entra wiring, custom masking rules, audit logging, BAA negotiation.

WHAT WE NEED

Azure VM or AKS for self-host. Gateway in Go, control plane in Clojure. OIDC/SAML IdP (Entra OK). Custom Go plugin work to handle Anthropic HTTPS as a “connection” — not documented. No Python SDK that wraps the LLM call natively.

RED FLAGS

AI masking story is aspirational, not mature. G2 minimal review volume. No healthcare/HIPAA customers cited. Featured case studies are SRE/database access, not PHI-to-LLM. Stateful proxy + Azure infra dependency for 265 calls/day = massive infra tax for tiny workload.

SGA
VERDICT
Skip for this use case. For 265 calls/day of script→Anthropic with an existing BAA, in-process Python de-ID (Presidio + dental dictionary) is dramatically lighter and exactly fits the data direction. Reconsider hoop.dev only if SGA needs human→production-DB access governance.
Product Deep-Dive · 7 of 7

Fortanix

16 / 17
Confidential Computing · Hardware Enclaves

Fortanix

DSM (HSM) + CCM (enclave orchestration) + Armet AI (turnkey GenAI in TEE).
Poor Fit
3/10
DECISION SCORE

WHAT IT DOES

Data Security Manager: FIPS 140-2 Level 3 key/HSM manager. Confidential Computing Manager: orchestration + remote attestation for apps inside hardware enclaves (Intel SGX/TDX, AMD SEV-SNP, NVIDIA H100 confidential GPU). Armet AI: turnkey GenAI stack (vector DB + inference + DLP + guardrails) where every stage runs inside a TEE.

WILL IT SOLVE THE PROBLEM?

FATAL BLOCKER: cannot put Claude inside a Fortanix enclave. Claude weights are not licensed for self-hosting. Anthropic’s own Confidential Inference via Trusted VMs (Nov 2024) is research-stage, not a product. Fortanix presumes you own the model (Llama, Mistral). Once a request leaves your enclave to hit api.anthropic.com, the TEE guarantee is broken.

ESTIMATED COST (USD/YR)

$200K–$600K/yr all-in. DSM: $30K–$80K entry, $100K–$250K+ HA production. CCM: $50K–$150K add-on. Armet AI: $250K–$500K+. Plus Azure CVM compute ($0.50–$2/hr per node). No public price list.

TIME TO IMPLEMENT

POC: 6–10 weeks (Azure CVM provisioning, attestation wiring, one workload). Production: 6–9 months with a dedicated security engineer + Fortanix professional services engagement. Not weeks.

WHAT WE NEED

Azure CVM quota (DCasv5/ECasv5 SEV-SNP or DCesv5 TDX). Azure Attestation Service. BAA with Microsoft (have) + Fortanix (separate). Self-hostable model (Llama 3, Mistral, Phi) — not Claude. Python/Go/Java SDKs. Fortanix REST APIs + CCM node agents.

RED FLAGS

Claude unsupported — period. Massive overkill for Phase A non-PHI data. Operational drag: attestation failures, CVM patching, H100 GPU availability constrained. Better Phase A alt: Azure OpenAI + BAA + private endpoint + CMK covers 95% of the threat model at 1% of the cost.

SGA
VERDICT
Revisit only when ALL three hold: (a) Phase B PHI is in active scope, AND (b) we’re willing to self-host an open model in place of Claude, AND (c) Anthropic productizes confidential inference. Today, none of those apply.
The Decision · Side-by-Side

Vendor Decision Matrix — And What I’d Recommend

17 / 17
Product Category Verdict Score Cost / yr Time to Prod Killer Issue
Skyflow
LLM Privacy Vault
Tokenization vault Partial 4 $120K–$200K 4–8 weeks Enterprise pricing for an enterprise problem we don’t have yet. AWS-hosted, not Azure-native.
Protecto.ai
REST tokenization API
API tokenization Partial 5 $3K–$120K 2–4 weeks Best technical pattern match. Solves a Phase B problem we don’t have today.
John Snow Labs
Healthcare NLP
NLP library Poor 3 $50K–$120K 6–10 weeks Hospital-system tooling. No dental NER. Spark JVM dependency.
Datavant
Privacy Hub + Connect
Data linkage Poor 2 $130K–$350K 6–9 months Wrong problem class (cross-dataset linkage). AWS-only. No Azure support.
Granica
Screen / AI Safe Room
LLM masking proxy Poor 3 $60K–$250K 6–10 weeks Azure NOT supported. Series A, early access, zero healthcare references.
hoop.dev
Access gateway
DB / MCP gateway Poor 3 $5K–$15K 6–10 weeks Wrong dataflow direction. Built for human→DB, not script→LLM.
Fortanix
Confidential compute
Hardware enclaves Poor 3 $200K–$600K 6–9 months Cannot put Claude inside any enclave. Anthropic doesn’t license self-hosting.
Homegrown (Presidio)
Microsoft OSS + dental regex
In-process Python lib Strong 8 $0 license 1–2 weeks None — right-sized for Phase A. Buy commercial only when Phase B PHI volume justifies it.
RECOMMENDED PATH
Homegrown
+ Presidio
Every commercial vendor is either wrong-tool, wrong-cloud, or 10–100× overpriced for the OM Daily Agent’s Phase A scope. Microsoft Presidio is Azure-native, open-source, and runs in-process — 1–2 weeks of engineering, $0 license. Re-evaluate Protecto.ai or Skyflow the moment Phase B activates and PHI volume + cross-system tokenization needs make a commercial tool earn its keep.