BAA + TLS + (negotiate) zero data retention. Defensible, but no in-house guardrail on what PHI fields go out.
Strip PHI in our script, keep the decoder ring local, re-identify on return. Weeks, not months.
Skyflow / Protecto / John Snow Labs — battle-tested NER, audit logs, less DIY risk. License cost.
Consolidate everything into our Azure tenant. Hardware-locked enclave. Data never visible outside SGA.
Find the JSON serialization step in the OM Daily pipeline. That tells us exactly which fields the model sees today — and that drives where Step 1 needs to scrub.
Start with Step 1 (homegrown de-ID) while we evaluate Step 2 vendors. In parallel, move toward Step 3 by giving me an Azure sandbox so new databases stop landing in scattered environments.