How to Run AI on Sensitive Data Without Violating HIPAA or GDPR
97% of enterprise AI deployments are blocked by 'no sensitive data in LLMs' policies. This guide covers exactly how to unlock AI on PII, PHI, and financial data using hardware-isolated enclaves — with cryptographic proof your data never left the boundary.

Your AI pilot worked. The model surfaced insights in minutes that used to take analysts days. The business case was obvious. Then legal reviewed it and the answer was no.
"We can't send patient records to OpenAI." "Customer transaction data can't leave our perimeter." "This violates our GDPR data processing obligations."
This isn't a rare edge case. According to surveys of enterprise technology leaders, 97% of AI deployments in regulated industries hit a wall the moment sensitive data is in scope. The same capabilities that make AI transformative — vast context windows, cross-document reasoning, real-time pattern matching — require the model to actually see the data. And seeing the data means sending it somewhere you can't fully control.
The traditional response is to mask, redact, or anonymize data before sending it to an AI. This works for some tasks. For the ones that matter most — detecting fraud in transaction flows, surfacing risk in patient records, catching anomalies in financial filings — removing the sensitive data removes the signal.
This guide covers the actual solutions: what they are technically, what each one does and doesn't guarantee, and how to implement them in production.
Why "Just Anonymize It" Doesn't Always Work
The conventional advice is to strip PII before sending data to an AI model. It sounds simple. In practice it has real limits:
Re-identification risk. Research consistently shows that even heavily anonymized datasets can be re-identified. An AI model operating on "anonymized" records can correlate age, zip code, diagnosis codes, and visit timestamps to infer identity with high probability. For medical or financial data, this is not a theoretical risk — regulators have prosecuted it.
Context destruction. The fields you remove are often the ones the model needs. If you're asking an AI to detect fraudulent transactions, removing the account holder's name, address, and behavioral history makes the problem unsolvable. If you're asking it to flag medication interactions, removing the patient's age and prior diagnoses reduces accuracy to noise.
Compliance doesn't end at the input. GDPR Article 25 (Data Protection by Design), HIPAA's Technical Safeguards, and the EU AI Act all require that you demonstrate ongoing protection of personal data throughout its lifecycle — not just at the point it enters a system. Stripping data before inference doesn't address what happens to intermediate computations, logs, or outputs.
Outputs can contain PII. Language models are generative. They can reproduce or extrapolate personal information from context even when the input was nominally anonymized. Controlling the input doesn't control the output.
None of this means anonymization is useless. It means it's not sufficient as the only control — and it actively destroys value in the use cases where AI provides the most leverage.
The Four Technical Approaches
Approach 1: On-Premises Deployment
Deploy the AI model on infrastructure you own and operate. Data never leaves your network perimeter.
What this actually gives you:
- Data stays inside your network
- No third-party access, contractual or otherwise
- Works with any open-source model
What it costs:
- GPU hardware is expensive. A single A100 costs $10,000–$30,000. A useful inference cluster for production workloads starts at $100K+.
- You own maintenance, driver updates, CUDA compatibility, and hardware failure
- You still can't produce cryptographic proof that a specific computation ran correctly — useful for audit trails, not available on bare metal
- Scaling requires buying more hardware
Best fit: Organizations with existing data center infrastructure, GPU hardware already amortized, and workloads with steady, predictable volume.
Approach 2: Enterprise API Agreements
The major AI providers — OpenAI, Anthropic, Google, Azure — offer enterprise tiers with contractual privacy terms: no training on inputs, no data retention, dedicated infrastructure, BAAs for HIPAA.
What this actually gives you:
- Access to frontier model quality (GPT-4o, Claude 3.5, Gemini)
- Legal (contractual) guarantee that data won't be used for training
- Provider's existing compliance certifications (SOC 2, ISO 27001)
What it doesn't give you:
- Technical verifiability. The provider agrees not to look at your data. You trust they comply. A misconfigured logging pipeline, an insider threat, or a government subpoena can still expose your data. You have no cryptographic way to verify the promise was kept.
- HIPAA's technical safeguards are harder to satisfy with contractual-only controls. Covered entities have been cited for relying solely on BAAs without implementing access controls and audit logs.
- GDPR data transfer restrictions. If you're processing EU residents' data on US infrastructure, the legal basis matters even with a contractual DPA.
Best fit: Large enterprises where legal and procurement teams are comfortable with contractual risk mitigation, and where the use case doesn't involve the most sensitive data categories.
Approach 3: Confidential AI Inference in Trusted Execution Environments
A Trusted Execution Environment (TEE) is a hardware-isolated enclave where computation runs in memory that is encrypted at the silicon level. The host operating system cannot read it. The hypervisor cannot read it. The cloud provider — even with root access to the physical machine — cannot read it.
This changes the trust model from contractual to cryptographic.
How TEE-based AI inference works:
- The AI model is packaged as a Docker container and deployed into a hardware enclave (e.g., AWS Nitro Enclave, Azure Confidential VM, NVIDIA H100 Confidential Mode)
- Your data is sent to the enclave over a mutually authenticated TLS channel
- The data is decrypted only inside the hardware boundary — in memory the CPU key-encrypts automatically
- The model runs inference inside the enclave
- The output is encrypted and returned to you
- The enclave produces an attestation document: a hardware-signed cryptographic proof of exactly what code ran, that the environment was unmodified, and that data was processed in isolation
What the attestation document proves:
The attestation includes Platform Configuration Registers (PCRs) — hashes of the enclave image, kernel, and application. These are signed by the CPU manufacturer's root of trust (Intel, AMD, AWS Nitro). An auditor, regulator, or counterparty can independently verify:
- The exact Docker image that processed the data
- That no modifications were made to the enclave at runtime
- The timestamp and nonce of the attestation (proving it's not replayed)
- That the hardware isolation was active during computation
This is audit evidence that satisfies HIPAA's Technical Safeguards (§164.312), GDPR Article 25 data protection by design, and SOC 2 CC6.1 controls around logical access.
What it costs:
- Slightly higher latency for the first request (enclave initialization + TLS setup). Subsequent requests run near-native speed.
- You're constrained to open-source or custom models that fit in the enclave's memory allocation. Large 70B models require more memory headroom; quantized 7B–13B models run comfortably on standard configurations.
- Requires a platform that manages the enclave lifecycle (Treza handles this so you don't deploy to raw Nitro).
Best fit: Healthcare AI on PHI, financial AI on proprietary trading signals or customer data, legal document analysis, any use case where you need both model quality and verifiable privacy guarantees.
Approach 4: Federated Learning
Instead of sending data to the model, train the model on data where it lives. Each node trains on its local data and sends only gradient updates to a central aggregator. The raw data never moves.
This works well for training and fine-tuning. It doesn't apply to inference — if you want to run a query against live patient records, the model still has to see the data at query time. Federated learning is a training-time privacy technique, not an inference-time one.
Implementing Confidential AI Inference with Treza
Here's what putting this into production actually looks like with the Treza SDK.
Step 1: Deploy Your Model into an Enclave
Package any open-source model as a Docker container. Treza wraps the container in a Nitro Enclave automatically.
import { TrezaClient } from '@treza/sdk';
const treza = new TrezaClient({
baseUrl: 'https://app.trezalabs.com',
apiKey: process.env.TREZA_API_KEY,
});
const enclave = await treza.createEnclave({
name: 'hipaa-inference',
region: 'us-east-1',
walletAddress: process.env.WALLET_ADDRESS,
providerId: 'aws-nitro',
providerConfig: {
// Any Docker image — Llama 3, Mistral, your fine-tuned model
dockerImage: 'myorg/clinical-llm:v2.1',
cpuCount: 4,
memoryMiB: 8192,
},
});
console.log('Enclave ID:', enclave.id);
console.log('Status:', enclave.status); // 'running'Step 2: Verify the Enclave Before Sending Data
Before sending any sensitive data, cryptographically verify the enclave is running exactly your model image — not a tampered version.
const verification = await treza.verifyAttestation(enclave.id, {
// Nonce binds this attestation to this specific verification request
// Prevents replay attacks
nonce: `audit-${Date.now()}-${crypto.randomUUID()}`,
});
if (!verification.isValid) {
throw new Error(`Enclave verification failed: ${verification.error}`);
}
// These PCR values are hardware-signed hashes of your Docker image
// They match the values you get from `nitro-cli describe-enclaves`
console.log('PCR0 (image hash):', verification.pcrs['0']);
console.log('PCR1 (kernel hash):', verification.pcrs['1']);
// Compliance checks are derived from the attestation evidence
console.log('HIPAA compliant:', verification.complianceChecks.hipaa);
console.log('SOC2 compliant:', verification.complianceChecks.soc2);
console.log('GDPR compliant:', verification.complianceChecks.gdpr);Store the attestation document. This is your audit evidence — the cryptographic proof that sensitive data was processed in a compliant enclave at a specific timestamp.
Step 3: Send Sensitive Data to the Enclave
With attestation verified, send your data. The enclave's TLS channel ensures data is encrypted in transit and only decrypted inside the hardware boundary.
// Example: HIPAA use case — summarize patient visit notes
const response = await treza.callEnclave(enclave.id, {
endpoint: '/v1/summarize',
method: 'POST',
body: {
// PHI stays encrypted until it's inside the enclave boundary
text: patientVisitNotes,
task: 'clinical_summary',
maxTokens: 512,
},
});
const summary = response.data.summary;
// PHI never left the enclave in plaintext
// The cloud provider saw only encrypted bytesStep 4: Generate Compliance Reports
For audits, you can retrieve the full attestation history for an enclave — every computation backed by a hardware-signed proof.
const report = await treza.getComplianceReport(enclave.id, {
startDate: '2026-01-01',
endDate: '2026-03-31',
format: 'pdf', // or 'json' for programmatic use
});
// The report maps directly to:
// - HIPAA 45 CFR §164.312(a)(1) Access Control
// - HIPAA 45 CFR §164.312(b) Audit Controls
// - SOC 2 CC6.1 Logical and Physical Access Controls
// - GDPR Article 25 Data Protection by DesignCompliance Mapping
Here's how TEE-based inference maps to the most common regulatory requirements:
| Regulation | Requirement | How TEE Attestation Satisfies It |
|---|---|---|
| HIPAA | Technical safeguards for PHI (§164.312) | Hardware isolation prevents unauthorized access. Attestation proves the correct model ran. Audit log is cryptographically signed. |
| GDPR | Data protection by design (Art. 25) | Encryption in memory is a technical measure built into the processing architecture, not bolted on after. Attestation documents satisfy Art. 5(2) accountability. |
| SOC 2 Type II | CC6.1 Logical access controls | Access to data in the enclave is physically impossible without the hardware keys. PCR measurements prove software integrity. |
| MiCA / DORA | ICT risk management for financial data | Cryptographic proof of execution satisfies operational resilience and data integrity requirements. |
| EU AI Act | Art. 10: Data governance for high-risk AI | Attestation documents provide auditable evidence of data handling practices for high-risk AI systems processing personal data. |
Practical Decision Framework
Use this to decide which approach is right for your use case:
Use on-premises deployment if:
- You have existing GPU infrastructure
- Your workloads are steady and predictable
- Regulatory requirements mandate fully air-gapped processing
- You don't need a cryptographic audit trail
Use enterprise API agreements if:
- You need frontier model quality (GPT-4o, Claude 3.5)
- Your legal team is comfortable with contractual controls
- Your use case doesn't involve the most sensitive data categories (PHI, classified)
- You're at a large organization with established vendor management
Use TEE-based inference (Treza) if:
- You process PHI, PII, financial, or legally privileged data
- Auditors or regulators require technical controls beyond contractual guarantees
- You need a verifiable audit trail for compliance evidence
- You want the capability to use any model — including proprietary fine-tunes — without exposing it to the cloud provider
Combine approaches if:
- Use TEE inference for sensitive production workloads
- Use enterprise APIs for tasks that don't touch regulated data
- Use local models for development and prototyping
The Bigger Shift
Regulatory pressure on AI data processing is not easing. The EU AI Act is in force. FTC guidance on AI and consumer data is tightening. HIPAA enforcement around AI-assisted clinical tools is becoming more specific. These aren't future risks — they're present ones.
The organizations moving fastest aren't waiting for the regulation to fully crystallize before building compliant infrastructure. They're adopting confidential computing now, building muscle around attestation-based audit trails, and deploying AI on sensitive data in ways that produce evidence regulators can actually verify.
The 97% statistic — AI deployments blocked by "no sensitive data in LLMs" policies — represents a backlog of business value waiting to be unlocked. The technical tools to unlock it now exist. Hardware TEEs are GA across AWS, Azure, and GCP. The performance overhead is minimal. The compliance mapping is clear.
The question is no longer whether you can run AI on sensitive data. It's whether you have the infrastructure to do it in a way that satisfies the people who are paid to say no.
Further Reading
- Private AI: How to Run Models Without Exposing Your Data — Full comparison of local models, enterprise APIs, and TEE inference
- What Is a TEE? Complete Guide — How hardware-isolated enclaves work
- HIPAA Compliance with Secure Enclaves — HIPAA-specific implementation guide
- MPC vs TEE vs FHE — Comparing privacy-preserving computation approaches
- Treza SDK on GitHub
- AWS Nitro Enclaves
- Azure Confidential Computing
- EU AI Act Full Text
Treza is a confidential compute platform that makes compliance automatic. Deploy any workload into a hardware-protected enclave — data encrypted in memory, inaccessible to the cloud operator, every execution backed by cryptographic proof. Get started.