How to Run AI on Sensitive Data Without Violating HIPAA or GDPR

You can run AI on PII, PHI, and financial data without breaking HIPAA or GDPR by processing it inside a hardware-isolated Trusted Execution Environment (TEE) — the data is decrypted only inside the enclave and is never exposed to the model vendor or cloud operator. Every run emits a hardware-signed attestation you can hand to auditors as proof the data stayed inside the boundary. This guide covers the four technical approaches, what each one actually guarantees, and how to put TEE-based inference into production.

Your AI pilot worked. The model surfaced insights in minutes that used to take analysts days. The business case was obvious. Then legal reviewed it and the answer was no.

"We can't send patient records to OpenAI." "Customer transaction data can't leave our perimeter." "This violates our GDPR data processing obligations."

This isn't a rare edge case. According to surveys of enterprise technology leaders, 97% of AI deployments in regulated industries hit a wall the moment sensitive data is in scope. The same capabilities that make AI transformative — vast context windows, cross-document reasoning, real-time pattern matching — require the model to actually see the data. And seeing the data means sending it somewhere you can't fully control.

The traditional response is to mask, redact, or anonymize data before sending it to an AI. This works for some tasks. For the ones that matter most — detecting fraud in transaction flows, surfacing risk in patient records, catching anomalies in financial filings — removing the sensitive data removes the signal. Doing this reliably, with an audit trail compliance teams can verify, is exactly what an AI control plane is built to enforce.

This guide covers the actual solutions: what they are technically, what each one does and doesn't guarantee, and how to implement them in production.

Why "Just Anonymize It" Doesn't Always Work

The conventional advice is to strip PII before sending data to an AI model. It sounds simple. In practice it has real limits:

Re-identification risk. Research consistently shows that even heavily anonymized datasets can be re-identified. An AI model operating on "anonymized" records can correlate age, zip code, diagnosis codes, and visit timestamps to infer identity with high probability. For medical or financial data, this is not a theoretical risk — regulators have prosecuted it.

Context destruction. The fields you remove are often the ones the model needs. If you're asking an AI to detect fraudulent transactions, removing the account holder's name, address, and behavioral history makes the problem unsolvable. If you're asking it to flag medication interactions, removing the patient's age and prior diagnoses reduces accuracy to noise.

Compliance doesn't end at the input. GDPR Article 25 (Data Protection by Design), HIPAA's Technical Safeguards, and the EU AI Act all require that you demonstrate ongoing protection of personal data throughout its lifecycle — not just at the point it enters a system. Stripping data before inference doesn't address what happens to intermediate computations, logs, or outputs.

Outputs can contain PII. Language models are generative. They can reproduce or extrapolate personal information from context even when the input was nominally anonymized. Controlling the input doesn't control the output.

None of this means anonymization is useless. It means it's not sufficient as the only control — and it actively destroys value in the use cases where AI provides the most leverage.

The Four Technical Approaches

Approach 1: On-Premises Deployment

Deploy the AI model on infrastructure you own and operate. Data never leaves your network perimeter.

What this actually gives you:

Data stays inside your network
No third-party access, contractual or otherwise
Works with any open-source model

What it costs:

GPU hardware is expensive. A single A100 costs $10,000–$30,000. A useful inference cluster for production workloads starts at $100K+.
You own maintenance, driver updates, CUDA compatibility, and hardware failure
You still can't produce cryptographic proof that a specific computation ran correctly — useful for audit trails, not available on bare metal
Scaling requires buying more hardware

Best fit: Organizations with existing data center infrastructure, GPU hardware already amortized, and workloads with steady, predictable volume.

Approach 2: Enterprise API Agreements

The major AI providers — OpenAI, Anthropic, Google, Azure — offer enterprise tiers with contractual privacy terms: no training on inputs, no data retention, dedicated infrastructure, BAAs for HIPAA.

What this actually gives you:

Access to frontier model quality (GPT-4o, Claude 3.5, Gemini)
Legal (contractual) guarantee that data won't be used for training
Provider's existing compliance certifications (SOC 2, ISO 27001)

What it doesn't give you:

Technical verifiability. The provider agrees not to look at your data. You trust they comply. A misconfigured logging pipeline, an insider threat, or a government subpoena can still expose your data. You have no cryptographic way to verify the promise was kept.
HIPAA's technical safeguards are harder to satisfy with contractual-only controls. Covered entities have been cited for relying solely on BAAs without implementing access controls and audit logs.
GDPR data transfer restrictions. If you're processing EU residents' data on US infrastructure, the legal basis matters even with a contractual DPA.

Best fit: Large enterprises where legal and procurement teams are comfortable with contractual risk mitigation, and where the use case doesn't involve the most sensitive data categories.

Approach 3: Confidential AI Inference in Trusted Execution Environments

A Trusted Execution Environment (TEE) is a hardware-isolated enclave where computation runs in memory that is encrypted at the silicon level. The host operating system cannot read it. The hypervisor cannot read it. The cloud provider — even with root access to the physical machine — cannot read it.

This changes the trust model from contractual to cryptographic.

How TEE-based AI inference works:

The AI model is packaged as a Docker container and deployed into a hardware enclave (e.g., AWS Nitro Enclave, Azure Confidential VM, NVIDIA H100 Confidential Mode)
Your data is sent to the enclave over a mutually authenticated TLS channel
The data is decrypted only inside the hardware boundary — in memory the CPU key-encrypts automatically
The model runs inference inside the enclave
The output is encrypted and returned to you
The enclave produces an attestation document: a hardware-signed cryptographic proof of exactly what code ran, that the environment was unmodified, and that data was processed in isolation

What the attestation document proves:

The attestation includes Platform Configuration Registers (PCRs) — hashes of the enclave image, kernel, and application. These are signed by the CPU manufacturer's root of trust (Intel, AMD, AWS Nitro). An auditor, regulator, or counterparty can independently verify:

The exact Docker image that processed the data
That no modifications were made to the enclave at runtime
The timestamp and nonce of the attestation (proving it's not replayed)
That the hardware isolation was active during computation

This is audit evidence that satisfies HIPAA's Technical Safeguards (§164.312), GDPR Article 25 data protection by design, and SOC 2 CC6.1 controls around logical access.

What it costs:

Slightly higher latency for the first request (enclave initialization + TLS setup). Subsequent requests run near-native speed.
You're constrained to open-source or custom models that fit in the enclave's memory allocation. Large 70B models require more memory headroom; quantized 7B–13B models run comfortably on standard configurations.
Requires a platform that manages the enclave lifecycle (Treza handles this so you don't deploy to raw Nitro).

Best fit: Healthcare AI on PHI, financial AI on proprietary trading signals or customer data, legal document analysis, any use case where you need both model quality and verifiable privacy guarantees.

Approach 4: Federated Learning

Instead of sending data to the model, train the model on data where it lives. Each node trains on its local data and sends only gradient updates to a central aggregator. The raw data never moves.

This works well for training and fine-tuning. It doesn't apply to inference — if you want to run a query against live patient records, the model still has to see the data at query time. Federated learning is a training-time privacy technique, not an inference-time one.

Implementing Confidential AI Inference with Treza

Here's what putting this into production actually looks like with the Treza SDK.

Step 1: Deploy Your Model into an Enclave

Package any open-source model as a Docker container. Treza wraps the container in a Nitro Enclave automatically.

import { TrezaClient } from '@treza/sdk';
 
const treza = new TrezaClient({
  baseUrl: 'https://app.trezalabs.com',
  apiKey: process.env.TREZA_API_KEY,
});
 
const enclave = await treza.createEnclave({
  name: 'hipaa-inference',
  region: 'us-east-1',
  walletAddress: process.env.WALLET_ADDRESS,
  providerId: 'aws-nitro',
  providerConfig: {
    // Any Docker image — Llama 3, Mistral, your fine-tuned model
    dockerImage: 'myorg/clinical-llm:v2.1',
    cpuCount: 4,
    memoryMiB: 8192,
  },
});
 
console.log('Enclave ID:', enclave.id);
console.log('Status:', enclave.status); // 'running'

Step 2: Verify the Enclave Before Sending Data

Before sending any sensitive data, cryptographically verify the enclave is running exactly your model image — not a tampered version.

const verification = await treza.verifyAttestation(enclave.id, {
  // Nonce binds this attestation to this specific verification request
  // Prevents replay attacks
  nonce: `audit-${Date.now()}-${crypto.randomUUID()}`,
});
 
if (!verification.isValid) {
  throw new Error(`Enclave verification failed: ${verification.error}`);
}
 
// These PCR values are hardware-signed hashes of your Docker image
// They match the values you get from `nitro-cli describe-enclaves`
console.log('PCR0 (image hash):', verification.pcrs['0']);
console.log('PCR1 (kernel hash):', verification.pcrs['1']);
 
// Compliance checks are derived from the attestation evidence
console.log('HIPAA compliant:', verification.complianceChecks.hipaa);
console.log('SOC2 compliant:', verification.complianceChecks.soc2);
console.log('GDPR compliant:', verification.complianceChecks.gdpr);

Store the attestation document. This is your audit evidence — the cryptographic proof that sensitive data was processed in a compliant enclave at a specific timestamp.

Step 3: Send Sensitive Data to the Enclave

With attestation verified, send your data. The enclave's TLS channel ensures data is encrypted in transit and only decrypted inside the hardware boundary.

// Example: HIPAA use case — summarize patient visit notes
const response = await treza.callEnclave(enclave.id, {
  endpoint: '/v1/summarize',
  method: 'POST',
  body: {
    // PHI stays encrypted until it's inside the enclave boundary
    text: patientVisitNotes,
    task: 'clinical_summary',
    maxTokens: 512,
  },
});
 
const summary = response.data.summary;
// PHI never left the enclave in plaintext
// The cloud provider saw only encrypted bytes

Step 4: Generate Compliance Reports

For audits, you can retrieve the full attestation history for an enclave — every computation backed by a hardware-signed proof.

const report = await treza.getComplianceReport(enclave.id, {
  startDate: '2026-01-01',
  endDate: '2026-03-31',
  format: 'pdf', // or 'json' for programmatic use
});
 
// The report maps directly to:
// - HIPAA 45 CFR §164.312(a)(1) Access Control
// - HIPAA 45 CFR §164.312(b) Audit Controls
// - SOC 2 CC6.1 Logical and Physical Access Controls
// - GDPR Article 25 Data Protection by Design

Compliance Mapping

Here's how TEE-based inference maps to the most common regulatory requirements:

Regulation	Requirement	How TEE Attestation Satisfies It
HIPAA	Technical safeguards for PHI (§164.312)	Hardware isolation prevents unauthorized access. Attestation proves the correct model ran. Audit log is cryptographically signed.
GDPR	Data protection by design (Art. 25)	Encryption in memory is a technical measure built into the processing architecture, not bolted on after. Attestation documents satisfy Art. 5(2) accountability.
SOC 2 Type II	CC6.1 Logical access controls	Access to data in the enclave is physically impossible without the hardware keys. PCR measurements prove software integrity.
MiCA / DORA	ICT risk management for financial data	Cryptographic proof of execution satisfies operational resilience and data integrity requirements.
EU AI Act	Art. 10: Data governance for high-risk AI	Attestation documents provide auditable evidence of data handling practices for high-risk AI systems processing personal data.

Practical Decision Framework

Use this to decide which approach is right for your use case:

Use on-premises deployment if:

You have existing GPU infrastructure
Your workloads are steady and predictable
Regulatory requirements mandate fully air-gapped processing
You don't need a cryptographic audit trail

Use enterprise API agreements if:

You need frontier model quality (GPT-4o, Claude 3.5)
Your legal team is comfortable with contractual controls
Your use case doesn't involve the most sensitive data categories (PHI, classified)
You're at a large organization with established vendor management

Use TEE-based inference (Treza) if:

You process PHI, PII, financial, or legally privileged data
Auditors or regulators require technical controls beyond contractual guarantees
You need a verifiable audit trail for compliance evidence
You want the capability to use any model — including proprietary fine-tunes — without exposing it to the cloud provider

Combine approaches if:

Use TEE inference for sensitive production workloads
Use enterprise APIs for tasks that don't touch regulated data
Use local models for development and prototyping

The Bigger Shift

Regulatory pressure on AI data processing is not easing. The EU AI Act is in force. FTC guidance on AI and consumer data is tightening. HIPAA enforcement around AI-assisted clinical tools is becoming more specific. These aren't future risks — they're present ones.

The organizations moving fastest aren't waiting for the regulation to fully crystallize before building compliant infrastructure. They're adopting confidential computing now, building muscle around attestation-based audit trails, and deploying AI on sensitive data in ways that produce evidence regulators can actually verify.

The 97% statistic — AI deployments blocked by "no sensitive data in LLMs" policies — represents a backlog of business value waiting to be unlocked. The technical tools to unlock it now exist. Hardware TEEs are GA across AWS, Azure, and GCP. The performance overhead is minimal. The compliance mapping is clear.

The question is no longer whether you can run AI on sensitive data. It's whether you have the infrastructure to do it in a way that satisfies the people who are paid to say no.

Frequently Asked Questions

Can you run AI on PHI or PII without exposing it to the model vendor?

Yes. Running inference inside a hardware-isolated Trusted Execution Environment (e.g., AWS Nitro Enclave) means data is decrypted only inside the enclave boundary — the host OS, hypervisor, and cloud operator never see it in the clear. Every execution emits a hardware-signed attestation document that auditors can independently verify, so the protection is cryptographic rather than just contractual.

Why isn't anonymizing or redacting the data before sending it enough?

Anonymization helps but isn't sufficient on its own. Even heavily anonymized records can be re-identified by correlating fields like age, zip code, and timestamps; removing the sensitive fields often destroys the exact signal the model needs (fraud detection, clinical reasoning); compliance obligations under GDPR Article 25 and HIPAA cover the entire data lifecycle, not just the input; and generative models can reproduce or infer personal information in their outputs.

What does a TEE attestation document actually prove?

It contains Platform Configuration Registers (PCRs) — hashes of the enclave image, kernel, and application — signed by the CPU manufacturer's root of trust. An auditor or regulator can verify the exact Docker image that processed the data, that nothing was modified at runtime, the timestamp and nonce (proving it wasn't replayed), and that hardware isolation was active during the computation.

TEE-based inference maps directly to HIPAA's Technical Safeguards (§164.312), GDPR Article 25 (data protection by design), and SOC 2 CC6.1 logical-access controls: data is encrypted in use, access is attestation-gated, and the audit log is cryptographically signed. It provides the technical controls that contractual-only arrangements (BAAs alone) often can't satisfy.

Does federated learning solve this for AI inference?

No. Federated learning keeps raw data in place during training by exchanging only gradient updates — it's a training-time technique. For inference against live records (a query over patient data, for example), the model still has to see the data at query time, which is where a TEE applies.

Can I use any AI model inside an enclave?

You can deploy any open-source or custom model packaged as a Docker container. The practical constraint is the enclave's memory allocation: quantized 7B–13B models run comfortably on standard configurations, while larger 70B models need more memory headroom.

How to Run AI on Sensitive Data Without Violating HIPAA or GDPR

Why "Just Anonymize It" Doesn't Always Work

The Four Technical Approaches

Approach 1: On-Premises Deployment

Approach 2: Enterprise API Agreements

Approach 3: Confidential AI Inference in Trusted Execution Environments

Approach 4: Federated Learning

Implementing Confidential AI Inference with Treza

Step 1: Deploy Your Model into an Enclave

Step 2: Verify the Enclave Before Sending Data

Step 3: Send Sensitive Data to the Enclave

Step 4: Generate Compliance Reports

Compliance Mapping

Practical Decision Framework

The Bigger Shift

Frequently Asked Questions

Can you run AI on PHI or PII without exposing it to the model vendor?

Why isn't anonymizing or redacting the data before sending it enough?

What does a TEE attestation document actually prove?

Does federated learning solve this for AI inference?

Can I use any AI model inside an enclave?

Further Reading

Read next

How to Transcribe Video Automatically with an AI Transcription API (2026 Guide)

How to Add Captions to AI Videos Automatically (2026 Guide)

How to Auto-Publish AI Videos to YouTube (2026 Guide)

Your next prompt could be production.

Why "Just Anonymize It" Doesn't Always Work

The Four Technical Approaches

Approach 1: On-Premises Deployment

Approach 2: Enterprise API Agreements

Approach 3: Confidential AI Inference in Trusted Execution Environments

Approach 4: Federated Learning

Implementing Confidential AI Inference with Treza

Step 1: Deploy Your Model into an Enclave

Step 2: Verify the Enclave Before Sending Data

Step 3: Send Sensitive Data to the Enclave

Step 4: Generate Compliance Reports

Compliance Mapping

Practical Decision Framework

The Bigger Shift

Frequently Asked Questions

Can you run AI on PHI or PII without exposing it to the model vendor?

Why isn't anonymizing or redacting the data before sending it enough?

What does a TEE attestation document actually prove?

Is TEE-based AI inference HIPAA and GDPR compliant?

Does federated learning solve this for AI inference?

Can I use any AI model inside an enclave?

Further Reading

Read next

How to Transcribe Video Automatically with an AI Transcription API (2026 Guide)

How to Add Captions to AI Videos Automatically (2026 Guide)

How to Auto-Publish AI Videos to YouTube (2026 Guide)

Your next prompt could be production.