product updateOpenAI

OpenAI Releases Privacy Filter: 1.5B-Parameter On-Premises PII Detection Model with 128K Context

TL;DR

OpenAI has released Privacy Filter, a 1.5B-parameter bidirectional token classification model designed for detecting and masking personally identifiable information in text. The model processes 128,000-token sequences in a single pass and is available under Apache 2.0 license for on-premises deployment.

April 22, 2026 · 5:07 PM2 min read

OpenAI Privacy Filter — Quick Specs

Context window128K tokens

Compare OpenAI Privacy Filter with other models →

OpenAI Releases Privacy Filter: 1.5B-Parameter On-Premises PII Detection Model with 128K Context

OpenAI has released Privacy Filter, a 1.5B-parameter bidirectional token classification model designed for detecting and masking personally identifiable information (PII) in text. The model is available under Apache 2.0 license and can run on-premises, including in web browsers.

Technical Architecture

Privacy Filter uses 1.5 billion total parameters with 50 million active parameters through a sparse mixture-of-experts architecture. According to OpenAI, the model was first pretrained autoregressively similar to GPT-OSS, then converted into a bidirectional token classifier with supervised classification training.

The architecture includes:

8 transformer blocks with grouped-query attention (14 query heads, 2 KV heads)
Sparse mixture-of-experts feed-forward blocks (128 experts total, top-4 routing)
Bidirectional banded attention with band size 128 (effective 257-token attention window)
640-dimensional residual stream width

Unlike autoregressive models that generate text token-by-token, Privacy Filter labels entire input sequences in one forward pass, then decodes coherent spans using a constrained Viterbi procedure.

Detection Capabilities

The model detects 8 privacy categories:

Account numbers
Private addresses
Private emails
Private persons (names)
Private phone numbers
Private URLs
Private dates
Secrets

For token-level classification, each category expands into BIOES (Begin, Inside, Outside, End, Single) boundary tags, producing 33 total output classes per token.

Context and Performance

Privacy Filter supports a 128,000-token context window, enabling processing of long documents without chunking. The model includes runtime controls for configuring precision-recall tradeoffs through adjustable operating points that modify span detection aggressiveness.

The sequence decoder uses six transition-bias parameters controlling background persistence, span entry, continuation, closure, and boundary handoff to produce coherent span boundaries rather than per-token independent predictions.

Deployment Options

OpenAI states the model can run in web browsers via WebGPU using Transformers.js with quantization (q4), or on laptops and on-premises infrastructure. The model is available through Hugging Face Transformers with standard pipeline API support.

The Apache 2.0 license permits commercial deployment and fine-tuning on specific data distributions.

Limitations Disclosed

OpenAI explicitly states Privacy Filter is "not an anonymization, compliance, or a safety guarantee" and warns against over-reliance. The model only identifies PII matching its trained taxonomy of 8 categories, which may not cover all privacy use cases or regulatory requirements. OpenAI recommends using it as one layer in a broader privacy-by-design approach rather than a standalone solution.

What This Means

This release addresses a specific enterprise need: fast, on-premises PII detection for data sanitization workflows where cloud APIs are unsuitable due to data residency or throughput requirements. The 128K context window and single-pass labeling design prioritize throughput over the iterative accuracy of larger models. The Apache 2.0 license and small parameter count make it accessible for fine-tuning on domain-specific PII patterns, though organizations must validate it meets their specific privacy requirements rather than treating it as a compliance checkbox.

Source: huggingface.co ↗

OpenAI privacy PII detection token classification Apache 2.0 on-premises privacy-filter data sanitization

model releaseJuly 18, 2026

OpenAI's GPT-5.6 Sol Adds Five Reasoning Effort Settings, Follows DeepSeep-R1 RLVR Training Method

OpenAI released GPT-5.6 Sol, a new reasoning model family that comes in three sizes with roughly five to six reasoning-effort settings each. The release follows the DeepSeek-R1 methodology of using reinforcement learning with verifiable rewards (RLVR), nearly two years after OpenAI's original o1 model popularized LLM-based reasoning.

product updateJuly 17, 2026

OpenAI restores chat sidebar in Mac app after user backlash over confusing redesign

OpenAI has updated its ChatGPT Mac app to restore direct access to chat conversations through a prominent sidebar toggle. The fix addresses user complaints following a July 10 redesign that replaced the native Mac client with an Electron-based app and buried the standard chat interface behind Work and Codex features.

changelogJuly 17, 2026

Cline v4.0.9 Adds GPT-5.6 ChatGPT Models, Fixes Token Count Over-Reporting

Cline, the AI coding assistant VS Code extension, released v4.0.9 on July 16, 2024, adding support for GPT-5.6 ChatGPT subscription models. The update fixes a bug where token counts were over-reported from OpenAI-compatible providers due to improper handling of cumulative usage snapshots.

changelogJuly 16, 2026

OpenAI's GPT-5.6 Codex Bug Deletes User Files When Attempting to Override $HOME Environment Variable

OpenAI has identified a critical bug in GPT-5.6's Codex implementation that causes unexpected file deletions. According to Thibault Sottiaux, the issue occurs when the model attempts to override the $HOME environment variable to define a temporary directory but mistakenly deletes $HOME instead, particularly when full access mode is enabled without sandboxing protections.

OpenAI Releases Privacy Filter: 1.5B-Parameter On-Premises PII Detection Model with 128K Context

OpenAI Privacy Filter — Quick Specs

OpenAI Releases Privacy Filter: 1.5B-Parameter On-Premises PII Detection Model with 128K Context

Technical Architecture

Detection Capabilities

Context and Performance

Deployment Options

Limitations Disclosed

What This Means

Related Articles

OpenAI's GPT-5.6 Sol Adds Five Reasoning Effort Settings, Follows DeepSeep-R1 RLVR Training Method

OpenAI restores chat sidebar in Mac app after user backlash over confusing redesign

Cline v4.0.9 Adds GPT-5.6 ChatGPT Models, Fixes Token Count Over-Reporting

OpenAI's GPT-5.6 Codex Bug Deletes User Files When Attempting to Override $HOME Environment Variable

Comments