Nvidia Releases Free 4B-Parameter Nemotron 3.5 Content Safety Model with 128K Context
Nvidia has released Nemotron 3.5 Content Safety, a 4-billion parameter multimodal guardrail model fine-tuned from Google Gemma-3-4B. The model is available for free, supports 128K token context windows, and moderates content across 12 languages.
Nemotron 3.5 Content Safety — Quick Specs
Nvidia Releases Free 4B-Parameter Nemotron 3.5 Content Safety Model with 128K Context
Nvidia has released Nemotron 3.5 Content Safety, a 4-billion parameter multimodal guardrail model now available for free through OpenRouter. The model is fine-tuned from Google's Gemma-3-4B and designed specifically for content moderation in AI applications.
Technical Specifications
The model accepts both text and image inputs and returns text output including:
- Safe/unsafe classification for user prompts and responses
- Safety category labels
- Optional reasoning trace when toggled
Nemotron 3.5 Content Safety supports a 128K token context window and covers 12 languages, though the specific languages have not been disclosed by Nvidia.
Use Cases
According to Nvidia, the model is designed for:
- Prompt and response moderation for LLMs and VLMs
- Content classification
- Safety pipelines
- Enterprise AI guardrails with policy enforcement
The model includes a toggleable reasoning mode that provides explanations for its safety decisions, allowing developers to understand why content was flagged.
Model Family
Nemotron 3.5 Content Safety is part of Nvidia's broader Nemotron family of open models for agentic AI. The compact 4B parameter size makes it suitable for deployment in resource-constrained environments while maintaining multimodal capabilities.
Availability
The model is currently available through OpenRouter at no cost. OpenRouter automatically routes requests to providers capable of handling the prompt size and parameters, with fallbacks to maximize uptime.
Nvidia has made model weights available, though specific hosting details and direct API access information have not been disclosed.
What This Means
The release of a free, compact multimodal guardrail model addresses a critical need in AI deployment: content safety at scale without licensing costs. At 4B parameters, it's small enough for practical deployment while the 128K context window allows it to evaluate longer conversations and documents. The multimodal capability is particularly significant, as few free safety models can process both text and images. However, without published benchmark scores or safety evaluation metrics, enterprises will need to test the model against their specific use cases to validate its effectiveness compared to existing solutions.
Related Articles
Nvidia Releases Nemotron 3 Ultra: 550B Parameter MoE Model with 1M Token Context Window
Nvidia has released Nemotron 3 Ultra, a 550B parameter mixture-of-experts model with 55B active parameters and a 1M token context window. The model uses a hybrid Transformer-Mamba architecture and is available for free through OpenRouter, targeting agentic workflows and multi-step reasoning tasks.
NVIDIA Releases Nemotron 3.5 ASR: 600M-Parameter Streaming Speech Model for 40 Languages
NVIDIA released Nemotron 3.5 ASR, a 600M-parameter speech-to-text model supporting 40 language-locales from a single checkpoint. The model achieves 0.07 seconds to final transcript after speech ends and ranks 2nd in latency among streaming ASR models according to Artificial Analysis benchmarks.
NVIDIA Shows Task-Seeded Synthetic Data Boosts Nemotron-3 Nano by +11.1 on GPQA
NVIDIA demonstrated that task-seeded synthetic Q&A data improves model performance across multiple benchmarks in a 100B-token continuation experiment on Nemotron-3 Nano. The approach improved GPQA scores by +11.1 points, MMLU-Pro by +1.8, average code by +1.9, and commonsense understanding by +1.6.
NVIDIA Releases Cosmos3-Super-Text2Image: 64B Parameter Model for Physical AI Applications
NVIDIA released Cosmos3-Super-Text2Image, a 64-billion parameter text-to-image generation model as part of its Cosmos3 collection of omnimodal world models. The model uses a Mixture-of-Transformers architecture combining autoregressive and diffusion transformers, designed for Physical AI applications including robotics and autonomous vehicles.
Comments
Loading...