model releaseNVIDIA

NVIDIA Releases Nemotron 3.5 Content Safety: 4B-Parameter Multimodal Model with Custom Policy Enforcement and 140-Langua

TL;DR

NVIDIA has released Nemotron 3.5 Content Safety, a 4B-parameter model built on Google Gemma 3 4B IT that provides multimodal safety classification across approximately 140 languages. The model includes a 128K context window, custom enterprise policy enforcement, auditable reasoning traces, and is releasing its training dataset.

3 min read
0

NVIDIA Releases Nemotron 3.5 Content Safety: 4B-Parameter Multimodal Model with Custom Policy Enforcement and 140-Language Coverage

NVIDIA has released Nemotron 3.5 Content Safety, a 4B-parameter model built on Google Gemma 3 4B IT that provides multimodal safety classification across approximately 140 languages with a 128K context window.

Core Specifications

Nemotron 3.5 Content Safety uses a LoRA adapter fine-tuned on Google's Gemma 3 4B IT base model. The model processes text, images, and combined inputs in a single inference call and runs on GPUs with 8GB+ VRAM. NVIDIA has not disclosed pricing for API access.

The model provides explicit training coverage for 12 languages—English, French, Spanish, German, Chinese, Japanese, Korean, Arabic, Hindi, Russian, Portuguese, and Italian—while inheriting zero-shot generalization across approximately 140 languages from the Gemma 3 base model.

Custom Policy Enforcement

The primary architectural addition in version 3.5 is custom policy enforcement. According to NVIDIA, the model accepts custom policy specifications alongside input and reasons over those policies when producing verdicts, rather than relying solely on its built-in taxonomy.

This addresses enterprise deployments where different applications require different risk profiles. For example, a healthcare platform's safety requirements differ from those of a developer tools IDE or children's education app. The model can suppress irrelevant categories or inject proprietary risk categories specific to organizational policies.

Three Output Modes

Nemotron 3.5 operates in three modes:

  1. Low-latency binary verdict: Returns safe/unsafe labels for user input and assistant responses
  2. Binary verdict with categories: Adds violated category labels from the 13-category Aegis 2.0 taxonomy
  3. THINK mode: Provides step-by-step reasoning traces before final verdicts

The THINK mode outputs auditable reasoning traces that document why specific verdicts were reached, which NVIDIA states is necessary for compliance logging and human review in regulated industries. When latency is critical, THINK mode can be disabled for faster binary verdicts.

Multimodal Integration

The model evaluates user prompts, images, and assistant responses as a single unified context rather than scoring each independently. According to NVIDIA, this approach catches policy violations that only emerge from interactions between text and images or between requests and responses.

The safety taxonomy follows the Aegis 2.0 framework with 13 core categories aligned with the MLCommons safety taxonomy, plus 10 fine-grained subcategories.

Dataset Release

NVIDIA is releasing the Nemotron 3.5 Content Safety Dataset, which includes multimodal and multilingual training data with safety reasoning traces. The reasoning traces were generated using a two-step process: first using larger models like Qwen 397B to generate chain-of-thought reasoning with ground-truth labels, then condensing those traces using Qwen 80B to fit within three sentences for efficiency.

NVIDIA states that most open-source safety models do not provide training or evaluation sets, and the problem is more severe for multimodal datasets where licensing restrictions often apply to image and video artifacts.

What This Means

Nemotron 3.5 represents NVIDIA's attempt to consolidate multiple safety capabilities—multimodal input, multilingual coverage, custom policies, and explainable reasoning—into a single 4B-parameter model that can run on modest GPU hardware. The custom policy enforcement feature addresses a genuine enterprise need: most production AI systems cannot operate under universal safety rules and require domain-specific risk profiles.

The dataset release is significant for reproducibility in AI safety research, though NVIDIA has not yet disclosed benchmark scores comparing Nemotron 3.5 to competing safety classifiers like OpenAI's Moderation API or Anthropic's content filtering. The model's practical utility will depend on how its accuracy compares to existing solutions and whether the reasoning traces genuinely improve human review efficiency in production deployments.

Related Articles

model release

Nvidia Releases Free 4B-Parameter Nemotron 3.5 Content Safety Model with 128K Context

Nvidia has released Nemotron 3.5 Content Safety, a 4-billion parameter multimodal guardrail model fine-tuned from Google Gemma-3-4B. The model is available for free, supports 128K token context windows, and moderates content across 12 languages.

model release

NVIDIA Releases Nemotron 3.5 ASR: 600M-Parameter Streaming Speech Model for 40 Languages

NVIDIA released Nemotron 3.5 ASR, a 600M-parameter speech-to-text model supporting 40 language-locales from a single checkpoint. The model achieves 0.07 seconds to final transcript after speech ends and ranks 2nd in latency among streaming ASR models according to Artificial Analysis benchmarks.

model release

NVIDIA Releases Cosmos3-Super-Text2Image: 64B Parameter Model for Physical AI Applications

NVIDIA released Cosmos3-Super-Text2Image, a 64-billion parameter text-to-image generation model as part of its Cosmos3 collection of omnimodal world models. The model uses a Mixture-of-Transformers architecture combining autoregressive and diffusion transformers, designed for Physical AI applications including robotics and autonomous vehicles.

model release

NVIDIA Nemotron 3 Ultra launches on AWS SageMaker with 550B parameters, 1M token context window

NVIDIA Nemotron 3 Ultra is now available on Amazon SageMaker JumpStart with 550 billion total parameters and 55 billion active parameters. The model features a hybrid Transformer-Mamba Mixture-of-Experts architecture and supports context windows up to 1 million tokens, targeting agentic AI workloads.

Comments

Loading...