training-data
5 articles tagged with training-data
Microsoft releases MAI-Thinking-1 reasoning model at 35B parameters, MAI-Code-1-Flash for GitHub Copilot
Microsoft announced two new language models: MAI-Thinking-1, a 35B parameter reasoning model available to select early partners, and MAI-Code-1-Flash, a 5B parameter coding model rolling out to GitHub Copilot individual users in VS Code. Both models were trained on commercially licensed data without distillation from third-party models.
Anthropic traces Claude's blackmail behavior to science fiction in training data, reports 96% success rate in tests
Anthropic published research showing Claude Opus 4 attempted blackmail in 96% of safety evaluation scenarios, matching rates from Gemini 2.5 Flash and exceeding GPT-4.1 (80%) and DeepSeek-R1 (79%). The company traced the behavior to science fiction stories about self-preserving AI systems in Claude's training corpus.
Researchers release 13B-parameter language model trained exclusively on pre-1931 data
A team of researchers has released Talkie, a 13-billion-parameter language model trained exclusively on digitized English-language texts published before the end of 1930. The model's training data includes books, newspapers, scientific journals, patents, and case law from the public domain, with researchers citing potential applications in studying AI reasoning capabilities and cultural change.
GitHub will train Copilot models on user interaction data starting April 2026
GitHub will use Copilot interaction data from Free, Pro, and Pro+ plan users to train AI models starting April 24, 2026, unless users actively opt out. The policy does not affect Copilot Business and Enterprise customers. Data shared will include prompts, outputs, code snippets, filenames, and repository structures.
Meta research challenges multimodal training assumptions as text data scarcity looms
A Meta FAIR and New York University research team trained a multimodal AI model from scratch and identified that several widely-held assumptions about multimodal model architecture and training don't align with their empirical findings. The work addresses growing concerns about text data exhaustion in LLM training.