training-data
2 articles tagged with training-data
March 26, 2026
product updateGitHub
GitHub will train Copilot models on user interaction data starting April 2026
GitHub will use Copilot interaction data from Free, Pro, and Pro+ plan users to train AI models starting April 24, 2026, unless users actively opt out. The policy does not affect Copilot Business and Enterprise customers. Data shared will include prompts, outputs, code snippets, filenames, and repository structures.
March 8, 2026
research
Meta research challenges multimodal training assumptions as text data scarcity looms
A Meta FAIR and New York University research team trained a multimodal AI model from scratch and identified that several widely-held assumptions about multimodal model architecture and training don't align with their empirical findings. The work addresses growing concerns about text data exhaustion in LLM training.