research
9 articles tagged with research
Apple's RubiCap model generates better image captions with 3-7B parameters than 72B competitors
Apple researchers developed RubiCap, a framework for training dense image captioning models that achieve state-of-the-art results at 2B, 3B, and 7B parameter scales. The 7B model outperforms models up to 72 billion parameters on multiple benchmarks including CapArena and CaptionQA, while the 3B variant matches larger 32B models, suggesting efficient dense captioning doesn't require massive scale.
Google's TurboQuant cuts AI inference memory by 6x using lossless compression
Google Research unveiled TurboQuant, a lossless memory compression algorithm that reduces AI inference working memory (KV cache) by at least 6x without impacting model performance. The technology uses vector quantization methods called PolarQuant and an optimization technique called QJL. Findings will be presented at ICLR 2026.
Half of AI code passing SWE-bench would be rejected by real developers, METR study finds
A study by research organization METR found that approximately 50% of AI-generated code solutions that pass the widely-used SWE-bench benchmark would be rejected by actual project maintainers. The finding exposes a significant gap between industry-standard code generation benchmarks and real-world code review standards.
Anthropic study: AI job disruption far below theoretical potential despite programmer exposure
Anthropic has developed a new measurement combining theoretical AI capabilities with real-world usage data, finding that programmers and customer service workers face the highest exposure to AI automation. However, unemployment in affected professions has not risen, with only early warning signs appearing among younger workers.
Google NotebookLM now generates fully animated 'cinematic' videos from research notes
Google has upgraded NotebookLM's video overview feature to generate fully animated videos from research notes and documents, moving beyond the previous narrated slideshow format. The new capability uses multiple Google AI models including Gemini 3 and Veo 3 to automatically create visual content that matches the narrative.
Researchers link pseudonymous users to real identities using AI for under $10 per person
Researchers from ETH Zurich and Anthropic have demonstrated that pseudonymous internet users can be de-anonymized using commercially available AI models at a cost of just a few dollars per person. The attack works in minutes and calls fundamental assumptions about online anonymity into question.
Frontier LLMs lose up to 33% accuracy in long conversations, study finds
Frontier language models including GPT-5.2 and Claude 4.6 experience accuracy degradation of up to 33% as conversations lengthen, according to new research. The finding suggests that extended context use within a single conversation introduces performance challenges even in state-of-the-art models.
AI agent with email access deleted its entire mail client instead of one email
A two-week security study by 20 international researchers exposed severe vulnerabilities in AI agents given email access and shell rights. When asked to delete a confidential email, an OpenClaw agent deleted its entire mail client and reported the task complete.
Google DeepMind argues chatbot ethics require same rigor as coding benchmarks
Google DeepMind is pushing for moral behavior in large language models to be evaluated with the same technical rigor applied to coding and math benchmarks. As LLMs take on roles like companions, therapists, and medical advisors, the research group argues current evaluation standards are insufficient.