LLM

Large Language Models

Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving
LLM

Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving

Together AI has open-sourced OSCAR, an INT2 KV cache quantization method that reduces memory usage by 8× and improves...

MarkTechPost
AI models often give the right answers but point to the wrong sources
LLM

AI models often give the right answers but point to the wrong sources

Leading AI models like GPT and Gemini often cite incorrect source passages to support their answers, even when the...

The Decoder
LLMOpenAI

OpenAI, Grupo Folha and Grupo UOL announce strategic content partnership

OpenAI has entered a strategic partnership with Brazilian media groups Grupo Folha and Grupo UOL to integrate their...

OpenAI Blog
Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments
LLMOpenAI

Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments

This tutorial demonstrates how to build a complete Langfuse observability pipeline for LLM engineering, covering...

MarkTechPost
StepFun Releases StepAudio 2.5 Realtime: An End-to-End Voice Model with Roleplay-Specific RLHF and Paralinguistic Comprehension
LLM

StepFun Releases StepAudio 2.5 Realtime: An End-to-End Voice Model with Roleplay-Specific RLHF and Paralinguistic Comprehension

StepFun released StepAudio 2.5 Realtime, an end-to-end real-time speech language model with customizable persona...

MarkTechPost
ByteDance study finds that asking LMMs questions beats making it transcribe text for long document training
LLM

ByteDance study finds that asking LMMs questions beats making it transcribe text for long document training

ByteDance Seed demonstrates that a 7B parameter model can effectively answer questions on long, image-heavy documents...

The Decoder
Deepseek makes its 75 percent discount permanent, pricing output tokens at least 34x below GPT-5.5
LLMDeepSeek

Deepseek makes its 75 percent discount permanent, pricing output tokens at least 34x below GPT-5.5

DeepSeek has made its 75% discount on its V4-Pro model permanent, pricing output tokens at least 34 times cheaper than...

The Decoder
Google’s new anything-to-anything AI model is wild
LLMGoogle

Google’s new anything-to-anything AI model is wild

The article discusses Google's Gemini AI model's capabilities for generating realistic videos, using the author's...

The Verge AI
Cloudflare CEO Prince says builders and sellers are safe but AI is coming for the measurers
LLM

Cloudflare CEO Prince says builders and sellers are safe but AI is coming for the measurers

Cloudflare CEO Matthew Prince laid off over 20% of the workforce, attributing the cuts to AI replacing middle...

The Decoder
Google’s AI search is so broken it can ‘disregard’ what you’re looking for
LLMGoogle

Google’s AI search is so broken it can ‘disregard’ what you’re looking for

Google's AI Overviews feature is malfunctioning when users search for the term 'disregard,' returning generic chatbot...

The Verge AI
Google checks websites for llms.txt in new agentic browsing audit
LLM

Google checks websites for llms.txt in new agentic browsing audit

Google is testing how well websites handle AI agents through a new experimental category called "Agentic Browsing" in...

The Decoder
Cohere open-sources its strongest model yet
LLM

Cohere open-sources its strongest model yet

Cohere, a Canadian AI company, has released Command A+, its most powerful language model to date, as open source under...

The Decoder
SAP taps Mistral AI to help customers migrate legacy software
LLMMistral

SAP taps Mistral AI to help customers migrate legacy software

SAP has partnered with Mistral AI to leverage their language models for helping customers migrate legacy software to...

The Decoder
US government takes $2 billion equity stake in nine quantum computing firms
LLM

US government takes $2 billion equity stake in nine quantum computing firms

Beneficiaries include startup backed by firm with links to the Trump family.

Ars Technica AI
One Model, Three Modalities: ByteDance Releases Lance for Image and Video Understanding, Generation, and Editing
LLM

One Model, Three Modalities: ByteDance Releases Lance for Image and Video Understanding, Generation, and Editing

ByteDance's Intelligent Creation Lab released Lance, an open-source multimodal model that performs image and video...

MarkTechPost
We’re announcing new community investments in Missouri.
LLM

We’re announcing new community investments in Missouri.

We’re helping build the state’s next-generation workforce and investing in energy programs.

Google AI Blog
OpenAI claims it solved an 80-year-old math problem — for real this time
LLMOpenAI

OpenAI claims it solved an 80-year-old math problem — for real this time

OpenAI's reasoning model reportedly disproved a geometry conjecture that has remained unsolved since 1946. The claim is...

TechCrunch AI
Google publishes exploit code threatening millions of Chromium users
LLM

Google publishes exploit code threatening millions of Chromium users

Google publishes exploit code before patch, reported 42 months earlier, is fixed.

Ars Technica AI
Stability AI releases a new audio model that can create 6-minute songs
LLM

Stability AI releases a new audio model that can create 6-minute songs

Stability AI has released Stability Audio 3.0 small model, capable of generating two-minute audio tracks that can run...

TechCrunch AI