LLM · Other Companies

Chaos erupts as cyberattack disrupts learning platform Canvas amid finals
LLM

Chaos erupts as cyberattack disrupts learning platform Canvas amid finals

Across the country, schools and colleges postpone year-end tests.

Ars Technica AI
See what happens when creative legends use AI to make ads for small businesses.
LLM

See what happens when creative legends use AI to make ads for small businesses.

The Small Brief is an initiative partnering three advertising industry leaders to create AI-generated advertisements...

Google AI Blog
Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class
LLM

Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class

Zyphra releases ZAYA1-8B, a reasoning Mixture of Experts model with only 760M active parameters that outperforms much...

MarkTechPost
Ars Asks: Share your shell and show us your tricked-out terminals!
LLM

Ars Asks: Share your shell and show us your tricked-out terminals!

A celebration of the tweaks and customizations that make life easier at the CLI.

Ars Technica AI
Widely used Daemon Tools disk app backdoored in monthlong supply-chain attack
LLM

Widely used Daemon Tools disk app backdoored in monthlong supply-chain attack

Daemon Tools users: It's time to check your machines for stealthy infections, stat.

Ars Technica AI
What an AI-designed car looks like
LLM

What an AI-designed car looks like

Car manufacturers are exploring AI and large language models to accelerate vehicle design and development processes,...

The Verge AI
Amazon brings agentic fine-tuning to SageMaker with support for Llama, Qwen, Deepseek, and Nova
LLM

Amazon brings agentic fine-tuning to SageMaker with support for Llama, Qwen, Deepseek, and Nova

Amazon SageMaker AI has introduced agentic fine-tuning capabilities to help developers customize language models, with...

The Decoder
GameStop offers $56 billion for eBay, struggles to explain how it'll pay for it
LLM

GameStop offers $56 billion for eBay, struggles to explain how it'll pay for it

Amid falling revenue and store closures, GameStop wants to buy the much larger eBay.

Ars Technica AI
A Developer’s Guide to Systematic Prompting: Mastering Negative Constraints, Structured JSON Outputs, and Multi-Hypothesis Verbalized Sampling
LLM

A Developer’s Guide to Systematic Prompting: Mastering Negative Constraints, Structured JSON Outputs, and Multi-Hypothesis Verbalized Sampling

This article provides a developer's guide to systematic prompting techniques for large language models, focusing on...

MarkTechPost
Xiaomi's open-weight MiMo-V2.5-Pro takes aim at Claude Opus with hours-long autonomous coding
LLM

Xiaomi's open-weight MiMo-V2.5-Pro takes aim at Claude Opus with hours-long autonomous coding

Xiaomi released MiMo-V2.5-Pro, an open-weight model that nearly matches Anthropic's Claude Opus 4.6 on coding...

The Decoder
What is Tokenization Drift and How to Fix It?
LLM

What is Tokenization Drift and How to Fix It?

The article discusses tokenization drift, a subtle but critical issue where AI models can degrade in performance due to...

MarkTechPost
Even the latest AI models make three systematic reasoning errors, ARC-AGI-3 analysis shows
LLM

Even the latest AI models make three systematic reasoning errors, ARC-AGI-3 analysis shows

Analysis of OpenAI's GPT-5.5 and Anthropic's Opus 4.7 on the ARC-AGI-3 benchmark reveals three systematic reasoning...

The Decoder
A Coding Guide on LLM Post Training with TRL from Supervised Fine Tuning to DPO and GRPO Reasoning
LLM

A Coding Guide on LLM Post Training with TRL from Supervised Fine Tuning to DPO and GRPO Reasoning

A hands-on tutorial guide for post-training large language models using the TRL library, covering four key techniques:...

MarkTechPost
Qwen AI Releases Qwen-Scope: An Open-Source Sparse AutoEncoders (SAE) Suite That Turns LLM Internal Features into Practical Development Tools
LLM

Qwen AI Releases Qwen-Scope: An Open-Source Sparse AutoEncoders (SAE) Suite That Turns LLM Internal Features into Practical Development Tools

Qwen AI has released Qwen-Scope, an open-source Sparse Autoencoder suite that provides tools to interpret and utilize...

MarkTechPost
Moonshot AI Open-Sources FlashKDA: CUTLASS Kernels for Kimi Delta Attention with Variable-Length Batching and H20 Benchmarks
LLM

Moonshot AI Open-Sources FlashKDA: CUTLASS Kernels for Kimi Delta Attention with Variable-Length Batching and H20 Benchmarks

Moonshot AI has open-sourced FlashKDA, a high-performance implementation of Kimi Delta Attention optimized for the...

MarkTechPost
Tencent's 440 MB AI model translates 33 languages offline on your phone
LLM

Tencent's 440 MB AI model translates 33 languages offline on your phone

Tencent released a compact 440 MB AI translation model as an open-weight model that supports 33 languages and runs...

The Decoder
IBM Releases Two Granite Speech 4.1 2B Models: Autoregressive ASR with Translation and Non-Autoregressive Editing for Fast Inference
LLM

IBM Releases Two Granite Speech 4.1 2B Models: Autoregressive ASR with Translation and Non-Autoregressive Editing for Fast Inference

IBM has released two versions of the Granite Speech 4.1 2B model for automatic speech recognition (ASR), featuring both...

MarkTechPost
Qwen Team Releases FlashQLA: a High-Performance Linear Attention Kernel Library That Achieves Up to 3× Speedup on NVIDIA Hopper GPUs
LLM

Qwen Team Releases FlashQLA: a High-Performance Linear Attention Kernel Library That Achieves Up to 3× Speedup on NVIDIA Hopper GPUs

The QwenLM team released FlashQLA, a high-performance kernel library that accelerates linear attention mechanisms,...

MarkTechPost
Sanctioned Chinese AI Firm SenseTime Releases Image Model Built for Speed
LLM

Sanctioned Chinese AI Firm SenseTime Releases Image Model Built for Speed

SenseTime, a sanctioned Chinese AI firm, has released a new image generation model optimized to run on Chinese-made...

Wired AI