Google

14 articles tagged with this topic

Google Lets Chrome Run AI Models Directly — The Browser is Becoming the New OS

Google opens Prompt API: web apps call built-in Gemini Nano in Chrome—no servers or API keys. It shifts inference on-device, making AI a native browse

5d ago3 min read

GoogleJAX

Google Multi-Agent Speeds Code Migration 6x: From Functions to Engineering

Google multi-AI agents accelerate TensorFlow to JAX migration 6x. AI proves it can handle systemic engineering tasks taking months of manual labor.

5d ago2 min read

GoogleChrome

Chrome Silently Installs 4GB AI Model: Google Races Ahead in Local AI via Browser

Chrome silently installs a ~4GB local AI model without consent. Browsers are becoming AI runtimes—distribution rights now matter more than the models.

6d ago2 min read

GoogleGemma 4

Google Doubles Gemma 4 Speed — Speculative Decoding Goes Mainstream

Google's Gemma 4 MTP models use speculative decoding for up to 2x speed with zero quality loss, boosting local LLM practicality and lowering compute b

6d ago2 min read

GoogleGemma 4

Google Gemma 4 Fixes Chat Template — Local LLM Usability Inches Forward

Google fixed Gemma 4's chat template bug; community quantized versions updated. Not major news, but proves local AI usability inches up via detail ref

May 42 min read

GoogleTransformer

7 Years of Transformer Dominance: LLM Architecture Awaits the Next Reshuffle

Transformer underpins LLMs via self-attention, fixing old algorithms' parallel and long-context flaws. Grasping it reveals LLM capability limits and b

May 42 min read

GemmaGoogle

Gemma 4 Per-Layer Embeds: Knowledge-Reasoning Split, Hope or Hype

Gemma 4's per-layer embeddings spark debate: Can knowledge and reasoning scale separately? If so, 2B models could hold 20B knowledge, redefining local

May 32 min read

TransformerAttention is all you need

Transformer: 7 Years, 120K Citations—Key to the LLM Race

Google's 2017 Transformer is the LLM bedrock, replacing RNNs with parallel attention. Grasping it reveals who takes shortcuts in the LLM race.

May 23 min read

GemmaGoogle

Gemma 4 Hits HuggingFace — Open Source Outpaces Official Toolchain

gemma-4-31B-it-DFlash on HuggingFace lacks llama.cpp support. We see models outpacing toolchains—having models you can't run is the new paradox.

May 22 min read

GoogleSeq2Seq

Decade of Seq2Seq: The True Technical Starting Point of LLMs

Google's 2014 Seq2Seq architecture is the shared technical foundation of LLMs like GPT and BERT. Understanding its encoder-decoder division and info b

May 12 min read

AIGoogle

Google Lets AI Recompose Your Photos After the Shot

Google Research demos AI that re frames photos post -capture — shifting the " fr aming decision" from photographer to algorithm.

Apr 222 min read

AIGoogle

Google Engineers Want One Ruleset for Production - Ready AI Code — Harder Than It Sounds

Google engineers are tac kling why AI- generated code rarely ships to production, and the fix is more complex than expected .

Apr 221 min read

Gemma 4LiteRT

Gemma 4 Has Hidden MTP Heads Disabled by Google at Launch

A developer found multi-token prediction weights inside Gemma 4's LiteRT files; Google confirmed MTP exists but was intentionally disabled.

Apr 74 min read

llama.cppGemma 4

Gemma 4 llama.cpp Issues Resolved With Recent Fixes

Google Gemma 4 models now run correctly in llama.cpp after critical fixes for output quality and crashes

Apr 41 min read