Hardware & InfraGoogle unveils two new TPUs designed for the "agentic era"Google split its next-generation Tensor chip into two specialized processors: one optimized for inference, the other for training. The move signals the company's bet on agentic AI workloads as a distinct infrastructure category.Ars Technica — AI·Apr 2281
ResearchEfficient Multi-Cohort Inference for Long-Term Effects and Lifetime Value in A/B Testing with User LearningResearchers propose a method to measure long-term treatment effects and lifetime value changes in A/B tests for streaming platforms, addressing the gap between short-term metrics and actual user retention. The approach uses inverse-variance weighting across multiple cohorts to detect interventions that appear beneficial initially but erode value through churn.arXiv cs.LG·Apr 2252
ResearchRelative Entropy Estimation in Function Space: Theory and Applications to Trajectory InferenceResearchers developed a scalable method to estimate KL divergence between probability distributions in function space, addressing a key evaluation bottleneck in trajectory inference from snapshot data. The technique enables better assessment of models reconstructing latent dynamics in fields like single-cell genomics where destructive measurements prevent direct path observation.arXiv cs.LG·Apr 2252
Products & AppsBusiness & FundingGoogle makes an interesting choice with its new agent building tool for enterprisesGoogle launched Gemini Enterprise Agent Platform, positioning it specifically for technical and IT teams rather than business users. The move signals a shift in how major AI vendors are segmenting the enterprise agent market.TechCrunch — AI·Apr 2258
Policy & RegulationProducts & AppsAnthropic’s Mythos rollout has missed America’s cyberscurity agencyCISA, the US government's central cybersecurity agency, lacks access to Anthropic's Mythos Preview despite other federal agencies adopting the vulnerability-detection model. The exclusion raises questions about coordination gaps in federal AI procurement for critical infrastructure defense.The Verge — AI·Apr 2265
ResearchModels & ReleasesPersonalized electric vehicle energy consumption estimation framework that integrates driver behavior with map dataResearchers developed a personalized EV energy consumption model combining LSTM-based driver behavior prediction with physics-based battery simulation and map data. The framework estimates real-time state-of-charge across varied terrain by learning individual driving patterns rather than assuming generic driver profiles.arXiv cs.LG·Apr 2252
ResearchCoverage, Not Averages: Semantic Stratification for Trustworthy Retrieval EvaluationResearchers formalize retrieval evaluation as a statistical problem and propose semantic stratification, a method that organizes documents into entity-based clusters to systematically test RAG systems across missing query categories. The approach provides formal coverage guarantees and interpretable failure-mode visibility, addressing a core bottleneck in retrieval-augmented generation accuracy.arXiv cs.LG·Apr 2258
Products & AppsAI Overviews are coming to your Gmail at workGoogle is rolling out AI Overviews to Gmail's enterprise tier, enabling users to generate instant email summaries across multiple messages. The feature brings Google's LLM-powered summarization directly into workplace productivity workflows.TechCrunch — AI·Apr 2258
Models & ReleasesQwen3.6-27B: Flagship-Level Coding in a 27B Dense ModelAlibaba's Qwen3.6-27B achieves coding performance matching its 397B predecessor while shrinking model size from 807GB to 55.6GB, demonstrating major efficiency gains in open-weight model design.Simon Willison·Apr 2289
ResearchModels & ReleasesV-tableR1: Process-Supervised Multimodal Table Reasoning with Critic-Guided Policy OptimizationResearchers introduce V-tableR1, a reinforcement learning framework that trains multimodal LLMs to reason step-by-step through visual table tasks using critic feedback. The approach addresses a core weakness in current vision-language models: treating visual reasoning as pattern matching rather than rigorous multi-step inference.arXiv cs.LG·Apr 2258
Products & AppsGoogle Meet will take AI notes for in-person meetings tooGoogle's Gemini now generates meeting notes and transcripts across in-person gatherings, Zoom, and Microsoft Teams, expanding beyond its original Google Meet-only scope. The feature graduates from Android-only alpha testing to broader availability, positioning Gemini as a cross-platform meeting intelligence layer.The Verge — AI·Apr 2265
ResearchLifecycle-Aware Federated Continual Learning in Mobile Autonomous SystemsResearchers propose a federated continual learning framework that lets distributed autonomous fleets learn collaboratively while mitigating catastrophic forgetting across mission lifecycles. The approach addresses layer-specific forgetting sensitivity and long-term drift accumulation, moving beyond simulation-only validation toward real-world fleet heterogeneity.arXiv cs.LG·Apr 2252
ResearchTools & CodeAAC: Admissible-by-Architecture Differentiable Landmark Compression for ALTResearchers introduce AAC, a differentiable neural module that learns to compress landmark sets for shortest-path heuristics while mathematically guaranteeing admissibility without post-hoc calibration. The technique bridges classical algorithmic search with end-to-end neural training, enabling learned graph compression that preserves formal guarantees.arXiv cs.LG·Apr 2252
ResearchModels & ReleasesRespondeoQA: a Benchmark for Bilingual Latin-English Question AnsweringResearchers released RespondeoQA, the first question-answering benchmark for Latin-English bilingual tasks with 7,800 QA pairs sourced from historical pedagogical materials. Testing LLaMa 3, Qwen QwQ, and OpenAI's o3-mini revealed all models struggle with skill-oriented questions, suggesting reasoning capabilities remain limited on specialized language tasks.arXiv cs.CL·Apr 2242
ResearchTools & CodeF\textsuperscript{2}LP-AP: Fast \& Flexible Label Propagation with Adaptive Propagation KernelResearchers propose F²LP-AP, a training-free graph neural network method that classifies nodes without expensive iterative training by adapting propagation parameters to local graph structure. The approach uses geometric medians and clustering coefficients to handle both homophilous and heterophilous graphs, addressing a key GNN limitation.arXiv cs.LG·Apr 2252
ResearchTools & CodeFast Bayesian equipment condition monitoring via simulation based inference: applications to heat exchanger healthResearchers propose a neural-network-based alternative to MCMC for real-time industrial equipment diagnostics, using simulation-based inference to map sensor data directly to degradation parameters without expensive likelihood computations. The method targets heat exchanger monitoring but generalizes to any complex failure-mode diagnosis under uncertainty.arXiv cs.LG·Apr 2252
ResearchNear-Future Policy OptimizationResearchers propose Near-Future Policy Optimization (NPO), a reinforcement learning technique that balances high-quality external trajectories with accessible training data by optimizing the ratio of value gain to absorption cost, addressing a key bottleneck in post-training RL systems.arXiv cs.LG·Apr 2258
ResearchAnchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight NegotiationResearchers propose a two-index framework for LLM-powered freight negotiation that adapts concession strategies to dynamic pricing without violating offer monotonicity, addressing vulnerabilities in current AI broker systems.arXiv cs.CL·Apr 2242
ResearchTools & CodeSupplement Generation Training for Enhancing Agentic Task PerformanceResearchers propose Supplement Generation Training, a method where smaller LLMs generate task-specific prompts that boost larger foundation models' performance without retraining them. The approach decouples optimization from massive models, reducing computational overhead and enabling faster adaptation to new domains.arXiv cs.LG·Apr 2258
ResearchExploiting LLM-as-a-Judge Disposition on Free Text Legal QA via Prompt OptimizationResearchers tested automatic prompt optimization on legal QA evaluation, finding that AI judges trained with lenient feedback criteria outperform strict baselines and generalize better across different judge models. The ProTeGi method consistently beat human-designed prompts on the LEXam benchmark using Qwen3 and DeepSeek judges.arXiv cs.CL·Apr 2252
ResearchTokenised Flow Matching for Hierarchical Simulation Based InferenceResearchers propose Tokenised Flow Matching for Posterior Estimation (TFMPE), a technique that cuts simulator costs in hierarchical inference by training neural surrogates on single-site data rather than multi-site batches, then assembling synthetic observations to amortise full posterior inference.arXiv cs.LG·Apr 2252
ResearchTools & CodeCOMPASS: COntinual Multilingual PEFT with Adaptive Semantic SamplingResearchers propose COMPASS, a parameter-efficient fine-tuning framework that uses semantic clustering to selectively sample multilingual training data, reducing negative cross-lingual interference when adapting LLMs to new languages.arXiv cs.CL·Apr 2258
Policy & RegulationAI Tools Are Helping Mediocre North Korean Hackers Steal MillionsNorth Korean threat actors leveraged AI to automate malware development and social engineering, stealing up to $12 million in a three-month campaign. The incident underscores how AI commoditizes attack sophistication for lower-skilled adversaries, expanding the threat surface beyond well-resourced nation-states.WIRED — AI·Apr 2265
ResearchGenerative Flow Networks for Model Adaptation in Digital Twins of Natural SystemsResearchers propose using Generative Flow Networks to calibrate digital twin simulators of natural systems when observations are sparse and indirect. The approach frames model adaptation as a generative problem, allowing multiple plausible parameter configurations to be sampled by likelihood rather than forcing a single optimal fit.arXiv cs.LG·Apr 2252
ResearchTools & CodeAuto-ART: Structured Literature Synthesis and Automated Adversarial Robustness TestingResearchers synthesized nine years of adversarial robustness literature and released Auto-ART, an open-source framework with 50+ attacks and gradient-masking detection that maps to NIST, OWASP, and EU AI Act standards. The work addresses fragmented evaluation protocols that have hindered trustworthy ML deployment claims.arXiv cs.LG·Apr 2262
Models & ReleasesHardware & InfraGemma 4 VLA Demo on Jetson Orin Nano SuperGoogle's Gemma 4 VLA (vision-language model) now runs on Nvidia's Jetson Orin Nano Super, bringing multimodal inference to edge devices. This expands accessible on-device AI capabilities for robotics and embedded applications.Hugging Face·Apr 2272
ResearchModels & ReleasesStorm Surge Modeling, Bias Correction, Graph Neural Networks, Graph Convolution NetworksResearchers introduced StormNet, a graph neural network combining convolutional and attention mechanisms with LSTMs to correct bias in storm surge forecasts from traditional models like ADCIRC. The spatio-temporal approach captures dependencies across water-level monitoring stations to improve tropical cyclone impact predictions.arXiv cs.LG·Apr 2252
Hardware & InfraProducts & AppsGoogle unveils 8th-gen TPUs, agent platform, and Workspace AI layer at Cloud Next '26Google rolled out eighth-generation TPUs alongside a new agent platform and Workspace AI layer at Cloud Next '26, consolidating its infrastructure and enterprise software under an 'Agentic Enterprise' strategy. The moves signal Google's push to compete in both AI compute and agent-driven productivity tools.The Decoder·Apr 2285
ResearchMGDA-Decoupled: Geometry-Aware Multi-Objective Optimisation for DPO-based LLM AlignmentResearchers propose MGDA-Decoupled, a geometry-based multi-objective optimization method that balances competing alignment goals in LLM training without relying on reinforcement learning or explicit reward models. The technique addresses fairness issues in existing DPO pipelines by preventing systematic under-weighting of harder-to-optimize objectives like truthfulness or harmlessness.arXiv cs.LG·Apr 2258
ResearchVariance Is Not Importance: Structural Analysis of Transformer Compressibility Across Model ScalesResearchers systematically tested compression techniques across GPT-2 and Mistral 7B, discovering that high-variance activations don't correlate with model importance and that transformer blocks behave linearly only under specific input distributions. The findings challenge conventional assumptions about which components matter for efficient inference.arXiv cs.LG·Apr 2262