Models & Releases Research Products & Apps Business & Funding

Modelwire

A curated feed of what matters in AI. Independent, ad-supported, built in Denver, Colorado.

Read

Today
Models & Releases
Research
Business & Funding

About

About Modelwire
Methodology
Our sources
Editor's notes
Contact
Advertise

Legal

Privacy policy
Terms of use
DMCA & takedowns
Corrections

© 2026 Modelwire. All article links go to the original publishers.Summaries generated by Modelwire. We don’t republish full articles.

Earlier stories

The full Modelwire feed, ordered by publish time.

Illustration for: Marked-up Mac minis flood eBay amid shortages driven by AI

Hardware & Infra Products & Apps

Marked-up Mac minis flood eBay amid shortages driven by AI

Mac mini stock depletion is driving secondary-market markups as developers and AI enthusiasts rush to acquire the machines for running local language models and inference workloads. The shortage signals growing demand for affordable on-device AI compute outside cloud infrastructure.

TechCrunch — AI·Apr 24

58

Illustration for: CRAFT: Clustered Regression for Adaptive Filtering of Training data

Research Tools & Code

CRAFT: Clustered Regression for Adaptive Filtering of Training data

Researchers introduce CRAFT, a data selection method that identifies high-quality training subsets for sequence-to-sequence models by clustering source data and matching target distributions. The technique reduces fine-tuning costs on massive corpora while maintaining model performance.

arXiv cs.CL·Apr 24

52

Illustration for: BERAG: Bayesian Ensemble Retrieval-Augmented Generation for Knowledge-based Visual Question Answering

BERAG: Bayesian Ensemble Retrieval-Augmented Generation for Knowledge-based Visual Question Answering

Researchers propose BERAG, a Bayesian ensemble method for retrieval-augmented generation that addresses the lost-in-the-middle problem and computational scaling issues in visual question answering by avoiding document concatenation and improving attribution.

arXiv cs.CL·Apr 24

52

Illustration for: Operational Feature Fingerprints of Graph Datasets via a White-Box Signal-Subspace Probe

Operational Feature Fingerprints of Graph Datasets via a White-Box Signal-Subspace Probe

Researchers introduce WG-SRC, a white-box diagnostic tool that decodes what graph neural networks learn during node classification by decomposing message passing into interpretable signal components. The method replaces opaque learned representations with fixed graph-signal dictionaries, enabling practitioners to diagnose which mechanisms a dataset actually requires.

arXiv cs.LG·Apr 24

52

Illustration for: Iterative Model-Learning Scheme via Gaussian Processes for Nonlinear Model Predictive Control of (Semi-)Batch Processes

Research Tools & Code

Iterative Model-Learning Scheme via Gaussian Processes for Nonlinear Model Predictive Control of (Semi-)Batch Processes

Researchers propose embedding Gaussian Processes into nonlinear model predictive control for batch chemical processes, learning dynamics iteratively from each production run rather than requiring upfront mechanistic models. The approach uses uncertainty quantification to enforce safety constraints while improving control performance batch-by-batch.

arXiv cs.LG·Apr 24

42

Illustration for: Rethinking XAI Evaluation: A Human-Centered Audit of Shapley Benchmarks in High-Stakes Settings

Rethinking XAI Evaluation: A Human-Centered Audit of Shapley Benchmarks in High-Stakes Settings

Researchers tested eight competing Shapley value variants across fraud detection and risk workflows with 3,735 professional analyst reviews, finding that standard quantitative metrics for explainability don't correlate with what actually helps humans make decisions in high-stakes settings.

arXiv cs.LG·Apr 24

62

Illustration for: Can QPP Choose the Right Query Variant? Evaluating Query Variant Selection for RAG Pipelines

Can QPP Choose the Right Query Variant? Evaluating Query Variant Selection for RAG Pipelines

Researchers investigate whether Query Performance Prediction can identify the best query reformulation before running expensive retrieval and generation steps in RAG pipelines. The work shifts QPP focus from estimating query difficulty across topics to selecting optimal variants within a single information need, tested at scale across retrieval and end-to-end RAG systems.

arXiv cs.CL·Apr 24

52

Illustration for: GPT-5.5 tops benchmarks but still hallucinates frequently and costs 20 percent more over the API

Models & Releases

GPT-5.5 tops benchmarks but still hallucinates frequently and costs 20 percent more over the API

OpenAI's GPT-5.5 reclaims top benchmark performance but remains prone to hallucinations while raising API costs by 20 percent. The tradeoff leaves it competitive on value among proprietary models despite the price increase.

The Decoder·Apr 24

73

Illustration for: Quality-Driven Selective Mutation for Deep Learning

Quality-Driven Selective Mutation for Deep Learning

Researchers propose a probabilistic framework to measure mutant quality in deep learning testing, balancing two criteria: resistance (how hard mutants are to kill) and realism (how well they simulate actual bugs). The work addresses a gap in DL testing methodology by unifying metrics that guide test improvement and fault simulation.

arXiv cs.LG·Apr 24

42

Illustration for: Adversarial Malware Generation in Linux ELF Binaries via Semantic-Preserving Transformations

Adversarial Malware Generation in Linux ELF Binaries via Semantic-Preserving Transformations

Researchers developed an adversarial malware generator targeting Linux ELF binaries that achieved a 67.74% evasion rate against MalConv, a deep learning classifier. The work highlights a significant gap in adversarial ML research, which has focused heavily on Windows PE files while leaving Linux systems understudied.

arXiv cs.LG·Apr 24

52

Illustration for: CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting

Research Models & Releases

CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting

Researchers propose CLVAE, a variational autoencoder that combines probabilistic customer modeling with flexible neural networks to forecast long-term revenue from sparse transaction data. The hybrid approach aims to balance the interpretability of traditional attrition models with the adaptability of modern machine learning, addressing a core challenge in marketing resource allocation.

arXiv cs.LG·Apr 24

52

Illustration for: The AI Race Is Becoming an Infrastructure Contest

Hardware & Infra Business & Funding

The AI Race Is Becoming an Infrastructure Contest

Major AI vendors are racing to build compute, power, and datacenter capacity at scale, betting billions on infrastructure before market demand is fully proven. The shift signals that hardware and energy constraints, not model innovation alone, now define competitive advantage in AI.

AI Business·Apr 24

66

Illustration for: Mixed Membership sub-Gaussian Models

Research Models & Releases

Mixed Membership sub-Gaussian Models

Researchers propose mixed membership sub-Gaussian models that extend classical Gaussian mixture models to allow observations to belong to multiple components simultaneously. The approach preserves interpretability while gaining flexibility for applications like genetics and text mining where partial membership is natural.

arXiv cs.LG·Apr 24

52

Illustration for: Identifying and typifying demographic unfairness in phoneme-level embeddings of self-supervised speech recognition models

Identifying and typifying demographic unfairness in phoneme-level embeddings of self-supervised speech recognition models

Researchers mapped two distinct failure modes in speech recognition embeddings: random variance and systematic bias, finding that phoneme classifiers trained on underperforming speaker groups sometimes generalize better than those trained on high-performing groups, suggesting a path toward fairer ASR systems.

arXiv cs.CL·Apr 24

58

Illustration for: Detecting Concept Drift in Evolving Malware Families Using Rule-Based Classifier Representations

Detecting Concept Drift in Evolving Malware Families Using Rule-Based Classifier Representations

Researchers developed a rule-based method to detect concept drift in malware classifiers by tracking changes in decision tree rulesets across time windows. Testing on six malware families showed fixed two-month intervals with feature correlation metrics most reliably flagged when models degrade, offering a practical approach for maintaining classifier performance in adversarial settings.

arXiv cs.LG·Apr 24

52

Illustration for: Beyond Patient Invariance: Learning Cardiac Dynamics via Action-Conditioned JEPAs

Beyond Patient Invariance: Learning Cardiac Dynamics via Action-Conditioned JEPAs

Researchers propose action-conditioned world models for cardiac diagnosis that learn disease progression as state transitions rather than static labels, addressing a fundamental misalignment in self-supervised healthcare AI where invariance objectives suppress the pathological changes clinicians need to detect.

arXiv cs.LG·Apr 24

58

Illustration for: China moves to block tech firms from taking US money without government approval

Policy & Regulation Business & Funding

China moves to block tech firms from taking US money without government approval

China is requiring government approval before domestic tech firms can accept US capital, tightening control over foreign investment in the sector. The move signals escalating tech nationalism and directly impacts AI startups seeking US funding for model development and infrastructure.

The Decoder·Apr 24

73

Illustration for: Dharma, Data and Deception: An LLM-Powered Rhetorical Analysis of Cow-Urine Health Claims on YouTube

Dharma, Data and Deception: An LLM-Powered Rhetorical Analysis of Cow-Urine Health Claims on YouTube

Researchers used GPT-4, GPT-4o, Gemini 2.5 Pro, and Mistral Medium 3 to annotate 100 YouTube transcripts promoting cow urine as medicine, building a 14-category taxonomy of persuasive tactics. The study maps how LLMs can systematically detect rhetorical manipulation in health misinformation across culturally specific contexts.

arXiv cs.CL·Apr 24

52

Illustration for: Adaptive Head Budgeting for Efficient Multi-Head Attention

Research Models & Releases

Adaptive Head Budgeting for Efficient Multi-Head Attention

Researchers propose BudgetFormer, a Transformer variant that dynamically allocates attention heads based on input complexity rather than activating all heads uniformly. The approach targets efficiency gains in tasks like text classification where full head diversity is unnecessary, addressing a fundamental mismatch between fixed architecture design and variable computational needs.

arXiv cs.LG·Apr 24

58

Illustration for: Meta buys tens of millions of AWS Graviton 5 processor cores from Amazon

Hardware & Infra Business & Funding

Meta buys tens of millions of AWS Graviton 5 processor cores from Amazon

Meta is committing to tens of millions of AWS Graviton 5 cores, positioning itself as a top-tier customer for Amazon's custom silicon. The deal signals major cloud players are diversifying away from Nvidia and betting on custom chips for AI workload economics.

The Decoder·Apr 24

85

Illustration for: Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting

Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting

Researchers propose WassersteinGrad, a gradient-based method to explain predictions from autoregressive neural networks on dynamic physical fields like weather forecasting. The technique adapts existing attribution methods to handle high-dimensional spatiotemporal data, addressing the operational need for interpretability in AI systems deployed in safety-critical domains.

arXiv cs.LG·Apr 24

52

Illustration for: Useful nonrobust features are ubiquitous in biomedical images

Useful nonrobust features are ubiquitous in biomedical images

Researchers found that medical imaging models rely heavily on adversarially vulnerable features to achieve high accuracy on standard benchmarks, but these shortcuts collapse under distribution shifts. The work quantifies a robustness-accuracy tradeoff across five MedMNIST tasks, suggesting practitioners must choose between in-distribution performance and real-world reliability.

arXiv cs.LG·Apr 24

58

Illustration for: QuantClaw: Precision Where It Matters for OpenClaw

Research Tools & Code

QuantClaw: Precision Where It Matters for OpenClaw

Researchers propose QuantClaw, a dynamic precision routing system that cuts inference costs for OpenClaw agent systems by assigning lower quantization to simpler tasks while preserving accuracy where needed. The work demonstrates that quantization sensitivity varies significantly across agent workflows, offering a practical plug-and-play optimization for cost-prohibitive long-context reasoning.

arXiv cs.CL·Apr 24

58

Illustration for: SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference

Models & Releases Research

SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference

SpikingBrain2.0, a 5B parameter model, combines sparse attention mechanisms across layers to cut inference costs on long-context tasks while maintaining performance. The architecture pairs sparse softmax and linear attention variants with dual quantization paths, targeting efficiency gains for deployment across platforms.

arXiv cs.LG·Apr 24

52

Illustration for: Adversarial Co-Evolution of Malware and Detection Models: A Bilevel Optimization Perspective

Adversarial Co-Evolution of Malware and Detection Models: A Bilevel Optimization Perspective

Researchers propose a bilevel optimization framework to defend ML-based malware detectors against adaptive adversarial attacks, demonstrating significant improvements over standard adversarial training on three malware families including Mokes, Strab, and DCRat.

arXiv cs.LG·Apr 24

58

Illustration for: Learning Evidence Highlighting for Frozen LLMs

Learning Evidence Highlighting for Frozen LLMs

Researchers propose HiLight, a reinforcement learning method that trains a lightweight module to tag important evidence spans in long contexts, letting frozen LLMs reason more effectively without modifying the underlying model or requiring labeled data.

arXiv cs.CL·Apr 24

52

Illustration for: Data-Free Contribution Estimation in Federated Learning using Gradient von Neumann Entropy

Research Tools & Code

Data-Free Contribution Estimation in Federated Learning using Gradient von Neumann Entropy

Researchers propose using spectral entropy of gradient updates to estimate client contribution in federated learning without server-side validation data. The approach, tested on CIFAR-10/100 and FEMNIST, enables privacy-preserving reward allocation and prevents manipulation in distributed training systems.

arXiv cs.LG·Apr 24

52

Illustration for: The AI Compute Crunch Is Here (and It's Affecting the Entire Economy)

Business & Funding Hardware & Infra

The AI Compute Crunch Is Here (and It's Affecting the Entire Economy)

Venture capital's subsidy model for cheap AI is hitting limits as compute demand strains labor markets, hardware supply, and power grids. The infrastructure crunch is now rippling across the broader economy, forcing real trade-offs beyond the AI industry itself.

404 Media·Apr 24

81

Illustration for: SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning

Research Tools & Code

SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning

Researchers propose SOLAR-RL, a hybrid reinforcement learning framework that combines offline trajectory data with selective online interactions to train GUI agents powered by multimodal LLMs. The method aims to reduce expensive real-time environment interactions while preserving long-horizon task semantics that static datasets miss.

arXiv cs.LG·Apr 24

52

Illustration for: Are Natural-Domain Foundation Models Effective for Accelerated Cardiac MRI Reconstruction?

Are Natural-Domain Foundation Models Effective for Accelerated Cardiac MRI Reconstruction?

Researchers tested whether general-purpose vision foundation models like CLIP and DINOv2 can reconstruct accelerated cardiac MRI scans when frozen into unrolled reconstruction pipelines, comparing their effectiveness against biomedical-specific alternatives like BiomedCLIP.

arXiv cs.LG·Apr 24

52

Older stories →