Modelwire
Subscribe

ComPASS: Towards Personalized Agentic Social Support via Tool-Augmented Companionship

Illustration accompanying: ComPASS: Towards Personalized Agentic Social Support via Tool-Augmented Companionship

Researchers introduce ComPASS, a framework that equips conversational agents with external tools to deliver personalized emotional support across multiple modalities. The work grounds agentic companionship in psychological social support theory and includes ComPASS-Bench, a new benchmark for evaluating personalized support interactions.

Modelwire context

Explainer

The meaningful distinction in ComPASS is not that it uses tools, but that it grounds tool selection in established social support theory, meaning the agent is supposed to know when to search for resources, when to reflect back emotion, and when to do nothing, rather than defaulting to a single response mode. ComPASS-Bench is the part worth scrutinizing: benchmarks for emotional support are notoriously hard to validate because ground truth is contested even among clinicians.

The agentic framing here connects directly to the broader agent benchmarking wave we covered around mid-April. CoopEval (covered April 16) raised a pointed question about whether LLM agents behave appropriately in socially consequential situations, finding that models defect rather than cooperate under pressure. ComPASS is asking a related but distinct question: can agents calibrate behavior to individual emotional needs over time? The social stakes in companionship are arguably higher than in game-theoretic dilemmas, yet the evaluation infrastructure is less mature. The Codex and MM-WebAgent coverage from the same period shows tool use becoming a baseline expectation for agents, which makes the psychological grounding in ComPASS the actual differentiator worth examining.

Watch whether ComPASS-Bench gets adopted by any third-party research group within six months. If it does, the benchmark has legs; if citations cluster only around the original authors, the evaluation criteria likely reflect design choices too specific to generalize.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsComPASS · ComPASS-Bench

MW

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.

ComPASS: Towards Personalized Agentic Social Support via Tool-Augmented Companionship · Modelwire