Advances in language agents that can follow instructions and use tools have renewed interest in autonomous agents and multi-agent systems. Like previous generations of agents, language agents are designed for specific tasks, highlighting the need for open networks of agents that complement each other's abilities to tackle more complex problems. New protocols are rapidly emerging to allow agents to discover and use tools, or to discover and interact with other agents. Some of these protocols build on Web standards to promote interoperability, but their alignments, misalignments, and overlaps are unclear. This report synthesizes the large body of research on autonomous agents and multi-agent systems (MAS) to define a conceptual model for understanding Web-based MAS. We use this conceptual model to classify existing technologies and frameworks, to identify relevant standards within the W3C, and to discover standardization gaps (if any).

Introduction

Terminology

Agent
An entity situated in an environment that perceives its environment and acts on it, over time, in pursuit of its goals. For a detailed discussion of agent definitions, see [[FRANKLIN96]].
Agent Interaction Protocol
A specification of communication among two or more agents that states who can say what to whom and when — for example, as message sequence diagrams [[AUML]] or information flows [[BSPL]].
Artifact or Tool
A resource [[WEBARCH]] that can be shared and used by agents to support their activities. In some multi-agent systems, agents can construct artifacts to instrument their environments [[JACAMO]].
Augmented Language Model
A language model augmented with abilities such as reasoning, tool use, information retrieval, or storing context across interactions. Unlike an agent, an augmented language model does not actively pursue goals and is not situated in an environment. See also [[TMLR23]] and [[ANTHROPIC24]].
Multi-Agent System (MAS)
A system composed of agents that are situated in a shared environment and interact with one another to achieve individual or collective goals. Agents can work in collaboration, cooperation, and/or competition. A MAS can be either an open or a closed system. This report is primarily concerned with open MAS.
Situatedness
The ability of an agent to interact with its environment directly through perception and action, and to respond in a timely fashion to sensory input.
[Term]
[To be added]

Agents on the Web

Visions of Agents on the Web

The vision of intelligent agents on the Web is almost as old as the Web itself: in a keynote at WWW'94, Sir Tim Berners-Lee was noting that documents on the Web describe real objects and relationships among them, and if the semantics of these objects are represented explicitly then machines can browse through and manipulate reality. This vision was published in 2001 as the Semantic Web [Berners-Lee et al., 2001] — and is now closer to its realization through the standardization of the Web of Things (WoT) at the W3C and the IETF.

In the AI community, the vision of a world-wide open network of intelligent agents can be traced back to the late '90s. In 2002, the AgentCities initiative was reporting a network of 41 agent platforms deployed in 21 countries [Willmott et al., 2002] — with up to 60 registered platforms reported in 2003 [Dale et al., 2003] and 160 platforms in 2005 [Bellifemine et al., 2005]. The network was based on the standards produced by the Foundation for Intelligent Physical Agents (FIPA), but quickly faded after the mid-2000s as industry attention shifted to Web services. Another prominent initative was the DARPA Control of Agent-Based Systems (CoABS) research program [TODO], which investigated the control, coordination, and management of large systems of autonomous software agents in military applications. Central to this program, CoABS Grid was the middleware integrating heterogeneous agent-based systems, object-based applications, and legacy systems using remote method invocation as a client-server style for network-based interaction.

The DARPA CoABS program demonstrated the use of agent technology in large-scale practical applications, but also raised a number of challenges, such as enabling software agents to dynamically identify and understand information sources [TODO]. To address these, DARPA launched the Agent Markup Language (DAML) research program, which built on top of existing Web standards and paved the way for the Web Ontology Language (OWL), Semantic Markup for Web Services (OWL-S), and other cornerstones of the Semantic Web. The DAML program thus advanced the original vision of the Web as an information space not only for people but also for intelligent agents, and promoted a shift from custom-built middleware for MAS — such as CoABS Grid or FIPA implementations — to offloading many of those responsibilities to the existing Web infrastructure. Web-based MAS received significant attention over the years, especially with the advent of service-oriented computing in the early 2000s [Singh and Huhns, 2006].

Recent years have brought renewed interest in Web-based MAS, as evidenced by the Dagstuhl Seminar 21072 (Feb. 2021) and Dagstuhl Seminar 23081 (Feb. 2023) on "Agents on the Web" that led to the creation of the W3C Autonomous Agents on the Web (WebAgents) Community Group. One key development is the Web of Things (WoT) [TODO], which unlocks new practical use cases for agents on the Web — and implements several visionary ideas expressed in the motivating scenarios from the original Semantic Web paper [Berners-Lee et al., 2001]. Another key development is the recent progreess in language agents that can follow instructions and use tools: just like previous generations of agents, language agents are designed for specific tasks, highlighting the need for open networks of agents that complement each other's abilities to tackle more complex problems. New protocols and frameworks are rapidly emerging to allow agents to discover and use tools, or to discover and interact with other agents — and many of these initiatives build on Web standards tos promote interoperability (e.g., see the Model Context Protocol, Agent2Agent Protocol, Agent Network Protocol, Eclipse LMOS).

State of Web-based Multi-Agent Systems

Relevant Concepts Agent Interaction Tool Use Identifiers Descriptions Discovery Mechanisms Arch. Style
MCP Tool,
Resource,
Prompt
N/A Function calling Strings (Tools and Prompts),
URIs (Resources)
Tool definition,
Resource descriptions,
Prompt definitions,
(JSON)
Directories (via */list) Client-Server with streaming RPC connectors (JSON-RPC 2.0, HTTP+SSE)
A2A Agent Card,
Task
Task invocation N/A Strings? Agent Card,
Task description,
(JSON)
Well-known URIs,
Directories
Async. Client-Server with streaming RPC connectors and webhooks (JSON-RPC 2.0, HTTP+SSE)
ANP Agent,
Agent Description,
Communication Protocol
Communication protocols with protocol negotiation N/A W3C DID with custom Web-based Agent DID Method Agent Description (RDF/JSON-LD) Directories Peer-to-Peer?
(WebSocket subprotocol)
LMOS Agent,
Agent Group, Tool,
Agent Description,
Tool Description
Message passing?
(in principle: TD interaction affordances)
Property Affordances,
Event Affordances,
Action Affordances
(W3C WoT TD)
Uniform identifiers (IRIs, W3C DIDs) Agent Description,
Tool Description
(W3C WoT TD; JSON, RDF/JSON-LD)
DNS-SD/mDNS,
Well-known URIs,
Directories
(W3C WoT Discovery)
W3C WoT Arch.? with protocol bindings for HTTP and WebSocket subprotocol
FIPA Agent,
Agent Directory,
Service Directory,
Agent Communication Language,
Interaction Protocol
FIPA Agent Communication Langauge,
FIPA Agent Interaction Protocols
N/A FIPA Agent Name FIPA Agent Identifier Description Directories TODO
hMAS Agent,
Artifact,
Agent Body,
Workspace,
Signifier,
Role,
Group,
Organization,
Resource Profile
Message passing,
Signifiers for agent body affordances
Signifiers
(W3C WoT TD, hMAS ontology)
Uniform identifiers (IRIs, W3C DIDs) Resource Profile
(W3C WoT TD or hMAS ontology; RDF/Turtle)
Hypermedia crawling,
Search engines,
Directories
Async. Client-Server with REST connectors (HTTP) and brokered pub/sub (W3C WebSub)
Multi-Agent MicroSevices (MAMS) Agent,
Agent Body,
Resource, Microservices
FIPA ACL (over HTTP), REST, HTTP API, JMS REST, HTTP API, JMS, W3C WOT TD URIs (Agents, Agent Bodies, Resources) Agent Bodies (JSON, JSON-LD (inc W3C WoT Hypermedia Controls Ontology), HAL) Service Registries (Netflix Eureka), Link Crawling, Link Sharing Microservices Architecture, Event Driven Architecture, REST

Agents and Web Services

Agents and the Decentralized Social Web

The decentralized Social Web emerged as an alternative to centralized Web social media. In particular, as early as 2009, Sir Tim Berners-Lee, founder of the World Wide Web, expressed a criticism of centralized cloud services that create closed silos of data, that are themselves tightly coupled to the applications that can use the data. These silos remain in control of cloud storage providers and prevent users from being in control of their own data. He proposed instead that applications should be decoupled from data [[CLOUDSTORAGE]]. To do so, he proposed to use the Web architecture combined with Linked Data Principles [[LINKEDDATA]]. In 2016, this proposal was realized by the Solid project [[SOLID]]. Solid is a protocol that enables each user to own one or more pods that contain their data. Each pod is structured as a Linked Data Platform [[LDP]]. The data provided by each data can be accessed by different applications depending on the authorizations they are given to read or write data in a given pod. It therefore satisfies the principles developed by Tim Berners-Lee in [[CLOUDSTORAGE]] and it also satisfies the principles for the decentralized Web developed in [[DECENTRALIZEDWEB]]. Solid is built using Semantic Web technologies, which were themselves developed to enable machines, such as software agents, to interact meaningfully with Web resources [[SEMWEB01]]. Software agents have been integrated as part of the Solid applications. In particular, LLM agents combined with Solid have been described as enabling the original vision of the Semantic Web [[CHARLIE24]]. Tim Berners-Lee asserts that the development of LLMs, combined with data wallets (such as Solid data pods) storing personal information, enables the development of personal assistants able to access data shared with the consent of the user, in order to provide better results [[CHARLIEWORKS]]. Another contribution to decentralized social media that also relies on Semantic Web technologies is the Fediverse. The Fediverse is a set of platforms that implement the ActivityPub protocol. This protocol enables different social media platforms to be interoperable with one another. It relies on the ActivityStreams format, compatible with Semantic Web technologies like JSON-LD, to exchange information across platforms. While ActivityPub has different objectives than Solid (interoperable social media vs decentralized data storage), their grounding in Semantic Web technologies and emphasis on decentralization make them complementary, as seen with project like ActivityPods, which enables the creation of social media apps, integrated within the Fediverse, over Solid. Another approach to create decentralized social media are approaches that rely on cryptographic technologies, such as blockchains. These technologies form together what has been called "Web3", which includes decentralized applications constructed over blockchains, such as Ethereum; the Interplanetary File System (IPFS) to enable distributed storage of information, making censorship of information more difficult as it requires blocking or shutting down each server that contains the information. While such initiatives often evolve in a different ecosystems than Solid and ActivityPub, some connections exist. One example is the W3C standard for Decentralized Identifiers (DID), which are URIs that can be resolved to a JSON-LD document that provides information to identify users (including humans and software agents). While DIDs have an explicit connection to the Semantic Web, through the use of JSON-LD, they also have a connection to blockchains, as a potential storage for DID documents.

Agentic AI

Conceptual Overview and Modeling Dimensions

Modelling dimensions for Multi-Agent Systems

Modelling Dimensions for Engineering Multi-Agent Systems [Demazeu, 1995]

Architectural Considerations

Identification

Relevant Standards and Initiatives

Agent Identification

Tool Identification

Discussion

Profiles

Relevant Standards and Initiatives

Agent Profiles

Tool Profiles

Discussion

Verifiable Credentials

Relevant Standards

Discussion

Discovery

Relevant Standards and Initiatives

Agent Discovery

Tool Discovery

Discussion

Agent-to-Agent Interaction

Relevant Standards and Initiatives

Agents and People

Discussion

Agent-Environment Interaction

Relevant Standards and Initiatives

Tool Use

Discussion

Norms, Policies, and Organizations

Relevant Standards and Initiatives

Discussion

Security and Privacy

Relevant Standards

Authentication and Authorization

Discussion

Conclusions: A Strategy for Agents on the Web

Acknowledgements