We proceed from a relatively weak notion of agency in cooperating software system components to describe the data collection activities of the Lycos Internet information retrieval services. The characteristics of agency of interest include autonomy, social ability, reactivity, and proactivity.
The original and simple Lycos spiders, once implemented in Perl, have evolved into a true multiagent system of cooperating components that can visit and analyze more than 10,000,000 Web pages each day. There are three kinds of cooperating components:
Spiders are independent software agents that crawl the Web to gather information. In the Lycos data collection system, the independent multiple processes are also multithreaded. The spiders individual spider threads communicate directly with another component called the Update Server (US). They get their marching ordersthe set of URLs they are to visitfrom the US and they pass back any discovered hyperlinks that the US can then parcel out to be visited in turn.
The URL server manages which servers and pages are to be visited by the spiders. Its job is to give each spider a list of URLs to visit and to receive from each spider the data it collects about links the spider may discover as it travels the Web. It provides a mechanism for controlling the rate at which spiders work and the environs in which they operate, providing command and control for the spiders. One Update Server manages the working of numerous spiders.
The Catalog Update Server (CUS) receives data from spiders, prepares it for indexing and stores it in a repository.
The three independent components of this system were developed as individual programs and communicate as needed to gather and analyze specific kinds of Web-based hyperlinked documents. The Lycos data collection system is clearly a distributed computing system residing and operating within an internetworked environment. It can also be viewed as a multiagent system because its components act as independent agents interoperating with one another to achieve greater reach.
There are advantages to building the Lycos data collection system as a multiagent system. We could have chosen to build all of the capabilities of the spider USCUS complex into the spider. Indeed, the original Lycos spider did perform all of these functions itself. There are several reasons why choosing a multiagent design made sense for Lycos.
The original Lycos spiders have evolved into a multiagent system of cooperating components that can visit and analyze more than 10,000,000 Web pages each day.
The Lycos data collection system is truly a good example of multiagent systems in large-scale information environments.
©1999 ACM 0002-0782/99/0300 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 1999 ACM, Inc.
No entries found