The Research archive provides access to all Research articles published in past issues of Communications of the ACM.
The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. We provide DV estimation techniques for the case in which the dataset of interest…
Many data generation processes can be modeled as data streams. While this data may be archived and indexed within a data warehouse, it is also important to process the data "as it happens," to provide up to the minute analysis…
The database and systems communities have made great progress in developing database systems that allow us to store and query huge amounts of data. Real-time analysis is becoming mandatory. Here is where data stream processing…
Relational systems have made it possible to query large collections of data in a declarative style through languages such as SQL. There is a key component that is needed to support this declarative style of programming and that…