acm-header
Sign In

Communications of the ACM

Digital government

Bistro: a Scalable and Secure Data Transfer Service For Digital Government Applications


ACM Digital Library

Communications of the ACM
Volume 46, Number 1 (2003), Pages 50-51
Bistro: a scalable and secure data transfer service for digital government applications
Leana Golubchik, William C. Cheng, Cheng-Fu Chou, Samir Khuller, Hanan Samet, C. Justin Wan

Table of Contents

back to top 

Government at all levels is a major collector and provider of data.

Our project focuses on the collection of data over wide-area networks (WANs) and addresses the scalability issues that arise in the context of Internet-based massive data collection applications. Furthermore, security, due to the need for privacy and integrity of the data, is a central issue for data collection applications that use a public infrastructure such as the Internet. Numerous digital government applications require data collection over WANs [5].

One compelling example of such an application is the Internal Revenue Service's electronic submission of income tax forms. Other digital government applications include collecting census data, federal statistics, and surveys; gathering and tallying of electronic votes; collecting crime data for the U.S. Justice department; collecting data from sensors for disaster response applications; collecting data from geological surveys; collecting electronic filings of patents, permits, and securities (for SEC) applications; grant proposals and contract bids submissions; and so on. All these applications have scalability and security needs in common.

The poor performance that may be experienced by current digital government users, given the existing state of technology (as in Figure 1a), is largely due to how (independent) data transfers using TCP/IP work over the Internet. TCP/IP is good at equally sharing bandwidth between data streams, which in large-scale applications can lead to poor performance for individual clients (as they receive only a very small share of this bandwidth). Given that TCP/IP is here to stay for the foreseeable future, what is needed is a scalable yet cost- effective solution that can be easily deployed over the existing Internet technology.

We are designing and developing a system called Bistro, which addresses the scalability needs of digital government data collection applications while allowing them to share the same infrastructure and resources efficiently, cost-effectively, and securely [1]. Bistro's basic approach is to introduce intermediate hosts—bistros—which allow replacement of a traditionally "synchronized client push" approach with a "nonsynchronized combination of client-push and server-pull" approach (as depicted in Figure 1b). This in turn allows spreading of the workload on the destination server and the network over time, with subsequent elimination of hot spots as well as significant improvements in performance for both clients and servers. Our ongoing research [2, 4] indicates that orders of magnitude of improvement can be achieved with the Bistro architecture and the corresponding data collection algorithms it affords.

Bistro's design allows for a gradual deployment and experimentation over the Internet (by simply downloading Bistro server software and installing it on public servers). Bistro's security protocol and trust structure [3] are designed such that only encrypted data travels through (not necessarily trusted) bistros. This means a government agency does not need to trust bistros installed by other agencies or commercial institutions. At the same time, these (untrusted) bistros can significantly improve the agency's data collection performance. Each application (within each agency) can have its own scalability, security, fault tolerance, and other data collection needs, and these applications and agencies can still share available resources, if so desired, across all Bistro servers.

We believe an appropriately designed single infrastructure such as Bistro can address all digital government wide-area data collection needs in a scalable, secure, and cost-effective manner. (For more information, see bourbon.usc.edu/iml/bistro/.

Back to Top

References

1. Bhattacharjee, S., Cheng, W.C., Chou, C-F, Golubchik, L, and Khuller, S. Bistro: A platform for building scalable wide-area upload applications. ACM SIGMETRICS Performance Evaluation Review 28, 2 (Sept. 2000), 29–35. (Also presented at the Workshop on Performance and Architecture of Web Servers, June 2000.)

2. Cheng, W.C., Chou, C-F, and Golubchik, L. Performance of online batch-based digital signatures. Submitted for publication.

3. Cheng, W.C., Chou, C-F, Golubchik, L., and Khuller, S. A secure and scalable wide-area upload service. In Proceedings of the 2nd International Conference on Internet Computing 2 (June 2001), 733–739.

4. Cheng, W.C., Chou, C-F, Golubchik, L., Khuller, S., and Wan, Y.C. On a graph-theoretic approach to scheduling large-scale data transfers. Submitted for publication.

5. Cheng, W.C., Chou, C-F., Golubchik, L., Khuller, S., and Samet, H. Scalable data collection for Internet-based digital government applications. Proceedings of the 1st National Conference on Digital Government Research. (Los Angeles, CA, May 2001), 108–113.

Back to Top

Authors

Leana Golubchik ([email protected]) is an associate professor in the computer science department at the University of Southern California, Los Angeles. (This work was partly done while she was at the University of Maryland.)

William C. Cheng ([email protected]) is President of TeleGIF in Marina Del Rey, CA. (This work was partly done while he was at the University of Maryland.)

Cheng-Fu Chou ([email protected]) is an assistant professor of computer science and information engineering at National Taiwan University, Taipei, Taiwan. (Some of this work was done while he was at the University of Maryland, College Park, MD.

Samir Khuller ([email protected]) is an associate professor in the Department of Computer Science and UMIACS at the University of Maryland, College Park, MD.

Hanan Samet ([email protected]) is a professor of computer science, Center for Automation Research, Institute for Advanced Studies at the University of Maryland, College Park, MD.

Y.C. Justin Wan ([email protected]) is a Ph.D. student in the Department of Computer Science and UMIACS at the University of Maryland, College Park, MD.

Back to Top

Footnotes

This work is supported in part by the NSF Digital Government Grant #0091474.

Back to Top

Figures

F1Figure 1. Data collection for digital government applications.

Back to top


©2003 ACM  0002-0782/03/0100  $5.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2003 ACM, Inc.


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account
Article Contents: