The reviewing process for most computer science conferences originated in the pre-Internet era. In this process, authors submit papers that are anonymously reviewed by program committee (PC) members and their delegates. Reviews are typically single-blind: reviewers know the identity of the authors of a paper, but not vice versa. At the end of the review process, authors are informed of paper acceptance or rejection and are also given reviewer feedback and (usually) scores. Authors of accepted papers use the reviews to improve the paper for the final copy, and the authors of rejected papers use them to revise and resubmit them elsewhere, or withdraw them altogether.
Some conferences within the broader computer science community modify this process in one of three ways. With double-blind reviewing, reviewers do not know (or, at least, pretend not to know) the authors. With shepherding, a PC member ensures that authors of accepted papers with minor flaws make the revisions required by the PC. And, with rollover, papers that could not be accepted in one conference are automatically resubmitted to another, related conference.
Surprisingly, the advent of the Internet has scarcely changed this process. Everything proceeds as before, except that papers and reviews are submitted online or by email, and the paper discussion and selection process is conducted, in whole or in part, online. A naive observer, seeing the essential structure of the reviewing process preserved with such verisimilitude, may come to the conclusion that the process has achieved perfection, and that is why the Internet has had so little impact on it. Such an observer would be, sadly, rather mistaken.
We believe the paper review process suffers from at least five problems:
These problems are interrelated. The increase in the number of papers leads, at least partly, both to a decline in paper quality and a decline in the quality of reviews. It also leads to an ever-increasing variance in paper quality. Similarly, as the acceptance rate of a conference declines, there is a greater incentive for reviewers to write overly negative reviews and favor their friends.
Paper reviewing and publishing can be viewed as a game. There are three players in this game, who are assumed to be rational, in the usual economic sense, and who have the following incentives:
Interestingly, the problems outlined here arise because the existing paper reviewing process does not explicitly address these contradictory incentives. There is no explicit incentive for authors to become reviewers or for authors to limit the number of papers they submit, or to submit good-quality papers. There is no check on reviewers who write skimpy reviews,a are overly negative, or play favorites. No wonder the system barely works!
Our goals, illustrated in the table here, involve designing mechanisms such that it is incentive-compatible to do the right thing. Here, we describe some mechanisms to achieve these goals (correlated to the A1, A2, R1, R2, R3 labeling scheme established in the table). Our proposals include some steps that have been tried by some brave conference PC chairs. Others that are novel and would need experimentation and experience.
Author Incentives. Our first mechanism addresses A1 using peer pressure. It requires the conference to publish not only the list of accepted papers, but also, for each author, the author's acceptance rate for that conference. For example, if an author were to submit two papers and none were accepted, the conference would report an acceptance rate of 0, and if one was accepted, the author would have an acceptance rate of 0.5. Because no author would like to be perceived to have a low acceptance ratio, we think this peer pressure will enforce A1.
Our second mechanism addresses A2 by raising the prestige of reviewing. For example, conferences can have a best reviewer award for the reviewer with the best review scoreb or give them a discount in the registration fee.
A more radical step would be to solve A1 and A2 simultaneously by means of a virtual economy, where tokens are paid for reviews, and spent to allow submission of papers.c Specifically, assuming each paper requires three reviews on average, reviewers are granted one token per review, independent of the conference, and the authors of a paper together pay three tokens to submit each paper. We recognize that this assumes all conferences expect the same level of reviewing: one could pervert this scheme by appropriate choice of reviewing venues. We ignore this fact for now, in the interests of simplicity. Continuing with our scheme, authors of accepted papers would be refunded one, two, or all their tokens depending on their review score. Authors of the top papers would therefore incur no cost, whereas authors of rejected papers would have spent all three of their tokens. Clearly, this scheme forces authors to become reviewers, and to be careful in using the tokens thus earned, solving A1 and A2.
We note that we obviously need to make tokens non-forgeable, non-replicable, and perhaps transferable. E-cash systems for achieving these goals are well knowndthey merely need to be adapted to a non-traditional purpose. We recognize that regulating the economy is not trivial. Over-damping the system would lead to conferences with too few papers, or too few reviewers. Underestimating the value of tokens would only slightly mitigate the current problems, but would add a lot of expensive overhead in the form of these mechanisms. Moreover, it is not clear how this system can be implemented. Indeed, even if it was, it would not be obvious how it can be bootstrapped, or whether it would have unintended consequences. One possible technique would be to start by publishing signed reviews and rely on technologies such as Citeseer and Google Scholar as we describe here in more detail.
The goal would be to have a standard way for members of the community to review and rank papers and authors both before and after publication.
Reviewer Incentives. We first discuss dealing with R1 and R3. We propose that authors should rate the reviews they receive for their papers, while preserving reviewer confidentiality. Average (non-anonymized) reviewer scores would then be circulated among the PC. No PC member wants to look bad in front of his or her peers, so peer pressure should enforce R1 and R3 (PC collusion will damage the conference reputation). Note that we expect most authors to rate detailed but unfavorable reviews highly.
An even more radical alternative is for reviews to be openly published with the name of the reviewer. The idea is that reviewers who are not willing to publish a review about a paper are perhaps inherently conflicted and therefore should not be reviewing that paper. Of course, there is a danger that public reviews will be too polite, but this will no doubt sort itself out over time. The advantage of using true identities (verinyms) is that this handles R1, R2, and R3. Alternatively, reviews could be signed with pseudonyms, where the pseudonyms could persist across conferences. Nonce pseudonyms will protect the nervous but prevent building reputation. There is a fundamental balance between anonymity and credibility that we cannot hope to solve.
We believe the academic community as a whole desires such a system. However, we also realize such a system can also be subverted.
A deeper examination of the incentive structure suggests that perhaps the real problem is that too much of the work of submitting and selecting papers is hidden. What if the entire process were made open, transparent, and centralized? The goal would be to have a standard way for members of the community to review and rank papers and authors both before and after publication, in a sense adding eBay-style reputations to Google Scholar or arXiv. All papers and reviews would be public and signed, with either pseudonyms or verinyms. This system, would, in one fell swoop achieve many simultaneous goals:
We believe the academic community as a whole desires such a system. However, we also realize such a system can also be subverted. As with e-cash, the hardening of reputation systems to resist collusion and other attacks is well known, and we merely need to import the appropriate machinery and techniques.
We have identified the underlying incentive structure in the paper publishing process and shown where these incentives lead to poor outcomes. These insights allow us to propose several mechanisms that give incentives to authors, reviewers, and the community to do the "right thing." We accept that there has been much altruism in the past, but in today's resource-scarce world, it may not be fair to rely on this any longer. We recognize our work is preliminary and leaves out many important details but nevertheless hope these ideas will serve as the foundation of a fundamental rethinking of the process. We hope at least some of our proposals will make their way into future conferences, workshops, and publications.
a. Other than a slight risk of embarrassment at the PC meeting.
b. See the subsection Reviewer Incentives for details on review scoring.
c. We have been informed that this scheme was first suggested by Jim Gray, though we cannot find a citation to this work.
d. For example, David Chaum's seminal work "Blind signatures for untraceable payments," Advances in Cryptology Crypto '82, Springer-Verlag (1983), 199203.
An earlier version of this material was published in Proceedings of the Workshop on Organizing Workshops, Conferences, and Symposia for Computer Systems (WOWCS 2008).
DOI: http://doi.acm.org/10.1145/1435417.1435430
©2009 ACM 0001-0782/09/0100 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2009 ACM, Inc.
No entries found