acm-header
Sign In

Communications of the ACM

BLOG@CACM

What Is Your Research Culture? Part 3: The Web of Science


View as: Print Mobile App Share:
Bertrand Meyer

I have procrastinated writing this third and last article in my Research Culture series. (Please read the other two parts first [1] [2].) The reason for procrastination is that one cannot expect pleasant results from talking about the ISI (Thomson Reuters) Web of Science and its undue influence. Anyone competent in the evaluation of computer science research and researchers knows that the role of WoS for computer science lies  somewhere between irrelevant and harmful. Reactions to the mention of WoS in professional meetings about research policies tend to be sighs, or, in writing, their emoticon equivalents: we know it's awful, but nothing can be done.

Still, pleasant or not, the topic has to be brought up, even for the umpteenth time, because every day a new university springs up somewhere, or an old university decides to modernize and become part of of what it perceives as the modern research world, and management starts screaming: This is taxpayers' and sponsors' money you are spending! We need to assess researchers using objective metrics! From now on we will apply the Web of Science metrics!

The worst part is that when people object, pointing out that study after study has shown WoS as inadequate and misleading for computer science, it is all too easy to dismiss them as sore losers. After all, aren't physicists content enough with WoS? Well, they may be, but we are not. And yes, we are sore losers. A football team assessed according to the rules of hockey would be a sore loser too.

The problem does not arise in any siginificant way in first-class universites. The current president of Stanford University is a well-known computer scientist (more precisely, an expert in computer  architecture). I have no inside information, but I very much doubt that when he sees an application for professor of computer science he rushes to his browser to look up the person's profile on WoS. No top institution I know takes these indicators seriously. The problem is with institutions whose management does not understand the research culture of computer science.

So let us belabor the obvious once more. WoS may be relevant for other disciplines but for computer science, computer engineering and other IT disciplines it is largely worthless. The reasons include under-representation of conferences (a special version of WoS includes conferences, but not the basic indicators that most institutions use), the absence of books, and in general the opaque process applied by these tools, depriving them of a community-based process that would permit continuous improvement.

It is tedious to go again what has been explored by so many studies. So let us just pick an example or two. The quintessential computer scientist is Donald E. Knuth. Let us search for "Knuth D". (The links given here and below are to the search results I get, so that if I messed up you can provide a correction. This one seems to work whether or not your institution has a WoS subscription.) A few papers come up, but not what makes Knuth an icon in our field: his universally acclaimed multi-volume The Art of Computer Programming book series. ("Dear Dr. Knuth: thank you for Associate Professor application at Nouveau-Riche University. I regret to inform you that your Web of Science h-index of 34 puts you far behind other candidates, some of whom even have publications in the South Kentucky Journal of Information Technology for Marine Entomology. Yours sincerely, ...") On Google Scholar, The Art of Computer Programming gets over thirty-seven thousand citations, and two of his other books get around 1600  and 1200. The Microsoft Academic Search similarly recognizes the value of the books.

Whatever the reasons besides not counting books, many WoS results make little sense. The Thomson-Reuters "list of Highly Cited Researchers" includes 117 people under "Computer Science". A quick look at it reveals that only a few of the field's leaders, the ones whose names would come up immediately if you asked a typical computer scientist, appear in the list. I did not spot a single recipient of the discipline's top prize, the Turing Award. (What would you think of a list of top physicists that includes no Nobel prize?) Now there is every reason to believe that the people listed are excellent scientists; they are just not, in most cases, computer scientists. They often belong to departments having nothing to do with CS, such as the Institute of Enzymology of a country's Academy of Sciences.

One cannot just retort that any such compendium will be by nature arbitrary. Not true. Look at the corresponding list in the Microsoft Academic Search. If you are a computer scientist, a few of the names might surprise you, but most will be familiar. I checked around: other computer scientists, like me,  know almost every name among the first 100 in the Microsoft list, and fewer than a tenth of those in the Thomson-Reuters list. Why WoS classifies them under computer science is somewhat mystifying, but a clue may be that many seem to use computational techniques in other disciplines such as chemistry or biology. WoS is comfortable with these disciplines (if not with computer science) and so captures these authors. Work applying computational techniques to high-particle physics may be brilliant, but it is not necessarily relevant as computer science work.

An accumulation of such examples of inadequacy does not constitute a scientific survey. But the scientific surveys have been made; see the references in the Informatics Europe article [3] and the associated report [4] on evaluation of computer science researchers. Although the statistics there are from some five years ago (I'd be glad to talk to anyone who has the time to update them), the general picture has not changed. If someone in your institution brings up WoS for evaluating CS people, please point to these articles. Any institution who applies Web of Science for computer science is behaving improperly. (Scopus, by the way, is just marginally better.)

Highlighting this inadequacy does not mean rejecting bibliometric indicators. The role of these indicators is the source of considerable discussion; but that is a separate discussion. Quantitative indicators, when and if applied, should be meaningful; not hockey rules for football games. Good computer scientists apply rational principles in their work, and expect those who evaluate them to do the same.

People will, of course, say that they need measurements, and ask which ones they can use. almost anything is better than WoS. "Try shoe size", said a colleague dejectedly, but those who ask the question seriously deserve a serious answer.

First — sorry again for belaboring what to experienced readers will be obvious — you should remind your colleagues that if they count anything about publications it should be citations, not the number of publications. Many institutions, believe it or not, still rely on  publication counts, which do not reward quality but mere prolificacy. Citation counts too are only an imperfect proxy for quality, but much less imperfect than publication counts.

For citations, good institutions typically rely on Google Scholar or Microsoft Academic Search. Both have their limits (see note [7] about the latter); they can be gamed a bit, for example through self-citations, but the effect is marginal. Some authors have also succeeded in demonstrating the possibility of outright fraud with Google Scholar, but the techniques are so extreme that it is hard to imagine a serious institution falling for them. (Any organization that blindly relies on automated metrics as the basic evaluation technique, without human filtering, is begging for disaster anyway.) In fact these sources, based on the simple and objective criterion of counting references on the Web, are more robust than the competition. It is no accident that tools like Anne Wil-Harzing's Publish or Perish [5] and Jens Palsberg's "h Index for Computer Science" at UCLA [6] rely on Google Scholar, complemented in the first case by Microsoft Academic Search. An imperfect tool is better, especially when its imperfections are clear and can be factored into the discussion, than a bogus indicator dressed up in the trappings of respectability.

In the years since we published the Informatics Europe article and report, we received countless notes of thanks from academics who successfully used them to dissuade their naïve management from applying inadequate criteria. (Naïve but honest: many administrators, or colleagues from other disciplines, simply assume that what works for other fields works for computer science, and are open to being enlightened, especially if you present them with credible alternative solutions.) As new institutions get into IT and new people take on leadership roles, the battle must, whether we like it or not, recommence again and again.

References

[1] Bertrand Meyer: What is Your Research Culture, Part 1: The Questionnaire, in this blog, 2 November 2014, see here.

[2] Bertrand Meyer: What is Your Research Culture, Part 2: Background, in this blog, 2 November 2014, see here

[3] Bertrand Meyer, Christine Choppy, Jørgen Staunstrup and Jan van Leeuwen: Research Evaluation for Computer Science, in Communications of the ACM, vol. 52, no. 4, April 2009, pages 31-34, text available here on the ACM site.

[4] Same authors as [3], same title, extended version (full report) on the Informatics Europe site, available here.

[5] Publish or Perish site, see here.

[6] Jens Palsberg, The h Index for Computer Science, see here.

[7] Microsoft Academic Search seems to index about half as many articles as Google Search, and a blog article in Nature last year questioned its future.


Comments


Amin Alipour

Once again, Bertrand Meyer wrote another insightful article and verbalize the culture of research in computer science and its difference with other fields. He emphasizes that in computer science the value of articles in top conferences is much higher than journals. Moreover, he acknowledge that top schools know this fact and researchers there have no worry about their research evaluation. The article implicitly suggest that the problem exists in the smaller universities and they need to change their procedure, which if we exclude few top schools there are thousands of them all over the world.

Meyer's article does not elaborate on the ramifications of this "different" research culture on smaller universities and the CS community which are (1) smaller schools are reluctant to tolerate this different culture which causes a lot of trouble in hiring and promotion of CS faculty members and researchers, (2) top conferences (and consequently CS community) are losing good researchers.

First, it is impractical to expect that a small-to-medium department has energy and power to ask for an exception for CS promotions. It is like changing the rules of game and making exceptions for particular players. Let's look at social factors for this change: (1) you are perceived as slacker/lazy and unable to publish in "indexed" journals by other departments (2) administration is reluctant to accept your justification, because other departments can ask for similar changes, (3) it is a battle with multiple fronts: administration, other departments/colleges, often reluctance of older members of departments (4) even if the administration be receptive, it would need an objective ranking of conferences ("South Kentucky Symposium of Information Technology" better be low in the ranking) (5) it requires much time, energy and political power to be spent.
Some may say "send them Meyer's article to the university president and other departments!". But the immediate question of chair of forestry would be "What is the impact factor of CACM anyway?"

This risky investment to push for change in individual universities dissuades many from targeting top CS conferences and they comply with the university demands which is journal. It hurts the entire CS community. For instance, look at the affiliation of authors in top conferences, say ICSE or FSE.How geographically diverse is it? Very much North America and West Europe. Is this the desired diversity that we look for?
It is hard to imagine that a smart student in a top school can publish multiple papers in top conferences during her studies in a good university, but once she graduates and joins a smaller department in another country her creativity stops and she only publishes in mediocre journals!!!
Rationally, that student has two choices: (1) trying hard to meet the standards a top conference, spending a couple of thousand dollars for visa and travel to attend the conference, knowing the fact that this paper would not help her in promotion or funding, (2) publish a couple of mediocre papers without spending anything while being sure that it counts in your career promotion. The former choice requires such selflessness and devotion that is rare.

ACM already has the capability of solving these problems and the solution seems straightforward: publishing papers of a select conferences under a journal title, say "The Proceedings of the ACM". Suppose that all citations ICSE, FSE, PLDI, POPL, OOPSLA and similar conferences receive are directed to a journal, what would be the impact factor of that journal? Certainly higher than most of journals that ACM is already publishing. It also relieves the current journals from accepting extended version of the conference papers which in most cases convey little new ideas. Some may argue that the quality of journal reviews are better due to flexible time constraints, but we know that even top ACM journals are pushing the reviewers for "timely" review. Even comparing h5 and h5-median index of journals and conferences on Google Scholar metrics shows that our community follows and trusts conferences more than journals.

So, the question is: pushing to change in a couple of thousands of schools and funding agencies is easier or asking ACM for an additional journal?


Displaying 1 comment

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account