Research assessments as fetish objects

The indicator mania

d'Lëtzebuerger Land vom 17.11.2017

Let’s start with this question: what’s the scientific quality of the University of Luxembourg? According to the latest ranking by the Times Higher Education, it currently ranks 179th in the world. This number does not tell us much, unless some more context is provided. For instance that there are over 10 000 universities across the world and that Luxembourg’s university was created in 2003.

In the category “young university”, the university currently ranks 11th. For a small and young university, being present in such an international ranking is undoubtedly an achievement. Yet, the Times Higher Education is only one of several actors to produce university rankings. In the two other, often-cited university rankings, the University of Luxembourg is less visible: it is absent in the so-called Shanghai ranking, and in the field of computer science it is ranked 351st in the QS ranking.

In 2016, the university’s publications have received over 9 000 citations. And the set goal is to publish on average two publications per year per researcher – the current average being 1,62 per year1. The answer to the question “what’s the scientific quality of the University of Luxembourg?” inevitably depends on who measures and what is being measured.

Measuring the impact and evaluating the quality of research and innovation are today common in the academic world. In many circles, impacts are used and discussed: various institutions produce rankings and measurements, researchers use them to make judgments and recruit people, governments mobilise them to distribute funding, and committees rely on them to produce expertise… But behind the idea of objective and “clean” indicators and numbers lies a more complex reality. Indicators cannot fully capture innovation, creativity and research impact, they have negative side-effects and they can never be truly objective.

Internationally, it is since the 1980s and 1990s that indicators have become increasingly used. Today, all kinds of things are measured and evaluated: the impact factor of scientific journals, the times that individual researchers and their papers are cited, the number of patents, the productivity of teams, departments and universities, etc. Terms such as “audit society” and “academic capitalism” have been developed to analyse this drive towards evaluations and assessments and its effects.

On the one hand, it makes sense to measure and assess scientific and technological activities. If funders want “return on investment”, if budgets are limited, if institutions want “accountability” and to make sure that science has positive effects, it is reasonable to try to measure the quality of scientific actors, from individual researchers to large institutions.

Indicators might increase scientific productivity and efficiency, and assessments can provide a basis for informed decisions on how to best spend research money. The “wider use of quantitative indicators, and the emergence of alternative metrics for societal impact, could support the transition to a more open, accountable and outward-facing research system” a recent report titled The Metrics Tide states2.

On the other hand, there are a number of problems and limitations with indicators. Counting citations, for instance, is biased towards Anglo-Saxon and established journals; the impact factors of journals can be manipulated; and they are dependent on the disciplines and size of communities that researchers work within. With university rankings, there are inevitable methodological shortcomings. It is difficult to compare and rank universities across countries, because they have emerged out of different cultures, histories, and traditions. Also, rankings are not evenly distributed around the globe: some countries rank more than others and the “indicator mania” is not present everywhere.

Ranking universities also runs the risk of homogenising and standardising them. The diversity and richness of universities can decrease if they all want to compete for the same measurable activities. This makes them less reactive to their local contexts, less attentive to teaching activities and less sensitive to social and organisational concerns. Rankings and evaluations can reinforce a questionable hypothesis: that international activities are necessarily excellent, while national and local activities are not. Another problem is a linguistic one: words such as “excellence” are used by so many universities these days that they have lost their significance.

Rankings and competition between universities and researchers is often seen as a manifestation of neoliberalism implementing competition and market forces in all kinds of activities. But there is an inevitable paradox: research is not only about competition, but also about collaboration and while some scientific activities can lead to marketable goods, a vast part of scientific research is a “public good”.

The tension between market forces and the public nature of research is particularly visible in controversies about the access and cost of publications. The publication of journals and the production of metrics has become a real business, with only a few enterprises such as Elsevier and Springer dominating the market. These enterprises now charge expensive subscription fees for access. The main criticism, raised by researchers and librarians, is that research paid by public funding is being distributed and sold by a few private enterprises and then read and used again by public researchers. Reactions against Elsevier have been particularly important, with editorial boards resigning, librarians boycotting, and universities cancelling contracts.

Somewhere in the middle between blindly playing the “indicator game” and rejecting/resisting indicators – and the minimum that researchers and research funders should do – is a balanced and reflexive attitude towards indicators3. We can follow the authors of The Metric Tide report that states: “Metrics should support, not supplant, expert judgment”. Peer review, although it is not perfect, is still an important and significant practice to assess research. “It is sensible to think in terms of research qualities, rather than striving for a single definition or measure of quality”, The Metric Tide further states.

There are three lessons to draw from the uses and fetishisation of indicators. First, multiplicity: assessments should never be based upon one indicator only, and indicators should never be used on their own, but coupled to other, qualitative evaluations. Research can have impacts on social, cultural, environmental, economical and public policy issues – all of which call for a variety of measurements. Research can have diverse values and a variety of impacts, some easier and some harder to assess.

Second, the contexts and disciplinary diversity of research much be recognized. And third, humility and reflexivity are important. We must recognize that indicators can capture only a portion of reality, that they are based on methods that inevitably have their shortcomings and they have positive and negative effects on behaviors, institutions, and diversity. Numerical quantification of research must go hand in hand with expert qualification.

Morgan Meyer is director of research at the CNRS and works at the Centre for the Sociology of Innovation at Mines ParisTech.

1 University of Luxembourg, Report 2016. Key Performance Indicators, p. 3

2 Wilsdon et al. (2015) The Metric Tide: Report of the Independent Review of the Role of Metrics in Research Assessment and Management. HEFCE

3 For a good debate on the indicator game, see the 2017 issue of Engaging Science, Technology, and Society (http://estsjournal.org).

Morgan Meyer
© 2017 d’Lëtzebuerger Land