Mary Phillips has worked as an academic in biomedical sciences at Oxford University, UK, as a funder with the Wellcome Trust, in London, and as director of research planning for University College London. She is the author of a report on research assessment, published in 2012 by the League of European Research Universities (LERU). She currently works as a senior academic advisor to a US-based consultancy called Academic Analytics. Find out below, her unique perspective on the limitations of the existing evaluation systems, be it for academic institutions or individual scientists. In this exclusive interview with the EuroScientist, she shares the lessons learned from her various positions related to academia.
1- What are the main limitations of research evaluation at institutional level?
In my experience, it was largely practical issues that limit the validity of evaluation techniques at institutional level. It depends what staff you include in your list in a particular evaluation. HR lists are there for separate purposes. The needs of people working on research strategy are different, in terms of who should be included and excluded from such lists.
One example is that are people working at universities, sometimes very senior, who are recipients of a fellowship, but may not be included on permanent faculty lists of the university. Postdoctoral students can also contribute significant work, but may not be accounted as being part of the institution. Different universities have different inclusion and exclusion criteria. They sometimes give honorary titles. It makes it difficult to compare apples with apples—that is universities among themselves. And once you try and compare with other countries, it is even more complicated.
2- What are the possible ways of remedying these shortcomings?
It is a matter to try and get the HR departments to get some kind of consistency in the way they categorise academic staff, and in the description of research grants. As part of my role in research strategy at UCL, I was involved in setting up a kind of meta-database of all research activity—a system called IRIS—where we had some degree of success. It would be helpful if grant information was also held in a consistent manner.
There was also an attempt to establish consistent terminology for UK institutions, organised by Elsevier in consultation with a number of universities, through a system called Snowball Metrics. The EU too, has been trying to evaluate various universities based on empirical data through U-Multirank. It is similar to the joint US initiative to measure the Effect of Research on Innovation, Competitiveness and Science, from the National Science Foundation (NSF), the National Institute of Health (NIH) and the White House Office of Science and Technology Policy (OSTP), called STAR METRICS. If you are going to undertake such exercises, you need to have all the individual universities and the higher education funding body contributing to make the results meaningful.
3- What are the current limitations in evaluating scientists’ worth with instruments such as peer review, bibliometric indicators?
It ultimately boils down to peer-review. Be it for allocation of grants, progress in researchers’ careers or publication. The current peer-review system is the best we have got at the moment.
There is a degree of selection bias—either deliberate of inadvertent—in how you get a peer-review panel together. It is due to the fact that people are competing for the same funding in the same field. This can be corrected somewhat. From my experience as a funder at the Wellcome Trust, we tried to get a good mix of gender, age, and people from a variety of universities.
Open peer review makes the system more transparent. The problem with open peer review is that people are so concerned about litigation and are thus less likely to be fully truthful in their evaluation.
Peer review is not totally satisfactory. People can be conservative, especially when money is short. For some areas of interdisciplinary research, it is difficult to judge projects that fall between established themes. To improve peer-review, using technology to try and increase the pool of reviewers and internationalise the peer-review panels, would be useful.
4- Do you believe that novel measurement methods, so-called altmetrics, offer a suitable alternative?
It is one of the fields growing exponentially. We would, however, need to have some kind of evaluation of the effectiveness of these altmetrics. We would need to evaluate evaluation techniques.
People in a certain demographic, young people, are using those techniques. There are more mature academics who would never dream of using them. Sometimes, people spend too much time using the social media tools instead of focusing on their research. We need to try and test them, first, before we use them as additional evaluation resources.
All in all, you need to find good people and give them funding, and let them run with it. A person-centric approach, could be a solution. You need to apply a lot of intelligence in the selection process to validate more objective techniques.
5- How do you see the evaluation of researchers evolve in the future?
The one area that is going to be important—and I don’t know how it will affect evaluation techniques—is the issue of open access. All UK funders now recommend gold standard open access. It is likely to have an effect on citations. How it will affect them is quite difficult to predict; whether it will enhance or decrease the citation rate is unknown.
Interview by Sabine Louët.
Go back to the Research Evaluation special issue.
Photo credit: selfie by Mary Phillips.