Scientific integrity starts with integrity at the data gathering stage
Scientific integrity has become a major issue in scientific research. Academies of science and national research institutions have published recommendations to raise awareness among scientists. The debate about scientific fraud, plagiarism, and other forms of scientific misconduct has its origin in some highly publicised cases of eminent scientists accused of publishing fake data. This is fuelled by the increasing number of scientific results which cannot be replicated. Besides, anonymous researchers surveys have revealed an unexpectedly high frequency of misconduct among early career and mid-career scientists.
As part of such misconduct, a common situation involves deliberately dropping points to eliminate “aberrant” points in a curve. This approach is the same kind of misconducts as eliminating an entire experiment because its results are too different from those of previous experiments. Or it is like discarding cells exhibiting an “aberrant” pattern of labelling from a photograph of immunostained cells.
This issue has been acknowledged by 15% of the scientists participating in the previous surveys on the matter, when referring to their behaviour in the three years prior to the surveys. This misconduct clearly originates in a wrong understanding of what scientific knowledge is, and how it is progressively constructed.
These issues raise many questions. Is this phenomenon new? Is it the result of recent transformations in the organisation of research? Or of an excessively high degree of competition or of increased links between academic research institutions and pharmaceutical companies? An answer to each of these questions would require an extensive study.
Before Photoshop was invented, Sydney Brenner described a strange phenomenon that he had frequently observed in photographs illustrating published articles: a visible thumbprint. The most intriguing thing was that it was possible to perceive aberrant points dissimulated by the thumb.
This behaviour is very common among scientists. The British geneticist Ronald Fisher accused Gregor Mendel–in hindsight–of such deception: the probability that the results of Mendel’s crossing experiments fitted so closely the laws that he had discovered was very low. The explanation of this improbability was provided by Mendel himself: he wrote in his article that he had dismissed some of his experiments because the values that he obtained were too far from those predicted by the laws that he had discovered and considered as true. He had a ready-made explanation: he performed artificial pollination, and the aberrant results were probably due to natural pollination by intruding insects.
The reasons given by Mendel are no different from those any scientist may be tempted to provide when confronted by “aberrant results”. There are so many possible explanations why an experiment may give aberrant results. Particularly in biology where reproducibility is often so difficult to achieve! When replicating an experiment to obtain a publishable curve or histogram, it is difficult not to have the gut feeling that the point that is not on the curve is aberrant!
Misconception of scientific knowledge
What is intriguing is that scientists do not spontaneously consider “dropping points” as being serious misconduct. They feel they have acquired the right to discard aberrant points. That’s because they assume that they know now what the laws and mechanisms of nature are. This is the core of the issue. How do they know that they have discovered the laws and mechanisms operating in nature?
This feeling often originates in the simplicity and beauty of the mechanisms and laws that they have discovered. But why does nature have to be simple and beautiful, and to possess laws? Insidiously, scientists have left the firm ground of scientific research to enter into metaphysics and ontology–a transition that the German philosopher Immanuel Kant condemned more than two centuries ago!
We only know that the laws–or mechanisms–are what they are and that they faithfully account for natural phenomena because they fit experimental data in a reproductible way. Each time experimental results are discarded because of a “gut feeling”–that is for the wrong reasons–these laws are weakened.
Our only sure understanding of reality is through experiments and the models derived from them. We have no other way of checking the truth of our proposed models, laws, and mechanisms.
Besides, history tells us that the elimination of aberrant points can be an obstacle to discovery. One example in biology was the occurrence of sigmoid binding curves for ligands described in the 1950s for some proteins and enzymes. They appeared to be “ugly” exceptions to the normal hyperbolic curves found in other ligands. We will never know how many researchers discarded their own observations of these sigmoid curves, delaying the proper characterisation of regulatory enzymes.
Scientists, and scientists alone are responsible for the construction of an adequate and precise representation of nature– resulting in what is called scientific knowledge. This responsibility is often clearly compromised by an action that might appear to be of little significance: the elimination of points considered to be aberrant.
Michel is Professor in Biology at the University Paris 6 and at the École normale supérieure, France. He is also the director of the Centre Cavaillès for History and Philosophy of Sciences at the Ecole normale supérieure (USR 3608, CNRS). His main interest is the history of the transformations of biology during the 20th century; in particular the rise of molecular biology outlined in a book called A History of Molecular Biology (1998).
Featured image credit: Josh Calabrese via Unplash
2 thoughts on “Scientific integrity: dropping points”
I wonder if Porfessor Morange wrote this tongue-in-cheek.
After all, his own Paris IV (now Sorbonne University) is at the epicentre of what might be the biggest research misconduct travesty of recent history, not just in France…
Axel Keys famously discarded a large number of data points that didn’t fit his theory that increased fat consumption in a country was related to rates of cardiovascular disease. Arguably, the linkage of saturated fat consumption to heart disease enshrined in medical advice since the mid-20th century was based on this example of selectively dropping data