You go on a trip. All the necessities of life crammed into your backpack: Phone, tablet, laptop, mini satellite phone and a foldable communications satellite. All you need to stay connected. You make your way through the streets with unfamiliar signs and the flocks of incomprehensibly chatting locals, all trying to help you out in a language you don’t understand. You hike over glaciers and through deserts, aided only by the sun – and the map-app on your smartphone. At last, you arrive at your destination, battery almost out. You dive head-first into a sofa, set down you backpack, extract your charger and … You have forgotten the adapter for it. At this moment, your phone makes a beep – and dies.
You have noticed an infrastructure problem. The standards of the electrical systems are invisible – until you cross the border into an area with a different standard. On your way here you crossed deserts and glaciers and enjoyed the freedom of the road. You never noticed the road being there until you stumbled into a hole. Infrastructure at its best is not noticed. Infrastructure is only noticed when it doesn’t work.
Research also requires infrastructure. A structure for traceability, accuracy, accountability and acknowledgement. But how easy is it really to exactly replicate a scientific experiment by just reading the published result? A scientific method is like a recipe, with a list of ingredients such as reagents, formulas, software, images, samples or data. Once we know what components to use, we can start describing how they are put together. But even in the first instance, the list of ingredients, the scientist often fails. Software used is not declared with enough detail, chemical substances and samples are declared only by common name, references points to URls that are removed or moved. Data is referenced to a larger set than was actually used in the experiment, simply because a proper way to cite a subset is lacking. Millions are being wasted each year on scientific experimentation which is not repeatable, simply because the infrastructure of how to make it repeatable is not there.
Internet should be the given infrastructure to share research. It was built as a collaborative project for sharing text and documentation. For publishing, it undeniably works. Lately it has also been extended to images and video, however somewhat unstructured. The true power of sharing is when we can share more than texts and images. If we can share references to components of data used in research with sufficient accuracy, then research may actually be repeatable.
The playing field for a researcher today is much different from when I began my meandering path into science. Back then computers were the new cool thing because you could use them to play solitaire on school breaks. Internet was something strange and unknown. Searching for something in the early days would turn up about a dozen results. And among them would always be what you were looking for. You could easily browse every page on a particular subject in a day. Today you may get thousands of results, none of them being what you are looking for. Today internet opens a door to a world of knowledge. At least for anyone who know how to filter their searches.
Imagine what could happen if we could use the web as the gigantic ocean of knowledge that it can be: If we all could reach astronomical, meteorological, chemical, medical and solid states data with a tap on a keyboard. If text could turn into data entities that can be understood by a computer.
This is what is sometimes referred to as the semantic web – and it is not a dream. It is already here. Identifiers, metadata, and linkages are used to make the internet searchable by computers. The power plugs are in place. But it requires of us humans that we can agree on what a power plug for online data and online knowledge should look like. It’s just that sometimes, some of us uses a different power plug, and here and there someone doesn’t use plugs at all, but a knot. Different organisations promote and prefer different identifiers, different metadata standards and different formats by which to expose and display their data. It is possible for all of those power plugs to exist, if we can get over the phase were we wonder why everybody else just can’t start using OUR power plug. Like we do when we use an adapter, it is possible to move from one system standard to another, if we decide to collaborate to figure out the adapters. Internet is a collaboration. Let us collaborate to make it better.