What do twitter tweets, Higgs bosons and DNA have in common? They are all subject to big data analysis. In 2001, business analyst Doug Laney, now at at Gartner Research in Chicago, USA, described the ongoing growth of data as having three dimensions: volume, velocity and variety. The term Big Data itself entered into business jargon to describe petabytes of data, generated by computer record, whose sheer volume makes them unsuitable to be analysed by commonly used software tools.
The capacity to analyse and manage giant amounts of information has traditionally been an integral part of many fields of science such as particle physics or climate science, while it has also been used in the world of finance and economics. Today, thanks to the use of supercomputers and highly specialised analytical tools, big data has become common in almost all scientific domains, humanities included. This silent revolution—which went initially unnoticed by the large public—has changed the scientific practice in the XXI century.
European scientists are pioneers in using big data tools contributing to scientific discoveries on an unprecedented scale. The most picturesque example is the renowned Atlas Experiment of the Large Hadron Collider at the particle physics laboratory CERN, in Geneva, Switzerland, which allowed for the discovery of the famous Higgs Boson. “Big data are essential for particle physics” the director of Atlas Project, Dario Barberi, told EuroScientist. Although this field of science has always depended on statistical data analysis, it was the use of big data tools that made it possible to open new perspectives in particle physics.
No less impressive, is the contribution of big data to European discoveries in the field of life sciences. For example, on 23 January 2013, the European Bioinformatics Institute (EBI) in Cambridge, UK, announced its researchers succeed in encoding data directly into DNA. According to molecular biologist Nick Goldman from the EBI, this technology could allow storage of huge amounts of data which “will last in the right conditions for 10 000 years, or possibly longer.” This is because DNA is much more reliable way to store data than any of the existing methods of data storing.
However, it is in social sciences where the use of big data seems to have brought a true revolution. For example, an EU funded project called Data without Boundaries aims to create by 2015 a data infrastructure allowing for easy access to official statistics and micro-data from countries of European research area. Its goal is not only to help social scientists but also to provide information to policy and decision makers. However, chartered statistician Harvey Goldstein, who is Professor of social statistics at the University of Bristol warns about the simplistic assumptions about the validity of data found in big governmental statistics databases. Information gathered may be largely misinterpreted by people not understanding the mechanisms of data gathering and analysis, especially “given the volume of data being produced and its take-up by the media” he explains.
Who should then play a role of knowledge guardians in the age of Big Data? For Harvey Goldstein the answer is clear. These are experts in statistics who have “a special responsibility to provide leadership,” he tells EuroScientist, “it may mean confronting data providers with questions and criticisms and involve condemning what is trivial, biased or misleading.” Becoming aware of the limitations and challenges associated with big data analysis used in scientific research is key. It matters even more when the outcome of big data analysis is used for decision making in society. As previously experienced in the world of finance, interpretation errors can too easily creep in.
Featured image credit: infocus Technologies via Flickr
- Creating what we need from what we have—how innovation rescues traditional industries - 25 March, 2014
- Europe lagging behind in open education: for how long? - 15 May, 2013
- Big Data – a silent revolution? - 27 March, 2013