shutterstock_credit dizain

Text and data mining freedom at stakes in copyright reform

The work of European Scientists could be hampered by proposed copyright law

Scientists digging in the haystack of research publications for patterns, trends or other information, increasingly rely on text and data mining (TDM) techniques. Without these tools, it is impossible to systematically analyse the rapidly growing volume of scientific knowledge. Yet, European scientists cannot apply TDM as widely as they wish. Due to legal restriction, they fear that Europe may fall behind other countries with respect to TDM. As the commission is expected to propose a reform of existing regulations later this year, the debate on TDM and copyright law is gaining momentum. Scientists hope a reform will harmonise copyright regulations across the EU member states in favour of TDM. But traditional scientific publishers continue to support existing regulations.

Already last year, two reports commissioned by the European Commission stressed the need to revise EU copyright law. But the issue is tricky. In fact, there is no reference to TDM in the so-called EU InfoSoc directive, which the debate focuses on. However, scientists applying TDM tools often need to store electronic copies of publications they analyse, which in turn affects the reproduction right of right holders. Based on the right of reproduction, scientific publishers claim the right to restrict access to their publications for TDM purposes. This is what upsets many researchers. “If you have access to the paper, you should also be able to text and data mine it,” says Jan Velterop, an advocate for the open access movement, with extensive experience in science publishing.

Indeed, “for scientists, it is important to have a regulation that is simple and easy to handle. Currently, they do not have that,“ says Christoph Bruch, senior policy advisor on strategy at the Open Science Coordination Office of the Helmholtz association, Germany’s largest research organisation, in Potsdam, Germany. Bruch edited a recently published briefing paper by Science Europe, which calls upon European academics to advocate for a “science-friendly” copyright law. Velterop agrees. “The copyright legislation as it is now is mainly protecting publishers. It is not protecting science. That has to be turned around. Publishers are the service for science. I would like to see that reflected in copyright law,” he says.

Details of licenses

Currently, scientists depend on the contractual licenses many publishers offer. These licenses provide subscribers with the additional right to use the content for TDM. But it also means that publishers can control the mining activities. “Copyright means control. That is problematic because the consequence of that control in the hands of publishers is almost always a limitation of further use. For science, that is terrible,” Velterop says. Bruch stresses that “EU copyright law should be formulated to assure that content accessed legally can be mined freely without the necessity of a license. Contradicting contractual agreements should be void.”

At the beginning of May, universities, libraries, research organisations and open access advocates signed The Hague-Declaration on knowledge discovery in the digital age. Among other things, the declaration finds it “unacceptable” that licences and contract terms “regulate and restrict how individuals may analyse and use facts, data and ideas,” as in the case of content mining. If nothing changes, scientists fear that TDM expertise will mainly develop in countries with more permissive regulations, such as in the USA, the UK, Japan or South Korea. “It would not be for the first time, that the legal framework leads to a relocation of activities,” Bruch says.

In Europe, only the UK has already changed its copyright law. UK researchers now have the right to mine without further permission as long as they have the right to read copyright material. The exception is solely for TDM purposes in the context of non-commercial research. “What Europe should do is look at the British legislation and have something similar across the EU,” says Velterop. In Bruch’s view, the European legislation should go even further and not be limited for certain purposes. From the perspective of the research community this is important because research projects often include non-commercial and commercial partners.

Resistance from publishers

But publishers oppose this. “An exception in copyright law is the wrong solution to a non-existing problem,” says Carlo Scollo Lavizzari, a copyright lawyer with a firm called Lenz and Caemmerer, based in Basel, Switzerland and a member of the legal affairs committee at the International Association of Scientific, Technical and Medical Publishers (STM). Lavizzari emphasises that publishers already allow for TDM under license agreements at no additional cost. He also points at tools such as the TDM service offered by CrossRef. This project, which involves many publishers from across the globe, allows scientists to analyse publications across many sources based on a researcher’s subscription rights.

Get high quality articles on Science policy, innovation and society delivered directly to your inbox.

We respect your privacy.

In Lavizzari’s view, the major issue concerning TDM is about getting access “to large volumes of content in a convenient way.” Indeed, he stresses, publishers “are more interested in how to do it, rather than arguing for the next three or four years over an exception in copyright law.” From the publishers’ perspective, another important point is that licenses are necessary to prevent larger quantities of data from being downloaded and then re-sold for entirely different purposes, Lavizzari says.

The issue is also about “jobs that are created through the development of new TDM services. Copyright regulation has a wide impact on who will be able to offer such services,” Bruch says. Currently, it is the right holders who aim to control the downstream business, he holds. But an exception for TDM in copyright regulations would help to build a level playing field and to develop and offer TDM services, Bruch points out. “If these are attractive, they are going to be used,” he adds. Lavizzari opposes. “Allocating market power from the publishers to technology companies is economically wrong. They will essentially become the gatekeepers of the mine and they will not pay those who provide the content,” he says.

Meanwhile, the advocates on both sides of the fence are watching the EU negotiation procedure carefully. At the beginning of July, the EU parliament is going to vote on the so-called Reda-Report, an evaluation of the current copyright directive. The initial report called for an update and harmonisation of current legislation, but has been amended since the draft was released in January 2015. In October, Commissioner Oetinger is expected to present the proposal for a copyright reform. When launching its digital single market strategy early in May 2015, the Commission recognised that copyright regulation with respect to TDM should be harmonised across Europe. In Bruch’s view this is promising. But it is unclear “what this will look like,” he says.

However, if the Commission decides to add an exception in favour of TDM in current copyright legislation, Bruch concludes, it has to be mandatory and phrased in such a way “that contractual licenses cannot undermine it.”

Should the law in favour of the copyright pass do you think it would represent a change in favour of scientists?

Your thoughts and opinions are valuable, feel free to use our simple comment section below.

Featured image credit: Dizain via Shutterstock

EuroScientist is looking for contributors!

If you would like to write guest posts in EuroScientist magazine, send us your suggestions of articles at

Constanze Böttcher

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

One thought on “Text and data mining freedom at stakes in copyright reform”