Why Text Mining is often not Legal, but how it could be in the Future
Hi there, I’m Lucie Guibault, Associate Professor at the Institute for Information Law of the University of Amsterdam.
Over the past few years, I became increasingly aware of TDM as a research method in all fields of science and humanities. With the increase of computational capacity, of digital born information and the digitisation of collections, the use of TDM in research is on its way towards achieving tremendous societal and economic benefits. Think about all the new insights and cost savings that would otherwise not be possible. This means more scientific breakthroughs and a greater understanding of society.
TDM is simply the research method of the future!
At the same time, I also have observed the difficulty encountered by researchers in trying to gain access to and use of databases for TDM. These barriers sometimes cause researchers to not use TDM at all. This is unacceptable to me. I firmly believe that the public interest in allowing TDM for research purposes outweighs the potentially negative impact on the rights of database owners.
Lawfulness of TDM is uncertain
Assuming that TDM does fall within the scope of copyright protection, the European copyright framework is highly fragmented: not only are the exceptions on copyright for the vast majority optional, but Member States have implemented them in many different ways.
That’s why there is a lot of legal uncertainty regarding the lawfulness of TDM for research purposes, especially on a cross-border basis. Moreover, rights owners often put a license condition on accessing and using their databases, which are then very restrictive with respect to TDM activities. Although open access licensing certainly is one aspect of the solution, the content licensed under open access condition is at this stage not sufficient to offset the negative impact of traditional licensing practices on research.
Another area of uncertainty is the application of the rules on data protection to TDM activities, because TDM technologies are not only applied to publications and other texts, but to ‘raw’ data as well, which might involve personal data. It is not always clear how miners can comply with data protection law when personal data is involved, even when they are very determined to do so. Currently, it is not always clear when researchers may benefit from the ‘lighter’ regime for “scientific research” purposes. A review of the European legal framework would therefore certainly need to address the issue of the rules on the processing of personal data in the context of scientific research.
TDM has huge potential!
My feeling is that most people who are involved in or potentially affected by TDM are ready to accept the huge potential expected from the use of this method in research. They cannot deny the societal and economic benefits TDM will bring to society. Only a small percentage of them is reluctant to permit TDM to take place without prior authorization, either because TDM is seen as a potential source of extra income or as a risk factor for their competitive interests.
The challenge is to convince companies, publishers, research institutions and policy makers that the public interest in allowing TDM for research purposes prevails over individual gains and that the perceived risks are more theoretical than actual.
We need the voice of the research community
The European Union is now working on a copyright reform, which also touches on TDM. The process of producing this reform is highly delicate and prone to capture by stakeholder groups. It is very good that TDM has caught the attention of European Commission officials and members of the European Parliament. But it is essential that the voice of the research community keeps being heard to remind all interested parties of the tremendous benefits of TDM for society as a whole. It would be a tragedy if the wind turned and the copyright reform did not result in an improvement of the situation.
Until next time,