Text & Data Mining


LIBER believes that libraries and their users should be empowered to contribute to an innovative and competitive Europe.

That is why we are actively advocating for a more flexible copyright system that will allow text and data mining to be used to its full potential.


What is text and data mining?

“Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns.” – UK Government


There are four stages to the TDM process. First, potentially relevant documents are identified. These documents are then turned into a machine-readable format so that structured data can be extracted. The useful information is extracted (Stage 3) and then mined (Stage 4) to discover new knowledge, test hypotheses, and identify new relationships.

Image credit: <a href="http://www.jisc.ac.uk/reports/value-and-benefits-of-text-mining">JISC / Value and Benefits of Text Mining (2012)</a>
Image credit: JISC / Value and Benefits of Text Mining (2012)

Why Is It Important?

TDM will increase the progress of science exponentially. It has the potential to facilitate the discovery of cures for diseases such as cancer and Parkinson’s. It has already been used to discover how existing drugs can be used to treat other conditions. It will also act as a foundation for innovation and new industry.

For libraries, who provide access to a growing amount of scientific content, it means that the researchers we support will be able to fully realise the value of the content we hold. This will, in turn, ensure a more rigorous approach to research, including more through reviews of the literature.


How is LIBER helping?

We are collaborating with a number of influential organisations and individuals to ensure that the European Union hears our call for legislative change that will better enable text and data mining.

We clearly expressed our views in our response to the EU’s 2014 copyright review. We have participated in and then withdrew from the Licenses for Europe process when it became clear that there were serious concerns about the scope, composition and transparency of the process.
Our withdrawal was preceded by a letter of concern sent to the European Commission, to express the need for a more flexible copyright system. This letter was supported by over 60 influential organisations and individuals, representing researchers, science organisations and industry (read the EC’s response).

In London, we organised a workshop The Perfect Swell to highlight the future potential and challenges related to TDM. We also produced a Factsheet on TDM, to make it easier for our members to understand and explain the importance of text and data mining to their professional and user communities.