The International Workshop on Mining Scientific Publications featured speakers from OpenMinTeD, as well as speakers who presented their text and data mining research results. LIBER being the lead on the dissemination and stakeholder engagement activities of the OpenMinTeD project, helped organise the workshop.The Open University, as lead organiser of the workshop, has led the International Workshop on Mining Scientific Publications for 5 years in a row, and has helped to make it a true brand name. This was the first year the workshop was organised as part of the OpenMinTeD project. With participants coming from all over the world the word ‘international’ in the title of the 2016 workshop was particularly appropriate. The workshop started with Petr Knoth from the Open University welcoming everyone to Newark.
The first keynote was given by Yuxiao Dong (University of Notre Dame). Yuxiao Dong talked about his work on the AMiner system, a network and database of author profiles. The researchers working on it have been developing ways to link, bridge, connect and compare profiles, publications and other research entities. One of their main research goals is to figure out how to extract and integrate semantics from different sources.
The second keynote talk was given by Michael J. Kurtz of Harvard-Smithsonian Center for Astrophysics, and addressed the work around the Smithsonian/NASA Astrophysics Data System (ADS), which is one of the oldest web based scholarly information systems in the World. Today it contains metadata on more than 11 million articles, and the full text for 5 million articles, including nearly every refereed article in physics, astrophysics, or geophysics. Its roots go back to almost a quarter of a century. This talk highlighted the technological and infrastructure gap between some of scientific communities. Connecting datasets to research papers is, according to Kurtz, still an issue in social sciences but this functionality has been provided by ADS since mid-90s. Kurtz also discussed how ADS benefits from applications of text and data mining, such as by mining of usage logs; the development and implementation of new bibliometric measures for papers, people, and organisations; semantic tagging, and the creation of links to external data sources; machine learning and text classification; recommender systems; real-time network analysis; and various related user interface issues.
Throughout the two days of the workshop, a large number of interesting paper presentations were given. A long talk was given by Shubhanshu Mishra (University of Illinois) who did a datamining study on novelty in biomedical literature. He found that researchers publish fewer new concepts as they age. However, researchers might publish their most novel text at any time in their career.
Another detailed paper was presented by Drahomira Herrmannova (Open University) who demonstrated the strengths and limitations analysis she did on the Microsoft Academic Graph, which contains over 120 million papers. She hopes her research will be valuable to those deciding whether to use the Graph in their (datamining) research. This was followed by Robert Patton (Oak Ridge National Laboratory), who talked about alternative ways of measuring scientific impact of a research article. He emphasized the need for better metrics that leverage full content analysis of publications.
Invited talks were given by Stelios Piperidis (Athena Research Center) and Peter Mutschke (GESIS), who laid out the challenges and potentials of the OpenMinTeD project.
After the presentations, a discussion followed on the barriers to text and data mining in Europe. The participants talked about the difficulty to get access to datasets and the need for a better and clearer copyright law. The participants are also looking for help in the research community and agreed that research libraries can play a valuable role in guiding researchers who want to do text and data mining.
During the workshop, the participants and OpenMinTeD partners were active on Twitter, resulting in lots of interaction with people not attending the workshop.
For a full list of speakers and their papers, please go to https://wosp.core.ac.uk/jcdl2016/
The (unpublished versions) of the papers can be found at https://drive.google.com/folderview?id=0Bz6QWs4w8jPUSWpGM1gtbDhvNTg&usp=sharing .
Many of the speakers also participated in video interviews about their text and data mining research. These interviews will be published with the launch of the OpenMinTeD platform. So stay tuned to hear more about exciting text and data mining research, from all over the world! Go to www.openminted.eu
This blog post was written by Hege van Dijke (LIBER Europe) and Petr Knoth (Open University).