The Introduction of Data Science in Libraries Working Group (Organiser: Data Science Working Group)

The LIBER Data Science Working Group was founded in the Spring of 2021. Its main aim is to connect practitioners of the field in research libraries and start sharing experiences and ideas. The group is focusing on three main topics:

  1. a) Landscape analysis: an overview of activities and developments in the field of AI and machine learning (e.g. text analysis, data mining, exploratory data analysis, visualisation etc.), in the domain of cultural heritage. We are interested in data sources, methods and tools, infrastructure, output, and its publication and whether the use case is in the research or production phase. What are the scientific and societal impacts? What about the gap between ideas and daily practice? We set up a Zotero library to collect relevant papers and websites.
  2. b) Services to external users: establishing existing guidelines and/or generating them. How we deal with metadata – for example related to CRIS/FAIR/grants. Modelling and curating (FAIR) data as fuel for AI techniques. Assisting researchers (library as a competence centre).
  3. c) Skill development within libraries: what are the skills and tools needed for Data Science activities, and potentially available in libraries (for example Python/R)? We concentrate on three aspects: i) resources: grants (and already funded projects as good examples), open datasets, parallel initiatives (AI4LAM workshops, relevant Carpentry lessons), surveys, bibliographic data science activities ii) Data Science tasks in cultural heritage: DOI verification, ISBN hyphenation, record linkage, metadata extraction, text summarization, scaling up pilots to full coverage of collections iii) skills: upskilling librarians via training and mentoring, Data Science team building within the organisation, help people anticipate costs (compute, storage) and overhead for accessing Data Science services. Investing how to support and improve existing library workflows with the help of Data Science.