Introduction to Spark for Machine Learning – Workshop Series on Applying and deploying AI in GLAMs

This workshop is part of a series of workshops (Applying and deploying AI in GLAMs) organised by AI4LAM and co-hosted by LIBER and the BnF. Read more about the workshop series here

This workshop will introduce participants to Apache Spark and its uses for machine learning. Through a combination of mini-lectures and hands-on training, the trainers will explore:

• the basic concepts behind Spark and cluster computing;
• using the Spark Machine Learning pipeline;
• applications of Spark in a GLAM setting;
• and deployment to an AWS cloud cluster.

Apache Spark is a popular, open-source machine learning tool with many relevant applications for the GLAM community. With a local deployment and a modestly-sized dataset, it can be used as a teaching/training tool for machine learning. It can also be used as part of a production system or research project with very large datasets and can be deployed economically to cloud clusters. This workshop will demonstrate Spark’s machine learning capabilities, and help participants determine if it would be a good fit for their projects.



  • Basic coding skills



Audrey Altman a software developer at the Digital Public Library of America. Michael Della Bitta is the director of technology at the Digital Public Library of America.

The space in the workshop is limited. To ensure your successful registration, please, register early until the maximum capacity is reached. Please, let us know should you not be able to attend so that a participant from the waiting list can take your place.

This workshop is organised by AI4LAM and co-hosted by the BnF as part of the workshop series “Applying and deploying AI in GLAMs”.

Register Here. 


[Photo by Michael Dziedzic on Unsplash]


Apr 02 2021


15:00 CET Time Zone
3:00 pm - 5:00 pm