Data visualisation and data labelling for machine learning applications - Workshop with Nesta

cube-no-animation-1

Data visualisation and data labelling for machine learning applications – Workshop with Nesta

Workshop

Tuesday 20 May 2025, 09:30 — 16:30

King's College London, Strand campus, 30 Aldwych, London, WC2B 4BG

Register

Nesta’s data science team will deliver a one-day workshop providing hands-on experience of data visualisation and data science methods. This will take place on 20 May 2025 at King’s College London, before ESCoE’s Conference on Economic Measurement.

The workshop is open to anyone with familiarity of Python, and will be designed with economic measurement applications in mind.

The workshop will cover: data visualisation and interactive dashboards using Streamlit; data labelling using Argilla and how to use labelled data to train a machine learning model.

  • Increase their knowledge and confidence of using Python (although prior knowledge of Python is required);
  • Increase their knowledge and confidence visualising data in an interactive dashboard;
  • Increase their knowledge and confidence labelling training data for machine learning models.

Workshop content

Streamlit is a powerful and user-friendly Python library used for building interactive data applications with minimal code. It enables data visualisation enthusiasts coding in Python to quickly create and deploy data dashboards.

In this session we will cover how to visualise data for decision making and how to communicate insights effectively. We will also cover key functionalities in Streamlit such as graphics interactivity, user selection and theme customisation, enabling participants to build engaging and user-friendly dashboards.
Suitable data isn’t always available to train a machine learning model, and often data scientists need to label their own. Sometimes this process can be done in a rudimentary way, for example creating binary classifications as a column in a spreadsheet. However for more complex labelling tasks, like selecting parts of text that correspond to people’s names, labelling software is essential.

In this session we will use Argilla, an open-source labelling software, to set up a labelling task. We will spend some time labelling data and using it to create a machine learning model. We will touch on:
  • The practicalities of setting up a labelling task
  • The importance of thinking about the design of a labelling task
  • Dealing with multiple labellers and measuring consensus

More details

The workshop is open to anyone with a familiarity of Python, – including variable types, writing if/else statements, functions, for loops, and importing and using packages – who is interested in learning about data visualisation and data labelling. Participants should bring their own laptop on the day. Set-up instructions will be sent to participants one week before the workshop.

The workshop will take place on 20 May at King’s College London and run from 09:30 – 16:30. Lunch will be provided. The event is just before ESCoE’s conference. However, you do not have to be attending the conference to join the workshop.