Mate Timar - From Passive to Active: Exploring the Benefits of Active Learning in Data Science
PyData PyData
158K subscribers
415 views
0

 Published On Feb 23, 2024

www.pydata.org

Active Learning is a powerful technique in the field of data science that enables efficient use of labelling resources. In this 90-minute-long hands-on tutorial, we will provide a step-by-step guide on how to apply basic Active Learning techniques for a document classification problem.

The tutorial will begin with an introduction to Active Learning, followed by a brief discussion of its cost and time savings benefits. Next, we will implement clustering to select the first batch of training data. Then, we will train a document classification model and analyse fundamental Active Learning concepts such as diversity, isolation, and model uncertainty. We will compare different metrics to select the best points for annotation.

Finally, we will evaluate the model's performance and compare the results of Active Learning with random annotation. Throughout the tutorial, attendees will have the opportunity to work on their implementation and receive assistance.

By the end of this tutorial, attendees will better understand the principles of Active Learning and how to apply them to their own supervised learning problems, enabling them to make more efficient use of their labelling resources.

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...

show more

Share/Embed