This item is available under a Creative Commons License for non-commercial use only
Acquiring labels for large datasets can be a costly and time-consuming process. This has motivated the development of the semi-supervised learning problem domain, which makes use of unlabelled data — in conjunction with a small amount of labelled data — to infer the correct labels of a partially labelled dataset. Active Learning is one of the most successful approaches to semi-supervised learning, and has been shown to reduce the cost and time taken to produce a fully labelled dataset. In this paper we present Activist; a free, online, state-of-the-art platform which leverages active learning techniques to improve the efficiency of dataset labelling. Using a simulated crowd-sourced label gathering scenario on a number of datasets, we show that the Activist software can speed up, and ultimately reduce the cost of label acquisition.
O'Neill, J., Delany, S. J. and MacNamee, B. (2016) Activist: A New Framework for Dataset Labelling. 24th Irish conference on Artificial Intelligence and Cognitive Science 2016, Dublin, Ireland. doi:10.21427/D7QK8M