A Comparison of Ensemble and Case-Base Maintenance Techniques for Handling Concept Drift in Spam Filtering
This item is available under a Creative Commons License for non-commercial use only
The problem of concept drift has recently received con- siderable attention in machine learning research. One important practical problem where concept drift needs to be addressed is spam filtering. The literature on con- cept drift shows that among the most promising ap- proaches are ensembles and a variety of techniques for ensemble construction has been proposed. In this pa- per we compare the ensemble approach to an alternative lazy learning approach to concept drift whereby a sin- gle case-based classifier for spam filtering keeps itself up-to-date through a case-base maintenance protocol. We present an evaluation that shows that the case-base maintenance approach is more effective than a selection of ensemble techniques. The evaluation is complicated by the overriding importance of False Positives (FPs) in spam filtering. The ensemble approaches can have very good performance on FPs because it is possible to bias an ensemble more strongly away from FPs than it is to bias the single classifer. However this comes at consid- erable cost to the overall accuracy
Sarah Jane Delany, Pádraig Cunningham & Alexey Tysmbal (2006) A comparison of Ensemble and Case-base Maintenance Techniques for Handling Concept Drift in Spam Filtering, In: G.Sutcliffe and R.Goebel (eds.), Proc. 19th Int. Conf. on Artificial Intelligence FLAIRS'2006, AAAI Press, p340-345.
This document is currently not available here.