AbstractSystems that learn from examples often express the learned concept in the form of a disjunctive description. Disjuncts that correctly classify few training examples are known as small disjuncts and are interesting to machine learning researchers because they have a much higher error rate than large disjuncts. Previous research has investigated this phenomenon by performing ad hoc analyses of a small number of datasets. In this paper we present a quantitative measure for evaluating the effect of small disjuncts on learning and use it to analyze 30 benchmark datasets. We investigate the relationship between small disjuncts and pruning, training set size and noise, and come up with several interesting results.
RightsThis Item is protected by copyright and/or related rights.You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use.For other uses you need to obtain permission from the rights-holder(s).