综述 Overview
决策树类型 Decision tree types
决策树在数据挖掘中的应用主要有两种类型:
分类树分析(Classification tree analysis)是指预测结果是数据所属的类。
回归树分析(Regression tree analysis)是指预测的结果可以被认为是一个真实的数字(例如房价或住院病人的住院天数)。
There are many specific decision-tree algorithms. Notable ones include:
ID3 (Iterative Dichotomiser 3)
C4.5 (successor of ID3)
CART (Classification And Regression Tree)
CHAID (CHi-squared Automatic Interaction Detector). Performs multi-level splits when computing classification trees.[11]
MARS: extends decision trees to handle numerical data better.
Conditional Inference Trees. Statistics-based approach that uses non-parametric tests as splitting criteria, corrected for multiple testing to avoid overfitting. This approach results in unbiased predictor selection and does not require pruning.[12][13]
ID3 and CART were invented independently at around the same time (between 1970 and 1980)[citation needed], yet follow a similar approach for learning decision tree from training tuples.
ID3[1] Quinlan, J. R., (1986). Induction of Decision Trees. Machine Learning 1: 81-106, Kluwer Academic Publishers
c4.5[2] Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.
cart[3] Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8.
