Induction of Decision Trees Using Relieff

In the context of machine learning from examples this paper deals with the problem of estimating the quality of attributes with and without dependencies between then. Greedy search prevents current inductive machine learning algorithms to detect significa

  • PDF / 2,137,302 Bytes
  • 22 Pages / 481.89 x 691.654 pts Page_size
  • 48 Downloads / 223 Views

DOWNLOAD

REPORT


I. Kononenko and E. Simec University of Ljubljana, Ljubljana, Slovenia

Abstract In the colltext of machiuf' learning from ('xalllples this papf'r oeals with the problem of estimating thf' quality of attributes with ami without oependellci('s betweell thelll. Greedy search prf'venls CU!Tf']lt inductivf' machin(' learning algoritlullS to deteet significallt dependencies betwpell the attributes. R('cpnt!y, hira ami RendelI developed the RELIEF algorithlll for estimating the quality of attribllt(·S that is able to detect dep('neleneies betweell attributes. \Ve show strong relation betwef'll RELIEF"s estilllat,es allel impurit:. fUllctions, that are usually USNl for helll'istic gllidan("e of ind uctive Iparning algorithms. We proposf' to use R ELIEFF, an f'xte]l(10d v('rsion of R ELI E F, instead of myopie im purity fUllctions. We lta ve reim plemen ted AssistanL a systelll for top down inductioll of decisioll tref'S, USillg RELIEFF aS an estimator of attributes at ('ach s('h·ctioll step. T1Ie algorithm is tested on several artificial and several real world probIPms. Results show the advantagl, llod(~s,

PAR2: Parit.\' problf'1ll with lwu sigJlificaJll billary ilt.1rilJule" and J() reilldolli ]Jillar\' attributes, 7/;1; 01' randol11]Y ,seleele IUmU IIEPA

SOYB IRIS MESlI:l l\IESli 15 VOTE (:HIME ELPI

#val/att. 2.2

-

.) ~.I

3.3 9.1 3.8 2.9 6.0 7.0 7.1 2.0 4.5 16.6

215

# instances 3:39 288 148 35.5 1.55 630 150 278 278 435

72:3 500

lllaj .dass (%) entropy(bit) 25 3.89 0.7:3 80 55 1.28 1. 7:3 66 I I 79 0.74 1.5 3.62 33 1.59 26 3.02 26 3.02 61 0.96 1.34 65 1.00 50

of the medicaI and lloll-llledical reaI-\\'orld data sets

Assistaut- I 42.2±6.1 78.1±5.1 77.2±5.0 67.0±·L7 77.:HG.O 91.1 ±·1.2 95.1±2.2 :32.7% ·11.0% !l6.:Hl.6 5G.8±1.6 8LH:U

Assist.ant- R ·W.l±5.8 7S.5±4.5 75.8±5.3 66.2±4.;~

83.0±3.5 89.0±:3,2 95.l±2.6 :32.0% 42.4% 95.:H1.3 57..1±3.2 81.1±2.2

naiw Bayes 51.0±3.8 79.2±5.2 84.2±2.9 67.2±5.0 86.6±3.5 89.1±2.1 96.8±1.3 :U.5% 34.5(j{, 90.3±2.0 G1.2±2.8 84.7±:3.0

Tabl, .5 Classi ficatioll accuracy 01' the karning systems on I1ledicaI and nOIl-l11edicaI rf'al-world daJa sets

I

216

I. Kononenko and E. Simec

domain PRIM BREA LY~IP

RHEU HEPA SOYB IRIS MESH:3 ~IESHl.5

VOTE CRIME ELPI

Assistant- I 1.19±0.l:3 0.06±0.07 0.61±0.OS 0.44±0.09 0.1.5±O.09 3.10±0.1O 1.41±0.06 0.·57 bit 0.74 bit 0.S4±0.0:3 0.09±0.0:3 0.6:3±0.O,5

Assistant- R 1.0S±0.10 0.0.,)±0.07 0 ..59±0.10 0.41±0.09 0.26±0.10 3.01±0.10 1.43±0.06 0 ..56 bit 0.70 bit 0.S3±0.03 0.08±0.04

OSi±0.0:3

naive Bayes 1.64±0.2.5 0.OS±0.06 0.77±0.09 0 ..54±0.07 0.39±0.1:3 3.26±0.10 1..50±0.04 0.6.5 bit 0 ..')6 bit 0.7,5±0.04 0.09±0.0:3 0.70±0.0.5

TabTe 6 Average information score of tlw learning systems on meclical anel non-meelical real-worlel elata sets

The basic characteristics of the real-worlel data sets are givell in table 4. The results of experiments on these data sets are providecl in tables ;) and G. In meelical elata sets, attributes arf' typica.lly indepelldent. Tlwrefore. it is not surprising that the naive Bayesian classifier shows deal