Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Library: xclust
See also: cartsplit cartsplitopt leafnum maketr pred prederr prune prunecv pruneseq prunetot ssr kuva

Macro: cartcv
Description: Performs cross validation for the CART: subtracts from the data in a given number of ways a test set, with the rest of the data a regression tree is formed and a sequence of subtrees is pruned from the initial tree. For each tree, the test set is used to calculate the prediction error.

Usage: cross = cartcv (x, y, type, opt, wv)
Input:
x n x p matrix: data matrix of regression variables
y n x 1 vector: contains the values of the response variable
type p x 1 vector: contains the types of the regression variables, 1 means that the corresponding variable is continuous and 0 that it is categorical
opt list of scalars: determines when the growing of the tree is stopped. Consists of opt.mincut, opt.minsize, opt.mindev. See cartsplit for the description of these parameters.
wv integer >=2, wv fold cross-validation is performed, that is, the data is divided in wv number of ways to an estimation set and a test set. Division is formed randomly.
Output:
cross list of vectors, consists of cross.alfa, cross.lnumber, cross.cv, cross.cvstd. The elements of the list cross are vectors with the number of elements equal to the number of trees in the sequence of pruned subtrees of the tree grown with data x and y. The vector cross.alfa contains the values of the complexity parameter alfa. The vector cross.lnumber contains the numbers of leaves in the sequence of the pruned subtrees. The vector cross.cv contains the estimates for the expected value of the mean of squared residuals. The vector cross.cvstd contains the estimates for the standard deviation of the estimator for the expected value of the mean of squared residuals.

Note:

Example:

;load the library xclust
library ("xclust")
;let us make some deterministic data
x1=#(0,0,0,0,1,1,1,1,1,2)
x2=#(0,0,0,0,0,0,0,1,1,1)
x=x1~x2
y=#(0,0,0,0,100,100,100,120,120,120)
opt=cartsplitopt("minsize",1,"mindev",0,"mincut",1)
cross=cartcv(x,y,#(0,1),opt,3)
cross             
Result:

Contents of cross.alfa
[1,] 0.000000 
[2,] 60.000000 
[3,] 2904.000000 
Contents of cross.lnumber
[1,] 3.000000 
[2,] 2.000000 
[3,] 1.000000 
Contents of cross.cv
[1,] 33.333333 
[2,] 71.555556 
[3,] 1582.086168 
Contents of cross.cvstd
[1,] 57.735027 
[2,] 26.342474 
[3,] 2595.969560  

Library: xclust
See also: cartsplit cartsplitopt leafnum maketr pred prederr prune prunecv pruneseq prunetot ssr kuva

Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Author: Jussi Klemela", 980323
(C) MD*TECH Method and Data Technologies, 17.8.2000