Dear R Geniuses:
I'm a C++ and Perl, not an R System consultant, but a client wants me
to see if R can help him predict whether daily sales for some auto
parts stores will be less than, greater, or equal to the median daily
(equal to is defined as within 2%, otherwise there would never be that
category.) He has 27 values to predict the 3 factors, everything from
the month, the weather, the number of clerks on duty and etc., etc.
I'm using this function P = (train[ , vars], test[,vars], cl , k =
1, l = 0, prob = TRUE)
Train and test are 1200 and 200 vector data frames. The cl values are
present with "test" (at this point as variable 28)
vars = c( 5, 11, 23), for example. If I use more than 3 variables I
get severe over-fitting.
The problem is with the printing: i want to print the results in a
table that shows for test data:
cl prob P (cl is the actual class from test, P is
the returned value from knn)
actual values for vector 1
actual values for vector 200
I'm using R from a terminal command line, not a GUI. I've tried
numerous ways of generating the table, and none work.