Below are some examples demonstrating unsupervised learning with NNS clustering and nonlinear regression using the resulting clusters. As always, for a more thorough description and definition, please view the References.
NNS.part
NNS.part
is both a partitional and hierarchical clustering method. NNS iteratively partitions the joint distribution into partial moment quadrants, and then assigns a quadrant identification (1:4) at each partition.
NNS.part
returns a data.table of observations along with their final quadrant identification. It also returns the regression points, which are the quadrant means used in NNS.reg
.
x = seq(-5, 5, .05); y = x ^ 3
for(i in 1 : 4){NNS.part(x, y, order = i, min.obs.stop = FALSE, Voronoi = TRUE, obs.req = 0)}
NNS.part
offers a partitioning based on \(x\) values only NNS.part(x, y, type = "XONLY", ...)
, using the entire bandwidth in its regression point derivation, and shares the same limit condition as partitioning via both \(x\) and \(y\) values.
Note the partition identifications are limited to 1’s and 2’s (left and right of the partition respectively), not the 4 values per the \(x\) and \(y\) partitioning.
## $order
## [1] 4
##
## $dt
## x y quadrant prior.quadrant
## 1: -5.00 -125.0000 q1111 q111
## 2: -4.95 -121.2874 q1111 q111
## 3: -4.90 -117.6490 q1111 q111
## 4: -4.85 -114.0841 q1111 q111
## 5: -4.80 -110.5920 q1111 q111
## ---
## 197: 4.80 110.5920 q2222 q222
## 198: 4.85 114.0841 q2222 q222
## 199: 4.90 117.6490 q2222 q222
## 200: 4.95 121.2874 q2222 q222
## 201: 5.00 125.0000 q2222 q222
##
## $regression.points
## quadrant x y
## 1: q111 -4.3742412 -79.8807307
## 2: q112 -3.0992681 -28.0828202
## 3: q121 -1.8507319 -5.8599732
## 4: q122 -0.5992681 -0.2594580
## 5: q211 0.6507319 0.3130212
## 6: q212 1.9007319 6.3553668
## 7: q221 3.1507319 29.4685900
## 8: q222 4.3992681 81.4792796
The right column of plots shows the corresponding regression for the order of NNS
partitioning.
NNS.reg
NNS.reg
can fit any \(f(x)\), for both uni- and multivariate cases. NNS.reg
returns a self-evident list of values provided below.
## $R2
## [1] 0.99997
##
## $SE
## [1] 0.2639869
##
## $Prediction.Accuracy
## NULL
##
## $equation
## NULL
##
## $x.star
## NULL
##
## $derivative
## Coefficient X.Lower.Range X.Upper.Range
## 1: 68.96153650 -5.0000000 -4.8497995
## 2: 78.14881892 -4.8497995 -4.6995990
## 3: 57.50334118 -4.6995990 -4.0504010
## 4: 41.41858676 -4.0504010 -3.3995990
## 5: 28.41727255 -3.3995990 -2.7746274
## 6: 18.27062076 -2.7746274 -2.1746274
## 7: 10.36054241 -2.1746274 -1.5495990
## 8: 4.40162406 -1.5495990 -0.8995990
## 9: 0.92746374 -0.8995990 -0.2746274
## 10: -0.04989832 -0.2746274 0.3504010
## 11: 1.23635058 0.3504010 0.9753726
## 12: 4.84664559 0.9753726 1.6004010
## 13: 10.94199241 1.6004010 2.2253726
## 14: 19.09637078 2.2253726 2.8504010
## 15: 30.06030289 2.8504010 3.4753726
## 16: 42.73145759 3.4753726 4.0753726
## 17: 57.14549806 4.0753726 4.7004010
## 18: 77.87331475 4.7004010 4.8502005
## 19: 70.92987791 4.8502005 5.0000000
##
## $Point.est
## NULL
##
## $regression.points
## x y
## 1: -5.0000000 -125.00000000
## 2: -4.8497995 -114.64194196
## 3: -4.6995990 -102.90394940
## 4: -4.0504010 -65.57289791
## 5: -3.3995990 -38.61759694
## 6: -2.7746274 -20.85761067
## 7: -2.1746274 -9.89523822
## 8: -1.5495990 -3.41960423
## 9: -0.8995990 -0.55854859
## 10: -0.2746274 0.02108984
## 11: 0.3504010 -0.01009803
## 12: 0.9753726 0.76258588
## 13: 1.6004010 3.79187737
## 14: 2.2253726 10.63031109
## 15: 2.8504010 22.56608652
## 16: 3.4753726 41.35291997
## 17: 4.0753726 66.99179452
## 18: 4.7004010 102.70935782
## 19: 4.8502005 114.37474056
## 20: 5.0000000 125.00000000
##
## $Fitted.xy
## x y y.hat NNS.ID gradient residuals
## 1: -5.00 -125.0000 -125.0000 q11111 68.96154 0.0000000
## 2: -4.95 -121.2874 -121.5519 q11111 68.96154 -0.2645482
## 3: -4.90 -117.6490 -118.1038 q11111 68.96154 -0.4548464
## 4: -4.85 -114.0841 -114.6558 q11111 68.96154 -0.5716445
## 5: -4.80 -110.5920 -110.7502 q11111 78.14882 -0.1581707
## ---
## 197: 4.80 110.5920 110.4655 q22222 77.87331 -0.1265397
## 198: 4.85 114.0841 114.3591 q22222 77.87331 0.2750011
## 199: 4.90 117.6490 117.9070 q22222 70.92988 0.2580122
## 200: 4.95 121.2874 121.4535 q22222 70.92988 0.1661311
## 201: 5.00 125.0000 125.0000 q22222 70.92988 0.0000000
Multivariate regressions return a plot of \(y\) and \(\hat{y}\), as well as the regression points ($RPM
) and partitions ($rhs.partitions
) for each regressor.
f= function(x, y) x ^ 3 + 3 * y - y ^ 3 - 3 * x
y = x ; z = expand.grid(x, y)
g = f(z[ , 1], z[ , 2])
NNS.reg(z, g, order = "max", ncores = 1)
## $R2
## [1] 1
##
## $rhs.partitions
## Var1 Var2
## 1: -5.00 -5
## 2: -4.95 -5
## 3: -4.90 -5
## 4: -4.85 -5
## 5: -4.80 -5
## ---
## 40397: 4.80 5
## 40398: 4.85 5
## 40399: 4.90 5
## 40400: 4.95 5
## 40401: 5.00 5
##
## $RPM
## Var1 Var2 y.hat
## 1: -4.8 -4.80 -7.105427e-15
## 2: -4.8 -2.55 -8.726063e+01
## 3: -4.8 -2.50 -8.806700e+01
## 4: -4.8 -2.45 -8.883587e+01
## 5: -4.8 -2.40 -8.956800e+01
## ---
## 40397: -2.6 -2.80 3.776000e+00
## 40398: -2.6 -2.75 2.770875e+00
## 40399: -2.6 -2.70 1.807000e+00
## 40400: -2.6 -2.65 8.836250e-01
## 40401: -2.6 -2.60 1.776357e-15
##
## $Point.est
## NULL
##
## $Fitted.xy
## Var1 Var2 y y.hat NNS.ID residuals
## 1: -5.00 -5 0.000000 0.000000 201.201 0
## 2: -4.95 -5 3.562625 3.562625 402.201 0
## 3: -4.90 -5 7.051000 7.051000 603.201 0
## 4: -4.85 -5 10.465875 10.465875 804.201 0
## 5: -4.80 -5 13.808000 13.808000 1005.201 0
## ---
## 40397: 4.80 5 -13.808000 -13.808000 39597.40401 0
## 40398: 4.85 5 -10.465875 -10.465875 39798.40401 0
## 40399: 4.90 5 -7.051000 -7.051000 39999.40401 0
## 40400: 4.95 5 -3.562625 -3.562625 40200.40401 0
## 40401: 5.00 5 0.000000 0.000000 40401.40401 0
NNS.reg
can inter- or extrapolate any point of interest. The NNS.reg(x, y, point.est = ...)
parameter permits any sized data of similar dimensions to \(x\) and called specifically with $Point.est
.
NNS.reg
also provides a dimension reduction regression by including a parameter NNS.reg(x, y, dim.red.method = "cor", ...)
. Reducing all regressors to a single dimension using the returned equation $equation
.
## Variable Coefficient
## 1: Sepal.Length 0.7980781
## 2: Sepal.Width -0.4402896
## 3: Petal.Length 0.9354305
## 4: Petal.Width 0.9381792
## 5: DENOMINATOR 4.0000000
Thus, our model for this regression would be: \[Species = \frac{0.798*Sepal.Length -0.44*Sepal.Width +0.935*Petal.Length +0.938*Petal.Width}{4} \]
NNS.reg(x, y, dim.red.method = "cor", threshold = ...)
offers a method of reducing regressors further by controlling the absolute value of required correlation.
## Variable Coefficient
## 1: Sepal.Length 0.7980781
## 2: Sepal.Width 0.0000000
## 3: Petal.Length 0.9354305
## 4: Petal.Width 0.9381792
## 5: DENOMINATOR 3.0000000
Thus, our model for this further reduced dimension regression would be: \[Species = \frac{\: 0.798*Sepal.Length - 0*Sepal.Width +0.935*Petal.Length +0.938*Petal.Width}{3} \]
and the point.est = (...)
operates in the same manner as the full regression above, again called with $Point.est
.
## [1] 1 1 1 1 1 1 1 1 1 1
For a classification problem, we simply set NNS.reg(x, y, type = "CLASS", ...)
.
## [1] 1 1 1 1 1 1 1 1 1 1
NNS.stack
The NNS.stack()
routine cross-validates for a given objective function the n.best
parameter in the multivariate NNS.reg
function as well as the threshold
parameter in the dimension reduction NNS.reg
version. NNS.stack
can be used for classification NNS.stack(..., type = "CLASS", ...)
or continuous dependent variables NNS.stack(..., type = NULL, ...)
.
Any objective function obj.fn
can be called using expression()
with the terms predicted
and actual
.
Note: For mixed data type regressors / features, it is suggested to use NNS.stack(..., order = "max", ...)
.
NNS.stack(IVs.train = iris[ , 1 : 4],
DV.train = iris[ , 5],
IVs.test = iris[1:10, 1 : 4],
obj.fn = expression( mean(round(predicted) == actual) ),
objective = "max",
type = "CLASS", folds = 1, ncores = 1)
## $OBJfn.reg
## [1] 1
##
## $NNS.reg.n.best
## [1] 3
##
## $OBJfn.dim.red
## [1] 0.9807692
##
## $NNS.dim.red.threshold
## [1] 0.785
##
## $reg
## [1] 1 1 1 1 1 1 1 1 1 1
##
## $dim.red
## [1] 1 1 1 1 1 1 1 1 1 1
##
## $stack
## [1] 1 1 1 1 1 1 1 1 1 1
If the user is so motivated, detailed arguments further examples are provided within the following: