Unified (formula-based) interface version of the naive Bayes algorithm
provided by e1071::naiveBayes()
.
Usage
mlNaiveBayes(train, ...)
ml_naive_bayes(train, ...)
# S3 method for formula
mlNaiveBayes(formula, data, laplace = 0, ..., subset, na.action)
# S3 method for default
mlNaiveBayes(train, response, laplace = 0, ...)
# S3 method for mlNaiveBayes
predict(
object,
newdata,
type = c("class", "membership", "both"),
method = c("direct", "cv"),
na.action = na.exclude,
threshold = 0.001,
eps = 0,
...
)
Arguments
- train
a matrix or data frame with predictors.
- ...
further arguments passed to the classification method or its
predict()
method (not used here for now).- formula
a formula with left term being the factor variable to predict and the right term with the list of independent, predictive variables, separated with a plus sign. If the data frame provided contains only the dependent and independent variables, one can use the
class ~ .
short version (that one is strongly encouraged). Variables with minus sign are eliminated. Calculations on variables are possible according to usual formula convention (possibly protected by usingI()
).- data
a data.frame to use as a training set.
- laplace
positive number controlling Laplace smoothing for the naive Bayes classifier. The default (0) disables Laplace smoothing.
- subset
index vector with the cases to define the training set in use (this argument must be named, if provided).
- na.action
function to specify the action to be taken if
NA
s are found. Forml_naive_bayes()
na.fail
is used by default. The calculation is stopped if there is anyNA
in the data. Another option isna.omit
, where cases with missing values on any required variable are dropped (this argument must be named, if provided). For thepredict()
method, the default, and most suitable option, isna.exclude
. In that case, rows withNA
s innewdata=
are excluded from prediction, but reinjected in the final results so that the number of items is still the same (and in the same order asnewdata=
).- response
a vector of factor with the classes.
- object
an mlNaiveBayes object
- newdata
a new dataset with same conformation as the training set (same variables, except may by the class for classification or dependent variable for regression). Usually a test set, or a new dataset to be predicted.
- type
the type of prediction to return.
"class"
by default, the predicted classes. Other options are"membership"
, the posterior probability or"both"
to return classes and memberships,- method
"direct"
(default) or"cv"
."direct"
predicts new cases innewdata=
if this argument is provided, or the cases in the training set if not. Take care that not providingnewdata=
means that you just calculate the self-consistency of the classifier but cannot use the metrics derived from these results for the assessment of its performances. Either use a different dataset innewdata=
or use the alternate cross-validation ("cv") technique. If you specifymethod = "cv"
thencvpredict()
is used and you cannot providenewdata=
in that case.- threshold
value replacing cells with probabilities within 'eps' range.
- eps
number for specifying an epsilon-range to apply Laplace smoothing (to replace zero or close-zero probabilities by 'threshold').
Value
ml_naive_bayes()
/mlNaiveBayes()
creates an mlNaiveBayes,
mlearning object containing the classifier and a lot of additional
metadata used by the functions and methods you can apply to it like
predict()
or cvpredict()
. In case you want to program new functions or
extract specific components, inspect the "unclassed" object using
unclass()
.
See also
mlearning()
, cvpredict()
, confusion()
, also
e1071::naiveBayes()
that actually does the classification.
Examples
# Prepare data: split into training set (2/3) and test set (1/3)
data("iris", package = "datasets")
train <- c(1:34, 51:83, 101:133)
iris_train <- iris[train, ]
iris_test <- iris[-train, ]
# One case with missing data in train set, and another case in test set
iris_train[1, 1] <- NA
iris_test[25, 2] <- NA
iris_nb <- ml_naive_bayes(data = iris_train, Species ~ .)
summary(iris_nb)
#> A mlearning object of class mlNaiveBayes (naive Bayes classifier):
#> Initial call: mlNaiveBayes.formula(formula = Species ~ ., data = iris_train)
#>
#> Naive Bayes Classifier for Discrete Predictors
#>
#> Call:
#> naiveBayes.default(x = train, y = response, laplace = laplace,
#> .args. = ..1)
#>
#> A-priori probabilities:
#> response
#> setosa versicolor virginica
#> 0.3333333 0.3333333 0.3333333
#>
#> Conditional probabilities:
#> Sepal.Length
#> response [,1] [,2]
#> setosa 5.048485 0.3725933
#> versicolor 6.027273 0.5392545
#> virginica 6.642424 0.7088857
#>
#> Sepal.Width
#> response [,1] [,2]
#> setosa 3.478788 0.3805897
#> versicolor 2.763636 0.3267436
#> virginica 2.951515 0.3545430
#>
#> Petal.Length
#> response [,1] [,2]
#> setosa 1.478788 0.1781109
#> versicolor 4.284848 0.4651083
#> virginica 5.642424 0.6179757
#>
#> Petal.Width
#> response [,1] [,2]
#> setosa 0.2454545 0.1033529
#> versicolor 1.3303030 0.2157615
#> virginica 2.0090909 0.2466825
#>
predict(iris_nb) # Default type is class
#> [1] setosa setosa setosa setosa setosa setosa
#> [7] setosa setosa setosa setosa setosa setosa
#> [13] setosa setosa setosa setosa setosa setosa
#> [19] setosa setosa setosa setosa setosa setosa
#> [25] setosa setosa setosa setosa setosa setosa
#> [31] setosa setosa setosa versicolor versicolor versicolor
#> [37] versicolor versicolor versicolor versicolor versicolor versicolor
#> [43] versicolor versicolor versicolor versicolor versicolor versicolor
#> [49] versicolor versicolor versicolor versicolor versicolor virginica
#> [55] versicolor versicolor versicolor versicolor versicolor versicolor
#> [61] virginica versicolor versicolor versicolor versicolor versicolor
#> [67] virginica virginica virginica virginica virginica virginica
#> [73] versicolor virginica virginica virginica virginica virginica
#> [79] virginica virginica virginica virginica virginica virginica
#> [85] virginica versicolor virginica virginica virginica virginica
#> [91] virginica virginica virginica virginica virginica virginica
#> [97] virginica virginica virginica
#> Levels: setosa versicolor virginica
predict(iris_nb, type = "membership")
#> setosa versicolor virginica
#> [1,] 1.000000e+00 1.803036e-16 1.167697e-24
#> [2,] 1.000000e+00 1.352715e-17 1.831338e-25
#> [3,] 1.000000e+00 1.772241e-16 1.353873e-24
#> [4,] 1.000000e+00 5.692804e-18 1.376409e-25
#> [5,] 1.000000e+00 6.093062e-14 6.531376e-21
#> [6,] 1.000000e+00 8.826228e-17 2.837739e-24
#> [7,] 1.000000e+00 7.437194e-17 8.757620e-25
#> [8,] 1.000000e+00 1.216747e-16 9.720500e-25
#> [9,] 1.000000e+00 6.724341e-17 2.531533e-25
#> [10,] 1.000000e+00 4.566759e-17 1.186716e-24
#> [11,] 1.000000e+00 1.879913e-16 1.981529e-24
#> [12,] 1.000000e+00 2.686953e-17 1.085926e-25
#> [13,] 1.000000e+00 1.650638e-18 2.251328e-26
#> [14,] 1.000000e+00 2.368402e-18 3.908159e-25
#> [15,] 1.000000e+00 1.761543e-16 1.792568e-22
#> [16,] 1.000000e+00 2.706070e-16 6.524103e-23
#> [17,] 1.000000e+00 1.684941e-16 5.447395e-24
#> [18,] 1.000000e+00 2.886607e-14 1.441825e-21
#> [19,] 1.000000e+00 6.739186e-17 4.013941e-24
#> [20,] 1.000000e+00 9.227945e-15 7.619346e-23
#> [21,] 1.000000e+00 3.076912e-15 2.554200e-22
#> [22,] 1.000000e+00 2.412671e-19 2.082673e-26
#> [23,] 1.000000e+00 5.291974e-11 2.164940e-18
#> [24,] 1.000000e+00 8.216570e-14 5.480528e-22
#> [25,] 1.000000e+00 3.650319e-15 1.524151e-23
#> [26,] 1.000000e+00 7.652882e-14 2.445210e-21
#> [27,] 1.000000e+00 7.812216e-17 1.149116e-24
#> [28,] 1.000000e+00 4.476095e-17 6.365960e-25
#> [29,] 1.000000e+00 5.252216e-16 3.756987e-24
#> [30,] 1.000000e+00 1.184327e-15 6.488869e-24
#> [31,] 1.000000e+00 8.338509e-14 3.263362e-21
#> [32,] 1.000000e+00 1.690882e-19 8.721554e-27
#> [33,] 1.000000e+00 4.316419e-19 7.033715e-26
#> [34,] 7.689655e-103 9.159085e-01 8.409150e-02
#> [35,] 1.096673e-96 9.667123e-01 3.328772e-02
#> [36,] 1.930514e-116 6.802159e-01 3.197841e-01
#> [37,] 1.102081e-67 9.999412e-01 5.879818e-05
#> [38,] 3.602267e-102 9.709066e-01 2.909345e-02
#> [39,] 3.335613e-86 9.993121e-01 6.878536e-04
#> [40,] 1.129911e-109 7.783585e-01 2.216415e-01
#> [41,] 2.982269e-33 9.999994e-01 5.938056e-07
#> [42,] 4.546204e-93 9.957187e-01 4.281285e-03
#> [43,] 2.245740e-67 9.998022e-01 1.977854e-04
#> [44,] 1.295311e-39 9.999994e-01 6.198654e-07
#> [45,] 8.639091e-84 9.962009e-01 3.799095e-03
#> [46,] 1.306052e-57 9.999962e-01 3.800063e-06
#> [47,] 5.129110e-100 9.913456e-01 8.654381e-03
#> [48,] 1.247004e-53 9.999529e-01 4.710380e-05
#> [49,] 2.751640e-89 9.894678e-01 1.053225e-02
#> [50,] 9.204587e-95 9.910781e-01 8.921933e-03
#> [51,] 1.056753e-59 9.999929e-01 7.109959e-06
#> [52,] 4.759450e-98 9.939246e-01 6.075414e-03
#> [53,] 4.172627e-56 9.999931e-01 6.930611e-06
#> [54,] 1.147499e-124 2.598343e-01 7.401657e-01
#> [55,] 2.430851e-68 9.998174e-01 1.826315e-04
#> [56,] 3.305367e-115 9.485689e-01 5.143106e-02
#> [57,] 1.322349e-91 9.991780e-01 8.219604e-04
#> [58,] 3.383621e-80 9.990860e-01 9.139928e-04
#> [59,] 4.018503e-89 9.929331e-01 7.066888e-03
#> [60,] 2.885304e-107 9.596243e-01 4.037565e-02
#> [61,] 1.500417e-131 1.709929e-01 8.290071e-01
#> [62,] 4.494759e-96 9.893143e-01 1.068568e-02
#> [63,] 5.408043e-40 9.999988e-01 1.216458e-06
#> [64,] 1.002758e-52 9.999955e-01 4.522882e-06
#> [65,] 5.628947e-46 9.999987e-01 1.332260e-06
#> [66,] 4.602589e-60 9.999711e-01 2.887912e-05
#> [67,] 4.079871e-244 3.576071e-09 1.000000e+00
#> [68,] 3.689196e-146 5.139596e-02 9.486040e-01
#> [69,] 4.834773e-210 1.388701e-06 9.999986e-01
#> [70,] 1.096501e-167 4.868244e-03 9.951318e-01
#> [71,] 1.614505e-208 2.397766e-06 9.999976e-01
#> [72,] 1.416791e-258 1.522784e-09 1.000000e+00
#> [73,] 4.215784e-105 9.531740e-01 4.682597e-02
#> [74,] 1.247937e-215 3.770575e-06 9.999962e-01
#> [75,] 1.109378e-181 1.112506e-03 9.988875e-01
#> [76,] 1.716419e-254 1.088139e-10 1.000000e+00
#> [77,] 4.973153e-155 2.017456e-03 9.979825e-01
#> [78,] 1.101493e-158 8.236383e-03 9.917636e-01
#> [79,] 3.008552e-185 3.859199e-05 9.999614e-01
#> [80,] 3.744500e-148 2.931546e-02 9.706845e-01
#> [81,] 3.347013e-184 2.234062e-05 9.999777e-01
#> [82,] 8.606627e-188 8.413290e-06 9.999916e-01
#> [83,] 7.840619e-163 5.273864e-03 9.947261e-01
#> [84,] 1.759992e-272 1.484798e-11 1.000000e+00
#> [85,] 8.472771e-297 6.900804e-12 1.000000e+00
#> [86,] 2.360925e-119 9.577011e-01 4.229886e-02
#> [87,] 2.229069e-212 2.646252e-07 9.999997e-01
#> [88,] 1.285230e-142 3.410444e-02 9.658956e-01
#> [89,] 2.856722e-259 2.856759e-09 1.000000e+00
#> [90,] 1.186200e-131 2.371171e-01 7.628829e-01
#> [91,] 1.619220e-195 7.614926e-06 9.999924e-01
#> [92,] 1.704706e-195 2.566160e-05 9.999743e-01
#> [93,] 1.903352e-126 3.334420e-01 6.665580e-01
#> [94,] 2.236975e-130 2.044628e-01 7.955372e-01
#> [95,] 3.459428e-189 6.688505e-05 9.999331e-01
#> [96,] 1.646362e-171 1.997026e-03 9.980030e-01
#> [97,] 2.796148e-210 3.580998e-06 9.999964e-01
#> [98,] 1.557094e-238 1.596656e-09 1.000000e+00
#> [99,] 7.874332e-197 1.449503e-05 9.999855e-01
predict(iris_nb, type = "both")
#> $class
#> [1] setosa setosa setosa setosa setosa setosa
#> [7] setosa setosa setosa setosa setosa setosa
#> [13] setosa setosa setosa setosa setosa setosa
#> [19] setosa setosa setosa setosa setosa setosa
#> [25] setosa setosa setosa setosa setosa setosa
#> [31] setosa setosa setosa versicolor versicolor versicolor
#> [37] versicolor versicolor versicolor versicolor versicolor versicolor
#> [43] versicolor versicolor versicolor versicolor versicolor versicolor
#> [49] versicolor versicolor versicolor versicolor versicolor virginica
#> [55] versicolor versicolor versicolor versicolor versicolor versicolor
#> [61] virginica versicolor versicolor versicolor versicolor versicolor
#> [67] virginica virginica virginica virginica virginica virginica
#> [73] versicolor virginica virginica virginica virginica virginica
#> [79] virginica virginica virginica virginica virginica virginica
#> [85] virginica versicolor virginica virginica virginica virginica
#> [91] virginica virginica virginica virginica virginica virginica
#> [97] virginica virginica virginica
#> Levels: setosa versicolor virginica
#>
#> $membership
#> setosa versicolor virginica
#> [1,] 1.000000e+00 1.803036e-16 1.167697e-24
#> [2,] 1.000000e+00 1.352715e-17 1.831338e-25
#> [3,] 1.000000e+00 1.772241e-16 1.353873e-24
#> [4,] 1.000000e+00 5.692804e-18 1.376409e-25
#> [5,] 1.000000e+00 6.093062e-14 6.531376e-21
#> [6,] 1.000000e+00 8.826228e-17 2.837739e-24
#> [7,] 1.000000e+00 7.437194e-17 8.757620e-25
#> [8,] 1.000000e+00 1.216747e-16 9.720500e-25
#> [9,] 1.000000e+00 6.724341e-17 2.531533e-25
#> [10,] 1.000000e+00 4.566759e-17 1.186716e-24
#> [11,] 1.000000e+00 1.879913e-16 1.981529e-24
#> [12,] 1.000000e+00 2.686953e-17 1.085926e-25
#> [13,] 1.000000e+00 1.650638e-18 2.251328e-26
#> [14,] 1.000000e+00 2.368402e-18 3.908159e-25
#> [15,] 1.000000e+00 1.761543e-16 1.792568e-22
#> [16,] 1.000000e+00 2.706070e-16 6.524103e-23
#> [17,] 1.000000e+00 1.684941e-16 5.447395e-24
#> [18,] 1.000000e+00 2.886607e-14 1.441825e-21
#> [19,] 1.000000e+00 6.739186e-17 4.013941e-24
#> [20,] 1.000000e+00 9.227945e-15 7.619346e-23
#> [21,] 1.000000e+00 3.076912e-15 2.554200e-22
#> [22,] 1.000000e+00 2.412671e-19 2.082673e-26
#> [23,] 1.000000e+00 5.291974e-11 2.164940e-18
#> [24,] 1.000000e+00 8.216570e-14 5.480528e-22
#> [25,] 1.000000e+00 3.650319e-15 1.524151e-23
#> [26,] 1.000000e+00 7.652882e-14 2.445210e-21
#> [27,] 1.000000e+00 7.812216e-17 1.149116e-24
#> [28,] 1.000000e+00 4.476095e-17 6.365960e-25
#> [29,] 1.000000e+00 5.252216e-16 3.756987e-24
#> [30,] 1.000000e+00 1.184327e-15 6.488869e-24
#> [31,] 1.000000e+00 8.338509e-14 3.263362e-21
#> [32,] 1.000000e+00 1.690882e-19 8.721554e-27
#> [33,] 1.000000e+00 4.316419e-19 7.033715e-26
#> [34,] 7.689655e-103 9.159085e-01 8.409150e-02
#> [35,] 1.096673e-96 9.667123e-01 3.328772e-02
#> [36,] 1.930514e-116 6.802159e-01 3.197841e-01
#> [37,] 1.102081e-67 9.999412e-01 5.879818e-05
#> [38,] 3.602267e-102 9.709066e-01 2.909345e-02
#> [39,] 3.335613e-86 9.993121e-01 6.878536e-04
#> [40,] 1.129911e-109 7.783585e-01 2.216415e-01
#> [41,] 2.982269e-33 9.999994e-01 5.938056e-07
#> [42,] 4.546204e-93 9.957187e-01 4.281285e-03
#> [43,] 2.245740e-67 9.998022e-01 1.977854e-04
#> [44,] 1.295311e-39 9.999994e-01 6.198654e-07
#> [45,] 8.639091e-84 9.962009e-01 3.799095e-03
#> [46,] 1.306052e-57 9.999962e-01 3.800063e-06
#> [47,] 5.129110e-100 9.913456e-01 8.654381e-03
#> [48,] 1.247004e-53 9.999529e-01 4.710380e-05
#> [49,] 2.751640e-89 9.894678e-01 1.053225e-02
#> [50,] 9.204587e-95 9.910781e-01 8.921933e-03
#> [51,] 1.056753e-59 9.999929e-01 7.109959e-06
#> [52,] 4.759450e-98 9.939246e-01 6.075414e-03
#> [53,] 4.172627e-56 9.999931e-01 6.930611e-06
#> [54,] 1.147499e-124 2.598343e-01 7.401657e-01
#> [55,] 2.430851e-68 9.998174e-01 1.826315e-04
#> [56,] 3.305367e-115 9.485689e-01 5.143106e-02
#> [57,] 1.322349e-91 9.991780e-01 8.219604e-04
#> [58,] 3.383621e-80 9.990860e-01 9.139928e-04
#> [59,] 4.018503e-89 9.929331e-01 7.066888e-03
#> [60,] 2.885304e-107 9.596243e-01 4.037565e-02
#> [61,] 1.500417e-131 1.709929e-01 8.290071e-01
#> [62,] 4.494759e-96 9.893143e-01 1.068568e-02
#> [63,] 5.408043e-40 9.999988e-01 1.216458e-06
#> [64,] 1.002758e-52 9.999955e-01 4.522882e-06
#> [65,] 5.628947e-46 9.999987e-01 1.332260e-06
#> [66,] 4.602589e-60 9.999711e-01 2.887912e-05
#> [67,] 4.079871e-244 3.576071e-09 1.000000e+00
#> [68,] 3.689196e-146 5.139596e-02 9.486040e-01
#> [69,] 4.834773e-210 1.388701e-06 9.999986e-01
#> [70,] 1.096501e-167 4.868244e-03 9.951318e-01
#> [71,] 1.614505e-208 2.397766e-06 9.999976e-01
#> [72,] 1.416791e-258 1.522784e-09 1.000000e+00
#> [73,] 4.215784e-105 9.531740e-01 4.682597e-02
#> [74,] 1.247937e-215 3.770575e-06 9.999962e-01
#> [75,] 1.109378e-181 1.112506e-03 9.988875e-01
#> [76,] 1.716419e-254 1.088139e-10 1.000000e+00
#> [77,] 4.973153e-155 2.017456e-03 9.979825e-01
#> [78,] 1.101493e-158 8.236383e-03 9.917636e-01
#> [79,] 3.008552e-185 3.859199e-05 9.999614e-01
#> [80,] 3.744500e-148 2.931546e-02 9.706845e-01
#> [81,] 3.347013e-184 2.234062e-05 9.999777e-01
#> [82,] 8.606627e-188 8.413290e-06 9.999916e-01
#> [83,] 7.840619e-163 5.273864e-03 9.947261e-01
#> [84,] 1.759992e-272 1.484798e-11 1.000000e+00
#> [85,] 8.472771e-297 6.900804e-12 1.000000e+00
#> [86,] 2.360925e-119 9.577011e-01 4.229886e-02
#> [87,] 2.229069e-212 2.646252e-07 9.999997e-01
#> [88,] 1.285230e-142 3.410444e-02 9.658956e-01
#> [89,] 2.856722e-259 2.856759e-09 1.000000e+00
#> [90,] 1.186200e-131 2.371171e-01 7.628829e-01
#> [91,] 1.619220e-195 7.614926e-06 9.999924e-01
#> [92,] 1.704706e-195 2.566160e-05 9.999743e-01
#> [93,] 1.903352e-126 3.334420e-01 6.665580e-01
#> [94,] 2.236975e-130 2.044628e-01 7.955372e-01
#> [95,] 3.459428e-189 6.688505e-05 9.999331e-01
#> [96,] 1.646362e-171 1.997026e-03 9.980030e-01
#> [97,] 2.796148e-210 3.580998e-06 9.999964e-01
#> [98,] 1.557094e-238 1.596656e-09 1.000000e+00
#> [99,] 7.874332e-197 1.449503e-05 9.999855e-01
#>
# Self-consistency, do not use for assessing classifier performances!
confusion(iris_nb)
#> 99 items classified with 95 true positives (error rate = 4%)
#> Predicted
#> Actual 01 02 03 (sum) (FNR%)
#> 01 setosa 33 0 0 33 0
#> 02 versicolor 0 31 2 33 6
#> 03 virginica 0 2 31 33 6
#> (sum) 33 33 33 99 4
# Use an independent test set instead
confusion(predict(iris_nb, newdata = iris_test), iris_test$Species)
#> 50 items classified with 47 true positives (error rate = 6%)
#> Predicted
#> Actual 01 02 03 04 (sum) (FNR%)
#> 01 setosa 16 0 0 0 16 0
#> 02 NA 0 0 0 0 0
#> 03 versicolor 0 1 16 0 17 6
#> 04 virginica 0 0 2 15 17 12
#> (sum) 16 1 18 15 50 6
# Another dataset
data("HouseVotes84", package = "mlbench")
house_nb <- ml_naive_bayes(data = HouseVotes84, Class ~ .,
na.action = na.omit)
summary(house_nb)
#> A mlearning object of class mlNaiveBayes (naive Bayes classifier):
#> Initial call: mlNaiveBayes.formula(formula = Class ~ ., data = HouseVotes84, na.action = na.omit)
#>
#> Naive Bayes Classifier for Discrete Predictors
#>
#> Call:
#> naiveBayes.default(x = train, y = response, laplace = laplace,
#> .args. = ..1)
#>
#> A-priori probabilities:
#> response
#> democrat republican
#> 0.5344828 0.4655172
#>
#> Conditional probabilities:
#> V1
#> response n y
#> democrat 0.4112903 0.5887097
#> republican 0.7870370 0.2129630
#>
#> V2
#> response n y
#> democrat 0.5483871 0.4516129
#> republican 0.5277778 0.4722222
#>
#> V3
#> response n y
#> democrat 0.1451613 0.8548387
#> republican 0.8425926 0.1574074
#>
#> V4
#> response n y
#> democrat 0.951612903 0.048387097
#> republican 0.009259259 0.990740741
#>
#> V5
#> response n y
#> democrat 0.7983871 0.2016129
#> republican 0.0462963 0.9537037
#>
#> V6
#> response n y
#> democrat 0.5564516 0.4435484
#> republican 0.1296296 0.8703704
#>
#> V7
#> response n y
#> democrat 0.2338710 0.7661290
#> republican 0.7314815 0.2685185
#>
#> V8
#> response n y
#> democrat 0.1693548 0.8306452
#> republican 0.8518519 0.1481481
#>
#> V9
#> response n y
#> democrat 0.2096774 0.7903226
#> republican 0.8611111 0.1388889
#>
#> V10
#> response n y
#> democrat 0.4677419 0.5322581
#> republican 0.4259259 0.5740741
#>
#> V11
#> response n y
#> democrat 0.4919355 0.5080645
#> republican 0.8425926 0.1574074
#>
#> V12
#> response n y
#> democrat 0.8709677 0.1290323
#> republican 0.1481481 0.8518519
#>
#> V13
#> response n y
#> democrat 0.7096774 0.2903226
#> republican 0.1574074 0.8425926
#>
#> V14
#> response n y
#> democrat 0.65322581 0.34677419
#> republican 0.01851852 0.98148148
#>
#> V15
#> response n y
#> democrat 0.4032258 0.5967742
#> republican 0.8888889 0.1111111
#>
#> V16
#> response n y
#> democrat 0.05645161 0.94354839
#> republican 0.33333333 0.66666667
#>
confusion(house_nb) # Self-consistency
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> 232 items classified with 218 true positives (error rate = 6%)
#> Predicted
#> Actual 01 02 (sum) (FNR%)
#> 01 democrat 111 13 124 10
#> 02 republican 1 107 108 1
#> (sum) 112 120 232 6
confusion(cvpredict(house_nb), na.omit(HouseVotes84)$Class)
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V1'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V2'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V3'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V4'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V5'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V6'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V7'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V8'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V9'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V10'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V11'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V12'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V13'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V14'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V15'. Did you use factors with numeric labels for training, and numeric values for new data?
#> Warning: Type mismatch between training and new data for variable 'V16'. Did you use factors with numeric labels for training, and numeric values for new data?
#> 232 items classified with 217 true positives (error rate = 6.5%)
#> Predicted
#> Actual 01 02 (sum) (FNR%)
#> 01 democrat 110 14 124 11
#> 02 republican 1 107 108 1
#> (sum) 111 121 232 6