Title: | Machine Learning Performance Evaluation on Steroids |
---|---|
Description: | Performance evaluation metrics for supervised and unsupervised machine learning, statistical learning and artificial intelligence applications. Core computations are implemented in 'C++' for scalability and efficiency. |
Authors: | Serkan Korkmaz [cre, aut, cph]
|
Maintainer: | Serkan Korkmaz <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.3-4 |
Built: | 2025-03-28 17:34:20 UTC |
Source: | https://github.com/serkor1/slmetrics |
A generic function for the (normalized) accuracy in classification tasks.
Use weighted.accuracy()
for the weighted accuracy.
## S3 method for class 'factor' accuracy(actual, predicted, ...) ## S3 method for class 'factor' weighted.accuracy(actual, predicted, w, ...) ## S3 method for class 'cmatrix' accuracy(x, ...) ## Generic S3 method accuracy(...) ## Generic S3 method weighted.accuracy( ..., w )
## S3 method for class 'factor' accuracy(actual, predicted, ...) ## S3 method for class 'factor' weighted.accuracy(actual, predicted, w, ...) ## S3 method for class 'cmatrix' accuracy(x, ...) ## Generic S3 method accuracy(...) ## Generic S3 method weighted.accuracy( ..., w )
actual |
|
predicted |
|
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
A <numeric>-vector of length 1
Let be the proportion of correctly predicted classes. The accuracy of the classifier is calculated as,
Where:
is the number of true positives,
is the number of true negatives,
is the number of false positives, and
is the number of false negatives.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model # performance cat( "Accuracy", accuracy( actual = actual, predicted = predicted ), "Accuracy (weigthed)", weighted.accuracy( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model # performance cat( "Accuracy", accuracy( actual = actual, predicted = predicted ), "Accuracy (weigthed)", weighted.accuracy( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
The auc()
-function calculates the area under the curve.
## S3 method for class 'numeric' auc(y, x, method = 0L, presorted = TRUE, ...) ## Generic S3 method auc( y, x, method = 0, presorted = TRUE, ... )
## S3 method for class 'numeric' auc(y, x, method = 0L, presorted = TRUE, ...) ## Generic S3 method auc( y, x, method = 0, presorted = TRUE, ... )
y |
|
x |
|
method |
A <numeric> value (default: |
presorted |
A <logical>-value length 1 (default: FALSE). If TRUE the input will not be sorted by threshold. |
... |
Arguments passed into other methods. |
A <numeric> vector of length 1
Trapezoidal rule
The trapezoidal rule approximates the integral of a function between
and
using trapezoids formed between consecutive points. If
we have points
(with
)
and corresponding function values
, the area under
the curve
is approximated by:
Step-function method
The step-function (rectangular) method uses the value of the function at one
endpoint of each subinterval to form rectangles. With the same partition
, the rectangular approximation
can be written as:
Other Tools:
cov.wt.matrix()
,
preorder()
,
presort()
## 1) Ordered x and y pair x <- seq(0, pi, length.out = 200) y <- sin(x) ## 1.1) calculate area ordered_auc <- auc(y = y, x = x) ## 2) Unordered x and y pair x <- sample(seq(0, pi, length.out = 200)) y <- sin(x) ## 2.1) calculate area unordered_auc <- auc(y = y, x = x) ## 2.2) calculate area with explicit ## ordering unordered_auc_flag <- auc( y = y, x = x, presorted = FALSE ) ## 3) display result cat( "AUC (ordered x and y pair)", ordered_auc, "AUC (unordered x and y pair)", unordered_auc, "AUC (unordered x and y pair, with unordered flag)", unordered_auc_flag, sep = "\n" )
## 1) Ordered x and y pair x <- seq(0, pi, length.out = 200) y <- sin(x) ## 1.1) calculate area ordered_auc <- auc(y = y, x = x) ## 2) Unordered x and y pair x <- sample(seq(0, pi, length.out = 200)) y <- sin(x) ## 2.1) calculate area unordered_auc <- auc(y = y, x = x) ## 2.2) calculate area with explicit ## ordering unordered_auc_flag <- auc( y = y, x = x, presorted = FALSE ) ## 3) display result cat( "AUC (ordered x and y pair)", ordered_auc, "AUC (unordered x and y pair)", unordered_auc, "AUC (unordered x and y pair, with unordered flag)", unordered_auc_flag, sep = "\n" )
A generic function for the (normalized) balanced accuracy.
Use weighted.baccuracy()
for the weighted balanced accuracy.
## S3 method for class 'factor' baccuracy(actual, predicted, adjust = FALSE, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.baccuracy(actual, predicted, w, adjust = FALSE, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' baccuracy(x, adjust = FALSE, na.rm = TRUE, ...) ## Generic S3 method baccuracy( ..., adjust = FALSE, na.rm = TRUE ) ## Generic S3 method weighted.baccuracy( ..., w, adjust = FALSE, na.rm = TRUE )
## S3 method for class 'factor' baccuracy(actual, predicted, adjust = FALSE, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.baccuracy(actual, predicted, w, adjust = FALSE, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' baccuracy(x, adjust = FALSE, na.rm = TRUE, ...) ## Generic S3 method baccuracy( ..., adjust = FALSE, na.rm = TRUE ) ## Generic S3 method weighted.baccuracy( ..., w, adjust = FALSE, na.rm = TRUE )
actual |
|
predicted |
|
adjust |
A logical value (default: FALSE). If TRUE the metric is adjusted for random chance |
na.rm |
A logical value (default: TRUE). If TRUE calculation of the metric is based on valid classes. |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
Let be the proportion of correctly predicted classes. If
adjust == false
, the balanced accuracy of the classifier is calculated as,
otherwise,
Where:
is the number of classes
is the overall sensitivity, and
is the overall specificity
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate the # model cat( "Balanced accuracy", baccuracy( actual = actual, predicted = predicted ), "Balanced accuracy (weigthed)", weighted.baccuracy( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate the # model cat( "Balanced accuracy", baccuracy( actual = actual, predicted = predicted ), "Balanced accuracy (weigthed)", weighted.baccuracy( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
This dataset contains features extracted from the wavelet transform of banknote images, which are used to classify banknotes as authentic or inauthentic. The data originates from the UCI Machine Learning Repository.
data(banknote)
data(banknote)
A list with two components:
A data frame with 4 variables: variance
, skewness
,
curtosis
, and entropy
.
A factor with levels "inauthentic"
and "authentic"
representing the banknote's authenticity.
The data is provided as a list with two components:
A data frame containing the following variables:
Variance of the wavelet transformed image.
Skewness of the wavelet transformed image.
Curtosis of the wavelet transformed image.
Entropy of the image.
A factor indicating the authenticity of the banknote. The factor has two levels:
Indicates the banknote is not genuine.
Indicates the banknote is genuine.
https://archive.ics.uci.edu/dataset/267/banknote+authentication
A generic function for the concordance correlation coefficient. Use weighted.ccc()
for the weighted concordance correlation coefficient.
## S3 method for class 'numeric' ccc(actual, predicted, correction = FALSE, ...) ## S3 method for class 'numeric' weighted.ccc(actual, predicted, w, correction = FALSE, ...) ccc( ..., correction = FALSE ) weighted.ccc( ..., w, correction = FALSE )
## S3 method for class 'numeric' ccc(actual, predicted, correction = FALSE, ...) ## S3 method for class 'numeric' weighted.ccc(actual, predicted, w, correction = FALSE, ...) ccc( ..., correction = FALSE ) weighted.ccc( ..., w, correction = FALSE )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
correction |
A <logical> vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
Let measure the agreement between
and
. The classifier agreement is calculated as,
Where:
is the pearson correlation coefficient
is the unbiased standard deviation of
is the unbiased standard deviation of
is the mean of
is the mean of
If correction == TRUE
each is adjusted by
Other Regression:
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance cat( "Concordance Correlation Coefficient", ccc( actual = actual, predicted = predicted, correction = FALSE ), "Concordance Correlation Coefficient (corrected)", ccc( actual = actual, predicted = predicted, correction = TRUE ), "Concordance Correlation Coefficient (weigthed)", weighted.ccc( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg), correction = FALSE ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance cat( "Concordance Correlation Coefficient", ccc( actual = actual, predicted = predicted, correction = FALSE ), "Concordance Correlation Coefficient (corrected)", ccc( actual = actual, predicted = predicted, correction = TRUE ), "Concordance Correlation Coefficient (weigthed)", weighted.ccc( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg), correction = FALSE ), sep = "\n" )
-statisticA generic function for Cohen's -statistic. Use
weighted.ckappa()
for the weighted -statistic.
## S3 method for class 'factor' ckappa(actual, predicted, beta = 0, ...) ## S3 method for class 'factor' weighted.ckappa(actual, predicted, w, beta = 0, ...) ## S3 method for class 'cmatrix' ckappa(x, beta = 0, ...) ckappa( ..., beta = 0 ) weighted.ckappa( ..., w, beta = 0 )
## S3 method for class 'factor' ckappa(actual, predicted, beta = 0, ...) ## S3 method for class 'factor' weighted.ckappa(actual, predicted, w, beta = 0, ...) ## S3 method for class 'cmatrix' ckappa(x, beta = 0, ...) ckappa( ..., beta = 0 ) weighted.ckappa( ..., w, beta = 0 )
actual |
|
predicted |
|
beta |
A <numeric> value of length 1 (default: 0). If |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the inter-rater (intra-rater) reliability. The inter-rater (intra-rater) reliability is calculated as,
Where:
is the empirical probability of agreement between predicted and actual values
is the expected probability of agreement under random chance
If the off-diagonals in the confusion matrix is penalized before
is calculated. More formally,
Where:
is the confusion matrix
is the penalizing matrix and
is the penalizing factor
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance with # Cohens Kappa statistic cat( "Kappa", ckappa( actual = actual, predicted = predicted ), "Kappa (penalized)", ckappa( actual = actual, predicted = predicted, beta = 2 ), "Kappa (weigthed)", weighted.ckappa( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance with # Cohens Kappa statistic cat( "Kappa", ckappa( actual = actual, predicted = predicted ), "Kappa (penalized)", ckappa( actual = actual, predicted = predicted, beta = 2 ), "Kappa (weigthed)", weighted.ckappa( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
The cmatrix()
-function uses cross-classifying factors to build
a confusion matrix of the counts at each combination of the factor levels.
Each row of the matrix represents the actual factor levels, while each
column represents the predicted factor levels.
## S3 method for class 'factor' cmatrix(actual, predicted, ...) ## S3 method for class 'factor' weighted.cmatrix(actual, predicted, w, ...) ## Generic S3 method cmatrix( actual, predicted, ... ) ## Generic S3 method weighted.cmatrix( actual, predicted, w, ... )
## S3 method for class 'factor' cmatrix(actual, predicted, ...) ## S3 method for class 'factor' weighted.cmatrix(actual, predicted, w, ...) ## Generic S3 method cmatrix( actual, predicted, ... ) ## Generic S3 method weighted.cmatrix( actual, predicted, w, ... )
actual |
|
predicted |
|
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A named x
<matrix>
There is no robust defensive measure against mis-specifying the confusion matrix. If the arguments are correctly specified, the resulting confusion matrix is on the form:
A (Predicted) | B (Predicted) | |
A (Actual) | Value | Value |
B (Actual) | Value | Value |
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) summarise performance # in a confusion matrix # 4.1) unweighted matrix confusion_matrix <- cmatrix( actual = actual, predicted = predicted ) # 4.1.1) summarise matrix summary( confusion_matrix ) # 4.1.2) plot confusion # matrix plot( confusion_matrix ) # 4.2) weighted matrix confusion_matrix <- weighted.cmatrix( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 4.2.1) summarise matrix summary( confusion_matrix ) # 4.2.1) plot confusion # matrix plot( confusion_matrix )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) summarise performance # in a confusion matrix # 4.1) unweighted matrix confusion_matrix <- cmatrix( actual = actual, predicted = predicted ) # 4.1.1) summarise matrix summary( confusion_matrix ) # 4.1.2) plot confusion # matrix plot( confusion_matrix ) # 4.2) weighted matrix confusion_matrix <- weighted.cmatrix( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 4.2.1) summarise matrix summary( confusion_matrix ) # 4.2.1) plot confusion # matrix plot( confusion_matrix )
A generic function for the diagnostic odds ratio in classification tasks. Use weighted.dor()
weighted diagnostic odds ratio.
## S3 method for class 'factor' dor(actual, predicted, ...) ## S3 method for class 'factor' weighted.dor(actual, predicted, w, ...) ## S3 method for class 'cmatrix' dor(x, ...) ## Generic S3 method dor(...) ## Generic S3 method weighted.dor( ..., w )
## S3 method for class 'factor' dor(actual, predicted, ...) ## S3 method for class 'factor' weighted.dor(actual, predicted, w, ...) ## S3 method for class 'cmatrix' dor(x, ...) ## Generic S3 method dor(...) ## Generic S3 method weighted.dor( ..., w )
actual |
|
predicted |
|
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
A <numeric>-vector of length 1
Let be the effectiveness of the classifier. The diagnostic odds ratio of the classifier is calculated as,
Where:
is the number of true positives
is the number of true negatives
is the number of false positives
is the number of false negatives
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # with Diagnostic Odds Ratio cat("Diagnostic Odds Ratio", sep = "\n") dor( actual = actual, predicted = predicted ) cat("Diagnostic Odds Ratio (weighted)", sep = "\n") weighted.dor( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # with Diagnostic Odds Ratio cat("Diagnostic Odds Ratio", sep = "\n") dor( actual = actual, predicted = predicted ) cat("Diagnostic Odds Ratio (weighted)", sep = "\n") weighted.dor( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) )
The entropy()
function calculates the Entropy of given probability distributions.
## S3 method for class 'matrix' entropy(pk, dim = 0L, base = -1, ...) ## S3 method for class 'matrix' relative.entropy(pk, qk, dim = 0L, base = -1, ...) ## S3 method for class 'matrix' cross.entropy(pk, qk, dim = 0L, base = -1, ...) ## Generic S3 method entropy( pk, dim = 0, base = -1, ... ) ## Generic S3 method relative.entropy( pk, qk, dim = 0, base = -1, ... ) ## Generic S3 method cross.entropy( pk, qk, dim = 0, base = -1, ... )
## S3 method for class 'matrix' entropy(pk, dim = 0L, base = -1, ...) ## S3 method for class 'matrix' relative.entropy(pk, qk, dim = 0L, base = -1, ...) ## S3 method for class 'matrix' cross.entropy(pk, qk, dim = 0L, base = -1, ...) ## Generic S3 method entropy( pk, dim = 0, base = -1, ... ) ## Generic S3 method relative.entropy( pk, qk, dim = 0, base = -1, ... ) ## Generic S3 method cross.entropy( pk, qk, dim = 0, base = -1, ... )
pk |
A |
dim |
An <integer> value of length 1 (Default: 0). Defines the dimension along which to calculate the entropy (0: total, 1: row-wise, 2: column-wise). |
base |
A <numeric> value of length 1 (Default: -1). The logarithmic base to use. Default value specifies natural logarithms. |
... |
Arguments passed into other methods |
qk |
A |
A <numeric> value or vector:
A single <numeric> value (length 1) if dim == 0
.
A <numeric> vector with length equal to the length of rows if dim == 1
.
A <numeric> vector with length equal to the length of columns if dim == 2
.
Entropy:
Cross Entropy:
Relative Entropy
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) Define actual # and observed probabilities # 1.1) actual probabilies pk <- matrix( cbind(1/2, 1/2), ncol = 2 ) # 1.2) observed (estimated) probabilites qk <- matrix( cbind(9/10, 1/10), ncol = 2 ) # 2) calculate # Entropy cat( "Entropy", entropy( pk ), "Relative Entropy", relative.entropy( pk, qk ), "Cross Entropy", cross.entropy( pk, qk ), sep = "\n" )
# 1) Define actual # and observed probabilities # 1.1) actual probabilies pk <- matrix( cbind(1/2, 1/2), ncol = 2 ) # 1.2) observed (estimated) probabilites qk <- matrix( cbind(9/10, 1/10), ncol = 2 ) # 2) calculate # Entropy cat( "Entropy", entropy( pk ), "Relative Entropy", relative.entropy( pk, qk ), "Cross Entropy", cross.entropy( pk, qk ), sep = "\n" )
-scoreA generic function for the -score. Use
weighted.fbeta()
for the weighted -score.
## S3 method for class 'factor' fbeta(actual, predicted, beta = 1, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fbeta(actual, predicted, w, beta = 1, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fbeta(x, beta = 1, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fbeta( ..., beta = 1, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fbeta( ..., w, beta = 1, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' fbeta(actual, predicted, beta = 1, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fbeta(actual, predicted, w, beta = 1, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fbeta(x, beta = 1, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fbeta( ..., beta = 1, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fbeta( ..., w, beta = 1, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
beta |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the
score, which is a weighted harmonic mean of precision and recall.
score of the classifier is calculated as,
Substituting and
yields:
Where:
is the number of true positives,
is the number of false positives,
is the number of false negatives, and
is a non-negative real number that determines the relative importance of precision vs. recall in the score.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using F1-score # 4.1) unweighted F1-score fbeta( actual = actual, predicted = predicted, beta = 1 ) # 4.2) weighted F1-score weighted.fbeta( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), beta = 1 ) # 5) evaluate overall performance # using micro-averaged F1-score cat( "Micro-averaged F1-score", fbeta( actual = actual, predicted = predicted, beta = 1, micro = TRUE ), "Micro-averaged F1-score (weighted)", weighted.fbeta( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), beta = 1, micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using F1-score # 4.1) unweighted F1-score fbeta( actual = actual, predicted = predicted, beta = 1 ) # 4.2) weighted F1-score weighted.fbeta( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), beta = 1 ) # 5) evaluate overall performance # using micro-averaged F1-score cat( "Micro-averaged F1-score", fbeta( actual = actual, predicted = predicted, beta = 1, micro = TRUE ), "Micro-averaged F1-score (weighted)", weighted.fbeta( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), beta = 1, micro = TRUE ), sep = "\n" )
A generic function for the False Discovery Rate. Use weighted.fdr()
for the weighted False Discovery Rate.
## S3 method for class 'factor' fdr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fdr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fdr(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fdr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fdr( ..., w, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' fdr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fdr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fdr(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fdr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fdr( ..., w, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the proportion of false positives among the preditced positives. The false discovery rate of the classifier is calculated as,
Where:
is the number of true positives, and
is the number of false positives
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using False Discovery Rate # 4.1) unweighted False Discovery Rate fdr( actual = actual, predicted = predicted ) # 4.2) weighted False Discovery Rate weighted.fdr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged False Discovery Rate cat( "Micro-averaged False Discovery Rate", fdr( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged False Discovery Rate (weighted)", weighted.fdr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using False Discovery Rate # 4.1) unweighted False Discovery Rate fdr( actual = actual, predicted = predicted ) # 4.2) weighted False Discovery Rate weighted.fdr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged False Discovery Rate cat( "Micro-averaged False Discovery Rate", fdr( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged False Discovery Rate (weighted)", weighted.fdr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
A generic function for the false omission rate. Use weighted.fdr()
for the weighted false omission rate.
## S3 method for class 'factor' fer(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fer(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fer(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fer( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fer( ..., w, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' fer(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fer(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fer(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fer( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fer( ..., w, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the proportion of false negatives among the predicted negatives. The false omission rate of the classifier is calculated as,
Where:
is the number of true negatives, and
is the number of false negatives.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using False Omission Rate # 4.1) unweighted False Omission Rate fer( actual = actual, predicted = predicted ) # 4.2) weighted False Omission Rate weighted.fer( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged False Omission Rate cat( "Micro-averaged False Omission Rate", fer( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged False Omission Rate (weighted)", weighted.fer( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using False Omission Rate # 4.1) unweighted False Omission Rate fer( actual = actual, predicted = predicted ) # 4.2) weighted False Omission Rate weighted.fer( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged False Omission Rate cat( "Micro-averaged False Omission Rate", fer( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged False Omission Rate (weighted)", weighted.fer( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
The fmi()
-function computes the Fowlkes-Mallows Index (FMI), a measure of the similarity between two sets of clusterings, between
two vectors of predicted and observed factor()
values.
## S3 method for class 'factor' fmi(actual, predicted, ...) ## S3 method for class 'cmatrix' fmi(x, ...) ## Generic S3 method fmi(...)
## S3 method for class 'factor' fmi(actual, predicted, ...) ## S3 method for class 'cmatrix' fmi(x, ...) ## Generic S3 method fmi(...)
actual |
|
predicted |
|
... |
Arguments passed into other methods |
x |
A confusion matrix created |
A <numeric>-vector of length 1
The metric is calculated for each class as follows,
Where ,
, and
represent the number of true positives, false positives, and false negatives for each class
, respectively.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # using Fowlkes Mallows Index cat( "Fowlkes Mallows Index", fmi( actual = actual, predicted = predicted ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # using Fowlkes Mallows Index cat( "Fowlkes Mallows Index", fmi( actual = actual, predicted = predicted ), sep = "\n" )
A generic function for the False Positive Rate. Use weighted.fpr()
for the weighted False Positive Rate.
Fallout
## S3 method for class 'factor' fpr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fpr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fpr(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' fallout(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fallout(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fallout(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fpr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method fallout( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fpr( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fallout( ..., w, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' fpr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fpr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fpr(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' fallout(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.fallout(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' fallout(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method fpr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method fallout( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fpr( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.fallout( ..., w, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the proportion of false positives among the actual negatives. The false positive rate of the classifier is calculated as,
Where:
is the number of true negatives, and
is the number of false positives.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using False Positive Rate # 4.1) unweighted False Positive Rate fpr( actual = actual, predicted = predicted ) # 4.2) weighted False Positive Rate weighted.fpr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged False Positive Rate cat( "Micro-averaged False Positive Rate", fpr( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged False Positive Rate (weighted)", weighted.fpr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using False Positive Rate # 4.1) unweighted False Positive Rate fpr( actual = actual, predicted = predicted ) # 4.2) weighted False Positive Rate weighted.fpr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged False Positive Rate cat( "Micro-averaged False Positive Rate", fpr( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged False Positive Rate (weighted)", weighted.fpr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
The huberloss()
-function computes the simple and weighted huber loss between
the predicted and observed <numeric> vectors. The weighted.huberloss()
function computes the weighted Huber Loss.
## S3 method for class 'numeric' huberloss(actual, predicted, delta = 1, ...) ## S3 method for class 'numeric' weighted.huberloss(actual, predicted, w, delta = 1, ...) ## Generic S3 method huberloss( actual, predicted, delta = 1, ... ) ## Generic S3 method weighted.huberloss( actual, predicted, w, delta = 1, ... )
## S3 method for class 'numeric' huberloss(actual, predicted, delta = 1, ...) ## S3 method for class 'numeric' weighted.huberloss(actual, predicted, w, delta = 1, ...) ## Generic S3 method huberloss( actual, predicted, delta = 1, ... ) ## Generic S3 method weighted.huberloss( actual, predicted, w, delta = 1, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
delta |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as follows,
and
where and
are the
actual
and predicted
values respectively. If w
is not NULL, then all values
are aggregated using the weights.
Other Regression:
ccc.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) calculate the metric # with delta 0.5 huberloss( actual = actual, predicted = predicted, delta = 0.5 ) # 3) caclulate weighted # metric using arbitrary weights w <- rbeta( n = 1e3, shape1 = 10, shape2 = 2 ) huberloss( actual = actual, predicted = predicted, delta = 0.5, w = w )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) calculate the metric # with delta 0.5 huberloss( actual = actual, predicted = predicted, delta = 0.5 ) # 3) caclulate weighted # metric using arbitrary weights w <- rbeta( n = 1e3, shape1 = 10, shape2 = 2 ) huberloss( actual = actual, predicted = predicted, delta = 0.5, w = w )
The jaccard()
-function computes the Jaccard Index, also known as the Intersection over Union, between
two vectors of predicted and observed factor()
values. The weighted.jaccard()
function computes the weighted Jaccard Index.
## S3 method for class 'factor' jaccard(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.jaccard(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' jaccard(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' csi(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.csi(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' csi(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' tscore(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.tscore(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' tscore(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method jaccard( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method csi( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method tscore( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.jaccard( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.csi( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.tscore( ..., w, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' jaccard(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.jaccard(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' jaccard(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' csi(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.csi(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' csi(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' tscore(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.tscore(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' tscore(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method jaccard( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method csi( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method tscore( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.jaccard( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.csi( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.tscore( ..., w, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
The metric is calculated for each class as follows,
Where ,
, and
represent the number of true positives, false positives, and false negatives for each class
, respectively.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Jaccard Index # 4.1) unweighted Jaccard Index jaccard( actual = actual, predicted = predicted ) # 4.2) weighted Jaccard Index weighted.jaccard( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Jaccard Index cat( "Micro-averaged Jaccard Index", jaccard( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Jaccard Index (weighted)", weighted.jaccard( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Jaccard Index # 4.1) unweighted Jaccard Index jaccard( actual = actual, predicted = predicted ) # 4.2) weighted Jaccard Index weighted.jaccard( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Jaccard Index cat( "Micro-averaged Jaccard Index", jaccard( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Jaccard Index (weighted)", weighted.jaccard( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
The logloss()
function computes the Log Loss between observed classes (as a <factor>) and their predicted probability distributions (a <numeric> matrix). The weighted.logloss()
function is the weighted version, applying observation-specific weights.
## S3 method for class 'factor' logloss(actual, response, normalize = TRUE, ...) ## S3 method for class 'factor' weighted.logloss(actual, response, w, normalize = TRUE, ...) ## S3 method for class 'integer' logloss(actual, response, normalize = TRUE, ...) ## S3 method for class 'integer' weighted.logloss(actual, response, w, normalize = TRUE, ...) ## Generic S3 method logloss( actual, response, normalize = TRUE, ... ) ## Generic S3 method weighted.logloss( actual, response, w, normalize = TRUE, ... )
## S3 method for class 'factor' logloss(actual, response, normalize = TRUE, ...) ## S3 method for class 'factor' weighted.logloss(actual, response, w, normalize = TRUE, ...) ## S3 method for class 'integer' logloss(actual, response, normalize = TRUE, ...) ## S3 method for class 'integer' weighted.logloss(actual, response, w, normalize = TRUE, ...) ## Generic S3 method logloss( actual, response, normalize = TRUE, ... ) ## Generic S3 method weighted.logloss( actual, response, w, normalize = TRUE, ... )
actual |
|
response |
A |
normalize |
A <logical>-value (default: TRUE). If TRUE, the mean cross-entropy across all observations is returned; otherwise, the sum of cross-entropies is returned. |
... |
Arguments passed into other methods |
w |
A <numeric>-vector of length 1
where:
is the
actual
-values, where = 1 if the
i
-th sample belongs to class j
, and 0 otherwise.
is the estimated probability for the
i
-th sample belonging to class j
.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) Recode the iris data set to a binary classification problem # Here, the positive class ("Virginica") is coded as 1, # and the rest ("Others") is coded as 0. iris$species_num <- as.numeric(iris$Species == "virginica") # 2) Fit a logistic regression model predicting species_num from Sepal.Length & Sepal.Width model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial(link = "logit") ) # 3) Generate predicted classes: "Virginica" vs. "Others" predicted <- factor( as.numeric(predict(model, type = "response") > 0.5), levels = c(1, 0), labels = c("Virginica", "Others") ) # 3.1) Generate actual classes actual <- factor( x = iris$species_num, levels = c(1, 0), labels = c("Virginica", "Others") ) # For Log Loss, we need predicted probabilities for each class. # Since it's a binary model, we create a 2-column matrix: # 1st column = P("Virginica") # 2nd column = P("Others") = 1 - P("Virginica") predicted_probs <- predict(model, type = "response") response_matrix <- cbind(predicted_probs, 1 - predicted_probs) # 4) Evaluate unweighted Log Loss # 'logloss' takes (actual, response_matrix, normalize=TRUE/FALSE). # The factor 'actual' must have the positive class (Virginica) as its first level. unweighted_LogLoss <- logloss( actual = actual, # factor response = response_matrix, # numeric matrix of probabilities normalize = TRUE # normalize = TRUE ) # 5) Evaluate weighted Log Loss # We introduce a weight vector, for example: weights <- iris$Petal.Length / mean(iris$Petal.Length) weighted_LogLoss <- weighted.logloss( actual = actual, response = response_matrix, w = weights, normalize = TRUE ) # 6) Print Results cat( "Unweighted Log Loss:", unweighted_LogLoss, "Weighted Log Loss:", weighted_LogLoss, sep = "\n" )
# 1) Recode the iris data set to a binary classification problem # Here, the positive class ("Virginica") is coded as 1, # and the rest ("Others") is coded as 0. iris$species_num <- as.numeric(iris$Species == "virginica") # 2) Fit a logistic regression model predicting species_num from Sepal.Length & Sepal.Width model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial(link = "logit") ) # 3) Generate predicted classes: "Virginica" vs. "Others" predicted <- factor( as.numeric(predict(model, type = "response") > 0.5), levels = c(1, 0), labels = c("Virginica", "Others") ) # 3.1) Generate actual classes actual <- factor( x = iris$species_num, levels = c(1, 0), labels = c("Virginica", "Others") ) # For Log Loss, we need predicted probabilities for each class. # Since it's a binary model, we create a 2-column matrix: # 1st column = P("Virginica") # 2nd column = P("Others") = 1 - P("Virginica") predicted_probs <- predict(model, type = "response") response_matrix <- cbind(predicted_probs, 1 - predicted_probs) # 4) Evaluate unweighted Log Loss # 'logloss' takes (actual, response_matrix, normalize=TRUE/FALSE). # The factor 'actual' must have the positive class (Virginica) as its first level. unweighted_LogLoss <- logloss( actual = actual, # factor response = response_matrix, # numeric matrix of probabilities normalize = TRUE # normalize = TRUE ) # 5) Evaluate weighted Log Loss # We introduce a weight vector, for example: weights <- iris$Petal.Length / mean(iris$Petal.Length) weighted_LogLoss <- weighted.logloss( actual = actual, response = response_matrix, w = weights, normalize = TRUE ) # 6) Print Results cat( "Unweighted Log Loss:", unweighted_LogLoss, "Weighted Log Loss:", weighted_LogLoss, sep = "\n" )
The mae()
-function computes the mean absolute error between
the observed and predicted <numeric> vectors. The weighted.mae()
function computes the weighted mean absolute error.
## S3 method for class 'numeric' mae(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mae(actual, predicted, w, ...) ## Generic S3 method mae( actual, predicted, ... ) ## Generic S3 method weighted.mae( actual, predicted, w, ... )
## S3 method for class 'numeric' mae(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mae(actual, predicted, w, ...) ## Generic S3 method mae( actual, predicted, ... ) ## Generic S3 method weighted.mae( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calulated as follows,
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Absolute Error (MAE) cat( "Mean Absolute Error", mae( actual = actual, predicted = predicted, ), "Mean Absolute Error (weighted)", weighted.mae( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Absolute Error (MAE) cat( "Mean Absolute Error", mae( actual = actual, predicted = predicted, ), "Mean Absolute Error (weighted)", weighted.mae( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
The mape()
-function computes the mean absolute percentage error between
the observed and predicted <numeric> vectors. The weighted.mape()
function computes the weighted mean absolute percentage error.
## S3 method for class 'numeric' mape(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mape(actual, predicted, w, ...) ## Generic S3 method mape( actual, predicted, ... ) ## Generic S3 method weighted.mape( actual, predicted, w, ... )
## S3 method for class 'numeric' mape(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mape(actual, predicted, w, ...) ## Generic S3 method mape( actual, predicted, ... ) ## Generic S3 method weighted.mape( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Absolute Percentage Error (MAPE) cat( "Mean Absolute Percentage Error", mape( actual = actual, predicted = predicted, ), "Mean Absolute Percentage Error (weighted)", weighted.mape( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Absolute Percentage Error (MAPE) cat( "Mean Absolute Percentage Error", mape( actual = actual, predicted = predicted, ), "Mean Absolute Percentage Error (weighted)", weighted.mape( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
The mcc()
-function computes the Matthews Correlation Coefficient (MCC), also known as the -coefficient, between
two vectors of predicted and observed
factor()
values. The weighted.mcc()
function computes the weighted Matthews Correlation Coefficient.
## S3 method for class 'factor' mcc(actual, predicted, ...) ## S3 method for class 'factor' weighted.mcc(actual, predicted, w, ...) ## S3 method for class 'cmatrix' mcc(x, ...) ## S3 method for class 'factor' phi(actual, predicted, ...) ## S3 method for class 'factor' weighted.phi(actual, predicted, w, ...) ## S3 method for class 'cmatrix' phi(x, ...) ## Generic S3 method mcc(...) ## Generic S3 method weighted.mcc( ..., w ) ## Generic S3 method phi(...) ## Generic S3 method weighted.phi( ..., w )
## S3 method for class 'factor' mcc(actual, predicted, ...) ## S3 method for class 'factor' weighted.mcc(actual, predicted, w, ...) ## S3 method for class 'cmatrix' mcc(x, ...) ## S3 method for class 'factor' phi(actual, predicted, ...) ## S3 method for class 'factor' weighted.phi(actual, predicted, w, ...) ## S3 method for class 'cmatrix' phi(x, ...) ## Generic S3 method mcc(...) ## Generic S3 method weighted.mcc( ..., w ) ## Generic S3 method phi(...) ## Generic S3 method weighted.phi( ..., w )
actual |
|
predicted |
|
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
A <numeric>-vector of length 1
The metric is calculated as follows,
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate performance # using Matthews Correlation Coefficient cat( "Matthews Correlation Coefficient", mcc( actual = actual, predicted = predicted ), "Matthews Correlation Coefficient (weighted)", weighted.mcc( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate performance # using Matthews Correlation Coefficient cat( "Matthews Correlation Coefficient", mcc( actual = actual, predicted = predicted ), "Matthews Correlation Coefficient (weighted)", weighted.mcc( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
The mpe()
-function computes the mean percentage error between
the observed and predicted <numeric> vectors. The weighted.mpe()
function computes the weighted mean percentage error.
## S3 method for class 'numeric' mpe(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mpe(actual, predicted, w, ...) ## Generic S3 method mpe( actual, predicted, ... ) ## Generic S3 method weighted.mpe( actual, predicted, w, ... )
## S3 method for class 'numeric' mpe(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mpe(actual, predicted, w, ...) ## Generic S3 method mpe( actual, predicted, ... ) ## Generic S3 method weighted.mpe( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
Where and
are the
actual
and predicted
values respectively.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Percentage Error (MPE) cat( "Mean Percentage Error", mpe( actual = actual, predicted = predicted, ), "Mean Percentage Error (weighted)", weighted.mpe( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Percentage Error (MPE) cat( "Mean Percentage Error", mpe( actual = actual, predicted = predicted, ), "Mean Percentage Error (weighted)", weighted.mpe( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
The mse()
-function computes the mean squared error between
the observed and predicted <numeric> vectors. The weighted.mse()
function computes the weighted mean squared error.
## S3 method for class 'numeric' mse(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mse(actual, predicted, w, ...) ## Generic S3 method mse( actual, predicted, ... ) ## Generic S3 method weighted.mse( actual, predicted, w, ... )
## S3 method for class 'numeric' mse(actual, predicted, ...) ## S3 method for class 'numeric' weighted.mse(actual, predicted, w, ...) ## Generic S3 method mse( actual, predicted, ... ) ## Generic S3 method weighted.mse( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
Where and
are the
actual
and predicted
values respectively.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Squared Error (MSE) cat( "Mean Squared Error", mse( actual = actual, predicted = predicted, ), "Mean Squared Error (weighted)", weighted.mse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Mean Squared Error (MSE) cat( "Mean Squared Error", mse( actual = actual, predicted = predicted, ), "Mean Squared Error (weighted)", weighted.mse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
A generic function for the negative likelihood ratio in classification tasks. Use weighted.nlr()
weighted negative likelihood ratio.
## S3 method for class 'factor' nlr(actual, predicted, ...) ## S3 method for class 'factor' weighted.nlr(actual, predicted, w, ...) ## S3 method for class 'cmatrix' nlr(x, ...) ## Generic S3 method nlr(...) ## Generic S3 method weighted.nlr( ..., w )
## S3 method for class 'factor' nlr(actual, predicted, ...) ## S3 method for class 'factor' weighted.nlr(actual, predicted, w, ...) ## S3 method for class 'cmatrix' nlr(x, ...) ## Generic S3 method nlr(...) ## Generic S3 method weighted.nlr( ..., w )
actual |
|
predicted |
|
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the likelihood of a negative outcome. The negative likelihood ratio of the classifier is calculated as,
Where:
is the sensitivity, or true positive rate
is the specificity, or true negative rate
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
The plr()
-function for the Positive Likehood Ratio (LR+)
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # with class-wise negative likelihood ratios cat("Negative Likelihood Ratio", sep = "\n") nlr( actual = actual, predicted = predicted ) cat("Negative Likelihood Ratio (weighted)", sep = "\n") weighted.nlr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # with class-wise negative likelihood ratios cat("Negative Likelihood Ratio", sep = "\n") nlr( actual = actual, predicted = predicted ) cat("Negative Likelihood Ratio (weighted)", sep = "\n") weighted.nlr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) )
The npv()
-function computes the negative predictive value, also known as the True Negative Predictive Value, between
two vectors of predicted and observed factor()
values. The weighted.npv()
function computes the weighted negative predictive value.
## S3 method for class 'factor' npv(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.npv(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' npv(x, micro = NULL, na.rm = TRUE, ...) npv(...) weighted.npv(...)
## S3 method for class 'factor' npv(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.npv(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' npv(x, micro = NULL, na.rm = TRUE, ...) npv(...) weighted.npv(...)
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
The metric is calculated for each class as follows,
Where and
are the number of true negatives and false negatives, respectively, for each class
.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Negative Predictive Value # 4.1) unweighted Negative Predictive Value npv( actual = actual, predicted = predicted ) # 4.2) weighted Negative Predictive Value weighted.npv( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Negative Predictive Value cat( "Micro-averaged Negative Predictive Value", npv( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Negative Predictive Value (weighted)", weighted.npv( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Negative Predictive Value # 4.1) unweighted Negative Predictive Value npv( actual = actual, predicted = predicted ) # 4.2) weighted Negative Predictive Value weighted.npv( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Negative Predictive Value cat( "Micro-averaged Negative Predictive Value", npv( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Negative Predictive Value (weighted)", weighted.npv( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
This dataset is used to estimate obesity levels based on eating habits and physical condition. The data originates from the UCI Machine Learning Repository and has been preprocessed to include both predictors and a target variable.
data(obesity)
data(obesity)
A list with two components:
A data frame containing various predictors related to eating habits, physical condition, and lifestyle.
A list with two elements: regression
(weight in kilograms) and class
(obesity level classification).
The dataset is provided as a list with two components:
A data frame containing various predictors related to lifestyle, eating habits, and physical condition. The variables include:
The age of the individual in years.
The height of the individual in meters.
Binary variable indicating whether the individual has a family history of overweight (1 = yes, 0 = no).
Binary variable indicating whether the individual frequently consumes high-calorie foods (1 = yes, 0 = no).
The frequency of consumption of vegetables in meals.
The number of main meals consumed per day.
Categorical variable indicating the frequency of consumption of food between meals.
Typical levels include "no"
, "sometimes"
, "frequently"
, and "always"
.
Binary variable indicating whether the individual smokes (1 = yes, 0 = no).
Daily water consumption (typically in liters).
Binary variable indicating whether the individual monitors calorie consumption (1 = yes, 0 = no).
The frequency of physical activity.
The time spent using electronic devices (e.g., screen time in hours).
Categorical variable indicating the frequency of alcohol consumption.
Typical levels include "no"
, "sometimes"
, "frequently"
, and "always"
.
Binary variable indicating the gender of the individual (1 = male, 0 = female).
A list containing two elements:
A numeric vector representing the weight of the individual (used as the regression target).
A factor indicating the obesity level classification. The levels are derived from the original nobeyesdad
variable in the dataset.
This function allows you to enable or disable the use of OpenMP for parallelizing computations.
## enable OpenMP openmp.on() ## disable OpenMP openmp.off() ## set number of threads openmp.threads(threads)
## enable OpenMP openmp.on() ## disable OpenMP openmp.off() ## set number of threads openmp.threads(threads)
threads |
A positive <integer>-value (Default: None). If |
If OpenMP is unavailable, the function returns NULL.
If OpenMP is unavailable, the function returns NULL.
If OpenMP is unavailable, the function returns NULL.
## Not run: ## enable OpenMP SLmetrics::openmp.on() ## disable OpenMP SLmetrics::openmp.off() ## available threads SLmetrics::openmp.threads() ## set number of threads SLmetrics::openmp.threads(2) ## End(Not run)
## Not run: ## enable OpenMP SLmetrics::openmp.on() ## disable OpenMP SLmetrics::openmp.off() ## available threads SLmetrics::openmp.threads() ## set number of threads SLmetrics::openmp.threads(2) ## End(Not run)
The pinball()
-function computes the pinball loss between
the observed and predicted <numeric> vectors. The weighted.pinball()
function computes the weighted Pinball Loss.
## S3 method for class 'numeric' pinball(actual, predicted, alpha = 0.5, deviance = FALSE, ...) ## S3 method for class 'numeric' weighted.pinball(actual, predicted, w, alpha = 0.5, deviance = FALSE, ...) ## Generic S3 method pinball( actual, predicted, alpha = 0.5, deviance = FALSE, ... ) ## Generic S3 method weighted.pinball( actual, predicted, w, alpha = 0.5, deviance = FALSE, ... )
## S3 method for class 'numeric' pinball(actual, predicted, alpha = 0.5, deviance = FALSE, ...) ## S3 method for class 'numeric' weighted.pinball(actual, predicted, w, alpha = 0.5, deviance = FALSE, ...) ## Generic S3 method pinball( actual, predicted, alpha = 0.5, deviance = FALSE, ... ) ## Generic S3 method weighted.pinball( actual, predicted, w, alpha = 0.5, deviance = FALSE, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
alpha |
A <numeric>-value of length |
deviance |
A <logical>-value of length 1 (default: FALSE). If TRUE the function returns the |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
where is the actual value,
is the predicted value and
is the quantile level.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Pinball Loss cat( "Pinball Loss", pinball( actual = actual, predicted = predicted, ), "Pinball Loss (weighted)", weighted.pinball( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Pinball Loss cat( "Pinball Loss", pinball( actual = actual, predicted = predicted, ), "Pinball Loss (weighted)", weighted.pinball( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
A generic function for the positive likelihood ratio in classification tasks. Use weighted.plr()
weighted positive likelihood ratio.
## S3 method for class 'factor' plr(actual, predicted, ...) ## S3 method for class 'factor' weighted.plr(actual, predicted, w, ...) ## S3 method for class 'cmatrix' plr(x, ...) ## Generic S3 method plr(...) ## Generic S3 method weighted.plr( ..., w )
## S3 method for class 'factor' plr(actual, predicted, ...) ## S3 method for class 'factor' weighted.plr(actual, predicted, w, ...) ## S3 method for class 'cmatrix' plr(x, ...) ## Generic S3 method plr(...) ## Generic S3 method weighted.plr( ..., w )
actual |
|
predicted |
|
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the likelihood of a positive outcome. The positive likelihood ratio of the classifier is calculated as,
Where:
is the sensitivity, or true positive rate
is the specificity, or true negative rate
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
The nlr()
-function for the Negative Likehood Ratio (LR-)
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # with class-wise positive likelihood ratios cat("Positive Likelihood Ratio", sep = "\n") plr( actual = actual, predicted = predicted ) cat("Positive Likelihood Ratio (weighted)", sep = "\n") weighted.plr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model performance # with class-wise positive likelihood ratios cat("Positive Likelihood Ratio", sep = "\n") plr( actual = actual, predicted = predicted ) cat("Positive Likelihood Ratio (weighted)", sep = "\n") weighted.plr( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) )
A generic function for the area under the Precision-Recall Curve. Use weighted.pr.auc()
for the weighted area under the Precision-Recall Curve.
## S3 method for class 'matrix' pr.auc(actual, response, micro = NULL, method = 0L, ...) ## S3 method for class 'matrix' weighted.pr.auc(actual, response, w, micro = NULL, method = 0L, ...) ## Generic S3 method pr.auc( actual, response, micro = NULL, method = 0, ... ) ## Generic S3 method weighted.pr.auc( actual, response, w, micro = NULL, method = 0, ... )
## S3 method for class 'matrix' pr.auc(actual, response, micro = NULL, method = 0L, ...) ## S3 method for class 'matrix' weighted.pr.auc(actual, response, w, micro = NULL, method = 0L, ...) ## Generic S3 method pr.auc( actual, response, micro = NULL, method = 0, ... ) ## Generic S3 method weighted.pr.auc( actual, response, w, micro = NULL, method = 0, ... )
actual |
|
response |
A |
micro |
A <logical>-value of length |
method |
A <numeric> value (default: |
... |
Arguments passed into other methods. |
w |
A <numeric> vector of length 1
Trapezoidal rule
The trapezoidal rule approximates the integral of a function between
and
using trapezoids formed between consecutive points. If
we have points
(with
)
and corresponding function values
, the area under
the curve
is approximated by:
Step-function method
The step-function (rectangular) method uses the value of the function at one
endpoint of each subinterval to form rectangles. With the same partition
, the rectangular approximation
can be written as:
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate precision-recall # data # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) calculate class-wise # area under the curve pr.auc( actual = actual, response = response ) # 4.3) calculate class-wise # weighted area under the curve weighted.pr.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall area under # the curve cat( "Micro-averaged area under the precision-recall curve", pr.auc( actual = actual, response = response, micro = TRUE ), "Micro-averaged area under the precision-recall curve (weighted)", weighted.pr.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate precision-recall # data # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) calculate class-wise # area under the curve pr.auc( actual = actual, response = response ) # 4.3) calculate class-wise # weighted area under the curve weighted.pr.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall area under # the curve cat( "Micro-averaged area under the precision-recall curve", pr.auc( actual = actual, response = response, micro = TRUE ), "Micro-averaged area under the precision-recall curve (weighted)", weighted.pr.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
A generic funcion for the precision. Use weighted.fdr()
for the weighted precision.
Positive Predictive Value
## S3 method for class 'factor' precision(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.precision(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' precision(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' ppv(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.ppv(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' ppv(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method precision( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.precision( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method ppv( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.ppv( ..., w, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' precision(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.precision(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' precision(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' ppv(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.ppv(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' ppv(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method precision( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.precision( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method ppv( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.ppv( ..., w, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the proportion of true positives among the predicted positives. The precision of the classifier is calculated as,
Where:
is the number of true positives, and
is the number of false positives.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Precision # 4.1) unweighted Precision precision( actual = actual, predicted = predicted ) # 4.2) weighted Precision weighted.precision( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Precision cat( "Micro-averaged Precision", precision( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Precision (weighted)", weighted.precision( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Precision # 4.1) unweighted Precision precision( actual = actual, predicted = predicted ) # 4.2) weighted Precision weighted.precision( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Precision cat( "Micro-averaged Precision", precision( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Precision (weighted)", weighted.precision( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
This function does a column-wise ordering permutation of numeric or integer matrix.
preorder( x, decreasing = FALSE, ... )
preorder( x, decreasing = FALSE, ... )
x |
|
decreasing |
a logical value of length 1 (default: FALSE). If TRUE the matrix is returned in descending order. |
... |
Arguments passed into other methods. |
A matrix with indices to the ordered values.
Other Tools:
auc.numeric()
,
cov.wt.matrix()
,
presort()
# 1) generate a 4x4 matrix # with random values to be sorted set.seed(1903) X <- matrix( data = cbind(sample(16:1)), nrow = 4 ) # 2) sort matrix # in decreasing order presort(X) # 3) get indices # for sorted matrix preorder(X)
# 1) generate a 4x4 matrix # with random values to be sorted set.seed(1903) X <- matrix( data = cbind(sample(16:1)), nrow = 4 ) # 2) sort matrix # in decreasing order presort(X) # 3) get indices # for sorted matrix preorder(X)
This generic function does a column-wise sorting of a numeric or integer matrix.
presort( x, decreasing = FALSE, ... )
presort( x, decreasing = FALSE, ... )
x |
|
decreasing |
a logical value of length 1 (default: FALSE). If TRUE the matrix is returned in descending order. |
... |
Arguments passed into other methods. |
A matrix with sorted rows.
Other Tools:
auc.numeric()
,
cov.wt.matrix()
,
preorder()
# 1) generate a 4x4 matrix # with random values to be sorted set.seed(1903) X <- matrix( data = cbind(sample(16:1)), nrow = 4 ) # 2) sort matrix # in decreasing order presort(X) # 3) get indices # for sorted matrix preorder(X)
# 1) generate a 4x4 matrix # with random values to be sorted set.seed(1903) X <- matrix( data = cbind(sample(16:1)), nrow = 4 ) # 2) sort matrix # in decreasing order presort(X) # 3) get indices # for sorted matrix preorder(X)
The prROC()
-function computes the precision()
and recall()
at thresholds provided by the - or
-vector. The function
constructs a
data.frame()
grouped by -classes where each class is treated as a binary classification problem.
## S3 method for class 'factor' prROC(actual, response, thresholds = NULL, presorted = FALSE, ...) ## S3 method for class 'factor' weighted.prROC(actual, response, w, thresholds = NULL, presorted = FALSE, ...) ## Generic S3 method prROC( actual, response, thresholds = NULL, presorted = FALSE, ... ) ## Generic S3 method weighted.prROC( actual, response, w, thresholds = NULL, presorted = FALSE, ... )
## S3 method for class 'factor' prROC(actual, response, thresholds = NULL, presorted = FALSE, ...) ## S3 method for class 'factor' weighted.prROC(actual, response, w, thresholds = NULL, presorted = FALSE, ...) ## Generic S3 method prROC( actual, response, thresholds = NULL, presorted = FALSE, ... ) ## Generic S3 method weighted.prROC( actual, response, w, thresholds = NULL, presorted = FALSE, ... )
actual |
|
response |
A |
thresholds |
|
presorted |
A <logical>-value length 1 (default: FALSE). If TRUE the input will not be sorted by threshold. |
... |
Arguments passed into other methods. |
w |
A data.frame on the following form,
threshold |
<numeric> Thresholds used to determine |
level |
|
label |
|
recall |
<numeric> The recall |
precision |
<numeric> The precision |
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Let be the proportion of true negatives among the actual negatives. The specificity of the classifier is calculated as,
Where:
is the number of true negatives, and
is the number of false positives.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate precision-recall # data # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) generate precision-recall # data roc <- prROC( actual = actual, response = response ) # 5) plot by species plot(roc) # 5.1) summarise summary(roc) # 6) provide custom # threholds roc <- prROC( actual = actual, response = response, thresholds = seq( 1, 0, length.out = 20 ) ) # 5) plot by species plot(roc)
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate precision-recall # data # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) generate precision-recall # data roc <- prROC( actual = actual, response = response ) # 5) plot by species plot(roc) # 5.1) summarise summary(roc) # 6) provide custom # threholds roc <- prROC( actual = actual, response = response, thresholds = seq( 1, 0, length.out = 20 ) ) # 5) plot by species plot(roc)
The rae()
-function calculates the normalized relative absolute error between
the predicted and observed <numeric> vectors. The weighted.rae()
function computes the weigthed relative absolute error.
## S3 method for class 'numeric' rae(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rae(actual, predicted, w, ...) ## Generic S3 method rae( actual, predicted, ... ) ## Generic S3 method weighted.rae( actual, predicted, w, ... )
## S3 method for class 'numeric' rae(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rae(actual, predicted, w, ...) ## Generic S3 method rae( actual, predicted, ... ) ## Generic S3 method weighted.rae( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The Relative Absolute Error (RAE) is calculated as:
Where are the
actual
values, are the
predicted
values,
and is the mean of the
actual
values.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Relative Absolute Error (RAE) cat( "Relative Absolute Error", rae( actual = actual, predicted = predicted, ), "Relative Absolute Error (weighted)", weighted.rae( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Relative Absolute Error (RAE) cat( "Relative Absolute Error", rae( actual = actual, predicted = predicted, ), "Relative Absolute Error (weighted)", weighted.rae( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
A generic funcion for the Recall. Use weighted.fdr()
for the weighted Recall.
Sensitivity, True Positive Rate
## S3 method for class 'factor' recall(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.recall(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' recall(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' sensitivity(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.sensitivity(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' sensitivity(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' tpr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.tpr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' tpr(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method recall( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method sensitivity( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method tpr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.recall( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.sensitivity( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.tpr( ..., w, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' recall(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.recall(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' recall(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' sensitivity(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.sensitivity(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' sensitivity(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' tpr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.tpr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' tpr(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method recall( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method sensitivity( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method tpr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.recall( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.sensitivity( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.tpr( ..., w, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Let be the proportion of true positives among the actual positives. The recall of the classifier is calculated as,
Where:
is the number of true positives, and
is the number of false negatives.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Recall # 4.1) unweighted Recall recall( actual = actual, predicted = predicted ) # 4.2) weighted Recall weighted.recall( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Recall cat( "Micro-averaged Recall", recall( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Recall (weighted)", weighted.recall( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Recall # 4.1) unweighted Recall recall( actual = actual, predicted = predicted ) # 4.2) weighted Recall weighted.recall( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Recall cat( "Micro-averaged Recall", recall( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Recall (weighted)", weighted.recall( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
The rmse()
-function computes the root mean squared error between
the observed and predicted <numeric> vectors. The weighted.rmse()
function computes the weighted root mean squared error.
## S3 method for class 'numeric' rmse(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rmse(actual, predicted, w, ...) ## Generic S3 method rmse( actual, predicted, ... ) weighted.rmse( actual, predicted, w, ... )
## S3 method for class 'numeric' rmse(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rmse(actual, predicted, w, ...) ## Generic S3 method rmse( actual, predicted, ... ) weighted.rmse( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
Where and
are the
actual
and predicted
values respectively.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Root Mean Squared Error (RMSE) cat( "Root Mean Squared Error", rmse( actual = actual, predicted = predicted, ), "Root Mean Squared Error (weighted)", weighted.rmse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Root Mean Squared Error (RMSE) cat( "Root Mean Squared Error", rmse( actual = actual, predicted = predicted, ), "Root Mean Squared Error (weighted)", weighted.rmse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
The rmsle()
-function computes the root mean squared logarithmic error between the observed and predicted <numeric> vectors. The weighted.rmsle()
function computes the weighted root mean squared logarithmic error.
## S3 method for class 'numeric' rmsle(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rmsle(actual, predicted, w, ...) ## Generic S3 method rmsle( actual, predicted, ... ) ## Generic S3 method weighted.rmsle( actual, predicted, w, ... )
## S3 method for class 'numeric' rmsle(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rmsle(actual, predicted, w, ...) ## Generic S3 method rmsle( actual, predicted, ... ) ## Generic S3 method weighted.rmsle( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
Where and
are the
actual
and predicted
values respectively.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Root Mean Squared Logarithmic Error (RMSLE) cat( "Root Mean Squared Logarithmic Error", rmsle( actual = actual, predicted = predicted, ), "Root Mean Squared Logarithmic Error (weighted)", weighted.rmsle( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Root Mean Squared Logarithmic Error (RMSLE) cat( "Root Mean Squared Logarithmic Error", rmsle( actual = actual, predicted = predicted, ), "Root Mean Squared Logarithmic Error (weighted)", weighted.rmsle( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
A generic function for the area under the Receiver Operator Characteristics Curve. Use weighted.roc.auc()
for the weighted area under the Receiver Operator Characteristics Curve.
## S3 method for class 'matrix' roc.auc(actual, response, micro = NULL, method = 0L, ...) ## S3 method for class 'matrix' weighted.roc.auc(actual, response, w, micro = NULL, method = 0L, ...) ## Generic S3 method roc.auc( actual, response, micro = NULL, method = 0, ... ) ## Generic S3 method weighted.roc.auc( actual, response, w, micro = NULL, method = 0, ... )
## S3 method for class 'matrix' roc.auc(actual, response, micro = NULL, method = 0L, ...) ## S3 method for class 'matrix' weighted.roc.auc(actual, response, w, micro = NULL, method = 0L, ...) ## Generic S3 method roc.auc( actual, response, micro = NULL, method = 0, ... ) ## Generic S3 method weighted.roc.auc( actual, response, w, micro = NULL, method = 0, ... )
actual |
|
response |
A |
micro |
A <logical>-value of length |
method |
A <numeric> value (default: |
... |
Arguments passed into other methods. |
w |
A <numeric> vector of length 1
Trapezoidal rule
The trapezoidal rule approximates the integral of a function between
and
using trapezoids formed between consecutive points. If
we have points
(with
)
and corresponding function values
, the area under
the curve
is approximated by:
Step-function method
The step-function (rectangular) method uses the value of the function at one
endpoint of each subinterval to form rectangles. With the same partition
, the rectangular approximation
can be written as:
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate receiver operator characteristics # data # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) calculate class-wise # area under the curve roc.auc( actual = actual, response = response ) # 4.3) calculate class-wise # weighted area under the curve weighted.roc.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall area under # the curve cat( "Micro-averaged area under the ROC curve", roc.auc( actual = actual, response = response, micro = TRUE ), "Micro-averaged area under the ROC curve (weighted)", weighted.roc.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate receiver operator characteristics # data # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) calculate class-wise # area under the curve roc.auc( actual = actual, response = response ) # 4.3) calculate class-wise # weighted area under the curve weighted.roc.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall area under # the curve cat( "Micro-averaged area under the ROC curve", roc.auc( actual = actual, response = response, micro = TRUE ), "Micro-averaged area under the ROC curve (weighted)", weighted.roc.auc( actual = actual, response = response, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
The ROC()
-function computes the tpr()
and fpr()
at thresholds provided by the - or
-vector. The function
constructs a
data.frame()
grouped by -classes where each class is treated as a binary classification problem.
## S3 method for class 'factor' ROC(actual, response, thresholds = NULL, presorted = FALSE, ...) ## S3 method for class 'factor' weighted.ROC(actual, response, w, thresholds = NULL, presorted = FALSE, ...) ## Generic S3 method ROC( actual, response, thresholds = NULL, presorted = FALSE, ... ) ## Generic S3 method weighted.ROC( actual, response, w, thresholds = NULL, presorted = FALSE, ... )
## S3 method for class 'factor' ROC(actual, response, thresholds = NULL, presorted = FALSE, ...) ## S3 method for class 'factor' weighted.ROC(actual, response, w, thresholds = NULL, presorted = FALSE, ...) ## Generic S3 method ROC( actual, response, thresholds = NULL, presorted = FALSE, ... ) ## Generic S3 method weighted.ROC( actual, response, w, thresholds = NULL, presorted = FALSE, ... )
actual |
|
response |
A |
thresholds |
|
presorted |
A <logical>-value length 1 (default: FALSE). If TRUE the input will not be sorted by threshold. |
... |
Arguments passed into other methods. |
w |
A data.frame on the following form,
threshold |
|
level |
|
label |
|
fpr |
<numeric> The false positive rate |
tpr |
<numeric> The true positve rate |
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Let be the proportion of true negatives among the actual negatives. The specificity of the classifier is calculated as,
Where:
is the number of true negatives, and
is the number of false positives.
Other Classification:
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
,
zerooneloss.factor()
Other Supervised Learning:
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate reciever # operator characteristics # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) construct # data.frame roc <- ROC( actual = actual, response = response ) # 5) plot by species plot(roc) # 5.1) summarise summary(roc) # 6) provide custom # threholds roc <- ROC( actual = actual, response = response, thresholds = seq( 1, 0, length.out = 20 ) ) # 5) plot by species plot(roc)
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes response <- predict(model, type = "response") # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) generate reciever # operator characteristics # 4.1) calculate residual # probability and store as matrix response <- matrix( data = cbind(response, 1-response), nrow = length(actual) ) # 4.2) construct # data.frame roc <- ROC( actual = actual, response = response ) # 5) plot by species plot(roc) # 5.1) summarise summary(roc) # 6) provide custom # threholds roc <- ROC( actual = actual, response = response, thresholds = seq( 1, 0, length.out = 20 ) ) # 5) plot by species plot(roc)
The rrmse()
-function computes the Relative Root Mean Squared Error between
the observed and predicted <numeric> vectors. The weighted.rrmse()
function computes the weighted Relative Root Mean Squared Error.
## S3 method for class 'numeric' rrmse(actual, predicted, normalization = 1L, ...) ## S3 method for class 'numeric' weighted.rrmse(actual, predicted, w, normalization = 1L, ...) ## Generic S3 method rrmse( actual, predicted, normalization = 1, ... ) ## Generic S3 method weighted.rrmse( actual, predicted, w, normalization = 1, ... )
## S3 method for class 'numeric' rrmse(actual, predicted, normalization = 1L, ...) ## S3 method for class 'numeric' weighted.rrmse(actual, predicted, w, normalization = 1L, ...) ## Generic S3 method rrmse( actual, predicted, normalization = 1, ... ) ## Generic S3 method weighted.rrmse( actual, predicted, w, normalization = 1, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
normalization |
A <numeric>-value of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
Where is the normalization factor.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Relative Root Mean Squared Error (RRMSE) cat( "IQR Relative Root Mean Squared Error", rrmse( actual = actual, predicted = predicted, normalization = 2 ), "IQR Relative Root Mean Squared Error (weighted)", weighted.rrmse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg), normalization = 2 ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Relative Root Mean Squared Error (RRMSE) cat( "IQR Relative Root Mean Squared Error", rrmse( actual = actual, predicted = predicted, normalization = 2 ), "IQR Relative Root Mean Squared Error (weighted)", weighted.rrmse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg), normalization = 2 ), sep = "\n" )
The rrse()
-function calculates the root relative squared error between
the predicted and observed <numeric> vectors. The weighted.rrse()
function computes the weighed root relative squared errorr.
## S3 method for class 'numeric' rrse(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rrse(actual, predicted, w, ...) ## Generic S3 method rrse( actual, predicted, ... ) ## Generic S3 method weighted.rrse( actual, predicted, w, ... )
## S3 method for class 'numeric' rrse(actual, predicted, ...) ## S3 method for class 'numeric' weighted.rrse(actual, predicted, w, ...) ## Generic S3 method rrse( actual, predicted, ... ) ## Generic S3 method weighted.rrse( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as,
Where are the
actual
values, are the
predicted
values,
and is the mean of the
actual
values.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rsq.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Relative Root Squared Errror (RRSE) cat( "Relative Root Squared Errror", rrse( actual = actual, predicted = predicted, ), "Relative Root Squared Errror (weighted)", weighted.rrse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Relative Root Squared Errror (RRSE) cat( "Relative Root Squared Errror", rrse( actual = actual, predicted = predicted, ), "Relative Root Squared Errror (weighted)", weighted.rrse( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
A generic function for the . The unadjusted
is returned by default.
Use
weighted.rsq()
for the weighted .
## S3 method for class 'numeric' rsq(actual, predicted, k = 0, ...) ## S3 method for class 'numeric' weighted.rsq(actual, predicted, w, k = 0, ...) ## Generic S3 method rsq( ..., k = 0 ) ## Generic S3 method weighted.rsq( ..., w, k = 0 )
## S3 method for class 'numeric' rsq(actual, predicted, k = 0, ...) ## S3 method for class 'numeric' weighted.rsq(actual, predicted, w, k = 0, ...) ## Generic S3 method rsq( ..., k = 0 ) ## Generic S3 method weighted.rsq( ..., w, k = 0 )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
k |
A <numeric>-vector of length 1 (default: 0). For adjusted |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
Let be the explained variation. The
is calculated as,
Where:
is the number of observations
is the number of features
is the actual values
is the predicted values
is the sum of squared errors and,
is total sum of squared errors
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
smape.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
smape.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure in-sample performance actual <- mtcars$mpg predicted <- fitted(model) # 2) calculate performance # using R squared adjusted and # unadjused for features cat( "Rsq", rsq( actual = actual, predicted = fitted(model) ), "Rsq (Adjusted)", rsq( actual = actual, predicted = fitted(model), k = ncol(model.matrix(model)) - 1 ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure in-sample performance actual <- mtcars$mpg predicted <- fitted(model) # 2) calculate performance # using R squared adjusted and # unadjused for features cat( "Rsq", rsq( actual = actual, predicted = fitted(model) ), "Rsq (Adjusted)", rsq( actual = actual, predicted = fitted(model), k = ncol(model.matrix(model)) - 1 ), sep = "\n" )
The smape()
-function computes the symmetric mean absolute percentage error between
the observed and predicted <numeric> vectors. The weighted.smape()
function computes the weighted symmetric mean absolute percentage error.
## S3 method for class 'numeric' smape(actual, predicted, ...) ## S3 method for class 'numeric' weighted.smape(actual, predicted, w, ...) ## Generic S3 method smape( actual, predicted, ... ) ## Generic S3 method weighted.smape( actual, predicted, w, ... )
## S3 method for class 'numeric' smape(actual, predicted, ...) ## S3 method for class 'numeric' weighted.smape(actual, predicted, w, ...) ## Generic S3 method smape( actual, predicted, ... ) ## Generic S3 method weighted.smape( actual, predicted, w, ... )
actual |
A <numeric>-vector of length |
predicted |
A <numeric>-vector of length |
... |
Arguments passed into other methods. |
w |
A <numeric>-vector of length |
A <numeric> vector of length 1.
The metric is calculated as follows,
where and
is the
actual
and predicted
values respectively.
Other Regression:
ccc.numeric()
,
huberloss.numeric()
,
mae.numeric()
,
mape.numeric()
,
mpe.numeric()
,
mse.numeric()
,
pinball.numeric()
,
rae.numeric()
,
rmse.numeric()
,
rmsle.numeric()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
specificity.factor()
,
zerooneloss.factor()
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Symmetric Mean Absolute Percentage Error (MAPE) cat( "Symmetric Mean Absolute Percentage Error", mape( actual = actual, predicted = predicted, ), "Symmetric Mean Absolute Percentage Error (weighted)", weighted.mape( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
# 1) fit a linear # regression model <- lm( mpg ~ ., data = mtcars ) # 1.1) define actual # and predicted values # to measure performance actual <- mtcars$mpg predicted <- fitted(model) # 2) evaluate in-sample model # performance using Symmetric Mean Absolute Percentage Error (MAPE) cat( "Symmetric Mean Absolute Percentage Error", mape( actual = actual, predicted = predicted, ), "Symmetric Mean Absolute Percentage Error (weighted)", weighted.mape( actual = actual, predicted = predicted, w = mtcars$mpg/mean(mtcars$mpg) ), sep = "\n" )
A generic funcion for the Specificity. Use weighted.specificity()
for the weighted Specificity.
True Negative Rate, Selectivity
## S3 method for class 'factor' specificity(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.specificity(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' specificity(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' tnr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.tnr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' tnr(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' selectivity(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.selectivity(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' selectivity(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method specificity( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method tnr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method selectivity( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.specificity( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.tnr( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.selectivity( ..., w, micro = NULL, na.rm = TRUE )
## S3 method for class 'factor' specificity(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.specificity(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' specificity(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' tnr(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.tnr(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' tnr(x, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' selectivity(actual, predicted, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'factor' weighted.selectivity(actual, predicted, w, micro = NULL, na.rm = TRUE, ...) ## S3 method for class 'cmatrix' selectivity(x, micro = NULL, na.rm = TRUE, ...) ## Generic S3 method specificity( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method tnr( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method selectivity( ..., micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.specificity( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.tnr( ..., w, micro = NULL, na.rm = TRUE ) ## Generic S3 method weighted.selectivity( ..., w, micro = NULL, na.rm = TRUE )
actual |
|
predicted |
|
micro |
A <logical>-value of length |
na.rm |
A <logical> value of length |
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
If micro
is NULL (the default), a named <numeric>-vector of length k
If micro
is TRUE or FALSE, a <numeric>-vector of length 1
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Let be the proportion of true negatives among the actual negatives. The specificity of the classifier is calculated as,
Where:
is the number of true negatives, and
is the number of false positives.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
zerooneloss.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
zerooneloss.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Specificity # 4.1) unweighted Specificity specificity( actual = actual, predicted = predicted ) # 4.2) weighted Specificity weighted.specificity( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Specificity cat( "Micro-averaged Specificity", specificity( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Specificity (weighted)", weighted.specificity( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate class-wise performance # using Specificity # 4.1) unweighted Specificity specificity( actual = actual, predicted = predicted ) # 4.2) weighted Specificity weighted.specificity( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ) # 5) evaluate overall performance # using micro-averaged Specificity cat( "Micro-averaged Specificity", specificity( actual = actual, predicted = predicted, micro = TRUE ), "Micro-averaged Specificity (weighted)", weighted.specificity( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length), micro = TRUE ), sep = "\n" )
This dataset contains measurements of various chemical properties of white wines along with their quality ratings and a quality classification. The dataset was obtained from the UCI Machine Learning Repository.
data(wine_quality)
data(wine_quality)
A list with two components:
A data frame with 11 chemical property variables.
A list with two elements: regression
(wine quality scores) and class
(quality classification).
The data is provided as a list with two components:
A data frame containing the chemical properties of the wines. The variables include:
Fixed acidity (g/L).
Volatile acidity (g/L), mainly due to acetic acid.
Citric acid (g/L).
Residual sugar (g/L).
Chloride concentration (g/L).
Free sulfur dioxide (mg/L).
Total sulfur dioxide (mg/L).
Density of the wine (g/cm).
pH value of the wine.
Sulphates (g/L).
Alcohol content (% by volume).
A list containing two elements:
A numeric vector representing the wine quality scores (used as the regression target).
A factor with levels "High Quality"
, "Medium Quality"
, and "Low Quality"
,
where classification is determined as follows:
quality 7.
quality 4.
for all other quality scores.
https://archive.ics.uci.edu/dataset/186/wine+quality
The zerooneloss()
-function computes the zero-one Loss, a classification loss function that calculates the proportion of misclassified instances between
two vectors of predicted and observed factor()
values. The weighted.zerooneloss()
function computes the weighted zero-one loss.
## S3 method for class 'factor' zerooneloss(actual, predicted, ...) ## S3 method for class 'factor' weighted.zerooneloss(actual, predicted, w, ...) ## S3 method for class 'cmatrix' zerooneloss(x, ...) ## Generic S3 method zerooneloss(...) ## Generic S3 method weighted.zerooneloss( ..., w )
## S3 method for class 'factor' zerooneloss(actual, predicted, ...) ## S3 method for class 'factor' weighted.zerooneloss(actual, predicted, w, ...) ## S3 method for class 'cmatrix' zerooneloss(x, ...) ## Generic S3 method zerooneloss(...) ## Generic S3 method weighted.zerooneloss( ..., w )
actual |
|
predicted |
|
... |
Arguments passed into other methods |
w |
|
x |
A confusion matrix created |
A <numeric>-vector of length 1
The metric is calculated as follows,
Where ,
,
, and
represent the true positives, true negatives, false positives, and false negatives, respectively.
Consider a classification problem with three classes: A
, B
, and C
. The actual vector of factor()
values is defined as follows:
## set seed set.seed(1903) ## actual factor( x = sample(x = 1:3, size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] B A B B A C B C C A #> Levels: A B C
Here, the values 1, 2, and 3 are mapped to A
, B
, and C
, respectively. Now, suppose your model does not predict any B
's. The predicted vector of factor()
values would be defined as follows:
## set seed set.seed(1903) ## predicted factor( x = sample(x = c(1, 3), size = 10, replace = TRUE), levels = c(1, 2, 3), labels = c("A", "B", "C") ) #> [1] C A C C C C C C A C #> Levels: A B C
In both cases, , determined indirectly by the
levels
argument.
Other Classification:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fmi.factor()
,
fpr.factor()
,
jaccard.factor()
,
logloss.factor()
,
mcc.factor()
,
nlr.factor()
,
npv.factor()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
recall.factor()
,
roc.auc.matrix()
,
specificity.factor()
Other Supervised Learning:
ROC.factor()
,
accuracy.factor()
,
baccuracy.factor()
,
ccc.numeric()
,
ckappa.factor()
,
cmatrix.factor()
,
dor.factor()
,
entropy.matrix()
,
fbeta.factor()
,
fdr.factor()
,
fer.factor()
,
fpr.factor()
,
huberloss.numeric()
,
jaccard.factor()
,
logloss.factor()
,
mae.numeric()
,
mape.numeric()
,
mcc.factor()
,
mpe.numeric()
,
mse.numeric()
,
nlr.factor()
,
npv.factor()
,
pinball.numeric()
,
plr.factor()
,
pr.auc.matrix()
,
prROC.factor()
,
precision.factor()
,
rae.numeric()
,
recall.factor()
,
rmse.numeric()
,
rmsle.numeric()
,
roc.auc.matrix()
,
rrmse.numeric()
,
rrse.numeric()
,
rsq.numeric()
,
smape.numeric()
,
specificity.factor()
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model # performance using Zero-One Loss cat( "Zero-One Loss", zerooneloss( actual = actual, predicted = predicted ), "Zero-One Loss (weigthed)", weighted.zerooneloss( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )
# 1) recode Iris # to binary classification # problem iris$species_num <- as.numeric( iris$Species == "virginica" ) # 2) fit the logistic # regression model <- glm( formula = species_num ~ Sepal.Length + Sepal.Width, data = iris, family = binomial( link = "logit" ) ) # 3) generate predicted # classes predicted <- factor( as.numeric( predict(model, type = "response") > 0.5 ), levels = c(1,0), labels = c("Virginica", "Others") ) # 3.1) generate actual # classes actual <- factor( x = iris$species_num, levels = c(1,0), labels = c("Virginica", "Others") ) # 4) evaluate model # performance using Zero-One Loss cat( "Zero-One Loss", zerooneloss( actual = actual, predicted = predicted ), "Zero-One Loss (weigthed)", weighted.zerooneloss( actual = actual, predicted = predicted, w = iris$Petal.Length/mean(iris$Petal.Length) ), sep = "\n" )