Skip to contents

Classification based on the Multinomial Naive Bayes model.

Usage

# S3 method for multinomial_naive_bayes
predict(object, newdata = NULL, type = c("class","prob"), ...)

Arguments

object

object of class inheriting from "multinomial_naive_bayes".

newdata

matrix with non-negative integer predictors (only numeric matrix is accepted).

type

if "class", new data points are classified according to the highest posterior probabilities. If "prob", the posterior probabilities for each class are returned.

...

not used.

Value

predict.multinomial_naive_bayes returns either a factor with class labels corresponding to the maximal conditional posterior probabilities or a matrix with class label specific conditional posterior probabilities.

Details

This is a specialized version of the Naive Bayes classifier, where the features represent the frequencies with which events have been generated by a multinomial distribution.

The Multinomial Naive Bayes is not available through the naive_bayes function.

The NAs in the newdata are not included into the calculation of posterior probabilities; and if present an informative warning is given.

References

McCallum, Andrew; Nigam, Kamal (1998). A comparison of event models for Naive Bayes text classification (PDF). AAAI-98 workshop on learning for text categorization. 752. http://www.cs.cmu.edu/~knigam/papers/multinomial-aaaiws98.pdf

Author

Michal Majka, michalmajka@hotmail.com

Examples

### Simulate the data:
cols <- 10 ; rows <- 100
M <- matrix(sample(0:5, rows * cols,  TRUE), nrow = rows, ncol = cols)
y <- factor(sample(paste0("class", LETTERS[1:2]), rows, TRUE, prob = c(0.3,0.7)))
colnames(M) <- paste0("V", seq_len(ncol(M)))
laplace <- 1

### Train the Multinomial Naive Bayes
mnb <- multinomial_naive_bayes(x = M, y = y, laplace = laplace)

# Classification
head(predict(mnb, newdata = M, type = "class"))
#> [1] classB classB classB classB classA classB
#> Levels: classA classB
head(mnb %class% M)
#> [1] classB classB classB classB classA classB
#> Levels: classA classB

# Posterior probabilities
head(predict(mnb, newdata = M, type = "prob"))
#>          classA    classB
#> [1,] 0.17623129 0.8237687
#> [2,] 0.41834308 0.5816569
#> [3,] 0.09334622 0.9066538
#> [4,] 0.19324231 0.8067577
#> [5,] 0.56491024 0.4350898
#> [6,] 0.23376937 0.7662306
head(mnb %prob% M)
#>          classA    classB
#> [1,] 0.17623129 0.8237687
#> [2,] 0.41834308 0.5816569
#> [3,] 0.09334622 0.9066538
#> [4,] 0.19324231 0.8067577
#> [5,] 0.56491024 0.4350898
#> [6,] 0.23376937 0.7662306