r – Error when Running a Naive Bayes on Text as Data using Caret

I am attempting to train a naive bayes model on text data, having predetermined the number of folds (so as to allow for comparison with other models), and employed adaptive resampling for hyperparameter tuning. However, this error appears:

Error in if (tmps < .Machine$double.eps^0.5) 0 else tmpm/tmps :
missing value where TRUE/FALSE needed

I know there are other methods, such as provided by the quanteda package, however, I wanting to remain with caret so that I am able to compare other models using the same data.

Any help would be much appreciated.

My code is below:


corp <- data_corpus_moviereviews


id_train <- sample(docnames(corp), size = 1500, replace = FALSE)

# get training set
training_dfm <- corpus_subset(corp, docnames(corp) %in% id_train) %>%
  dfm(stem = TRUE, tolower=TRUE, remove=stopwords("en"), remove_symbols=TRUE)

# get test set (documents not in id_train, make features equal)
test_dfm <- corpus_subset(corp, !docnames(corp) %in% id_train) %>%
  dfm(stem = TRUE, tolower=TRUE, remove_symbols=TRUE, remove=stopwords("en")) %>% 
  dfm_select(pattern = training_dfm, 
             selection = "keep")

training_m <- convert(training_dfm, to = "matrix")
test_m <- convert(test_dfm, to = "matrix")

myFolds <- createFolds(training_m, k = 5) 

myControl <- trainControl(
  summaryFunction = twoClassSummary,
  classProbs = TRUE, 
  verboseIter = TRUE,
  index = myFolds, 
  adaptive = list(min = 2, 
                  alpha = 0.05, 
                  method = "gls", 
                  complete = TRUE), 
  search = "random")

nb_caret <- train(x = training_m,
                  y = as.factor(docvars(training_dfm, "sentiment")),
                  method = "naive_bayes",
                  trControl = myControl,
                  tuneLength = 3,
                  verbose = TRUE,
                  metric = "ROC") ```