I'm trying to write a function that edits workbooks imported via loadWorkbook and have a problem where blank cells get ignored rather than treated as NAs. The following code demonstrates this issue.
df <- data.frame(x = c(1, 2, 3, 4, NA, 6), y = c(7, 8, 9, 10, NA, 12))
xlsx::write.xlsx(df, file = 'testdf.xlsx', showNA = FALSE, row.names = FALSE)
testwb <- loadWorkbook(file = 'testdf.xlsx')
sheetnames <- c('Sheet1')
sheets <- lapply(sheetnames, function(name) getSheets(testwb)[[name]])
rows <- getRows(sheets[[1]], rowIndex=1:7)
cells <- getCells(rows, colIndex=1:2)
values <- sapply(unlist(cells), getCellValue)
values
length(values)
When I run it as is I get what I expected, with NAs included:
1.1 1.2 2.1 2.2 3.1 3.2 4.1 4.2 5.1 5.2 6.1 6.2 7.1 7.2
"x" "y" "1" "7" "2" "8" "3" "9" "4" "10" NA NA "6" "12"
However, if I open the excel file and manually delete the first row of data and then run it again, I get this:
1.1 1.2 3.1 3.2 4.1 4.2 5.1 5.2 7.1 7.2
"x" "y" "2" "8" "3" "9" "4" "10" "6" "12"
This causes problems for me because when doing comparisons between two sheets they end up with different numbers of values if one has more blank cells than the other. How do I force it to include NA values?
I tried manually rebuilding a copy of the workbook so I could use the keepNA argument in writeData like this:
testwb2 <- openxlsx::createWorkbook()
addWorksheet(testwb2, 'Sheet1')
writeData(testwb2, 'Sheet1', read_xlsx('testdf.xlsx', sheet = 1), keepNA = TRUE)
sheetnames <- c('Sheet1')
sheets <- lapply(sheetnames, function(name) getSheets(testwb2)[[name]])
rows <- getRows(sheets[[1]], rowIndex=1:7)
cells <- getCells(rows, colIndex=1:2)
values <- sapply(unlist(cells), getCellValue)
values
length(values)
but then I get the error
Error in envRefInferField(x, what, getClass(class(x)), selfEnv) : ‘getNumberOfSheets’ is not a valid field or method name for reference class “Workbook”
which I also don't understand. Is there a simple way to handle this?