R Reticulate does work with for loop but not purrr::map2

Question

I want to remove passwords from several Excel files, so I specified a data frame with paths and passwords and then added some python code running with reticulate.

When I pass that function to a for loop, it does work perfectly fine. However, when using map2, I'm getting an error:

Error in !trace_length(trace) : invalid argument type

Here's my code (NOTE: you need to change the path to your desired path and need to make sure you have two respective Excel files with "test" and "test2" as their passwords):

library(tidyverse)
library(reticulate)

pw_dat <- data.frame(path = c("C:/Users/USERNAME/Downloads/file1.xlsx",
                              "C:/Users/USERNAME/Downloads/file2.xlsx"),
                     pw   = c("test", "test2"))

# function with python code
unlock_excel <- function(file, password)
{
  
  output_folder <- "C:/Users/USERNAME/Downloads/test/"
  file <- normalizePath(file)
  
  py_code <- sprintf("
import pathlib
import msoffcrypto

def unlock(filename, passwd, output_folder):
    temp = open(filename, 'rb')
    excel = msoffcrypto.OfficeFile(temp)
    excel.load_key(passwd)
    out_path = pathlib.Path(output_folder)
    
    if not out_path.exists():
        out_path.mkdir(parents=True, exist_ok=True)

    with open(str(out_path / pathlib.Path(filename).name), 'wb') as f:
        excel.decrypt(f)
    
    temp.close()

unlock('%s', '%s', '%s')
", file_path, password, output_folder)
  
  # run python script
  py_run_string(py_code)
}


## This works
for (i in 1:nrow(pw_dat)) {
  file_path <- pw_dat[i, "path"]
  password <- pw_dat[i, "pw"]
  
  unlock_excel(file_path, password)
}


## This doesn't work
map2(.x = pw_dat$path,
     .y = pw_dat$pw,
     .f = ~unlock_excel(file = .x, password = .y))

UPDATE: I think it has sth. to do with the normalizePath code. Without it, map2 works.

UPDATE 2: Here's the full traceback:

Error in !trace_length(trace) : invalid argument type
10.
stop(structure(list(message = " File \"<string>\", line 20\n unlock('C:\\Users\\USERNAME\\Downloads\\file1.xlsx', 'test', 'C:/Users/MartinDegen/Downloads/test/')\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\nSyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape\n\033[90mRun \033]8;;rstudio:run:reticulate::py_last_error()\a`reticulate::py_last_error()`\033]8;;\a for details.\033[39m",
call = py_run_string_impl(code, local, convert)), class = c("python.builtin.SyntaxError",
"python.builtin.Exception", "python.builtin.BaseException", "python.builtin.object",
"error", "condition"), py_object = <environment>))
9.
py_run_string_impl(code, local, convert)
8.
py_run_string(py_code)
7.
unlock_excel(file = .x, password = .y)
6.
.f(.x[[i]], .y[[i]], ...)
5.
call_with_cleanup(map2_impl, environment(), .type, .progress,
n, names, i)
4.
withCallingHandlers(expr, error = function(cnd) {
if (i == 0L) {
}
else { ...
3.
with_indexed_errors(i = i, names = names, error_call = .purrr_error_call,
call_with_cleanup(map2_impl, environment(), .type, .progress,
n, names, i))
2.
map2_("list", .x, .y, .f, ..., .progress = .progress)
1.
map2(.x = pw_dat$path, .y = pw_dat$pw, .f = ~unlock_excel(file = .x,
password = .y))

Update 3: result of py_code:

[1] "\nimport pathlib\nimport msoffcrypto\n\ndef unlock(filename, passwd, output_folder):\n    temp = open(filename, 'rb')\n    excel = msoffcrypto.OfficeFile(temp)\n    excel.load_key(passwd)\n    out_path = pathlib.Path(output_folder)\n    \n    if not out_path.exists():\n        out_path.mkdir(parents=True, exist_ok=True)\n\n    with open(str(out_path / pathlib.Path(filename).name), 'wb') as f:\n        excel.decrypt(f)\n    \n    temp.close()\n\n# Funktion aufrufen\nunlock('C:/Users/USERNAME/Downloads/file1.xlsx', 'test', 'C:/Users/USERNAME/Downloads/test/')\n"

I think the error is referencing rlang::trace_length, but I don't see that call in purrr, and I cannot seem to get it invoked locally. When you get the error, does .Traceback or traceback() give you any better context? — r2evans
– r2evans, Commented Jul 8 at 12:50
@r2evans updated my post with the traceback. I think it really had to do with the normalizePath. Deleting that line of code makes the function run. Maybe it's because python uses a different syntax for writing paths? IDK. — deschen
– deschen, Commented Jul 8 at 13:16
Try normalizePath(..., winslash="/"). On non-windows, this is a no-op, and on windows it still works in R just fine (and I suspect python as well). — r2evans
– r2evans, Commented Jul 8 at 13:18
I don't know exactly how reticulate is passing the strings to python, it might be that the backslashes as R passes to python might be confusing to it. (Sorry, I don't have windows up to test this theory.) — r2evans
– r2evans, Commented Jul 8 at 13:20
Can you print the value of py_code before running it so we can see exactly how the string is interpolated when the error is generated? — MrFlick
– MrFlick, Commented Jul 8 at 14:17

margusl · Accepted Answer · 2025-07-09 04:54:40Z

In your sprintf() call you are using file_path (global object, updated in a loop) instead of file (function arg & local object, never passed to your Python code):

unlock_excel <- function(file, password){
  
  output_folder <- "C:/Users/USERNAME/Downloads/test/"
  file <- normalizePath(file)
  
  py_code <- sprintf("...
unlock('%s', '%s', '%s')
", file_path, password, output_folder)
  
  # run python script
  py_run_string(py_code)
}

It could be just a typo, but it would much easier to spot when global objects in functions are avoided. Also, there's no need to re-import modules and re-decalre Python function on every call; nor use R wrapper function. After executing Pyhon code and defining unlock(), you can call it directly with py$unlock() from R.

As unlock() is called only for side effects and does not have a (non-None) return value, purrr::walk2() would be preferred here over purrr::map2().

library(purrr)
library(reticulate)
py_require("msoffcrypto-tool")

# test files:
fs::dir_tree("excel")
#> excel
#> ├── file1.xlsx
#> └── file2.xlsx
pw_dat <- 
  data.frame(path = fs::path("excel") / c("file1.xlsx", "file2.xlsx"),
             pw   = c("test", "test2"))
pw_dat
#>               path    pw
#> 1 excel/file1.xlsx  test
#> 2 excel/file2.xlsx test2

py_run_string("
import pathlib
import msoffcrypto

def unlock(filename, passwd, output_folder):
    temp = open(filename, 'rb')
    excel = msoffcrypto.OfficeFile(temp)
    excel.load_key(passwd)
    out_path = pathlib.Path(output_folder)
    
    if not out_path.exists():
        out_path.mkdir(parents=True, exist_ok=True)

    with open(str(out_path / pathlib.Path(filename).name), 'wb') as f:
        excel.decrypt(f)
    
    temp.close()
")

fs::dir_create("excel/out")
walk2(
  .x = pw_dat$path, 
  .y = pw_dat$pw, 
  .f = \(x, y) py$unlock(fs::path_abs(x), y, fs::path_abs("excel/out"))
)

Resulting dir tree:

fs::dir_tree("excel")
#> excel
#> ├── file1.xlsx
#> ├── file2.xlsx
#> └── out
#>     ├── file1.xlsx
#>     └── file2.xlsx

Thanks for this code. I will investigate this a bit more. To be honest, I only have a limited idea what the python code does. I‘m not familiar with python and just used it because I have to handle password protected Excel files in R (which there is no R package/function for). The code is 50% Google, 50% ChatGPT.

Collectives™ on Stack Overflow

R Reticulate does work with for loop but not purrr::map2

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related