1

Hi so I make a lot of graphs and was wondering if I was doing it in a fast, efficient way. I need to make visualizations commonly that are commonly ~2000 pages long (one for each unit of analysis)

I tried to make a toy example here that writes to a temp directory.

library(ggplot2)
library(furrr)
library(dplyr)

plan("multisession")

# define graphing function
plot_graph <- function(dt) {
  
  dt |> ggplot(aes(x = x, y = y)) +
    geom_point()
}

# create dataset
a <- 2e3
b <- 1e3

dt <- data.frame(
  id = rep(1:a, each = b),
  x = runif(a*b),
  y = runif(a*b)
)

# create list of plots
list_plots <- dt |>
  split(f = "id") |>
  purrr::map(plot_graph)

# set up temp dir
dir_out_tmp <- tempdir()
filename_out <- "temp"

# save graphs to temp dir
furrr::future_iwalk(
  list_plots,
  ~withr::with_pdf(
    new = fs::path(dir_out_tmp, paste(filename_out, .y, sep = "-"), ext = "pdf"),
    width = 15,
    height = 8,
    code = plot(.x)
  )
)

# check for files
files_temp <- fs::path(dir_out_tmp, paste(filename_out, names(list_plots), sep = "-"), ext = "pdf")
stopifnot(all(fs::file_exists(files_temp)))

# combine graphs
qpdf::pdf_combine(
  input = files_temp,
  output = fs::path(dir_out_tmp, filename_out, ext = "pdf")
)


Basically the code saves individual plots in the temp directory and in real applications, collates those to save in the actual output folder.

This is pretty fast, but I'm wondering if there are tools to just save the multipage pdf from the list of plots directly?

EDIT: I've tried ggsave and it is much slower

5
  • 2
    It might be worth putting the code into a rmd/qmd file and then in one chunk calling purrr::walk(list_plots, print). Rendering as a pdf should then create all your graphs and combine them into the final file. Commented Sep 26, 2024 at 21:21
  • I'm curious who is going to actually look through a 2000-page document ... (maybe there's an index and people are interested in looking at particular plots/pages?) Commented Sep 26, 2024 at 23:02
  • a side note on archiving that many files, after you make all those files in their directories, you can installl a 7zip cli tool, and can run a .zip archival script to compress and save them for later with less drive space usage Commented Sep 27, 2024 at 1:01
  • 1
    Isn't this the perfect use case for Quarto? You can create a chunk that produces plots using a loop, and then you will get an .html or .pdf file with a dynamic index or sidebar, allowing you to browse all plots. Commented Sep 29, 2024 at 19:44
  • @BastiánOleaHerrera @AndyBaxter This sounds like it might be the answer, thanks! Do you know how quick it is to render? I know that before we had this current code we tried ggsave() but it was just so slow Commented Sep 30, 2024 at 21:42

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.