Skip to content

loo_subsample slower than loo when many lookups are needed in complex models #1782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rubenarslan opened this issue May 27, 2025 · 1 comment
Labels
Milestone

Comments

@rubenarslan
Copy link

Based on this forum thread

loo_subsample is meant to be a faster alternative to loo and is most likely to be used for complex models.

Unfortunately, it seems the implementation in brms (unlike the implementation in loo_subsample) incurs overhead when looking up parameters that (in my simulations) costs more than the subsampling saves.

When I profile the function (documented in the forum), most of the time is spent in p called by predictor_re. Apparently, according to @jgabry you could get rid of the r_eff call but not log_lik. So, based on my limited understanding I thought, to solve this I guess either p would need to become faster (probably not possible) or the extraction of observations and draws happens once, vectorised?

My reprex:

library(brms)

options(mc.cores = 4, brms.backend="cmdstanr")
max_s_id <- max(inhaler$subject, na.rm = TRUE)

enlarged_inhaler <- purrr::map_dfr(
  .x = 0:9, # Creates 9 versions (0=original, 1=first new copy, ..., 9=last new copy)
  .f = ~ dplyr::mutate(inhaler, subject = subject + (.x * max_s_id))
)

fit1 <- brm(rating ~ treat + period + carry,
            data = enlarged_inhaler)

fit2 <- brm(rating ~ treat + (1|period) + carry + (1|subject),
            data = enlarged_inhaler)


options(mc.cores = 1, brms.backend="cmdstanr")

system.time({
  fit1l <- add_criterion(fit1, criterion = "loo", overwrite = T, ndraws = 500)
  fit2l <- add_criterion(fit2, criterion = "loo", overwrite = T, ndraws = 500)
})

system.time({
  fit1ls <- add_criterion(fit1, criterion = "loo_subsample", observations = 100, ndraws = 500, overwrite = T)
  fit2ls <- add_criterion(fit2, criterion = "loo_subsample", observations = fit1ls$criteria$loo_subsample, ndraws = 500, overwrite = T)
})

profvis::profvis({
  fit2ls <- add_criterion(fit2, criterion = "loo_subsample", observations = fit1ls$criteria$loo_subsample, ndraws = 500, overwrite = T)
})
@paul-buerkner paul-buerkner added this to the brms 2.23.0 milestone May 27, 2025
@jgabry
Copy link
Contributor

jgabry commented May 27, 2025

Apparently, according to @jgabry you could get rid of the r_eff call but not log_lik.

Right, the r_eff calculation is nice to have but often not essential. In many of the packages that call the loo package (e.g. rstanarm, brms, cmdstanr, etc.) we were always computing r_eff. But @avehtari pointed out that it's usually not necessary and we should probably default not compute it unless the user requests it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants