Skip to content

Conversation

@eblondel
Copy link

@eblondel eblondel commented Mar 16, 2025

Hi @pvictor, here's some PR that intends to make create_tree faster.

WIth this PR, I led to decrease significantly the time required to build a tree for a large list of levels.

The context on which i'm working and that led to this PR is a R shiny app r-geoflow/geoflow-shiny as UI for for the R package r-geoflow/geoflow, in which i'm building a metadata editor that connects to vocabulary servers (powered by RDF/SKOS vocabularies). Some of these vocabularies can be very large, handling many collections of concepts.

See below a script that summaries the benchmarking I did on some large vocabulary (with 131,689 records). I attach the vocabular mentioned in this script as CSV, in case you want to reproduce. The code runs 25x faster with this PR.

#create_tree benchmarking

require(shinyWidgets)
require(readr)
require(waldo)

# Data queried through geoflow package
# provided as CSV for convenience

# if(!require(geoflow)){
#   remotes::install_github("r-geoflow/geoflow")
#   require(geoflow)
# }
# vocab = geoflow::list_vocabularies(raw = T)[[3]]
# concepts = vocab$list_concepts()
concepts <- readr::read_csv("concepts.csv")

#build tree from above concepts
system.time(
  tree1 <- shinyWidgets::create_tree(
    concepts,
    levels = c("collectionLabel", "prefLabel"), 
    levels_id = c("collection", "concept")
  )
)
#-------------------------------------------
## => master
## user    system  ellapsed 
## 35.79   0.58    41.10
#-------------------------------------------

#run after PR shinyWidgets install
system.time(
  tree2 <- shinyWidgets::create_tree(
    concepts,
    levels = c("collectionLabel", "prefLabel"), 
    levels_id = c("collection", "concept")
  )
)
#-------------------------------------------
## => PR
## user    system  ellapsed
## 1.46    0.12    1.65
#-------------------------------------------

#qa with waldo
waldo::compare(tree1, tree2)
#-------------------------------------------
## ✔ No differences
#-------------------------------------------

@pvictor
Copy link
Member

pvictor commented Mar 20, 2025

Hello,
Thank you for this. I tried it quickly and didn't see the same performance gain, I'll find some time to look into it further.

@eblondel
Copy link
Author

concepts.zip
Here's the file I used in the above test. I though it was attached, but the size was too high apparently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants