`dbplyr_uncount()` on Redshift tables fails with an opaque error

## Description

When using `dbplyr_uncount()` on a table within a ~Postgres or~ AWS Redshift database, we always fail with an error which mentions the `seq2()` function. It's not immediately obvious why this happens.


## Minimal example

The first few lines will need to be adjusted to create a valid connection/table.

``` r
# Establish a connection to a Redshift database
con <- RPostgres::dbConnect(...)

# Create a simple demo query - I'm using a "calendar" table I happen to have available here
q <- dplyr::tbl(con, I("lookups.calendar")) |> 
  dplyr::select(date) |> 
  dplyr::filter(date >= "2025-04-01", date <= "2025-04-03")

# Show current results
dplyr::collect(q)
#> # A tibble: 3 × 1
#>   date      
#>   <date>    
#> 1 2025-04-01
#> 2 2025-04-02
#> 3 2025-04-03

# All fine so far!

# Now suppose we want to duplicate every row...
dbplyr::dbplyr_uncount(q, 2)
#> Error in `seq2()`:
#> ! `to` must be length one.
```

<sup>Created on 2025-04-01 with [reprex v2.1.1](https://reprex.tidyverse.org)</sup>

<details style="margin-bottom:10px;">
<summary>
Session info
</summary>

``` r
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.2 (2023-10-31)
#>  os       Amazon Linux 2023.6.20250123
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  C.UTF-8
#>  ctype    C.UTF-8
#>  tz       Europe/London
#>  date     2025-04-01
#>  pandoc   3.2 @ /usr/lib/rstudio-server/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  bit           4.5.0.1 2024-12-03 [2] CRAN (R 4.3.2)
#>  bit64         4.6.0-1 2025-01-16 [2] CRAN (R 4.3.2)
#>  blob          1.2.4   2023-03-17 [2] CRAN (R 4.3.2)
#>  box           1.2.0   2024-02-06 [2] CRAN (R 4.3.2)
#>  class         7.3-23  2025-01-01 [3] CRAN (R 4.3.2)
#>  classInt      0.4-11  2025-01-08 [2] CRAN (R 4.3.2)
#>  cli           3.6.3   2024-06-21 [2] CRAN (R 4.3.2)
#>  DBI           1.2.3   2024-06-02 [2] CRAN (R 4.3.2)
#>  dbplyr        2.5.0   2024-03-19 [2] CRAN (R 4.3.2)
#>  digest        0.6.37  2024-08-19 [2] CRAN (R 4.3.2)
#>  dplyr         1.1.4   2023-11-17 [2] CRAN (R 4.3.2)
#>  e1071         1.7-16  2024-09-16 [2] CRAN (R 4.3.2)
#>  evaluate      1.0.3   2025-01-10 [2] CRAN (R 4.3.2)
#>  fastmap       1.2.0   2024-05-15 [2] CRAN (R 4.3.2)
#>  fs            1.6.5   2024-10-30 [2] CRAN (R 4.3.2)
#>  generics      0.1.3   2022-07-05 [2] CRAN (R 4.3.2)
#>  glue          1.8.0   2024-09-30 [2] CRAN (R 4.3.2)
#>  hms           1.1.3   2023-03-21 [2] CRAN (R 4.3.2)
#>  htmltools     0.5.8.1 2024-04-04 [2] CRAN (R 4.3.2)
#>  KernSmooth    2.23-26 2025-01-01 [3] CRAN (R 4.3.2)
#>  knitr         1.49    2024-11-08 [2] CRAN (R 4.3.2)
#>  lifecycle     1.0.4   2023-11-07 [2] CRAN (R 4.3.2)
#>  lubridate     1.9.4   2024-12-08 [2] CRAN (R 4.3.2)
#>  magrittr      2.0.3   2022-03-30 [2] CRAN (R 4.3.2)
#>  pillar        1.10.1  2025-01-07 [2] CRAN (R 4.3.2)
#>  pkgconfig     2.0.3   2019-09-22 [2] CRAN (R 4.3.2)
#>  proxy         0.4-27  2022-06-09 [2] CRAN (R 4.3.2)
#>  purrr         1.0.2   2023-08-10 [2] CRAN (R 4.3.2)
#>  R6            2.5.1   2021-08-19 [2] CRAN (R 4.3.2)
#>  Rcpp          1.0.14  2025-01-12 [2] CRAN (R 4.3.2)
#>  reprex        2.1.1   2024-07-06 [2] CRAN (R 4.3.2)
#>  rlang         1.1.5   2025-01-17 [2] CRAN (R 4.3.2)
#>  rmarkdown     2.29    2024-11-04 [2] CRAN (R 4.3.2)
#>  RPostgres     1.4.7   2024-05-27 [2] CRAN (R 4.3.2)
#>  rstudioapi    0.17.1  2024-10-22 [2] CRAN (R 4.3.2)
#>  sessioninfo   1.2.2   2021-12-06 [2] CRAN (R 4.3.2)
#>  sf            1.0-19  2024-11-05 [2] CRAN (R 4.3.2)
#>  tibble        3.2.1   2023-03-20 [2] CRAN (R 4.3.2)
#>  tidyselect    1.2.1   2024-03-11 [2] CRAN (R 4.3.2)
#>  timechange    0.3.0   2024-01-18 [2] CRAN (R 4.3.2)
#>  units         0.8-5   2023-11-28 [2] CRAN (R 4.3.2)
#>  vctrs         0.6.5   2023-12-01 [2] CRAN (R 4.3.2)
#>  withr         3.0.2   2024-10-28 [2] CRAN (R 4.3.2)
#>  xfun          0.50    2025-01-07 [2] CRAN (R 4.3.2)
#>  yaml          2.3.10  2024-07-26 [2] CRAN (R 4.3.2)
#> 
#>  [1] /home/owen10004/efs/R/x86_64-amazon-linux-gnu-library/4.3
#>  [2] /usr/lib64/R/site-library
#>  [3] /usr/lib64/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
```

</details>


This example uses a constant `weights` value, but we see the same error if we pass a column name for `weights`.


## Cause

~Postgres converts all (unquoted) column names to lowercase.~ Redshift converts all column names in results, always, to lowercase. Within the query that's executed in order to calculate `n_max`, this conversion leads to a name mismatch, resulting in an unexpected `NULL` result.


## Details

Early on in `dbplyr::dbplyr_uncount()`, a query is constructed calculate the max number of repetitions, and is then executed implicitly via `dplyr::pull()`:

https://github.com/tidyverse/dbplyr/blob/572693098d0098018b6bb2d1e88b650d2e1e1580/R/verb-uncount.R#L50

Note that the argument provided to the `summarise()` query isn't named, so it's given a name using the expression itself (I think [via `set_names()`?](https://github.com/tidyverse/dbplyr/blob/572693098d0098018b6bb2d1e88b650d2e1e1580/R/verb-summarise.R#L155)) - which looks like:

```
max(..., na.rm = TRUE)
```


The call is to `pull()` is dispatched to `dbplyr::pull.tbl_sql()`, which:

1. Derives the name of the variable to extract:
   
    https://github.com/tidyverse/dbplyr/blob/572693098d0098018b6bb2d1e88b650d2e1e1580/R/verb-pull.R#L20

    As we just saw, that variable is called something like `max(..., na.rm = TRUE)`.

2. Collects query results from the database:
   
   https://github.com/tidyverse/dbplyr/blob/572693098d0098018b6bb2d1e88b650d2e1e1580/R/verb-pull.R#L29
   
   However, **~Postgres converts all (unquoted) column names to lowercase~ Redshift converts all column names, always, in results to lowercase:**

   > There are two types of identifiers, standard identifiers and quoted or delimited identifiers. Identifiers must consist of only UTF-8 printable characters. ASCII letters in standard and delimited identifiers are case-insensitive and are folded to lowercase in the database. In query results, column names are returned as lowercase by default. -- [Redshift docs](https://docs.aws.amazon.com/redshift/latest/dg/r_names.html)

   This means that the collected results contain a column called something like `max(..., na.rm = true)`.

3. Uses the name to extract the appropriate column from the collected results:
   
   https://github.com/tidyverse/dbplyr/blob/572693098d0098018b6bb2d1e88b650d2e1e1580/R/verb-pull.R#L30

   Due to the mismatch in names, we end up with `out <- NULL`


So we end up with `n_max <- NULL`, hence the error in the `seq2()` call a little later:

https://github.com/tidyverse/dbplyr/blob/572693098d0098018b6bb2d1e88b650d2e1e1580/R/verb-uncount.R#L63



<details><summary>Output from `debug()`</summary>

```
> debugonce(dbplyr:::pull.tbl_sql)
> dbplyr::dbplyr_uncount(q, 2)
debugging in: pull.tbl_sql(summarise(ungroup(data), max(!!sym(weights_col), 
    na.rm = TRUE)))
debug: {
    vars <- tbl_vars(.data)
    var <- tidyselect::vars_pull(vars, !!enquo(var), error_arg = "var")
    name_quo <- enquo(name)
    if (!quo_is_null(name_quo)) {
        name <- tidyselect::vars_pull(vars, !!name_quo, error_arg = "name")
    }
    .data <- ungroup(.data)
    .data <- select(.data, all_of(c(var, name)))
    .data <- collect(.data)
    out <- .data[[var]]
    if (!is.null(name)) {
        out <- set_names(out, nm = .data[[name]])
    }
    out
}
Browse[2]> n
debug: vars <- tbl_vars(.data)
Browse[2]> n
debug: var <- tidyselect::vars_pull(vars, !!enquo(var), error_arg = "var")
Browse[2]> n
debug: name_quo <- enquo(name)
Browse[2]> n
debug: if (!quo_is_null(name_quo)) {
    name <- tidyselect::vars_pull(vars, !!name_quo, error_arg = "name")
}
Browse[2]> n
debug: .data <- ungroup(.data)
Browse[2]> n
debug: .data <- select(.data, all_of(c(var, name)))
Browse[2]> n
debug: .data <- collect(.data)
Browse[2]> n
debug: out <- .data[[var]]
Browse[2]> n
debug: if (!is.null(name)) {
    out <- set_names(out, nm = .data[[name]])
}
Browse[2]> .data
# A tibble: 1 × 1
  `max(..dbplyr_weight_col, na.rm = true)`
                                     <dbl>
1                                        2
Browse[2]> var
[1] "max(..dbplyr_weight_col, na.rm = TRUE)"
Browse[2]> out
NULL
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`dbplyr_uncount()` on Redshift tables fails with an opaque error #1601

Description

Minimal example

Cause

Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

dbplyr_uncount() on Redshift tables fails with an opaque error #1601

Description

Description

Minimal example

Cause

Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`dbplyr_uncount()` on Redshift tables fails with an opaque error #1601