Skip to content

Commit cb51247

Browse files
DavisVaughanhadley
authored andcommitted
Improve pivot_wider() performance (#790)
Use `vec_group_id()` and `vec_unique_loc()` to improve performance
1 parent 666e62d commit cb51247

File tree

3 files changed

+10
-4
lines changed

3 files changed

+10
-4
lines changed

DESCRIPTION

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ Imports:
3737
tibble (>= 2.1.1),
3838
tidyselect (>= 0.2.5),
3939
utils,
40-
vctrs (>= 0.2.0),
40+
vctrs (>= 0.2.0.9007),
4141
lifecycle
4242
Suggests:
4343
covr,
@@ -55,3 +55,5 @@ Encoding: UTF-8
5555
LazyData: true
5656
Roxygen: list(markdown = TRUE)
5757
RoxygenNote: 7.0.0
58+
Remotes:
59+
r-lib/vctrs

NEWS.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# tidyr (development version)
22

3+
* `pivot_wider()` and `pivot_longer()` are considerably more performant, thanks
4+
largely to improvements in the underlying vctrs code (#790, @DavisVaughan)
5+
36
* `unnest_wider()` and `unnest_longer()` can now unnest `list_of` columns. This
47
is important for unnesting columns created from `nest()`, which are always
58
`list_of` columns, and for usage after `pivot_wider()`, which, by default,

R/pivot-wide.R

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -131,10 +131,12 @@ pivot_wider_spec <- function(data,
131131
df_rows <- data[key_vars]
132132
if (ncol(df_rows) == 0) {
133133
rows <- tibble(.rows = 1)
134+
nrow <- 1L
134135
row_id <- rep(1L, nrow(df_rows))
135136
} else {
136-
rows <- vec_unique(df_rows)
137-
row_id <- vec_match(df_rows, rows)
137+
row_id <- vec_group_id(df_rows)
138+
nrow <- attr(row_id, "n")
139+
rows <- vec_slice(df_rows, vec_unique_loc(row_id))
138140
}
139141

140142
value_specs <- unname(split(spec, spec$.value))
@@ -158,7 +160,6 @@ pivot_wider_spec <- function(data,
158160
val_id <- dedup$key
159161
val <- dedup$val
160162

161-
nrow <- nrow(rows)
162163
ncol <- nrow(spec_i)
163164

164165
fill <- values_fill[[value]]

0 commit comments

Comments
 (0)