R/speedy_functions.R
speedy_functions.Rd
These function are deprecated to the benefit of the functions whose name
ends with an underscore _
(e.g., sselect()
-> svTidy::select_()
) in the
svTidy package.
The Tidyverse defines a coherent set of tools to manipulate
data frames that use a non-standard evaluation and sometimes require extra
care. These functions, like dplyr::mutate()
or dplyr::summarise()
are
defined in the {dplyr} and {tidyr} packages. The {collapse} package
provides a couple of functions with similar interface, but with different and
much faster code.
For instance, collapse::fselect()
is similar to dplyr::select()
, or
collapse::fsummarise()
is similar to dplyr::summarise()
. Not all
functions are implemented, arguments and argument names differ, and the
behavior may be very different, like collapse::frename()
which uses
old_name = new_name
, while dplyr::rename()
uses new_name = old_name
!
The speedy functions all are prefixed with an "s", like smutate()
, and
build on the work initiated in {collapse} to propose a series of paired
functions with the tidy ones. So, smutate()
and dplyr::mutate()
are
"speedy" and "tidy" counterparts and they are used in a very
similar, if not identical way. This notation using a "s" prefix is there to
draw the attention on their particularities. Their classes are function
and speedy_fn. Avoid mixing tidy, speedy and non-tidy/speedy functions in
the same pipeline.
This is a global page to present all the speedy functions in one place.
It is not meant to be a clear and detailed help page of all individual "s"
functions. Please, refer to the corresponding help page of the non-"s" paired
function for more details! You can use the {svMisc}'s .?smutate
syntax to
go to the help page of the non-"s" function with a message.
list_speedy_functions()
sgroup_by(.data, ...)
sungroup(.data, ...)
srename(.data, ...)
srename_with(.data, .fn, .cols = everything(), ...)
sfilter(.data, ...)
sfilter_ungroup(.data, ...)
sselect(.data, ...)
smutate(.data, ..., .keep = "all")
smutate_ungroup(.data, ..., .keep = "all")
stransmute(.data, ...)
stransmute_ungroup(.data, ...)
ssummarise(.data, ...)
sfull_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)
sleft_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)
sright_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)
sinner_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)
sbind_rows(..., .id = NULL)
scount(
x,
...,
wt = NULL,
sort = FALSE,
name = NULL,
.drop = dplyr::group_by_drop_default(x),
sort_cat = TRUE,
decreasing = FALSE
)
stally(
x,
wt = NULL,
sort = FALSE,
name = NULL,
sort_cat = TRUE,
decreasing = FALSE
)
sadd_count(
x,
...,
wt = NULL,
sort = FALSE,
name = NULL,
.drop = NULL,
sort_cat = TRUE,
decreasing = FALSE
)
sadd_tally(
x,
wt = NULL,
sort = FALSE,
name = NULL,
sort_cat = TRUE,
decreasing = FALSE
)
sbind_cols(
...,
.name_repair = c("unique", "universal", "check_unique", "minimal")
)
sarrange(.data, ..., .by_group = FALSE)
spull(.data, var = -1, name = NULL, ...)
sdistinct(.data, ..., .keep_all = FALSE)
sdrop_na(data, ...)
sreplace_na(data, replace, ...)
spivot_longer(data, cols, names_to = "name", values_to = "value", ...)
spivot_wider(data, names_from = name, values_from = value, ...)
suncount(data, weights, .remove = TRUE, .id = NULL)
sunite(data, col, ..., sep = "_", remove = TRUE, na.rm = FALSE)
sseparate(
data,
col,
into,
sep = "[^[:alnum:]]+",
remove = TRUE,
convert = FALSE,
...
)
sseparate_rows(data, ..., sep = "[^[:alnum:].]+", convert = FALSE)
sfill(data, ..., .direction = c("down", "up", "downup", "updown"))
sextract(
data,
col,
into,
regex = "([[:alnum:]]+)",
remove = TRUE,
convert = FALSE,
...
)
A data frame (data.frame, data.table or tibble's tbl_df)
Arguments dependent to the context of the function and most of the time, not evaluated in a standard way (cf. the tidyverse approach).
A function to use.
The list of the column where to apply the transformation. For
the moment, only all existing columns, which means .cols = everything()
is implemented
Which columns to keep. The default is "all"
, possible values
are "used"
, "unused"
, or "none"
(see dplyr::mutate()
).
A data frame (data.frame, data.table or tibble's tbl_df).
A second data frame.
A list of names of the columns to use for joining the two data frames.
The suffix to the column names to use to differentiate the
columns that come from the first or the second data frame. By default it is
c(".x", ".y")
.
This argument is there for compatibility with the "t" matching functions, but it is not used here.
The name of the column for the origin id, either names if all other arguments are named, or numbers.
Frequency weights. Can be NULL
or a variable. Use data masking.
If TRUE
largest group will be shown on top.
The name of the new column in the output (n
by default, and no
existing column must have this name, or an error is generated).4
Are levels with no observations dropped (TRUE
by default).
Are levels sorted (TRUE
by default).
Is sorting done in decreasing order (FALSE
by default)?
How should the name be "repaired" to avoid duplicate
column names? See dplyr::bind_cols()
for more details.
Logical. If TRUE
rows are first arranger by the grouping
variables in any. FALSE
by default.
A variable specified as a name, a positive or a negative integer
(counting from the end). The default is -1
and returns last variable.
If TRUE
keep all variables in .data
.
A data frame, or for replace_na()
a vector or a data frame.
If data
is a vector, a unique value to replace NA
s,
otherwise, a list of values, one per column of the data frame.
A selection of the columns using tidy-select syntax,
seetidyr::pivot_longer()
.
A character vector with the name or names of the columns for the names.
A string with the name of the column that receives the values.
The column or columns containing the names (use tidy selection and do not quote the names).
Idem for the column or columns that contain the values.
A vector of weight to use to "uncount" data
.
If TRUE
, and weights
is the name of a column, that column
is removed from data
.
The name quoted or not of the new column with united variable.
Separator to use between values for united or separated columns.
If TRUE
the initial columns that are separated are also
removed from data
.
If TRUE
, NA
s are eliminated before uniting the values.
Name of the new column to put separated variables. Use NA
for
items to drop.
If 'TRUE
resulting values are converted into numeric,
integer or logical.
Direction in which to fill missing data: "down"
(by
default), "up"
, or "downup"
(first down, then up), "updown"
(the opposite).
A regular expression used to extract the desired values (use one
group with (
and )
for each element of into
).
See corresponding "non-s" function for the full help page with indication of the return values.
The ssummarise()
function does not support n()
as does
dplyr::summarise()
. You can use fn()
instead, but then, you must give a
variable name as argument. The fn()
alternative can also be used in
dplyr::summarise()
for homogeneous syntax between the two.
From {dplyr}, the dplyr::slice()
and slice_xxx()
functions are not
added yet because they are not available for {dbplyr}. Also
dplyr::anti_join()
, dplyr::semi_join()
and dplyr::nest_join()
are not
implemented yet. From {tidyr} tidyr::expand()
, tidyr::chop()
,
tidyr::unchop()
, tidyr::nest()
, tidyr::unnest()
,
tidyr::unnest_longer()
, tidyr::unnest_wider()
, tidyr::hoist()
,
tidyr::pack()
and tidyr::unpack()
are not implemented yet.
# TODO...