Read and return an R object from data on disk, from URL, or from packages.
Usage
read(
file,
type = NULL,
header = "#",
header.max = 50L,
skip = 0L,
locale = default_locale(),
lang = getOption("data.io_lang", "en"),
lang_encoding = "UTF-8",
as_dataframe = FALSE,
as_labelled = FALSE,
comments = NULL,
package = NULL,
sidecar_file = TRUE,
fun_list = NULL,
hfun = NULL,
fun = NULL,
data,
cache_file = NULL,
method = "auto",
quiet = FALSE,
force = FALSE,
...
)
type_from_extension(file, full = FALSE)
hread_text(file, header.max, skip = 0L, locale = default_locale(), ...)
hread_xls(file, header.max, skip = 0L, locale = default_locale(), ...)
hread_xlsx(file, header.max, skip = 0L, locale = default_locale(), ...)
# S3 method for subsettable_type
$(x, name)
# S3 method for read_function_subset
.DollarNames(x, pattern = "")
Arguments
- file
The path to the file to read, or the name of the dataset to get from an R package (in that case, you must provide the
package=
argument).- type
The type (format) of data to read.
- header
The character to use for the header and other comments.
- header.max
The maximum of lines to consider for the header.
- skip
The number of lines to skip at the beginning of the file.
- locale
A readr locale object with all the data regarding required to correctly interpret country-related items. The default value matches R defaults as US English + UTF-8 encoding, and it is advised to be used as much as possible.
- lang
The language to use (mainly for comment, label and units), but also for factor levels or other character strings if a translation exists and if the language is spelled with uppercase characters (e.g.,
"FR"
). The default value can be set with, e.g.,options(data.io_lang = "fr")
for French.- lang_encoding
Encoding used by R scripts for translation. They should all be encoded as
UTF-8
, which is the default. However, this argument allows to specify a different encoding if needed.- as_dataframe
Deprecated: now use
options(SciViews.as_dtx = as_XXX)
to specify if you want a data.frame (as_dtf
), a data.table (as_dtt
, by default), or a tibble (as_dtbl
). Do we try to convert the resulting object into adataframe
(inheriting fromdata.frame
,tbl
andtbl_db
aliastibble
)? IfFALSE
, no conversion is attempted. Note that now, whatever you indicate, it is always assumed to beFALSE
as part of the deprecation!- as_labelled
Are variable converted into 'labelled' objects. This allows to keep labels and units when the vector is manipulated, but it can lead to incompatibilities with some R code (hence, it is
FALSE
by default).- comments
Comments to add in the created object.
- package
The package where to look for the dataset. If
file=
is not provided, a list of available datasets in the package is displayed.- sidecar_file
If
TRUE
and a file with same name asfile=
+.R
is found in the same directory, it is considered as code to import these data and it is sourced withlocal = TRUE
,chdir = TRUE
andverbose = FALSE
. That script must create an object nameddataset
, which is the result that is returned by the function. It is advised to encode this script inUTF-8
, which is the default value, but it is possible to specify a different encoding through thelang_encoding=
parameter.- fun_list
The table with correspondence of the types, read, and write functions.
- hfun
The function to read the header (lines starting with a special mark, usually '#' at the beginning of the file). This function must have the same arguments as
hread_text()
and should return a character string with the firstheader.max
lines.- fun
The function to delegate reading of the data. If
NULL
(default), The function is chosen fromfun_list
.- data
A synonym to
file=
(the name makes more sense when the dataset is loaded from a package). You cannot usedata=
andfile=
at the same time.- cache_file
The path to a local file to use as a cache when file is downloaded (http://, https://, ftp://, or file:// protocols). If cache_file already exists, data are read from this cache, except if
force = TRUE
, see here under. Otherwise, data are saved in it before being used. Ifcache_file = NULL
(the default), a temporary file is used and data are read from the Internet every time. This cache mechanism is particularly useful to provide data associated with a git repository. Put cache_file in.gitignore
and usecache_file=
in the code (andforce = FALSE
). That way, the data are downloaded once in a freshly cloned repository, and they are not included in the versioning system (useful for large datasets).- method
The downloading method used (
"auto"
by default), seeutils::download.file()
.- quiet
In case we have to download files, do it silently (
TRUE
) or do we provide feedback and a progression bar (FALSE
, by default)?- force
If
TRUE
and an URL is provided forfile=
and a path forcache_file=
, then the content is downloaded all the time, even if the cache file already exists (it overwrites it). By default, it isFALSE
, which is the most useful setting to make good use of the cache mechanism.- ...
Further arguments passed to the function
fun=
.- full
Do we return the full extension, like
csv.tar.gz
(TRUE
), or only the main extension, likecsv
(FALSE
, by default).- x
A
subsettable_type
function.- name
The value to use for the
type=
argument.- pattern
A regular expression to list matching names.
Details
read()
allows for a unique entry point to read various kinds of
data, but it delegates the actual work to various other functions dispatched
across several R packages. See getOption("read_write")
.
Author
Philippe Grosjean phgrosjean@sciviews.org
Examples
# Use of read() as a more flexible substitute to data() (can change dataset
# name and syntax more similar to read R datasets and datasets from files)
read() # List all available datasets in your installed version of R
# List datasets in one particular package
read(package = "data.io")
# Read one dataset from this package, possibly changing its name
(urchin <- read("urchin_bio", package = "data.io"))
#> origin diameter1 diameter2 height buoyant_weight weight solid_parts
#> <fctr> <num> <num> <num> <num> <num> <num>
#> 1: Fishery 9.9 10.2 5.0 NA 0.5215 0.4777
#> 2: Fishery 10.5 10.6 5.7 NA 0.6418 0.5891
#> 3: Fishery 10.8 10.8 5.2 NA 0.7336 0.6770
#> 4: Fishery 9.6 9.3 4.6 NA 0.3697 0.3438
#> 5: Fishery 10.4 10.7 4.8 NA 0.6097 0.5587
#> ---
#> 417: Farm 16.7 17.2 8.5 0.5674 2.4300 2.2900
#> 418: Farm 16.5 16.5 7.9 0.5472 2.3200 2.1800
#> 419: Farm 16.8 16.7 8.2 0.4864 2.2200 2.1300
#> 420: Farm 17.3 17.2 8.5 0.4864 2.5200 2.3400
#> 421: Farm 17.0 16.6 7.9 0.4357 2.0500 1.9800
#> integuments dry_integuments digestive_tract dry_digestive_tract gonads
#> <num> <num> <num> <num> <num>
#> 1: 0.3658 NA 0.0525 0.0079 0.0000
#> 2: 0.4447 NA 0.0482 0.0090 0.0000
#> 3: 0.5326 NA 0.0758 0.0134 0.0000
#> 4: 0.2661 NA 0.0442 0.0064 0.0000
#> 5: 0.4058 NA 0.0743 0.0117 0.0000
#> ---
#> 417: 1.8400 1.02 0.1661 0.0229 0.0215
#> 418: 1.8000 1.01 0.0977 0.0147 0.0253
#> 419: 1.6300 0.88 0.1704 0.0208 0.0154
#> 420: 1.7200 0.89 0.1444 0.0167 0.0237
#> 421: 1.4300 0.83 0.1462 0.0212 0.0266
#> dry_gonads skeleton lantern test spines maturity sex
#> <num> <num> <num> <num> <num> <int> <fctr>
#> 1: 0.0000 0.1793 0.0211 0.0587 0.0995 0 <NA>
#> 2: 0.0000 0.1880 0.0205 0.0622 0.1053 0 <NA>
#> 3: 0.0000 0.2354 0.0254 0.0836 0.1263 0 <NA>
#> 4: 0.0000 0.0630 0.0167 0.0180 0.0283 0 <NA>
#> 5: 0.0000 NA NA NA NA 0 <NA>
#> ---
#> 417: 0.0034 0.9046 0.0750 0.3399 0.4896 0 <NA>
#> 418: 0.0051 0.8965 0.0908 0.3189 0.4868 0 <NA>
#> 419: 0.0020 0.7714 0.0877 0.2961 0.3876 0 <NA>
#> 420: 0.0032 0.7938 0.0772 0.3077 0.4090 0 <NA>
#> 421: 0.0051 0.7421 0.0723 0.2689 0.4009 0 <NA>
# Same, but using labels in French
(urchin <- read("urchin_bio", package = "data.io", lang = "fr"))
#> origin diameter1 diameter2 height buoyant_weight weight solid_parts
#> <fctr> <num> <num> <num> <num> <num> <num>
#> 1: Fishery 9.9 10.2 5.0 NA 0.5215 0.4777
#> 2: Fishery 10.5 10.6 5.7 NA 0.6418 0.5891
#> 3: Fishery 10.8 10.8 5.2 NA 0.7336 0.6770
#> 4: Fishery 9.6 9.3 4.6 NA 0.3697 0.3438
#> 5: Fishery 10.4 10.7 4.8 NA 0.6097 0.5587
#> ---
#> 417: Farm 16.7 17.2 8.5 0.5674 2.4300 2.2900
#> 418: Farm 16.5 16.5 7.9 0.5472 2.3200 2.1800
#> 419: Farm 16.8 16.7 8.2 0.4864 2.2200 2.1300
#> 420: Farm 17.3 17.2 8.5 0.4864 2.5200 2.3400
#> 421: Farm 17.0 16.6 7.9 0.4357 2.0500 1.9800
#> integuments dry_integuments digestive_tract dry_digestive_tract gonads
#> <num> <num> <num> <num> <num>
#> 1: 0.3658 NA 0.0525 0.0079 0.0000
#> 2: 0.4447 NA 0.0482 0.0090 0.0000
#> 3: 0.5326 NA 0.0758 0.0134 0.0000
#> 4: 0.2661 NA 0.0442 0.0064 0.0000
#> 5: 0.4058 NA 0.0743 0.0117 0.0000
#> ---
#> 417: 1.8400 1.02 0.1661 0.0229 0.0215
#> 418: 1.8000 1.01 0.0977 0.0147 0.0253
#> 419: 1.6300 0.88 0.1704 0.0208 0.0154
#> 420: 1.7200 0.89 0.1444 0.0167 0.0237
#> 421: 1.4300 0.83 0.1462 0.0212 0.0266
#> dry_gonads skeleton lantern test spines maturity sex
#> <num> <num> <num> <num> <num> <int> <fctr>
#> 1: 0.0000 0.1793 0.0211 0.0587 0.0995 0 <NA>
#> 2: 0.0000 0.1880 0.0205 0.0622 0.1053 0 <NA>
#> 3: 0.0000 0.2354 0.0254 0.0836 0.1263 0 <NA>
#> 4: 0.0000 0.0630 0.0167 0.0180 0.0283 0 <NA>
#> 5: 0.0000 NA NA NA NA 0 <NA>
#> ---
#> 417: 0.0034 0.9046 0.0750 0.3399 0.4896 0 <NA>
#> 418: 0.0051 0.8965 0.0908 0.3189 0.4868 0 <NA>
#> 419: 0.0020 0.7714 0.0877 0.2961 0.3876 0 <NA>
#> 420: 0.0032 0.7938 0.0772 0.3077 0.4090 0 <NA>
#> 421: 0.0051 0.7421 0.0723 0.2689 0.4009 0 <NA>
# ... and also the levels of factors in French (note: uppercase FR)
(urchin <- read("urchin_bio", package = "data.io", lang = "FR"))
#> origin diameter1 diameter2 height buoyant_weight weight solid_parts
#> <fctr> <num> <num> <num> <num> <num> <num>
#> 1: Pêcherie 9.9 10.2 5.0 NA 0.5215 0.4777
#> 2: Pêcherie 10.5 10.6 5.7 NA 0.6418 0.5891
#> 3: Pêcherie 10.8 10.8 5.2 NA 0.7336 0.6770
#> 4: Pêcherie 9.6 9.3 4.6 NA 0.3697 0.3438
#> 5: Pêcherie 10.4 10.7 4.8 NA 0.6097 0.5587
#> ---
#> 417: Culture 16.7 17.2 8.5 0.5674 2.4300 2.2900
#> 418: Culture 16.5 16.5 7.9 0.5472 2.3200 2.1800
#> 419: Culture 16.8 16.7 8.2 0.4864 2.2200 2.1300
#> 420: Culture 17.3 17.2 8.5 0.4864 2.5200 2.3400
#> 421: Culture 17.0 16.6 7.9 0.4357 2.0500 1.9800
#> integuments dry_integuments digestive_tract dry_digestive_tract gonads
#> <num> <num> <num> <num> <num>
#> 1: 0.3658 NA 0.0525 0.0079 0.0000
#> 2: 0.4447 NA 0.0482 0.0090 0.0000
#> 3: 0.5326 NA 0.0758 0.0134 0.0000
#> 4: 0.2661 NA 0.0442 0.0064 0.0000
#> 5: 0.4058 NA 0.0743 0.0117 0.0000
#> ---
#> 417: 1.8400 1.02 0.1661 0.0229 0.0215
#> 418: 1.8000 1.01 0.0977 0.0147 0.0253
#> 419: 1.6300 0.88 0.1704 0.0208 0.0154
#> 420: 1.7200 0.89 0.1444 0.0167 0.0237
#> 421: 1.4300 0.83 0.1462 0.0212 0.0266
#> dry_gonads skeleton lantern test spines maturity sex
#> <num> <num> <num> <num> <num> <int> <fctr>
#> 1: 0.0000 0.1793 0.0211 0.0587 0.0995 0 <NA>
#> 2: 0.0000 0.1880 0.0205 0.0622 0.1053 0 <NA>
#> 3: 0.0000 0.2354 0.0254 0.0836 0.1263 0 <NA>
#> 4: 0.0000 0.0630 0.0167 0.0180 0.0283 0 <NA>
#> 5: 0.0000 NA NA NA NA 0 <NA>
#> ---
#> 417: 0.0034 0.9046 0.0750 0.3399 0.4896 0 <NA>
#> 418: 0.0051 0.8965 0.0908 0.3189 0.4868 0 <NA>
#> 419: 0.0020 0.7714 0.0877 0.2961 0.3876 0 <NA>
#> 420: 0.0032 0.7938 0.0772 0.3077 0.4090 0 <NA>
#> 421: 0.0051 0.7421 0.0723 0.2689 0.4009 0 <NA>
# Read one dataset from another package, but with labels and comments
data(iris) # The R way: you got the initial datasets
# Same result, using read()
ir2 <- read("iris", package = "datasets", lang = NULL)
# ir2 records that it comes from datasets::iris
attr(comment(ir2), "src")
#> [1] "datasets::iris"
# otherwise, it is identical to iris, except is may be a data.table or a
# tibble, depending on user preferences
comment(ir2) <- NULL
# Force coercion into a data.frame
ir2 <- svBase::as_dtf(ir2)
identical(iris, ir2)
#> [1] TRUE
# More interesting: you can get an enhanced version of iris with read():
# (note that variable names ar in snake-case now!)
(ir3 <- read("iris", package = "datasets"))
#> sepal_length sepal_width petal_length petal_width species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
class(ir3)
#> [1] "data.table" "data.frame"
comment(ir3)
#> [1] "The 'iris' from 'datasets', but with variables names in snake_case"
#> [2] "(Sepal.Length -> sepal_length, Species -> species)."
#> attr(,"lang")
#> [1] "en"
#> attr(,"lang_encoding")
#> [1] "UTF-8"
#> attr(,"src")
#> [1] "datasets::iris"
ir3$sepal_length
#> [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
#> [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
#> [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
#> [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
#> [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
#> [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
#> [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
#> [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
#> [145] 6.7 6.7 6.3 6.5 6.2 5.9
#> attr(,"label")
#> [1] "Length of the sepals"
#> attr(,"units")
#> [1] "cm"
# ... and you can get it in French too!
(ir_fr <- read("iris", package = "datasets", lang = "fr"))
#> sepal_length sepal_width petal_length petal_width species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
class(ir_fr)
#> [1] "data.table" "data.frame"
comment(ir_fr)
#> [1] "Jeu de données 'iris' de 'datasets', mais avec noms de variables modifiées"
#> [2] "(Sepal.Length -> sepal_length, Species -> species)."
#> attr(,"lang")
#> [1] "fr"
#> attr(,"lang_encoding")
#> [1] "UTF-8"
#> attr(,"src")
#> [1] "datasets::iris"
ir_fr$sepal_length
#> [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
#> [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
#> [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
#> [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
#> [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
#> [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
#> [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
#> [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
#> [145] 6.7 6.7 6.3 6.5 6.2 5.9
#> attr(,"label")
#> [1] "Longueur des sépales"
#> attr(,"units")
#> [1] "cm"
# Sometimes, datasets are more deeply reworked. For instance, trees has
# variables in imperial units (in, ft, and cubic ft), but it is automatically
# reworked by read() into metric variables (m or m^3):
data(trees)
head(trees)
#> Girth Height Volume
#> 1 8.3 70 10.3
#> 2 8.6 65 10.3
#> 3 8.8 63 10.2
#> 4 10.5 72 16.4
#> 5 10.7 81 18.8
#> 6 10.8 83 19.7
(trees2 <- read("trees", package = "datasets"))
#> diameter height volume
#> <num> <num> <num>
#> 1: 0.211 21.3 0.292
#> 2: 0.218 19.8 0.292
#> 3: 0.224 19.2 0.289
#> 4: 0.267 21.9 0.464
#> 5: 0.272 24.7 0.532
#> 6: 0.274 25.3 0.558
#> 7: 0.279 20.1 0.442
#> 8: 0.279 22.9 0.515
#> 9: 0.282 24.4 0.640
#> 10: 0.284 22.9 0.563
#> 11: 0.287 24.1 0.685
#> 12: 0.290 23.2 0.595
#> 13: 0.290 23.2 0.606
#> 14: 0.297 21.0 0.603
#> 15: 0.305 22.9 0.541
#> 16: 0.328 22.6 0.629
#> 17: 0.328 25.9 0.957
#> 18: 0.338 26.2 0.776
#> 19: 0.348 21.6 0.728
#> 20: 0.351 19.5 0.705
#> 21: 0.356 23.8 0.977
#> 22: 0.361 24.4 0.898
#> 23: 0.368 22.6 1.028
#> 24: 0.406 21.9 1.085
#> 25: 0.414 23.5 1.206
#> 26: 0.439 24.7 1.569
#> 27: 0.444 25.0 1.577
#> 28: 0.455 24.4 1.651
#> 29: 0.457 24.4 1.458
#> 30: 0.457 24.4 1.444
#> 31: 0.523 26.5 2.180
#> diameter height volume
comment(trees2)
#> [1] "The 'trees' from 'datasets' but with variables renamed and in m or m^3"
#> [2] "(Girth [in] -> diameter [m], Height [ft] -> height [m],"
#> [3] "Volume [ft^3] -> volume [m^3])."
#> attr(,"lang")
#> [1] "en"
#> attr(,"lang_encoding")
#> [1] "UTF-8"
#> attr(,"src")
#> [1] "datasets::trees"
trees2$volume
#> [1] 0.292 0.292 0.289 0.464 0.532 0.558 0.442 0.515 0.640 0.563 0.685 0.595
#> [13] 0.606 0.603 0.541 0.629 0.957 0.776 0.728 0.705 0.977 0.898 1.028 1.085
#> [25] 1.206 1.569 1.577 1.651 1.458 1.444 2.180
#> attr(,"label")
#> [1] "Volume of timber"
#> attr(,"units")
#> [1] "m^3"
# \donttest{
# Read from a Github Gist (need to specify the type here!)
# (ble <- read$csv("http://tinyurl.com/Biostat-Ble"))
# Various versions of the famous iris dataset
(iris <- read(data_example("iris.csv")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris.csv.zip")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris.csv.gz")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris.csv.bz2")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris.tsv")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris.xls")))
#> New names:
#> • `` -> `...1`
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris.xlsx")))
#> New names:
#> • `` -> `...1`
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris.rds"))) # Does not tranform into tibble!
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
#(iris <- read(data_example("iris.syd"))) ##
#(iris <- read(data_example("iris.csvy"))) ##
#(iris <- read(data_example("iris.csvy.zip"))) ##
# A file with an header both in English (default) and in French
(iris <- read(data_example("iris_short_header.csv")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <labelled> <labelled> <labelled> <labelled> <labelled>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris_fr <- read(data_example("iris_short_header.csv"), lang = "fr"))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <labelled> <labelled> <labelled> <labelled> <labelled>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
# Headers are also recognized in xls/xlsx files
(iris_fr <- read(data_example("iris_short_header.xls"), lang = "fr"))
#> New names:
#> • `` -> `...1`
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <labelled> <labelled> <labelled> <labelled> <labelled>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
# Read a file with a sidecar file (same name + '.R')
(iris <- read(data_example("iris_sidecar.csv"))) # lang = "en" by default
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#> sepal_length sepal_width petal_length petal_width species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "EN")) # Full lang
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#> sepal_length sepal_width petal_length petal_width species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 I. setosa
#> 2: 4.9 3.0 1.4 0.2 I. setosa
#> 3: 4.7 3.2 1.3 0.2 I. setosa
#> 4: 4.6 3.1 1.5 0.2 I. setosa
#> 5: 5.0 3.6 1.4 0.2 I. setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 I. virginica
#> 147: 6.3 2.5 5.0 1.9 I. virginica
#> 148: 6.5 3.0 5.2 2.0 I. virginica
#> 149: 6.2 3.4 5.4 2.3 I. virginica
#> 150: 5.9 3.0 5.1 1.8 I. virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "en_us")) # US (in)
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#> sepal_length sepal_width petal_length petal_width species
#> <num> <num> <num> <num> <fctr>
#> 1: 2.007874 1.377953 0.5511811 0.07874016 setosa
#> 2: 1.929134 1.181102 0.5511811 0.07874016 setosa
#> 3: 1.850394 1.259843 0.5118110 0.07874016 setosa
#> 4: 1.811024 1.220472 0.5905512 0.07874016 setosa
#> 5: 1.968504 1.417323 0.5511811 0.07874016 setosa
#> ---
#> 146: 2.637795 1.181102 2.0472441 0.90551181 virginica
#> 147: 2.480315 0.984252 1.9685039 0.74803150 virginica
#> 148: 2.559055 1.181102 2.0472441 0.78740157 virginica
#> 149: 2.440945 1.338583 2.1259843 0.90551181 virginica
#> 150: 2.322835 1.181102 2.0078740 0.70866142 virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "fr")) # French
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#> sepal_length sepal_width petal_length petal_width species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "FR_BE")) # Belgian
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#> sepal_length sepal_width petal_length petal_width species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 I. setosa
#> 2: 4.9 3.0 1.4 0.2 I. setosa
#> 3: 4.7 3.2 1.3 0.2 I. setosa
#> 4: 4.6 3.1 1.5 0.2 I. setosa
#> 5: 5.0 3.6 1.4 0.2 I. setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 I. virginica
#> 147: 6.3 2.5 5.0 1.9 I. virginica
#> 148: 6.5 3.0 5.2 2.0 I. virginica
#> 149: 6.2 3.4 5.4 2.3 I. virginica
#> 150: 5.9 3.0 5.1 1.8 I. virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = NULL)) # No labels
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
# Require the feather package
#(iris <- read(data_example("iris.feather"))) # Not available for all Win
# Challenging datasets from the readr package
library(readr)
(mtcars <- read(readr_example("mtcars.csv")))
#> Rows: 32 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (11): mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> 2: 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> 3: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> 4: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> 5: 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> 6: 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> 7: 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> 8: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> 9: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> 10: 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> 11: 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> 12: 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> 13: 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> 14: 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> 15: 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> 16: 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> 17: 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> 18: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> 19: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> 20: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> 21: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> 22: 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> 23: 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> 24: 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> 25: 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> 26: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> 27: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> 28: 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> 29: 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> 30: 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> 31: 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> 32: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
#> mpg cyl disp hp drat wt qsec vs am gear carb
(mtcars <- read(readr_example("mtcars.csv.zip")))
#> Rows: 32 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (11): mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> 2: 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> 3: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> 4: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> 5: 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> 6: 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> 7: 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> 8: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> 9: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> 10: 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> 11: 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> 12: 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> 13: 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> 14: 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> 15: 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> 16: 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> 17: 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> 18: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> 19: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> 20: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> 21: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> 22: 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> 23: 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> 24: 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> 25: 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> 26: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> 27: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> 28: 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> 29: 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> 30: 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> 31: 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> 32: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
#> mpg cyl disp hp drat wt qsec vs am gear carb
(mtcars <- read(readr_example("mtcars.csv.bz2")))
#> Rows: 32 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (11): mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> 2: 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> 3: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> 4: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> 5: 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> 6: 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> 7: 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> 8: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> 9: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> 10: 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> 11: 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> 12: 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> 13: 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> 14: 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> 15: 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> 16: 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> 17: 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> 18: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> 19: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> 20: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> 21: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> 22: 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> 23: 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> 24: 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> 25: 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> 26: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> 27: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> 28: 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> 29: 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> 30: 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> 31: 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> 32: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
#> mpg cyl disp hp drat wt qsec vs am gear carb
(challenge <- read(readr_example("challenge.csv"), guess_max = 1001))
#> Rows: 2000 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (1): x
#> date (1): y
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> x y
#> <num> <Date>
#> 1: 404.0000000 <NA>
#> 2: 4172.0000000 <NA>
#> 3: 3004.0000000 <NA>
#> 4: 787.0000000 <NA>
#> 5: 37.0000000 <NA>
#> ---
#> 1996: 0.1635163 2018-03-29
#> 1997: 0.4719390 2014-08-04
#> 1998: 0.7183186 2015-08-16
#> 1999: 0.2698786 2020-02-04
#> 2000: 0.6082372 2019-01-06
(massey <- read(readr_example("massey-rating.txt")))
#> [1] "UCC PAY LAZ KPK RT COF BIH DII ENG ACU Rank Team Conf\n 1 1 1 1 1 1 1 1 1 1 1 Ohio St B10 \n 2 2 2 2 2 2 2 2 4 2 2 Oregon P12 \n 3 4 3 4 3 4 3 4 2 3 3 Alabama SEC \n 4 3 4 3 4 3 5 3 3 4 4 TCU B12 \n 6 6 6 5 5 7 6 5 6 11 5 Michigan St B10 \n 7 7 7 6 7 6 11 8 7 8 6 Georgia SEC \n 5 5 5 7 6 8 4 6 5 5 7 Florida St ACC \n 8 8 9 9 10 5 7 7 10 7 8 Baylor B12 \n 9 11 8 13 11 11 12 9 14 9 9 Georgia Tech ACC \n 13 10 13 11 8 9 10 11 9 10 10 Mississippi SEC \n"
# By default, the type cannot be guessed from the extension
# This is a space-separated vaules file (ssv)
(massey <- read(readr_example("massey-rating.txt"), type = "ssv"))
#>
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> UCC = col_double(),
#> PAY = col_double(),
#> LAZ = col_double(),
#> KPK = col_double(),
#> RT = col_double(),
#> COF = col_double(),
#> BIH = col_double(),
#> DII = col_double(),
#> ENG = col_double(),
#> ACU = col_double(),
#> Rank = col_double(),
#> Team = col_character(),
#> Conf = col_character()
#> )
#> Warning: 10 parsing failures.
#> row col expected actual file
#> 1 -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 2 -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 3 -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 4 -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 5 -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> ... ... .......... .......... .................................................................
#> See problems(...) for more details.
#> UCC PAY LAZ KPK RT COF BIH DII ENG ACU Rank
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 1 1 1 1 1 1 1 1 1 1 1
#> 2: 2 2 2 2 2 2 2 2 4 2 2
#> 3: 3 4 3 4 3 4 3 4 2 3 3
#> 4: 4 3 4 3 4 3 5 3 3 4 4
#> 5: 6 6 6 5 5 7 6 5 6 11 5
#> 6: 7 7 7 6 7 6 11 8 7 8 6
#> 7: 5 5 5 7 6 8 4 6 5 5 7
#> 8: 8 8 9 9 10 5 7 7 10 7 8
#> 9: 9 11 8 13 11 11 12 9 14 9 9
#> 10: 13 10 13 11 8 9 10 11 9 10 10
#> Team Conf
#> <char> <char>
#> 1: Ohio St
#> 2: Oregon P12
#> 3: Alabama SEC
#> 4: TCU B12
#> 5: Michigan St
#> 6: Georgia SEC
#> 7: Florida St
#> 8: Baylor B12
#> 9: Georgia Tech
#> 10: Mississippi SEC
# or ...
(massey <- read$ssv(readr_example("massey-rating.txt")))
#>
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> UCC = col_double(),
#> PAY = col_double(),
#> LAZ = col_double(),
#> KPK = col_double(),
#> RT = col_double(),
#> COF = col_double(),
#> BIH = col_double(),
#> DII = col_double(),
#> ENG = col_double(),
#> ACU = col_double(),
#> Rank = col_double(),
#> Team = col_character(),
#> Conf = col_character()
#> )
#> Warning: 10 parsing failures.
#> row col expected actual file
#> 1 -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 2 -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 3 -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 4 -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> 5 -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> ... ... .......... .......... .................................................................
#> See problems(...) for more details.
#> UCC PAY LAZ KPK RT COF BIH DII ENG ACU Rank
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 1 1 1 1 1 1 1 1 1 1 1
#> 2: 2 2 2 2 2 2 2 2 4 2 2
#> 3: 3 4 3 4 3 4 3 4 2 3 3
#> 4: 4 3 4 3 4 3 5 3 3 4 4
#> 5: 6 6 6 5 5 7 6 5 6 11 5
#> 6: 7 7 7 6 7 6 11 8 7 8 6
#> 7: 5 5 5 7 6 8 4 6 5 5 7
#> 8: 8 8 9 9 10 5 7 7 10 7 8
#> 9: 9 11 8 13 11 11 12 9 14 9 9
#> 10: 13 10 13 11 8 9 10 11 9 10 10
#> Team Conf
#> <char> <char>
#> 1: Ohio St
#> 2: Oregon P12
#> 3: Alabama SEC
#> 4: TCU B12
#> 5: Michigan St
#> 6: Georgia SEC
#> 7: Florida St
#> 8: Baylor B12
#> 9: Georgia Tech
#> 10: Mississippi SEC
(epa <- read$ssv(readr_example("epa78.txt"), col_names = FALSE))
#>
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> X1 = col_character(),
#> X2 = col_character(),
#> X3 = col_character(),
#> X4 = col_character(),
#> X5 = col_double()
#> )
#> Warning: 17 parsing failures.
#> row col expected actual file
#> 2 -- 5 columns 10 columns '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#> 3 -- 5 columns 6 columns '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#> 4 -- 5 columns 3 columns '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#> 5 -- 5 columns 8 columns '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#> 6 -- 5 columns 8 columns '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#> ... ... ......... .......... .........................................................
#> See problems(...) for more details.
#> X1 X2 X3 X4 X5
#> <char> <char> <char> <char> <num>
#> 1: ALFA ROMEO ALFA ROMEO 78010003
#> 2: ALFETTA 03 81 8 74
#> 3: SPIDER 2000 01 SPIDER 2000
#> 4: AMC AMC 78020002 <NA> NA
#> 5: GREMLIN 03 79 9 79
#> 6: PACER 04 89 11 89
#> 7: PACER WAGON 07 90 26
#> 8: CONCORD 04 88 12 90
#> 9: CONCORD WAGON 07 91 30
#> 10: MATADOR COUPE 05 97 14
#> 11: MATADOR SEDAN 06 110 20
#> 12: MATADOR WAGON 09 112 50
#> 13: ASTON MARTIN ASTON MARTIN 78040002
#> 14: ASTON MARTIN ASTON MARTIN 78040053
#> 15: AUDI AUDI 78050002 <NA> NA
#> 16: FOX 03 84 11 84
#> 17: FOX WAGON 07 83 40
#> 18: 5000 04 90 15 90
#> 19: AVANTI AVANTI 78065002 <NA> NA
#> 20: AVANTI II 02 75 8
#> X1 X2 X3 X4 X5
(example_log <- read(readr_example("example.log")))
#>
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> X1 = col_character(),
#> X2 = col_logical(),
#> X3 = col_character(),
#> X4 = col_character(),
#> X5 = col_character(),
#> X6 = col_double(),
#> X7 = col_double()
#> )
#> X1 X2 X3 X4
#> <char> <lgcl> <char> <char>
#> 1: 172.21.13.45 NA Microsoft\\JohnDoe 08/Apr/2001:17:39:04 -0800
#> 2: 127.0.0.1 NA frank 10/Oct/2000:13:55:36 -0700
#> X5 X6 X7
#> <char> <num> <num>
#> 1: GET /scripts/iisadmin/ism.dll?http/serv HTTP/1.0 200 3401
#> 2: GET /apache_pb.gif HTTP/1.0 200 2326
# There are different ways to specify columns for fixed-width files (fwf)
# See ?read_fwf in package readr
(fwf_sample <- read$fwf(readr_example("fwf-sample.txt"),
col_positions = fwf_cols(name = 20, state = 10, ssn = 12)))
#> Rows: 3 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#>
#> chr (3): name, state, ssn
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> name state ssn
#> <char> <char> <char>
#> 1: John Smith WA 418-Y11-4111
#> 2: Mary Hartford CA 319-Z19-4341
#> 3: Evan Nolan IL 219-532-c301
# Various examples of Excel datasets from readxl
library(readxl)
(xl <- read(readxl_example("datasets.xls")))
#> New names:
#> • `` -> `...1`
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(xl <- read(readxl_example("datasets.xlsx"), sheet = "mtcars"))
#> New names:
#> • `` -> `...1`
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> 2: 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> 3: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> 4: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> 5: 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> 6: 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> 7: 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> 8: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> 9: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> 10: 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> 11: 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> 12: 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> 13: 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> 14: 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> 15: 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> 16: 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> 17: 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> 18: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> 19: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> 20: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> 21: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> 22: 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> 23: 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> 24: 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> 25: 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> 26: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> 27: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> 28: 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> 29: 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> 30: 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> 31: 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> 32: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
#> mpg cyl disp hp drat wt qsec vs am gear carb
(xl <- read(readxl_example("datasets.xlsx"), sheet = 3))
#> New names:
#> • `` -> `...1`
#> weight feed
#> <num> <char>
#> 1: 179 horsebean
#> 2: 160 horsebean
#> 3: 136 horsebean
#> 4: 227 horsebean
#> 5: 217 horsebean
#> 6: 168 horsebean
#> 7: 108 horsebean
#> 8: 124 horsebean
#> 9: 143 horsebean
#> 10: 140 horsebean
#> 11: 309 linseed
#> 12: 229 linseed
#> 13: 181 linseed
#> 14: 141 linseed
#> 15: 260 linseed
#> 16: 203 linseed
#> 17: 148 linseed
#> 18: 169 linseed
#> 19: 213 linseed
#> 20: 257 linseed
#> 21: 244 linseed
#> 22: 271 linseed
#> 23: 243 soybean
#> 24: 230 soybean
#> 25: 248 soybean
#> 26: 327 soybean
#> 27: 329 soybean
#> 28: 250 soybean
#> 29: 193 soybean
#> 30: 271 soybean
#> 31: 316 soybean
#> 32: 267 soybean
#> 33: 199 soybean
#> 34: 171 soybean
#> 35: 158 soybean
#> 36: 248 soybean
#> 37: 423 sunflower
#> 38: 340 sunflower
#> 39: 392 sunflower
#> 40: 339 sunflower
#> 41: 341 sunflower
#> 42: 226 sunflower
#> 43: 320 sunflower
#> 44: 295 sunflower
#> 45: 334 sunflower
#> 46: 322 sunflower
#> 47: 297 sunflower
#> 48: 318 sunflower
#> 49: 325 meatmeal
#> 50: 257 meatmeal
#> 51: 303 meatmeal
#> 52: 315 meatmeal
#> 53: 380 meatmeal
#> 54: 153 meatmeal
#> 55: 263 meatmeal
#> 56: 242 meatmeal
#> 57: 206 meatmeal
#> 58: 344 meatmeal
#> 59: 258 meatmeal
#> 60: 368 casein
#> 61: 390 casein
#> 62: 379 casein
#> 63: 260 casein
#> 64: 404 casein
#> 65: 318 casein
#> 66: 352 casein
#> 67: 359 casein
#> 68: 216 casein
#> 69: 222 casein
#> 70: 283 casein
#> 71: 332 casein
#> weight feed
# Accomodate a column with disparate types via col_type = "list"
(clip <- read(readxl_example("clippy.xls"), col_types = c("text", "list")))
#> New names:
#> • `` -> `...1`
#> name value
#> <char> <list>
#> 1: Name Clippy
#> 2: Species paperclip
#> 3: Approx date of death 2007-01-01
#> 4: Weight in grams 0.9
(clip <- read(readxl_example("clippy.xlsx"), col_types = c("text", "list")))
#> New names:
#> • `` -> `...1`
#> name value
#> <char> <list>
#> 1: Name Clippy
#> 2: Species paperclip
#> 3: Approx date of death 2007-01-01
#> 4: Weight in grams 0.9
tibble::deframe(clip)
#> $Name
#> [1] "Clippy"
#>
#> $Species
#> [1] "paperclip"
#>
#> $`Approx date of death`
#> [1] "2007-01-01 UTC"
#>
#> $`Weight in grams`
#> [1] 0.9
#>
# Read from a specific range in a sheet
(xl <- read(readxl_example("datasets.xlsx"), range = "mtcars!B1:D5"))
#> New names:
#> • `` -> `...1`
#> cyl disp hp
#> <num> <num> <num>
#> 1: 6 160 110
#> 2: 6 160 110
#> 3: 4 108 93
#> 4: 6 258 110
(deaths <- read(readxl_example("deaths.xls"), range = cell_rows(5:15)))
#> New names:
#> • `` -> `...1`
#> Name Profession Age Has kids Date of birth Date of death
#> <char> <char> <num> <lgcl> <POSc> <POSc>
#> 1: David Bowie musician 69 TRUE 1947-01-08 2016-01-10
#> 2: Carrie Fisher actor 60 TRUE 1956-10-21 2016-12-27
#> 3: Chuck Berry musician 90 TRUE 1926-10-18 2017-03-18
#> 4: Bill Paxton actor 61 TRUE 1955-05-17 2017-02-25
#> 5: Prince musician 57 TRUE 1958-06-07 2016-04-21
#> 6: Alan Rickman actor 69 FALSE 1946-02-21 2016-01-14
#> 7: Florence Henderson actor 82 TRUE 1934-02-14 2016-11-24
#> 8: Harper Lee author 89 FALSE 1926-04-28 2016-02-19
#> 9: Zsa Zsa Gábor actor 99 TRUE 1917-02-06 2016-12-18
#> 10: George Michael musician 53 FALSE 1963-06-25 2016-12-25
(deaths <- read(readxl_example("deaths.xlsx"), range = cell_rows(5:15)))
#> New names:
#> • `` -> `...1`
#> Name Profession Age Has kids Date of birth Date of death
#> <char> <char> <num> <lgcl> <POSc> <POSc>
#> 1: David Bowie musician 69 TRUE 1947-01-08 2016-01-10
#> 2: Carrie Fisher actor 60 TRUE 1956-10-21 2016-12-27
#> 3: Chuck Berry musician 90 TRUE 1926-10-18 2017-03-18
#> 4: Bill Paxton actor 61 TRUE 1955-05-17 2017-02-25
#> 5: Prince musician 57 TRUE 1958-06-07 2016-04-21
#> 6: Alan Rickman actor 69 FALSE 1946-02-21 2016-01-14
#> 7: Florence Henderson actor 82 TRUE 1934-02-14 2016-11-24
#> 8: Harper Lee author 89 FALSE 1926-04-28 2016-02-19
#> 9: Zsa Zsa Gábor actor 99 TRUE 1917-02-06 2016-12-18
#> 10: George Michael musician 53 FALSE 1963-06-25 2016-12-25
(type_me <- read(readxl_example("type-me.xls"), sheet = "logical_coercion",
col_types = c("logical", "text")))
#> New names:
#> • `` -> `...1`
#> Warning: Expecting logical in A5 / R5C1: got a date
#> Warning: Expecting logical in A8 / R8C1: got 'cabbage'
#> maybe boolean? description
#> <lgcl> <char>
#> 1: NA empty
#> 2: FALSE 0 (numeric)
#> 3: TRUE 1 (numeric)
#> 4: NA datetime
#> 5: TRUE boolean true
#> 6: FALSE boolean false
#> 7: NA "cabbage"
#> 8: TRUE the string "true"
#> 9: FALSE the letter "F"
#> 10: FALSE "False" preceded by single quote
(type_me <- read(readxl_example("type-me.xlsx"), sheet = "numeric_coercion",
col_types = c("numeric", "text")))
#> New names:
#> • `` -> `...1`
#> Warning: Coercing boolean to numeric in A3 / R3C1
#> Warning: Coercing boolean to numeric in A4 / R4C1
#> Warning: Expecting numeric in A5 / R5C1: got a date
#> Warning: Coercing text to numeric in A6 / R6C1: '123456'
#> Warning: Expecting numeric in A8 / R8C1: got 'cabbage'
#> maybe numeric? explanation
#> <num> <char>
#> 1: NA empty
#> 2: 1 boolean true
#> 3: 0 boolean false
#> 4: 40534 datetime
#> 5: 123456 the string "123456"
#> 6: 123456 the number 123456
#> 7: NA "cabbage"
(type_me <- read(readxl_example("type-me.xls"), sheet = "date_coercion",
col_types = c("date", "text")))
#> New names:
#> • `` -> `...1`
#> Warning: Expecting date in A5 / R5C1: got boolean
#> Warning: Expecting date in A6 / R6C1: got 'cabbage'
#> Warning: Coercing numeric to date in A7 / R7C1
#> Warning: Coercing numeric to date in A8 / R8C1
#> maybe a datetime? explanation
#> <POSc> <char>
#> 1: <NA> empty
#> 2: 2016-05-23 00:00:00 date only format
#> 3: 2016-04-28 11:30:00 date and time format
#> 4: <NA> boolean true
#> 5: <NA> "cabbage"
#> 6: 1904-01-05 07:12:00 4.3 (numeric)
#> 7: 2012-01-02 00:00:00 another numeric
(type_me <- read(readxl_example("type-me.xlsx"), sheet = "text_coercion",
col_types = c("text", "text")))
#> New names:
#> • `` -> `...1`
#> text explanation
#> <char> <char>
#> 1: <NA> empty
#> 2: cabbage "cabbage"
#> 3: TRUE boolean true
#> 4: 1.3 numeric
#> 5: 41175 datetime
#> 6: 36436153 another numeric
(xl <- read(readxl_example("geometry.xls"), col_names = FALSE))
#> New names:
#> • `` -> `...1`
#> • `` -> `...2`
#> • `` -> `...3`
#> ...1 ...2 ...3
#> <char> <char> <char>
#> 1: B3 C3 D3
#> 2: B4 C4 D4
#> 3: B5 C5 D5
#> 4: B6 C6 D6
(xl <- read(readxl_example("geometry.xlsx"), range = cell_rows(4:8)))
#> B4 C4 D4
#> <char> <char> <char>
#> 1: B5 C5 D5
#> 2: B6 C6 D6
#> 3: <NA> <NA> <NA>
#> 4: <NA> <NA> <NA>
# Various examples from haven
library(haven)
haven_example <- function(path)
system.file("examples", path, package = "haven", mustWork = TRUE)
(iris2 <- read(haven_example("iris.dta"))) # Stata v. 8-14
#> sepallength sepalwidth petallength petalwidth species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virginica
#> 147: 6.3 2.5 5.0 1.9 virginica
#> 148: 6.5 3.0 5.2 2.0 virginica
#> 149: 6.2 3.4 5.4 2.3 virginica
#> 150: 5.9 3.0 5.1 1.8 virginica
(iris2 <- read(haven_example("iris.sav"))) # SPSS, TODO: labelled -> factor?
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <haven_labelled>
#> 1: 5.1 3.5 1.4 0.2 1
#> 2: 4.9 3.0 1.4 0.2 1
#> 3: 4.7 3.2 1.3 0.2 1
#> 4: 4.6 3.1 1.5 0.2 1
#> 5: 5.0 3.6 1.4 0.2 1
#> ---
#> 146: 6.7 3.0 5.2 2.3 3
#> 147: 6.3 2.5 5.0 1.9 3
#> 148: 6.5 3.0 5.2 2.0 3
#> 149: 6.2 3.4 5.4 2.3 3
#> 150: 5.9 3.0 5.1 1.8 3
(pbc <- read(data_example("pbc.por"))) # SPSS, POR format
#> AGE ALB ALKPHOS ASCITES BILI CHOL EDEMA EDTRT HEPMEG TIME
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 58.7652 2.60 1718.0 1 14.5 261 1 1.0 1 400
#> 2: 56.4463 4.14 7394.8 0 1.1 302 0 0.0 1 4500
#> 3: 70.0726 3.48 516.0 0 1.4 176 1 0.5 0 1012
#> 4: 54.7406 2.54 6121.8 0 1.8 244 1 0.5 1 1925
#> 5: 38.1054 3.53 671.0 0 3.4 279 0 0.0 1 1504
#> ---
#> 414: 67.0000 2.96 -9.0 -9 1.2 -9 0 0.0 -9 681
#> 415: 39.0000 3.83 -9.0 -9 0.9 -9 0 0.0 -9 1103
#> 416: 57.0000 3.42 -9.0 -9 1.6 -9 0 0.0 -9 1055
#> 417: 58.0000 3.75 -9.0 -9 0.8 -9 0 0.0 -9 691
#> 418: 53.0000 3.29 -9.0 -9 0.7 -9 0 0.0 -9 976
#> PLATELET PROTIME SEX SGOT SPIDERS STAGE STATUS TRT TRIG COPPER
#> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#> 1: 190 12.2 1 137.95 1 4 1 1 172 156
#> 2: 221 10.6 1 113.52 1 3 0 1 88 54
#> 3: 151 12.0 0 96.10 0 4 1 1 55 210
#> 4: 183 10.3 1 60.63 1 4 1 1 92 64
#> 5: 136 10.9 1 113.15 1 3 0 2 72 143
#> ---
#> 414: 174 10.9 -9 -9.00 -9 -9 1 -9 -9 -9
#> 415: 180 11.2 -9 -9.00 -9 -9 0 -9 -9 -9
#> 416: 143 9.9 -9 -9.00 -9 -9 0 -9 -9 -9
#> 417: 269 10.4 -9 -9.00 -9 -9 0 -9 -9 -9
#> 418: 350 10.6 -9 -9.00 -9 -9 0 -9 -9 -9
(iris2 <- read$sas(haven_example("iris.sas7bdat"))) # SAS file
#> Sepal_Length Sepal_Width Petal_Length Petal_Width Species
#> <num> <num> <num> <num> <char>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> ---
#> 146: 6.7 3.0 5.2 2.3 virgin
#> 147: 6.3 2.5 5.0 1.9 virgin
#> 148: 6.5 3.0 5.2 2.0 virgin
#> 149: 6.2 3.4 5.4 2.3 virgin
#> 150: 5.9 3.0 5.1 1.8 virgin
(afalfa <- read(data_example("afalfa.xpt"))) # SAS transport file
#> POP SAMPLE REP SEEDWT HARV1 HARV2
#> <char> <num> <num> <num> <num> <num>
#> 1: min 0 1 64 171.7 180.3
#> 2: min 1 1 54 138.2 150.7
#> 3: min 2 1 40 145.6 129.1
#> 4: min 3 1 45 170.4 191.2
#> 5: min 4 1 64 124.8 172.6
#> 6: MAX 5 1 75 179.0 235.3
#> 7: MAX 6 1 45 166.3 173.9
#> 8: MAX 7 1 63 169.7 155.8
#> 9: MAX 8 1 65 192.9 177.6
#> 10: MAX 9 1 59 185.8 179.2
#> 11: min 0 2 59 158.8 139.7
#> 12: min 1 2 46 163.7 150.0
#> 13: min 2 2 42 120.6 131.1
#> 14: min 3 2 38 193.1 195.4
#> 15: min 4 2 54 171.5 167.6
#> 16: MAX 5 2 59 181.4 152.9
#> 17: MAX 6 2 60 165.3 167.5
#> 18: MAX 7 2 63 163.9 158.0
#> 19: MAX 8 2 70 152.5 150.2
#> 20: MAX 9 2 62 173.5 190.7
#> 21: min 0 3 60 147.9 164.9
#> 22: min 1 3 42 181.3 151.5
#> 23: min 2 3 35 124.3 134.4
#> 24: min 3 3 47 174.8 200.8
#> 25: min 4 3 59 167.8 178.3
#> 26: MAX 5 3 57 193.4 183.5
#> 27: MAX 6 3 60 150.7 147.1
#> 28: MAX 7 3 59 142.5 148.7
#> 29: MAX 8 3 59 176.4 204.8
#> 30: MAX 9 3 70 144.2 143.8
#> 31: min 0 4 61 148.4 168.8
#> 32: min 1 4 52 164.9 158.6
#> 33: min 2 4 43 141.2 158.1
#> 34: min 3 4 49 176.5 208.3
#> 35: min 4 4 60 177.5 137.1
#> 36: MAX 5 4 59 174.1 160.2
#> 37: MAX 6 4 48 155.5 185.8
#> 38: MAX 7 4 61 186.7 157.7
#> 39: MAX 8 4 64 162.4 179.4
#> 40: MAX 9 4 71 141.0 161.5
#> POP SAMPLE REP SEEDWT HARV1 HARV2
# Note that where completion is available, you have a completion list of file
# format after typing read$<tab>
# }