Read data in R in different formats

Read and return an R object from data on disk, from URL, or from packages.

Usage

read(
  file,
  type = NULL,
  header = "#",
  header.max = 50L,
  skip = 0L,
  locale = default_locale(),
  lang = getOption("data.io_lang", "en"),
  lang_encoding = "UTF-8",
  as_dataframe = FALSE,
  as_labelled = FALSE,
  comments = NULL,
  package = NULL,
  sidecar_file = TRUE,
  fun_list = NULL,
  hfun = NULL,
  fun = NULL,
  data,
  cache_file = NULL,
  method = "auto",
  quiet = FALSE,
  force = FALSE,
  ...
)

type_from_extension(file, full = FALSE)

hread_text(file, header.max, skip = 0L, locale = default_locale(), ...)

hread_xls(file, header.max, skip = 0L, locale = default_locale(), ...)

hread_xlsx(file, header.max, skip = 0L, locale = default_locale(), ...)

# S3 method for subsettable_type
$(x, name)

# S3 method for read_function_subset
.DollarNames(x, pattern = "")

Arguments

file: The path to the file to read, or the name of the dataset to get from an R package (in that case, you must provide the package= argument).
type: The type (format) of data to read.
header: The character to use for the header and other comments.
header.max: The maximum of lines to consider for the header.
skip: The number of lines to skip at the beginning of the file.
locale: A readr locale object with all the data regarding required to correctly interpret country-related items. The default value matches R defaults as US English + UTF-8 encoding, and it is advised to be used as much as possible.
lang: The language to use (mainly for comment, label and units), but also for factor levels or other character strings if a translation exists and if the language is spelled with uppercase characters (e.g., "FR"). The default value can be set with, e.g., options(data.io_lang = "fr") for French.
lang_encoding: Encoding used by R scripts for translation. They should all be encoded as UTF-8, which is the default. However, this argument allows to specify a different encoding if needed.
as_dataframe: Deprecated: now use options(SciViews.as_dtx = as_XXX) to specify if you want a data.frame (as_dtf), a data.table (as_dtt, by default), or a tibble (as_dtbl). Do we try to convert the resulting object into a dataframe (inheriting from data.frame, tbl and tbl_db alias tibble)? If FALSE, no conversion is attempted. Note that now, whatever you indicate, it is always assumed to be FALSE as part of the deprecation!
as_labelled: Are variable converted into 'labelled' objects. This allows to keep labels and units when the vector is manipulated, but it can lead to incompatibilities with some R code (hence, it is FALSE by default).
comments: Comments to add in the created object.
package: The package where to look for the dataset. If file= is not provided, a list of available datasets in the package is displayed.
sidecar_file: If TRUE and a file with same name as file= + .R is found in the same directory, it is considered as code to import these data and it is sourced with local = TRUE, chdir = TRUE and verbose = FALSE. That script must create an object named dataset, which is the result that is returned by the function. It is advised to encode this script in UTF-8, which is the default value, but it is possible to specify a different encoding through the lang_encoding= parameter.
fun_list: The table with correspondence of the types, read, and write functions.
hfun: The function to read the header (lines starting with a special mark, usually '#' at the beginning of the file). This function must have the same arguments as hread_text() and should return a character string with the first header.max lines.
fun: The function to delegate reading of the data. If NULL (default), The function is chosen from fun_list.
data: A synonym to file= (the name makes more sense when the dataset is loaded from a package). You cannot use data= and file= at the same time.
cache_file: The path to a local file to use as a cache when file is downloaded (http://, https://, ftp://, or file:// protocols). If cache_file already exists, data are read from this cache, except if force = TRUE, see here under. Otherwise, data are saved in it before being used. If cache_file = NULL (the default), a temporary file is used and data are read from the Internet every time. This cache mechanism is particularly useful to provide data associated with a git repository. Put cache_file in .gitignore and use cache_file= in the code (and force = FALSE). That way, the data are downloaded once in a freshly cloned repository, and they are not included in the versioning system (useful for large datasets).
method: The downloading method used ("auto" by default), see utils::download.file().
quiet: In case we have to download files, do it silently (TRUE) or do we provide feedback and a progression bar (FALSE, by default)?
force: If TRUE and an URL is provided for file= and a path for cache_file=, then the content is downloaded all the time, even if the cache file already exists (it overwrites it). By default, it is FALSE, which is the most useful setting to make good use of the cache mechanism.
...: Further arguments passed to the function fun=.
full: Do we return the full extension, like csv.tar.gz (TRUE), or only the main extension, like csv (FALSE, by default).
x: A subsettable_type function.
name: The value to use for the type= argument.
pattern: A regular expression to list matching names.

Value

An R object with the data (its class depends on the data being read).

Details

read() allows for a unique entry point to read various kinds of data, but it delegates the actual work to various other functions dispatched across several R packages. See getOption("read_write").

Author

Philippe Grosjean phgrosjean@sciviews.org

Examples

# Use of read() as a more flexible substitute to data() (can change dataset
# name and syntax more similar to read R datasets and datasets from files)
read() # List all available datasets in your installed version of R
# List datasets in one particular package
read(package = "data.io")

# Read one dataset from this package, possibly changing its name
(urchin <- read("urchin_bio", package = "data.io"))
#>       origin diameter1 diameter2 height buoyant_weight weight solid_parts
#>       <fctr>     <num>     <num>  <num>          <num>  <num>       <num>
#>   1: Fishery       9.9      10.2    5.0             NA 0.5215      0.4777
#>   2: Fishery      10.5      10.6    5.7             NA 0.6418      0.5891
#>   3: Fishery      10.8      10.8    5.2             NA 0.7336      0.6770
#>   4: Fishery       9.6       9.3    4.6             NA 0.3697      0.3438
#>   5: Fishery      10.4      10.7    4.8             NA 0.6097      0.5587
#>  ---                                                                     
#> 417:    Farm      16.7      17.2    8.5         0.5674 2.4300      2.2900
#> 418:    Farm      16.5      16.5    7.9         0.5472 2.3200      2.1800
#> 419:    Farm      16.8      16.7    8.2         0.4864 2.2200      2.1300
#> 420:    Farm      17.3      17.2    8.5         0.4864 2.5200      2.3400
#> 421:    Farm      17.0      16.6    7.9         0.4357 2.0500      1.9800
#>      integuments dry_integuments digestive_tract dry_digestive_tract gonads
#>            <num>           <num>           <num>               <num>  <num>
#>   1:      0.3658              NA          0.0525              0.0079 0.0000
#>   2:      0.4447              NA          0.0482              0.0090 0.0000
#>   3:      0.5326              NA          0.0758              0.0134 0.0000
#>   4:      0.2661              NA          0.0442              0.0064 0.0000
#>   5:      0.4058              NA          0.0743              0.0117 0.0000
#>  ---                                                                       
#> 417:      1.8400            1.02          0.1661              0.0229 0.0215
#> 418:      1.8000            1.01          0.0977              0.0147 0.0253
#> 419:      1.6300            0.88          0.1704              0.0208 0.0154
#> 420:      1.7200            0.89          0.1444              0.0167 0.0237
#> 421:      1.4300            0.83          0.1462              0.0212 0.0266
#>      dry_gonads skeleton lantern   test spines maturity    sex
#>           <num>    <num>   <num>  <num>  <num>    <int> <fctr>
#>   1:     0.0000   0.1793  0.0211 0.0587 0.0995        0   <NA>
#>   2:     0.0000   0.1880  0.0205 0.0622 0.1053        0   <NA>
#>   3:     0.0000   0.2354  0.0254 0.0836 0.1263        0   <NA>
#>   4:     0.0000   0.0630  0.0167 0.0180 0.0283        0   <NA>
#>   5:     0.0000       NA      NA     NA     NA        0   <NA>
#>  ---                                                          
#> 417:     0.0034   0.9046  0.0750 0.3399 0.4896        0   <NA>
#> 418:     0.0051   0.8965  0.0908 0.3189 0.4868        0   <NA>
#> 419:     0.0020   0.7714  0.0877 0.2961 0.3876        0   <NA>
#> 420:     0.0032   0.7938  0.0772 0.3077 0.4090        0   <NA>
#> 421:     0.0051   0.7421  0.0723 0.2689 0.4009        0   <NA>
# Same, but using labels in French
(urchin <- read("urchin_bio", package = "data.io", lang = "fr"))
#>       origin diameter1 diameter2 height buoyant_weight weight solid_parts
#>       <fctr>     <num>     <num>  <num>          <num>  <num>       <num>
#>   1: Fishery       9.9      10.2    5.0             NA 0.5215      0.4777
#>   2: Fishery      10.5      10.6    5.7             NA 0.6418      0.5891
#>   3: Fishery      10.8      10.8    5.2             NA 0.7336      0.6770
#>   4: Fishery       9.6       9.3    4.6             NA 0.3697      0.3438
#>   5: Fishery      10.4      10.7    4.8             NA 0.6097      0.5587
#>  ---                                                                     
#> 417:    Farm      16.7      17.2    8.5         0.5674 2.4300      2.2900
#> 418:    Farm      16.5      16.5    7.9         0.5472 2.3200      2.1800
#> 419:    Farm      16.8      16.7    8.2         0.4864 2.2200      2.1300
#> 420:    Farm      17.3      17.2    8.5         0.4864 2.5200      2.3400
#> 421:    Farm      17.0      16.6    7.9         0.4357 2.0500      1.9800
#>      integuments dry_integuments digestive_tract dry_digestive_tract gonads
#>            <num>           <num>           <num>               <num>  <num>
#>   1:      0.3658              NA          0.0525              0.0079 0.0000
#>   2:      0.4447              NA          0.0482              0.0090 0.0000
#>   3:      0.5326              NA          0.0758              0.0134 0.0000
#>   4:      0.2661              NA          0.0442              0.0064 0.0000
#>   5:      0.4058              NA          0.0743              0.0117 0.0000
#>  ---                                                                       
#> 417:      1.8400            1.02          0.1661              0.0229 0.0215
#> 418:      1.8000            1.01          0.0977              0.0147 0.0253
#> 419:      1.6300            0.88          0.1704              0.0208 0.0154
#> 420:      1.7200            0.89          0.1444              0.0167 0.0237
#> 421:      1.4300            0.83          0.1462              0.0212 0.0266
#>      dry_gonads skeleton lantern   test spines maturity    sex
#>           <num>    <num>   <num>  <num>  <num>    <int> <fctr>
#>   1:     0.0000   0.1793  0.0211 0.0587 0.0995        0   <NA>
#>   2:     0.0000   0.1880  0.0205 0.0622 0.1053        0   <NA>
#>   3:     0.0000   0.2354  0.0254 0.0836 0.1263        0   <NA>
#>   4:     0.0000   0.0630  0.0167 0.0180 0.0283        0   <NA>
#>   5:     0.0000       NA      NA     NA     NA        0   <NA>
#>  ---                                                          
#> 417:     0.0034   0.9046  0.0750 0.3399 0.4896        0   <NA>
#> 418:     0.0051   0.8965  0.0908 0.3189 0.4868        0   <NA>
#> 419:     0.0020   0.7714  0.0877 0.2961 0.3876        0   <NA>
#> 420:     0.0032   0.7938  0.0772 0.3077 0.4090        0   <NA>
#> 421:     0.0051   0.7421  0.0723 0.2689 0.4009        0   <NA>
# ... and also the levels of factors in French (note: uppercase FR)
(urchin <- read("urchin_bio", package = "data.io", lang = "FR"))
#>        origin diameter1 diameter2 height buoyant_weight weight solid_parts
#>        <fctr>     <num>     <num>  <num>          <num>  <num>       <num>
#>   1: Pêcherie       9.9      10.2    5.0             NA 0.5215      0.4777
#>   2: Pêcherie      10.5      10.6    5.7             NA 0.6418      0.5891
#>   3: Pêcherie      10.8      10.8    5.2             NA 0.7336      0.6770
#>   4: Pêcherie       9.6       9.3    4.6             NA 0.3697      0.3438
#>   5: Pêcherie      10.4      10.7    4.8             NA 0.6097      0.5587
#>  ---                                                                      
#> 417:  Culture      16.7      17.2    8.5         0.5674 2.4300      2.2900
#> 418:  Culture      16.5      16.5    7.9         0.5472 2.3200      2.1800
#> 419:  Culture      16.8      16.7    8.2         0.4864 2.2200      2.1300
#> 420:  Culture      17.3      17.2    8.5         0.4864 2.5200      2.3400
#> 421:  Culture      17.0      16.6    7.9         0.4357 2.0500      1.9800
#>      integuments dry_integuments digestive_tract dry_digestive_tract gonads
#>            <num>           <num>           <num>               <num>  <num>
#>   1:      0.3658              NA          0.0525              0.0079 0.0000
#>   2:      0.4447              NA          0.0482              0.0090 0.0000
#>   3:      0.5326              NA          0.0758              0.0134 0.0000
#>   4:      0.2661              NA          0.0442              0.0064 0.0000
#>   5:      0.4058              NA          0.0743              0.0117 0.0000
#>  ---                                                                       
#> 417:      1.8400            1.02          0.1661              0.0229 0.0215
#> 418:      1.8000            1.01          0.0977              0.0147 0.0253
#> 419:      1.6300            0.88          0.1704              0.0208 0.0154
#> 420:      1.7200            0.89          0.1444              0.0167 0.0237
#> 421:      1.4300            0.83          0.1462              0.0212 0.0266
#>      dry_gonads skeleton lantern   test spines maturity    sex
#>           <num>    <num>   <num>  <num>  <num>    <int> <fctr>
#>   1:     0.0000   0.1793  0.0211 0.0587 0.0995        0   <NA>
#>   2:     0.0000   0.1880  0.0205 0.0622 0.1053        0   <NA>
#>   3:     0.0000   0.2354  0.0254 0.0836 0.1263        0   <NA>
#>   4:     0.0000   0.0630  0.0167 0.0180 0.0283        0   <NA>
#>   5:     0.0000       NA      NA     NA     NA        0   <NA>
#>  ---                                                          
#> 417:     0.0034   0.9046  0.0750 0.3399 0.4896        0   <NA>
#> 418:     0.0051   0.8965  0.0908 0.3189 0.4868        0   <NA>
#> 419:     0.0020   0.7714  0.0877 0.2961 0.3876        0   <NA>
#> 420:     0.0032   0.7938  0.0772 0.3077 0.4090        0   <NA>
#> 421:     0.0051   0.7421  0.0723 0.2689 0.4009        0   <NA>

# Read one dataset from another package, but with labels and comments
data(iris) # The R way: you got the initial datasets
# Same result, using read()
ir2 <- read("iris", package = "datasets", lang = NULL)
# ir2 records that it comes from datasets::iris
attr(comment(ir2), "src")
#> [1] "datasets::iris"
# otherwise, it is identical to iris, except is may be a data.table or a
# tibble, depending on user preferences
comment(ir2) <- NULL
# Force coercion into a data.frame
ir2 <- svBase::as_dtf(ir2)
identical(iris, ir2)
#> [1] TRUE
# More interesting: you can get an enhanced version of iris with read():
# (note that variable names ar in snake-case now!)
(ir3 <- read("iris", package = "datasets"))
#>      sepal_length sepal_width petal_length petal_width   species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
class(ir3)
#> [1] "data.table" "data.frame"
comment(ir3)
#> [1] "The 'iris' from 'datasets', but with variables names in snake_case"
#> [2] "(Sepal.Length -> sepal_length, Species -> species)."               
#> attr(,"lang")
#> [1] "en"
#> attr(,"lang_encoding")
#> [1] "UTF-8"
#> attr(,"src")
#> [1] "datasets::iris"
ir3$sepal_length
#>   [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
#>  [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
#>  [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
#>  [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
#>  [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
#>  [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
#> [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
#> [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
#> [145] 6.7 6.7 6.3 6.5 6.2 5.9
#> attr(,"label")
#> [1] "Length of the sepals"
#> attr(,"units")
#> [1] "cm"
# ... and you can get it in French too!
(ir_fr <- read("iris", package = "datasets", lang = "fr"))
#>      sepal_length sepal_width petal_length petal_width   species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
class(ir_fr)
#> [1] "data.table" "data.frame"
comment(ir_fr)
#> [1] "Jeu de données 'iris' de 'datasets', mais avec noms de variables modifiées"
#> [2] "(Sepal.Length -> sepal_length, Species -> species)."                       
#> attr(,"lang")
#> [1] "fr"
#> attr(,"lang_encoding")
#> [1] "UTF-8"
#> attr(,"src")
#> [1] "datasets::iris"
ir_fr$sepal_length
#>   [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
#>  [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
#>  [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
#>  [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
#>  [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
#>  [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
#> [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
#> [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
#> [145] 6.7 6.7 6.3 6.5 6.2 5.9
#> attr(,"label")
#> [1] "Longueur des sépales"
#> attr(,"units")
#> [1] "cm"

# Sometimes, datasets are more deeply reworked. For instance, trees has
# variables in imperial units (in, ft, and cubic ft), but it is automatically
# reworked by read() into metric variables (m or m^3):
data(trees)
head(trees)
#>   Girth Height Volume
#> 1   8.3     70   10.3
#> 2   8.6     65   10.3
#> 3   8.8     63   10.2
#> 4  10.5     72   16.4
#> 5  10.7     81   18.8
#> 6  10.8     83   19.7
(trees2 <- read("trees", package = "datasets"))
#>     diameter height volume
#>        <num>  <num>  <num>
#>  1:    0.211   21.3  0.292
#>  2:    0.218   19.8  0.292
#>  3:    0.224   19.2  0.289
#>  4:    0.267   21.9  0.464
#>  5:    0.272   24.7  0.532
#>  6:    0.274   25.3  0.558
#>  7:    0.279   20.1  0.442
#>  8:    0.279   22.9  0.515
#>  9:    0.282   24.4  0.640
#> 10:    0.284   22.9  0.563
#> 11:    0.287   24.1  0.685
#> 12:    0.290   23.2  0.595
#> 13:    0.290   23.2  0.606
#> 14:    0.297   21.0  0.603
#> 15:    0.305   22.9  0.541
#> 16:    0.328   22.6  0.629
#> 17:    0.328   25.9  0.957
#> 18:    0.338   26.2  0.776
#> 19:    0.348   21.6  0.728
#> 20:    0.351   19.5  0.705
#> 21:    0.356   23.8  0.977
#> 22:    0.361   24.4  0.898
#> 23:    0.368   22.6  1.028
#> 24:    0.406   21.9  1.085
#> 25:    0.414   23.5  1.206
#> 26:    0.439   24.7  1.569
#> 27:    0.444   25.0  1.577
#> 28:    0.455   24.4  1.651
#> 29:    0.457   24.4  1.458
#> 30:    0.457   24.4  1.444
#> 31:    0.523   26.5  2.180
#>     diameter height volume
comment(trees2)
#> [1] "The 'trees' from 'datasets' but with variables renamed and in m or m^3"
#> [2] "(Girth [in] -> diameter [m], Height [ft] -> height [m],"               
#> [3] "Volume [ft^3] -> volume [m^3])."                                       
#> attr(,"lang")
#> [1] "en"
#> attr(,"lang_encoding")
#> [1] "UTF-8"
#> attr(,"src")
#> [1] "datasets::trees"
trees2$volume
#>  [1] 0.292 0.292 0.289 0.464 0.532 0.558 0.442 0.515 0.640 0.563 0.685 0.595
#> [13] 0.606 0.603 0.541 0.629 0.957 0.776 0.728 0.705 0.977 0.898 1.028 1.085
#> [25] 1.206 1.569 1.577 1.651 1.458 1.444 2.180
#> attr(,"label")
#> [1] "Volume of timber"
#> attr(,"units")
#> [1] "m^3"
# \donttest{
# Read from a Github Gist (need to specify the type here!)
# (ble <- read$csv("http://tinyurl.com/Biostat-Ble"))

# Various versions of the famous iris dataset
(iris <- read(data_example("iris.csv")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris.csv.zip")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris.csv.gz")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris.csv.bz2")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris.tsv")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris.xls")))
#> New names:
#> • `` -> `...1`
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris.xlsx")))
#> New names:
#> • `` -> `...1`
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris.rds"))) # Does not tranform into tibble!
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
#(iris <- read(data_example("iris.syd"))) ##
#(iris <- read(data_example("iris.csvy"))) ##
#(iris <- read(data_example("iris.csvy.zip"))) ##

# A file with an header both in English (default) and in French
(iris <- read(data_example("iris_short_header.csv")))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#>        <labelled>  <labelled>   <labelled>  <labelled> <labelled>
#>   1:          5.1         3.5          1.4         0.2     setosa
#>   2:          4.9         3.0          1.4         0.2     setosa
#>   3:          4.7         3.2          1.3         0.2     setosa
#>   4:          4.6         3.1          1.5         0.2     setosa
#>   5:          5.0         3.6          1.4         0.2     setosa
#>  ---                                                             
#> 146:          6.7         3.0          5.2         2.3  virginica
#> 147:          6.3         2.5          5.0         1.9  virginica
#> 148:          6.5         3.0          5.2         2.0  virginica
#> 149:          6.2         3.4          5.4         2.3  virginica
#> 150:          5.9         3.0          5.1         1.8  virginica
(iris_fr <- read(data_example("iris_short_header.csv"), lang = "fr"))
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#>        <labelled>  <labelled>   <labelled>  <labelled> <labelled>
#>   1:          5.1         3.5          1.4         0.2     setosa
#>   2:          4.9         3.0          1.4         0.2     setosa
#>   3:          4.7         3.2          1.3         0.2     setosa
#>   4:          4.6         3.1          1.5         0.2     setosa
#>   5:          5.0         3.6          1.4         0.2     setosa
#>  ---                                                             
#> 146:          6.7         3.0          5.2         2.3  virginica
#> 147:          6.3         2.5          5.0         1.9  virginica
#> 148:          6.5         3.0          5.2         2.0  virginica
#> 149:          6.2         3.4          5.4         2.3  virginica
#> 150:          5.9         3.0          5.1         1.8  virginica
# Headers are also recognized in xls/xlsx files
(iris_fr <- read(data_example("iris_short_header.xls"), lang = "fr"))
#> New names:
#> • `` -> `...1`
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#>        <labelled>  <labelled>   <labelled>  <labelled> <labelled>
#>   1:          5.1         3.5          1.4         0.2     setosa
#>   2:          4.9         3.0          1.4         0.2     setosa
#>   3:          4.7         3.2          1.3         0.2     setosa
#>   4:          4.6         3.1          1.5         0.2     setosa
#>   5:          5.0         3.6          1.4         0.2     setosa
#>  ---                                                             
#> 146:          6.7         3.0          5.2         2.3  virginica
#> 147:          6.3         2.5          5.0         1.9  virginica
#> 148:          6.5         3.0          5.2         2.0  virginica
#> 149:          6.2         3.4          5.4         2.3  virginica
#> 150:          5.9         3.0          5.1         1.8  virginica

# Read a file with a sidecar file (same name + '.R')
(iris <- read(data_example("iris_sidecar.csv"))) # lang = "en" by default
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#>      sepal_length sepal_width petal_length petal_width   species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "EN")) # Full lang
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#>      sepal_length sepal_width petal_length petal_width      species
#>             <num>       <num>        <num>       <num>       <fctr>
#>   1:          5.1         3.5          1.4         0.2    I. setosa
#>   2:          4.9         3.0          1.4         0.2    I. setosa
#>   3:          4.7         3.2          1.3         0.2    I. setosa
#>   4:          4.6         3.1          1.5         0.2    I. setosa
#>   5:          5.0         3.6          1.4         0.2    I. setosa
#>  ---                                                               
#> 146:          6.7         3.0          5.2         2.3 I. virginica
#> 147:          6.3         2.5          5.0         1.9 I. virginica
#> 148:          6.5         3.0          5.2         2.0 I. virginica
#> 149:          6.2         3.4          5.4         2.3 I. virginica
#> 150:          5.9         3.0          5.1         1.8 I. virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "en_us")) # US (in)
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#>      sepal_length sepal_width petal_length petal_width   species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:     2.007874    1.377953    0.5511811  0.07874016    setosa
#>   2:     1.929134    1.181102    0.5511811  0.07874016    setosa
#>   3:     1.850394    1.259843    0.5118110  0.07874016    setosa
#>   4:     1.811024    1.220472    0.5905512  0.07874016    setosa
#>   5:     1.968504    1.417323    0.5511811  0.07874016    setosa
#>  ---                                                            
#> 146:     2.637795    1.181102    2.0472441  0.90551181 virginica
#> 147:     2.480315    0.984252    1.9685039  0.74803150 virginica
#> 148:     2.559055    1.181102    2.0472441  0.78740157 virginica
#> 149:     2.440945    1.338583    2.1259843  0.90551181 virginica
#> 150:     2.322835    1.181102    2.0078740  0.70866142 virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "fr")) # French
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#>      sepal_length sepal_width petal_length petal_width   species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = "FR_BE")) # Belgian
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Warning: number of items to replace is not a multiple of replacement length
#>      sepal_length sepal_width petal_length petal_width      species
#>             <num>       <num>        <num>       <num>       <fctr>
#>   1:          5.1         3.5          1.4         0.2    I. setosa
#>   2:          4.9         3.0          1.4         0.2    I. setosa
#>   3:          4.7         3.2          1.3         0.2    I. setosa
#>   4:          4.6         3.1          1.5         0.2    I. setosa
#>   5:          5.0         3.6          1.4         0.2    I. setosa
#>  ---                                                               
#> 146:          6.7         3.0          5.2         2.3 I. virginica
#> 147:          6.3         2.5          5.0         1.9 I. virginica
#> 148:          6.5         3.0          5.2         2.0 I. virginica
#> 149:          6.2         3.4          5.4         2.3 I. virginica
#> 150:          5.9         3.0          5.1         1.8 I. virginica
(iris <- read(data_example("iris_sidecar.csv"), lang = NULL)) # No labels
#> Rows: 150 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): Species
#> dbl (4): Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <fctr>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica

# Require the feather package
#(iris <- read(data_example("iris.feather"))) # Not available for all Win

# Challenging datasets from the readr package
library(readr)
(mtcars <- read(readr_example("mtcars.csv")))
#> Rows: 32 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (11): mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>     <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#>  1:  21.0     6 160.0   110  3.90 2.620 16.46     0     1     4     4
#>  2:  21.0     6 160.0   110  3.90 2.875 17.02     0     1     4     4
#>  3:  22.8     4 108.0    93  3.85 2.320 18.61     1     1     4     1
#>  4:  21.4     6 258.0   110  3.08 3.215 19.44     1     0     3     1
#>  5:  18.7     8 360.0   175  3.15 3.440 17.02     0     0     3     2
#>  6:  18.1     6 225.0   105  2.76 3.460 20.22     1     0     3     1
#>  7:  14.3     8 360.0   245  3.21 3.570 15.84     0     0     3     4
#>  8:  24.4     4 146.7    62  3.69 3.190 20.00     1     0     4     2
#>  9:  22.8     4 140.8    95  3.92 3.150 22.90     1     0     4     2
#> 10:  19.2     6 167.6   123  3.92 3.440 18.30     1     0     4     4
#> 11:  17.8     6 167.6   123  3.92 3.440 18.90     1     0     4     4
#> 12:  16.4     8 275.8   180  3.07 4.070 17.40     0     0     3     3
#> 13:  17.3     8 275.8   180  3.07 3.730 17.60     0     0     3     3
#> 14:  15.2     8 275.8   180  3.07 3.780 18.00     0     0     3     3
#> 15:  10.4     8 472.0   205  2.93 5.250 17.98     0     0     3     4
#> 16:  10.4     8 460.0   215  3.00 5.424 17.82     0     0     3     4
#> 17:  14.7     8 440.0   230  3.23 5.345 17.42     0     0     3     4
#> 18:  32.4     4  78.7    66  4.08 2.200 19.47     1     1     4     1
#> 19:  30.4     4  75.7    52  4.93 1.615 18.52     1     1     4     2
#> 20:  33.9     4  71.1    65  4.22 1.835 19.90     1     1     4     1
#> 21:  21.5     4 120.1    97  3.70 2.465 20.01     1     0     3     1
#> 22:  15.5     8 318.0   150  2.76 3.520 16.87     0     0     3     2
#> 23:  15.2     8 304.0   150  3.15 3.435 17.30     0     0     3     2
#> 24:  13.3     8 350.0   245  3.73 3.840 15.41     0     0     3     4
#> 25:  19.2     8 400.0   175  3.08 3.845 17.05     0     0     3     2
#> 26:  27.3     4  79.0    66  4.08 1.935 18.90     1     1     4     1
#> 27:  26.0     4 120.3    91  4.43 2.140 16.70     0     1     5     2
#> 28:  30.4     4  95.1   113  3.77 1.513 16.90     1     1     5     2
#> 29:  15.8     8 351.0   264  4.22 3.170 14.50     0     1     5     4
#> 30:  19.7     6 145.0   175  3.62 2.770 15.50     0     1     5     6
#> 31:  15.0     8 301.0   335  3.54 3.570 14.60     0     1     5     8
#> 32:  21.4     4 121.0   109  4.11 2.780 18.60     1     1     4     2
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
(mtcars <- read(readr_example("mtcars.csv.zip")))
#> Rows: 32 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (11): mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>     <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#>  1:  21.0     6 160.0   110  3.90 2.620 16.46     0     1     4     4
#>  2:  21.0     6 160.0   110  3.90 2.875 17.02     0     1     4     4
#>  3:  22.8     4 108.0    93  3.85 2.320 18.61     1     1     4     1
#>  4:  21.4     6 258.0   110  3.08 3.215 19.44     1     0     3     1
#>  5:  18.7     8 360.0   175  3.15 3.440 17.02     0     0     3     2
#>  6:  18.1     6 225.0   105  2.76 3.460 20.22     1     0     3     1
#>  7:  14.3     8 360.0   245  3.21 3.570 15.84     0     0     3     4
#>  8:  24.4     4 146.7    62  3.69 3.190 20.00     1     0     4     2
#>  9:  22.8     4 140.8    95  3.92 3.150 22.90     1     0     4     2
#> 10:  19.2     6 167.6   123  3.92 3.440 18.30     1     0     4     4
#> 11:  17.8     6 167.6   123  3.92 3.440 18.90     1     0     4     4
#> 12:  16.4     8 275.8   180  3.07 4.070 17.40     0     0     3     3
#> 13:  17.3     8 275.8   180  3.07 3.730 17.60     0     0     3     3
#> 14:  15.2     8 275.8   180  3.07 3.780 18.00     0     0     3     3
#> 15:  10.4     8 472.0   205  2.93 5.250 17.98     0     0     3     4
#> 16:  10.4     8 460.0   215  3.00 5.424 17.82     0     0     3     4
#> 17:  14.7     8 440.0   230  3.23 5.345 17.42     0     0     3     4
#> 18:  32.4     4  78.7    66  4.08 2.200 19.47     1     1     4     1
#> 19:  30.4     4  75.7    52  4.93 1.615 18.52     1     1     4     2
#> 20:  33.9     4  71.1    65  4.22 1.835 19.90     1     1     4     1
#> 21:  21.5     4 120.1    97  3.70 2.465 20.01     1     0     3     1
#> 22:  15.5     8 318.0   150  2.76 3.520 16.87     0     0     3     2
#> 23:  15.2     8 304.0   150  3.15 3.435 17.30     0     0     3     2
#> 24:  13.3     8 350.0   245  3.73 3.840 15.41     0     0     3     4
#> 25:  19.2     8 400.0   175  3.08 3.845 17.05     0     0     3     2
#> 26:  27.3     4  79.0    66  4.08 1.935 18.90     1     1     4     1
#> 27:  26.0     4 120.3    91  4.43 2.140 16.70     0     1     5     2
#> 28:  30.4     4  95.1   113  3.77 1.513 16.90     1     1     5     2
#> 29:  15.8     8 351.0   264  4.22 3.170 14.50     0     1     5     4
#> 30:  19.7     6 145.0   175  3.62 2.770 15.50     0     1     5     6
#> 31:  15.0     8 301.0   335  3.54 3.570 14.60     0     1     5     8
#> 32:  21.4     4 121.0   109  4.11 2.780 18.60     1     1     4     2
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
(mtcars <- read(readr_example("mtcars.csv.bz2")))
#> Rows: 32 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (11): mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>     <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#>  1:  21.0     6 160.0   110  3.90 2.620 16.46     0     1     4     4
#>  2:  21.0     6 160.0   110  3.90 2.875 17.02     0     1     4     4
#>  3:  22.8     4 108.0    93  3.85 2.320 18.61     1     1     4     1
#>  4:  21.4     6 258.0   110  3.08 3.215 19.44     1     0     3     1
#>  5:  18.7     8 360.0   175  3.15 3.440 17.02     0     0     3     2
#>  6:  18.1     6 225.0   105  2.76 3.460 20.22     1     0     3     1
#>  7:  14.3     8 360.0   245  3.21 3.570 15.84     0     0     3     4
#>  8:  24.4     4 146.7    62  3.69 3.190 20.00     1     0     4     2
#>  9:  22.8     4 140.8    95  3.92 3.150 22.90     1     0     4     2
#> 10:  19.2     6 167.6   123  3.92 3.440 18.30     1     0     4     4
#> 11:  17.8     6 167.6   123  3.92 3.440 18.90     1     0     4     4
#> 12:  16.4     8 275.8   180  3.07 4.070 17.40     0     0     3     3
#> 13:  17.3     8 275.8   180  3.07 3.730 17.60     0     0     3     3
#> 14:  15.2     8 275.8   180  3.07 3.780 18.00     0     0     3     3
#> 15:  10.4     8 472.0   205  2.93 5.250 17.98     0     0     3     4
#> 16:  10.4     8 460.0   215  3.00 5.424 17.82     0     0     3     4
#> 17:  14.7     8 440.0   230  3.23 5.345 17.42     0     0     3     4
#> 18:  32.4     4  78.7    66  4.08 2.200 19.47     1     1     4     1
#> 19:  30.4     4  75.7    52  4.93 1.615 18.52     1     1     4     2
#> 20:  33.9     4  71.1    65  4.22 1.835 19.90     1     1     4     1
#> 21:  21.5     4 120.1    97  3.70 2.465 20.01     1     0     3     1
#> 22:  15.5     8 318.0   150  2.76 3.520 16.87     0     0     3     2
#> 23:  15.2     8 304.0   150  3.15 3.435 17.30     0     0     3     2
#> 24:  13.3     8 350.0   245  3.73 3.840 15.41     0     0     3     4
#> 25:  19.2     8 400.0   175  3.08 3.845 17.05     0     0     3     2
#> 26:  27.3     4  79.0    66  4.08 1.935 18.90     1     1     4     1
#> 27:  26.0     4 120.3    91  4.43 2.140 16.70     0     1     5     2
#> 28:  30.4     4  95.1   113  3.77 1.513 16.90     1     1     5     2
#> 29:  15.8     8 351.0   264  4.22 3.170 14.50     0     1     5     4
#> 30:  19.7     6 145.0   175  3.62 2.770 15.50     0     1     5     6
#> 31:  15.0     8 301.0   335  3.54 3.570 14.60     0     1     5     8
#> 32:  21.4     4 121.0   109  4.11 2.780 18.60     1     1     4     2
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
(challenge <- read(readr_example("challenge.csv"), guess_max = 1001))
#> Rows: 2000 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl  (1): x
#> date (1): y
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>                  x          y
#>              <num>     <Date>
#>    1:  404.0000000       <NA>
#>    2: 4172.0000000       <NA>
#>    3: 3004.0000000       <NA>
#>    4:  787.0000000       <NA>
#>    5:   37.0000000       <NA>
#>   ---                        
#> 1996:    0.1635163 2018-03-29
#> 1997:    0.4719390 2014-08-04
#> 1998:    0.7183186 2015-08-16
#> 1999:    0.2698786 2020-02-04
#> 2000:    0.6082372 2019-01-06
(massey <- read(readr_example("massey-rating.txt")))
#> [1] "UCC PAY LAZ KPK  RT   COF BIH DII ENG ACU Rank Team            Conf\n  1   1   1   1   1     1   1   1   1   1    1 Ohio St          B10 \n  2   2   2   2   2     2   2   2   4   2    2 Oregon           P12 \n  3   4   3   4   3     4   3   4   2   3    3 Alabama          SEC \n  4   3   4   3   4     3   5   3   3   4    4 TCU              B12 \n  6   6   6   5   5     7   6   5   6  11    5 Michigan St      B10 \n  7   7   7   6   7     6  11   8   7   8    6 Georgia          SEC \n  5   5   5   7   6     8   4   6   5   5    7 Florida St       ACC \n  8   8   9   9  10     5   7   7  10   7    8 Baylor           B12 \n  9  11   8  13  11    11  12   9  14   9    9 Georgia Tech     ACC \n 13  10  13  11   8     9  10  11   9  10   10 Mississippi      SEC \n"
# By default, the type cannot be guessed from the extension
# This is a space-separated vaules file (ssv)
(massey <- read(readr_example("massey-rating.txt"), type = "ssv"))
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   UCC = col_double(),
#>   PAY = col_double(),
#>   LAZ = col_double(),
#>   KPK = col_double(),
#>   RT = col_double(),
#>   COF = col_double(),
#>   BIH = col_double(),
#>   DII = col_double(),
#>   ENG = col_double(),
#>   ACU = col_double(),
#>   Rank = col_double(),
#>   Team = col_character(),
#>   Conf = col_character()
#> )
#> Warning: 10 parsing failures.
#> row col   expected     actual                                                              file
#>   1  -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   2  -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   3  -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   4  -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   5  -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> ... ... .......... .......... .................................................................
#> See problems(...) for more details.
#>       UCC   PAY   LAZ   KPK    RT   COF   BIH   DII   ENG   ACU  Rank
#>     <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#>  1:     1     1     1     1     1     1     1     1     1     1     1
#>  2:     2     2     2     2     2     2     2     2     4     2     2
#>  3:     3     4     3     4     3     4     3     4     2     3     3
#>  4:     4     3     4     3     4     3     5     3     3     4     4
#>  5:     6     6     6     5     5     7     6     5     6    11     5
#>  6:     7     7     7     6     7     6    11     8     7     8     6
#>  7:     5     5     5     7     6     8     4     6     5     5     7
#>  8:     8     8     9     9    10     5     7     7    10     7     8
#>  9:     9    11     8    13    11    11    12     9    14     9     9
#> 10:    13    10    13    11     8     9    10    11     9    10    10
#>            Team   Conf
#>          <char> <char>
#>  1:        Ohio     St
#>  2:      Oregon    P12
#>  3:     Alabama    SEC
#>  4:         TCU    B12
#>  5:    Michigan     St
#>  6:     Georgia    SEC
#>  7:     Florida     St
#>  8:      Baylor    B12
#>  9:     Georgia   Tech
#> 10: Mississippi    SEC
# or ...
(massey <- read$ssv(readr_example("massey-rating.txt")))
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   UCC = col_double(),
#>   PAY = col_double(),
#>   LAZ = col_double(),
#>   KPK = col_double(),
#>   RT = col_double(),
#>   COF = col_double(),
#>   BIH = col_double(),
#>   DII = col_double(),
#>   ENG = col_double(),
#>   ACU = col_double(),
#>   Rank = col_double(),
#>   Team = col_character(),
#>   Conf = col_character()
#> )
#> Warning: 10 parsing failures.
#> row col   expected     actual                                                              file
#>   1  -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   2  -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   3  -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   4  -- 13 columns 14 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#>   5  -- 13 columns 15 columns '/home/runner/work/_temp/Library/readr/extdata/massey-rating.txt'
#> ... ... .......... .......... .................................................................
#> See problems(...) for more details.
#>       UCC   PAY   LAZ   KPK    RT   COF   BIH   DII   ENG   ACU  Rank
#>     <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#>  1:     1     1     1     1     1     1     1     1     1     1     1
#>  2:     2     2     2     2     2     2     2     2     4     2     2
#>  3:     3     4     3     4     3     4     3     4     2     3     3
#>  4:     4     3     4     3     4     3     5     3     3     4     4
#>  5:     6     6     6     5     5     7     6     5     6    11     5
#>  6:     7     7     7     6     7     6    11     8     7     8     6
#>  7:     5     5     5     7     6     8     4     6     5     5     7
#>  8:     8     8     9     9    10     5     7     7    10     7     8
#>  9:     9    11     8    13    11    11    12     9    14     9     9
#> 10:    13    10    13    11     8     9    10    11     9    10    10
#>            Team   Conf
#>          <char> <char>
#>  1:        Ohio     St
#>  2:      Oregon    P12
#>  3:     Alabama    SEC
#>  4:         TCU    B12
#>  5:    Michigan     St
#>  6:     Georgia    SEC
#>  7:     Florida     St
#>  8:      Baylor    B12
#>  9:     Georgia   Tech
#> 10: Mississippi    SEC
(epa <- read$ssv(readr_example("epa78.txt"), col_names = FALSE))
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   X1 = col_character(),
#>   X2 = col_character(),
#>   X3 = col_character(),
#>   X4 = col_character(),
#>   X5 = col_double()
#> )
#> Warning: 17 parsing failures.
#> row col  expected     actual                                                      file
#>   2  -- 5 columns 10 columns '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#>   3  -- 5 columns 6 columns  '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#>   4  -- 5 columns 3 columns  '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#>   5  -- 5 columns 8 columns  '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#>   6  -- 5 columns 8 columns  '/home/runner/work/_temp/Library/readr/extdata/epa78.txt'
#> ... ... ......... .......... .........................................................
#> See problems(...) for more details.
#>          X1     X2       X3     X4       X5
#>      <char> <char>   <char> <char>    <num>
#>  1:    ALFA  ROMEO     ALFA  ROMEO 78010003
#>  2: ALFETTA     03       81      8       74
#>  3:  SPIDER   2000       01 SPIDER     2000
#>  4:     AMC    AMC 78020002   <NA>       NA
#>  5: GREMLIN     03       79      9       79
#>  6:   PACER     04       89     11       89
#>  7:   PACER  WAGON       07     90       26
#>  8: CONCORD     04       88     12       90
#>  9: CONCORD  WAGON       07     91       30
#> 10: MATADOR  COUPE       05     97       14
#> 11: MATADOR  SEDAN       06    110       20
#> 12: MATADOR  WAGON       09    112       50
#> 13:   ASTON MARTIN    ASTON MARTIN 78040002
#> 14:   ASTON MARTIN    ASTON MARTIN 78040053
#> 15:    AUDI   AUDI 78050002   <NA>       NA
#> 16:     FOX     03       84     11       84
#> 17:     FOX  WAGON       07     83       40
#> 18:    5000     04       90     15       90
#> 19:  AVANTI AVANTI 78065002   <NA>       NA
#> 20:  AVANTI     II       02     75        8
#>          X1     X2       X3     X4       X5
(example_log <- read(readr_example("example.log")))
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   X1 = col_character(),
#>   X2 = col_logical(),
#>   X3 = col_character(),
#>   X4 = col_character(),
#>   X5 = col_character(),
#>   X6 = col_double(),
#>   X7 = col_double()
#> )
#>              X1     X2                 X3                         X4
#>          <char> <lgcl>             <char>                     <char>
#> 1: 172.21.13.45     NA Microsoft\\JohnDoe 08/Apr/2001:17:39:04 -0800
#> 2:    127.0.0.1     NA              frank 10/Oct/2000:13:55:36 -0700
#>                                                  X5    X6    X7
#>                                              <char> <num> <num>
#> 1: GET /scripts/iisadmin/ism.dll?http/serv HTTP/1.0   200  3401
#> 2:                      GET /apache_pb.gif HTTP/1.0   200  2326
# There are different ways to specify columns for fixed-width files (fwf)
# See ?read_fwf in package readr
(fwf_sample <- read$fwf(readr_example("fwf-sample.txt"),
   col_positions =  fwf_cols(name = 20, state = 10, ssn = 12)))
#> Rows: 3 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> 
#> chr (3): name, state, ssn
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>             name  state          ssn
#>           <char> <char>       <char>
#> 1:    John Smith     WA 418-Y11-4111
#> 2: Mary Hartford     CA 319-Z19-4341
#> 3:    Evan Nolan     IL 219-532-c301

# Various examples of Excel datasets from readxl
library(readxl)
(xl <- read(readxl_example("datasets.xls")))
#> New names:
#> • `` -> `...1`
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>             <num>       <num>        <num>       <num>    <char>
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
(xl <- read(readxl_example("datasets.xlsx"), sheet = "mtcars"))
#> New names:
#> • `` -> `...1`
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>     <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
#>  1:  21.0     6 160.0   110  3.90 2.620 16.46     0     1     4     4
#>  2:  21.0     6 160.0   110  3.90 2.875 17.02     0     1     4     4
#>  3:  22.8     4 108.0    93  3.85 2.320 18.61     1     1     4     1
#>  4:  21.4     6 258.0   110  3.08 3.215 19.44     1     0     3     1
#>  5:  18.7     8 360.0   175  3.15 3.440 17.02     0     0     3     2
#>  6:  18.1     6 225.0   105  2.76 3.460 20.22     1     0     3     1
#>  7:  14.3     8 360.0   245  3.21 3.570 15.84     0     0     3     4
#>  8:  24.4     4 146.7    62  3.69 3.190 20.00     1     0     4     2
#>  9:  22.8     4 140.8    95  3.92 3.150 22.90     1     0     4     2
#> 10:  19.2     6 167.6   123  3.92 3.440 18.30     1     0     4     4
#> 11:  17.8     6 167.6   123  3.92 3.440 18.90     1     0     4     4
#> 12:  16.4     8 275.8   180  3.07 4.070 17.40     0     0     3     3
#> 13:  17.3     8 275.8   180  3.07 3.730 17.60     0     0     3     3
#> 14:  15.2     8 275.8   180  3.07 3.780 18.00     0     0     3     3
#> 15:  10.4     8 472.0   205  2.93 5.250 17.98     0     0     3     4
#> 16:  10.4     8 460.0   215  3.00 5.424 17.82     0     0     3     4
#> 17:  14.7     8 440.0   230  3.23 5.345 17.42     0     0     3     4
#> 18:  32.4     4  78.7    66  4.08 2.200 19.47     1     1     4     1
#> 19:  30.4     4  75.7    52  4.93 1.615 18.52     1     1     4     2
#> 20:  33.9     4  71.1    65  4.22 1.835 19.90     1     1     4     1
#> 21:  21.5     4 120.1    97  3.70 2.465 20.01     1     0     3     1
#> 22:  15.5     8 318.0   150  2.76 3.520 16.87     0     0     3     2
#> 23:  15.2     8 304.0   150  3.15 3.435 17.30     0     0     3     2
#> 24:  13.3     8 350.0   245  3.73 3.840 15.41     0     0     3     4
#> 25:  19.2     8 400.0   175  3.08 3.845 17.05     0     0     3     2
#> 26:  27.3     4  79.0    66  4.08 1.935 18.90     1     1     4     1
#> 27:  26.0     4 120.3    91  4.43 2.140 16.70     0     1     5     2
#> 28:  30.4     4  95.1   113  3.77 1.513 16.90     1     1     5     2
#> 29:  15.8     8 351.0   264  4.22 3.170 14.50     0     1     5     4
#> 30:  19.7     6 145.0   175  3.62 2.770 15.50     0     1     5     6
#> 31:  15.0     8 301.0   335  3.54 3.570 14.60     0     1     5     8
#> 32:  21.4     4 121.0   109  4.11 2.780 18.60     1     1     4     2
#>       mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
(xl <- read(readxl_example("datasets.xlsx"), sheet = 3))
#> New names:
#> • `` -> `...1`
#>     weight      feed
#>      <num>    <char>
#>  1:    179 horsebean
#>  2:    160 horsebean
#>  3:    136 horsebean
#>  4:    227 horsebean
#>  5:    217 horsebean
#>  6:    168 horsebean
#>  7:    108 horsebean
#>  8:    124 horsebean
#>  9:    143 horsebean
#> 10:    140 horsebean
#> 11:    309   linseed
#> 12:    229   linseed
#> 13:    181   linseed
#> 14:    141   linseed
#> 15:    260   linseed
#> 16:    203   linseed
#> 17:    148   linseed
#> 18:    169   linseed
#> 19:    213   linseed
#> 20:    257   linseed
#> 21:    244   linseed
#> 22:    271   linseed
#> 23:    243   soybean
#> 24:    230   soybean
#> 25:    248   soybean
#> 26:    327   soybean
#> 27:    329   soybean
#> 28:    250   soybean
#> 29:    193   soybean
#> 30:    271   soybean
#> 31:    316   soybean
#> 32:    267   soybean
#> 33:    199   soybean
#> 34:    171   soybean
#> 35:    158   soybean
#> 36:    248   soybean
#> 37:    423 sunflower
#> 38:    340 sunflower
#> 39:    392 sunflower
#> 40:    339 sunflower
#> 41:    341 sunflower
#> 42:    226 sunflower
#> 43:    320 sunflower
#> 44:    295 sunflower
#> 45:    334 sunflower
#> 46:    322 sunflower
#> 47:    297 sunflower
#> 48:    318 sunflower
#> 49:    325  meatmeal
#> 50:    257  meatmeal
#> 51:    303  meatmeal
#> 52:    315  meatmeal
#> 53:    380  meatmeal
#> 54:    153  meatmeal
#> 55:    263  meatmeal
#> 56:    242  meatmeal
#> 57:    206  meatmeal
#> 58:    344  meatmeal
#> 59:    258  meatmeal
#> 60:    368    casein
#> 61:    390    casein
#> 62:    379    casein
#> 63:    260    casein
#> 64:    404    casein
#> 65:    318    casein
#> 66:    352    casein
#> 67:    359    casein
#> 68:    216    casein
#> 69:    222    casein
#> 70:    283    casein
#> 71:    332    casein
#>     weight      feed
# Accomodate a column with disparate types via col_type = "list"
(clip <- read(readxl_example("clippy.xls"), col_types = c("text", "list")))
#> New names:
#> • `` -> `...1`
#>                    name      value
#>                  <char>     <list>
#> 1:                 Name     Clippy
#> 2:              Species  paperclip
#> 3: Approx date of death 2007-01-01
#> 4:      Weight in grams        0.9
(clip <- read(readxl_example("clippy.xlsx"), col_types = c("text", "list")))
#> New names:
#> • `` -> `...1`
#>                    name      value
#>                  <char>     <list>
#> 1:                 Name     Clippy
#> 2:              Species  paperclip
#> 3: Approx date of death 2007-01-01
#> 4:      Weight in grams        0.9
tibble::deframe(clip)
#> $Name
#> [1] "Clippy"
#> 
#> $Species
#> [1] "paperclip"
#> 
#> $`Approx date of death`
#> [1] "2007-01-01 UTC"
#> 
#> $`Weight in grams`
#> [1] 0.9
#> 
# Read from a specific range in a sheet
(xl <- read(readxl_example("datasets.xlsx"), range = "mtcars!B1:D5"))
#> New names:
#> • `` -> `...1`
#>      cyl  disp    hp
#>    <num> <num> <num>
#> 1:     6   160   110
#> 2:     6   160   110
#> 3:     4   108    93
#> 4:     6   258   110
(deaths <- read(readxl_example("deaths.xls"), range = cell_rows(5:15)))
#> New names:
#> • `` -> `...1`
#>                   Name Profession   Age Has kids Date of birth Date of death
#>                 <char>     <char> <num>   <lgcl>        <POSc>        <POSc>
#>  1:        David Bowie   musician    69     TRUE    1947-01-08    2016-01-10
#>  2:      Carrie Fisher      actor    60     TRUE    1956-10-21    2016-12-27
#>  3:        Chuck Berry   musician    90     TRUE    1926-10-18    2017-03-18
#>  4:        Bill Paxton      actor    61     TRUE    1955-05-17    2017-02-25
#>  5:             Prince   musician    57     TRUE    1958-06-07    2016-04-21
#>  6:       Alan Rickman      actor    69    FALSE    1946-02-21    2016-01-14
#>  7: Florence Henderson      actor    82     TRUE    1934-02-14    2016-11-24
#>  8:         Harper Lee     author    89    FALSE    1926-04-28    2016-02-19
#>  9:      Zsa Zsa Gábor      actor    99     TRUE    1917-02-06    2016-12-18
#> 10:     George Michael   musician    53    FALSE    1963-06-25    2016-12-25
(deaths <- read(readxl_example("deaths.xlsx"), range = cell_rows(5:15)))
#> New names:
#> • `` -> `...1`
#>                   Name Profession   Age Has kids Date of birth Date of death
#>                 <char>     <char> <num>   <lgcl>        <POSc>        <POSc>
#>  1:        David Bowie   musician    69     TRUE    1947-01-08    2016-01-10
#>  2:      Carrie Fisher      actor    60     TRUE    1956-10-21    2016-12-27
#>  3:        Chuck Berry   musician    90     TRUE    1926-10-18    2017-03-18
#>  4:        Bill Paxton      actor    61     TRUE    1955-05-17    2017-02-25
#>  5:             Prince   musician    57     TRUE    1958-06-07    2016-04-21
#>  6:       Alan Rickman      actor    69    FALSE    1946-02-21    2016-01-14
#>  7: Florence Henderson      actor    82     TRUE    1934-02-14    2016-11-24
#>  8:         Harper Lee     author    89    FALSE    1926-04-28    2016-02-19
#>  9:      Zsa Zsa Gábor      actor    99     TRUE    1917-02-06    2016-12-18
#> 10:     George Michael   musician    53    FALSE    1963-06-25    2016-12-25
(type_me <- read(readxl_example("type-me.xls"), sheet = "logical_coercion",
  col_types = c("logical", "text")))
#> New names:
#> • `` -> `...1`
#> Warning: Expecting logical in A5 / R5C1: got a date
#> Warning: Expecting logical in A8 / R8C1: got 'cabbage'
#>     maybe boolean?                      description
#>             <lgcl>                           <char>
#>  1:             NA                            empty
#>  2:          FALSE                      0 (numeric)
#>  3:           TRUE                      1 (numeric)
#>  4:             NA                         datetime
#>  5:           TRUE                     boolean true
#>  6:          FALSE                    boolean false
#>  7:             NA                        "cabbage"
#>  8:           TRUE                the string "true"
#>  9:          FALSE                   the letter "F"
#> 10:          FALSE "False" preceded by single quote
(type_me <- read(readxl_example("type-me.xlsx"), sheet = "numeric_coercion",
  col_types = c("numeric", "text")))
#> New names:
#> • `` -> `...1`
#> Warning: Coercing boolean to numeric in A3 / R3C1
#> Warning: Coercing boolean to numeric in A4 / R4C1
#> Warning: Expecting numeric in A5 / R5C1: got a date
#> Warning: Coercing text to numeric in A6 / R6C1: '123456'
#> Warning: Expecting numeric in A8 / R8C1: got 'cabbage'
#>    maybe numeric?         explanation
#>             <num>              <char>
#> 1:             NA               empty
#> 2:              1        boolean true
#> 3:              0       boolean false
#> 4:          40534            datetime
#> 5:         123456 the string "123456"
#> 6:         123456   the number 123456
#> 7:             NA           "cabbage"
(type_me <- read(readxl_example("type-me.xls"), sheet = "date_coercion",
  col_types = c("date", "text")))
#> New names:
#> • `` -> `...1`
#> Warning: Expecting date in A5 / R5C1: got boolean
#> Warning: Expecting date in A6 / R6C1: got 'cabbage'
#> Warning: Coercing numeric to date in A7 / R7C1
#> Warning: Coercing numeric to date in A8 / R8C1
#>      maybe a datetime?          explanation
#>                 <POSc>               <char>
#> 1:                <NA>                empty
#> 2: 2016-05-23 00:00:00     date only format
#> 3: 2016-04-28 11:30:00 date and time format
#> 4:                <NA>         boolean true
#> 5:                <NA>            "cabbage"
#> 6: 1904-01-05 07:12:00        4.3 (numeric)
#> 7: 2012-01-02 00:00:00      another numeric
(type_me <- read(readxl_example("type-me.xlsx"), sheet = "text_coercion",
  col_types = c("text", "text")))
#> New names:
#> • `` -> `...1`
#>        text     explanation
#>      <char>          <char>
#> 1:     <NA>           empty
#> 2:  cabbage       "cabbage"
#> 3:     TRUE    boolean true
#> 4:      1.3         numeric
#> 5:    41175        datetime
#> 6: 36436153 another numeric
(xl <- read(readxl_example("geometry.xls"), col_names = FALSE))
#> New names:
#> • `` -> `...1`
#> • `` -> `...2`
#> • `` -> `...3`
#>      ...1   ...2   ...3
#>    <char> <char> <char>
#> 1:     B3     C3     D3
#> 2:     B4     C4     D4
#> 3:     B5     C5     D5
#> 4:     B6     C6     D6
(xl <- read(readxl_example("geometry.xlsx"), range = cell_rows(4:8)))
#>        B4     C4     D4
#>    <char> <char> <char>
#> 1:     B5     C5     D5
#> 2:     B6     C6     D6
#> 3:   <NA>   <NA>   <NA>
#> 4:   <NA>   <NA>   <NA>

# Various examples from haven
library(haven)
haven_example <- function(path)
  system.file("examples", path, package = "haven", mustWork = TRUE)
(iris2 <- read(haven_example("iris.dta"))) # Stata v. 8-14
#>      sepallength sepalwidth petallength petalwidth   species
#>            <num>      <num>       <num>      <num>    <char>
#>   1:         5.1        3.5         1.4        0.2    setosa
#>   2:         4.9        3.0         1.4        0.2    setosa
#>   3:         4.7        3.2         1.3        0.2    setosa
#>   4:         4.6        3.1         1.5        0.2    setosa
#>   5:         5.0        3.6         1.4        0.2    setosa
#>  ---                                                        
#> 146:         6.7        3.0         5.2        2.3 virginica
#> 147:         6.3        2.5         5.0        1.9 virginica
#> 148:         6.5        3.0         5.2        2.0 virginica
#> 149:         6.2        3.4         5.4        2.3 virginica
#> 150:         5.9        3.0         5.1        1.8 virginica
(iris2 <- read(haven_example("iris.sav"))) # SPSS, TODO: labelled -> factor?
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width          Species
#>             <num>       <num>        <num>       <num> <haven_labelled>
#>   1:          5.1         3.5          1.4         0.2                1
#>   2:          4.9         3.0          1.4         0.2                1
#>   3:          4.7         3.2          1.3         0.2                1
#>   4:          4.6         3.1          1.5         0.2                1
#>   5:          5.0         3.6          1.4         0.2                1
#>  ---                                                                   
#> 146:          6.7         3.0          5.2         2.3                3
#> 147:          6.3         2.5          5.0         1.9                3
#> 148:          6.5         3.0          5.2         2.0                3
#> 149:          6.2         3.4          5.4         2.3                3
#> 150:          5.9         3.0          5.1         1.8                3
(pbc <- read(data_example("pbc.por"))) # SPSS, POR format
#>          AGE   ALB ALKPHOS ASCITES  BILI  CHOL EDEMA EDTRT HEPMEG  TIME
#>        <num> <num>   <num>   <num> <num> <num> <num> <num>  <num> <num>
#>   1: 58.7652  2.60  1718.0       1  14.5   261     1   1.0      1   400
#>   2: 56.4463  4.14  7394.8       0   1.1   302     0   0.0      1  4500
#>   3: 70.0726  3.48   516.0       0   1.4   176     1   0.5      0  1012
#>   4: 54.7406  2.54  6121.8       0   1.8   244     1   0.5      1  1925
#>   5: 38.1054  3.53   671.0       0   3.4   279     0   0.0      1  1504
#>  ---                                                                   
#> 414: 67.0000  2.96    -9.0      -9   1.2    -9     0   0.0     -9   681
#> 415: 39.0000  3.83    -9.0      -9   0.9    -9     0   0.0     -9  1103
#> 416: 57.0000  3.42    -9.0      -9   1.6    -9     0   0.0     -9  1055
#> 417: 58.0000  3.75    -9.0      -9   0.8    -9     0   0.0     -9   691
#> 418: 53.0000  3.29    -9.0      -9   0.7    -9     0   0.0     -9   976
#>      PLATELET PROTIME   SEX   SGOT SPIDERS STAGE STATUS   TRT  TRIG COPPER
#>         <num>   <num> <num>  <num>   <num> <num>  <num> <num> <num>  <num>
#>   1:      190    12.2     1 137.95       1     4      1     1   172    156
#>   2:      221    10.6     1 113.52       1     3      0     1    88     54
#>   3:      151    12.0     0  96.10       0     4      1     1    55    210
#>   4:      183    10.3     1  60.63       1     4      1     1    92     64
#>   5:      136    10.9     1 113.15       1     3      0     2    72    143
#>  ---                                                                      
#> 414:      174    10.9    -9  -9.00      -9    -9      1    -9    -9     -9
#> 415:      180    11.2    -9  -9.00      -9    -9      0    -9    -9     -9
#> 416:      143     9.9    -9  -9.00      -9    -9      0    -9    -9     -9
#> 417:      269    10.4    -9  -9.00      -9    -9      0    -9    -9     -9
#> 418:      350    10.6    -9  -9.00      -9    -9      0    -9    -9     -9
(iris2 <- read$sas(haven_example("iris.sas7bdat"))) # SAS file
#>      Sepal_Length Sepal_Width Petal_Length Petal_Width Species
#>             <num>       <num>        <num>       <num>  <char>
#>   1:          5.1         3.5          1.4         0.2  setosa
#>   2:          4.9         3.0          1.4         0.2  setosa
#>   3:          4.7         3.2          1.3         0.2  setosa
#>   4:          4.6         3.1          1.5         0.2  setosa
#>   5:          5.0         3.6          1.4         0.2  setosa
#>  ---                                                          
#> 146:          6.7         3.0          5.2         2.3  virgin
#> 147:          6.3         2.5          5.0         1.9  virgin
#> 148:          6.5         3.0          5.2         2.0  virgin
#> 149:          6.2         3.4          5.4         2.3  virgin
#> 150:          5.9         3.0          5.1         1.8  virgin
(afalfa <- read(data_example("afalfa.xpt"))) # SAS transport file
#>        POP SAMPLE   REP SEEDWT HARV1 HARV2
#>     <char>  <num> <num>  <num> <num> <num>
#>  1:    min      0     1     64 171.7 180.3
#>  2:    min      1     1     54 138.2 150.7
#>  3:    min      2     1     40 145.6 129.1
#>  4:    min      3     1     45 170.4 191.2
#>  5:    min      4     1     64 124.8 172.6
#>  6:    MAX      5     1     75 179.0 235.3
#>  7:    MAX      6     1     45 166.3 173.9
#>  8:    MAX      7     1     63 169.7 155.8
#>  9:    MAX      8     1     65 192.9 177.6
#> 10:    MAX      9     1     59 185.8 179.2
#> 11:    min      0     2     59 158.8 139.7
#> 12:    min      1     2     46 163.7 150.0
#> 13:    min      2     2     42 120.6 131.1
#> 14:    min      3     2     38 193.1 195.4
#> 15:    min      4     2     54 171.5 167.6
#> 16:    MAX      5     2     59 181.4 152.9
#> 17:    MAX      6     2     60 165.3 167.5
#> 18:    MAX      7     2     63 163.9 158.0
#> 19:    MAX      8     2     70 152.5 150.2
#> 20:    MAX      9     2     62 173.5 190.7
#> 21:    min      0     3     60 147.9 164.9
#> 22:    min      1     3     42 181.3 151.5
#> 23:    min      2     3     35 124.3 134.4
#> 24:    min      3     3     47 174.8 200.8
#> 25:    min      4     3     59 167.8 178.3
#> 26:    MAX      5     3     57 193.4 183.5
#> 27:    MAX      6     3     60 150.7 147.1
#> 28:    MAX      7     3     59 142.5 148.7
#> 29:    MAX      8     3     59 176.4 204.8
#> 30:    MAX      9     3     70 144.2 143.8
#> 31:    min      0     4     61 148.4 168.8
#> 32:    min      1     4     52 164.9 158.6
#> 33:    min      2     4     43 141.2 158.1
#> 34:    min      3     4     49 176.5 208.3
#> 35:    min      4     4     60 177.5 137.1
#> 36:    MAX      5     4     59 174.1 160.2
#> 37:    MAX      6     4     48 155.5 185.8
#> 38:    MAX      7     4     61 186.7 157.7
#> 39:    MAX      8     4     64 162.4 179.4
#> 40:    MAX      9     4     71 141.0 161.5
#>        POP SAMPLE   REP SEEDWT HARV1 HARV2

# Note that where completion is available, you have a completion list of file
# format after typing read$<tab>
# }