When should the data go to / data and when should it go / inst / extdata?

In the Writing R Extensions reference:

The data subdirectory is for data files that can be accessed through lazy loading or for loading using data (). (The selection is made in the "LazyData" field in the DESCRIPTION file: this is not done by default). It should not be used for other data files needed for the package, and agreement has grown to use the inst / extdata directory for such files.)

But it is not yet clear which data is “required” by the package. I would like to use the data for the following (not always mutually exclusive) reasons:

  • documentation
    • function examples
    • functional tests
    • vignettes
  • to provide access to the original dataset
  • to make data available for functions within the package (e.g. lookup / dictionary tables)

But it is not clear which one should go in the data folder and which should go in inst/extdata . Are there any conditions under which "data" must be sent to another location?

Related Questions: Previous questions (for example, the inst and extdata folders in R Packaging and Using inst / extdata with a vignette during a package check R 2.14.0 ) give some usage instructions, but don’t tell me how to decide which directory to use. Another question: R - where should I put the RDA file - / R, / data, / inst / extdata? will be closest, but it seems that the focus is on RDA and RData Files.

+8
r package
source share
1 answer

The data directory provides data for the data() function and is expected to follow certain customs in terms of file formats and extensions.

The inst/extdata becomes extdata/ during installation and the extdata/ west, and you can do whatever you want and it is expected that you write your own accessors.

It may be helpful to look at empiricism. On my machine, about 240-some installed packages full 77 (or not quite the third) has data/ , but only 4 (including one of mine) have extdata. .

+10
source share

All Articles