Exclude datasets from package assembly R

I am implementing the R package, where I have several large .rda data files in the "data" folder.

When I create a package (with R CMD build to create a packaged .tar.gz file), the data files are also included in the package, and since they are really large, this makes the assembly (as well as validation) the process very slow, and the final package size is uselessly large .

This data is loaded from some database through the package function, therefore the goal is not to include data in the package, but allows the user to fill in the data folder from his own database. The data that I use is intended for testing, and it makes no sense to include it in the package.

To summarize my question: is it possible to save data in the "data" folder, but exclude it from the built-in package?

Edit

Ok, I found the first solution by creating a file called .Rbuildignore that contains the line:

 ^data/.+$ 

In any case, the problem remains for the R CMD and R CMD verification processes that do not take into account the .Rbuildignore file.

Any suggestion to exclude the folder also from the installation and verification processes?

+6
source share
1 answer

If you use .Rbuildignore , you must first create and then check your package (this is not a check-ignore). Here are a few tests in the Debian environment and a random package:

 l@np350v5c :~/src/yapomif/pkg$ ls data DESCRIPTION man NAMESPACE R l@np350v5c :~/src/yapomif/pkg$ R > save(Formaldehyde, file = "data/formal.rda") l@np350v5c :~/src/yapomif/pkg$ ls -l totale 20 drwxr-xr-x 2 ll 4096 mag 1 01:31 data -rw-r--r-- 1 ll 349 apr 25 00:35 DESCRIPTION drwxr-xr-x 2 ll 4096 apr 25 01:10 man -rw-r--r-- 1 ll 1189 apr 25 00:33 NAMESPACE drwxr-xr-x 2 ll 4096 apr 25 01:02 R l@np350v5c :~/src/yapomif/pkg$ ls -l data/ totale 4 -rw-r--r-- 1 ll 229 mag 1 01:31 formal.rda 

Now I am creating exactly your .Rbuildignore

 l@np350v5c :~/src/yapomif/pkg$ em .Rbuildignore l@np350v5c :~/src/yapomif/pkg$ cat .Rbuildignore ^data/.+$ 

Ok let build

 l@np350v5c :~/src/yapomif/pkg$ cd .. l@np350v5c :~/src/yapomif$ R CMD build pkg > tools:::.build_packages() * checking for file 'pkg/DESCRIPTION' ... OK * preparing 'yapomif': * checking DESCRIPTION meta-information ... OK * checking for LF line-endings in source and make files * checking for empty or unneeded directories Removed empty directory 'yapomif/data' * building 'yapomif_0.8.tar.gz' 

Good (you see a message about yapomif / data). Now check the package

 l@np350v5c :~/src/yapomif$ R CMD check yapomif_0.8.tar.gz > tools:::.check_packages() * using log directory '/home/l/.src/yapomif/yapomif.Rcheck' * using R version 3.1.0 (2014-04-10) * using platform: x86_64-pc-linux-gnu (64-bit) ... 

... everything is as usual

Now let's check the file (moved to the home directory to save my dir clean development)

 l@np350v5c :~/src/yapomif$ mv yapomif_0.8.tar.gz ~ l@np350v5c :~/src/yapomif$ cd l@np350v5c :~$ tar xvzf yapomif_0.8.tar.gz l@np350v5c :~$ ls yapomif DESCRIPTION man NAMESPACE R 

therefore there is no data directory

But if

 l@np350v5c :~/src/yapomif$ R CMD check pkg ... Undocumented data sets: 'Formaldehyde' 

So, as indicated, first build, then check.

HTH, Luca

+4
source

All Articles