Import Excel files in R, xlsx or xls

Please can someone help me in the best way to import an excel 2007 file (.xlsx) into R. I have tried several methods and none of them work. I updated to 2.13.1, Windows XP, xlsx 0.3.0, I do not know why the error continues to grow. I tried:

AB<-read.xlsx("C:/AB_DNA_Tag_Numbers.xlsx","DNA_Tag_Numbers") 

OR

 AB<-read.xlsx("C:/AB_DNA_Tag_Numbers.xlsx",1) 

but I get the error:

  Error in .jnew("java/io/FileInputStream", file) : java.io.FileNotFoundException: C:\AB_DNA_Tag_Numbers.xlsx (The system cannot find the file specified) 

Thank.

+84
r xls xlsx
Aug 13 2018-11-11T00:
source share
14 answers

For a solution that does not contain problematic external dependencies *, now readxl :

The readxl package makes it easy to extract data from Excel and R. Compared to many of the existing packages (for example, gdata, xlsx, xlsReadWrite), readxl has no external dependencies, so it is easy to install and use on all operating systems. It is designed to work with tabular data stored in one sheet.

Readxl supports both the legacy .xls format and the modern xml-based.xlsx..xls support for the libxls C library, which abstracts many of the complexities of the basic binary code format. To parse .xlsx, we use the RapidXML C ++ library.

It can be installed as follows:

 install.packages("readxl") # CRAN version 

or

 devtools::install_github("hadley/readxl") # development version 

Using

 library(readxl) # read_excel reads both xls and xlsx files read_excel("my-old-spreadsheet.xls") read_excel("my-new-spreadsheet.xlsx") # Specify sheet with a number or name read_excel("my-spreadsheet.xls", sheet = "data") read_excel("my-spreadsheet.xls", sheet = 2) # If NAs are represented by something other than blank cells, # set the na argument read_excel("my-spreadsheet.xls", na = "NA") 

* not strictly true, this requires the Rcpp package, which in turn requires Rtools (for Windows) or Xcode (for OSX), which are dependent external R. But they do not require any attempts using paths, etc. e., so that advantage over Java and Perl dependencies.

Update Now there is a rexcel package. These are promises for getting formatting, functions, and many other kinds of Excel information from an Excel file and into R.

+95
Mar 19 '15 at 7:11
source share

You can also try the XLConnect package. I'm more fortunate than xlsx (plus it can also read .xls files).

 library(XLConnect) theData <- readWorksheet(loadWorkbook("C:/AB_DNA_Tag_Numbers.xlsx"),sheet=1) 

Also, if you have problems with your file not being found, try selecting it with file.choose ().

+35
Apr 09 2018-12-12T00:
source share

I would definitely try the read.xls function in the gdata package, which is significantly more mature than the xlsx package. This may require perl ...

+22
Aug 14 2018-11-11T00:
source share

Update

Since the answer below is somewhat outdated, I just drew attention to the readxl package. If the Excel worksheet is well formatted / spelled out, I would now use readxl to read from a workbook. If the sheets are poorly formatted / left, I will still export to CSV, and then handle the problems in R either through read.csv() , or in the usual old readLines() .

Original

My preferred way is to save individual Excel worksheets in comma-delimited (CSV) files. On Windows, these files are linked to Excel, so you don’t lose the double-click function in Excel.

CSV files can be read in R using read.csv() or, if you are in a specific place or using a computer configured with some European settings (where , used as a decimal place), using read.csv2() .

These functions have reasonable defaults, making it easy to read files formatted accordingly. Just keep labels for samples or variables in the first row or column.

Additional benefits of storing files in CSV are that since the files are plain text, they can be easily transferred and you can be sure that they will be opened anywhere; for Excel there is no need to view or edit data.

+20
Aug 13 '11 at 8:17
source share

Example 2012:

 library("xlsx") FirstTable <- read.xlsx("MyExcelFile.xlsx", 1 , stringsAsFactors=F) SecondTable <- read.xlsx("MyExcelFile.xlsx", 2 , stringsAsFactors=F) 
  • I would try the "xlsx" package because it is easy to handle and seems mature enough.
  • worked fine for me and didn’t need additional add-ons like Perl or anything else

Example 2015:

 library("readxl") FirstTable <- read_excel("MyExcelFile.xlsx", 1) SecondTable <- read_excel("MyExcelFile.xlsx", 2) 
  • I am currently using readxl and have done a good job with it.
  • no extra material required
  • good performance
+18
Sep 25 '12 at 10:31
source share

This new package looks beautiful http://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf It does not require rJava and uses "Rcpp" for speed.

+14
Apr 22 '14 at 3:37
source share

If you encounter the same problem and R gives you an error - could not find the ".jnew" function - just install the rJava library. Or, if you have this, just run the string library (rJava). This should be a problem.

In addition, it should be clear to everyone that csv and txt files are easier to work with, but life is not easy, and sometimes you just need to open xlsx.

+5
Apr 09 '12 at 19:35
source share

I recently discovered a Schaun Wheeler function for importing excel files into R after I realized that the xlxs package was not updated for R 3.1.0.

https://gist.github.com/schaunwheeler/5825002

The file name must have the extension ".xlsx" and the file cannot be opened when the function starts.

This feature is really useful for accessing other people's work. The main advantages of using the read.csv function are

  • Import multiple excel files
  • Import large files
  • Files that are updated regularly

Using the read.csv function requires manually opening and saving each Excel document, which is time consuming and very boring. So using Schaun to automate your workflow is a huge help.

Great props for Schaun for this solution.

+4
May 28 '14 at 6:45
source share

For me, the openxlx package worked the easiest way.

 install.packages("openxlsx") library(openxlsx) rawData<-read.xlsx("your.xlsx"); 
+4
Jun 26 '15 at 13:16
source share

What is your operating system? Which version of R are you using: 32-bit or 64-bit? What version of Java did you install?

I had a similar error when I first started using the read.xlsx() function and found that my problem (which may or may not be related to yours, at least this answer should be considered as “try this too”,) related to incompatibility of .xlsx pacakge with 64-bit Java. I am pretty sure that the .xlsx package requires 32-bit Java.

Use 32-bit R and make sure that 32-bit Java is installed. This may solve your problem.

+2
Sep 06 '13 at 18:49
source share

You checked that R can indeed find the file, for example. file.exists ("C: /AB_DNA_Tag_Numbers.xlsx")? - Ben Bolker Aug 14, 2011 at 23:05

The above comment should solve your problem:

 require("xlsx") read.xlsx("filepath/filename.xlsx",1) 

should work after that.

+2
Jan 07 '15 at 12:46
source share

You may be able to store several tabs and more formatting information if you export to an OpenDocument (ods) table file or an older Excel format and import it using the ODS reader or the Excel reader you mentioned above.

+1
Aug 13 '11 at 8:22
source share

As stated by many here, I am writing the same thing, but with an extra point!

First we need to make sure that these two packages are installed in our R Studio:

  • "readxl"
  • "XLConnect"

To load a package into R, you can use the following function:

 install.packages("readxl/XLConnect") library(XLConnect) search() 

a search displays a list of the current packages available in your R Studio.

Now there’s another catch, although you may have these two packages, but still you may encounter a problem reading the “xlsx” file, and the error may be like “error: more columns than column name”

To solve this problem you can just save your excel "xlsx" sheet to

"CSV (Comma Separated)"

and your life will be very easy ....

Good luck

+1
May 27 '16 at 8:30
source share

I tried very hard for all the answers above. However, they didn’t really help, because I used a Mac. The rio library has this import function, which can basically import any data file into Rstudio , even those that use languages ​​other than English!

Try the codes below:

  library(rio) AB <- import("C:/AB_DNA_Tag_Numbers.xlsx") AB <- AB[,1] 

Hope this helps. For more detailed reference: https://cran.r-project.org/web/packages/rio/vignettes/rio.html

+1
Apr 2 '18 at 18:51
source share



All Articles