How to convert MS excel date from float to date in Ruby?

Trying to parse and xlsx file using beetle in ruby ​​script.

In excel dates are stored as float or integers in the format DDDDD.ttttt, counting from 1900-01-00 (00 no 01) . Therefore, to convert a date such as 40396, you would take 1900-01-00 + 40396 and you should get 2010-10-15, but I get 2010-08-08.

I use active_support / time to calculate as follows:

 Time.new("1900-01-01") + 40396.days 

Am I doing my calculation wrong or is there a mistake in the active support?

I am running ruby ​​1.9.3-mri on Windows 7 + the latest active_support resource (3.2.1)

EDIT

I looked at the old file in Excel with the wrong data - my script / console was pulling the necessary data - hence my confusion - I did everything correctly, except for using the correct file !!!! Damn it!

Thanks to everyone who answers, I will ask this question here if someone needs information on how to convert dates from excel using ruby.

In addition, for everyone who works in this table - the spreadsheet table does NOT support reading XLSX files at the moment (v 0.7.1) correctly - so I use roo for reading and axlsx for writing.

+7
source share
3 answers

You have a "one by one" error in the numbering of your day - due to an error in Lotus 1-2-3 that Excel and other spreadsheet programs have carefully maintained compatibility for over 30 years.

Initially, the 1st day was supposed to be January 1, 1900 (which, as you said, would make day 0 equal to December 31, 1899). But Lotus misinterpreted 1900 as a leap year, so the number of days for everything until March 1 of this year is too large. Using these numbers with a calendar that correctly records 1900 as a regular year, the first day becomes December 31, and the 0th day returns to the 30th. Thus, the era for date arithmetic in Lotus-based spreadsheets is Saturday, December 30, 1899. (Modern Excel and some other spreadsheets extend compatibility with Lotus errors far enough to continue marking this date as December 31st, agreeing that it was Saturday, but other Lotus-based tables do not, and Ruby, of course does not do this.)

Even if you take this error into account, your declared example is incorrect: lot day 40 396 is August 6, 2010, and not October 15. I have confirmed this correspondence in Excel, LibreOffice and Google sheets, all of which agree. You must have crossed examples somewhere.

Here is one way to do the conversion:

 Time.utc(1899,12,30) + 40396.days #=> 2010-08-06 00:00:00 UTC 

Alternatively, you can use another well-known correspondence. The zero point for Ruby (and POSIX systems in general) is the moment January 1, 1970 at midnight GMT. January 1, 1970 is the day of the lotus 25 569. Until you forget to do your calculations in UTC, you can also do this:

 Time.at( (40396 - 25569).days ).utc # => 2010-08-06 00:00:00 UTC 

In any case, you probably want to declare a symbolic constant for the epoch date (either a Time object representing 1899-12-30, or a POSIX value of "day 0" 25,569).

You can replace these calls with .days with multiplication by 86,400 if you don't need active_support/core_ext/integer/time for anything else, and you don't want to download it just for that.

+24
source

You are doing the wrong calculation. How do you achieve the expected result 2010-10-15?

In Excel 40396 there is 2010-08-06 (not using the 1904 calendar, of course). To demonstrate this, enter 40396 in the Excel cell and set the format to yyyy-mm-dd .

As an alternative:

 40396 / 365.2422 = 110.6 (years -- 1900 + 110 = 2010) 0.6 * 12 = 7.2 (months -- January = 1; 1 + 7 = 8; 8 = August) 0.2 * 30 = 6 (days) 

Excel calendar incorrectly includes 1900-02-29; which explains the one-day difference between your 2010-08-08 result; I am not sure about the reason for the second day of the difference.

+3
source

"Excel stores dates and times as a number representing the number of days from 1900 to January-0, as well as the fractional part of the 24-hour day: ddddd.tttttt. This is called a serial date or serial date-time." ( http://www.cpearson.com/excel/datetime.htm )

If your column shows the time of the date, and not just the date, the following code is useful:

  dt = DateTime.new(1899, 12, 30) + excel_value.to_f 

Also keep in mind that there are two date modes on an excel worksheet, based on 1900 and based on 1904, which is usually enabled by default for spreadsheets created on a poppy. If you constantly find your dates for 4 years, you should use a different base date:

  dt = DateTime.new(1904, 1, 1) + excel_value.to_f 

You can enable / disable the 1904 date mode for any spreadsheet, but the dates will be displayed for 4 years in the spreadsheet if you change the setting after adding data. In general, you should always use the 1900 date mode, since most users in the wild are window based.

Note. Getting with this method is that rounding can appear +/- 1 second. For me, the dates I import are “pretty close”, but just something to keep in mind. A better solution might use fractional-second rounding to solve this problem.

+3
source

All Articles