Paste from Excel to C # application while maintaining full accuracy

I have data in an Excel spreadsheet with values ​​similar to these:

  • 0.69491375
  • 0.31220394

Cells are formatted as Percentage and are set to display two decimal places. Therefore, they are displayed in Excel as:

  • 69.49%
  • 31.22%

I have a C # program that parses this data with Clipboard .

 var dataObj = Clipboard.GetDataObject(); var format = DataFormats.CommaSeparatedValue; if (dataObj != null && dataObj.GetDataPresent(format)) { var csvData = dataObj.GetData(format); // do something } 

The problem is that csvData contains the displayed values ​​from Excel, i.e. '69 .49% 'and '31 .22%'. It does not contain the full accuracy of the extra decimal places.

I tried using different DataFormats values, but the data only ever contains the displayed value from Excel, for example:

  • DataFormats.Dif
  • DataFormats.Rtf
  • DataFormats.UnicodeText
  • and etc.

As a test, I installed LibreOffice Calc and copied / pasted the same cells from Excel to Calc. Calc maintains the full accuracy of the source data.

Thus, Excel puts this data somewhere that other programs can access. How can I access it from my C # application?

Edit - next steps.

I downloaded the LibreOffice Calc source code and I will have the opportunity to see if I can find out how they get the full context of the copied data from Excel.

I also made a call to GetFormats() on the data object returned from the clipboard and got a list of 24 different data formats, some of which are not listed in the DataFormats enumeration. They include formats such as Biff12 , Biff8 , Biff5 , Format129 among other formats that are unfamiliar to me, so I will investigate them and answer if I make any discoveries ...

+7
source share
2 answers

Also not a complete answer, but some additional information about the problem:

When you copy a single Excel cell, what appears on the clipboard is a complete Excel workbook containing a single spreadsheet, which, in turn, contains one cell:

 var dataObject = Clipboard.GetDataObject(); var mstream = (MemoryStream)dataObject.GetData("XML Spreadsheet"); // Note: For some reason we need to ignore the last byte otherwise // an exception will occur... mstream.SetLength(mstream.Length - 1); var xml = XElement.Load(mstream); 

Now when you upload the contents of XElement to the console, you can see that you are really getting the full Excel workbook. Also, the XML table format contains an internal representation of the numbers stored in the cell. Therefore, I assume that you can use Linq-To-Xml or similarly extract the required data:

 XNamespace ssNs = "urn:schemas-microsoft-com:office:spreadsheet"; var numbers = xml.Descendants(ssNs + "Data"). Where(e => (string)e.Attribute(ssNs + "Type") == "Number"). Select(e => (double)e); 

I also tried reading Biff formats using Excel Data Reader , however the resulting DataSets always came out empty ...

+6
source

BIFF formats are Microsoft's open specification. (Please note that I am saying that the specification is not standard). Let me read this to get an idea of ​​what is going on.

Then those BIFFs that you see correspond to some Excel formats. BIFF5 is XLS from Excel 5.0 and 95, BIFF8 is XLS from Excel 97 in 2003, BIFF12 is XLS from Excel 2003, note that Excel 2007 can also create them (I think Excel 2010 too). There is some documentation here as well as here (from OpenOffice) that can help you understand the binary there ...

In any case, some work has been done in the past to analyze these documents in C ++, Java, VB and, for your taste, in C #. For example, the BIFF12 Reader , the NExcel project, and ExcelLibrary , to cite a few.

In particular, NExcel will allow you to transfer a stream that you can create from clipboard data, and then query NExcel to retrieve the data. If you are going to take the source code, I think ExcelLibrary is much more readable.

You can get the stream as follows:

 var dataobject = System.Windows.Forms.Clipboard.GetDataObject(); var stream = (System.IO.Stream)dataobject.GetData(format); 

And read that the stream with NExcel will be something like this:

 var wb = getWorkbook(stream); var sheet = wb.Sheets[0]; var somedata = sheet.getCell(0, 0).Contents; 

I think the actual Microsoft Office libraries will work as well.

I know that this is not the whole story, please share how this happens. Try it if I get a chance.

+3
source

All Articles