Defining Signed State for HDF5 Variables in NetCDF

My team received HDF5 files for reading. They contain structured data with unsigned variables. My team and I were very pleased to find the NetCDF library, which allows you to read pure HDF5 files with pure Java, although it uses the NetCDF data model.

No problem - we thought that we would simply transfer from the NetCDF data model to any model we want. While we receive the data. Then we tried to read the 32-bit unsigned integer from the HDF5 file. We can load HDFView 2.9 and see that the variable is an unsigned 32-bit integer. But ... it turns out that NetCDF-3 does not support unsigned values!

To add insult to injury, NetCDF-3 recommends that you "extend the data type" or use the _Unsigned = "true" attribute (I do not do this) to indicate that 32 bits should be considered an unsigned value.

Well, maybe these kludges will be effective if I create NetCDF data from scratch, but how can I detect with NetCDF that the 32-bit value in an existing HDF5 file should be interpreted as unsigned?

Update: Apparently NetCDF-4 supports unsigned data types . Thus, the question arises: How to determine if a value from a Java NetCDF library is signed or not specified? I do not see any unsigned types in ucar.ma2.DataType .

+2
unsigned hdf5 netcdf
Apr 30 '13 at 22:28
source share
3 answers

Yes, you can search for the _Unsigned = "true" attribute, or you can call Variable.isUnsigned ().

Since Java does not support unsigned types, this was a complex design decision. In the end, we decided not to automatically expand the type, for efficiency. Therefore, the application must check and do the right thing. Take a look at the helper methods ucar.nc2.DataType.unsignedXXX ().

When you read the data, you get an Array object. you can call Array.isUnsigned (). Also extractors such as Array.getDouble () will correctly convert.

The netCDF-Java library supports an advanced data model called the "General Data Model" to abstract the differences in file formats. Therefore, we do not adhere to the limitations of the netCDF-3 file format or data model. But we are in Java

John

+3
May 01 '13 at 14:05
source share

Given the fact that Java does not have unsigned types, I think that the only parameters are: 1) automatically expand unsigned data (turn bytes into shorts, shorts into ins, ints to longs) or 2) represent both signed and integer numbers unsigned with available Java data types, and let the user decide whether / when he should be expanded.

Perhaps the main use for unsigned data is to represent bits, in which case the conversion will be a waste, as you just mask and test the bits.

Another main use is, for example, satellite data, which often uses unsigned bytes, and again I think that automatic extension is not the right choice. What you do is simply expanding right where you use it.

+1
May 01 '13 at 18:34
source share

It appears that when CDM data types are mapped to Java , NetCDF will automatically add the _Unsigned = "true" attribute to the variable. Therefore, I assume that if I check this attribute, it will indicate whether the value is unsigned or not. This may be exactly what I was looking for; Tomorrow I will check that it works.

Update: I tried this and it works; in addition, as John Caron pointed out in the accepted answer, the NetCDF array has an isUnsigned() method that checks the _Unsigned attribute.

0
Apr 30 '13 at 23:27
source share



All Articles