The good news: nothing happened.
Looking at DER ASN.1 Encoding
Take a look at the start of both of the following DER encodings:
C#: 308201AE... Java: 3080...
C # coding has a certain length, i.e. 30 indicates that SEQUENCE , 82 indicates the encoding of a specific length using the following two bytes, and 01AE actual value of the length is 430. The 430 bytes that follow plus 4 reads still total 434 bytes.
Java encoding, on the other hand, is characterized in that it indicates an encoding with undefined space ( 80 ). Strictly speaking, this is not DER encoding, but BER encoding. This means that no explicit length is specified for this element, but this element ends with a special END OF CONTENTS element, which is encoded as 0000 . You will notice quite a few of them at the end of the Java encoding. Read more about this manual in BER / DER.
The rest of the two structures are exactly identical, even to the signature value itself. It is just that the Java version uses indefinite lengths, while the C # version uses certain lengths. If the relying party understands both BER and DER encodings, the two signatures will be identical prior to encoding. And the encoding will not play a role in the signature verification process. Here's what the CMS RFC says:
With signedAttrs present:
In particular, the initial input is the encapContentInfo eContent OCTET STRING, to which the signing process is applied. Only octets containing the eContent OCTET STRING value are entered into the message digest algorithm, not a tag or octet length.
Without signedAttrs :
When the signedAttrs field is missing, only octets containing the SignalData encapContentInfo eContent OCTET STRING value (for example, the contents of the file) are entered into the message digest calculation. This has the advantage that the length of the content to be signed does not need to know the signature generation process in advance.
In other words: only bytes containing the actual eContent value eContent hashed, and only those are valid. Neither its tag, nor its length, as well as the tags and the lengths of its pieces (in the case of an undefined encoding) can be hashed in the process. I admit that there are implementations that do this wrong, and this is clearly a rather complicated problem.
Why use indefinite lengths in CMS SignedData?
Despite the fact that it adds a lot of complexity and interaction problems, this makes sense for one reason (in addition to fewer bytes): if you create “attached signatures” (those where the source document is embedded in the EncapContentInfo element), by choosing infinite lengths, you You can create and verify the signature in a streaming way: you can read or write a piece in a piece. Whereas for certain lengths you need to read / write everything at once, because you need to know the length in advance in order to create the final DER-label length format for DER encoding. The idea of being able to do streaming IO is very powerful in this context: imagine that you want to create an attached signature of a log file with several GB large - any non-streaming approach will quickly run out of memory.
The Java version of Bouncy Castle added support for streaming in the CMS context some time ago, the likelihood that it will not be too long until the C # version picks it up.