Is there a dataset for a full base64 encoding / decoding check?

Question

Is there a dataset for a full base64 encoding / decoding check?

I see that there are many base64 implementations in open source, and I found several internal implementations in the product that I support.

I am trying to split duplicates, but I am not 100% sure that all these implementations give identical results. Therefore, I need to have a data set that checks all possible input combinations.

Is it available somewhere? google search did not really report this.

I saw a similar question about stackoverflow, but it didn’t answer it completely, and in fact it just asks for a single phrase (in ascii) that will check all 64 characters. For example, it does not handle padding with =. Thus, one test line, of course, will not correspond to the score for a 100% test.

+4

unit-testing base64

David nouls Aug 22 '12 at 9:00

source share

1 answer

Hugh brackett · Answer 1 · 2012-09-03T18:12:39+0000

Perhaps something like Base64Test in Bouncy Castle will do what you want ?. The complex part in base64 handles the gasket correctly. It is certainly important to cover this, as you mentioned. Accordingly, RFC 4648 defines these test vectors:

BASE64("") = "" BASE64("f") = "Zg==" BASE64("fo") = "Zm8=" BASE64("foo") = "Zm9v" BASE64("foob") = "Zm9vYg==" BASE64("fooba") = "Zm9vYmE=" BASE64("foobar") = "Zm9vYmFy"

Some of your implementations can generate base64 output, which differs only in whether they insert line breaks, and where implementations that break line breaks insert a break and the line ending used. You will need to do additional testing to determine if you can safely replace an implementation that uses one style with another. In particular, the decoder may make assumptions about the length or termination of the line.

Is there a dataset for a full base64 encoding / decoding check?

More articles: