Find an algorithm to calculate a specific result from a specific input

This problem is more related to the mathematical side. I gave a list of 4-byte UIDs in hexadecimal and a list of corresponding double-byte codes - let them hash.

It looks like this:

7D04E214 --> 4A49 7D048DC3 --> A0E7 7D04DB2E --> 4191 ... 

I have something like 50 of these tuples, so I assume that if I find an algorithm that computes the correct hash for all UIDs, I can be sure that it is correct.

Here is my problem: I do not know how to start. I am not a mathematician and have no experience with such problems. I suspect some bitwise algorithm. It looks like it might be CRC16, but I already faked it. I do not think this is any popular algorithm. I also think (or rather hope) that the algorithm is not too complicated.

I know that the general problem of finding a function that calculates a specific output from a specific input is unsolvable. But assuming the algorithm is simple, what are my options? Are there any tools that could help me? Are there any comments you can suggest for writing my own tool? I think of some brute force, but how to do it systematically?

Thanks in advance!

Update:. Since there is some ambiguity in my question: I really need to find one algorithm that was used to create hashes from the UID in the first place - or at least one that behaves the same way for all possible UIDs (e.g. 4 byte numbers ) Since it was pointed out that there are an infinite number of possible functions, I think I need to find the simplest ones and test them against more UID values. As I said, I actually assume that the algorithm is simple and not filled with obscure keys. If I am mistaken, I am doomed, as you have noticed. But if not, maybe I have a chance with a trial error.

+4
source share
3 answers

As others commented / answered, you have an incorrect problem along with very little-known information about an unknown function (well, in the end, it is unknown :). Although you can try to guess a function using Genetic Programming, you cannot rely on the assurance that it actually represents an unknown function, not just 50 inputs -> .

But, as a dummy experiment, I played with Genetic Programming and found the following program for your three given examples:

 def guess(a, key=0xbeef): # The parameter 'a' is an input value. temp = (a % (-14)) << 3 if temp == 0: temp = -4 temp = ((a ^ (-2 * key)) - temp) >> 2 res = (temp + a + (a % (-15))) % key return res 

Which gives the following results:

 Input Output (guess) Actual output Diff 0x7d04e214 0x4a49 0x4a49 0 0x7d048dc3 0xa0e7 0xa0e7 0 0x7d04db2e 0x4191 0x4191 0 

Thus, the resulting program has a total error of 0 units for these inputs, so the function is correct for these examples, but it does not mean anything. To create a program that did not give errors for examples, it took several runs, thousands of generations, etc. Now, the immediate problem that needs to be noted here is that I assumed that the unknown function takes the key parameter along with the input - which may or may not be. Also, I just guessed that the key could be 0xbeef mainly because it is a good hex value. The consequence of these decisions is that the program will try to create a program to accommodate these options, which may be completely wrong with what the unknown function does. This means that you need to somehow make this unknown function more famous than it is now, in order to expect any relevant results.

+5
source

You should try to figure out exactly what you are trying to achieve.

If you only want to map something like 50 FIXED input values ​​to some 50 o FIXED output values, as it has already been suggested to create some mapping table from input to output values.

If, on the other hand, several 50 input values ​​and their corresponding 50 output values ​​are given and you want to be able to correctly CONDUCT the corresponding output value for ANY other input value, at least from a mathematical point of view, your problem is unsolvable, as indicated by ANY fixed the number of displayed values ​​of the output values ​​is still in the number of INFINITE, which display ALL input values ​​visible so far, to the same values ​​that have been received so far, and all slit calculated a different result for any value that has not been seen until now.

+1
source

This is an impossible quest if you cannot find out more information or compile a comparison of all the possible inputs and their outputs so that you can experiment comprehensively.

0
source

All Articles