People play a little fast and freely with a big O recording. In the case of GCD, they usually do this in two ways:
1) You are right, the algorithmic complexity, and therefore the Big-O designation, should be indicated in terms of size in input bits, and not in terms of input values. How P, NP, etc. are defined. Taking a binary input and randomly large numbers (e.g., BigNum representation), and N the number of input bits, your GCD requires no more than 2 ^ N subtractions, each of which takes O (N) time, to complete each digit, the numbers are subtracted. So, O (N * 2 ^ N). Of course, a GCD can be made much faster if you use division rather than subtraction: O (N ^ 2).
Thus, when we say that testing was confirmed in polynomial time in 2002 , this is a technical definition of complexity, and we are the average polynomial in the number of digits (which is the difficult part), and not polynomial in the input number itself (which is trivially easy to do in “sublinear time” using trial division).
But in practice, for algorithms that take a fixed number of integer inputs, it is more convenient to talk about complexity, as if N were the input itself, and not the size of the input. This is not harmful if you understand what you mean in cases that may be ambiguous.
2) In practice, whole inputs are often fixed, 32-bit or any, and operations with them, such as addition, multiplication and division, are O (1) time. We use these facts selectively in our analysis of order. Technically, if your GCD program accepts input data up to (2 ^ 32-1), then this is O (1). He has a fixed upper bound on the time of his work. The end of the analysis.
Although technically correct, this is not a very useful analysis. Almost everything you do on a real computer is O (1) on the same basis that the size of the problem is limited by hardware.
It is usually more convenient to assume that adding O (1), because the numbers are fixed in size, but ignore that the GCD is also O (1), it is pretending that its behavior in the range [1, 2 ^ 32) continues to infinity, and analyze this on this basis. Then, with N maximum value of the two inputs, O (N) is output: O (N) subtractions, each of which takes a constant time.
Again, this is not ambiguous if you know what the conditions of the link are, but beware of incorrect comparison of the first analysis that I gave the Euclidean division algorithm, O (N ^ 2), against this analysis of the subtraction algorithm, O (N). N is not the same in everyone, and subtraction is not faster; -)