Well, here we go ... we apologize to anyone who is expecting a quicker solution. It turns out that my teacher had a little fun with me, and I completely missed what he was saying.
I should start by explaining what I had in mind:
he hinted that there is an even faster way to do this
The essence of our conversation was as follows: he said that my XOR approach was interesting, and we talked for some time about how I came to my decision. He asked me if I thought my decision was optimal. I said what I did (for the reasons mentioned in my question). Then he asked me: "Are you sure?" with a look on his face I can only describe as "smug." I hesitated, but said yes. He asked me if I could come up with a better way to do this. I was very similar: "Do you mean a faster way?" but instead of giving me a direct answer, he told me to think about it. I said I will.
So I thought about it, of course, that my teacher knew what I did not know. And after I came up with nothing for the day, I came here.
What my teacher really wanted me to do was defend my decision as optimal, and not try to find a better solution. As he put it: creating a nice algorithm is the easy part, the hard part proves that it works (and that it is the best). He thought it was pretty funny that I spent so much time on Find-A-Better-Way Land instead of developing a simple O (n) proof that would take significantly less time (we ended this up, see below, if you're interested )
So, I think I learned a big lesson here. I will accept the answer of Shashank Gupta, because I think that he really can answer the original question, although the question was erroneous.
I will leave you guys with the neat little single-line Python I found by typing the proof. This is not more efficient, but I like it:
def getUniqueElement(a, b): return reduce(lambda x, y: x^y, a + b)
Very informal "proof"
Let's start with the original two arrays from question a and b :
int[] a = {6, 5, 6, 3, 4, 2}; int[] b = {5, 7, 6, 6, 2, 3, 4};
We will say here that a shorter array has a length n , then a longer array should have a length n + 1 . The first step to proving linear complexity is to combine the arrays into a third array (we will call it c ):
int[] c = {6, 5, 6, 3, 4, 2, 5, 7, 6, 6, 2, 3, 4};
whose length is 2n + 1 . Why do this? So, now we have one more problem in its entirety: searching for an element that occurs an odd number of times in c (hence the "odd number of times" and "unique" are perceived as the same thing). This is actually a fairly popular interview question , and apparently my teacher got an idea of his problem, so now my question has practical meaning. Hurrah!
Suppose there is an algorithm faster than O (n), such as O (log n). This means that it will only get access to some elements of c . For example, the O (log n) algorithm may only need to check log (13) ~ 4 elements in our array of examples to determine a unique element. Our question is: is this possible?
First, let's see if we can remove any of the elements (by "deleting", I mean that you do not need access). How about if we remove 2 elements so that our algorithm checks only the subframe c with a length of 2n - 1 ? This is still linear complexity, but if we can do this, perhaps we can improve it even further.
So, we choose two elements c completely randomly for removal. There are actually a few things that can happen here that I will summarize in the following cases:
What does our array look like? In the first case, 7 is still the only element. In the second case, there is a new unique element, 5. And in the third case, there are 3 unique elements ... yes, this is a complete mess there.
Now the question is: can we define a unique element c simply by looking at this subarray? In the first case, we see that 7 is the only element of the subarray, but we cannot be sure that it is also the only element of c ; two deleted elements could be equal to 7 and 1. A similar argument applies to the second case. In the case of 3 with 3 unique elements, we cannot say which two are not unique in c .
It becomes clear that even with 2n - 1 access, there 2n - 1 not enough information to solve the problem. And so the optimal solution is linear.
Of course, the real proof will use induction, and not use the proof, for example, but I will leave it to someone else :)