The technical term for a block in code coverage is the base block. To pick up directly from a Wikipedia entry :
The code in the base unit has one entry point, which means the lack of code inside it is the destination of the instruction jump anywhere in the program, and it has one exit point, which means only the last instruction can cause the program to start executing the code in another base unit. Under these circumstances, when the first instruction in the base unit is executed, the rest of the instructions must be executed exactly once, in order.
The base unit is important for covering the code, because we can insert a probe at the beginning of the base unit. When this probe hit, we know that all of the following instructions in this base unit will be executed (due to the properties of the base unit).
Unfortunately, with compilers (and especially with optimization), it is not always obvious how the source code maps to the base blocks. The easiest way is to look at the generated assembly. For example, let's look at your original main and testfunction :
In main I see the assembly below (alternating with the source code). Just like Peter here , I noticed where the base blocks begin.
int main() { 013B2D20 push ebp <--- Block 0 (initial) 013B2D21 mov ebp,esp 013B2D23 sub esp,40h 013B2D26 push ebx 013B2D27 push esi 013B2D28 push edi testfunction(-1); 013B2D29 push 0FFFFFFFFh 013B2D2B call testfunction (013B10CDh) 013B2D30 add esp,4 <--- Block 1 (due to call) testfunction(1); 013B2D33 push 1 013B2D35 call testfunction (013B10CDh) 013B2D3A add esp,4 <--- Block 2 (due to call) } 013B2D3D xor eax,eax 013B2D3F pop edi 013B2D40 pop esi 013B2D41 pop ebx 013B2D42 mov esp,ebp 013B2D44 pop ebp 013B2D45 ret
We see that main has three main blocks: one starting block and two others due to function calls. Looking at the code, this seems reasonable. testfunction little tougher. Just looking at the source seems to be three blocks:
- Writing to a function and logical test (
input > 0 ) - True branch condition (
return 1 ) - False branch condition (
return 0 )
However, due to the actual generated assembly, there are four blocks. I assume that you created your code with optimizations disabled. When I create VS2010 in the Debug configuration (optimization is disabled), I see the following disassembly for testfunction :
int testfunction(int input) { 013B2CF0 push ebp <--- Block 0 (initial) 013B2CF1 mov ebp,esp 013B2CF3 sub esp,40h 013B2CF6 push ebx 013B2CF7 push esi 013B2CF8 push edi if (input > 0) { 013B2CF9 cmp dword ptr [input],0 013B2CFD jle testfunction+18h (013B2D08h) return 1; 013B2CFF mov eax,1 <--- Block 1 (due to jle branch) 013B2D04 jmp testfunction+1Ah (013B2D0Ah) } else { 013B2D06 jmp testfunction+1Ah (013B2D0Ah) <--- Not a block (unreachable code) return 0; 013B2D08 xor eax,eax <--- Block 2 (due to jmp branch @ 013B2D04) } } 013B2D0A pop edi <--- Block 3 (due to being jump target from 013B2D04) 013B2D0B pop esi 013B2D0C pop ebx 013B2D0D mov esp,ebp 013B2D0F pop ebp 013B2D10 ret
Here we have four blocks:
- Write to function
- True branch condition
- False branch condition
- Common epilog function (stack cleanup and return)
If the compiler duplicated the epilog function in both the true and false branches condition, you would see only three blocks. It is also interesting that the compiler inserted a false jmp instruction in 013B2D06 . Since this is unreachable code, it is not considered as a base unit.
In general, all this analysis is redundant, since the general metric of code coverage will tell you what you need to know. This answer was simply to emphasize why the number of blocks is not always obvious or what was expected.