I think the line of reasoning works as follows: the ZeroR classifier simply assigns each value to the most common class (found by studying the training data). This means that if your data is 55% of class A, 10% of class B, 5% of class C, etc., then ZeroR will get 55% of the right. If your data is 33% class A, 31% class B, 28% class C, etc., then ZeroR will be entitled to 33%.
Save from accidental class selection, this is pretty much the dumbest classifier you can get, and so you can measure other classifiers how well they work compared to this minimum level of performance. Given a specific dataset, you can use ZeroR to find out what minimal performance you can expect.
Michael clerx
source share