How much zero checking is enough?

What are some guidelines when it is not necessary to check for zero?

Many of the legacy codes I worked on late have a zero ad nauseam check. Null checks for trivial functions, checks for null on API calls that indicate non-empty returns, etc. In some cases, null checks are reasonable, but in many places null is not a reasonable expectation.

I heard several arguments, ranging from "You can’t trust another code" to "ALWAYS a program is protected" to "Until the language guarantees me a non-zero value, I will always check it." I definitely agree with many of these principles until a certain point, but I found that over-checking a null value causes other problems that usually violate these principles. Is tenacious zero checking really worth it?

Often I have observed codes with redundant null checking to be actually lower quality and not higher quality. Most of the code seems to be so focused on zero checks that the developer has lost sight of other important qualities such as readability, correctness, or exception handling. In particular, I see that a lot of code ignores the std :: bad_alloc exception, but does a null check on new .

In C ++, I understand this to some extent due to the unpredictable behavior of null pointer dereferencing; Zero dilution is handled more elegantly in Java, C #, Python, etc. Have I just seen bad examples of vigilant verification of null code, or is there something like that?

This question is for language agnostic, although I am mainly interested in C ++, Java, and C #.




Some examples of null checking that I have seen that seem excessive include the following:




This example seems to apply to non-standard compilers, since the C ++ specification says that a failed new one throws an exception. Unless you explicitly support inconsistent compilers, does that make sense? Does this make sense in a managed language like Java or C # (or even C ++ / CLR)?

 try { MyObject* obj = new MyObject(); if(obj!=NULL) { //do something } else { //??? most code I see has log-it and move on //or it repeats what in the exception handler } } catch(std::bad_alloc) { //Do something? normally--this code is wrong as it allocates //more memory and will likely fail, such as writing to a log file. } 



Another example is working with internal code. In particular, if it is a small team that can define its own development methods, this seems unnecessary. In some projects or outdated code, trusted documentation may not be practical ... but is it really necessary for new code that you or your team control?

If a method that you can see and can update (or yell at the developer who is responsible) has a contract, do you still have to check for zeros?

 //X is non-negative. //Returns an object or throws exception. MyObject* create(int x) { if(x<0) throw; return new MyObject(); } try { MyObject* x = create(unknownVar); if(x!=null) { //is this null check really necessary? } } catch { //do something } 



When developing a private or other internal function, is it really necessary to explicitly handle zero when the contract requires only non-zero values? Why is zero check preferable to statement?

(obviously, in your public API, null checks are vital since it was considered impolite to yell at your users for misusing the API)

 //Internal use only--non-public, not part of public API //input must be non-null. //returns non-negative value, or -1 if failed int ParseType(String input) { if(input==null) return -1; //do something magic return value; } 

Compared with:

 //Internal use only--non-public, not part of public API //input must be non-null. //returns non-negative value int ParseType(String input) { assert(input!=null : "Input must be non-null."); //do something magic return value; } 
+62
java c ++ c # design-by-contract
Nov 19 '08 at 17:39
source share
18 answers

First of all, note that this is a special case of checking contracts: you write code that does nothing but check at runtime that a documented contract is being executed. Failure means that some code is somewhere defective.

I always doubt a little the implementation of special cases of a more useful concept. Checking contracts is useful because it catches programming errors when it first crosses the API boundary. What is so special about zeros that means they are the only part of the contract you want to check? However,

Regarding input validation:

null is special in Java: many Java APIs are written so that null is the only invalid value that can even be passed to a given method call. In such cases, a null check "fully validates" the input, so the full argument is used in favor of checking the contract.

In C ++, on the other hand, NULL is just one of almost 2 ^ 32 (2 ^ 64 on new architectures) invalid values ​​that a pointer parameter can have, since almost all addresses are not objects of the correct type. You cannot "fully confirm" your entry if you do not have a list of all objects of this type.

The question then becomes, is NULL a common enough invalid input to receive special treatment that (foo *)(-1) does not receive?

Unlike Java, fields do not get automatically initialized to NULL, so an uninitialized garbage value is as believable as NULL. But sometimes C ++ objects have pointer elements that explicitly have a NULL value, which means "I haven't got it yet." If your caller does this, then there is a significant class of programming errors that can be diagnosed with a NULL check. An exception may be easier to debug than a page error in a library for which they have no source. Therefore, if you do not mind bloating the code, this can be useful. But this is your interlocutor that you should think about, and not about yourself - this is not protective encoding, because it only "protects" from NULL, and not from (foo *) (- 1).

If NULL is not a valid input, you might consider using a parameter by reference rather than a pointer, but many coding styles do not approve non-constant reference parameters. And if the caller sends you * fooptr, where fooptr is NULL, then it was of no use to anyone. What you are trying to do is compress a little extra documentation into a function signature, in the hope that your caller is more likely to think: "Hmm, maybe there is no zero here?" when they should explicitly dereference it, than if they just passed it to you as a pointer. It still comes, but as far as possible it can help.

I do not know C #, but I understand that it, like Java, is guaranteed to have valid values ​​in these links (at least in safe code), but unlike Java, in which not all types are NULL. Therefore, I assume that null checks there are rarely worth it: if you are in safe code, do not use a type with a null value, if null is not valid, and if you are in unsafe code, then the same considerations apply as in C + +.

Regarding verification of withdrawal:

A similar problem arises: in Java, you can "fully test" the result by knowing its type and that the value is non-zero. In C ++, you cannot "fully check" the result with a NULL check - all you know is that the function returned a pointer to an object in its own stack that has just been unwound. But if NULL is a common invalid return due to constructs commonly used by the caller’s author code, then checking it will help.

In all cases:

Use assertions rather than “real code” to check contracts wherever possible - as soon as your application is running, you probably do not want each caller's code scatter to check all his inputs, and each caller to check his return values .

In the case of writing code that is portable for non-standard C ++ implementations, instead of the code in the question that checks for null and also catches an exception, I would probably have this function:

 template<typename T> static inline void nullcheck(T *ptr) { #if PLATFORM_TRAITS_NEW_RETURNS_NULL if (ptr == NULL) throw std::bad_alloc(); #endif } 

Then, as one of the list of things that you do when porting to the new system, you correctly define PLATFORM_TRAITS_NEW_RETURNS_NULL (and possibly some other PLATFORM_TRAITS). Obviously, you can write a header that does this for all the compilers you know about. If someone takes your code and compiles it on a non-standard C ++ implementation that you know nothing about, they are fundamentally independent for more reasons than this, so they will have to do it themselves.

+17
Nov 21 '08 at 12:27
source share

One thing to remember is that your code that you write today, when it can be a small team, and you can have good documentation, will turn into legacy code that someone else will have to maintain. I use the following rules:

  • If I write a public API that will be available to others, then I will do null checks on all the link parameters.

  • If I write an internal component for my application, I write zero checks when I need to do something special, when zero exists, or when I want to make it very clear. Otherwise, I do not mind the exclusion of a null reference, as it is also pretty clear what is happening.

  • When working with return data from other nations, I only check for a null value when possible, and is valid for null returns. If their contract indicates that it does not return zeros, I will not check.

+34
Nov 19 '08 at 17:53
source share

If you write the code and its contract, you are responsible for its use in accordance with its contract and guarantee that the contract is correct. If you say "returns non-zero" x, then the caller should not check for null. If a null pointer exception occurs instead of this pointer / pointer, this is not your contract.

Checking for zeros should be limited only when using a library that is not trusted or does not have a proper contract. If this is the code of your development team, emphasize that the contracts should not be broken, and also to track the person who misuses the contract in case of errors.

+10
Nov 19 '08 at 17:48
source share

This partly depends on how the code is used - if it is a method available only within the framework of a project or a public API. API error checking requires something stronger than contention.

So, so far this is good within the framework of the project, where it is supported by unit tests, etc.:

 internal void DoThis(Something thing) { Debug.Assert(thing != null, "Arg [thing] cannot be null."); //... } 

in a method in which you don't have control over who calls it, something like this might be better:

 public void DoThis(Something thing) { if (thing == null) { throw new ArgumentException("Arg [thing] cannot be null."); } //... } 
+7
Nov 19 '08 at 17:56
source share

It depends on situation. The rest of my answer suggests C ++.

  • I never check the return value of a new one since all implementations that I use throw bad_alloc on failure. If I see the deprecated test for a new null return in any code I'm working on, I cut it out and don't bother to replace it with something.
  • If minor coding standards prohibit this, I document the premises. Broken code that violates published contract needs immediately breaks down dramatically.
  • If null arises from the runtime failure that has not broken code, I throw. fopen failure and malloc (although I rarely, if ever use them in C ++) will fall into this category.
  • I am not trying to recover from a denial of distribution. Bad_alloc gets caught in main ().
  • If the test is null for an object that is a co-author of my class, I rewrite the code to take it by reference.
  • If a collaborator really cannot exist, I use the Null Object design pattern to create a placeholder for the failed path.
+7
Nov 19 '08 at 20:10
source share

The NULL check is generally evil, because it adds a small negative token to the testability of the code. With NULL checks everywhere, you cannot use the "pass null" method, and it will hit you with unit testing. It is better to have a unit test for a method than checking for zero.

Check out a decent presentation on this issue and unit testing in general by Mishko Hevery at http://www.youtube.com/watch?v=wEhu57pih5w&feature=channel

+4
Nov 19 '08 at 18:31
source share

Old versions of Microsoft C ++ (and possibly others) did not throw an exception for failed allocations via new, but returned NULL. Code that was supposed to work in both the standard and earlier versions would have the redundancy specified in the first example.

It would be easier to make all the failed allocations by following the same path:

 if(obj==NULL) throw std::bad_alloc(); 
+3
Nov 19 '08 at 20:23
source share

It is well known that there are process-oriented people (focus on the right choice of things) and result-oriented people (get the right answer). Most of us are somewhere in between. Looks like you found outlier for procedure-oriented. These people would say "anything, if you don’t understand things at all, so get ready for everything." For them, what you see is done right. For them, if you change it, they will worry because the ducks are not all lined up.

When I work on other code, I try to make sure that I know two things.
1. What the programmer wanted 2. Why did they write the code the way they did

To keep track of type A programmers, perhaps this helps.

So, “How much is enough” ends up being a social issue, not a technical issue — there is no agreed way to measure it.

(He turns me on too.)

+2
Nov 19 '08 at 17:50
source share

When you can specify which compiler is used, for system functions such as the "new" null check, this is a bug in the code. This means that you will duplicate the error handling code. Duplicate code is often a source of error because it is often changed and the other is not. If you cannot specify the version of the compiler or compiler, you should be more protective.

As for the internal functions, you must specify the contract and make sure that the contract is executed using unit tests. We had a problem in our code some time ago, when we either threw an exception or returned null if there was no object from our database. It just made us confuse the caller, so we went through it and made it consistent throughout the code base and deleted duplicate checks.

The important thing (IMHO) is to not have a repeating logic of errors when one branch will never be called. If you can never call the code, you cannot test it, and you will never know that it is broken or not.

+2
Nov 19 '08 at 17:55
source share

Personally, I believe that zero testing is not required in most cases. If a new crash or malloc crashes, you have more problems, and the probability of recovery is just zero in cases where you are not writing a memory checker! Also, zero testing often hides errors at the development stages, since the null clauses are often just empty and do nothing.

+2
Nov 19 '08 at 18:37
source share

I would say that it depends a little on your language, but I use Resharper with C #, and it basically comes out of this way to tell me that “this link may be null”, in which case I add a check if it tells me "it will always be true" for "if (null! = oMyThing & & ....)", then I listen to it, do not check the null value.

+2
Nov 19 '08 at 21:30
source share

Whether to check for null or not is highly dependent on the circumstances.

For example, in our store, we check the parameters for the methods that we create for null inside the method. The simple reason is that, as a source programmer, I have a good idea of ​​what this method should do. I understand the context, even if the documentation and requirements are incomplete or less satisfactory. A later programmer entrusted with maintenance may not understand the context and may erroneously assume that passing null is harmless. If I know that null will be harmful, and I can anticipate that someone might pass null, I have to take a simple step to make sure that the method reacts gracefully.

 public MyObject MyMethod(object foo) { if (foo == null) { throw new ArgumentNullException("foo"); } // do whatever if foo was non-null } 
+2
Nov 20 '08 at 15:13
source share

I check only NULL when I know what to do when I see NULL. “Knowing what to do” here means “knowing how to avoid a crash” or “knowing what to tell the user other than the scene of the accident.” For example, if malloc () returns NULL, I usually have no option but to abort the program. On the other hand, if fopen () returns NULL, I can tell the user a file name that cannot be opened and can be errno. And if find () returns end (), I usually know how to continue without fail.

+2
Nov 21 '08 at 19:54
source share

The lower level code must verify the use of higher level code. This usually means checking the arguments, but it may mean checking the return values ​​from the ascending ones. No need to check for arguments with fine tuning.

The goal is to immediately identify errors, as well as document the contract in code that does not lie.

+1
Nov 19 '08 at 18:05
source share

I do not think this is bad code. A fair number of Windows / Linux APIs return NULL on failure. Therefore, of course, I check for a denial of how the API points out. I usually skip the control flow to the error module of some way instead of duplicating the error handling code.

+1
Nov 19 '08 at 21:24
source share

If I get a pointer that is not guaranteed by a language that is not null, and I am going to unlink it so that the null breaks me, or let go of my function, where I said that I would not produce NULL, I check NULL.

This is not only about NULL, the function should check the preconditions and post-conditions, if possible.

It doesn't matter if the function contract that pointed to me with a pointer says that it will never give zeros. We all make mistakes. , , , , . . .

, main, , (. ++ terminate()). . bad_alloc , .

assert vs. fail - .

NULL new(), new() , NULL, , .

, , malloc , , . .

+1
20 . '08 14:40
source share

, , . , id , , , , . , . , . , . ? NullReferenceException , : wtf? !

, , , . . , , , , .

, , , NullReferenceException. , , NullReferenceException. , ! Hurrah! ! ... ", , ? , , , ... ! .

+1
20 . '13 7:04
source share

: null . , new null , . , , . , , , .:) , , , new null , .

, null , , . , , - . , .., , . null , , . , .

- . , . , :)

0
25 . '11 1:48
source share



All Articles