Function format in program C

I am writing some functions that manipulate strings in C and return excerpts from a string.

What are your thoughts on good styles for returning values ​​from functions.

Turning to the code, Steve McConnell completed (section 5.8 in the 1993 edition), he suggests using the following format:

void my_function ( char *p_in_string, char *p_out_string, int *status ) 

The alternatives that I am considering are as follows:

Return the result of the function (option 2) using:

 char* my_function ( char *p_in_string, int *status ) 

Returns the status of a function (option 3) using:

 int my_function ( char *p_in_string, char *p_out_string ) 

In option 2 above, I would return the address of the local variable from my_function, but my calling function would use the value immediately, so I think this is normal and suppose the memory location was not reused (correct me if I am wrong).

Does it depend on personal style and preference, or should I consider other issues?

+6
c function
source share
10 answers

Option 3 is a largely unspoken (?) Industry standard. If an IO-based C function that returns an integer returns a nonzero integer value, this almost always means that the I / O operation failed. You might want to refer to this Wikibook section on return values ​​in C / C ++.

The reason people use 0 for success is because there is only one condition for success. Then, if it returns a non-zero value, you somehow look at what a non-zero value means in terms of errors. Perhaps 1 means that it cannot allocate memory, 2 means that the argument is invalid, 3 means that, for example, there was some I / O error. Technically, as a rule, you will not return 1, but you will return XXX_ERR_COULD_NOT_MALLOC or something like that.

In addition, it never returns the addresses of local variables . If you personally malloc did not publish it, after returning from the function there are no guarantees regarding this variable address. Read the link for more information.

+6
source share

In option 2 above, I would return the address of the local variable from my_function, but my calling function would use the value immediately, so I believe that everything is in order and assume the memory location was not repeated (correct me if I am wrong).

Sorry, but you're wrong, go with Steve McConnell or the last method (by the way, according to the first method, "int status" should be "int * status".

You are forgiven for thinking that you are right, and this may work for the first 99,999 times when you run the program, but the 100,000th time is a kicker. In multi-threaded or even multi-processor architecture, you cannot rely on someone or something to not take this segment of memory and use it before you get to it.

Better to be safe than sorry.

+2
source share

The second option is problematic because you must get memory for the result string, so you either use a static buffer (which can cause several problems), or allocate memory, which in turn can easily cause a memory leak, because the calling function is responsible for his release after use, which is easily forgotten.

There is also option 4,

 char* my_function ( char *p_in_string, char* p_out_string ) 

which just returns p_out_string for convenience.

+1
source share

safer way:

 int my_function(const char* p_in_string, char* p_out_string, unsigned int max_out_length); 

the function will return a status so that it can perform check-able immediately, as in

 if( my_function(....) ) 

and the caller will allocate memory for output because

  • the caller will have to free him, and this is best done at the same level.
  • the calling user will know how he handles the memory allocation in general, and not the function
+1
source share
  • void my_function ( char *p_in_string, char *p_out_string, int *status )
  • char* my_function ( char *p_in_string, int *status )
  • int my_function ( char *p_in_string, char *p_out_string )

In all cases, the input string must be const, if my_function is not explicitly allowed to write, for example, the temporary completion of zero or markers in the input string.

The second form is only valid if my_function calls "malloc" or some other option to allocate a buffer. It is unsafe in any c / C ++ implementation to return pointers to local / stack variables. Of course, when my_function calls malloc itself, the question is how the distributed buffer is free.

In some cases, the caller is given the responsibility to free the buffer — by calling free() or, to allow different layers to use different allocators, through my_free_buffer(void*) , which you publish. Another common pattern is to return a pointer to a static buffer supported by my_function , provided that the caller should not expect the buffer to remain valid after the next call to my_function.

In all cases, when a pointer to the output buffer is passed, it must be mated to the size of the buffer.

The most preferred form is

int my_function(char const* pInput, char* pOutput,int cchOutput);

This returns 0 on failure or the number of characters copied to pOutput on success, cchOutput is the size of pOutput to prevent my_function pOutput buffer overflows. If pOutput is NULL, it returns the number of characters, which must be exactly pOutput. Including space for the null terminator, of course.

 // This is one easy way to call my_function if you know the output is <1024 characters char szFixed[1024]; int cch1 = my_function(pInput,szFixed,sizeof(szFixed)/sizeof(char)); // Otherwise you can call it like this in two passes to find out how much to alloc int cch2 = my_function(pInput,NULL,0); char* pBuf = malloc(cch2); my_function(pInput,pBuf,cch2); 
+1
source share

Second style:

Do not assume that memory will not be used. There may be threads that can eat this memory, and you will only have endless garbage left.

0
source share

I prefer option 3. This means that I can perform error checking for the inline function, i.e. in if statements. In addition, it gives me the opportunity to add an additional parameter for the length of the string, if necessary.

 int my_function(char *p_in_string, char **p_out_string, int *p_out_string_len) 
0
source share

As for your option 2:
If you return a pointer to a local variable that has been allocated on the stack, the behavior is undefined.
If you return a pointer to some part of the memory that you allocated to yourself ( malloc , calloc , ...), it will be safe (but ugly, since you can forget free() ).

I will vote for option 3:
It allows you to manage memory outside of my_function(...) , and you can also return some status code.

0
source share

I would say option 3 is the best way to avoid memory management issues. You can also perform error checking using an integer state.

0
source share

It also makes sense to consider whether your function is time critical. In most architectures, using a return value is faster than using a reference pointer. I had a case when I used the return value of a function, I could avoid accessing memory in the inner loop, but using the parameter pointer, the value was always written to memory (the compiler does not know if access to this value will be through another pointer somewhere else ) With some compiler, you can even apply attributes to a return value that cannot be expressed in pointers. For example, with a function such as strlen, some compiler knows that between calls to strlen, if the pointer has not been changed, the same value will be returned, and thus avoid calling the function. In Gnu-C, you can assign a pure or even const return value to the attribute (if necessary), which is not possible with the reference parameter.

0
source share

All Articles