What should the string-receiving interface look like?

Question

What should the string-receiving interface look like?

This is a continuation of this issue . Suppose I am writing a C ++ interface that accepts or returns a string const. I can use the string const char * with zero completion:

void f(const char* str); // (1)

Another way is to use std :: string:

 void f(const string& str); // (2)

You can also write overload and accept both:

 void f(const char* str); // (3) void f(const string& str);

Or even a template in combination with string formatting algorithms:

 template<class Range> void f(const Range& str); // (4)

My thoughts:

(1) is not C ++ ish and may be less efficient when subsequent operations may need to know the length of the string.
(2) bad, because now f("long very long C string"); calls the std :: string construct, which enables heap allocation. If f uses this line to pass it to some low-level interface that expects a C-line (like fopen), then this is just a waste of resources.
(3) causes code duplication. Although one f may call another depending on what is the most efficient implementation. However, we cannot overload based on the return type, as in the case of std :: exception :: what (), which returns const char *.
(4) does not work with separate compilation and can lead to even more bloating of code.
The choice between (1) and (2) based on what is needed for the implementation is that the implementation details are leaked to the interface.

The question is, what is the preferred way? Is there any one recommendation I can follow? What is your experience?

Edit: There is also a fifth option:

 void f(boost::iterator_range<const char*> str); // (5)

which has pros in (1) (no need to create a string object) and (2) (the size of the string is explicitly passed to the function).

+3

c ++

ybungalobill Jan 9 '11 at 17:43

source share

8 answers

If you are dealing with a clean C ++ code base, I would go with # 2 and not worry about calling functions that do not use it with std :: string until a problem occurs. As always, don’t worry about optimization if you don’t have a problem. Make your code clean, easy to read, and easy to extend.

+7

Mark loeser Jan 9 '11 at 17:45

source share

There is one directive that you can follow: use (2) if you have no reason to do so.

A const char* str as a parameter does not make it explicit what operations are allowed to be performed on str . How often can he be increased to his segfaults? Is this a pointer to a char , a char array, or a C string (i.e. an array with char termination null)?

+4

Oswald Jan 9 '11 at 17:50

source share

I have no particular preference. Depending on the circumstances, I alternate with most of your examples.

Another option that I sometimes use is similar to your Range example, but uses plain old iterator ranges:

 template <typename Iter> void f(Iter first, Iter last);

which has the good property that it easily works with both strings in C style (and allows the callee to determine the length of the string in constant time), as well as std::string .

If the templates are problematic (perhaps because I don't want the function to be defined in the header), I sometimes do the same thing, but using char* as iterators:

 void f(const char* first, const char* last);

Again, it can be trivially used with both C-strings and C ++ std::string (as I recall, C ++ 03 does not explicitly require strings to be contiguous, but every implementation I know uses adjacent lines, and I believe C ++ 0x will explicitly require it).

Thus, these versions allow me to pass more information than the simple C-style const char* parameter (which loses information about the length of the string and does not process embedded zeros) in addition to supporting types as the main string (and, possibly, any other class of strings that you can think of) in an idiomatic way.

The disadvantage is, of course, that you have an additional parameter.

Unfortunately, string handling is not the greatest strength of C ++, so I don't think that there is one “best” approach. But a couple of iterators is one of several approaches that I usually use.

+3

jalf Jan 9 '11 at 21:08

source share

You can also write overload and accept both:

void f(const string& str) already accepts both due to an implicit conversion from const char* to std::string . Thus, No. 3 has a slight advantage over # 2.

0

dan04 Jan 9 '11 at 17:51

source share

The answer should depend on what you intend to do in f . If you need to perform complex processing with a string, approach 2 makes sense, if you just need to go to some other functions, then choose based on these other functions (say, for the arguments for which you open the file - which would be the most reasonable ?;) )

0

Nim Jan 9 '11 at 17:51

source share

I would choose void f(const string& str) if the function body does not perform char analysis; means it does not reference char* of str .

0

Nawaz Jan 9 '11 at 17:51

source share

Use (2).

The first stated problem with this is not a problem, because the line must be created at some point independently.

Fretting at the second point smells of premature optimization. Unless you have special circumstances where heap allocation is problematic, such as repeated calls with string literals, and they cannot be changed, then clarity is best avoided to avoid this trap. Then and only then can you consider option (3).

(2) clearly states that the function accepts and has the correct restrictions.

Of course, all 5 are improvements over foo(char*) , which I came across more than I would like to mention.

0

Johnmcg Jan 9 '11 at 19:10

source share

CB Bailey · Accepted Answer · 2011-01-09T17:52:02+0000

To accept a parameter, I would go with what is the simplest and often it is const char* . This works with zero-value string literals and extracting const char* from something stored in std:string , usually a very low cost.

Personally, I would not worry about overload. In all but the simplest cases, you will want to combine with the two code paths and call the other one at some point or both call the common function. It can be argued that overload hides whether one is converted to another or not, and which path has a higher cost.

Only if I really wanted to use the const functions of the std::string interface inside the function would I have const std::string& in the interface itself, and I'm not sure if just using size() would be sufficient justification.

For many projects, for better or worse, alternative string classes are often used. Many of them, such as std::string , provide cheap access to const char* with zero completion; conversion to std::string requires a copy. The requirement const std::string& in the interface dictates the storage strategy, even if the internal functions of the function do not need to specify this. I believe that this is undesirable, since the adoption of const shared_ptr<X>& dictates the storage strategy, while the adoption of X& , if possible, allows the caller to use any storage strategy for the transferred object.

The disadvantages of const char* are that, purely from the point of view of the interface, it does not use a nonzero value (although some interfaces sometimes use the difference between an empty parameter and an empty string - this may mean that t can be done using std::string ), and const char* can be the address of only one character. In practice, however, using a const char* to pass a string is so common that I consider it negative to consider it a rather trivial problem. Other problems, such as encoding the characters specified in the interface documentation (applies to both std::string and const char* ), are much more important and can cause a lot of work.

What should the string-receiving interface look like?

More articles: