Developed an alternative to strtok

I developed my own version of strtok. Just practice using pointers.

Can anyone see any limitations with this or in any case I can improve.

void stvstrtok(const char *source, char *dest, const char token) { /* Search for the token. */ int i = 0; while(*source) { *dest++ = *source++; if(*source == token) { source++; } } *dest++ = '\0'; } int main(void) { char *long_name = "dog,sat ,on ,the,rug,in ,front,of,the,fire"; char buffer[sizeof(long_name)/sizeof(*long_name)]; stvstrtok(long_name, buffer, ','); printf("buffer: %s\n", buffer); getchar(); return 0; } 
+4
source share
7 answers

Note: the word "token" is usually used to describe parts of the returned string. A separator is used to describe what separates tokens. To make the code more understandable, you must rename the token to a separator and rename dest to token_dest.

Differences in your function and strtok:

There are several differences between your function and strtok.

  • What your function does is simply remove the token separators.
  • You only call your function once to process all parts of the string. With strtok, you call it several times for each part of the string (subsequent times with NULL as the first parameter).
  • strtok also destroys the original string, while your code uses its own buffer (I think it's better to use your own buffer, as you did).
  • strtok saves the position of the next token after each call, where the first parameter is NULL. This position is then used for subsequent calls. This is not thread safe, although your function will be thread safe.
  • strtok can use several different delimiters, while your code uses only one.

As the saying goes, I will give suggestions on how to make a better function, not a function that is closer to strtok implementation.

How to improve your function (do not emulate strtok):

I think it would be better to make the following changes:

  • Ask your function to simply return the "next" token
  • Break from loop when you have source * or source == delimiter
  • Returns a pointer to the first character of the source string containing the next token. This pointer can be used for subsequent calls.
+7
source

This code does not work at all like strtok() . What exactly were you trying to do? But as far as improvements, your code has a serious error: if the length of the source , subtracted by the number of token occurrences, is greater than the length of dest , you have a very classic stack overflow, which seems somewhat ironic to me at the moment. This will not happen in the main that you used, but using the function elsewhere will inevitably lead you to a path of uncertainty and despair.

+3
source

strtok allows you to iterate through all tokens. He does this by assuming that the original string is writable and inserts zeros into it when tokens break. The destination buffer is a pointer to the character offset with the source buffer. You can use this fact to find out when you have reached the end +, also keep a β€œstate” between calls.

Strtok is not a good function to use as it destroys the original string. It is also not a repeat participant.

+1
source

strtok () will save some state, so you can call it several times to get multiple tokens. In addition, strtok () will "split" the original string so that you get multiple destination strings, each of which was a token.

All your code, from what I see, ignores any char input that is equal to the token separator, and will continue copying until the source finishes zero.

edit: Also, consider two sequencer sequencer separators: the first will be ignored by your function, the second will be written to the destination, while strtok () will define a seqeunce of two or more separators as a single separator (man page: http: // man. cx /? page = strtok )

+1
source

strtok destroys the input string with the NUL character, making it hostile.

You should also consider the case of "xyz, pdq", how many tokens will strtok pull from this line if "," is a delimiter.

What do you want your function to execute in this case?

+1
source

In addition, strtok (...) supports multiple delimiter characters. Look at the definitions of strspn (...) and strcspn (...) as they can be used to reimplement strtok (...).

+1
source

By the way, long_name is a pointer to char, and sizeof (long_name) is sizeof (char *). not the size of what long_name points to.

+1
source

All Articles