To test my skills, I'm trying to write my own version of a few standard library functions. I wrote a replacement for strlen(), strlength():
int strlength(const char *c){
int len = 0;
while (*c != '\0') {
c++;
len++;
}
return len;
}
which doesn't include the null-terminator, and I am trying to write a function to reverse a string. This:
char *reverse(const char *s){
char *str = (char *)malloc(sizeof(char) * strlength(s));
int i = 0;
while (i < strlength(s)) {
str[i] = s[(strlength(s) - 1) - i];
i++;
}
str[strlength(s)] = '\0';
return str;
}
works for every string except for one with 32 characters (not including null-terminator) like foofoofoofoofoofoofoofoofoofoofo. It hangs in the reverse() functions while-loop. For all other amounts of characters, it works. Why is this happening?
Your buffer for str is off by 1. Your writes are overflowing into the rest of your heap.
As for why it works for values other than 32, my guess is that it has to do with the heap's memory alignment. The compiler is adding extra padding for smaller buffer sizes, but 32 bytes is nicely aligned (it's a power of 2, multiple of 8, etc.), so it doesn't add the extra padding and that's causing your bug to manifest. Try some other multiples of 8 and you'll probably get the same behavior.
char *reverse(const char *s){
here you allocate N characters (where N is the length of s without \0):
char *str = (char *)malloc(sizeof(char) * strlength(s));
then you iterate N times over all characters of s
int i = 0;
while (i < strlength(s)) {
str[i] = s[(strlength(s) - 1) - i];
i++;
}
and finally you add \0 at N+1 characters
str[strlength(s)] = '\0';
return str;
}
so you should do instead:
char *str = malloc(sizeof(*str) * strlength(s) + 1); // +1 for final `\0`
and funny thing is that I just tested your code, and it works fine for me (with one character off) and your 32 characters string. As #JoachimPileborg says, "That's the fun thing about undefined behavior"!
As suggested by others, the problem is certainly due to memory alignment, when you get your data aligned with your memory it overflows, whereas when it is not aligned it overwrites padding values.
You asked:
But this works for every other string length. Why won't it for 32?
Most likely because the runtime allocates memory in blocks of 32 bytes. So a 1-character buffer overrun when the buffer size is, say, 22 bytes, isn't a problem. But when you allocate 32 bytes and try to write to the 33rd byte, the problem shows up.
I suspect you'd see the same error with a string of 64 characters, 96, 128, etc . . .
Replace
malloc(sizeof(char) * strlength(s))
by
malloc(sizeof(char) * (1+strlength(s)));
The line:
str[strlength(s)] = '\0';
works often as the malloc library routine reserves word boundary aligned block and allocates only as much of it as requested in the call, viz. power of 2, but when the overflowing data overwrites beyond the allocated part, then examining the disassembly through debugger is the best hack to understand the build tool-chain specific to target's behavior. As the line is following the while loop rather than within it, so without disassembly how is the while loop mutating into infinite is unpredictable.
What everybody else said about buffer overflow and the vagaries of your runtime's memory allocation implementation/strategy. The size of a C-string is 1 more than its length, due to the NUL-termination octet.
Something like this ought to do you (a little cleaner and easier to read):
#define NUL ((char)0) ;
char *reverse( const char *s )
{
int len = strlen(s) ;
char *tgt = ((char*)malloc( 1+len )) + len ;
char *src = s ;
*(tgt--) = NUL ;
while ( *src )
{
*(tgt--) = *(src++) ;
}
return tgt;
}
Your strlen() implementation is more complicated than it needs to be, too. This is about all you need:
int string_length( const char *s )
{
char *p = s ;
while ( *p ) ++p ;
return p - s ;
}
Related
Background:
I'm trying to create a program that takes a user name(assuming that input is clean), and prints out the initials of the name.
Objective:
Trying my hand out at C programming with CS50
Getting myself familiar with malloc & realloc
Code:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
string prompt(void);
char *getInitials(string input);
char *appendArray(char *output,char c,int count);
//Tracks # of initials
int counter = 0;
int main(void){
string input = prompt();
char *output = getInitials(input);
for(int i = 0; i < counter ; i++){
printf("%c",toupper(output[i]));
}
}
string prompt(void){
string input;
do{
printf("Please enter your name: ");
input = get_string();
}while(input == NULL);
return input;
}
char *getInitials(string input){
bool initials = true;
char *output;
output = malloc(sizeof(char) * counter);
for(int i = 0, n = strlen(input); i < n ; i++){
//32 -> ASCII code for spacebar
//9 -> ASCII code for tab
if(input[i] == 32 || input[i] == 9 ){
//Next char after spaces/tab will be initial
initials = true;
}else{//Not space/tab
if(initials == true){
counter++;
output = appendArray(output,input[i],counter);
initials = false;
}
}
// eprintf("Input[i] is : %c\n",input[i]);
// eprintf("Counter is : %i\n",counter);
// eprintf("i is : %i\n",i);
// eprintf("n is : %i\n",n);
}
return output;
}
char *appendArray(char *output,char c,int count){
// allocate an array of some initial (fairly small) size;
// read into this array, keeping track of how many elements you've read;
// once the array is full, reallocate it, doubling the size and preserving (i.e. copying) the contents;
// repeat until done.
//pointer to memory
char *data = malloc(0);
//Increase array size by 1
data = realloc(output,sizeof(char) * count);
//append the latest initial
strcat(data,&c);
printf("Value of c is :%c\n",c);
printf("Value of &c is :%s\n",&c);
for(int i = 0; i< count ; i++){
printf("Output: %c\n",data[i]);
}
return data;
}
Problem:
The output is not what i expected as there is a mysterious P appearing in the output.
E.g When i enter the name Barack Obama, instead of getting the result:BO, i get the result BP and the same happens for whatever name i choose to enter, with the last initial always being P.
Output:
Please enter your name: Barack Obama
Value of c is :B
Value of &c is :BP
Output: B
Value of c is :O
Value of &c is :OP
Output: B
Output: P
BP
What i've done:
I've traced the problem to the appendArray function, and more specifically to the value of &c (Address of c) though i have no idea what's causing the P to appear,what it means, why it appears and how i can get rid of it.
The value of P shows up no matter when i input.
Insights as to why it's happening and what i can do to solve it will be much appreciated.
Thanks!
Several issues, in decreasing order of importance...
First issue - c in appendArray is not a string - it is not a sequence of character values terminated by a 0. c is a single char object, storing a single char value.
When you try to print c as a string, as in
printf("Value of &c is :%s\n",&c);
printf writes out the sequence of character values starting at the address of c until it sees a 0-valued byte. For whatever reason, the byte immediately following c contains the value 80, which is the ASCII (or UTF-8) code for the character 'P'. The next byte contains a 0 (or there's a sequence of bytes containing non-printable characters, followed by a 0-valued byte).
Similarly, using &c as the argument to strcat is inappropriate, since c is not a string. Instead, you should do something like
data[count-1] = c;
Secondly, if you want to treat the data array as a string, you must make sure to size it at least 1 more than the number of initials and write a 0 to the final element:
data[count-1] = 0; // after all initials have been stored to data
Third,
char *data = malloc(0);
serves no purpose, the behavior is implementation-defined, and you immediately overwrite the result of malloc(0) with a call to realloc:
data = realloc(output,sizeof(char) * count);
So, get rid of the malloc(0) call altogether; either just initialize data to NULL, or initialize it with the realloc call:
char *data = realloc( output, sizeof(char) * count );
Fourth, avoid using "magic numbers" - numeric constants with meaning beyond their immediate, literal value. When you want to compare against character values, use character constants. IOW, change
if(input[i] == 32 || input[i] == 9 ){
to
if ( input[i] == ' ' || input[i] == '\t' )
That way you don't have to worry about whether the character encoding is ASCII, UTF-8, EBCDIC, or some other system. ' ' means space everywhere, '\t' means tab everywhere.
Finally...
I know part of your motivation for this exercise is to get familiar with malloc and realloc, but I want to caution you about some things:
realloc is potentially an expensive operation, it may move data to a new location, and it may fail. You really don't want to realloc a buffer a byte at a time. Instead, it's better to realloc in chunks. A typical strategy is to multiply the current buffer size by some factor > 1 (typically doubling):
char *tmp = realloc( data, current_size * 2 );
if ( tmp )
{
current_size *= 2;
data = tmp;
}
You should always check the result of a malloc, calloc, or realloc call to make sure it succeeded before attempting to access that memory.
Minor stylistic notes:
Avoid global variables where you can. There's no reason counter should be global, especially since you pass it as an argument to appendArray. Declare it local to main and pass it as an argument (by reference) to getInput:
int main( void )
{
int counter = 0;
...
char *output = getInitials( input, &counter );
for(int i = 0; i < counter ; i++)
{
printf("%c",toupper(output[i]));
}
...
}
/**
* The "string" typedef is an abomination that *will* lead you astray,
* and I want to have words with whoever created the CS50 header.
*
* They're trying to abstract away the concept of a "string" in C, but
* they've done it in such a way that the abstraction is "leaky" -
* in order to use and access the input object correctly, you *need to know*
* the representation behind the typedef, which in this case is `char *`.
*
* Secondly, not every `char *` object points to the beginning of a
* *string*.
*
* Hiding pointer types behind typedefs is almost always bad juju.
*/
char *getInitials( const char *input, int *counter )
{
...
(*counter)++; // parens are necessary here
output = appendArray(output,input[i],*counter); // need leading * here
...
}
I am new to C language. I need to concatenate char array and a char. In java we can use '+' operation but in C that is not allowed. Strcat and strcpy is also not working for me. How can I achieve this? My code is as follows
void myFunc(char prefix[], struct Tree *root) {
char tempPrefix[30];
strcpy(tempPrefix, prefix);
char label = root->label;
//I want to concat tempPrefix and label
My problem differs from concatenate char array in C as it concat char array with another but mine is a char array with a char
Rather simple really. The main concern is that tempPrefix should have enough space for the prefix + original character. Since C strings must be null terminated, your function shouldn't copy more than 28 characters of the prefix. It's 30(the size of the buffer) - 1 (the root label character) -1 (the terminating null character). Fortunately the standard library has the strncpy:
size_t const buffer_size = sizeof tempPrefix; // Only because tempPrefix is declared an array of characters in scope.
strncpy(tempPrefix, prefix, buffer_size - 3);
tempPrefix[buffer_size - 2] = root->label;
tempPrefix[buffer_size - 1] = '\0';
It's also worthwhile not to hard code the buffer size in the function calls, thus allowing you to increase its size with minimum changes.
If your buffer isn't an exact fit, some more legwork is needed. The approach is pretty much the same as before, but a call to strchr is required to complete the picture.
size_t const buffer_size = sizeof tempPrefix; // Only because tempPrefix is declared an array of characters in scope.
strncpy(tempPrefix, prefix, buffer_size - 3);
tempPrefix[buffer_size - 2] = tempPrefix[buffer_size - 1] = '\0';
*strchr(tempPrefix, '\0') = root->label;
We again copy no more than 28 characters. But explicitly pad the end with NUL bytes. Now, since strncpy fills the buffer with NUL bytes up to count in case the string being copied is shorter, in effect everything after the copied prefix is now \0. This is why I deference the result of strchr right away, it is guaranteed to point at a valid character. The first free space to be exact.
strXXX() family of functions mostly operate on strings (except the searching related ones), so you will not be able to use the library functions directly.
You can find out the position of the existing null-terminator, replace that with the char value you want to concatenate and add a null-terminator after that. However, you need to make sure you have got enough room left for the source to hold the concatenated string.
Something like this (not tested)
#define SIZ 30
//function
char tempPrefix[SIZ] = {0}; //initialize
strcpy(tempPrefix, prefix); //copy the string
char label = root->label; //take the char value
if (strlen(tempPrefix) < (SIZ -1)) //Check: Do we have room left?
{
int res = strchr(tempPrefix, '\0'); // find the current null
tempPrefix[res] = label; //replace with the value
tempPrefix[res + 1] = '\0'; //add a null to next index
}
I am in the process of teaching myself C. I have the following code that prints a string char by char forwards and backwards:
#include<stdio.h>
#include<string.h>
main(){
char *str;
fgets(str, 100, stdin);
//printf("%i", strlen(str));
int i;
for(i = 0; i < strlen(str) - 1; i++){
printf("%c", str[i]);
}
for(i = strlen(str); i > -1; i--){
printf("%c", str[i]);
}
}
When run, it gives me the following output (assuming I typed "hello"):
cello
ollec
In addition, if I uncomment the 7th line of code, I get the following output (assuming I typed "hello"):
6 ♠
For the life of me, I cannot figure out what I am doing that is causing the first character in the output to change. In the second example, I know that the string length would be 6 because 'h' + 'e' + 'l' + 'l' + 'o' + '\0' = 6. That is fine, but where is the spade symbol coming from? Why is it only printing one of them?
It is pretty obvious to me that I have some kind of fundamental misunderstanding of what is happening under the hood here and I cant find any examples of this elsewhere. Can anyone explain what is going wrong here?
You never allocate memory for the string. Instead of
char *str;
use
char str[100];
so that you have enough space for the up to 100 characters you read in there with the fgets call.
In this code:
char *str;
fgets(str, 100, stdin);
str points to an effectively random location. Then you tell fgets to read characters and put them where str is pointing. This causes undefined behaviour; the symptoms you are seeing probably occur because str happened to point to some memory where the first character of that memory that was being used for other purposes, but the other characters weren't being used.
Instead you need to allocate memory:
char str[100]; // allocate 100 bytes
fgets(str, 100, stdin);
Pointers only point at memory which already is allocated somewhere; they do not "own" or "contain" any memory.
You should preallocate space for your string, otherwise you are writing to who knows where, which is bad.
char str[100]; //I must be big enough to hold anything fgets might pass me
You should also be sure to only access parts of the string which contain characters:
for(i = strlen(str)-1; i > -1; i--){
printf("%c", str[i]);
}
Note that the character at strlen(str) is \0, the string-terminating null character. So you can access this space, but trying to print it or otherwise treating it like a standard letter is going to lead to issues at some point.
Your str is a pointer to char, but you don't have any actual character buffer for it to point to. You need a character array instead:
char str[100];
Only then can fgets have somewhere to store the data it reads.
Then on your reverse-printing loop, your indices are wrong:
for(i = strlen(str); i > -1; i--){
With the above, you try to print str[i] for i = strlen(str), but that's one past the end of the valid string data. Change to:
for(i = strlen(str) - 1; i > -1; i--){
The issue is that you are not allocating your
char *str
what you need to do is either
1)
char *str = malloc(sizeof(char) * 100);
and then when you are no longer using it:
free(str)
2)
char str[100];
I'm coding a program that takes some files as parameters and prints all lines reversed. The problem is that I get unexpected results:
If I apply it to a file containing the following lines
one
two
three
four
I get the expected result, but if the file contains
september
november
december
It returns
rebmetpes
rebmevons
rebmeceds
And I don't understand why it adds a "s" at the end
Here is my code
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void reverse(char *word);
int main(int argc, char *argv[], char*envp[]) {
/* No arguments */
if (argc == 1) {
return (0);
}
FILE *fp;
int i;
for (i = 1; i < argc; i++) {
fp = fopen(argv[i],"r"); // read mode
if( fp == NULL )
{
fprintf(stderr, "Error, no file");
}
else
{
char line [2048];
/*read line and reverse it. the function reverse it prints it*/
while ( fgets(line, sizeof line, fp) != NULL )
reverse(line);
}
fclose(fp);
}
return (0);
}
void reverse(char *word)
{
char *aux;
aux = word;
/* Store the length of the word passed as parameter */
int longitud;
longitud = (int) strlen(aux);
/* Allocate memory enough ??? */
char *res = malloc( longitud * sizeof(char) );
int i;
/in this loop i copy the string reversed into a new one
for (i = 0; i < longitud-1; i++)
{
res[i] = word[longitud - 2 - i];
}
fprintf(stdout, "%s\n", res);
free(res);
}
(NOTE: some code has been deleted for clarity but it should compile)
You forget to terminate your string with \0 character. In reversing the string \0 becomes your first character of reversed string. First allocate memory for one more character than you allocated
char *res = malloc( longitud * sizeof(char) + 1);
And the try this
for (i = 0; i < longitud-1; i++)
{
res[i] = word[longitud - 2 - i];
}
res[i] = '\0'; // Terminating string with '\0'
I think I know the problem, and it's a bit of a weird issue.
Strings in C are zero terminated. This means that the string "Hi!" in memory is actually represented as 'H','i','!','\0'. The way strlen etc then know the length of the string is by counting the number of characters, starting from the first character, before the zero terminator. Similarly, when printing a string, fprintf will print all the characters until it hits the zero terminator.
The problem is, your reverse function never bothers to set the zero terminator at the end, which it needs to since you're copying characters into the buffer character by character. This means it runs off the end of your allocated res buffer, and into undefined memory, which just happened to be zero when you hit it (malloc makes no promises of the contents of the buffer you allocate, just that it's big enough). You should get different behaviour on Windows, since I believe that in debug mode, malloc initialises all buffers to 0xcccccccc.
So, what's happening is you copy september, reversed, into res. This works as you see, because it just so happens that there's a zero at the end.
You then free res, then malloc it again. Again, by chance (and because of some smartness in malloc) you get the same buffer back, which already contains "rebmetpes". You then put "november" in, reversed, which is slightly shorter, hence your buffer now contains "rebmevons".
So, the fix? Allocate another character too, this will hold your zero terminator (char *res = malloc( longitud * sizeof(char) + 1);). After you reverse the string, set the zero terminator at the end of the string (res[longitud] = '\0';).
there are two errors there, the first one is that you need one char more allocated (all chars for the string + 1 for the terminator)
char *res = malloc( (longitud+1) * sizeof(char) );
The second one is that you have to terminate the string:
res[longitud]='\0';
You can terminate the string before entering in the loop because you know already the size of the destination string.
Note that using calloc instead of malloc you will not need to terminate the string as the memory gets alreay zero-initialised
Thanks, it solved my problem. I read something about the "\0" in strings but wasn't very clear, which is now after reading all the answers (all are pretty good). Thank you all for the help.
may someone please help me understand these lines of code in the program below
this program according the writer it writes a string of hello world then there is a function in it that also reverses the string to world hello,my quest is what does this code do?
char * p_divs = divs; //what does divs do
char tmp;
while(tmp = *p_divs++)
if (tmp == c) return 1
;
also this code in the void function
*dest = '\0';//what does this pointer do?
int source_len = strlen(source); //what is source
if (source_len == 0) return;
char * p_source = source + source_len - 1;
char * p_dest = dest;
while(p_source >= source){
while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;
this is the main program
#include <stdio.h>
#include <string.h>
int inDiv(char c, char * divs){
char * p_divs = divs;
char tmp;
while(tmp = *p_divs++)
if (tmp == c) return 1;
return 0;
}
void reverse(char * source, char * dest, char * divs){
*dest = '\0';
int source_len = strlen(source);
if (source_len == 0) return;
char * p_source = source + source_len - 1;
char * p_dest = dest;
while(p_source >= source){
while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;
if (p_source < source) break;
char * w_end = p_source;
while((p_source >= source) && (!inDiv(*p_source, divs))) p_source--;
char * w_beg = p_source + 1;
for(char * p = w_beg; p <= w_end; p++) *p_dest++ = *p;
*p_dest++ = ' ';
}
*p_dest = '\0';
}
#define MAS_SIZE 100
int main(){
char source[MAS_SIZE], dest[MAS_SIZE], divs[MAS_SIZE];
printf("String : "); gets(source);
printf("Dividers : "); gets(divs);
reverse(source, dest, divs);
printf("Reversed string : %s", dest);
return 0;
}
Here, inDiv can be called to search for the character c in the string divs, for example:
inDiv('x', "is there an x character in here somewhere?') will return 1
inDiv('x', "ahhh... not this time') will return 0
Working through it:
int inDiv(char c, char * divs)
{
char * p_divs = divs; // remember which character we're considering
char tmp;
while(tmp = *p_divs++) // copy that character into tmp, and move p_divs to the next character
// but if tmp is then 0/false, break out of the while loop
if (tmp == c) return 1; // if tmp is the character we're searching for, return "1" meaning found
return 0; // must be here because tmp == 0 indicating end-of-string - return "0" meaning not-found
}
We can infer things about reverse by looking at the call site:
int main()
{
char source[MAS_SIZE], dest[MAS_SIZE], divs[MAS_SIZE];
printf("String : ");
gets(source);
printf("Dividers : ");
gets(divs);
reverse(source, dest, divs);
printf("Reversed string : %s", dest);
We can see gets() called to read from standard input into character arrays source and divs -> those inputs are then provided to reverse(). The way dest is printed, it's clearly meant to be a destination for the reversal of the string in source. At this stage, there's no insight into the relevance of divs.
Let's look at the source...
void reverse(char * source, char * dest, char * divs)
{
*dest = '\0'; //what does this pointer do?
int source_len = strlen(source); //what is source
if (source_len == 0) return;
char* p_source = source + source_len - 1;
char* p_dest = dest;
while(p_source >= source)
{
while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;
Here, *dest = '\0' writes a NUL character into the character array dest - that's the normal sentinel value encoding the end-of-string position - putting it in at the first character *dest implies we want the destination to be cleared out. We know source is the textual input that we'll be reversing - strlen() will set source_len to the number of characters therein. If there are no characters, then return as there's no work to do and the output is already terminated with NUL. Otherwise, a new pointer p_source is created and initialised to source + source_len - 1 -> that means it's pointing at the last non-NUL character in source. p_dest points at the NUL character at the start of the destination buffer.
Then the loop says: while (p_source >= source) - for this to do anything p_source must initially be >= source - that makes sense as p_source points at the last character and source is the first character address in the buffer; the comparison implies we'll be moving one or both towards the other until they would cross over - doing some work each time. Which brings us to:
while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;
This is the same test we've just seen - but this time we're only moving p_source backwards towards the start of the string while inDiv(*p_source, divs) is also true... that means that the character at *p_source is one of the characters in the divs string. What it means is basically: move backwards until you've gone past the start of the string (though this test has undefined behaviour as Michael Burr points out in comments, and really might not work if the string happens to be allocated at address 0 - even if relative to some specific data segment, as the pointer could go from 0 to something like FFFFFFFF hex without ever seeming to be less than 0) or until you find a character that's not one of the "divider" characters.
Here we get some real insight into what the code's doing... dividing the input into "words" separated by any of a set of characters in the divs input, then writing them in reverse order with space delimiters into the destination buffer. That's getting ahead of ourselves a bit - but let's track it through:
The next line is...
if (p_source < source) break;
...which means if the loop exited having backed past the front of the source string, then break out of all the loops (looking ahead, we see the code just puts a new NUL on the end of the already-generated output and returns - but is that what we'd expect? - if we'd been backing through the "hello" in "hello world" then we'd hit the start of the string and terminate the loop without copying that last "hello" word to the output! The output will always be all the words in the input - except the first word - reversed - that's not the behaviour described by the author).
Otherwise:
char* w_end = p_source; // remember where the non-divider character "word" ends
// move backwards until there are no more characters (p_source < source) or you find a non-divider character
while((p_source >= source) && (!inDiv(*p_source, divs))) p_source--;
// either way that loop exited, the "word" begins at p_source + 1
char * w_beg = p_source + 1;
// append the word between w_beg and w_end to the destination buffer
for(char* p = w_beg; p <= w_end; p++) *p_dest++ = *p;
// also add a space...
*p_dest++ = ' ';
This keeps happening for each "word" in the input, then the final line adds a NUL terminator to the destination.
*p_dest = '\0';
Now, you said:
according [to] the writer it writes a string of hello world then there is a function in it that also reverses the string to world hello
Well, given inputs "hello world" and divider characters including a space (but none of the other characters in the input), then the output would be "hello world " (note the space at the end).
For what it's worth - this code isn't that bad... it's pretty normal for C handling of ASCIIZ buffers, though the assumptions about the length of the input are dangerous and it's missing that first word....
** How to fix the undefined behaviour **
Regarding the undefined behaviour - the smallest change to address that is to change the loops so they terminate when at the start of the buffer, and have the next line explicitly check why it terminated and work out what behaviour is required. That will be a bit ugly, but isn't rocket science....
char * p_divs = divs; //what does divs do
char tmp;
while(tmp = *p_divs++)
if (tmp == c) return 1
divs is a pointer to a char array (certainly a string). p_divs just points to the same string and within the while loop a single character is extraced and written to tmp, and then the pointer is incremented meaning that the next character will be extraced on the next iterator. If tmp matches c the function returns.
Edit: You should learn more about pointers, have a look at Pointer Arithmetic.
As I pointed out in the comments, I don't think C is really the ideal tool for this task (given a choice, I'd use C++ without a second thought).
However, I suppose if I'm going to talk about how horrible the code is, the counter-comment really was right: I should post something better. Contrary to the comment in question, however, I don't think this represents a compromise in elegance, concision, or performance.
The only part that might be open to real argument is elegance, but think this is enough simpler and more straightforward that there's little real question in that respect. It's clearly more concise -- using roughly the same formatting convention as the original, my rev_words is 14 lines long instead of 17. As most people would format them, mine is 17 lines and his is 21.
For performance, I'd expect the two to be about equivalent under most circumstances. Mine avoids running off the beginning of the array, which saves a tiny bit of time. The original contains an early exit, which will save a tiny bit of time on reversing an empty string. I'd consider both insignificant though.
I think one more point is far more important though: I'm reasonably certain mine doesn't use/invoke/depend upon undefined behavior like the original does. I suppose some people might consider that justified if it provided a huge advantage in another area, but given that it's roughly tied or inferior in the other areas, I can't imagine who anybody would consider it (even close to) justified in this case.
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
int contains(char const *input, char val) {
while (*input != val && *input != '\0')
++input;
return *input == val;
}
void rev_words(char *dest, size_t max_len, char const *input, char const *delims) {
char const *end = input + strlen(input);
char const *start;
char const *pos;
do {
for (; end>input && contains(delims, end[-1]); --end);
for (start=end; start>input && !contains(delims,start[-1]); --start);
for (pos=start; pos<end && max_len>1; --max_len)
*dest++=*pos++;
if (max_len > 1) { --max_len; *dest++ = ' '; }
end=start;
} while (max_len > 1 && start > input);
*dest++ = '\0';
}
int main(){
char reversed[100];
rev_words(reversed, sizeof(reversed), "This is an\tinput\nstring with\tseveral words in\n it.", " \t\n.");
printf("%s\n", reversed);
return 0;
}
Edit: The:
if (max_len > 1) { --max_len; *dest++ = ' '; }
should really be:
if (max_len > 1 && end-start > 0) { --max_len; *dest++ = ' '; }
If you want to allow for max_len < 1, you can change:
*dest++ = '\0';
to:
if (max_len > 0) *dest++ = '\0';
If the buffer length could somehow be set by via input from a (possibly hostile) user, that would probably be worthwhile. For many purposes it's sufficient to simply require a positive buffer size.