anything wrong with this trim() method in C - c

this method is for trimming a string in C by deleting spaces from the beginning and end.
BTW iam not that good with C and actually facing some issues with dealding with strings
char* trim(char s[])
{
int i, j;
int size = strlen(s);
//index i will point to first char, while j will point to the last char
for(i = 0; s[i] == ' '; i++);
for(j = size - 1; s[j] == ' '; j--);
if(i > 0)
s[i - 1] = '\0';
if(j < size - 1)
s[j + 1] = '\0';
s = &s[i];
return s;
}

This loop
for(j = size - 1; s[j] == ' '; j--);
will access an out-of-bounds index when:
the input string consists entirely of spaces (e.g., " "), in which case there is nothing stopping j from reaching -1 and beyond, or
the input string is the empty string (""), where j starts at -1.
You need to guard against this in some way
for (j = size - 1; j >= 0 && s[j] == ' '; j--)
The other thing to consider is that you are both: modifying the contents of the original string buffer, and potentially returning an address that does not match the beginning of the buffer.
This is somewhat awkward, as the user must take care to keep the original value of s, as its memory contains the trimmed string (and might be the base pointer for allocated memory), but can only safely access the trimmed string from the returned pointer.
Some things to consider:
moving the trimmed string to the start of the original buffer.
returning a new, dynamically allocated string.
Some people have warned that you cannot pass a string literal to this function, which is true, but passing a string literal to a non-const argument is a terrible idea, in general.

Related

Two null termination signs at the end of a string in C

I'm learning C via "The C Programming Language" book. During one of the exercises, where it's needed to concatenate two strings, I found that there are two null terminating signs (\0) at the end of a resulting string. Is this normal?
Code for the function:
void
copy_str_to_end(char *target, char *destination)
{
while (*destination != '\0')
++destination;
while ((*destination++ = *target++) != '\0')
;
}
Output:
This is a destination. This is a target. Here's everything seems to be OK, but if I run this function to test:
void
prcesc(char *arr)
{
int i;
for (i = 0; i <= strlen(arr) + 1; i++)
if (arr[i] != '\0')
printf("%c", arr[i]);
else
printf("E");
printf("\n");
}
The problem becomes visible: This is a destination. This is a target.EE (E means \0)
So, should I worry about this or not? And if yes, what's the reason for this thing to happen?
The problem is basically caused by the use of the <= operator instead of the < operator inside of the for loop condition:
i <= strlen(arr) + 1
strlen(arr) + 1 gives the amount of elements in the array, arr is pointing to in the caller (which actually contains the string).
When you use i <= strlen(arr) + 1 the loop iterates one time more than expected and you attempt to access an element beyond the bound of the array at the last iteration with
if (arr[i] != '\0')
since index counting starts at 0, not 1.
To access memory beyond the bounds of the array invokes undefined behavior.
The extra E in output is because you are running the while loop in prcesc function for i = strlen(arr) + 1 also. strlen returns the length of string say 'n'. So arr[n-1] is the last element of string and all elements from arr[n] are '\0'.
Hence as you are iterating for both arr[n], arr[n+1], you are getting two null characters.
The following function is what you need :
void
prcesc(char *arr)
{
int i;
for (i = 0; i <= strlen(arr); i++)
if (arr[i] != '\0')
printf("%c", arr[i]);
else
printf("E");
printf("\n");
}

Writing a function to split a string

I'm trying to write a function to split a string (not use strtok) to learn how it works. I've come up with the following so far:
char ** split_string(char * string, char sep) {
// Allow single separators only for now
// get length of the split string array
int array_length = 0;
char c;
for (int i=0; (c=string[i]) != 0; i++)
if (c == sep) array_length ++;
// allocate the array
char * array[array_length + 1];
array[array_length] = '\0';
// add the strings to the array
for (int i=0, word=0; (c=string[i]) != 0;) {
if (c == sep) {
i=0;
word ++;
} else {
array[i][word] = c;
i++;
}
}
return array;
}
This is my first time working with a pointer to a pointer (a list of strings), so I'm a bit unclear how to do this, as you can probably tell from the above function.
How would this be properly done? Specifically, is the return type correct? How would you add the \0 to the end of the array?
the one mistake you are making is not allocating space for the words to be copied. You must explicitly allocate space for the words in the destination array before copying. Following program achieves what's intended. To know the number of words, declare array_length to be a global variable, so that you can use that in the function where split_string was called.
int array_length=0;
char** split_string(char* str, char sep){
for(int i = 0;str[i] != '\0';++i){
if(str[i] == sep) ++array_length;
char** str_arr = (char**)malloc(sizeof(char*) * (array_length+1));
for(int i=0, j, k = 0; str[i] != '\0'; ++k){ // k is used to index the destination array for the extracted word
for(j = i; str[j] != sep && str[j] != '\0'; ++j); // from the current character, find the position of the next separator
str_arr[k] = (char*)malloc((j-i+2)*sizeof(char)); // Allocate as many chars in the heap and make str_arr[k] pointer point to it
strncpy(str_arr[k], str+i, j-i); // copy the word to the allocated space
i=j+1; // move the array iterator to the next non-sep character
}
return str_arr;
}
If you don't want to use malloc explicitly, you can also use library function strndup which takes the pointer to the start character of the source string and number of characters to be copied as input and does the memory allocation, copies the word and returns the pointer to the allocated space. so two of the lines in the function
str_arr[k] = (char*)malloc((j-i+2)*sizeof(char)); // Allocate as many chars in the heap and make str_arr[k] pointer point to it
strncpy(str_arr[k], str+i, j-i);
can be replaced by a single line as-
str_arr[k] = strndup(str+i, j-i);
But I would recommend using the first method as a beginner for better understanding and debugging.
Note: The above program works only for single delimiter between words, if there is an occurrence of multiple consecutive delimiters between words, you will have to tweak the program a bit in order to get it working.

2D string array is storing '\0' when it encounters a word with more than one space or digit

I am pretty new to C programming. My program is supposed to take a string and move it into a 2D array. With the words either being separated by a white-space or a digit. This works perfectly fine if there is one space or digit separating it. However, as soon as there is more than one it starts adding '\0' to my array.
//Move the string into a 2D array
for(i = 0; i < total + 1; i++)
{
if(isalpha( *(tempString + i) ))
{
sortingArray[n][j++] = tempString[i];
input++;
}
else
{
sortingArray[n][j++] = '\0';
n++;
j = 0;
}
if(tempString[i] == '\0')
break;
}
This is a sample of what happens (n = number of rows placed)
./a.out "one more way"
5 inputs
before
one
more
way
After
one
more
way
You need to skip consecutive delimiters:
for(i = 0; i < total; i++)
{
if(isalpha(tempString[i]))
{
sortingArray[n][j] = tempString[i];
++j;
++input;
}
else
{
// skip consecutive delimiters
while (i < total && !isalpha(tempString[i]))
++i;
sortingArray[n][j] = '\0';
++j
++n;
j = 0;
}
}
Disclaimer: not verified by a compiler. Use caution!
I also took the liberty of some improvements to your original code.
there is no sense to check for \0 if you have the length of the string.
changed *(tempString + i) to the clear tempString[i]
moved the increments out of the larger expressions into their own full expression. It is clearer this way.
It's a simple logic failure for which a debugger is ideal for identifying.
Imagine you have the string "hello world".
It stores "hello" into sortingArray[0] easily enough. When it gets to the first space it increments n and starts looking for the next word. But the next character it finds is another space so it increments n again.
A slight change is required to your logic
if(isalpha( *(tempString + i) ))
{
sortingArray[n][j++] = tempString[i];
input++;
}
else if(j>0)
{
sortingArray[n][j++] = '\0';
n++;
j = 0;
}
Now the code will only increment n if the previous character was a letter (by virtue of j being more than 0). Otherwise if it doesn't care and will keep going.
You should also check to see if j is non-zero after the loop as that means there is a new entry in sortingArray that needs a NUL added.
One thing also to note is that the way you're doing the for loop is a little odd. You have this
for(i = 0; i < total + 1; i++)
but also this inside the loop
if(tempString[i] == '\0')
break;
Typically, the way to terminate the for loop would be to write it like this
for(i = 0; tempString[i]!='\0'; i++)
as that way you firstly don't care about the length of the string, but the loop will finish when it hits the NUL character.

Rearranging string letters

I was doing a program to copy all string words other than its first 2 words and putting a x at the end of it.
However i cant put x at its end. Please help!!!!
Below is my code.
#include<stdio.h>
#include<string.h>
int main()
{
char a[25], b[25];
int i, j, count = 0, l, k;
scanf("%[^\n]s", a);
i = strlen(a);
if (i > 20)
printf("Given Sentence is too long.");
else
{/* checking for first 2 words and counting 2 spaces*/
for (j = 0; j < i; j++)
{
if (a[j] == ' ')
count = count + 1;
if (count == 2)
{
k = j;
break;
}
}
/* copying remaining string into new one*/
for (j = 0; j < i - k; j++)
{
b[j] = a[j + k];
}
b[j + 1] = 'x';
printf("%s", b);
}
}
you are removing first two index. But you wrote k=j and if you check the current value j there it's 1. so you are updating wrongly k because you removed 2 indexes. So k value should be 2. So checked the below code
/* copying remaining string into new one*/
for (j = 0; j < i - 2; j++)
{
b[j] = a[j + 2];
}
b[j + 1] = 'x';
printf("%s", b);
Your index is off by one. After your second loop, the condition j < i-k was false, so j now is i-k. Therefore, the character after the end of what you copied is b[j], not b[j+1]. The correct line would therefore be b[j] = 'x';.
Just changing this would leave you with something that is not a string. A string is defined as a sequence of char, ending with a '\0' char. So you have to add b[j+1] = 0; as well.
After these changes, your code does what you intended, but still has undefined behavior.
One problem is that your scanf() will happily overflow your buffer -- use a field width here: scanf("%24[^\n]", a);. And by the way, the s at the and doesn't make any sense, you use either the s conversion or the [] conversion.
A somewhat sensible implementation would use functions suited for the job, like e.g. this:
#include<stdio.h>
#include<string.h>
int main(void)
{
// memory is *cheap* nowadays, these buffers are still somewhat tiny:
char a[256];
char b[256];
// read a line
if (!fgets(a, 256, stdin)) return 1;
// and strip the newline character if present
a[strcspn(a, "\n")] = 0;
// find first space
char *space = strchr(a, ' ');
// find second space
if (space) space = strchr(space+1, ' ');
if (space)
{
// have two spaces, copy the rest
strcpy(b, space+1);
// and append 'x':
strcat(b, "x");
}
else
{
// empty string:
b[0] = 0;
}
printf("%s",b);
return 0;
}
For functions you don't know, google for man <function>.
In C strings are array of chars as you know and the way C knows it is end of the string is '\0' character. In your example you are missing at the last few lines
/* copying remaining string into new one*/
for(j=0;j<i-k;j++)
{
b[j]=a[j+k];
}
b[j+1]='x';
printf("%s",b);
after the loop ends j is already increased 1 before it quits the loop.
So if your string before x is "test", it is like
't', 'e', 's', 't','\0' in char array, and since your j is increased more than it should have, it gets to the point just right of '\0', but characters after '\0' doesnt matter, because it is the end, so your x will not be added. Simple change to
b[j]='x';

Garbage after string

I've written code to copy a string into another string but with a space between each character. When I run the code there is "garbage" after the string. However, if the for loop at the end is uncommented, there is no garbage after. Anyone know why this is happening?
#include<stdio.h>
#include<string.h>
#define MAX_SIZE 20
main ()
{
char name[MAX_SIZE+ 1];
char cpy[(MAX_SIZE * 2) + 1];
gets(name);
int i = 0;
while (name[i] != '\0' && i < MAX_SIZE)
{
cpy[(i * 2)] = name[i];
cpy[(i * 2) + 1] = ' ';
i++;
}
cpy[strlen(cpy)] = '\0';
printf("%s\n", cpy);
//for (i = 0; i < strlen(cpy); ++i) {
// printf("%c", cpy[i]);
//}
}
The line
cpy[strlen(cpy)] = '\0';
won't work since cpy isn't null terminated so strlen will read beyond the end of name until it either crashes or finds a zero byte of memory. You can fix this by changing that line to
cpy[i*2] = '\0';
If uncommenting the for loop at the end of your function appears to fix things, I can only guess that i gets reset to 0 before your printf call, meaning that printf finds a null terminator on the stack immediately after cpy. If this is what's happening, its very much undefined behaviour so cannot be relied upon.
while (name[i] != '\0' && i < MAX_SIZE)
{
cpy[(i * 2)] = name[i];
cpy[(i * 2) + 1] = ' ';
i++;
}
cpy[(i * 2)] = 0x0;
You have to null terminate the string.
Because you know that you are working with a string, it's a good thing if you initialize your "cpy" array with null character :
char cpy[(MAX_SIZE * 2) + 1] = "\0";
Otherwise, I agreed with simonc answer.
For the sake of completeness:
char* pcpy = cpy;
for (char const* p = fgets(name,sizeof(name)/sizeof(*name),stdin); p && *p; ++p) {
*pcpy++ = *p;
*pcpy++ = ' ';
}
*pcpy = 0;
You should use fgets, not gets in order to prevent your stack to be corrupted by data overruns. Second, You must manually terminate the string stored in the cpy array, since strlen simply counts the number of characters until the very first zero. Hence, if you haven't already terminated cpy the result of strlen(cpy) will be undefined and most likely crash your program.

Resources