Why does assigning int(0) to a string index end the loop? - c

This is a puzzle I encountered.
I get no errors when I compile it:
$gcc -Wall -Wextra -pedantic -std=c99
Can you please explain why this happens in C?
Expected:
Print out the index and char at the str[i] and replace
Output:
If I try to reassign str[i] to 0, it acts as a break for the for-loop.
char str[] = "this is only a test";
for(int i = 0; i < (int)strlen(str); i++) {
printf("str[%d] = %c\n", i, str[i]);
if(str[i] == ' ') {
str[i] = 0;
}
}

Your program works fine (assuming you have included string.h) - when it finds space ' ' as a character, it sets the ith index to \0 and when next time strlen calculates the length, it finds that i<(int)strlen (it finds termination at early position) is violated and the loop terminates.
Here is the output on GCC without any warning/error:
str[0] = t
str[1] = h
str[2] = i
str[3] = s
str[4] =
The only thing is, it will not print the entire string as when you find space character you replace it will null termination, next time strlen condition gets violated and it comes out of the loop.
For printing the entire string better use a variable and initialize it with length of string and then use that variable as condition:
unsigned length = strlen(str);
for(int i = 0; i < length; i++)
Plus you don't need to typecast strlen's return value to int.

Since you're manipulating the string and calling strlen on it repeatedly you're getting into trouble here. Remember that the for termination condition is evaluated each time through the loop, not once.
To fix this:
size_t l = strlen(str); // Save the length once
for (size_t i = 0; i < l; i++) {
// ...
}

Related

Can't set second element of an empty string in C

I have this code, which doubles every letter in nabucco:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40];
int j=-1;
for(int i=0; i<strlen(string1);i++){
j++;
string2[j] = string1[i];
j++;
string2[j] = string1[i];
}
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
string2 is set to and prints nnaabbuuccccoo.
However, when I try to set j=0; string2 prints (and is presumably set to) nothing.
If j=0 before going into the loop, then j is incremented to 1 in the for loop, which starts off setting the second element of the empty string2. Is there something behind the scenes preventing any element other than the first from being set? I do not understand what is causing this behavior.
The program as posted has undefined behavior since you don't put a null terminator in string2.
If you start with j=0, you never set the value of string2[0] since you increase j before using it, so that position remains indeterminable.
Possible fix:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40];
size_t j=0;
for(size_t i=0, len = strlen(string1); i<len; i++) {
string2[j++] = string1[i]; // increase j "after" using the current value
string2[j++] = string1[i];
}
string2[j] = '\0'; // add null terminator
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
An alternative could be to initialize string2 when you create it:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40] = {0}; // zero initialized
for(size_t i=0, j=0, len = strlen(string1); i<len; i++) {
string2[j++] = string1[i];
string2[j++] = string1[i];
}
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
You increment the counter before using it the first time, so if you set j=0 you do j++ before using it first, which means you start with string2[1].
You can change it like so:
int j=0;
for(int i=0; i<strlen(string1);i++){
string2[j] = string1[i];
j++;
string2[j] = string1[i];
j++;
}
When you do this
printf("string2=\t%s\n",string2);
and string2 was set from the second character, this is a sort of undefined behaviour (see also this answer). string2[0] was not initialised, and could contain anything (from sensitive data to something that can't be parsed successfully by other functions - okay, here it is a single character; in other circumstances it could be more than that, if you started with j=20 for example). Which is why undefined behaviour is to be avoided at all costs.
In this case, the quirks of platform, compiler and process made it so the first character is a zero. Since C strings are zero-terminated, and string2 begins with a zero, printf prints nothing. If you printed something like this, where "." is a 0 character (ASCII 0),
"nnaa.bbuuccoo."
you would get only "nnaa". By the same token, printf'ing ".nnaabbuuccoo" gets you nothing.
If you were to print string2 starting at the second character, you'd get different results:
printf("string2=\t%s\n",string2 + 1);
After the for loop the character array string2 does not contain a string because you forgot to append it with the terminating zero character '\0'. So this call of printf
printf("string2=\t%s\n",string2);
invokes undefined behavior.
A simplest approach to resolve the problem is to initialize the array in its declaration. For example
char string2[40] = "";
As for this statement
However, when I try to set j=0; string2 prints (and is presumably set
to) nothing.
then within the for loop the variable j is at once incremented.
int j = 0;
for(int i=0; i<strlen(string1);i++){
j++;
string2[j] = string1[i];
//...
So the first character of the array string2 is not set.
In this case it is better to write the for loop the following way
size_t j = 0;
for ( size_t i = 0, n = strlen( string1 ); i < n; i++ ){
string2[j++] = string1[i];
string2[j++] = string1[i];
}
string2[j] = '\0';
It is because strings in C are null terminated. So when you set j=0 outside of loop, you skip setting first character which remains null.

Why does my empty character array start with a length of 6?

When I print out the length of the temp string, it starts at a random number. The goal of this for loop is to filter out everything that's not a letter, and it works for the most part, but when I print out the filtered string it returns the filtered string but with some extra random characters before and after the string.
#define yes 1000
...
char stringed[yes] = "teststring";
int len = strlen(text);
char filt[yes];
for (int i = 0; i < len; i++) {
if (isalpha(stringed[i])) {
filt[strlen(filt)] = tolower(stringed[i]);
}
}
There are at least two problems with the line:
temp[strlen(temp)] = "\0";
The compiler should be shrieking about converting a pointer to an integer. You need '\0' and not "\0". (This might account for some of the odd characters; the least-significant byte of the address is probably stored over the null byte, making it and random other characters visible until the string printing comes across another null byte somewhere.)
With that fixed, the code carefully writes a null byte over the null byte that marks the end of the string.
You should probably not be using strlen() at this point (or at a number of other points where you use it in the loop).
You should be using i more in the loop. If your goal is to eliminate non-alpha characters, you probably need two indexes, one for 'next character to check' and one for 'next position to overwrite'. After the loop, you need to write over the 'next position to overwrite' with the null byte.
int j = 0; // Next position to overwrite
for (int i = 0; i < length; i++)
{
if (isalpha(text[i]))
temp[j++] = text[i];
}
temp[j] = '\0';
For starters the character array
char temp[MAX];
is not initialized. It has indeterminate values.
So these statements
printf("NUM:[%i] CHAR:[%c] TEMP:[%c] TEMPSTRLEN:[%i]\n", i, text[i], temp[strlen(temp)], strlen(temp));
temp[strlen(temp)] = tolower(text[i]);
have undefined behavior because you may not apply the standard function strlen to uninitialized character array.
This statement
temp[strlen(temp)] = "\0";
is also invalid.
In the left side of the assignment statement there is used the string literal "\0" which is implicitly converted to pointer to its first character.
So these statements
length = strlen(temp);
printf("[%s]\n", temp);
do not make sense.
It seems what you mean is the following
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAX 1000
int main(void)
{
char text[MAX] = "teststring";
size_t length = strlen(text);
char temp[MAX] = { '\0' };
// or
//char temp[MAX] = "";
for ( size_t i = 0; i < length; i++)
{
if (isalpha( ( unsigned char )text[i] ) )
{
printf("NUM:[%zu] CHAR:[%c] TEMP:[%c] TEMPSTRLEN:[%zu]\n", i, text[i], temp[strlen(temp)], strlen(temp));
temp[strlen(temp)] = tolower(text[i]);
temp[i+1] = '\0';
}
}
length = strlen(temp);
printf( "[%s]\n", temp );
return 0;
}
The program output is
NUM:[0] CHAR:[t] TEMP:[] TEMPSTRLEN:[0]
NUM:[1] CHAR:[e] TEMP:[] TEMPSTRLEN:[1]
NUM:[2] CHAR:[s] TEMP:[] TEMPSTRLEN:[2]
NUM:[3] CHAR:[t] TEMP:[] TEMPSTRLEN:[3]
NUM:[4] CHAR:[s] TEMP:[] TEMPSTRLEN:[4]
NUM:[5] CHAR:[t] TEMP:[] TEMPSTRLEN:[5]
NUM:[6] CHAR:[r] TEMP:[] TEMPSTRLEN:[6]
NUM:[7] CHAR:[i] TEMP:[] TEMPSTRLEN:[7]
NUM:[8] CHAR:[n] TEMP:[] TEMPSTRLEN:[8]
NUM:[9] CHAR:[g] TEMP:[] TEMPSTRLEN:[9]
[teststring]
Edit: next time do not change your question so cardinally because this can confuse readers of the question.

printf() prints whole char matrix

As marked in the code, the first printf() rightfully prints only the i-th line of the matrix. But outiside the loop, both printf() and strcat() act on the whole matrix from i-th line on as a single-lined string. This means that
printf("%s\n",m_cfr[0])
will print whole matrix, but m_cfr[i] will print whole matrix from the i-th line on. char* string is a single lined string with no spaces.
trasp(char* string)
{
int row = strlen(string) / 5;
char m[row][5];
char m_cfr[row][5];
char cfr[row*5];
memset(cfr, 0, row * 5);
int key[5] = {3, 1, 2, 0, 4};
int k = 0;
for (i = 0 ; i < row ; i++)
{
strncpy(m[i], string + k, 5);
m[i][5] = '\0';
k += 5;
}
for (i = 0 ; i < row ; i++)
{
for (j = 0 ; j < 5 ; j++)
{
m_cfr[i][key[j]] = m[i][j];
}
m_cfr[i][5] = '\0';
printf("%s\n", m_cfr[i]); //--->prints only line i
}
printf("%s\n", m_cfr[0]); //prints whole matrix
strcat(cfr, m_cfr[0]); //concatenates whole matrix
printf("%s\n", cfr);
}
In your code, your array definition is
char m_cfr[row][5];
while you're accessing
m_cfr[i][5] = '\0';
/* ^
|
there is no 6th element
*/
You're facing off-by-one error. Out-of-bound memory access causes undefined behaviour.
Maybe you want to change the null-terminating statement to
m_cfr[i][4] = '\0'; //last one is null
%s expects a char* and prints everything until it encounters a \0. So,
printf("%s\n", m_cfr[i]);
printf("%s\n",m_cfr[0]);
strcat(cfr,m_cfr[0]);
All exhibit Undefined Behavior as m_cfr[i],m_cfr[0] and m_cfr[0] are chars and not char*s and %s as well as both the arguments of strcat expects a char*. Also, as SouravGhosh points out, using
m_cfr[i][5] = '\0';
And
m[i][5] = '\0';
Are wrong.
To fix the former issue, use
printf("%s\n", &m_cfr[i]);
printf("%s\n",m_cfr);
strcat(cfr,&m_cfr[0]);
To print the whole string and concatenate the two strings in the arguments of strcat or if you wanted to print just the chars, use
printf("%c\n", m_cfr[i]);
printf("%c\n",m_cfr[0]);
As for the latter issue, use
char m[row][5]={{0}};
char m_cfr[row][5]={{0}};

Why is both my input array and output array corrupted when I try to implement a function to reverse the array?

I'm Programming in C, using Linux GCC Compiler. I'm very much new to programming.
I'm confused as to why my in[] char array would be changed at all in the function. Doesn't the code simply count the amount of subscripts in in[] and then copy its contents into out[] but backwards? How is it being changed in the function?
/*reverse in to out*/
void reverse(char in[], char out[]) {
int i, l,b;
b = i = l = 0;
while (in[i] != '\0')
++i;
for(l=i;l > 0; l--) {
in[l] = out[b];
++b;
}
return;
}
Your assignment statement is backwards.
in[l] = out[b];
Means "Assign the value in array out at index b to array in index l". This line should instead be
out[b] = in[l];
And BTW, you don't need an empty return statement, you can simply omit this in a void function.
for(l=i-1;l >= 0; l--) {
out[b] = in[l];
++b;
}
out[b] = '\0';
Four problems fixed: start at i-1, test for l>=0, reverse the assignment, and terminate out with a null character.
Also, a good idea is to use const when a function argument will not be changed. In this case, const char in[] would let you spot the assignment error because the compiler would give you a compile time error.
You can use strlen :
for(l=strlen(in), b=0; l>=0; l--)
out[b++] = in[l];
// ensure null terminated so strlen out works
out[b] = '\0';
This assumes out is wide enough for in AND in is null terminated

Reverse Array of C-Strings

I have a few questions regarding array of strings in C.
I have an array char *string. I have a char *string and then I split every 4 characters in a array of strings called sep_str. So for example if char *string = 'The sum';, then char **sep_str is:
0: |_| --> "The "
1: |_| --> "Sum"
My first question is, in an array of strings in C (so array of array of chars), will there be a null terminating character at the end of each sep_str[i], or just at the last position of sep_str? Here is how I copy string into an array of strings:
for (int i = 0; i < str_length; i++) {
sep_str[i/4][i%4] = *ptr;
ptr++;
}
My second question is, how would I reverse the elements of each string in sep_str? Here's how I did it, but I feel like it is stepping out of the array of the substring. (so out of the element of the sep_str):
// Reverse each element in the array
char temp;
for (int i = 0; i < num_strs; i++) {
for (int j = 0, k = 4; j < k; j++, k--) {
temp = sep_str[i][j];
sep_str[i][j] = sep_str[i][k];
sep_str[i][k] = temp;
}
}
The copy of the strings sounds good to me. Since each string has always 4 chars, you can avoid the null terminator \0. Alternatively you need to declare sep_str as a 5x(lenght/4) matrix, to store the \0 char at the end of each string.
To reverse a string you need to iterate from the start to the middle of the string, replacing the i-th char with the length-i-1-th. You need to replace the inner for replacing k=3 to k=2.
You also need to take care of the last string, since the lenght might not be multiple of four.
char temp;
for (int i = 0; i < (num_strs - 1); i++) {
for (int j = 0, k = 3; j < k; j++, k--) {
temp = sep_str[i][j];
sep_str[i][j] = sep_str[i][k];
sep_str[i][k] = temp;
}
}
if (num_strs > 0) {
for (int j = 0, k = strlen(sep_str[i]) - 1; j < k; j++, k--) {
temp = sep_str[i][j];
sep_str[i][j] = sep_str[i][k];
sep_str[i][k] = temp;
}
}
In a C string, there will be only one termination character. But if you need to tokenize the strings, then each string must be null terminated.
But before that -
char *string = "The sum"; // should be const char* string = "The sum";
String literal in the above case resides in read only location and cannot be modified. If you need to modify, then
char string[] = "The sum";
If you don't have the terminating character in your strings then yes, you will be outside the bounds of the array since you are accessing sep_str[i][4], which is not a valid location:
sep_str[0] = 'T'
sep_str[1] = 'h'
sep_str[2] = 'e'
sep_str[3] = ' '
However, I doubt that you want to have the null character at the beginning of your string, so you need k=3 in your for loop, not k=4.
My first question is, in an array of strings in C (so array of array of chars), will there be a null terminating character at the end of each sep_str[i], or just at the last position of sep_str?
Only at the end, but if you want to treat each individual chunk as its own string, you'll need to add the \0 yourself.
My second question is, how would I reverse the elements of each string in sep_str?
You could do it with pointers...
char temp;
// Point to start of string, `str` will decay to first memory position.
char *start = str;
// Point to the end of the string. You will need to `#include <string.h>`
// for `strlen()`. Otherwise, write a `while` loop that goes until `\0` to find
// the last position.
char *end = &str[strlen(str) - 1];
// Do until we hit the middle of the string.
while (start < end) {
// Need a temp char, no parallel assignment in C.
temp = str[start];
// Swap chars.
str[start++] = str[end];
str[end--] = str[temp];
}
Assuming str is your string.

Resources