Can't set second element of an empty string in C - c

I have this code, which doubles every letter in nabucco:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40];
int j=-1;
for(int i=0; i<strlen(string1);i++){
j++;
string2[j] = string1[i];
j++;
string2[j] = string1[i];
}
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
string2 is set to and prints nnaabbuuccccoo.
However, when I try to set j=0; string2 prints (and is presumably set to) nothing.
If j=0 before going into the loop, then j is incremented to 1 in the for loop, which starts off setting the second element of the empty string2. Is there something behind the scenes preventing any element other than the first from being set? I do not understand what is causing this behavior.

The program as posted has undefined behavior since you don't put a null terminator in string2.
If you start with j=0, you never set the value of string2[0] since you increase j before using it, so that position remains indeterminable.
Possible fix:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40];
size_t j=0;
for(size_t i=0, len = strlen(string1); i<len; i++) {
string2[j++] = string1[i]; // increase j "after" using the current value
string2[j++] = string1[i];
}
string2[j] = '\0'; // add null terminator
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
An alternative could be to initialize string2 when you create it:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40] = {0}; // zero initialized
for(size_t i=0, j=0, len = strlen(string1); i<len; i++) {
string2[j++] = string1[i];
string2[j++] = string1[i];
}
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}

You increment the counter before using it the first time, so if you set j=0 you do j++ before using it first, which means you start with string2[1].
You can change it like so:
int j=0;
for(int i=0; i<strlen(string1);i++){
string2[j] = string1[i];
j++;
string2[j] = string1[i];
j++;
}

When you do this
printf("string2=\t%s\n",string2);
and string2 was set from the second character, this is a sort of undefined behaviour (see also this answer). string2[0] was not initialised, and could contain anything (from sensitive data to something that can't be parsed successfully by other functions - okay, here it is a single character; in other circumstances it could be more than that, if you started with j=20 for example). Which is why undefined behaviour is to be avoided at all costs.
In this case, the quirks of platform, compiler and process made it so the first character is a zero. Since C strings are zero-terminated, and string2 begins with a zero, printf prints nothing. If you printed something like this, where "." is a 0 character (ASCII 0),
"nnaa.bbuuccoo."
you would get only "nnaa". By the same token, printf'ing ".nnaabbuuccoo" gets you nothing.
If you were to print string2 starting at the second character, you'd get different results:
printf("string2=\t%s\n",string2 + 1);

After the for loop the character array string2 does not contain a string because you forgot to append it with the terminating zero character '\0'. So this call of printf
printf("string2=\t%s\n",string2);
invokes undefined behavior.
A simplest approach to resolve the problem is to initialize the array in its declaration. For example
char string2[40] = "";
As for this statement
However, when I try to set j=0; string2 prints (and is presumably set
to) nothing.
then within the for loop the variable j is at once incremented.
int j = 0;
for(int i=0; i<strlen(string1);i++){
j++;
string2[j] = string1[i];
//...
So the first character of the array string2 is not set.
In this case it is better to write the for loop the following way
size_t j = 0;
for ( size_t i = 0, n = strlen( string1 ); i < n; i++ ){
string2[j++] = string1[i];
string2[j++] = string1[i];
}
string2[j] = '\0';

It is because strings in C are null terminated. So when you set j=0 outside of loop, you skip setting first character which remains null.

Related

Why does my empty character array start with a length of 6?

When I print out the length of the temp string, it starts at a random number. The goal of this for loop is to filter out everything that's not a letter, and it works for the most part, but when I print out the filtered string it returns the filtered string but with some extra random characters before and after the string.
#define yes 1000
...
char stringed[yes] = "teststring";
int len = strlen(text);
char filt[yes];
for (int i = 0; i < len; i++) {
if (isalpha(stringed[i])) {
filt[strlen(filt)] = tolower(stringed[i]);
}
}
There are at least two problems with the line:
temp[strlen(temp)] = "\0";
The compiler should be shrieking about converting a pointer to an integer. You need '\0' and not "\0". (This might account for some of the odd characters; the least-significant byte of the address is probably stored over the null byte, making it and random other characters visible until the string printing comes across another null byte somewhere.)
With that fixed, the code carefully writes a null byte over the null byte that marks the end of the string.
You should probably not be using strlen() at this point (or at a number of other points where you use it in the loop).
You should be using i more in the loop. If your goal is to eliminate non-alpha characters, you probably need two indexes, one for 'next character to check' and one for 'next position to overwrite'. After the loop, you need to write over the 'next position to overwrite' with the null byte.
int j = 0; // Next position to overwrite
for (int i = 0; i < length; i++)
{
if (isalpha(text[i]))
temp[j++] = text[i];
}
temp[j] = '\0';
For starters the character array
char temp[MAX];
is not initialized. It has indeterminate values.
So these statements
printf("NUM:[%i] CHAR:[%c] TEMP:[%c] TEMPSTRLEN:[%i]\n", i, text[i], temp[strlen(temp)], strlen(temp));
temp[strlen(temp)] = tolower(text[i]);
have undefined behavior because you may not apply the standard function strlen to uninitialized character array.
This statement
temp[strlen(temp)] = "\0";
is also invalid.
In the left side of the assignment statement there is used the string literal "\0" which is implicitly converted to pointer to its first character.
So these statements
length = strlen(temp);
printf("[%s]\n", temp);
do not make sense.
It seems what you mean is the following
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAX 1000
int main(void)
{
char text[MAX] = "teststring";
size_t length = strlen(text);
char temp[MAX] = { '\0' };
// or
//char temp[MAX] = "";
for ( size_t i = 0; i < length; i++)
{
if (isalpha( ( unsigned char )text[i] ) )
{
printf("NUM:[%zu] CHAR:[%c] TEMP:[%c] TEMPSTRLEN:[%zu]\n", i, text[i], temp[strlen(temp)], strlen(temp));
temp[strlen(temp)] = tolower(text[i]);
temp[i+1] = '\0';
}
}
length = strlen(temp);
printf( "[%s]\n", temp );
return 0;
}
The program output is
NUM:[0] CHAR:[t] TEMP:[] TEMPSTRLEN:[0]
NUM:[1] CHAR:[e] TEMP:[] TEMPSTRLEN:[1]
NUM:[2] CHAR:[s] TEMP:[] TEMPSTRLEN:[2]
NUM:[3] CHAR:[t] TEMP:[] TEMPSTRLEN:[3]
NUM:[4] CHAR:[s] TEMP:[] TEMPSTRLEN:[4]
NUM:[5] CHAR:[t] TEMP:[] TEMPSTRLEN:[5]
NUM:[6] CHAR:[r] TEMP:[] TEMPSTRLEN:[6]
NUM:[7] CHAR:[i] TEMP:[] TEMPSTRLEN:[7]
NUM:[8] CHAR:[n] TEMP:[] TEMPSTRLEN:[8]
NUM:[9] CHAR:[g] TEMP:[] TEMPSTRLEN:[9]
[teststring]
Edit: next time do not change your question so cardinally because this can confuse readers of the question.

Why does assigning int(0) to a string index end the loop?

This is a puzzle I encountered.
I get no errors when I compile it:
$gcc -Wall -Wextra -pedantic -std=c99
Can you please explain why this happens in C?
Expected:
Print out the index and char at the str[i] and replace
Output:
If I try to reassign str[i] to 0, it acts as a break for the for-loop.
char str[] = "this is only a test";
for(int i = 0; i < (int)strlen(str); i++) {
printf("str[%d] = %c\n", i, str[i]);
if(str[i] == ' ') {
str[i] = 0;
}
}
Your program works fine (assuming you have included string.h) - when it finds space ' ' as a character, it sets the ith index to \0 and when next time strlen calculates the length, it finds that i<(int)strlen (it finds termination at early position) is violated and the loop terminates.
Here is the output on GCC without any warning/error:
str[0] = t
str[1] = h
str[2] = i
str[3] = s
str[4] =
The only thing is, it will not print the entire string as when you find space character you replace it will null termination, next time strlen condition gets violated and it comes out of the loop.
For printing the entire string better use a variable and initialize it with length of string and then use that variable as condition:
unsigned length = strlen(str);
for(int i = 0; i < length; i++)
Plus you don't need to typecast strlen's return value to int.
Since you're manipulating the string and calling strlen on it repeatedly you're getting into trouble here. Remember that the for termination condition is evaluated each time through the loop, not once.
To fix this:
size_t l = strlen(str); // Save the length once
for (size_t i = 0; i < l; i++) {
// ...
}

printf() prints whole char matrix

As marked in the code, the first printf() rightfully prints only the i-th line of the matrix. But outiside the loop, both printf() and strcat() act on the whole matrix from i-th line on as a single-lined string. This means that
printf("%s\n",m_cfr[0])
will print whole matrix, but m_cfr[i] will print whole matrix from the i-th line on. char* string is a single lined string with no spaces.
trasp(char* string)
{
int row = strlen(string) / 5;
char m[row][5];
char m_cfr[row][5];
char cfr[row*5];
memset(cfr, 0, row * 5);
int key[5] = {3, 1, 2, 0, 4};
int k = 0;
for (i = 0 ; i < row ; i++)
{
strncpy(m[i], string + k, 5);
m[i][5] = '\0';
k += 5;
}
for (i = 0 ; i < row ; i++)
{
for (j = 0 ; j < 5 ; j++)
{
m_cfr[i][key[j]] = m[i][j];
}
m_cfr[i][5] = '\0';
printf("%s\n", m_cfr[i]); //--->prints only line i
}
printf("%s\n", m_cfr[0]); //prints whole matrix
strcat(cfr, m_cfr[0]); //concatenates whole matrix
printf("%s\n", cfr);
}
In your code, your array definition is
char m_cfr[row][5];
while you're accessing
m_cfr[i][5] = '\0';
/* ^
|
there is no 6th element
*/
You're facing off-by-one error. Out-of-bound memory access causes undefined behaviour.
Maybe you want to change the null-terminating statement to
m_cfr[i][4] = '\0'; //last one is null
%s expects a char* and prints everything until it encounters a \0. So,
printf("%s\n", m_cfr[i]);
printf("%s\n",m_cfr[0]);
strcat(cfr,m_cfr[0]);
All exhibit Undefined Behavior as m_cfr[i],m_cfr[0] and m_cfr[0] are chars and not char*s and %s as well as both the arguments of strcat expects a char*. Also, as SouravGhosh points out, using
m_cfr[i][5] = '\0';
And
m[i][5] = '\0';
Are wrong.
To fix the former issue, use
printf("%s\n", &m_cfr[i]);
printf("%s\n",m_cfr);
strcat(cfr,&m_cfr[0]);
To print the whole string and concatenate the two strings in the arguments of strcat or if you wanted to print just the chars, use
printf("%c\n", m_cfr[i]);
printf("%c\n",m_cfr[0]);
As for the latter issue, use
char m[row][5]={{0}};
char m_cfr[row][5]={{0}};

realloc() seems to affect already allocated memory

I am experiencing an issue where the invocation of realloc seems to modify the contents of another string, keyfile.
It's supposed to run through a null-terminated char* (keyfile), which contains just above 500 characters. The problem, however, is that the reallocation I perform in the while-loop seems to modify the contents of the keyfile.
I tried removing the dynamic reallocation with realloc and instead initialize the pointers in the for-loop with a size of 200*sizeof(int) instead. The problem remains, the keyfile string is modified during the (re)allocation of memory, and I have no idea why. I have confirmed this by printing the keyfile-string before and after both the malloc and realloc statements.
Note: The keyfile only contains the characters a-z, no digits, spaces, linebreaks or uppercase. Only a text of 26, lowercase letters.
int **getCharMap(const char *keyfile) {
char *alphabet = "abcdefghijklmnopqrstuvwxyz";
int **charmap = malloc(26*sizeof(int));
for (int i = 0; i < 26; i++) {
charmap[(int) alphabet[i]] = malloc(sizeof(int));
charmap[(int) alphabet[i]][0] = 0; // place a counter at index 0
}
int letter;
int count = 0;
unsigned char c = keyfile[count];
while (c != '\0') {
int arr_count = charmap[c][0];
arr_count++;
charmap[c] = realloc(charmap[c], (arr_count+1)*sizeof(int));
charmap[c][0] = arr_count;
charmap[c][arr_count] = count;
c = keyfile[++count];
}
// Just inspecting the results for debugging
printf("\nCHARMAP\n");
for (int i = 0; i < 26; i++) {
letter = (int) alphabet[i];
printf("%c: ", (char) letter);
int count = charmap[letter][0];
printf("%d", charmap[letter][0]);
if (count > 0) {
for (int j = 1; j < count+1; j++) {
printf(",%d", charmap[letter][j]);
}
}
printf("\n");
}
exit(0);
return charmap;
}
charmap[(int) alphabet[i]] = malloc(sizeof(int));
charmap[(int) alphabet[i]][0] = 0; // place a counter at index 0
You are writing beyond the end of your charmap array. So, you are invoking undefined behaviour and it's not surprising that you are seeing weird effects.
You are using the character codes as an index into the array, but they do not start at 0! They start at whatever the ASCII code for a is.
You should use alphabet[i] - 'a' as your array index.
The following piece of code is a source of troubles:
int **charmap = malloc(26*sizeof(int));
for (int i = 0; i < 26; i++)
charmap[...] = ...;
If sizeof(int) < sizeof(int*), then it will be performing illegal memory access operations.
For example, on 64-bit platforms, the case is usually sizeof(int) == 4 < 8 == sizeof(int*).
Under that scenario, by writing into charmap[13...25], you will be accessing unallocated memory.
Change this:
int **charmap = malloc(26*sizeof(int));
To this:
int **charmap = malloc(26*sizeof(int*));

Different outputs

Why does the first code give a different output from the second code, even though they intend to do the same thing?
while(s[i++]==t[j++]);
while(s[i]==t[j])
{
i++;
j++;
}
The first code increments i and j even when s[i] != t[j], while the second doesn't.
For example, with:
char s[] = "hello";
char t[] = "world";
int i = 0, j = 0;
The first code will have both i and j equal to 1 after the loop, but the second code will have i and j equal to 0.

Resources