C - How can I concatenate an array of strings into a buffer? - c

I am trying to concatenate a random number of lines from the song twinkle twinkle. Into the buffer before sending it out because I need to count the size of the buffer.
My code:
char temp_buffer[10000];
char lyrics_buffer[10000];
char *twinkle[20];
int arr_num;
int i;
twinkle[0] = "Twinkle, twinkle, little star,";
twinkle[1] = "How I wonder what you are!";
twinkle[2] = "Up above the world so high,";
twinkle[3] = "Like a diamond in the sky.";
twinkle[4] = "When the blazing sun is gone,";
twinkle[5] = "When he nothing shines upon,";
srand(time(NULL));
arr_num = rand() % 5;
for (i=0; i<arr_num; i++);
{
sprintf(temp_buffer, "%s\n", twinkle[i]);
strcat(lyrics_buffer, temp_buffer);
}
printf("%s%d\n", lyrics_buffer, arr_num);
My current code only prints 1 line even when I get a number greater than 0.

There are two problems: The first was found by BLUEPIXY and it's that your loop never does what you think it does. You would have found this out very easily if you just used a debugger to step through the code (please do that first in the future).
The second problem is that contents of non-static local variables (like your lyrics_buffer is indeterminate. Using such variables without initialization leads to undefined behavior. The reason this happens is because the strcat function looks for the end of the destination string, and it does that by looking for the terminating '\0' character. _If the contents of the destination string is indeterminate it will seem random, and the terminator may not be anywhere in the array.
To initialize the array you simply do e.g.
char lyrics_buffer[10000] = { 0 };
That will make the compiler initialize it all to zero, which is what '\0' is.
This initialization is not needed for temp_buffer because sprintf unconditionally starts to write at the first location, it doesn't examine the content in any way. It does, in other words, initialize the buffer.

Update the buffer address after each print after initializing buffer with 0.
char temp_buffer[10000] = {0};
for (i=0; i<arr_num; i++) //removed semicolon from here
{
sprintf(temp_buffer + strlen(temp_buffer), "%s\n", twinkle[i]);
}
temp_buffer should contain final output. Make sure you have enough buffer size
You don't need strcat

Related

string gets filled with garbage

i got a string and a scanf that reads from input until it finds a *, which is the character i picked for the end of the text. After the * all the remaining cells get filled with random characters.
I know that a string after the \0 character if not filled completly until the last cell will fill all the remaining empty ones with \0, why is this not the case and how can i make it so that after the last letter given in input all the remaining cells are the same value?
char string1 [100];
scanf("%[^*]s", string1);
for (int i = 0; i < 100; ++i) {
printf("\n %d=%d",i,string1[i]);
}
if i try to input something like hello*, here's the output:
0=104
1=101
2=108
3=108
4=111
5=0
6=0
7=0
8=92
9=0
10=68
You have an uninitialized array:
char string1 [100];
that has indeterminate values. You could initialize the array like
char string1 [100] = { 0 };
or
char string1 [100] = "";
In this call
scanf("%[^*]s", string1);
you need to remove the trailing character s, because %[] and %s are distinct format specifiers. There is no %[]s format specifier. It should look like this:
scanf("%[^*]", string1);
The array contains a string terminated by the zero character '\0'.
So to output the string you should write for example
for ( int i = 0; string1[i] != '\0'; ++i) {
printf( "%c", string1[i] ); // or putchar( string1[i] );
putchar( '\n' );
or like
for ( int i = 0; string1[i] != '\0'; ++i) {
printf("\n %d=%c",i,string1[i]);
putchar( '\n' );
or just
puts( string1 );
As for your statement
printf("\n %d=%d",i,string1[i]);
then it outputs each character (including non-initialized characters) as integers due to using the conversion specifier d instead of c. That is the function outputs internal ASCII representations of characters.
I know that a string after the \0 character if not filled completly
until the last cell will fill all the remaining empty ones with \0
No, that's not true.
It couldn't be true: there is no length to a string. No where neither the compiler nor any function can even know what is the size of the string. Only you do. So, no, string don't autofill with '\0'
Keep in minds that there aren't any string types in C. Just pointer to chars (sometimes those pointers are constant pointers to an array, but still, they are just pointers. We know where they start, but there is no way (other than deciding it and being consistent while coding) to know where they stop.
Sure, most of the time, there is an obvious answer, that make obvious for any reader of the code what is the size of the allocated memory.
For example, when you code
char string1[20];
sprintf(string1, "hello");
it is quite obvious for a reader of that code that the allocated memory is 20 bytes. So you may think that the compiler should know, when sprinting in it of sscaning to it, that it should fill the unused part of the 20 bytes with 0. But, first of all, the compiler is not there anymore when you will sscanf or sprintf. That occurs at runtime, and compiler is at compilation time. At run time, there is not trace of that 20.
Plus, it can be more complicated than that
void fillString(char *p){
sprintf(p, "hello");
}
int main(){
char string1[20];
string1[0]='O';
string1[1]='t';
fillString(&(string1[2]));
}
How in this case does sprintf is supposed to know that it must fill 18 bytes with the string then '\0'?
And that is for normal usage. I haven't started yet with convoluted but legal usages. Such as using char buffer[1000]; as an array of 50 length-20 strings (buffer, buffer+20, buffer+40, ...) or things like
union {
char str[40];
struct {
char substr1[20];
char substr2[20];
} s;
}
So, no, strings are not filled up with '\0'. That is not the case. It is not the habit in C to have implicit thing happening under the hood. And that could not be the case, even if we wanted to.
Your "star-terminated string" behaves exactly as a "null-terminated string" does. Sometimes the rest of the allocated memory is full of 0, sometimes it is not. The scanf won't touch anything else that what is strictly needed. The rest of the allocated memory remains untouched. If that memory happened to be full of '\0' before the call to scanf, then it remains so. Otherwise not. Which leads me to my last remark: you seem to believe that it is scanf that fills the memory with non-null chars. It is not. Those chars were already there before. If you had the feeling that some other methods fill the rest of memory with '\0', that was just an impression (a natural one, since most of the time, newly allocated memory are 0. Not because a rule says so. But because that is the most frequent byte to be found in random area of memory. That is why uninitialized variables bugs are so painful: they occur only from times to times, because very often uninitialized variables are 0, just by chance, but still they are)
The easiest way to create a zeroed array is to use calloc.
Try replacing
char string1 [100];
with
char *string1=calloc(1,100);

How to read in the entire word, and not just the first character?

I am writing a method in C in which I have a list of words from a file that I am redirecting from stdin. However, when I attempt to read in the words into the array, my code will only output the first character. I understand that this is because of a casting issue with char and char *.
While I am challenging myself to not use any of the functions from string.h, I have tried iterating through and am thinking of writing my own strcpy function, but I am confused because my input is coming from a file that I am redirecting from standard input. The variable numwords is inputted by the user in the main method (not shown).
I am trying to debug this issue via dumpwptrs to show me what the output is. I am not sure what in the code is causing me to get the wrong output - whether it is how I read in words to the chunk array, or if I am pointing to it incorrectly with wptrs?
//A huge chunk of memory that stores the null-terminated words contiguously
char chunk[MEMSIZE];
//Points to words that reside inside of chunk
char *wptrs[MAX_WORDS];
/** Total number of words in the dictionary */
int numwords;
.
.
.
void readwords()
{
//Read in words and store them in chunk array
for (int i = 0; i < numwords; i++) {
//When you use scanf with '%s', it will read until it hits
//a whitespace
scanf("%s", &chunk[i]);
//Each entry in wptrs array should point to the next word
//stored in chunk
wptrs[i] = &chunk[i]; //Assign address of entry
}
}
Do not re-use char chunk[MEMSIZE]; used for prior words.
Instead use the next unused memory.
char chunk[MEMSIZE];
char *pool = chunk; // location of unassigned memory pool
// scanf("%s", &chunk[i]);
// wptrs[i] = &chunk[i];
scanf("%s", pool);
wptrs[i] = pool;
pool += strlen(pool) + 1; // Beginning of next unassigned memory
Robust code would check the return value of scanf() and insure i, chunk do not exceed limits.
I'd go for a fgets() solution as long as words are entered a line at a time.
char chunk[MEMSIZE];
char *pool = chunk;
// return word count
int readwords2() {
int word_count;
// limit words to MAX_WORDS
for (word_count = 0; word_count < MAX_WORDS; word_count++) {
intptr_t remaining = &chunk[MEMSIZE] - pool;
if (remaining < 2) {
break; // out of useful pool memory
}
if (fgets(pool, remaining, stdin) == NULL) {
break; // end-of-file/error
}
pool[strcspn(pool, "\n")] = '\0'; // lop off potential \n
wptrs[word_count] = pool;
pool += strlen(pool) + 1;
}
return word_count;
}
While I am challenging myself to not use any of the functions from string.h, ...
The best way to challenge yourself to not use any of the functions from string.h is to write them yourself and then use them.
your program reads the next word in the i-esim position of the buffer chunk, so you are getting the first letters of each word (as long as i doesn't get above the size of chunk) as each time you read, you overwrite the second and rest of the chars of the last word with the ones of the just read one. Then, you are putting all the pointers in wptrs to point to these places, making it impossible to distinguish the end of one string to the next (you overwrote all the null terminators, leaving only the last) so you will get a first string with all the first letters of your words but the last, which is complete. then the second will have the same string, but beginning at the second... then the third.... etc.
Build your own version of strdup(3) and use chunk to store temporarily the string... then make a dynamically allocated copy of the string with your version of strdup(3) and make the pointer to point to it.... etc.
Finally, when you are finished, just free all the allocated strings and voilĂ !!
Also, this is very important: read How to create a Minimal, Complete, and Verifiable example as it is very frequent that your code lacks of some errors that you have eliminated from the posted code (you don't normally know where the error is, or you would have corrected it and no question here, right?)

Why does the program print out an "#" when I enter nothing?

I wrote a program in order to reverse a string. However, when I enter nothing but press the "enter" key, it prints out an "#".
The code is as follows:
#include <stdio.h>
#include <string.h>
int main(void)
{
int i, j, temp;
char str[80];
scanf("%[^\n]s", str);
i = strlen(str);
//printf("%d\n", i);
//printf("%d\n", sizeof(str));
for (j=0; j<i/2; j++) {
temp=str[i-j-1];
str[i-j-1]=str[j];
str[j]=temp;
}
for(i = 0; str[i] != 0; i++)
putchar(str[i]);
}
I tried to use printf() function to see what happens when I press the "Enter" key.
However, after adding printf("%d\n", i);, the output became "3 #".
After adding printf("%d\n", sizeof(str));, the output became "0 80".
It seemed as if "sizeof" had automatically "fixed" the problem.
My roommate said that the problem may result from initialization. I tried to change the code char str[80] to char str[80] = {0}, and everything works well. But I still don't understand what "sizeof" does when it exists in the code. If it really results in the initialization, why will such thing happen when the program runs line by line?
When you declare an array without initializing any part of the array, you receive a pointer to a memory location that has not been initialized. That memory location could contain anything. In fact, you're lucky it stopped just at the #.
By specifying char str[80] = {0} you are effectively saying:
char str[80] = {0, 0, 0, /* 77 more times */ };
Thereby initializing the string to all null values. This is because the compiler automatically pads arrays with nulls if it is partially initialized. (However, this is not the case when you allocate memory from the heap, just a warning).
To understand why everything was happening, let's follow through your code.
When you set i to the value returned by strlen(str), strlen iterates over the location starting at the memory location pointed to by str. Since your memory is not initialized, it finds a # at location 0 and then 0 at location 1, so it correctly returns 1.
What happens with the loops when you don't enter anything? i is set to 0, j is set to 0, so the condition j<i/2 evaluates to 0<0, which is false so it moves on to the second condition. The second condition only tests if the current location in the array is null. Coincidentally you are returned a memory location where the first char is #. It prints it and luckily the next value is null.
When you use the sizeof operator, you are receiving the size of the entire array that you were allocated on the stack (this is important you you may run into this issue later if you start using pointers). If you used strlen, you would have received 1 instead.
Suggestions
Instead of trying to do i = strlen(str);, I would suggest doing i = scanf("%[^\n]s", str);. This is because scanf returns the number of chars read and placed in the buffer. Also, try to use more descriptive variable names, it makes reading code so much easier.
Do a memset of str then it will nothing instead of garbage
char str[80];
memset(str,0,80);

Combine characters from a two dimensional array into a string in C

I'm still new to programming but lets say I have a two dimensional char array with one letter in each array. Now I'm trying to combine each of these letters in the array into one array to create a word.
So grid[2][4]:
0|1|2|3
0 g|o|o|d
1 o|d|d|s
And copy grid[0][0], grid[0][1], grid[0][2], grid[0][3] into a single array destination[4] so it reads 'good'. I have something like
char destination[4];
strcpy(destination, grid[0][1]);
for(i=0; i<4; i++)
strcat(destination, grid[0][i]);
but it simply crashes..
Any step in the right direction is appreciated.
In C, the runtime library functions strcpy and strcat require zero terminated strings. What you're handing to them are not zero terminated, and so these functions will crash due to their dependency on that terminating zero to indicate when they should stop. They are running through RAM until they read a zero, which could be anywhere in RAM, including protected RAM outside your program, causing a crash. In modern work we consider functions like strcpy and strcat to be unsafe. Any kind of mistake in handing them pointers causes this problem.
Versions of strcpy and strcat exist, with slightly different names, which require an integer or size_t indicating their maximum valid size. strncat, for example, has the signature:
char * strncat( char *destination, const char *source, size_t num );
If, in your case, you had used strncat, providing 4 for the last parameter, it would not have crashed.
However, an alternative exists you may prefer to explore. You can simply use indexing, as in:
char destination[5]; // I like room for a zero terminator here
for(i=0; i<4; i++)
destination[i] = grid[0][i];
This does not handle the zero terminator, which you might append with:
destination[4] = 0;
Now, let's assume you wanted to continue, putting both words into a single output string. You might do:
char destination[10]; // I like room for a zero terminator here
int d=0;
for(r=0; r<2; ++r ) // I prefer the habit of prefix instead of postfix
{
for( i=0; i<4; ++i )
destination[d++] = grid[r][i];
destination[d++] = ' ';// append a space between words
}
Following whatever processing is required on what might be an ever larger declaration for destination, append a zero terminator with
destination[ d ] = 0;
strcpy copies strings, not chars. A string in C is a series of chars, followed by a \0. These are called "null-terminated" strings. So your calls to strcpy and strcat aren't giving them the right kind of parameters.
strcpy copies character after character until it hits a \0; it doesn't just copy the one char you're giving it a pointer to.
If you want to copy a character, can just assign it.
char destination[5];
for(i = 0; i < 4; i++)
destination[i] = grid[0][i];
destination[i] = '\0';

Tokenizing a phone number in C

I'm trying to tokenize a phone number and split it into two arrays. It starts out in a string in the form of "(515) 555-5555". I'm looking to tokenize the area code, the first 3 digits, and the last 4 digits. The area code I would store in one array, and the other 7 digits in another one. Both arrays are to hold just the numbers themselves.
My code seems to work... sort of. The issue is when I print the two storage arrays, I find some quirks;
My array aCode; it stores the first 3 digits as I ask it to, but then it also prints some garbage values notched at the end. I walked through it in the debugger, and the array only stores what I'm asking it to store- the 515. So how come it's printing those garbage values? What gives?
My array aNum; I can append the tokens I need to the end of it, the only problem is I end up with an extra space at the front (which makes sense; I'm adding on to an empty array, ie adding on to empty space). I modify the code to only hold 7 variables just to mess around, I step into the debugger, and it tells me that the array holds and empty space and 6 of the digits I need- there's no room for the last one. Yet when I print it, the space AND all 7 digits are printed. How does that happen?
And how could I set up my strtok function so that it first copies the 3 digits before the "-", then appends to that the last 4 I need? All examples of tokenization I've seen utilize a while loop, which would mean I'd have to choose either strcat or strcpy to complete my task. I can set up an "if" statement to check for the size of the current token each time, but that seems too crude to me and I feel like there's a simpler method to this. Thanks all!
int main() {
char phoneNum[]= "(515) 555-5555";
char aCode[3];
char aNum[7];
char *numPtr;
numPtr = strtok(phoneNum, " ");
strncpy(aCode, &numPtr[1], 3);
printf("%s\n", aCode);
numPtr = strtok(&phoneNum[6], "-");
while (numPtr != NULL) {
strcat(aNum, numPtr);
numPtr = strtok(NULL, "-");
}
printf("%s", aNum);
}
I can primarily see two errors,
Being an array of 3 chars, aCode is not null-terminated here. Using it as an argument to %s format specifier in printf() invokes undefined behaviour. Same thing in a differrent way for aNum, too.
strcat() expects a null-terminated array for both the arguments. aNum is not null-terminated, when used for the first time, will result in UB, too. Always initialize your local variables.
Also, see other answers for a complete bug-free code.
The biggest problem in your code is undefined behavior: since you are reading a three-character constant into a three-character array, you have left no space for null terminator.
Since you are tokenizing a value in a very specific format of fixed length, you could get away with a very concise implementation that employs sscanf:
char *phoneNum = "(515) 555-5555";
char aCode[3+1];
char aNum[7+1];
sscanf(phoneNum, "(%3[0-9]) %3[0-9]-%4[0-9]", aCode, aNum, &aNum[3]);
printf("%s %s", aCode, aNum);
This solution passes the format (###) ###-#### directly to sscanf, and tells the function where each value needs to be placed. The only "trick" used above is passing &aNum[3] for the last argument, instructing sscanf to place data for the third segment into the same storage as the second segment, but starting at position 3.
Demo.
Your code has multiple issues
You allocate the wrong size for aCode, you should add 1 for the nul terminator byte and initialize the whole array to '\0' to ensure end of lines.
char aCode[4] = {'\0'};
You don't check if strtok() returns NULL.
numPtr = strtok(phoneNum, " ");
strncpy(aCode, &numPtr[1], 3);
Point 1, applies to aNum in strcat(aNum, numPtr) which will also fail because aNum is not yet initialized at the first call.
Subsequent calls to strtok() must have NULL as the first parameter, hence
numPtr = strtok(&phoneNum[6], "-");
is wrong, it should be
numPtr = strtok(NULL, "-");
Other answers have already mentioned the major issue, which is insufficient space in aCode and aNum for the terminating NUL character. The sscanf answer is also the cleanest for solving the problem, but given the restriction of using strtok, here's one possible solution to consider:
char phone_number[]= "(515) 555-1234";
char area[3+1] = "";
char digits[7+1] = "";
const char *separators = " (-)";
char *p = strtok(phone_number, separators);
if (p) {
int len = 0;
(void) snprintf(area, sizeof(area), "%s", p);
while (len < sizeof(digits) && (p = strtok(NULL, separators))) {
len += snprintf(digits + len, sizeof(digits) - len, "%s", p);
}
}
(void) printf("(%s) %s\n", area, digits);

Resources