How to read and concatenate strings repeatedly in C - c

The program needs to do the following:
initialise the combined string variable.
read a string (input1) of no more than 256 characters then:
Until the first character of the read string is '#'
allocate enough space for a new combined string variable to hold the current combined string variable and the new read-string (input1).
then copy the contents of the old combined variable to the new, bigger, combined variable string.
then concatenate the newly read-string (input1) to the end of the new combined string.
deallocate the old combined string.
read a string into input 1 again.
After the user has typed a string with '#' print out the new combined string.
test input: where I am #here
expected output: whereIam
actually output: Segmentation fault (core dumped)
Note that the spaces above separate the strings. Write two more test cases.
And that is my own code:
#include<stdio.h>
#include<stdlib.h> // for malloc
#include<string.h> // for string funs
int main(void) {
char input1[257];
printf("Please enter a string here: ");
scanf("%256s",input1);
int input1_index = 0;
while (input1[input1_index] != '#') {
input1_index++;
}
char *combined;
combined = malloc(sizeof(char)*(input1_index+1+1));
combined[0] = '\0';
strcpy(combined,input1);
printf("%s\n",combined);
return 0;
}
How to modify my code? What is the Segmentation fault (core dumped)?
Thank you all.

With scanf, the specifier %Ns consumes and discards all leading whitespace before reading non-whitespace characters - stopping when trailing whitespace is encountered or N non-whitespace characters are read.
With the input
where I am #here
this
char input1[257];
scanf("%256s", input1);
will result in the input1 buffer containing the string "where".
This while loop
while (input1[input1_index] != '#') {
input1_index++;
}
does not handle the specified "[read strings...] Until the first character of the read string is '#'". It simply searches the one string you have read for the '#' character, and can easily exceed the valid indices of the string if one does not exist (as in the case of "where"). This will lead to Undefined Behaviour, and is surely the cause of your SIGSEGV.
Your code does not follow the instructions detailed.
A loop is required to read two or more whitespace delimited strings using scanf and %s. This loop should end when scanf fails, or the the first character of the read string is '#'.
Inside this loop you should get the string length of the read string (strlen) and add that to the existing string length. You should allocate a new buffer of this length plus one (malloc), and exit the loop if this allocation fails.
You should then copy the existing string to this new buffer (strcpy).
You should then concatenate the read string to this new buffer (strcat).
You should then deallocate the existing string buffer (free).
You should then copy the pointer value of the new string buffer to the existing string buffer variable (=).
Then repeat the loop.
In pseudocode, this roughly looks like:
input := static [257]
combined := allocate [1] as ""
size := 0
exit program if combined is null
print "Enter strings: "
while read input does not begin with '#' do
add length of input to size
temporary := allocate [size + 1]
stop loop if temporary is null
copy combined to temporary
concatenate input to temporary
deallocate combined
combined := temporary
end
print combined
deallocate combined

Others have already explained what you should do. But my additional suggestion is to take those detailed instructions and make them comments. Then for each one of them translate into C code:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
// initialise the combined string variable.
char *combined = malloc(1);
combined[0] = 0;
while (true) {
// read a string (input1) of no more than 256 characters
char input1[257];
if (scanf("%256s", input1) != 1) {
break;
}
// then: Until the first character of the read string is '#'
if (input1[0] == '#') {
break;
}
// allocate enough space for a new combined string variable to hold the current combined string variable and the new read-string (input1).
char *tmp = malloc(strlen(combined) + strlen(input1) + 1);
// then copy the contents of the old combined variable to the new, bigger, combined variable string.
strcpy(tmp, combined);
// then concatenate the newly read-string (input1) to the end of the new combined string.
strcat(tmp, input1);
// deallocate the old combined string.
free(combined);
combined = tmp; // <-- My interpretation of what the aim is
// read a string into input 1 again. <-- I guess they mean to loop
}
// After the user has typed a string with '#' print out the new combined string.
printf("%s", combined);
free(combined);
return 0;
}
Notice how "initialize", means "make it a proper C string which can be extended later".
When you read anything remember to check if the read was successful.
The first character of input1 is just `input1[0]'. It's always safe to check it (if the read was successful), because it must be the first character or the string terminator. No risk of false detection.
Allocation requires to add the size of combined, input1, and the string terminator.
Copy and concatenation are available in <string.h>.
Notice that the instructions are missing a line stating that the new combined string should become the current combined string, but it's pretty obvious from the intent.
I strongly dislike the suggested (implied) implementation which calls for a sequence such as:
read
while (check) {
do_stuff
read
}
My preference goes to the non repeated read:
while (true) {
read
if (!check) break;
do_stuff
}
Finally, free your memory. Always. Don't be lazy!
Just for completeness, another option is to store the size of your combined string, to avoid calling strlen on something you already know. You could also leverage the calloc and realloc functions, which have been available since a lot of time:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
size_t capacity = 1;
char *combined = calloc(capacity, 1);
while (true) {
char input1[257];
if (scanf("%256s", input1) != 1 || input1[0] == '#') {
break;
}
capacity += strlen(input1);
combined = realloc(combined, capacity);
strcat(combined, input1);
}
printf("%s", combined);
free(combined);
return 0;
}
Finally for the additional test strings:
these are not the droids you're looking for #dummy
a# b## c### d#### e##### #f###### u#######(explicit)
Useless addition
You could also limit the number of reallocations by using a size/capacity couple and doubling the capacity each time you need more. Also (not particularly useful here) checking the return value of memory allocation function is a must. My checking for realloc is needlessly complicated here, because memory is freed at program termination, but, nevertheless, free your memory. Always. Don't be lazy! 😁
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
size_t capacity = 257, size = 0;
char *combined = calloc(capacity, 1);
if (combined == NULL) {
exit(EXIT_FAILURE);
}
while (true) {
char input1[257];
if (scanf("%256s", input1) != 1 || input1[0] == '#') {
break;
}
size += strlen(input1);
if (size >= capacity) {
char *tmp = realloc(combined, capacity *= 2);
if (tmp == NULL) {
free(combined);
exit(EXIT_FAILURE);
}
combined = tmp;
}
strcat(combined, input1);
}
printf("%s", combined);
free(combined);
return 0;
}
Uncalled-for optimization
David C. Rankin pointed out another important thing to take care of when using library functions, that is not to assume they are O(1). Repeatedly calling strcat() is not the smartest move, since it's always scanning from the beginning. So here I replaced the strcat() with a ligher strcpy(), storing the length of the string, before adding the new one.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
size_t capacity = 257, size = 0;
char *combined = calloc(capacity, 1);
if (combined == NULL) {
exit(EXIT_FAILURE);
}
while (true) {
char input1[257];
if (scanf("%256s", input1) != 1 || input1[0] == '#') {
break;
}
size_t oldsize = size;
size += strlen(input1);
if (size >= capacity) {
char *tmp = realloc(combined, capacity *= 2);
if (tmp == NULL) {
free(combined);
exit(EXIT_FAILURE);
}
combined = tmp;
}
strcpy(combined + oldsize, input1);
}
printf("%s", combined);
free(combined);
return 0;
}

In the below code, your idea that strcpy() will copy only up to the allocated memory is incorrect. strcpy() will copy beyond the allocated memory which is the reason it is unsafe and maybe the reason for the Segmentation fault (core dumped).
strcpy() in your code copies all characters from input1 to combined but as the size of combined is less than input1 thus problem occurs.
Instead, you should use strncpy() to copy only n number of characters to avoid writing to memory beyond the allocated space.
You should also make sure that combined is ended with '\0' properly.
And combined[0]='\0' will not initialize all the allocated memory with '\0', this only assigned '\0' to the first byte. To initialize all the allocated memory to '\0' you need to use memset() after malloc()
combined = malloc(sizeof(char)*(input1_index+1+1));
combined[0] = '\0'; //use memset() here
strcpy(combined,input1); //Try using strncpy(combined, input1, input1_index) here

Related

Displaying lines with given number of characters

I have written such a program which suppose to returns lines which are containing at least 11 characters and 4 digits. I messed up something with types of variables I guess but I cant figure out how should I fix it.
#include <stdio.h>
#include <ctype.h>
#include <string.h>
int main()
{
char line[200];
char *temp[200];
int i = 0, k=0;
printf("Enter a string: \n");
while(fgets(line, sizeof(line),stdin))
{
int numberAlpha = 0;
int numberDigit = 0;
int i;
for(i=0; i<strlen(line); i++){
if(isalpha(line[i])) numberAlpha++;
else if(isdigit(line[i])) numberDigit++;
}
if(numberAlpha+numberDigit>10 && numberDigit>3){
temp[i]=line;
i++;
}
}
while(temp[k]!='\0'){
printf("%s", temp[k]);
k++;
}
return 0;
}
You're reusing the same buffer each time, and you're storing a pointer to that buffer in your temp array. What you're going to end up with is a bunch of the same pointer in that array, with that pointer pointing at the last line in the file.
What you can do instead is to rewrite your temp[i]=line statement to the following:
temp[i] = malloc(sizeof(line))
memcpy(temp[i], line, sizeof(line))
In so doing, you'll be creating a new array with the contents of the matching line, which won't get overwritten when you come around and read the next line out of the file.
Note that, because you're allocating that on the heap, at the end of your function you'll want to free it:
while (temp[k] != '\0') {
printf(...);
free(temp[k]);
k++
}
As said before , one issue is with copying of
temp[i]=line;
This can be solved by doing a new heap allocation and doing memcopy to temp.
The other issue that i could see is - with the value of variable i. Then temp array will always be assigned to strlen(line) index. You might be thinking of storing in the temp array from 0. Which is not happening.
This can be solved by-
int start_index=0;
while(...){
if(numberAlpha+numberDigit>10 && numberDigit>3){
temp[start_index]=line;
start_index++;
}
}
The problem is you are assigning the same address here:
temp[i]=line;
and line is used in the loop to read as well. That means it's overwritten in every iteration.
Instead, you can use strdup() (POSIX function):
temp[i] = strdup(line);
to copy the lines you are interested in. If strdup() not available you can use malloc() + strcpy() to do the same. Plus, free() them later.
In addition, be aware that:
fgets() will read in the newline character if there's room in the buffer which may not be what you want. So, you need to trim it out. You can do it with:
line[strcspn(line, "\n")] = 0; /* trim the trailing newline, if any */
The arguments to isalpha() and isdigit() should be cast to unsigned char to avoid potential undefined behaviour i.e. these two lines:
if(isalpha(line[i])) numberAlpha++;
else if(isdigit(line[i])) numberDigit++;
should be
if(isalpha((unsigned char)line[i])) numberAlpha++;
else if((unsigned char)isdigit(line[i])) numberDigit++;

How to correctly input a string in C

I am currently learning C, and so I wanted to make a program that asks the user to input a string and to output the number of characters that were entered, the code compiles fine, when I enter just 1 character it does fine, but when I enter 2 or more characters, no matter what number of character I enter, it will always say there is just one character and crashes after that. This is my code and I can't figure out what is wrong.
int main(void)
{
int siz;
char i[] = "";
printf("Enter a string.\n");
scanf("%s", i);
siz = sizeof(i)/sizeof(char);
printf("%d", siz);
getch();
return 0;
}
I am currently learning to program, so if there is a way to do it using the same scanf() function I will appreciate that since I haven't learned how to use any other function and probably won't understand how it works.
Please, FORGET that scanf exists. The problem you are running into, whilst caused mostly by your understandable inexperience, will continue to BITE you even when you have experience - until you stop.
Here is why:
scanf will read the input, and put the result in the char buffer you provided. However, it will make no check to make sure there is enough space. If it needs more space than you provided, it will overwrite other memory locations - often with disastrous consequences.
A safer method uses fgets - this is a function that does broadly the same thing as scanf, but it will only read in as many characters as you created space for (or: as you say you created space for).
Other observation: sizeof can only evaluate the size known at compile time : the number of bytes taken by a primitive type (int, double, etc) or size of a fixed array (like int i[100];). It cannot be used to determine the size during the program (if the "size" is a thing that changes).
Your program would look like this:
#include <stdio.h>
#include <string.h>
#define BUFLEN 100 // your buffer length
int main(void) // <<< for correctness, include 'void'
{
int siz;
char i[BUFLEN]; // <<< now you have space for a 99 character string plus the '\0'
printf("Enter a string.\n");
fgets(i, BUFLEN, stdin); // read the input, copy the first BUFLEN characters to i
siz = sizeof(i)/sizeof(char); // it turns out that this will give you the answer BUFLEN
// probably not what you wanted. 'sizeof' gives size of array in
// this case, not size of string
// also not
siz = strlen(i) - 1; // strlen is a function that is declared in string.h
// it produces the string length
// subtract 1 if you don't want to count \n
printf("The string length is %d\n", siz); // don't just print the number, say what it is
// and end with a newline: \n
printf("hit <return> to exit program\n"); // tell user what to do next!
getc(stdin);
return 0;
}
I hope this helps.
update you asked the reasonable follow-up question: "how do I know the string was too long".
See this code snippet for inspiration:
#include <stdio.h>
#include <string.h>
#define N 50
int main(void) {
char a[N];
char *b;
printf("enter a string:\n");
b = fgets(a, N, stdin);
if(b == NULL) {
printf("an error occurred reading input!\n"); // can't think how this would happen...
return 0;
}
if (strlen(a) == N-1 && a[N-2] != '\n') { // used all space, didn't get to end of line
printf("string is too long!\n");
}
else {
printf("The string is %s which is %d characters long\n", a, strlen(a)-1); // all went according to plan
}
}
Remember that when you have space for N characters, the last character (at location N-1) must be a '\0' and since fgets includes the '\n' the largest string you can input is really N-2 characters long.
This line:
char i[] = "";
is equivalent to:
char i[1] = {'\0'};
The array i has only one element, the program crashes because of buffer overflow.
I suggest you using fgets() to replace scanf() like this:
#include <stdio.h>
#define MAX_LEN 1024
int main(void)
{
char line[MAX_LEN];
if (fgets(line, sizeof(line), stdin) != NULL)
printf("%zu\n", strlen(line) - 1);
return 0;
}
The length is decremented by 1 because fgets() would store the new line character at the end.
The problem is here:
char i[] = "";
You are essentially creating a char array with a size of 1 due to setting it equal to "";
Instead, use a buffer with a larger size:
char i[128]; /* You can also malloc space if you desire. */
scanf("%s", i);
See the link below to a similar question if you want to include spaces in your input string. There is also some good input there regarding scanf alternatives.
How do you allow spaces to be entered using scanf?
That's because char i[] = ""; is actually an one element array.
Strings in C are stored as the text which ends with \0 (char of value 0). You should use bigger buffer as others said, for example:
char i[100];
scanf("%s", i);
Then, when calculating length of this string you need to search for the \0 char.
int length = 0;
while (i[length] != '\0')
{
length++;
}
After running this code length contains length of the specified input.
You need to allocate space where it will put the input data. In your program, you can allocate space like:
char i[] = " ";
Which will be ok. But, using malloc is better. Check out the man pages.

Print a string reversed in C

I'm coding a program that takes some files as parameters and prints all lines reversed. The problem is that I get unexpected results:
If I apply it to a file containing the following lines
one
two
three
four
I get the expected result, but if the file contains
september
november
december
It returns
rebmetpes
rebmevons
rebmeceds
And I don't understand why it adds a "s" at the end
Here is my code
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void reverse(char *word);
int main(int argc, char *argv[], char*envp[]) {
/* No arguments */
if (argc == 1) {
return (0);
}
FILE *fp;
int i;
for (i = 1; i < argc; i++) {
fp = fopen(argv[i],"r"); // read mode
if( fp == NULL )
{
fprintf(stderr, "Error, no file");
}
else
{
char line [2048];
/*read line and reverse it. the function reverse it prints it*/
while ( fgets(line, sizeof line, fp) != NULL )
reverse(line);
}
fclose(fp);
}
return (0);
}
void reverse(char *word)
{
char *aux;
aux = word;
/* Store the length of the word passed as parameter */
int longitud;
longitud = (int) strlen(aux);
/* Allocate memory enough ??? */
char *res = malloc( longitud * sizeof(char) );
int i;
/in this loop i copy the string reversed into a new one
for (i = 0; i < longitud-1; i++)
{
res[i] = word[longitud - 2 - i];
}
fprintf(stdout, "%s\n", res);
free(res);
}
(NOTE: some code has been deleted for clarity but it should compile)
You forget to terminate your string with \0 character. In reversing the string \0 becomes your first character of reversed string. First allocate memory for one more character than you allocated
char *res = malloc( longitud * sizeof(char) + 1);
And the try this
for (i = 0; i < longitud-1; i++)
{
res[i] = word[longitud - 2 - i];
}
res[i] = '\0'; // Terminating string with '\0'
I think I know the problem, and it's a bit of a weird issue.
Strings in C are zero terminated. This means that the string "Hi!" in memory is actually represented as 'H','i','!','\0'. The way strlen etc then know the length of the string is by counting the number of characters, starting from the first character, before the zero terminator. Similarly, when printing a string, fprintf will print all the characters until it hits the zero terminator.
The problem is, your reverse function never bothers to set the zero terminator at the end, which it needs to since you're copying characters into the buffer character by character. This means it runs off the end of your allocated res buffer, and into undefined memory, which just happened to be zero when you hit it (malloc makes no promises of the contents of the buffer you allocate, just that it's big enough). You should get different behaviour on Windows, since I believe that in debug mode, malloc initialises all buffers to 0xcccccccc.
So, what's happening is you copy september, reversed, into res. This works as you see, because it just so happens that there's a zero at the end.
You then free res, then malloc it again. Again, by chance (and because of some smartness in malloc) you get the same buffer back, which already contains "rebmetpes". You then put "november" in, reversed, which is slightly shorter, hence your buffer now contains "rebmevons".
So, the fix? Allocate another character too, this will hold your zero terminator (char *res = malloc( longitud * sizeof(char) + 1);). After you reverse the string, set the zero terminator at the end of the string (res[longitud] = '\0';).
there are two errors there, the first one is that you need one char more allocated (all chars for the string + 1 for the terminator)
char *res = malloc( (longitud+1) * sizeof(char) );
The second one is that you have to terminate the string:
res[longitud]='\0';
You can terminate the string before entering in the loop because you know already the size of the destination string.
Note that using calloc instead of malloc you will not need to terminate the string as the memory gets alreay zero-initialised
Thanks, it solved my problem. I read something about the "\0" in strings but wasn't very clear, which is now after reading all the answers (all are pretty good). Thank you all for the help.

Need help finding bug, if string input is composed all of same character one output character is corrupt

reverser() reverses a cstring (not in place). 99% of the time it works but some input corrupts it for example it appears if aStr2[] is assigned a string made up of the same character it will have an error.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* reverser(const char *str);
int main()
{
char aStr[] = "aaa";
char aStr2[] = "cccccc";
printf("%s %s", aStr, aStr2);
char* tmp = reverser(aStr2);//tmp now has garbage
printf("\n%s", tmp);
printf(" %s", aStr2);
return 0;
}
char* reverser(const char *str)
{
char* revStr = (char*)malloc(strlen(str));
int i;
for(i = strlen(str)-1; i >= 0; i--)
{
revStr[strlen(str)-1-i] = str[i];
}
return revStr;
}
Gives
aaa cccccc
cccccc9 cccccc
Process returned 0 (0x0) execution time : 0.068 s
Press any key to continue
Notice the 9 that shouldn't be there.
Change this malloc to strlen(str) + 1 , plus 1 for '\0'
char* revStr = (char*)malloc(strlen(str) + 1);
and after the for loop
revStr[strlen(str)+1] = '\0';
Your problem is that you don't put the string terminator in your reversed string. All strings in C are actually one extra character that isn't reported by strlen, and that is the character '\0' (or plain and simple, a zero). This tells all C functions when the string ends.
Therefore you need to allocate space for this extra terminator character in your malloc call, and add it after the last character in the string.
There are also a couple of other problems with your code, the first is that you should not cast the return of malloc (or any other function returning void *). Another that you have a memory leak in that you do not free the memory you allocate. This last point doesn't matter in a small program like the one you have here, but will be an issue in larger and longer running programs.
You haven't null-terminated your reversed string. You need to set the final index of revStr[] to 0.

How to get the start of an email address

I have two strings, one with an email address, and the other is empty.
If the email adress is e.g. "abc123#gmail.com", I need to pass the start of the email address, just before the # into the second string. For example:
first string: "abc123#gmail.com"
second string: "abc123"
I've written a loop, but it doesn't work:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char email[256] = "abc123#gmail.com";
char temp[256];
int i = 0;
while (email[i] != '#')
{
temp = strcat(temp, email[i]);
i++;
}
printf ("%s\n", temp);
system ("PAUSE");
return 0;
}
Basically, I took every time one char from the email address, and added it into the new string. For example if the new string has a on it, now I'll put b with it too using strcat....
Pointers. Firstly, strcat() returns a char pointer, which C can't cast as a char array for some reason (which I hear all C programmers must know). Secondly, the second argument to strcat() is supposed to be a char pointer, not a char.
Replacing temp = strcat(temp, email[i]); with temp[i] = email[i]; should do the trick.
Also, after the loop ends, terminate the string with a null character.
temp[i] = '\0';
(After the loop ends, i is equal to the length of your extracted string, so temp[i] is where the terminal should go.)
There are better ways to solve this problem (e.g. by finding the index of the # (by strcspn or otherwise) and doing a memcpy), but your method is very close to working, so we can just make a few small adjustments.
As others have identified, the problem is with this line:
temp = strcat(temp, email[i]);
Presumably, you are attempting to copy the character at the ith position of email into the corresponding position of temp. However, strcat is not the correct way to do so: strcat copies data from one char* to another char*, that is, it copies strings. You just want to copy a single character, which is exactly what = does.
Looking at it from a higher level (so that I don't just tell you the answer), you want to set the appropriate character of temp to the appropriate character of email (you will need to use i to index both email and temp).
Also, remember that strings in C have to be terminated by '\0', so you have to set the next character of temp to '\0' after you have finished copying the string. (On this line of thought, you should consider what happens if your email string doesn't have an # in it, your while loop will keep going past the end of the string email: remember that you can tell if you are at the end of a string by character == '\0' or just using character as a condition.)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char email[256] = "abc123#gmail.com";
char temp[256];
size_t i = 0;
#if 0
for (i=0; email[i] && email[i] != '#'; i++) {;}
/* at the end of the loop email[i] is either the first '#',
** or that of the terminating '\0' (aka as strlen() )
*/
#else
i = strcspn(email, "#" );
/* the return value for strcspn() is either the index of the first '#'
* or of the terminating '\0'
*/
#endif
memcpy (temp, email, i);
temp[i] = 0;
printf ("%s\n", temp);
system ("PAUSE");
return 0;
}
UPDATE: a totally different approach would be to do the copying inside the loop (I guess this was the OP's intention):
for (i=0; temp[i] = (email[i] == '#' ? '\0' : email[i]) ; i++) {;}
You may want to try using strtok()

Resources