Storing a string in char* in C - c

In the code below, I hope you can see that I have a char* variable and that I want to read in a string from a file. I then want to pass this string back from the function. I'm rather confused by pointers so I'm not too sure what I'm supposed to do really.
The purpose of this is to then pass the array to another function to be searched for a name.
Unfortunately the program crashes as a result and I've no idea why.
char* ObtainName(FILE *fp)
{
char* temp;
int i = 0;
temp = fgetc(fp);
while(temp != '\n')
{
temp = fgetc(fp);
i++;
}
printf("%s", temp);
return temp;
}
Any help would be vastly appreciated.

fgetc returns an int, not a char*. This int is a character from the stream, or EOF if you reach the end of the file.
You're implicitly casting the int to a char*, i.e., interpreting it as an address (turn your warnings on.) When you call printf it reads that address and continues to read a character at a time looking for the null terminator which ends the string, but that address is almost certainly invalid. This is undefined behavior.

I've taken some liberties with what you wanted to accomplish. Rather that deal with pointers, you can just use a fixed sized array as long as you can set a maximum length. I've also included several checks so that you don't run off the end of the buffer or the end of the file. Also important is to make sure that you have a null termination '\0' at the end of the string.
#define MAX_LEN 100
char* ObtainName(FILE *fp)
{
static char temp[MAX_LEN];
int i = 0;
while(i < MAX_LEN-1)
{
if (feof(fp))
{
break;
}
temp[i] = fgetc(fp);
if (temp[i] == '\n')
{
break;
}
i++;
}
temp[i] = '\0';
printf("%s", temp);
return temp;
}

So, there are several problems here:
You're not setting aside any storage for the string contents;
You're not storing the string contents correctly;
You're attempting to read memory that doesn't belong to you;
The way you're attempting to return the string is going to give you heartburn.
1. You're not setting aside storage for the string contents
The line
char *temp;
declares temp as a pointer to char; its value will be the address of a single character value. Since it's declared at local scope without the static keyword, its initial value will be indeterminate, and that value may not correspond to a valid memory address.
It does not set aside any storage for the string contents read from fp; that would have to be done as a separate step, which I'll get to below.
2. You're not storing the string contents correctly
The line
temp = fgetc(fp);
reads the next character from fp and assigns it to temp. First of all, this means you're only storing the last character read from the stream, not the whole string. Secondly, and more importantly, you're assigning the result of fgetc() (which returns a value of type int) to an object of type char * (which is treated as an address). You're basically saying "I want to treat the letter 'a' as an address into memory." This brings us to...
3. You're attempting to read memory that doesn't belong to you
In the line
printf("%s", temp);
you're attempting to print out the string beginning at the address stored in temp. Since the last thing you wrote to temp was most likely a character whose value is < 127, you're telling printf to start at a very low and most likely not accessible address, hence the crash.
4. The way you're attempting to return the string is guaranteed to give you heartburn
Since you've defined the function to return a char *, you're going to need to do one of the following:
Allocate memory dynamically to store the string contents, and then pass the responsibility of freeing that memory on to the function calling this one;
Declare an array with the static keyword so that the array doesn't "go away" after the function exits; however, this approach has serious drawbacks;
Change the function definition;
Allocate memory dynamically
You could use dynamic memory allocation routines to set aside a region of storage for the string contents, like so:
char *temp = malloc( MAX_STRING_LENGTH * sizeof *temp );
or
char *temp = calloc( MAX_STRING_LENGTH, sizeof *temp );
and then return temp as you've written.
Both malloc and calloc set aside the number of bytes you specify; calloc will initialize all those bytes to 0, which takes a little more time, but can save your bacon, especially when dealing with text.
The problem is that somebody has to deallocate this memory when its no longer needed; since you return the pointer, whoever calls this function now has the responsibility to call free() when it's done with that string, something like:
void Caller( FILE *fp )
{
...
char *name = ObtainName( fo );
...
free( name );
...
}
This spreads the responsibility for memory management around the program, increasing the chances that somebody will forget to release that memory, leading to memory leaks. Ideally, you'd like to have the same function that allocates the memory free it.
Use a static array
You could declare temp as an array of char and use the static keyword:
static char temp[MAX_STRING_SIZE];
This will set aside MAX_STRING_SIZE characters in the array when the program starts up, and it will be preserved between calls to ObtainName. No need to call free when you're done.
The problem with this approach is that by creating a static buffer, the code is not re-entrant; if ObtainName called another function which in turn called ObtainName again, that new call will clobber whatever was in the buffer before.
Why not just declare temp as
char temp[MAX_STRING_SIZE];
without the static keyword? The problem is that when ObtainName exits, the temp array ceases to exist (or rather, the memory it was using is available for someone else to use). That pointer you return is no longer valid, and the contents of the array may be overwritten before you can access it again.
Change the function definition
Ideally, you'd like for ObtainName to not have to worry about the memory it has to write to. The best way to achieve that is for the caller to pass target buffer as a parameter, along with the buffer's size:
int ObtainName( FILE *fp, char *buffer, size_t bufferSize )
{
...
}
This way, ObtainName writes data into the location that the caller specifies (useful if you want to obtain multiple names for different purposes). The function will return an integer value, which can be a simple success or failure, or an error code indicating why the function failed, etc.
Note that if you're reading text, you don't have to read character by character; you can use functions like fgets() or fscanf() to read an entire string at a time.
Use fscanf if you want to read whitespace-delimited strings (i.e., if the input file contains "This is a test", fscanf( fp, "%s", temp); will only read "This"). If you want to read an entire line (delimited by a newline character), use fgets().
Assuming you want to read an individual string at a time, you'd use something like the following (assumes C99):
#define FMT_SIZE 20
...
int ObtainName( FILE *fp, char *buffer, size_t bufsize )
{
int result = 1; // assume success
int scanfResult = 0;
char fmt[FMT_SIZE];
sprintf( fmt, "%%%zus", bufsize - 1 );
scanfResult = fscanf( fp, fmt, buffer );
if ( scanfResult == EOF )
{
// hit end-of-file before reading any text
result = 0;
}
else if ( scanfResult == 0 )
{
// did not read anything from input stream
result = 0;
}
else
{
result = 1;
}
return result;
}
So what's this noise
char fmt[FMT_SIZE];
sprintf( fmt, "%%%zus", bufsize - 1 );
about? There is a very nasty security hole in fscanf() when you use the %s or %[ conversion specifiers without a maximum length specifier. The %s conversion specifier tells fscanf to read characters until it sees a whitespace character; if there are more non-whitespace characters in the stream than the buffer is sized to hold, fscanf will store those extra characters past the end of the buffer, clobbering whatever memory is following it. This is a common malware exploit. So we want to specify a maximum length for the input; for example, %20s says to read no more than 20 characters from the stream and store them to the buffer.
Unfortunately, since the buffer length is passed in as an argument, we can't write something like %20s, and fscanf doesn't give us a way to specify the length as an argument the way fprintf does. So we have to create a separate format string, which we store in fmt. If the input buffer length is 10, then the format string will be %10s. If the input buffer length is 1000, then the format string will be %1000s.

The following code expands on that in your question, and returns the string in allocated storage:
char* ObtainName(FILE *fp)
{
int temp;
int i = 1;
char *string = malloc(i);
if(NULL == string)
{
fprintf(stderr, "malloc() failed\n");
goto CLEANUP;
}
*string = '\0';
temp = fgetc(fp);
while(temp != '\n')
{
char *newMem;
++i;
newMem=realloc(string, i);
if(NULL==newMem)
{
fprintf(stderr, "realloc() failed.\n");
goto CLEANUP;
}
string=newMem;
string[i-1] = temp;
string[i] = '\0';
temp = fgetc(fp);
}
CLEANUP:
printf("%s", string);
return(string);
}
Take care to 'free()' the string returned by this function, or a memory leak will occur.

Related

Reading string input from file in C

I have a text file called "graphics" which contains the words "deoxyribonucleic acid".
When I run this code it works and it returns the first character. "d"
int main(){
FILE *fileptr;
fileptr = fopen("graphics.txt", "r");
char name;
if(fileptr != NULL){ printf("hey \n"); }
else{ printf("Error"); }
fscanf( fileptr, "%c", &name);
printf("%c\n", name);
fclose( fileptr );
return 0;
}
When I am using the fscanf function the parameters I am sending are the name of the FILE object, the type of data the function will read, and the name of the object it is going to store said data, correct? Also, why is it that I have to put an & in front of name in fscanf but not in printf?
Now, I want to have the program read the file and grab the first word and store it in name.
I understand that this will have to be a string (An array of characters).
So what I did was this:
I made name into an array of characters that can store 20 elements.
char name[20];
And changed the parameters in fscanf and printf to this, respectively:
fscanf( fileptr, "%s", name);
printf("%s\n", name);
Doing so produces no errors from the compiler but the program crashes and I don't understand why. I am letting fscanf know that I want it to read a string and I am also letting printf know that it will output a string. Where did I go wrong? How would I accomplish said task?
This is a very common problem. fscanf reads data and copies it into a location you provide; so first of all, you need the & because you provide the address of the variable (not the value) - that way fscanf knows where to copy TO.
But you really want to use functions that copy "only as many characters as I have space". This is for example fgets(), which includes a "max byte count" parameter:
char * fgets ( char * str, int num, FILE * stream );
Now, if you know that you only allocated 20 bytes to str, you can prevent reading more than 20 bytes and overwriting other memory.
Very important concept!
A couple of other points. A variable declaration like
char myString[20];
results in myString being a pointer to 20 bytes of memory where you can put a string (remember to leave space for the terminating '\0'!). So you can usemyStringas thechar *argument infscanforfgets`. But when you try to read a single character, and that characters was declared as
char myChar;
You must create the pointer to the memory location "manually", which is why you end up with &myChar.
Note - if you want to read up to white space, fscanf is the better function to use; but it will be a problem if you don't make sure you have the right amount of space allocated. As was suggested in a comment, you could do something like this:
char myBuffer[20];
int count = fscanf(fileptr, "%19s ", myBuffer);
if(count != 1) {
printf("failed to read a string - maybe the name is too long?\n");
}
Here you are using the return value of fscanf (the number of arguments correctly converted). You are expecting to convert exactly one; if that doesn't work, it will print the message (and obviously you will want to do more than print a messageā€¦)
Not answer of your question but;
for more efficient memory usage use malloc instead of a static declaration.
char *myName // declara as pointer
myName = malloc(20) // same as char[20] above on your example, but it is dynamic allocation
... // do your stuff
free(myName) // lastly free up your allocated memory for myName

Same structure objects memory overlap?

struct integer
{
int len;
char* str;
int* arr;
}int1, int2;
int main(void) {
printf("Please enter 1st number\n");
int1.str= str_input();
int1.len=chars_read-1;
int1.arr= func2(int1.len, int1.str);
printf(("\%c\n"), *int1.str);
printf("Please enter 2nd number\n");
int2.str = str_input();
int2.len=chars_read-1;
printf(("\n %c\n"), *int1.str );
int2.arr= func2(int2.len, int2.str);
if the input is 4363 and 78596 , the output is 4 and 7 respectively.
The output is not 4 and 4. Given that both are different objects, shouldn't both have different memory allocation?
Please note: this is NOT a typographical error. I have used the same *int1.str both times. the problem is that although I have made no changes in it, its value is changing. How?
I do not think that str_input() can make a difference.
char* str_input(void) {
char cur_char;
char* input_ptr = (char*)malloc(LINE_LEN * sizeof(char));
char input_string[LINE_LEN];
//while ((cur_char = getchar()) != '\n' && cur_char<='9' && cur_char>='0' && chars_read < 10000)
for(chars_read=1; chars_read<10000; chars_read++)
{
scanf("%c", &cur_char);
if(cur_char!='\n' && cur_char<='9' && cur_char>='0')
{
input_string[chars_read-1]= cur_char;
printf("%c\n", input_string[chars_read-1]);
}
else{
break;
}
}
input_string[chars_read] = '\n';
input_ptr = &input_string[0]; /* sets pointer to address of 0th index */
return input_ptr;
}
//chars_read is a global variable.
Thanks in advance.
you have printed the same variable, *int1.str
It will be helpful have the source code of str_input(), but it's probably that it returns a pointer to the same buffer in each call, so the second call to str_input() updates also the target of int1.str (beacuse it's pointing to the same char* than int2.str)
As noted elsewhere, both of the printf calls in your question pass *int1.str to printf.
However, if that is merely a typographical error in your question, and the second printf call passes *int2.str, then most likely the problem is that str_input returns the address of a fixed buffer (with static or, worse, automatic storage duration). Instead, str_input should use malloc or strdup to allocate new memory for each string and should return a pointer to that. Then the caller should free the memory.
Alternatively, str_input may be changed to accept a buffer and size passed to it by the caller, and the caller will have the responsibility of providing a different buffer for each call.
About the newly posted code
The code for str_input contains this line:
char* input_ptr = (char*)malloc(LINE_LEN * sizeof(char));
That declares input_ptr to be a char * and calls malloc to get space. Then input_ptr is set to contain the address of that space. Later, str_input contains this line:
input_ptr = &input_string[0];
That line completely ignores the prior value of input_ptr and overwrites it with the address of input_string[0]. So the address returned by malloc is gone. The str_input function returns the address of input_string[0] each time it is called. This is wrong. str_input must return the address of the allocated space each time.
Typically, a routine like this would use input_ptr throughout, doing its work in the char array at that address. It would not use a separate array, input_string, for its work. So delete the definition of input_string and change str_input to do all its work in the space pointed to by input_ptr.
Also, do not set the size of the buffer to LINE_LEN in one place but limit the number of characters in it with chars_read < 10000. Use the same limit in all places. Also allow one byte for a null character at the end (unless you are very careful never to perform any operation that requires a null byte at the end).

Custom getLine() function for c

I need a function/method that will take in a char array and set it to a string read from stdin. It needs to return the last character read as its return type, so I can determine if it reached the end of a line or the end of file marker.
here is what I have so far, and I kind of based it off of code from here
UPDATE: I changed it, but now it just crashes upon hitting enter after text. I know this way is inefficient, and char is not the best for EOF check, but for now I am just trying to get it to return the string. I need it to do it in this fashion and no other fashion. I need the string to be the exact length of the line, and to return a value that is either the newline or EOF int which I believe can still be used in a char value.
This program is in C not C++
char getLine(char **line);
int main(int argc, char *argv[])
{
char *line;
char returnVal = 0;
returnVal = getLine(&line);
printf("%s", line);
free(line);
system("pause");
return 0;
}
char getLine(char **line) {
unsigned int lengthAdder = 1, counter = 0, size = 0;
char charRead = 0;
*line = malloc(lengthAdder);
while((charRead = getc(stdin)) != EOF && charRead != '\n')
{
*line[counter++] = charRead;
*line = realloc(*line, counter);
}
*line[counter] = '\0';
return charRead;
}
Thank you for any help in advance!
You're assigning the result of malloc() to a local copy of line, so after the getLine() function returns it's not modified (albeit you think it is). What you have to do is either return it (as opposed to use an output parameter) or pass its address (pass it 'by reference'):
void getLine(char **line)
{
*line = malloc(length);
// etc.
}
and call it like this:
char *line;
getLine(&line);
Your key problem is that line pointer value does not propagate out of the getLine() function. The solution is to pass pointer to the line pointer to the function as a parameter instead - calling it like getLine(&line); while the function would be defined as taking parameter char **line. In the function, on all places where you now work with line, you would work with *line instead, i.e. dereferencing the pointer to a pointer and working with the value of the variable in main() where the pointer leads. Hope this is not too confusing. :-) Try to draw it on a piece of paper.
(A tricky part - you must change line[counter] to (*line)[counter] because you first need to dereference the pointer to the string, and only then to access a specific character in the string.)
There is a couple of other problems with your code:
You use char as the type for charRead. However, the EOF constant cannot be represented using char, you need to use int - both as the type of charRead and return value of getLine(), so that you can actually distringuish between a newline and end of file.
You forgot to return the last char read from your getLine() function. :-)
You are reallocating the buffer after each character addition. This is not terribly efficient and therefore is a rather ugly programming practice. It is not too difficult to use another variable to track the amount of space allocated and then (i) start with allocating a reasonable chunk of memory, e.g. 64 bytes, so that ideally you will never reallocate (ii) enlarge the allocation only if you need to based on comparing the counter and your allocation size tracker. Two reallocation strategies are common - either doubling the size of the allocation or increasing the allocation by a fixed step.
The way you use realloc is not correct. If it returns NULL then the memory block will be lost.
It is better to use realloc in this way:
char *tmp;
...
tmp = realloc(line, counter);
if(tmp == NULL)
ERROR, TRY TO SOLVE IT
line = tmp;

Can't free an allocated string memory after appending character

I am trying to append a character to a string... that works fine unfortunately I can't free the mem of the string afterwards which causes that the string gets longer and longer.... as it reads a file every linie will be added to the string which obviously shouldn't happen
char* append_char(char* string, char character)
{
int length = strlen(string);
string[length] = character;
string[length+1] = '\0';
return string;
}
I allocated mem for string like
char *read_string = (char *)malloc(sizeof(char)*500);
call the function append_char(read_string,buffer[0]); and free it after the whole string is build free(read_string);
I presume that once I call the append_char() , the mem allocation is going to be changed, which cause that I can't get hold of it.
Edited:
here is the function which uses the append_char()
char *read_log_file_row(char *result,int t)
{
filepath ="//home/,,,,,/mmm.txt";
int max = sizeof(char)*2;
char buffer[max];
char *return_fgets;
char *read_string = malloc(sizeof(char)*500);
file_pointer = fopen(filepath,"r");
if(file_pointer == NULL)
{
printf("Didn't work....");
return NULL;
}
int i = 0;
while(i<=t)
{
while(return_fgets = (fgets(buffer, max, file_pointer)))
{
if(buffer[0] == '\n')
{
++i;
break;
}
if(i==t)
{
append_char(read_string,buffer[0]);
}
}
if(return_fgets == NULL)
{
free(read_string);
return NULL;
/* return "\0";*/
}
if(buffer[0] != '\n')
append_char(read_string,buffer[0]);
}
fclose(file_pointer);
strcpy(result,read_string);
free(read_string);
return result;
}
Dont cast the return value of malloc() in C.
Make sure you initialize read_string to an empty string before you try to append to it, by setting read_string[0] = '\0';.
Make sure you track the current length, so you don't try to build a string that won't fit in the buffer. 500 chars allocated means max string length is 499 characters.
Not sure what you expect should happen when you do free(read_string). It sounds (from your comment to #Steve Jessop's answer) that you do something like this:
char *read_string = malloc(500);
read_string[0] = '\0'; /* Let's assume you do this. */
append_char(read_string, 'a'); /* Or whatever, many of these. */
free(read_string);
print("%c\n", *read_string); /* This invokes UNDEFINED BEHAVIOR. */
This might print an a, but that proves nothing since by doing this (accessing memory that has been free():d) your program is invoking undefined behavior, which means that anything could happen. You cannot draw conclusions from this, since the "test" is not valid. You can't free memory and then access it. If you do it, and get some "reasonable"/"correct" result, you still cannot say that the free():ing "didn't work".
No, the memory allocation is not changed in any way by append_char. All it does is change the contents of the allocation -- by moving the nul terminator one byte along, you now care about the contents of one more of your 500 bytes than you did before.
If the string gets longer than 500 bytes (including terminator), then you have undefined behavior. If you call strlen on something that isn't a nul-terminated string, for example if you pass it a pointer to uninitialized memory straight from malloc, then you have undefined behavior.
Undefined behavior is bad[*]: feel free to read up on it, but "X has undefined behavior" is in effect a way of saying "you must not do X".
[*] To be precise: it's not guaranteed not to be bad...
Have you ever initialized the string? Try *read_string=0 after allocating it. Or use calloc. Also, have your string grown beyond the allocated memory?

How to read the standard input into string variable until EOF in C?

I am getting "Bus Error" trying to read stdin into a char* variable.
I just want to read whole stuff coming over stdin and put it first into a variable, then continue working on the variable.
My Code is as follows:
char* content;
char* c;
while( scanf( "%c", c)) {
strcat( content, c);
}
fprintf( stdout, "Size: %d", strlen( content));
But somehow I always get "Bus error" returned by calling cat test.txt | myapp, where myapp is the compiled code above.
My question is how do i read stdin until EOF into a variable? As you see in the code, I just want to print the size of input coming over stdin, in this case it should be equal to the size of the file test.txt.
I thought just using scanf would be enough, maybe buffered way to read stdin?
First, you're passing uninitialized pointers, which means scanf and strcat will write memory you don't own. Second, strcat expects two null-terminated strings, while c is just a character. This will again cause it to read memory you don't own. You don't need scanf, because you're not doing any real processing. Finally, reading one character at a time is needlessly slow. Here's the beginning of a solution, using a resizable buffer for the final string, and a fixed buffer for the fgets call
#define BUF_SIZE 1024
char buffer[BUF_SIZE];
size_t contentSize = 1; // includes NULL
/* Preallocate space. We could just allocate one char here,
but that wouldn't be efficient. */
char *content = malloc(sizeof(char) * BUF_SIZE);
if(content == NULL)
{
perror("Failed to allocate content");
exit(1);
}
content[0] = '\0'; // make null-terminated
while(fgets(buffer, BUF_SIZE, stdin))
{
char *old = content;
contentSize += strlen(buffer);
content = realloc(content, contentSize);
if(content == NULL)
{
perror("Failed to reallocate content");
free(old);
exit(2);
}
strcat(content, buffer);
}
if(ferror(stdin))
{
free(content);
perror("Error reading from stdin.");
exit(3);
}
EDIT: As Wolfer alluded to, a NULL in your input will cause the string to be terminated prematurely when using fgets. getline is a better choice if available, since it handles memory allocation and does not have issues with NUL input.
Since you don't care about the actual content, why bother building a string? I'd also use getchar():
int c;
size_t s = 0;
while ((c = getchar()) != EOF)
{
s++;
}
printf("Size: %z\n", s);
This code will correctly handle cases where your file has '\0' characters in it.
Your problem is that you've never allocated c and content, so they're not pointing anywhere defined -- they're likely pointing to some unallocated memory, or something that doesn't exist at all. And then you're putting data into them. You need to allocate them first. (That's what a bus error typically means; you've tried to do a memory access that's not valid.)
(Alternately, since c is always holding just a single character, you can declare it as char c and pass &c to scanf. No need to declare a string of characters when one will do.)
Once you do that, you'll run into the issue of making sure that content is long enough to hold all the input. Either you need to have a guess of how much input you expect and allocate it at least that long (and then error out if you exceed that), or you need a strategy to reallocate it in a larger size if it's not long enough.
Oh, and you'll also run into the problem that strcat expects a string, not a single character. Even if you leave c as a char*, the scanf call doesn't make it a string. A single-character string is (in memory) a character followed by a null character to indicate the end of the string. scanf, when scanning for a single character, isn't going to put in the null character after it. As a result, strcpy isn't going to know where the end of the string is, and will go wandering off through memory looking for the null character.
The problem here is that you are referencing a pointer variable that no memory allocated via malloc, hence the results would be undefined, and not alone that, by using strcat on a undefined pointer that could be pointing to anything, you ended up with a bus error!
This would be the fixed code required....
char* content = malloc (100 * sizeof(char));
char c;
if (content != NULL){
content[0] = '\0'; // Thanks David!
while ((c = getchar()) != EOF)
{
if (strlen(content) < 100){
strcat(content, c);
content[strlen(content)-1] = '\0';
}
}
}
/* When done with the variable */
free(content);
The code highlights the programmer's responsibility to manage the memory - for every malloc there's a free if not, you have a memory leak!
Edit: Thanks to David Gelhar for his point-out at my glitch! I have fixed up the code above to reflect the fixes...of course in a real-life situation, perhaps the fixed value of 100 could be changed to perhaps a #define to make it easy to expand the buffer by doubling over the amount of memory via realloc and trim it to size...
Assuming that you want to get (shorter than MAXL-1 chars) strings and not to process your file char by char, I did as follows:
#include <stdio.h>
#include <string.h>
#define MAXL 256
main(){
char s[MAXL];
s[0]=0;
scanf("%s",s);
while(strlen(s)>0){
printf("Size of %s : %d\n",s,strlen(s));
s[0]=0;
scanf("%s",s);
};
}

Resources