C: Writing and Reading a string to and from a binary file - c

I want to store strings in a binary file, along with a lot of other data, im using the code below (when i use it for real the strings will be malloc'd) I can write to the file. Ive looked at it in a hex editor. Im not sure im writing the null terminator correctly (or if i need to). when i read back out i get the same string length that i stored, but not the string. what am i doing wrong?
FILE *fp = fopen("mybinfile.ttt", "wb");
char drumCString[6] = "Hello\0";
printf("%s\n", drumCString);
//the string length + 1 for the null terminator
unsigned short sizeOfString = strlen(drumCString) + 1;
fwrite(&sizeOfString, sizeof(unsigned short), 1, fp);
//write the string
fwrite(drumCString, sizeof(char), sizeOfString, fp);
fclose(fp);
fp = fopen("mybinfile.ttt", "rb");
unsigned short stringLength = 0;
fread(&stringLength, sizeof(unsigned short), 1, fp);
char *drumReadString = malloc(sizeof(char) * stringLength);
int count = fread(&drumReadString, sizeof(char), stringLength, fp);
//CRASH POINT
printf("%s\n", drumReadString);
fclose(fp);

You are doing wrong while reading.
you have put the & for the pointer variable that's why it gives segmentation fault.
I removed that it works fine and it returns Hello correctly.
int count = fread(drumReadString, sizeof(char), stringLength, fp);

I see a couple of issues, some problematic, some stylistic.
You should really test the return values from malloc, fread and fwrite since it's possible that the allocation can fail, and no data may be read or written.
sizeof(char) is always 1, there's no need to multiply by it.
The character array "Hello\0" is actually 7 bytes long. You don't need to add a superfluous null terminator.
I prefer the idiom char x[] = "xxx"; rather than specifying a definite length (unless you want an array longer than the string of course).
When you fread(&drumReadString ..., you're actually overwriting the pointer, not the memory it points to. This is the cause of your crash. It should be fread(drumReadString ....

A couple of tips:
1
A terminating \0 is implicit in any double quote string, and by adding an additional at the end you end up with two. The following two initializations are identical:
char str1[6] = "Hello\0";
char str2[6] = { 'H', 'e', 'l', 'l', 'o', '\0', '\0'};
So
char drumReadString[] = "Hello";
is enough, and specifying the size of the array is optional when it is initialized like this, the compiler will figure out the required size (6 bytes).
2
When writing a string, you might just as well just write all characters in one go (instead of writing one by one character sizeOfString times):
fwrite(drumCString, sizeOfString, 1, fp);
3
Even though not so common for a normal desktop pc scenario, malloc can return NULL and you will benefit from developing a habbit of always checking the result because in embedded environments, getting NULL is not an unlikely outcome.
char *drumReadString = malloc(sizeof(char) * stringLength);
if (drumReadString == NULL) {
fprintf(stderr, "drumReadString allocation failed\n");
return;
}

You don't write the terminating NUL, you don't need to but then you have to think about adding it when reading. ie malloc stringLength + 1 char, read stringLength chars and add a \0 at the end of what has been read.
Now the usual warning: if you are writing binary file the way you are doing here, you have lots of unstated assumptions which make your format difficult to port, sometimes even to another version of the same compiler -- I've seen default alignment in struct changes between compiler versions.

Some more to add to paxdiablo and AProgrammer - if you are going to use malloc in the future, just do it from the get go. It's better form and means you won't have to debug when switch over.
Additionally I'm not fully seeing the use of the unsigned short, if you are planning on writing a binary file, consider that the unsigned char type is generally of size byte, making it very convenient for that purpose.

You Just remove your &drumReadString in the fread function.You simply use drumReadString in that function as ganesh mentioned.Because,drumReadString is an array.Array is similar to pointers which point to the memory location directly.

Related

How to get the length of a string in c, if it has integers in it

I am familiar with the sizeof operation in C, but when I use it for the string "1234abcd" it only returns 4, which I am assuming is accounting for the last 4 characters.
So how would I get this to be a string of size 8?
specific code is as follows:
FILE *in_file;
in_file = fopen(filename, "r");
if (in_file == NULL) {
printf("File does not exist\n");
return 1;
}
int val_to_inspect = 0;
fscanf(in_file, "%x", &val_to_inspect);
while (val_to_inspect != 0) {
printf("%x", val_to_inspect);
int length = sizeof val_to_inspect;
printf("%d", length);
Again, the string that is being read from the file is "1234abcd", just to clarify.
There're a couple of issues here:
sizeof operator returns the size of the object. In this case it returns the size of val_to_inspect, which is an int.
http://en.cppreference.com/w/cpp/language/sizeof
fscanf reads from a stream and interprets it. You are only scanning an integer ("%x"), not a string.
http://en.cppreference.com/w/cpp/io/c/fscanf
Lastly, if you actually had a nil-terminated string, to get its length you could use strlen().
TL;DR, to get the length of a string, you need to use strlen().
That said, be a little cautious while using sizeof, it operates on the data type. So, if you pass a pointer to it, it will return you the size of the pointer variable, not the length of the string it points to.
In several important ways, only some of which have anything to do with sizeof, you are mistaken about what your code actually does.
FILE *in_file;
in_file = fopen(filename, "r");
if (in_file == NULL)
{
printf("File does not exist\n");
return 1;
}
Kudos for actually checking whether fopen succeeded; lots of people forget to do that when they are starting out in C. However, there are many reasons why fopen might fail; the file not existing is just one of them. Whenever an I/O operation fails, make sure to print strerror(errno) so you know the actual reason. Also, error messages should be sent to stderr, not stdout, and should include the name of the affected file(s) if any. Corrected code looks like
if (in_file == NULL)
{
fprintf(stderr, "Error opening %s: %s\n", filename, strerror(errno));
return 1;
}
(You will need to add includes of string.h and errno.h to the top of the file if they aren't already there.)
int val_to_inspect = 0;
fscanf(in_file,"%x", &val_to_inspect);
This code does not read a string from the file. It skips any leading whitespace and then reads a sequence of hexadecimal digits from the file, stopping as soon as it encounters a non-digit, and immediately converts them to a machine number which is stored in val_to_expect. With the file containing 1234abcd, it will indeed read eight characters from the file, but with other file contents it might read more or fewer.
(Technically, with the %x conversion specifier you should be using an unsigned int, but most implementations will let you get away with using a signed int.)
(When you get more practice in C you will learn that scanf is broken-as-specified and also very difficult to use robustly, but for right now don't worry about that.)
while (val_to_inspect != 0) {
printf("%x", val_to_inspect);
int length = sizeof val_to_inspect;
printf("%d", length);
}
You are not applying sizeof to a string, you are applying it to an int. The size of an int, on your computer, is 4 chars, and that is true no matter what the value is.
Moreover, sizeof applied to an actual C string (that is, a char * variable pointing to a NUL-terminated sequence of characters) does not compute the length of the string. It will instead tell you the size of the pointer to the string, which will be a constant (usually either 4 or 8, depending on the computer) independent of the length of the string. To compute the length of a string, use the library function strlen (declared in string.h).
You will sometimes see clever code apply sizeof to a string literal, which does return a number related to (but not equal to!) its length. Exercise for you: figure out what that number is, and why sizeof does this for string literals but not for strings in general. (Hint: sizeof s will return a number related to s's string length when s was declared as char s[] = "string";, but not when it was declared as char *s = "string";.)
As a final note, it doesn't matter in the grand scheme of things whether you like your opening braces on their own lines or not, but pick one style and stick to it throughout the entire file. Don't put some if opening braces on their own lines and others at the end of the if line.
It's better to create own counter to find the length of "1234abcd" by reading the character by character.
FILE *in_file;
char ch;
int length=0;
in_file = fopen("filename.txt", "r");
if (in_file == NULL)
{
printf("File does not exist\n");
return 1;
}
while (1) {
ch = fgetc(in_file);
printf("%c", ch);
if (ch == EOF)
break;
length++;
}
fclose(in_file);
printf ("\n%d",length);
Everyone, thank you for all the feedback. I realize I made a lot of mistakes with the original post, but im just switching to c from c++, so a lot of the things I'm used to cant really be applied the same way. This is all tremendously helpful, it's good to have a place to go to.
Len=sizeof(your string)/sizeof(char)-1
-1 is eof character null
If you want to get length of any from specific begining index just do Len-index

Storing strings in array in C

I have read a lot of questions on this, and using them I have altered my code and have created code which I thought would work.
I think it's my understanding of C, which is failing me here as I can't see where I'm going wrong.
I get no compilation errors, but when I run i receive 'FileReader.exe has stopped working' from the command prompt.
My code is :
void storeFile(){
int i = 0;
char allWords [45440][25];
FILE *fp = fopen("fileToOpen.txt", "r");
while (i <= 45440){
char buffer[25];
fgets(buffer, 25, fp);
printf("The word read into buffer is : %s",buffer);
strcpy(allWords[i], buffer);
printf("The word in allWords[%d] is : %s", i, allWords[i]);
//allWords[i][strlen(allWords[i])-1] = '\0';
i = i + 1;
}
fclose(fp);
}
There are 45440 lines in the file, and no words longer than 25 char's in length. I'm trying to read each word into a char array named buffer, then store that buffer in an array of char arrays named allWords.
I am trying to get this part working, before I refactor to return the array to the main method (which I feel won't be a fun experience).
You are trying to allocate more than a megabyte (45440*25) worth of data in automatic storage. On many architectures this results in stack overflow before your file-reading code even gets to run.
You can work around this problem by allocating allWords statically, like this
static char allWords [45440][25];
or dynamically, like this:
char (*allWords)[25] = malloc(45440 * sizeof(*allWords));
Note that using buffer in the call to fgets is not required, because allWords[i] can be used instead, without strcpy:
fgets(allWords[i], sizeof(*allWords)-1, fp);
Also note that an assumption about file size is unnecessary: you can continue calling fgets until it returns NULL; this indicates that the end of the file has been reached, so you can exit the loop using break.

C program printing weird characters

I have a program that reads the content of a file and saves it into buf. After reading the content it is supposed to copy two by two chars to an array. This code works fine if I'm not trying to read from a file but if I try to read it from a file the printf from buffer prints the two chars that I want but adds weird characters. I've confirmed and it's saving correctly into buf, no weird characters there. I can't figure out what's wrong... Here's the code:
char *buffer = (char*)malloc(2*sizeof(char));
char *dst = buffer;
char *src = buf;
char *end = buf + strlen(buf);
char *baby = '\0';
while (src<= end)
{
strncpy(dst, src, 2);
src+= 2;
printf("%s\n", buffer);
}
(char*)malloc(2*sizeof(char)); change to malloc(3*sizeof*buffer); You need an additional byte to store the terminating null character which is used to indicate the end-of-string. Aslo, do not cast the return value of malloc(). Thanks to unwind
In your case, with strncpy(), you have supplied n as 2, which is not having any scope to store the terminating null byte. without the trminating null, printf() won't be knowing where to stop. Now, with 3 bytes of memory, you can use strcpy() to copy the string properly
strncpy() will not add the terminating null itself, in case the n is equal to the size of supplied buffer, thus becoming very very unreliable (unlike strcpy()). You need to take care of it programmatically.
check the man page for strncpy() and strcpy() here.

Why is fgets() and strncmp() not working in this C code for string comparison?

This is a very fun problem I am running into. I did a lot of searching on stack overflow and found others had some similar problems. So I wrote my code accordingly. I originally had fscan() and strcmp(), but that completely bombed on me. So other posts suggested fgets() and strncmp() and using the length to compare them.
I tried to debug what I was doing by printing out the size of my two strings. I thought, maybe they have /n floating in there or something and messing it up (another post talked about that, but I don't think that is happening here). So if the size is the same, the limit for strncmp() should be the same. Right? Just to make sure they are supposedly being compared right. Now, I know that if the strings are the same, it returns 0 otherwise a negative with strncmp(). But it's not working.
Here is the output I am getting:
perk
repk
Enter your guess: perk
Word size: 8 and Guess size: 8
Your guess is wrong
Enter your guess:
Here is my code:
void guess(char *word, char *jumbleWord)
{
size_t wordLen = strlen(word);
size_t guessLen;
printf("word is: %s\n",word);
printf("jumble is: %s\n", jumbleWord);
char *guess = malloc(sizeof(char) * (MAX_WORD_LENGTH + 1));
do
{
printf("Enter your guess: ");
fgets(guess, MAX_WORD_LENGTH, stdin);
printf("\nword: -%s- and guess: -%s-", word, guess);
guessLen = strlen(guess);
//int size1 = strlen(word);
//int size2 = strlen(guess);
//printf("Word size: %d and Guess size: %d\n",size1,size2);
if(strncmp(guess,word,wordLen) == 0)
{
printf("Your guess is correct\n");
break;
}
}while(1);
}
I updated it from suggestions below. Especially after learning the difference between char * as a pointer and referring to something as a string. However, it's still giving me the same error.
Please note that MAX_WORD_LENGTH is a define statement used at the top of my program as
#define MAX_WORD_LENGTH 25
Use strlen, not sizeof. Also, you shouldn't use strncmp here, if your guess is a prefix of the word it will mistakenly report a match. Use strcmp.
sizeof(guess) is returning the size of a char * not the length of the string guess. Your problem is that you're using sizeof to manage string lengths. C has a function for string length: strlen.
sizeof is used to determine the size of data types and arrays. sizeof only works for strings in one very specific case - I won't go into that here - but even then, always use strlen to work with string lengths.
You'll want to decide how many characters you'll allow for your words. This is a property of your game, i.e. words in the game are never more that 11 characters long.
So:
// define this somewhere, a header, or near top of your file
#define MAX_WORD_LENGTH 11
// ...
size_t wordlen = strlen(word);
size_t guessLen;
// MAX_WORD_LENGTH + 1, 1 more for the null-terminator:
char *guess = malloc(sizeof(char) * (MAX_WORD_LENGTH + 1));
printf("Enter your guess: ");
fgets(guess, MAX_WORD_LENGTH, stdin);
guessLen = strlen(guess);
Also review the docs for fgets and note that the newline character is retained in the input, so you'll need to account for that if you want to compare the two words. One quick fix for this is to only compare up to the length of word, and not the length of guess, so: if( strncmp(guess, word, wordLen) == 0). The problem with this quick fix is that it will pass invalid inputs, i.e. if word is eject, and guess is ejection, the comparison will pass.
Finally, there's no reason to allocate memory for a new guess in each iteration of the loop, just use the string that you've already allocated. You could change your function setup to:
char guess(char *word, char *jumbledWord)
{
int exit;
size_t wordLen = strlen(word);
size_t guessLen;
char *guess = malloc(sizeof(char) * (MAX_WORD_LENGTH + 1));
do
{
printf("Enter your guess: ");
// ...
As everyone else has stated, use strlen not sizeof. The reason this is happening though, is a fundamental concept of C that is different from Java.
Java does not give you access to pointers. Not only does C have pointers, but they are fundamental to the design of the language. If you don't understand and use pointers properly in C then things won't make sense, and you will have quite a bit of trouble.
So, in this case, sizeof is returning the size of the char * pointer, which is (usually) 4 or 8 bytes. What you want is the length of the data structure "at the other end" of the pointer. This is what strlen encapsulates for you.
If you didn't have strlen, you would need to dereference the pointer, then walk the string until you find the null byte marking the end.
i = 1;
while(*guess++) { i++ }
Afterwards, i will hold the length of your string.
Update:
Your code is fine, except for one minor detail. The docs for fgets note that it will keep the trailing newline char.
To fix this, add the following code in between the fgets and strncmp sections:
if ( guess[guessLen-1] == '\n' ) {
guess[guessLen-1] = '\0';
}
That way the trailing newline, if any, gets removed and you are no longer off by one.
Some list of problems / advices for your code, much too long to fit in a comment:
your function returns a char which is strange. I don't see the
logic and what is more important, you actually never return a value. Don't do that, it will bring you trouble
look into other control structures in C, in particular don't do your exit thing. First, exit in C is a function, which does what it says, it exits the program. Then there is a break statement to leave a loop.
A common idiom is
do {
if (something) break;
} while(1)
you allocate a buffer in each iteration, but you never free it. this will give you big memory leaks, buffers that will be wasted and inaccessible to your code
your strncmp approach is only correct if the strings have the same length, so you'd have to test that first

Reading strings and integers from one binary text in C

I'm using C and I want to read from a binaryFile.
I know that it is contain strings in the following way: Length of a string, the string itself, the length of a string, string itself, and so on...
I want to count the number of times which the string Str appears in the binary file.
So I want to do something like this:
int N;
while (!feof(file)){
if (fread(&N, sizeof(int), 1, file)==1)
...
Now I need to get the string itself. I know it's length. Should I do a 'for'
loop and get with fgetc char by char? I know I'm not allowed to use fscanf since
it's not a text file, but can I use fgetc? And would I get what I'm expecting for
my string? (To use dynamic allocation for char* for it with the size of the length
and use strcpy to add it to the current string?)
You could allocate some memory with malloc then fread into that buffer:
char *str;
/* ... */
if (fread(&N, sizeof(int), 1, file)==1)
{
/* check that N > 0 */
str = malloc(N+1);
if (fread(str, sizeof(char), N, file) == N)
{
str[N] = '\0'; /* terminate str */
printf("Read %d chars: %s\n", N, str);
}
free(str);
}
You should probably loop on:
while (fread(&N, sizeof(int), 1, file) == 1)
{
// Check N for sanity
char *buffer = malloc(N+1);
// Check malloc succeeded
if (fread(buffer, N, 1, file) != 1)
...process error...
buffer[N] = '\0'; // Null terminate for sanity's sake
...store buffer (the pointer) for later processing so you aren't leaking...
...or free it if you won't need it later...
}
You could use getc() or fgetc() in a loop; that would work. However, the direct fread() is much simpler (and is coded as if it uses getc() in a loop).
You might want to do some sanity checking on N before blindly using it with malloc(). In particular, negative values are likely to lead to much unhappiness.
The file format as written is tied to one class of machine — either big-endian or little-endian, and with the fixed size of int (probably 32-bits). Writing more portable data is slightly fiddlier, but eminently doable — but probably not relevant to you just yet.
Using feof() is seldom the correct way to test for whether to continue with a loop. Indeed, there is not often a need to use feof() in code. When it is used, it is because an I/O operation 'failed' and you need to disambiguate between 'it was not an error — just EOF' and 'there was some sort of error on the device'.

Resources