Reading non-fixed length from binary file (C) - c

I'm new here, and I need some help. :)
I am working on a program that has to write and read a binary file. I have to add lectures to it, they look like:
COURSECODE;COURSENAME;MAXAPPLICANTS;ACTUALAPPLICANTS;
I could write that in a file without any problems using char*.
My question is: how do I read that back in a struct if the records are non-fixed size? (e.g.: coursename can be Linear Algebra or Analysis -> length is non-determined) I also need to modify the actual applicants number, how do I find the character position of it, and the current line?
I'd be happy with ideas, and I would appreciate any source code as well, I was programming in C++ and C is a hard step-back for me.
Thank you in advance!

Your structure looks like
struct student {
char *coursecode;
char *coursename;
char *max_applicants;
char *actual_applicants;
};
Just add another member into your structure say int size which stores total size of structure.
Every time when you read from binary file you should read first 4 bytes you will get complete size of record,then see how many characters are there into record,read that much and tokenize string by ; you will find your records.

without termination characters it is impossible.
If you dedicate some character to split data apart, then its possible.
for instance, 3 strings can be told apart by their \0. so you read until \0, three times.

You could read the file into a char* buffer, then replace any ; with \0 (the string termination character) and finally you take pointers of the begins of the fields into your struct:
struct student {
char *coursecode;
char *coursename;
char *max_applicants;
char *actual_applicants;
};
You might want to parse numeric fields with atoi first.

Piece of advice #1: if you're Hindi and you ever re-born, start by learning C, only then transition to C++.
Piece of advice #2: so if I understand correctly, you have four strings in a row, separated by semi-colons. Then you can use strtok_r() to split up each line and put the contents of the file in an array of structs (all error checking omitted for clarity, but you should definitely have some):
typedef struct {
char *code;
char *name;
int max_appl;
int cur_appl;
} Course;
char buf[1024];
FILE *f = fopen("courses.txt", "r");
size_t size = 0;
size_t allocsize = 8;
Course *c = malloc(allocsize * sizeof(*c));
char *end;
while (fgets(buf, sizeof(buf), f) != NULL) {
if (size >= allocsize) {
allocsize <<= 1;
c = realloc(c, allocsize * sizeof(*c));
}
c[size].code = strdup(strtok_r(buf, ";", &end));
c[size].name = strdup(strtok_r(NULL, ";", &end));
c[size].max_appl = strtol(strtok_r(NULL, ";", &end), NULL, 10);
c[size].cur_appl = strtol(strtok_r(NULL, "\n", &end), NULL, 10);
size++;
}
int i;
for (i = 0; i < size; i++) {
Course *p = c + i;
printf("%s\n%s\n%d\n%d\n\n", p->code, p->name, p->max_appl, p->cur_appl);
free(p->code);
free(p->name);
}
free(c);
fclose(f);

Related

all array elements are the same fgets in C?

So currently my program uses a hard-coded array like this:
char *array[] = {"array","ofran","domle","tters", "squar"}
Basically n strings of n length "an n*n grid. I then treat the values like a 2D array. So I will access array[y][x] and do comparison operations and math using the corresponding ASCII.
I wanted to allow text files of various sizes (n*n) (up to 32) be implemented in my program instead of hard coding it. But am having issues with using fgets.
My current function for getting and storing the file information looks like this:
char *array[32];
char buffer[32];
FILE *fp = fopen("textfile.txt","r");
int n = 0;
while(fgets(buffer, 32, fp)){
array[i] = buffer;
n++;
}
fclose(fp);
but all values of "array" are the same (they are the last string). So with the example values above. If I printed array[0] to array [4] I get
values from my code
squar
squar
squar
squar
squar
expected values:
array
ofran
domle
tters
squar
array[i] = buffer just assigns the very same pointer to all elements of array. You need dynamic memory allocation here:
char *array[32];
char buffer[32];
FILE *fp = fopen("textfile.txt","r");
int n = 0;
while(fgets(buffer, 32, fp)){
array[i] = strdup(buffer); // allocate memory for a new string
// containing a copy of the string in buffer
n++;
}
fclose(fp);
No error checking is done here for brevity. Also if the input file contains more than 32 lines you'll run into trouble.
if strdup does not exist on your platform:
char *strdup(const char *str)
{
char *newstring = malloc(strlen(str) + 1); // + 1 for the NUL terminator
if ( newstring )
strcpy(newstring, str);
return(newstring);
}
Again no error checking is done here for brevity.
Given this code:
char buffer[32];
How many buffer variables are there?
One.
So this code
array[i] = buffer;
points every char * element of array at the ONE buffer.
(One fix is to do #Jabberwocky posted in his answer - use strdup())
char *array[32];
char buffer[32];
....
while(fgets(buffer, 32, fp)){
array[i] = buffer;
....
Look at your variables: first one is array of 32 char* pointers, the second one is a 32 char array.
In while loop, you also just assign each & every element on the array to the same buffer. Do you see? While fgets just keeps re-freshing / updating that buffer with latest data.

Import data to ragged array

As an exercise, I have build a simple program that, given a text file of N lowercase words and whitespaces, populates a ragged array char *en[N].
It works without great problems, apart for one: it populates the ragged array with only the last word of the input.
#include<stdio.h>
#include<ctype.h>
int main(int argc, char *argv[]){
int i = 0, j = 0;
char *en[100];
char temp[20];
FILE *p = fopen(argv[1], "r");
char single;
while((single = fgetc(p)) != EOF){
if(!isspace(single)) /* Temporary store a single word */
temp[i++] = single;
else{
temp[i] = '\0';
en[j++] = temp; /* Save stored word in ragged array */
i = 0;
}
}
printf("%s\n", en[0]); /* Return the same than en[1] and en[99] */
printf("%s\n", en[1]);
printf("%s\n", en[99]);
return 0;
}
I cannot understand why it goes down to the end of the input file. I am unable of detecting major issues that could suggest a wrong approach.
Edit:
The reasoning behind my approach was that an array of *char can be initialized with this form:
p[0] = "abc";
reasoning that I wrongly tried to translate in the error above, that #coderredoc brilliantly caught. As far as the dimensions of single words and inputs are concerned, I admit I did not put many attention in them. The exercise is centered on a different topic. In any case, thanks a lot your your valuable suggestions!
Your array of charcaters are all pointing to the same char array and then the content of the array at last changes to the last word. And you get only the last word.
A possible solution
en[j++] = temp;
to
en[j++] = strdup(temp);
Then you will achieve the behavior you want your program to have.
You just found out the awesomeness of pointers, congratulations!
Seriously, char *en[100] is an array of pointers. en[j++] = temp; assigns the pointer to the first value of temp to a pointer at en[j++]. And you do this over and over again. No surprise that you end up with an array of pointers, all of which point to the same array temp, which holds the contents of the last word.
What to learn from this: a pointer merely points to some memory, and no memory copying happens when you do en[j++] = temp;. You have to allocate the memory yourself and copy temp to that new memory yourself.

Getting 'Segmentation Fault' when running simple string manipulation program

I'm trying to learn some C and am having a little bit of trouble with manipulating strings. In trying to learn them, I've decided to make a simple spanish verb conjugator, but I'm getting stuck. Right now I'm just trying to drop the last 2 non '\0' of the string and then add a 'o' to it. (For example, for an input of "hablar" I want it to output "hablo"). Here's my code. I've tried to be overly detailed in my comments to hopefully aid in figuring out what I'm missing conceptually.
#include <stdio.h>
#include <string.h>
/* Reimplemented the length function of a string for practice */
int len(char *);
void conjugatePresentAr(char *, char *);
int len(char *arr){
int l = 0;
while (*arr++ != '\0'){
l++;
}
return l;
}
void conjugatePresentAr(char *verb, char *output){
output = verb;
int i = len(verb);
while (output < (verb + i -2)){
*output = *verb;
output++;
verb++;
}
*output = 'o';
output++;
*output = '\0';
}
int main(){
char input[20];
scanf("%s", input);
printf("%s\n",input);
char conjugated[20];
conjugatePresentAr(input, conjugated);
printf("%s\n", conjugated);
return 0;
}
For any input I get Segmentation Fault: 11. I've spent a decent amount of time looking around here and reading through books on pointers but can't quite seem to figure out what I'm messing up. I appreciate your help!
In conjugatePresentAr() you have changed the argument *output, possibly because you thought that copies the string.
output = verb;
so the function doesn't write anything to the string you supplied. Then when you print it, it's still an uninitialised variable.
int i = len(verb);
while (output < (verb + i -2)){
*output = *verb;
output++;
verb++;
}
will keep going forever: you're chasing (verb + i - 2) as it recedes into the distance (you increment verb inside the loop).
Try something like:
char *end = verb + strlen(verb) - 2;
while (output < end) {
...
verb++; /* this doesn't change end */
}
(and also fix the bug Weather Vane spotted which I entirely missed).
Note: in general, string processing is hard to do well in C, because the built-in facilities are so low-level. It's actually much easier to use C++ with its string and stringstream facilities.
If you're sticking to C, explicitly tracking length and allocated capacity alongside the char pointer (as the C++ string does for you) is good practice. Oh, and there's no obvious benefit to re-writing strlen.
You can't copy strings (char *) by assignment, like you did here:
output = verb;
What you do here is just change output to point at the input string, so any changes made to one of the strings will also apply to the other one - since they both point to the same memory.
you need to explicitly a function for copying the memory - such as strcpy (make sure to supply a null terminated string) or memcpy.
And, regarding your logic, since you don't really check the string for 'ar' in the end, and just assume there is, why not use something a little simpler like this:
void conjugatePresentAr(char *verb, char *output)
{
strcpy(output,verb);
int len = strlen(verb);
output[len - 2] = 'o';
output[len - 1] = '\0';
}
In function conjugatePresentAr() you have alterered the argument *output
output = verb;
Is an address affectation, not value.
Should reread pointer definition

typedef memory allocation

VARIABLES AREN'T SET IN STONE YET! Excuse if if no indention. I am new to this site. Anyway, I have a text document of a list of games in five different categories, and I need to some help with memory allocation VIA typedef. How would one do it? So far, this is what I have:
/*
Example of text document
2012 DotA PC 0.00 10
2011 Gran Turismo 5 PS3 60.00 12
list continues in similar fashion...
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//function prototype here
char **readFile(char *file);
char *allocateString(char temp[]);
typedef struct
{
int year;
char name[100];
char system[10];
float price;
int players;
}game;
int main(void)
{
char **list;
system("pause");
return 0;
}
//function defined here
char **readFile(char *file) //reads file and and allocates
{
FILE* fpIn;
int i, total=0;
fpIn = fopen("list.txt", "r");
if (!fpIn)
{
printf("File does not exist");
exit(101);
}
/*
allocate memory by row here VIA for loop with the total++ to keep track of the
number of games
*/
/*
allocate memory individually for each item VIA "allocateString by using going
to set list[i] = allocateStrng(tmpList) using for loop the for loop will have
for (i=0; i<total; i++)
*/
return;
}
//allocateString here
char *allocateString(char temp[]);
{
char *s;
s = (char*)calloc(strlen(temp+1), sizeof(char)));
strcpy(s, temp);
return s;
}
Usually you'd allocate a decent amount of memory up front, detect situations where that amount is not enough, and enlarge the allocation in those cases using realloc (or malloc followed by memcpy and free). This advice holds for both the buffer into which you read the current line (to be passed as temp to allocateString) and the array to hold the sequence of all lines.
You can detect an insufficient buffer size for the line buffer when after calling fgets(buf, bufsize, fpIn) the strlen(buf) == bufsize - 1 but still buf[bufsize - 2] != '\n'. In other words, when reading filled the whole buffer, but still didn't reach a newline. In that case, the next read will continue the current line. You might want an inner loop to extend the buffer and read again for as long as it takes.
Note that your allocateString pretty much duplicates strdup, so you might want to use that instead.
The links in the above text mainly come from the manual of the GNU C library. cppreference.com is another good source of C function documentation. As are the Linux man pages.
s = (char*)calloc(strlen(temp+1), sizeof(char)));
//the name of the array is a pointer, so you are doing pointer arithmetic.
//I think you want strlen(*temp+1, sizeof(char)));
// or strlen(temmp[1]) it isn't clear if this is a pointer to a string or an array
// of strings
//you need the length of the string *temp is the content which temp points to
//strcpy(s, temp);

Simple C string manipulation

I trying to do some very basic string processing in C (e.g. given a filename, chop off the file extension, manipulate filename and then add back on the extension)- I'm rather rusty on C and am getting segmentation faults.
char* fname;
char* fname_base;
char* outdir;
char* new_fname;
.....
fname = argv[1];
outdir = argv[2];
fname_len = strlen(fname);
strncpy(fname_base, fname, (fname_len-4)); // weird characters at the end of the truncation?
strcpy(new_fname, outdir); // getting a segmentation on this I think
strcat(new_fname, "/");
strcat(new_fname, fname_base);
strcat(new_fname, "_test");
strcat(new_fname, ".jpg");
printf("string=%s",new_fname);
Any suggestions or pointers welcome.
Many thanks and apologies for such a basic question
You need to allocate memory for new_fname and fname_base. Here's is how you would do it for new_fname:
new_fname = (char*)malloc((strlen(outdir)+1)*sizeof(char));
In strlen(outdir)+1, the +1 part is for allocating memory for the NULL CHARACTER '\0' terminator.
In addition to what other's are indicating, I would be careful with
strncpy(fname_base, fname, (fname_len-4));
You are assuming you want to chop off the last 4 characters (.???). If there is no file extension or it is not 3 characters, this will not do what you want. The following should give you an idea of what might be needed (I assume that the last '.' indicates the file extension). Note that my 'C' is very rusty (warning!)
char *s;
s = (char *) strrchr (fname, '.');
if (s == 0)
{
strcpy (fname_base, fname);
}
else
{
strncpy (fname_base, fname, strlen(fname)-strlen(s));
fname_base[strlen(fname)-strlen(s)] = 0;
}
You have to malloc fname_base and new_fname, I believe.
ie:
fname_base = (char *)(malloc(sizeof(char)*(fname_len+1)));
fname_base[fname_len] = 0; //to stick in the null termination
and similarly for new_fname and outdir
You're using uninitialized pointers as targets for strcpy-like functions: fname_base and new_fname: you need to allocate memory areas to work on, or declare them as char array e.g.
char fname_base[FILENAME_MAX];
char new_fname[FILENAME_MAX];
you could combine the malloc that has been suggested, with the string manipulations in one statement
if ( asprintf(&new_fname,"%s/%s_text.jpg",outdir,fname_base) >= 0 )
// success, else failed
then at some point, free(new_fname) to release the memory.
(note this is a GNU extension which is also available in *BSD)
Cleaner code:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
const char *extra = "_test.jpg";
int main(int argc, char** argv)
{
char *fname = strdup(argv[1]); /* duplicate, we need to truncate the dot */
char *outdir = argv[1];
char *dotpos;
/* ... */
int new_size = strlen(fname)+strlen(extra);
char *new_fname = malloc(new_size);
dotpos = strchr(fname, '.');
if(dotpos)
*dotpos = '\0'; /* truncate at the dot */
new_fname = malloc(new_size);
snprintf(new_fname, new_size, "%s%s", fname, extra);
printf("%s\n", new_fname);
return 0;
}
In the following code I do not call malloc.
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
/* Change this to '\\' if you are doing this on MS-windows or something like it. */
#define DIR_SYM '/'
#define EXT_SYM '.'
#define NEW_EXT "jpg"
int main(int argc, char * argv[] ) {
char * fname;
char * outdir;
if (argc < 3) {
fprintf(stderr, "I want more command line arguments\n");
return 1;
}
fname = argv[1];
outdir = argv[2];
char * fname_base_begin = strrchr(fname, DIR_SYM); /* last occurrence of DIR_SYM */
if (!fname_base_begin) {
fname_base_begin = fname; // No directory symbol means that there's nothing
// to chop off of the front.
}
char * fname_base_end = strrchr(fname_base_begin, EXT_SYM);
/* NOTE: No need to search for EXT_SYM in part of the fname that we have cut off
* the front and then have to deal with finding the last EXT_SYM before the last
* DIR_SYM */
if (!fname_base_end) {
fprintf(stderr, "I don't know what you want to do when there is no extension\n");
return 1;
}
*fname_base_end = '\0'; /* Makes this an end of string instead of EXT_SYM */
/* NOTE: In this code I actually changed the string passed in with the previous
* line. This is often not what you want to do, but in this case it should be ok.
*/
// This line should get you the results I think you were trying for in your example
printf("string=%s%c%s_test%c%s\n", outdir, DIR_SYM, fname_base_begin, EXT_SYM, NEW_EXT);
// This line should just append _test before the extension, but leave the extension
// as it was before.
printf("string=%s%c%s_test%c%s\n", outdir, DIR_SYM, fname_base_begin, EXT_SYM, fname_base_end+1);
return 0;
}
I was able to get away with not allocating memory to build the string in because I let printf actually worry about building it, and took advantage of knowing that the original fname string would not be needed in the future.
I could have allocated the space for the string by calculating how long it would need to be based on the parts and then used sprintf to form the string for me.
Also, if you don't want to alter the contents of the fname string you could also have used:
printf("string=%s%c%*s_test%c%s\n", outdir, DIR_SYM, (unsigned)fname_base_begin -(unsigned)fname_base_end, fname_base_begin, EXT_SYM, fname_base_end+1);
To make printf only use part of the string.
The basic of any C string manipulation is that you must write into (and read from unless... ...) memory you "own". Declaring something is a pointer (type *x) reserves space for the pointer, not for the pointee that of course can't be known by magic, and so you have to malloc (or similar) or to provide a local buffer with things like char buf[size].
And you should be always aware of buffer overflow.
As suggested, the usage of sprintf (with a correctly allocated destination buffer) or alike could be a good idea. Anyway if you want to keep your current strcat approach, I remember you that to concatenate strings, strcat have always to "walk" thourgh the current string from its beginning, so that, if you don't need (ops!) buffer overflow checks of any kind, appending chars "by hand" is a bit faster: basically when you finished appending a string, you know where the new end is, and in the next strcat, you can start from there.
But strcat doesn't allow to know the address of the last char appended, and using strlen would nullify the effort. So a possible solution could be
size_t l = strlen(new_fname);
new_fname[l++] = '/';
for(i = 0; fname_base[i] != 0; i++, l++) new_fname[l] = fname_base[i];
for(i = 0; testjpgstring[i] != 0; i++, l++) new_fname[l] = testjpgstring[i];
new_fname[l] = 0; // terminate the string...
and you can continue using l... (testjpgstring = "_test.jpg")
However if your program is full of string manipulations, I suggest using a library for strings (for lazyness I often use glib)

Resources