fread'ing and fwrite'ing a structure that contains pointers [duplicate] - c

This question already has answers here:
Writing and reading (fwrite - fread) structures with pointers
(3 answers)
Closed 8 years ago.
gcc (GCC) 4.7.0
c89
Hello,
I have the following structure that I am trying to fwrite and fread.
However, because my device and resource are pointers. The fwrite will read the pointer values and not the data. I cannot use a array for the device or resource. Only pointers as they have to be dynamically allocated.
I allocate all memory for the structure elements before I write. Not shown here as I want to keep the snippet short. Nor is free'ing.
In my fread function, I allocate the memory for the device and resource so that the fread will read into these memory locations. However, this will not work.
What is the best way to do this?
Many thanks for any advice,
struct data {
int id;
int set;
char *device;
char *resource;
};
struct database {
struct data **db_data;
size_t database_rows;
size_t database_data_size;
};
int database_write(FILE *fp, const struct database *db)
{
rewind(fp);
if(fwrite(*db->db_data, sizeof(struct data), 1, fp) == -1) {
return DATABASE_ERROR;
}
return 0;
}
struct database* database_read(FILE *fp, size_t db_rows, size_t db_data_size)
{
struct database *db = NULL;
size_t i = 0;
db = malloc(sizeof(struct database));
db->database_rows = db_rows;
db->database_data_size = db_data_size;
db->db_data = malloc(sizeof(struct data) * db_rows);
for(i = 0; i < db_rows; i++) {
db->db_data[i] = malloc(sizeof(struct data));
db->db_data[i]->device = malloc(db_data_size);
db->db_data[i]->resource = malloc(db_data_size);
}
rewind(fp);
if(fread(*db->db_data, sizeof(struct data), 1, fp) == -1) {
return NULL;
}
return db;
}

You seem to have answered your own question, fread and fwrite just look at what's in memory and put that in the file. This works great if you're writing things that don't have pointers (e.g. big arrays of numbers). It's not designed to write structs with pointers.
If this file has a format, you need to do what the format says. If you're making up a format as you go, then you should write each member one by one into the file. You will need some sort of buffer to read into (you may need to resize this if you don't have a maximum length specification). Also, your database_write function will need to be changed quite a bit as well.

If device and resource can have variable length you should write down the size of device and then the data. Do the same for resource.
When you read them back you can read the size, then allocate memory and finally read the value.

You have yourself described you problem. fwrite will write the address and not the value.
May be you can use a field for the length of device and resource in your structure "struct data".
Create a wrapper for fread() and fwrite() which reads/writes this length.
In this wrapper you can memcpy devices, resource in a temporary buffer and use fwrite() on it.
This is a simple and very basic solution.
While sending packets in networks, you will generally see a structures containing char pointers. The first 4/8 bytes store the length of the data and the remaining bytes contain the actual data.
User reading the packet, first reads the beginning 4/8 bytes. Depending on this, read() call is issued to read the remaining data.
You may refer
Is the "struct hack" technically undefined behavior?

Related

C unknown number of structures in file

Similar with this. But what if MAX_BOOKS would be unknown as well?
I want to get number of structures from a file.
My structure:
typedef struct material {
int mat_cislo;
char oznaceni[MAX_TEXT];
char mat_dodavatel[MAX_TEXT];
char dodavatel[MAX_TEXT];
float cena;
int mat_kusovnik;
} MATERIAL;
My code:
void nacist_material() {
FILE* pSoubor;
MATERIAL materialy_pocitadlo;
int i;
int b;
if((pSoubor = fopen(SOUBOR_MATERIAL, "rb")) == NULL ) {
printf("\nChyba při čtení souboru");
return;
}
pocet_zaznamu_materialu = 3;
printf("\n\n===>%d", pocet_zaznamu_materialu);
if(pocet_zaznamu_materialu > 0) {
printf("\nExistuje %d materialu", pocet_zaznamu_materialu);
free(pMaterialy);
pMaterialy = (MATERIAL *) malloc(pocet_zaznamu_materialu * sizeof(MATERIAL));
for(i = 0; i < pocet_zaznamu_materialu; i++) {
b = fread(&pMaterialy[i], sizeof(MATERIAL), 1, pSoubor);
}
printf("\n otrava %d", b);
}
else {
printf("\nNeexistuje předchozí záznam materialu");
}
fclose(pSoubor);
return;
}
Right now pocet_zaznamu_materialu is hard code to 3, because there are 3 structures in a file and it all works correctly. But what if number of structures in file changes?
Problem: I need to know - number of structures in file, how to a do it?
Thanks, sorry for eng
If the file is composed of nothing but a list of your desired struct stored contiguously, then the file's size, in bytes, will be a multiple of the size of your struct, and you can obtain the file size and then the number of structs in the file like so:
size_t len_file, num_structs;
fseek(fp, 0, SEEK_END);
len_file = ftell(fp);
rewind(fp);
num_structs = len_file/sizeof(MYSTRUCT);
This can be a real problem when you read from a dynamic file (another program writes at the end of file while you read it), a pipe or a network socket. In that case, you really have no way to guess the number of structs.
In that case, a common idiom is to use a dynamicaly allocated array of structs of an arbitrary size and then make it grow with realloc each time the currently allocated array is full. You could for example make the new size be twice the previous one.
That is the way C++ vectors manage their underlying array under the hood.
Have you considered adding a header to the file?
That is, place a special structure at the start of the file that tells you some information about the file. Something like ...
struct file_header {
char id[32]; /* Let this contain a special identifying string */
uint32_t version; /* version number in case the file structure changes */
uint32_t num_material; /* number of material structures in file */
};
Not only does this give you a relatively quick way to determine how many material structures you have in your file, it is also extensible. Perhaps you will want to store other structures in this file, and you want to know how many of each are in there--just add a new field and update the version.
If you want, you can even throw in some error checking.

C - varying size text

I have to write for my assignement a program that will consist of agents and a central server deamon. It will be a distributed shell - every command issued from a server will be also performed on every agent(the output will be sent back from every agent to central server).
I will have to deal with output commands (like ls -la /home/user/dir1) - on each agent the output may vary in size). The output of "find /" will also be BIG but I have to take somehow into account that something like that can happen. What is desired way of handling varying size outputs in C and operating on them? (saving to variable, sending it over a socket).
The way to deal with data of arbitrary size is to use dynamic allocation, i.e. the functions malloc(), realloc() and free(). You allocate and possibly grow the memory needed to store the command output.
Reading command output (assuming a Unix-like OS) is best done with popen().
Read the manuals of each of the mentioned functions for details.
Dynamic Memory Allocation
To hold your "variable length" strings, you should use dynamic memory allocation: the malloc family of functions.
#include <stdlib.h>
void *malloc(size_t size);
void free(void *ptr);
void *calloc(size_t nmemb, size_t size);
void *realloc(void *ptr, size_t size);
So, suppose you have your data stored in a variable char *ag_str. I suggest you malloc and then realloc the size of the buffer in blocks. Calling malloc and then realloc a thousand times to readjust the block size after each character is very costly.
So, you might do something like this:
#define BLOCK_SIZE 4096
struct mem_block {
size_t current_block_size;
size_t current_str_size;
char *ag_str;
};
struct mem_block *new_chunk(void)
{
struct mem_block *p = malloc(sizeof *p);
p->ag_str = malloc(BLOCK_SIZE);
p->current_block_size = BLOCK_SIZE;
p->current_str_size = 0;
return p;
}
void realloc_chunk(struct mem_block *chunk)
{
size_t ns = chunk->current_block_size + BLOCK_SIZE;
chunk->ag_str = realloc(chunk->ag_str, ns);
chunk->current_block_size = ns;
}
void cat_ag_str(struct mem_block *chunk, char *ag_str, size_t ag_len)
{
if (chunk->current_str_size + ag_len > chunk->current_block_size)
realloc_chunk(chunk);
strncat (chunk->ag_str, ag_str, ag_len);
chunk->current_str_size += ag_len;
}
void receive_from_agent(...)
{
struct mem_block *chunk = new_chunk();
ssize_t c; // Linux read/recv return
size_t count;
char buff[BLOCK_SIZE];
while((c = read(your_fd, buff, BLOCK_SIZE)) // or probably recv()
if (c < 0) ...
count = (size_t)c;
cat_ag_str(chunk, buff, count);
(...)
}
Note that this code was not tested and is just an idea for you. (Error checking was omitted)
struct mem_block: This will keep information about your current memory block.
new_chunk: function to create a new chunk handler for you.
realloc_chunk: anytime the amount of characters that must be written exceeds the amount of characters available in the chunk, we get one more block.
cat_ag_str: this will append what you just read to the memory block you have, effectively transforming chunks of data into one coherent big buffer.
receive_from_agent: this is the entry point of your receiving loop. You may use read or recv, I don't know which you use, but both return the amount of bytes read, which you'll use to pass to cat_ag_str.
It's important to note that you're reading in the same sized blocks as you realloc. (You can read in smaller chunks too, but never bigger).
You can do roughly the same for sending, but you don't need all that workaround for memory. You can just use a fixed sized buffer and copy data from your big string to it in fixed sizes, then you send the fixed-sized buffer.

Easiest way to allocate "blank" block of data to .dat file

Looking for a quick way to allocate a block of data to be managed from disk. I'm allocating a block of 50 structs, and while most of the memory allocates fine, when I read it all back I get junk messages returned in some of the fields that should be blank. I assume this is me allocating the space incorrectly somehow that allows some junk from memory to leak in there.
if ((fpBin = fopen(BINARYFILE, "w+b")) == NULL)
{
printf("Could not open binary file %s.\n", BINARYFILE);
return;
}
fwrite(fpBin, sizeof(struct student), 50, fpBin); //Write entire hash table to disk
struct definition
typedef struct student
{
char firstName[20]; //name
char lastName[20];
double amount; //amount owed
char stuID[5]; //4 digit code
}student;
Is how I was taught, yet I'm still getting some junk in my data instead of it being a clean slate. So question: How do I set all fields to blank?
Answer:
student tempStu[50] = {0};
fwrite(tempStu, sizeof(struct student), BUCKETSIZE, fpBin); //Write entire hash table to disk
fwrite(fpBin, sizeof(struct student), 50, fpBin);
You're writing your file pointer, not your student structs, to disk. That first fpBin should instead be a pointer to your data. That data can be an array of 50 student structs initialized to 0, perhaps with calloc or by defining it at file scope, but it has to be somewhere. Instead, you are writing 50*sizeof(struct student) bytes from your fpBin pointer, which is undefined behavior -- you'll either crash with an access violation or you'll write junk to disk. That junk is what you're getting when you read it back.
Also, using a constant like 50 is bad practice ... it should be a variable (or manifest constant) that holds the number of students that you're writing out.
BTW, on Linux and other POSIX systems, you could allocate a block of zeroes on disk just by writing the last byte (or in some other way making the file that large).

copy_to_user a struct that contains an array (pointer)

Disclosure: I'm fairly new to C. If you could explain any answers verbosely, I would appreciate it.
I am writing a linux kernel module, and in one of the functions I am writing I need to copy a structure to userspace that looks like this:
typedef struct
{
uint32_t someProperty;
uint32_t numOfFruits;
uint32_t *arrayOfFruits;
} ObjectCapabilities;
The API I'm implementing has documentation that describes the arrayOfFruits member as "an array of size numOfFruits where each element is a FRUIT_TYPE constant." I am confused how to do this, given that the arrayOfFruits is a pointer. When I copy_to_user the ObjectCapabilities structure, it will only copy the pointer arrayOfFruits to userspace.
How can userspace continuously access the elements of the array? Here is my attempt:
ObjectCapabilities caps;
caps.someProperty = 1024;
caps.numOfFruits = 3;
uint32_t localArray[] = {
FRUIT_TYPE_APPLE,
FRUIT_TYPE_ORANGE,
FRUIT_TYPE_BANANA
};
caps.arrayOfFruits = localArray;
And then for the copy... can I just do this?
copy_to_user((void *)destination, &caps, (sizeof(caps) + (sizeof(localArray) / sizeof((localArray)[0]))));
The user needs to provide enough space for all the data being copied out. Ideally he'll tell you how much space he provided, and you check that everything fits.
The copied-out data should (in general) not include any pointers, since they're "local" to a different "process" (the kernel can be viewed as a separate process, as it were, and kernel / user interactions involve process-to-process IPC, similar to sending stuff over local or even Internet-connected sockets).
Since the kernel has pretty intimate knowledge of a process, you can skirt these rules somewhat, e.g., you could compute what the user's pointer will be, and copy out a copy of the original data, with the pointer modified appropriately. But that's kind of wasteful. Or, you can copy a kernel pointer and just not use it in the user code, but now you're "leaking data" that "bad guys" can sometimes leverage in various ways. In security-people-speak you've left a wide-open "covert channel".
In the end, then, the "right" way to do this tends to be something like this:
struct user_interface_version_of_struct {
int property;
int count;
int data[]; /* of size "count" */
};
The user code mallocs (or otherwise arranges to have sufficient space) the "user interface version" and makes some system call to the kernel (read, receive, rcvmsg, ioctl, whatever, as long as it involves doing a "read"-type operation) and tells the kernel: "here's the memory holding the struct, and here's how big it is" (in bytes, or the maximum count value, or whatever: user and kernel simply need to agree on the protocol). The kernel-side code then verifies the user's values in some appropriate manner, and either does the copy-out however is most convenient, or returns an error.
"Most convenient" is sometimes two separate copy ops, or some put_user calls, e.g., if the kernel side has the data structure you showed, you might do:
/* let's say ulen is the user supplied length in bytes,
and uaddr is the user-supplied address */
struct user_interface_version_of_struct *p;
needed = sizeof(*p) + 3 * sizeof(int);
if (needed > ulen)
return -ENOMEM; /* user did not supply enough space */
p = uaddr;
error = put_user(1024, &p->property);
if (error == 0)
error = put_user(3, &p->count);
if (error == 0 && copy_to_user(&p->data, localArray, 3 * sizeof(int))
error = -EFAULT;
You may have a situation where you must conform to some not-very-nice interface, though.
Edit: if you're adding your own system call (rather than tying in to read or ioctl for instance), you can separate the header and data, as in Adam Rosenfield's answer.
You can't copy raw pointers, since a pointer into kernel space is meaningless to userspace (and will segfault if dereferenced).
The typical way of doing something like this is to ask the userspace code to allocate the memory and pass in a pointer to that memory into a system call. If the program doesn't pass in a large enough buffer, then fail with an error (e.g. EFAULT). If there's no way for the program to know in advance a priori how much memory it will need, then typically you'd return the amount of data needed when passed a NULL pointer.
Example usage from userspace:
// Fixed-size data
typedef struct
{
uint32_t someProperty;
uint32_t numOfFruits;
} ObjectCapabilities;
// First query the number of fruits we need
ObjectCapabilities caps;
int r = sys_get_fruit(&caps, NULL, 0);
if (r != 0) { /* Handle error */ }
// Now allocate memory and query the fruit
uint32_t *arrayOfFruits = malloc(caps.numOfFruits * sizeof(uint32_t));
r = sys_get_fruit(&caps, arrayOfFruits, caps.numOfFruits);
if (r != 0) { /* Handle error */ }
And here's how the corresponding code would look in kernel space on the other side of the system call:
int sys_get_fruit(ObjectCapabilities __user *userCaps, uint32_t __user *userFruit, uint32_t numFruits)
{
ObjectCapabilities caps;
caps.someProperty = 1024;
caps.numOfFruits = 3;
// Copy out fixed-size data
int r = copy_to_user(userCaps, &caps, sizeof(caps));
if (r != 0)
return r;
uint32_t localArray[] = {
FRUIT_TYPE_APPLE,
FRUIT_TYPE_ORANGE,
FRUIT_TYPE_BANANA
};
// Attempt to copy variable-sized data. Check the size first.
if (numFruits * sizeof(uint32_t) < sizeof(localArray))
return -EFAULT;
return copy_to_user(userFruit, localArray, sizeof(localArray));
}
With copy_to_user you would do two copy to users.
//copy the struct
copy_to_user((void *)destination, &caps, sizeof(caps));
//copy the array.
copy_to_user((void *)destination->array, localArray, sizeof(localArray);

Reading file and populating struct

I have a structure with the following definition:
typedef struct myStruct{
int a;
char* c;
int f;
} OBJECT;
I am able to populate this object and write it to a file. However I am not able to read the char* c value in it...while trying to read it, it gives me a segmentation fault error. Is there anything wrong with my code:
//writensave.c
#include "mystruct.h"
#include <stdio.h>
#include <string.h>
#define p(x) printf(x)
int main()
{
p("Creating file to write...\n");
FILE* file = fopen("struct.dat", "w");
if(file == NULL)
{
printf("Error opening file\n");
return -1;
}
p("creating structure\n");
OBJECT* myObj = (OBJECT*)malloc(sizeof(OBJECT));
myObj->a = 20;
myObj->f = 45;
myObj->c = (char*)calloc(30, sizeof(char));
strcpy(myObj->c,
"This is a test");
p("Writing object to file...\n");
fwrite(myObj, sizeof(OBJECT), 1, file);
p("Close file\n");
fclose(file);
p("End of program\n");
return 0;
}
Here is how I am trying to read it:
//readnprint.c
#include "mystruct.h"
#include <stdio.h>
#define p(x) printf(x)
int main()
{
FILE* file = fopen("struct.dat", "r");
char* buffer;
buffer = (char*) malloc(sizeof(OBJECT));
if(file == NULL)
{
p("Error opening file");
return -1;
}
fread((void *)buffer, sizeof(OBJECT), 1, file);
OBJECT* obj = (OBJECT*)buffer;
printf("obj->a = %d\nobj->f = %d \nobj->c = %s",
obj->a,
obj->f,
obj->c);
fclose(file);
return 0;
}
When you write your object, you're writing the pointer value to the file instead of the pointed-to information.
What you need to do is not just fwrite/fread your whole structure, but rather do it a field at a time. fwrite the a and the f as you're doing with the object, but then you need to do something special with the string. Try fwrite/fread of the length (not represented in your data structure, that's fine) and then fwrite/fread the character buffer. On read you'll need to allocate that, of course.
Your first code sample seems to assume that the strings are going to be no larger than 30 characters. If this is the case, then the easiest fix is probably to re-define your structure like this:
typedef struct myStruct{
int a;
char c[30];
int f;
} OBJECT;
Otherwise, you're just storing a pointer to dynamically-allocated memory that will be destroyed when your program exits (so when you retrieve this pointer later, the address is worthless and most likely illegal to access).
You're saving a pointer to a char, not the string itself. When you try to reload the file you're running in a new process with a different address space and that pointer is no longer valid. You need to save the string by value instead.
I would like to add a note about a potential portability issue, which may or may not exist depending upon the planned use of the data file.
If the data file is to be shared between computers of different endian-ness, you will need to configure file-to-host and host-to-file converters for non-char types (int, short, long, long long, ...). Furthermore, it could be prudent to use the types from stdint.h (int16_t, int32_t, ...) instead to guarantee the size you want.
However, if the data file will not be moving around anywhere, then ignore these two points.
The char * field of your structure is known as a variable length field. When you write this field, you will need a method for determining the length of the text. Two popular methods are:
1. Writing Size First
2. Writing terminal character
Writing Size First
In this method, the size of the text data is written first, followed immediately by the data.
Advantages: Text can load quicker by block reads.
Disadvantages: Two reads required, extra space required for the length data.
Example code fragment:
struct My_Struct
{
char * text_field;
};
void Write_Text_Field(struct My_Struct * p_struct, FILE * output)
{
size_t text_length = strlen(p_struct->text_field);
fprintf(output, "%d\n", text_length);
fprintf(output, "%s", p_struct->text_field);
return;
}
void Read_Text_Field(struct My_STruct * p_struct, FILE * input)
{
size_t text_length = 0;
char * p_text = NULL;
fscanf(input, "%d", &text_length);
p_text = (char *) malloc(text_length + sizeof('\0'));
if (p_text)
{
fread(p_text, 1, text_length, input);
p_text[text_length] = '\0';
}
}
Writing terminal character
In this method the text data is written followed by a "terminal" character. Very similar to a C language string.
Advantages: Requires less space than Size First.
Disadvantages: Text must be read one byte at a time so terminal character is not missed.
Fixed size field
Instead of using a char* as a member, use a char [N], where N is the maximum size of the field.
Advantages: Fixed sized records can be read as blocks.
Makes random access in files easier.
Disadvantages: Waste of space if all the field space is not used.
Problems when the field size is too small.
When writing data structures to a file, you should consider using a database. There are small ones such as SQLite and bigger ones such as MySQL. Don't waste time writing and debugging permanent storage routines for your data when they have already been written and tested.

Resources