Saving a C struct with a char* string into a file - c

I'm trying to save a struct with a char* string into a file.
struct d_object {
int flags;
int time;
int offset;
char *filename;
};
The problem is that when doing that I will obviously only save the address of that pointer rather than the string. So what I've done is simply use a character array and but I'm forced to set the maximum size of the string. This works fine, however I was wondering if there is anyway of storing the struct with a char* (that I malloc at some point) in a file and then retrieve it. I can save the string and the struct separate and then retrieve them but it's quite a mess. It would be preferable if I could load and save the entire struct (the one above) into the file. Thanks!
The code with the char array is below:
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
struct d_object {
int flags;
int time;
int offset;
char filename[255];
};
int main(int argc, char **argv) {
struct d_object fcb;
fcb.flags=5;
fcb.time=100000;
fcb.offset=220;
strncpy(fcb.filename,"myfile",255);
int fd=open("testfile",O_RDWR);
write(fd,&fcb,sizeof(fcb));
close(fd);
int fd2 = open("testfile",O_RDONLY);
struct d_object new_fcb;
read(fd2,&new_fcb,sizeof(new_fcb));
printf("read from file testfile: %s\n",new_fcb.filename);
return 0;
}
P.S.: I'm not using the STREAM functions simply because this is actually meant to be run on an embedded OS that doesn't have them. I've just adapted the code for *BSD/Linux so it makes more sense when asking the question.

I understand that portability is not an issue, since you are working for an embedded system. In other case, you should use something like XML.
You can transform back your code to:
struct d_object {
int flags;
int time;
int offset;
char * filename;
};
And then save each piece of data individually:
write( fd, &record.flags, sizeof( int ) );
write( fd, &record.time, sizeof( int ) );
write( fd, &record.offset, sizeof( int ) );
int filename_length = strlen( filename );
write( fd, &filename_length, sizeof( int ) );
write( fd, record.filename, filename_length );
For reading, you'll have to read each item separatedly, and then the filename:
int filename_length;
read( fd, &emptyRecord.flags, sizeof( int ) );
read( fd, &emptyRecord.time, sizeof( int ) );
read( fd, &emptyRecord.offset, sizeof( int ) );
read( filename_length, sizeof( int ), 1, file );
emptyRecord.filename = (char *) malloc( sizeof( char ) * ( filename_length +1) );
read( fd, emptyRecord.filename, filename_length );
*( emptyRecord.filename + filename_length ) = 0;

Serialization is never pretty. How about storing the length of the string in the pointer, and letting the string follow the struct in the file? Something like this (warning, brain-compiled code):
void write_object(struct d_object *s, int fd) {
struct d_object copy = *s;
copy.filename = (char*)strlen(s->filename);
write(fd, &copy, sizeof(copy));
write(fd, s->filename, (size_t)copy.filename);
}
void read_object(struct d_object *s, int fd) {
read(fd, s, sizeof(struct d_object));
char *filename = malloc(((size_t)s->filename) + 1);
read(fd, filename, (size_t)s->filename);
filename[(size_t)s->filename] = '\0';
s->filename = filename;
}

Now that I know the nature of your problem, why not try flexible arrays? Instead of using char *filename;, use char filename[1] and malloc(sizeof struct d_object + filename_len) to allocate your structs. Add a size member, and you can easily write the object to disk with a single call to write and load it from disk with 2 calls (first to read the size element, second to read the whole object after allocating it).
Note that the "official" way to do flexible arrays in C99 is [] rather than [1], but [1] is guaranteed to work as a consequence of other requirements in the standard and works on C89 too. [1] will waste a few bytes though, so if your compiler supports [] you might want to use it.

Your problem here is really a symptom of a much larger issue: you shouldn't be reading/writing binary data structures between memory and files. Not only is there no clear way to read/write structures with pointers to other data (this applies not only to strings but to nested data, linked lists, etc.) but the format of the data on disk will depend on your host machine and C implementation and will not be portable to other environments.
Instead, you should design a format for your data on disk and write functions to save and load the data, or use an existing format and find library code for using it. This is usually referred to as "serialization".
If your data is hierarchical, JSON, XML, or EBML may be appropriate. If it's fairly simple, flat text files or a homebrew binary format (written byte-by-byte so it's portable) may be appropriate.
Since you seem to be unfamiliar with these issues, it might be worthwhile to write some code for loading/saving a few simple binary file formats (like .tga or .wav) as an exercise before you try to design something for your own data.

Nope, there's no magic built directly into the language to do this for you.
What you need to do is to write a couple of functions to convert your data into writable form, and to read it back in from file. This is called "serialization" / "deserialization" in languages that make more of a fuss about it.
Your particular structure, you could do something like write the binary stuff to the file straight from the struct, and then follow it up with the contents of the character buffer. You could make things easier for yourself come read time if you precede the character data with an int specifying its length.
When you read that stuff back in, you'll want to malloc/calloc yourself a chunk of memory to hold the char data in; if you stored the size you'll know just how big to make that malloc. Read the binary data into the struct, read the char data into the malloc'd memory, store the pointer to the malloc chunk into the struct, and you've re-created the original.
No magic. Just data and a bit of elbow grease.
EDIT
While I wrote about doing this, Thomas coded an example. I think our answers complement each other very well; together they should tell you everything you need to know.

Related

How do I write a C function that returns a variable-length string?

I need to be able to check in a kernel module whether or not a file descriptor, dentry, or inode falls under a certain path. To do this, I am going to have to write a function that when given a dentry or a file descriptor (not sure which, yet), will return said object's full path name.
What is the way to write a function that returns variable-length strings?
You can try like this:
char *myFunction(void)
{
char *word;
word = malloc (sizeof (some_random_length));
//add some random characters
return word;
}
You can also refer related thread: best practice for returning a variable length string in c
The typical way to do this in C, is not to return anything at all:
void func (char* buf, size_t buf_size, size_t* length);
Where buf is a pointer to the buffer which will hold the string, allocated by the caller. buf_size is the size of that buffer. And length is how much of that buffer that the function used.
You could return a pointer to buf as done by for example strcpy. But this doesn't make much sense, since the same pointer already exists in one of the parameters. It adds nothing but confusion.
(Don't use strcpy, strcat etc functions as some role model for how to write functions. Many C standard library functions have obscure prototypes, because they are so terribly old, from a time when good programming practice wasn't invented, or at least not known by Dennis Ritchie.)
There are two common approaches:
One is to have a fixed size buffer to store the result:
int makeFullPath(char *buffer,size_t max_size,...)
{
int actual_size = snprintf(buffer,max_size,...);
return actual_size;
}
Examples of standard functions which use this approach are strncpy() and snprintf(). This approach has the advantage that no dynamic memory allocation is needed, which will give better performance for time-critical functions. The downside is that it puts more responsibility on the caller to be able to determine the largest possible result size in advance or be ready to reallocate if a larger size is necessary.
The second common approach is to calculate how big of a buffer to use and allocate that many bytes internally:
// Caller eventually needs to free() the result.
char* makeFullPath(...)
{
size_t max_size = calculateFullPathSize(...);
char *buffer = malloc(max_size);
if (!buffer) return NULL;
int actual_size = snprintf(buffer,max_size,...);
assert(actual_size<max_size);
return buffer;
}
An example of a standard function that uses this approach is strdup(). The advantage is that the caller no longer needs to worry about the size, but they now need to make sure that they free the result. For a kernel module, you would use kmalloc() and kfree() instead of malloc() and free().
A less common approach is to have a static buffer:
const char *makeFullPath(char *buffer,size_t max_size,...)
{
static char buffer[MAX_PATH];
int actual_size = snprintf(buffer,MAX_PATH,...);
return buffer;
}
This avoids the caller having to worry about the size or freeing the result, and it is also efficient, but it has the downside that the caller now has to make sure that they don't call the function a second time while the result of the first call is still being used.
char *result1 = makeFullPath(...);
char *result2 = makeFullPath(...);
printf("%s",result1);
printf("%s",result2); /* oops! */
Here, the caller probably meant to print two separate strings, but they'll actually just get the second string twice. This is also problematic in multi-threaded code, and probably unusable for kernel code.
For example:
char * fn( int file_id )
{
static char res[MAX_PATH];
// fill res[]
return res;
}
/*
let do it the BSTR way (BasicString of VB)
*/
char * CopyString(char *str){
unsigned short len;
char *buff;
len=lstrlen(str);
buff=malloc(sizeof(short)+len+1);
if(buff){
((short*)buff)[0]=len+1;
buff=&((short*)buff)[1];
strcpy(buff,str);
}
return buff;
}
#define len_of_string(s) ((short*)s)[-1])
#define free_string(s) free(&((short*)s)[-1]))
int main(){
char *buff=CopyString("full_path_name");
if(buff){
printf("len of string= %d\n",len_of_string(buff));
free_string(buff);
}else{
printf("Error: malloc failed\n");
}
return 0;
}
/*
now you can imagine how to reallocate the string to a new size
*/

Prepend to unsigned char pointer in C?

In C I am reading binary data from a file into a var data like this:
unsigned char *data;
data = malloc(size);
int read_size = fread(data, 1, size, fp);
I want to prepend the var data with <filename><size> of the file. How can I achieve this?
It's not a legal C string because it's binary data with null bytes potentially all over the place.
I know to make sure I allocate it with enough memory, I just can't figure out how to actually prepend it.
Allocate enough memory to data.
Copy the prefix into it.
Get a reference to just behind what had been copied in 2..
Pass this reference to fread().
Define your own data format for storage:
<uint64_t datalength><string name><char[datalength] contents>
Or for easier in-app use:
struct named_file {
char* contents;
uint64_t datasize;
char name[]; // contents begin directly after the name.
}
Allocate the struct with enough space: sizeof(named_file)+strlen(_name)+1+_datasize
strcpy(name, _name)
contents = name+strlen(name)+1
save data to contents-pointer. memcpy(), direct reading, whatever.

Storing data into a structure full of pointers

I have a program that is reading a file, but not saving into the structure. Once the data is read, it should be saved within the structure in order for the program to be able to use said data later. I'm having a heck of a time figuring out how to get this done.
structure
typedef struct friends_contact {
char *First_Name;
char *Last_Name;
char *home_phone;
char *cell_phone;
} fr;
Reading of the file
void ReadFile(fr *friends, int *counter, char buffer[], FILE *read) {
fseek(read, 0, SEEK_SET);
while (fscanf(read, "%s", buffer) != EOF) {
friends[*counter].First_Name = malloc(BUFFSIZE * strlen(buffer));
strcpy(friends[*counter].First_Name, buffer);
}
}
More information can be provided as needed. I just want to figure out why the information isn't saving within the structure so that it can be called on later.
What is "friends"? global variable?
What is "contacts"? It is not used in function.
May be you mix them?
BUFFSIZE * strlen(buffer) -> What do you mean? you allocate strlen(buffer) BUFFSIZE times.
Possibly it should be sizeof(char) * strlen(buffer) ?
I also think you should check the length of "buffer" after operation fscanf.
The code you use for allocating space for the char array and then copying to it makes sense but one of two things could be happening in your while() cycle: the condition is evaluated instantly to false so nothing is copied or you iterate over and over again until fscanf writes an empty string to buffer and this overwrites the content of friends[*counter].First_Name, should you increment *counter in the body of while()?

Using a C string like a FILE*

I have a C function that reads a stream of characters from a FILE*.
How might I create a FILE* from a string in this situation?
Edit:
I think my original post may have been misleading. I want to create a FILE* from a literal string value, so that the resulting FILE* would behave as though there really was a file somewhere that contains the string without actually creating a file.
The following is what I would like to do:
void parse(FILE* f, Element* result);
int main(int argc, char** argv){
FILE* f = mysteryFunc("hello world!");
Element result;
parse(f,&result);
}
Standard C provides no such facility, but POSIX defines the fmemopen() function that does exactly what you want.
Unfortunately, C's standard library doesn't provide this functionality; but there are a few ways to get around it:
Create a temporary file, write your string to it, then open it for reading. If you've got POSIX, gettempnam will choose a unique name for you
The other option (again for POSIX only) is to fork a new process, whose job will be to write the string to a pipe, while you fdopen the other end to obtain a FILE* for your function.
As #KeithThompson pointed out, fmemopen does exactily what you want, so if you have POSIX, use that. On any other platform, (unless you can find the platform-equivalent), you'll need a temporary file.
Last time I had this kind of problem I actually created a pipe, launched a thread, and used the thread to write the data into the pipe... you would have to look into operating system calls, though.
There are probably other ways, like creating a memory mapped file, but I was looking for something that just worked without a lot of work and research.
EDIT: you can, of course, change the problem to "how do I find a nice temporary filename". Then you could write the data to a file, and read it back in :-)
pid_t pid;
int pipeIDs[2];
if (pipe (pipeIDs)) {
fprintf (stderr, "ERROR, cannot create pipe.\n");
return EXIT_FAILURE;
}
pid = fork ();
if (pid == (pid_t) 0) {
/* Write to PIPE in this THREAD */
FILE * file = fdopen( pipe[1], 'w');
fprintf( file, "Hello world");
return EXIT_SUCCESS;
} else if (pid < (pid_t) 0) {
fprintf (stderr, "ERROR, cannot create thread.\n");
return EXIT_FAILURE;
}
FILE* myFile = fdopen(pipe[0], 'r');
// DONE! You can read the string from myFile
.... .....
Maybe you can change the code a little bit to receive a custom handle.
void parse(my_handle *h, Element *result)
{
// read from handle and process
// call h->read instead of fread
}
and defines the handle like this:
struct my_handle
{
// wrapper for fread or something
int (*read)(struct my_handle *h, char *buffer, int readsize);
// maybe some more methods you need
};
implement your FILE* wrapper
struct my_file_handle
{
struct my_handle base;
FILE *fp;
};
int read_from_file(struct my_handle *h, char *buffer, int readsize)
{
return fread(buffer, 1, readsize, ((my_file_handle*)h)->fp);
}
// function to init the FILE* wrapper
void init_my_file_handle(struct my_file_handle *h, FILE *fp)
{
h->base.read = read_from_file;
h->fp = fp;
}
Now, implement your string reader
struct my_string_handle
{
struct my_handle base;
// string buffer, size, and current position
const char *buffer;
int size;
int position;
};
// string reader
int read_from_string(struct my_handle *h, char *buffer, int readsize)
{
// implement it yourself. It's easy.
}
// create string reader handle
void init_my_string_handle(struct my_string_handle *h, const char *str, int strsize)
{
// i think you know how to init it now.
}
//////////////////////////////////////////////////
And now, you can simply send a handle to your parse function. The function doesn't care where the data comes from, it can even read data from network!
This is an old question, but deserves a better answer.
C has always had the ability to read and write strings using the formatted I/O functions. You just need to keep track of where you are in the string!
Reading a string
To read a string you need the %n format string specifier, which returns the number of bytes read each time we use sscanf(). Here is a simple example with a loop:
#include <stdio.h>
int main(void)
{
const char * s = "2 3 5 7";
int n = 0;
int value;
while (sscanf( s+=n, "%d%n", &value, &n ) == 1)
{
printf( "value = %d\n", value );
}
}
Another way to have written that loop would be:
for (int value, n; sscanf( s, "%d%n", &value, &n ) == 1; s += n)
Whichever floats your boat best.
The loop is not important.
What is important is that we increment the value of s after every read.
Notice how we don’t bother to remember original value of s in this example? If it matters, use a temporary, as we do in our next example.
It is also important that we stop reading when sscanf fails. This is the normal usage for the scanf family of functions.
Writing a string
In this case sprintf() helps us by directly returning the number of bytes written. Here’s a simple example of building a string using several formatted outputs:
#include <stdio.h>
int main(void)
{
char s[100] = {0};
char * p = s;
p += sprintf( p, "%d %s", 3, "three" );
p += sprintf( p, "; " );
p += sprintf( p, "%.2f %s", 3.141592, "pi" );
*p = '\0'; // don’t forget it!
printf( "s = \"%s\"\n", s );
printf( "number of bytes written = %zu = %zu\n", p-s, strlen(s) );
}
The important points:
This time we do not want to clobber s (and in this particular example couldn’t even if we wanted to), so we use a helper p.
We cannot forget to manually add that null-terminator. (Which should make sense, since we are manually building the string.)
BUFFER OVERFLOW IS POSSIBLE
That last point is significant, and a usual concern when building strings in C. As always, whether using strcat() or sprintf(), always make sure you have enough room to append everything you intend to write to your string!
Don’t use %n when writing
We could have used the %n specifier as well, but then we hit a cross-platform issue with MSVC: Microsoft targets %n and the printf() family of functions as a security issue. Whether or not you accept Microsoft’s reasoning you must live with the way things are.
If you are undeterred, you can add a little platform-specific code and use it anyway:
#ifdef _WIN32
_set_printf_count_output( 1 );
#endif
int n;
printf( "Hello%n world!", &n );
True FILE * I/O
Notice that we aren’t touching actual FILE * I/O functions, like fgetc()? If you need that, then you need an actual file.
As mentioned above, use tmpfile() to open a temporary read/write file and use the usual FILE * I/O functions on it. Our read-a-string example could be re-written as:
#include <stdio.h>
int main(void)
{
FILE * f = tmpfile();
if (!f) return 1;
fprintf( f, "2 3 5 7" );
rewind( f );
int value;
while (fscanf( f, "%d", &value ) == 1)
{
printf( "value = %d\n", value );
}
fclose( f );
}
This works just fine. Remember that tmpfile() might not give you an actual file on disk with a filename. You don’t need that anyway. In other words, it may very well be an in-memory buffer provided by the OS... which is kind of what this thread is about anyway, right?
Hopefully these options will give a deeper insight into the C standard I/O functions and their use. Next time you need to read or build a formatted string in parts, you will have a better grasp of the tools already provided for you.

Passing values to a function in C

I am new to C and working on it since two months. I have a structure shown below:
struct profile_t
{
unsigned char length;
unsigned char type;
unsigned char *data;
};
typedef struct profile_datagram_t
{
unsigned char *src;
unsigned char *dst;
unsigned char ver;
unsigned char n;
struct profile_t profiles[MAXPROFILES];
} header;
header outObj;
Now the values inside the elements of the structure are read as outObj.src[i], outObj.dst[i], and outObj.profiles[i].type.
Now I want to call a function and pass the values read by me to a function which is actually a Berkeley DB.
void main()
{
struct pearson_record {
unsigned char src[6];
unsigned char dst[6];
unsigned char type;
};
DB *dbp;
int ret;
if ((ret = db_create(&dbp, dbenv, 0)) !=0)
handle_error(ret);
if ((ret = dbp->open(dbp, NULL, "pearson.db", NULL, DB_BTREE, DB_CREATE, 0600)) !=0)
handle_error(ret);
const DBT *pkey;
const DBT *pdata;
struct pearson_record p;
DBT data, key;
memset(&key, 0, sizeof(DBT));
memset(&data, 0, sizeof(DBT));
memset(&s, 0, sizeof(struct pearson_record));
Now the above code is written by looking at a example from the DB reference guide. but i don't understand what is const DBT. Also they have added the value inside structure using memcopy which I know is the right way, but now I want to memcopy the values passed which are mentioned above and store them in the structure pearson_record. How should I go with this?? Any kind of help would be appreciated.
Please post the complete code. You mention "they memcopy" (which I assume you refer to memcpy), but all I see is a bunch of memset(*,0). Hope you're not confusing them.
Also "they have added the value inside structure using memcopy which I know is the right way" is not entirely true. It's not necessarily wrong, BUT... char* is basically interpreted as a C string. that is an array of bytes which represent characters which MUST be null terminated (that is the last character must be 0, equivalent to '\0'). The proper way to copy strings is using strcpy() (or strcpy_s on windows), the difference is memcpy is faster and used in other situations (such as pointers\buffer management).
unsigned char* is not so used (at least I never saw it till now). As a note read about char, unsigned char, signed char, char[] and char* (not that it changes your code in any way, but just to make sure you understand the differences).
As for copying data, I assume you mean src, dst and type from pearson_record to header, correct ? If so, for the sake of simplicity I wanted to suggest memcpy but you say that each element is accessed as [i]. Does that mean header.src is an array of more than one pearson_record.src or does header.src[i] correspond to pearson_record.src[i] ? This is slightly unclear to me.
There is a difference between char* src and char* *src.

Resources