Extra zeroes when writing structure to file in C - c

So I'm writing an RLE compressor, and basically, you take the runs of a file (the continuous strings of the same byte) and convert it to a packet. For example, if the file contains:
0xFF 0xFF 0xFF 0xFF 0xBB 0xBB 0xBB you would convert it into packets, containing a length and the data: It would turn out 0x04FF 0x03BB. You get the idea.
I use a structure to define the packet. (byte is a typedefd type for an unsigned 8 bit type)
typedef struct /* Each packet has a length and a data byte */
{
byte length; /* how many elements of data there is */
byte data; /* whatever byte is being repeated */
} PACKET;
and I have my writeData function:
void writeData(FILE* f, int offset, PACKET p)
{
fseek(f,offset,SEEK_SET); /* move to the offset to write */
fwrite(&p,sizeof(PACKET),sizeof(PACKET),f); /* write the packet to the given offset */
}
And here is the code that calls the function
offsetCounter = 0x0; /* reset offset counter */
for(i = 0; i < nPackets; i++) /* nPackets is the total amount of packets created */
{
writeData(fDest,offsetCounter,packet[i]);
printf("Wrote %d:0x%X to 0x%X\n",packet[i].length,packet[i].data,offsetCounter);
offsetCounter += 0x02; /* skip 2 bytes to write the next packet */
}
When I run the program, everything works fine, and the packets are written to the file correctly, except for some reason, at the very end of the file, there are 2 0x00 bytes that are automatically added on. So, for some reason, it just adds 2 extra empty bytes.
However, when I write the writeData function like this:
void writeData(FILE* f, int offset, PACKET p)
{
fseek(f,offset,SEEK_SET);
fwrite(&p.length,sizeof(byte),sizeof(byte),f);
fwrite(&p.data,sizeof(byte),sizeof(byte),f);
}
, which writes each byte of the structure separately, it no longer adds 2 extra empty bytes to the end of the file..
I'm really confused as to why when I write the structure, it works fine, but adds 2 extra bytes at the end, but when I write each element of the structure separately, it doesn't add them.
Could somebody help me figure this out?

You have the incorrect line
fwrite(&p,sizeof(PACKET),sizeof(PACKET),f);
The structure size is 2 so you are writing 2 of them and this is where your two extra bytes are coming from. It should be
fwrite(&p,sizeof(PACKET),1,f);
The reason why the extra bytes only show up at the end of the file, is because you use fseek() to position the file pointer, which is incremented correctly by 2. NB. When you are writing sequentially to a file, you don't need to use fseek() anyway. Also, you have hard-coded with offsetCounter += 0x02; instead of using offsetCounter += sizeof(PACKET).

Related

Processing binary data containing different length

So I have some binary data that I read, process and need to "split" into different variables, like this:
int *buffer;
buffer = malloc(size);
fread(buffer,size,1,file);
buffer = foo(buffer);
The result looks something like this in my debugger:
01326A18 5F4E8E19 5F0A0000
I want the first byte ( 01 ) to be int a.
The following 4 bytes are the first timestamp b (should be 5F186A32)
The following 4 bytes are the second timestamp c (should be 5F198E4E)
The 0A is supposed to be int d.
My Problem is that I can put the 1 into a, with (*buffer) & 0xff;, but I'm not able to read the first timestamp correctly since its from second to 5th byte and not align with the int declaration of the buffer.
If I print *(buffer +1) it gives me the second int and prints "198E4E5F"
It would be better if I were able to target n byte from every position in my data.
thx in advance.
Something like this will work on most little-endian platforms. Just fread the same way.
struct {
uint8_t a;
uint32_t timeStamp1;
uint32_t timeStamp2;
uint8_t d;
} buffer __attribute__((packed));
assert(sizeof buffer == 10); /* check packing */
If you set your buffer type as char* this will make it point to the 1 byte of chunks. Then if try to get buffer+2, it will return the second byte of the buffer, unlike int pointer which will return the 8th byte of the buffer. Do not forget to update your size in malloc call, since you get your memory with 1 byte chunks in this case. Also this link may be helpful.

How to create a linked list from a binary input file? With first few bytes being an int and the next few a char and so on

I want to create a linked list from an input binary file. The first sizeof(int) bytes is an int and the next sizeof(char) bytes is a char and it keeps going like that.
What I want to do is create a linked list from this file where each node in the linked list contains a character and a tree node that contains this int value.
I am stuck when it comes to creating a linked list out of this file. If it was just a regular file with ints and no binary and no chars, I would have used fscanf to read the file and stored its contents into an array, and then I would have traversed through the array and made the nodes. However, I am confused when these chars are present in the file. Could anyone please help me and tell me if there is a way to create a linked list?
Edits ->
ListNode *head = malloc(sizeof(ListNode)*sizeoffile);
//how do i find the size of the file.
//if it was a file with just integers, I would have done something like this
// int value;
// int count = 0;
//while(fscanf(fptr, "%d", &value)==1)
//{
// count++;
//}
//But now that there is chars, I am really confused how I would determine
//the size of the file.
while(!feof(fptr))
{
fread(head, sizeof(int)+sizeof(char), 1, fptr);
}
I know this is not right. ^
Step 1: Assume all data from an external source (e.g. from a file) is potentially malicious and/or corrupted and/or came from a different computer (with different sizeof(int) and different endianness).
Step 2: Define your file format properly (taking step 1 into account). E.g. maybe it's supposed to be a value in the range 123 to 123456 that's stored as 4 consecutive bytes in little endian order (it should never be an int); and maybe it's a byte containing an ASCII character (it should never be a "random whatever character set the compiler felt like using char").
Step 3: Write some code to load data from the file into an array of bytes. If the file is expected to be small you can use realloc() to increase the size of the buffer if the buffer wasn't big enough (but make sure there's a "max. file size" so that a malicious attacker can't trick you into consuming all available RAM and crashing due to "out of memory"). If the file is expected to be larger; look into functions like mmap(). Alternatively, you can have a "read next part of file; parse next part of file" loop that recycles a fixed sized buffer.
Step 4: Write code to parse the "array of bytes" data and check that it actually complies with the file format specs in every way possible. For example, unsigned long value = buffer[0] + (buffer[1] << 8) + (buffer[2] << 16) + (buffer[3] << 24) and if( (value < 123) || (value > 123456) ) { // Data is malformed.
Step 5: Once you've parsed the data (and written code to handle every conceivable error condition in an appropriate manner, and know for a fact that it must be valid data), you can store the data in a structure and add that structure to a linked list. E.g.
// Parse and check it
if(bufferSize < position + 5) {
return "File ends in the middle of a record";
}
unsigned long value = buffer[position] + (buffer[position+1] << 8) + (buffer[position+2] << 16) + (buffer[position+3] << 24);
if( (value < 123) || (value > 123456) ) {
return "Data was malformed (integer out of range)";
}
if( (buffer[position+4] < 0x20) || (buffer[position+4] >= 0x7F) ) {
return "Data was malformed (character not printable ASCII)";
}
// Create a structure
myStructureType * myStruct = malloc(sizeof(myStructureType));
if(myStruct == NULL) {
return "Failed to allocate memory for structure";
}
myStruct->value = value;
myStruct->character = buffer[position+4];
position += 5;
// Add structure to singly linked list
myStruct->next = NULL;
if(listFirst == NULL) {
listFirst = myStruct;
} else {
listLast->next = myStruct;
}
listLast = myStruct;
Ok, so I suggest that you forget about linked lists. Just stick to the first problem: reading data from a binary file.
The text of the problem is unclear about the size of the objects, so let's assume that it says: "There is a binary file which contains widgets composed by a 32 bit integer (little endian) and an 8 bit number representing an ASCII character. Dump all the widgets to stdout one per line representing the integer in base 10 followed by a space and then the character".
Let's assume that your int is 32 bit little endian and your char is a signed byte, i.e. let's assume you are on one of the 99.9% of the machines in the world.
Now you have to read the widgets, that is an int and a char. There are usually two functions you have to chose from when reading: fscanf and fread. The first one reads from data formatted for humans to read, while the second one reads bytes as they are from a file. Which one do you need now? the second one, so we need to use that.
In your code you write
while (!feof(fptr))
This is always wrong. The only correct way for reading a file is:
while (1) {
// Read
// Check
// Use
}
Then you can find a way to read and check in the while condition, but believe me: write it this way the first time.
So lets populate the above template. To check if fread succeeded you need to check if it returned the number of elements you asked for.
while (1) {
int i;
char c;
// Read
int ok1 = fread(&i, 4, 1, fptr);
int ok2 = fread(&c, 1, 1, fptr);
// Check
if (ok1 != 1 || ok2 != 1)
break;
// Use
printf("%d %c\n", i, c);
}
Of course you can pack this in the while condition, but I don't see a reason for that.
Now I'd test this with your input and with a good debugger and check if all the data in the file gets printed out. If everything is ok, you can move on to the rest of the problem, that is putting these widgets in a linked list.
Here I assumed that you didn't learn structs yet. If this is not the case, you can work with them:
struct widget {
int i;
char c;
};
[...]
while (1) {
struct widget w;
// Read
int ok1 = fread(&w.i, 4, 1, fptr);
int ok2 = fread(&w.c, 1, 1, fptr);
// Check
if (ok1 != 1 || ok2 != 1)
break;
// Use
printf("%d %c\n", w.i, w.c);
}
Don't be deluded by the fact that the widget has the same structure of your data on file. You cannot trust that
fread(&w, 5, 1, fptr); // No! Don't do this
will read your data correctly. When making a struct, the compiler can put all the space it want between fields, so I wouldn't be surprised if sizeof(widget) returned 8.
Disclaimer: I wrote the code directly on the browser and didn't check it!
I think you are getting a little too caught up in something that isn't really a fundamental problem. if you need to create a linked list from a file, you could use fscanf() or fread() or whatever you like to read a file into a buffer and manipulate that buffer as you wish. The same logic for parsing an array of ints (read in from a file) can be applied to parsing a buffer of strings from a binary file (you say binary file with sizeof(int), sizeof(char) consecutively so I'm going to assume you mean it can be read into a buffer)
You say
"If it was just a regular file with ints and no binary and no chars, I
would have used fscanf to read the file and stored its contents into
an array, and then I would have traversed through the array and made
the nodes"
you can traverse a string, or list of strings (however you decide to parse your buffer) using the same logic to make nodes. That's the beauty of a data structure, or struct if you will, in C.

Convert blob of data to align with structure

char buf[512] = { 0 };
int ret = recv(gSock, buf, 512, 0);
typedef struct _STRUCT {
int package;
int version;
char string[512];
} STRUCT, *PSTRUCT;
PSTRUCT ok;
ok = (PSTRUCT)buf;
I am trying to accept a buffer from a socket (Code not here, but it is working). It accepts it and places it into buf. I then want to cast this buf as a structure STRUCT. I want the first 4 bytes to go into the first member, second 4 bytes into the second member, and then the remaining data to go into the last member. However this is not working like I expected. I am getting weird large numbers that are not what I am receiving.
I entered 1111111111 (10) and the results I got back were;
package = 825307441
version = 825307441
string = 11\n
I did a decimal to hex conversion on the package number and it comes back as '31313131', which is my first 4 1's. So I am not to sure why it is going from integer, to hex back to a integer. I want just exactly what sends to go into the structure.
You have to review the following functions:
htonl, htons, ntohl, ntohs - convert values between host and network byte order.

How do you read to a struct via fread given specific byte format?

I'm having trouble reading from a binary file into a struct given a specific format. As in, given a byte-by-byte definition (Offset: 0 Length: 2 Value: 0xFF 0xD8, Offset: 2 Length: 2 Value: 0xFF 0xE1, etc), I'm not sure how to define my structure nor utilize file operations like fread(); to correctly take in the information I'm looking for.
Currently, my struct(s) are as follows:
struct header{
char start; //0xFF 0xD8 required.
char app1_marker; //0xFF 0xE1 required. Make sure app0 marker (0xFF 0xE0) isn't before.
char size_of_app1_block; //big endian
char exif_string; //"EXIF" required
char NULL_bytes; //0x00 0x00 required
char endianness; //II or MM (if not II break file)
char version_number; //42 constant
char offset; //4 blank bytes
};
and
struct tag{
char tag_identifier;
char data_type;
char size_of_data;
char data;
};
What data types should I be using if each attribute of the structure has a different (odd) byte length? Some require 2 bytes of space, others 4, and even others are variable/dynamic length. I was thinking of using char arrays, since a char is always one byte in C. Is this a good idea?
Also, what would be the proper way to use fread if I'm trying to read the whole structure in at once? Would I leave it at:
fread(&struct_type, sizeof(char), num_of_bytes, FILE*);
Any guidance to help me move past this wall would be greatly appreciate. I already understand basic structural declerations and constructions, the two key issues I'm having is the variance and odd byte sizes for the information and the proper way to read variable byte sizes into the struct in one fread statement.
Here is the project link:
http://people.cs.pitt.edu/~jmisurda/teaching/cs449/2141/cs0449-2141-project1.htm
I usually see fread() on structures specify the size as the structure size, and the number of elements as 1, and the expectation that fread() returns 1:
size_t result = fread(&record, num_of_bytes, 1, infile);
I am uncertain how you figure out if your endianess field is II or MM, but I guess the idea is that you could decide whether or not to fix up the field values based on whether the file endianess matches the host endianess.
The actual data seems to be the tag structures, and the last field data is actually just a place holder for variable length data that is specified in the size_of_data field. So I guess you would first read sizeof(struct tag) - 1 bytes, and then read size_of_data more bytes.
struct tag taghdr;
size_t result = fread(&taghdr, sizeof(taghdr) - 1, 1, infile);
if (result != 1) { /* ...handle error */ }
struct tag *tagdata = malloc(sizeof(taghdr) + taghdr.size_of_data - 1);
if (tagdata == 0) { /*...no more memory */ }
memcpy(tagdata, &taghdr, sizeof(taghdr) - 1);
if (taghdr.size_of_data > 0) {
result = fread(&tagdata->data, taghdr.size_of_data, 1, infile);
if (result != 1) { /*...handle error */ }
}

sprintf is outputting some strange data

I am working an embedded project which involves reading/writing a struct into EEPROM. I am using sprintf to make it easy to display some debugging information.
There are two problems with this code for some reason. The first; sprintf is printing a very strange output. When I print 'addr++' it will follow a pattern '0, 1, 2, 3, 4, 32, ...' which doesn't make sense.
void ee_read(char * buf, unsigned int addr, unsigned int len) {
unsigned int i;
sprintf(bf1, "Starting EEPROM read of %u bytes.\r\n", len); // Debug output
debugr(bf1);
IdleI2C1();
StartI2C1();
WriteI2C1(EE_ADDR | EE_W);
IdleI2C1();
WriteI2C1((unsigned char)addr>>8); // Address to start reading data from
IdleI2C1();
WriteI2C1((unsigned char)addr&0xFF);
IdleI2C1();
RestartI2C1();
WriteI2C1(EE_ADDR | EE_R);
IdleI2C1();
for (i=0; i<len; i++) {
buf[i] = ReadI2C1(); // Read a byte from EEPROM
sprintf(bf1, "Addr: %u Byte: %c\r\n", addr, buf[i]); // Display the read byte and the address
debugr(bf1);
addr++; // Increment address
IdleI2C1();
if (i == len-1) { // This makes sure the last byte gets 'nAcked'
NotAckI2C1();
} else {
AckI2C1();
}
}
StopI2C1();
}
The output from the above is here: https://gist.github.com/3803316 Please note that the about output was taken with %x for the address value (so addr is hex)
The second problem, which you may have noticed with the output, is that it doesn't stop when i > len. It continues further than the output I have supplied, and doesn't stop until the microcontroller's watch dog restarts.
Edit:
Calling the function
Location loc;
ee_read(&loc, 0, sizeof(Location));
Declarations:
struct location_struct {
char lat[12]; // ddmm.mmmmmm
char latd[2]; // n/s
char lon[13]; // dddmm.mmmmmm
char lond[2]; // e/w
char utc[11]; // hhmmss.sss
char fix[2]; // a/v
};
typedef struct location_struct Location;
char bf1[BUFFER_SIZE];
I don't think it's a race condition. I disable the interrupts which use bf1. Even then, it would corrupt the whole debug string if that happened, and it certainly wouldn't be so repeatable.
Edit
The value of addr starts as zero, which can be seen here: https://gist.github.com/3803411
Edit
What this is supposed to do it copy the location structure byte by byte into the EEPROM, and then recall it when it is needed.
Closure
So I never did solve this problem. The project moved away from the EEPROM, and I have since changed OS, compiler and IDE. It's unlikely I will replicate this problem.
I'll tell you one thing wrong with your code, this line:
(unsigned char)addr>>8
doesn't do what you seem to need.
It converts the value in addr into an unsigned char which (assuming 8-bit char and either 16-bit int or only using the lower 16 bits of a wider int), will will always give you the lower eight bits.
If you then right shift that by eight bits, you'll always end up with zero.
If your intent is to get the upper eight bits of the address, you need to use:
(unsigned char)(addr>>8)
so that the shift is done first.

Resources