Using fseek() To Update a Binary File - c

I have looked and looked online for help on using fseek() efficiently, but no matter what I do, I am still not receiving the right results. Basically I am reading from a file of animals that have an "age" parameter. If the age is -1, then upon adding to this binary file, I should use fseek() to find the first -1 in the file and overwriting that entire line with new information that the user inputs. I have an array that traverses and finds all of the holes at the beginning of the file, and it is working correctly. My issue is that it is updating the new animal and putting each one in the next empty slot with age -1, but when I go to refresh my file, all of the animals are appended to the end, even though their id's are the id's of the once empty slots. Here is my code:
void addingAnimal(FILE *file, struct animal ani, int * availableHoles) {
int i;
int offset = ((sizeof(int) + sizeof(ani)) * ani.id -1);
if (availableHoles[0] != 0) {
fseek(file, offset, SEEK_SET);
ani.id = availableHoles[0];
fwrite & ani, sizeof(ani), 1, file);
for (i = 0; i < sizeof(availableHoles) -1; i++) {
availableHoles[i] = avialablesHoles[i+1];
}
}
The very beginning of the file has an integer that tells us the number of holes within the file, so the offset is removing that, so once I print it, it prints everything correctly. Then I check if there are holes in the helper array I created, if there are, then I want the animal's id to be that id and I am trying to seek to the line with the first -1 age to put my updated animal's information there, and then writing it to the file. The last for-loop is just me shifting up the available holes. Oh and as for opening the file, I am using r+b for reading and writing. Thank you in advance!

You cannot use sizeof(availableHoles) to iterate on the array. You are in a function that receives availableHoles as a pointer, its size is irrelevant to the number of holes.
Pass the number of elements of this array as a separate argument.
Using the FILE streams in read/write mode is tricky, do you call fseek() systematically between accesses in read mode and write mode?
Post the calling code, the function addingAnimal alone is not enough to investigate your problem.

Related

Efficient way of writing large variable number of integers from array to text file

I have a program that results in an integer array of variable size and I need to write the elements to a text file. The elements need to be on the same line in the file.
This is a minimal example of what I am currently doing. I'm using the approach in this post https://stackoverflow.com/a/30234430/10163981
FILE *file = fopen("file.txt","w");
int nLines = 1000;
char breakstr[]="\n";
for(; ix<N; ix++){
char s[nLines*13];
for(int jx = 0 ; jx<nLines; jx++){
index += sprintf(&s[index],"%03i %03i %03i ", array[ix][jx], array[ix][jx], array[ix][jx]);
// I need jx:th element in repeats of three, and may need to modify it externally for printing
}
fwrite(s,sizeof(char),strlen(s), file);
fwrite(breakstr,sizeof(char),strlen(breakstr), file);
}
fclose(file);
I am formatting the array contents as a string and using fwrite, as this method has been given to me as a requirement. My problem is that this implementation that I am using is way too slow. I have also tried using shorter strings and writing for each iteration, but this even slower. There is not much I can do with regards to the outer ix loop, as the initial value of ix is variable. I included it for the same of completeness.
nLines is expected to reach as high as 10000 at most.

Remove Last two bytes from the File or Ignore last two bytes of the file In C

Here i am implementing CRC 16 for file for file verification.
Here i append 2 bytes CRC at the end of file. When File will be received on target device than i have to calculate CRC of this file without last two bytes
Here is my data after appeneding CRC at the end of file.
test123
wU
Now when i again calculate CRC on target device than i want to ignore last two bytes.
Here i have one common function in which i open file in read mode and calculate CRC. i want to use same function for this time.
I have one solution make another function same like previous one and go up to filesize-2. but dnt want to replicate function two times. i want to delete last two bytes.
So any body have Suggestion or Solution regarding this?
In addition, do you need help truncating two bytes off a file?
What kind of API is on the target.
On POSIX you can open the file, then off_t pos = lseek(fd, 0, SEEK_END) to seek to the end, which returns the position. if (pos == (off_t) -1) then the call failed. If the call succeeded, you can just ftruncate(fd, pos - 2) (provided that pos >= 2).
Have your function take a parameter to ignore the last n bytes. Pass in 0 for normal use a 2 for that case.

How to handle a huge string correctly?

This may be a newbie question, but i want to avoid buffer overflow. I read very much data from the registry which will be uploaded to an SQL database. I read the data in a loop, and the data was inserted after each loop. My problem is, that this way, if i read 20 keys, and the values under is ( the number of keys is different on every computer ), then i have to connect to the SQL database 20 times.
However i found out, that there is a way, to create a stored procedure, and pass the whole data it, and so, the SQL server will deal with data, and i have to connect only once to the SQL server.
Unfortunately i don't know how to handle such a big string to avoid any unexpected errors, like buffer owerflow. So my question is how should i declare this string?
Should i just make a string like char string[ 15000 ]; and concatenate the values? Or is there a simplier way for doing this?
Thanks!
STL strings should do a much better job than the approach you have described.
You'll also need to build some thresholds. For example, if your string grew more than a mega bytes, it will be worth considering making different SQL connections since your transaction will be too long.
You may read (key, value) pairs from a registry and store them into a preallocated buffer while there is sufficient space there.
Maintain "write" position within the buffer. You could use it to check whether there is enough space for new key,value pair in the buffer.
When there is no space left for new (key,value) pair - execute stored procedure and reset "write" position within the buffer.
At the end of the "read key, value pairs" loop - check buffer's 'write" position and execute stored procedure if it is greater than 0.
This way you will minimize number of times you execute stored procedure on a server.
const int MAX_BUFFER_SIZE = 15000;
char buffer[MAX_BUFFER_SIZE];
char buffer_pos = 0; // "write" position within the buffer.
...
// Retrieve key, value pairs and push them into the buffer.
while(get_next_key_value(key, value)) {
post(key, value);
}
// Execute stored procedure if buffer is not empty.
if(buffer_pos > 0) {
exec_stored_procedure(buffer);
}
...
bool post(const char* key, const char* value)
{
int len = strlen(key) + strlen(value) + <length of separators>;
// Execute stored procedure if there is no space for new key/value pair.
if(len + buffer_pos >= MAX_BUFFER_SIZE) {
exec_stored_procedure(buffer);
buffer_pos = 0; // Reset "write" position.
}
// Copy key, value pair to the buffer if there is sufficient space.
if(len + buffer_pos < MAX_BUFFER_SIZE) {
<copy key, value to the buffer, starting from "write" position>
buffer_pos += len; // Adjust "write" position.
return true;
}
else {
return false;
}
}
bool exec_stored_procedure(const char* buf)
{
<connect to SQL database and execute stored procedure.>
}
To do this properly in C you need to allocate the memory dynamically, using malloc or one of the operating system equivalents. The idea here is to figure out how much memory you actually need and then allocate the correct amount. The registry functions provide various ways you can determine how much memory you need for each read.
It gets a bit trickier if you're reading multiple values and concatenating them. One approach would be to read each value into a separately allocated memory block, then concatenate them to a new memory block once you've got them all.
However, it may not be necessary to go to this much trouble. If you can say "if the data is more than X bytes the program will fail" then you can just create a static buffer as you suggest. Just make sure that you provide the registry and/or string concatenation functions with the correct size for the remaining part of the buffer, and check for errors, so that if it does fail it fails properly rather than crashing.
One more note: char buf[15000]; is OK provided the declaration is in program scope, but if it appears in a function you should add the static specifier. Implicitly allocated memory in a function is by default taken from the stack, so a large allocation is likely to fail and crash your program. (Fifteen thousand bytes should be OK but it's not a good habit to get into.)
Also, it is preferable to define a macro for the size of your buffer, and use it consistently:
#define BUFFER_SIZE 15000
char buf[BUFFER_SIZE];
so that you can easily increase the size of the buffer later on by modifying a single line.

File Descriptors and System Calls

I am doing merging k-sorted streams using read write system calls.
After having read the first integers out of the files and sorting them, the file having the smallest element should be accessed again.
I am not sure how to do this. I thought I can use a structure like:
struct filePointer {
int ptr;
int num;
}fptr[5];
Can someone tell me how to do this.
Thanks
Although reading integers one-by-one is not an efficient way of doing this, I will try to write the solution that you are looking for. however this is not a real implementation, just the idea.
Your structure should be like this:
struct filePointer {
FILE * fp;
int num;
} fptr[k]; /* I assume k is constant, known at compile time */
You need to have an priority queue ( http://en.wikipedia.org/wiki/Priority_queue ) and prioities are determined accourding to num.
First read the first numbers from all files and insert them to priority queue (pq).
Then while pq is not empty, pop the first element which holds the smallest integer compared to other elements in the pq.
Write the integer that first element holds to file.
Using the file pointer (fp) try to read new integer from the input file.
If EOF (end of file), then do nothing
else insert the new element to pq by setting num to the read one.
When the loop is finished, close all files and you will have a new file that contains all the elements from the input files and it will be sorted.
I hope this helps.

Comparing strings in two files in C;

I'm new to language c, so I'll appreciate every help :D
I need to compare given words in the first file ( " Albert\n Martin\n Bob" ) with words in the second file ( " Albert\n Randy\n Martin\n Ohio" ) .
Whenever they're the same i need to put in the file word " Language " ; and print every word without representation in second file "
Something like that:
Language
Language
Bob
need's to be in my third file;
I tried to come up with some ideas , but they dont work; p ,
Thanks for every anwser in advance .
First, you need to open a stream to read the files.
If you need to do this in C, then you may use the strcmp function. It allows you to compares the two strings.
For example:
int strcmp(const char *s1, const char *s2);
I'd open all three files to begin with (both input files and the output file). If you can't open all of them then you can't do anything useful (other than display an error message or something); and there's no point wasting CPU time only to find out that (for e.g.) you can't open the output file later. This can also help to reduce race conditions (e.g. second file changes while you're processing the first file).
Next, start processing the first file. Break it into words/tokens as you read it, and for each word/token calculate a hash value. Then use the hash value and the word/token itself to check if the new word/token is a duplicate of a previous (already known) word/token. If it's not a duplicate, allocate some memory and create a new entry for the word/token and insert the entry onto the linked list that corresponds to the hash.
Finally, process the second file. This is similar to how you processed the first file (break it into words/tokens, calculate the hash, use the hash to find out if the word/token is known), except if the word/token isn't known you write it to the output file, and if it is known you write " language" to the output file instead.
If you're not familiar with hash tables, they're fairly easy. For a simple method (not necessary the best method) of calculating the hash value for ASCII/text you could do something like:
hash = 0;
while(*src != 0) {
hash = hash ^ (hash << 5) ^ *src;
src++;
}
hash = hash % HASH_SIZE;
Then you have an array of linked lists, like "INDEX_ENTRY *index[HASH_SIZE]" that contains a pointer to the first entry for each linked list (or NULL if the linked list for the hash is empty).
To search, use the hash to find the first entry of the correct linked list then do "strcmp()" on each entry in the linked list. An example might look something like this:
INDEX_ENTRY *find_entry(uint32_t hash, char *new_word) {
INDEX_ENTRY *entry;
entry = index[hash];
while(entry != NULL) {
if(strcmp(new_word, entry->word) == 0) return entry;
entry = entry->next;
}
return NULL;
}
The idea of all this is to improve performance. For example, if both files have 1024 words then (without a hash table) you'd need to do "strcmp()" 1024*1024 times; but if you use a hash table with "#define HASH_SIZE 1024" you'll probably reduce that to about 2000 times (and end up with much faster code). Larger values of HASH_SIZE increase the amount of memory you use a little (and reduce the chance of different words having the same hash).
Don't forget to close your files when you're finished with them. Freeing the memory you used is a good idea if you do something else after this (but if you don't do anything after this then it's faster and easier to "exit()" and let the OS cleanup).

Resources