Array of Linked lists on disk - c

I am trying to find how to store and process (search, add, remove) an array of linked lists on disk. For example in memory, it would like
struct list {
int a;
struct list *next;
}LIST
LIST *array[ARRAY_SIZE]
int main{
...
LIST *foo = array[pointer];
/* Search */
while(foo!=NULL){
...
foo=foo->next
}
}
I believe that by using fseek() I can point to a specific element/structure of the array in the file. But I cannot understand if all previous elements need to have been written or not.
Can this be done with dynamic allocation on disk?
How will I link on element to another one in the linked list?
Any example would certainly help!

Okay, as Amardeep says, this sounds like the sort of thing that would best be done practically using some kind of database, like, or, Berkeley DB. But let's answer the questions anyway.
you can indeed use fseek (or its system call, lseek) to determine where something is on the disk and find it later. If you use some method of computing the offset, you're implementing what we used to call a direct file; otherwise you might store an index, which leads you off toward the indexed sequential method.
whether it can be done with dynamic allocation sort of depends on the file system. Many UNIX file systems support * sparse* allocation, which means that if you allocate block 365 it doesn't have to allocate blocks 0 through 364. Some dont.
Let's say that you have a structure with a length k that looks more or less like this:
(trick the parse)
struct blk {
// some stuff declared here
long next; // the block number of the next item
};
You create the first item; set its block number to 0. Set next to some distinguished value, say -1.
// Warning, this is off the cuff code, not compiled.
struct blk * b = malloc(sizeof(struct blk));
// also, you should really consider the case where malloc returns null.
// do some stuff with it, including setting the block next to 1.
lseek(file, 0, SEEK_SET); // set the pointer at the front of the file
write(file, sizeof(struct blk), b); // write it.
free(b);
// make a new block item
malloc(sizeof(struct blk));
// Do some stuff with it, and set next to 2.
lseek(file, 0, SEEK_CUR); // leave the pointer where it was, end of item 0
write(file, sizeof(struct blk), b); // write it.
free(b);
You now have two items on disk. keep this up and you'll eventually have a thousand items on disk. Now, to find item # 513, you just
lseek(file, (sizeof(struct blk)*513), SEEK_SET);
You need a buffer; since we freed the previous ones we'll make another
b = malloc(sizeof(struck blk);
Read that many bytes
read(file, sizeof(struct blk), b);
And poof record 513 is in memory pointed to by b. Get the record following with
lseek(file, (sizeof(struct blk)*b->next), SEEK_SET);

Related

Easiest way to allocate "blank" block of data to .dat file

Looking for a quick way to allocate a block of data to be managed from disk. I'm allocating a block of 50 structs, and while most of the memory allocates fine, when I read it all back I get junk messages returned in some of the fields that should be blank. I assume this is me allocating the space incorrectly somehow that allows some junk from memory to leak in there.
if ((fpBin = fopen(BINARYFILE, "w+b")) == NULL)
{
printf("Could not open binary file %s.\n", BINARYFILE);
return;
}
fwrite(fpBin, sizeof(struct student), 50, fpBin); //Write entire hash table to disk
struct definition
typedef struct student
{
char firstName[20]; //name
char lastName[20];
double amount; //amount owed
char stuID[5]; //4 digit code
}student;
Is how I was taught, yet I'm still getting some junk in my data instead of it being a clean slate. So question: How do I set all fields to blank?
Answer:
student tempStu[50] = {0};
fwrite(tempStu, sizeof(struct student), BUCKETSIZE, fpBin); //Write entire hash table to disk
fwrite(fpBin, sizeof(struct student), 50, fpBin);
You're writing your file pointer, not your student structs, to disk. That first fpBin should instead be a pointer to your data. That data can be an array of 50 student structs initialized to 0, perhaps with calloc or by defining it at file scope, but it has to be somewhere. Instead, you are writing 50*sizeof(struct student) bytes from your fpBin pointer, which is undefined behavior -- you'll either crash with an access violation or you'll write junk to disk. That junk is what you're getting when you read it back.
Also, using a constant like 50 is bad practice ... it should be a variable (or manifest constant) that holds the number of students that you're writing out.
BTW, on Linux and other POSIX systems, you could allocate a block of zeroes on disk just by writing the last byte (or in some other way making the file that large).

Array of structure - memory release in C

I am trying to implement a priority queue based on binary heap using a static array (I will be later using a linked list, just wanted to test first with an array).
typedef struct n
{
int x;
int y;
int size;
double value;
} node;
node arr[100];
int total = 1;
void insertElement(int x, int y, int size, double value)
{
node n;
n.x = x;
n.y = y;
n.size = size;
n.value = value;
arr[total] = n;
if (total > 1)
insertArrange(total);
total += 1;
}
Now in the delete function i will just return the top most node and delete it, then re-arranging the whole heap. Problem is I can not free any memory. Suppose I use
free(&arr[1]);
I am getting pointer being freed was not allocated error. Is this the proper way of implementation? How to tackle memory issues?
I am using Xcode with Apple LLVM 4.2 compiler. This entire thing will be ultimately put into a bigger project in Objective-C but for now I do not want to use NSMutableArray. I want a simple solution in C.
You only need to call free() if you have used malloc() or calloc(). In fact, attempting to free anything else is undefined behaviour.
As it stands, your code will not be leaking any memory.
Why delete? You could just zero it out and write new data to it whenever you need to. Also My recommendation would be to remember which nodes you delete, so that later when you need to insert a new node, you will know beforehand where the free space is.
For example:
node arr[10];
indexes free_index[10];
//(delete the 6th member of nodes)
delete arr[5];
//remember which one you deleted
free_index[0] = 5;
//later when you add new node you can search the index and pick the first matching value
// zero it out so that it will not be used accidentally again like this
int i = free_index[0] // finding which one is free is task for loops
new_node(arr[i]);
free_index[i] = NULL;
This code example here is very incomplete you have to complete it depending on your own implementation. I just gave you the idea. watch out for free_index [0] = 0; it basically will never match as a valid index. If you zero out indexes with = NULL statement.
There is also a big assumption from my side that you do not wish to shrink the size of this array or grow it. Just empty some elements and then add new ones.
If you want to grow the array you have to calloc it first. I advise calloc because you can allocate array of structs with it.
Growing this is easy with realloc.
But with shrinking you need to create temporary array of nodes where you will store all active results, shrink the original array, put the active results from temporary array back into original and free temporary array.
calloc(numberofnodearrays,sizeof(node));

how do i delete arrays of typedef structs?

I am trying to delete an array of initialized structs e.g. reset the array
My struct:
struct entry{
char name[NAME_SIZE];
int mark;
};
typedef struct entry Acct;
Acct dism2A03[MAX_ENTRY];
Acct clear[0]; << temp struct to set original struct to null
My attempt:
entry_total keeps track of how many structs in the struct array dism2A03[x] have values set in them.
I tried to create an empty array of the same struct clear[0]. Looped through initialized arrays in dism2A03[x] and set them to clear[0]
for(m=0;m<entry_total;m++){
dism2A03[m]=clear[0];
}
break;
However, it is setting them to 0, i want them to become uninitialized e.g. no values in them
You cannot have memory with no value in it. It's physically impossible. It's due to the laws of physics of our universe :-)
Also, this:
Acct clear[0];
is wrong. You cannot have an array with zero elements. Some compilers will allow this as an extension, but it's not valid C. And for the compilers that allow this, it doesn't do what you think it does.
It would seem to me that what you want instead is to resize the array. To do that, you would need to copy the elements you want to keep into a new array, and then free() the old one. To do that, you need to create dism2A03 using dynamic memory:
Acct *dism2A03 = malloc(sizeof(Acct) * MAX_ENTRY);
if (dism2A03 == NULL) {
// Error: We're out of memory.
}
(malloc() returns NULL if there's no more free memory, and the code checks that. Usually all you can do if this happens is terminate the program.)
When you want a new array with some elements removed, then you should back up the starting address of the current one:
Acct* oldArray = dism2A03;
then create a new one with the new size you want:
dism2A03 = malloc(sizeof(Acct) * NEW_SIZE);
if (dism2A03 == NULL) {
// Error: We're out of memory.
}
copy the elements you want from the old array (oldArray) to the new one (dism2A03) - which is up to you, I don't know which ones you want to keep - and after than you must free the old array:
free(oldArray);
As a final note, you might actually not want to create a new array at all. Instead, you could keep having your original, statically allocated array ("statically allocated" means you're not using malloc()):
Acct dism2A03[MAX_ENTRY];
and have a index variable where you keep track of how many useful elements are actually in that array. At first, there are 0:
size_t dism2A03_size = 0;
As you add elements to that array, you do that at the position given by dism2A03_size:
dism2A03[dism2A03_size] = <something>
++dism2A03_size; // Now there's one more in there, so remember that.
While doing so, you need to make sure that dism2A03_size does not grow larger than the maximum capacity of the array, which is MAX_ENTRY in your case. So the above would become:
if (dism2A03_size < MAX_SIZE) {
dism2A03[dism2A03_size] = <something>
++dism2A03_size; // Now there's one more in there, so remember that.
} else {
// Error: the array is full.
}
As you can see, adding something to the end of the array is rather easy. Removing something from the end of the array is just as easy; you just decrement dism2A03_size by one. However, "removing" something from the middle of the array means copying all following elements by one position to the left:
for (size_t i = elem_to_remove + 1; i < dism2A03_size; ++i) {
dism2A03[i - 1] = dism2A03[i];
}
--dism2A03_size; // Remember the new size, since we removed one.
Note that you should not attempt to remove an element if the array is empty (meaning when dism2A03_size == 0.)
There's also the case of adding a new elements in the middle of the array rather than at the end. But I hope that now you can figure that out on your own, since it basically a reversed version of the element removal case.
Also note that instead of copying elements manually one by one in a for loop, you can use the memcpy() function instead, which will do the copying faster. But I went with the loop here so that the logic of it all is more obvious (hopefully.)
when you declare an array in this way Acct dism2A03[MAX_ENTRY]; the array is allocated in the stack, therefore it will be removed when the function will perform the return statement.
What you can do is to allocate the structure in the heap via malloc/calloc, and then you can free that memory area via the free function.
For example :
typedef struct entry Acct;
Acct * dism2A03 = calloc(MAX_ENTRY, sizeof( struct entry));
// ....
free(dism2A03);

How to _delete_ element from dynamic array?

I have seen other answers to questions like this, but none seemed to work for me. Say I have a dynamic array:
int* myarray;
myarray = malloc(myarray, 4*sizeof(int));
myarray[0] = 1;
myarray[1] = 2;
myarray[2] = 3;
myarray[3] = 4;
What I want to do is to remove (and free, because the array will keep on getting larger and larger) the first element of the array. I am well aware of realloc which removes the last element of the array if shrunk. Any ideas on this? Is this possible?
Thanks in advance!
One method I can think of is doing
memmove(myarray, myarray+1, 3*sizeof(int))
and then use realloc to shrink the array. I'm not sure there are more efficient ways to do this in C.
You have to shunt all the other elements along one. Conceptually, it's like this:
for( int i = 0; i < 3; i++ ) p[i] = p[i+1];
As others have mentioned, memmove is optimized for shifting memory segments that overlap, rather than using the above loop.
Moving data around is still inefficient as your array grows larger. Reallocating an array every time you add an item is even worse. General advice is don't do it. Just keep track of how large your array is and how many items are currently stored in it. When you grow it, grow it by a significant amount (typically you would double the size).
It sounds like you might want a circular queue, where you preallocate the array, and a head and tail pointer chase each other round and round as you push and pop items on.
Typically a "Delete" operation is not possible on an array. Perhaps you want to create and use a linked list?
C++ has its std::vector which supports this. What it would do is to shift elements that come later, forward by 1 element. You could implement this, and call realloc later.
Storing them in reverse is an obvious workaround if only first element needs to be deleted.
I don't think that you'll find a proper/clean way to do that in C. C++ as some lybraries who do that, and almost all the OO oriented languages can do that, but not C. All I can think of is moving memory and, yes, calling realloc, or setting the position you want to free to a known value wich you'll consider empty in a memory re-use policy.
Another way to turn the problem is by a dynamic implementation of the array. DOn't know if you want to go there, but if you do, here's some brief example.
Since you're only saving integers, a struct like this:
typedef struct DynamicArray_st{
int x;
struct DynamicArray_st *next;
}DynamicArray;
Makes it possible to alloc and free elements as the program needs to. It also allows insertion in the middle, begin or end and the same for frees.
The way you'll do it is by saving a pointer to the begin of this dynamic type and then iterate over it.
The problem is that you can't access data by the [] notation. Iterations are necessary wich makes it heavier on processing time.
Besides that, your code would become something like this:
DynamicArray *array = malloc(sizeof(DynamicArray)); /*Just a first element that will stay empty so your Dynamic array persists*/
array->next = NULL;
DynamicArray *aux = array;
DynamicArray *new;
for(i = 0; i<4; i++){
new = malloc(sizeof(DynamicArray));
new->next = NULL;
new->x = i+1;
aux->next = new;
aux = new;
}
Here you have a sequence of structs in a way that each struct points to the next one and has a integer inside.
If now you'd do something like:
aux = array->next; /*array points to that empty one, must be the next*/
while(aux != NULL){
printf("%d\n",aux->x);
aux = aux->next;
}
You'll get the output:
1
2
3
4
And freeing the first element is as easy as:
aux = array->next;
array->next = aux->next;
free(aux);
If you try to draw it(structs are boxes and next/aux/next are arrows) you'll see one boxe's arrow outline an box - the one you want to free.
Hope this helps.

Recursive struct and malloc()

I have a recursive struct which is:
typedef struct dict dict;
struct dict {
dict *children[M];
list *words[M];
};
Initialized this way:
dict *d = malloc(sizeof(dict));
bzero(d, sizeof(dict));
I would like to know what bzero() exactly does here, and how can I malloc() recursively for children.
Edit: This is how I would like to be able to malloc() the children and words:
void dict_insert(dict *d, char *signature, unsigned int current_letter, char *w) {
int occur;
occur = (int) signature[current_letter];
if (current_letter == LAST_LETTER) {
printf("word found : %s!\n",w);
list_print(d->words[occur]);
char *new;
new = malloc(strlen(w) + 1);
strcpy(new, w);
list_append(d->words[occur],new);
list_print(d->words[occur]);
}
else {
d = d->children[occur];
dict_insert(d,signature,current_letter+1,w);
}
}
bzero(3) initializes the memory to zero. It's equivalent to calling memset(3) with a second parameter of 0. In this case, it initializes all of the member variables to null pointers. bzero is considered deprecated, so you should replace uses of it with memset; alternatively, you can just call calloc(3) instead of malloc, which automatically zeroes out the returned memory for you upon success.
You should not use either of the two casts you have written—in C, a void* pointer can be implicitly cast to any other pointer type, and any pointer type can be implicitly cast to void*. malloc returns a void*, so you can just assign it to your dict *d variable without a cast. Similarly, the first parameter of bzero is a void*, so you can just pass it your d variable directly without a cast.
To understand recursion, you must first understand recursion. Make sure you have an appropriate base case if you want to avoid allocating memory infinitely.
In general, when you are unsure what the compiler is generating for you, it is a good idea to use a printf to report the size of the struct. In this case, the size of dict should be 2 * M * the size of a pointer. In this case, bzero will fill a dict with zeros. In other words, all M elements of the children and words arrays will be zero.
To initialize the structure, I recommend creating a function that takes a pointer to a dict and mallocs each child and then calls itself to initialize it:
void init_dict(dict* d)
{
int i;
for (i = 0; i < M; i++)
{
d->children[i] = malloc(sizeof(dict));
init_dict(d->children[i]);
/* initialize the words elements, too */
}
}
+1 to you if you can see why this code won't work as is. (Hint: it has an infinite recursion bug and needs a rule that tells it how deep the children tree needs to be so it can stop recursing.)
bzero just zeros the memory. bzero(addr, size) is essentially equivalent to memset(addr, 0, size). As to why you'd use it, from what I've seen around half the time it's used, it's just because somebody though zeroing the memory seemed like a good idea, even though it didn't really accomplish anything. In this case, it looks like the effect would be to set some pointers to NULL (though it's not entirely portable for that purpose).
To allocate recursively, you'd basically just keep track of a current depth, and allocate child nodes until you reached the desired depth. Code something on this order would do the job:
void alloc_tree(dict **root, size_t depth) {
int i;
if (depth == 0) {
(*root) = NULL;
return;
}
(*root) = malloc(sizeof(**root));
for (i=0; i<M; i++)
alloc_tree((*root)->children+i, depth-1);
}
I should add that I can't quite imagine doing recursive allocation like this though. In a typical case, you insert data, and allocate new nodes as needed to hold the data. The exact details of that will vary depending on whether (and if so how) you're keeping the tree balanced. For a multi-way tree like this, it's fairly common to use some B-tree variant, in which case the code I've given above won't normally apply at all -- with a B-tree, you fill a node, and when it's reached its limit, you split it in half and promote the middle item to the parent node. You allocate a new node when this reaches the top of the tree, and the root node is already full.

Resources