Some confusions about struct memory allocation mechanism? - c

During my project, I am confronted with C program.
As shown below, htmp is a struct pointer. We first allocate a memory for it. But why should we allocate a memory for its element word again?
If it's essential to allocate memory for each element of a struct, why not allocate memory for its other elements, id and next?
#define HASHREC bitewisehash
typedef struct hashrec {
char *word;
long long id;
struct hashrec *next;
} HASHREC;
/* Move-to-front hashing and hash function from Hugh Williams, http://www.seg.rmit.edu.au/code/zwh-ipl/ */
/* Simple bitwise hash function */
unsigned int bitwisehash(char *word, int tsize, unsigned int seed) {
char c;
unsigned int h;
h = seed;
for (; (c =* word) != '\0'; word++) h ^= ((h << 5) + c + (h >> 2));
return((unsigned int)((h&0x7fffffff) % tsize));
}
/* Insert string in hash table, check for duplicates which should be absent */
void hashinsert(HASHREC **ht, char *w, long long id) {
HASHREC *htmp, *hprv;
unsigned int hval = HASHFN(w, TSIZE, SEED);
for (hprv = NULL, htmp = ht[hval]; htmp != NULL && scmp(htmp->word, w) != 0; hprv = htmp, htmp = htmp->next);
if (htmp == NULL) {
htmp = (HASHREC *) malloc(sizeof(HASHREC)); # allocate memory for htmp
htmp->word = (char *) malloc(strlen(w) + 1); # why allocate memory again ?
strcpy(htmp->word, w); #
htmp->id = id; # why not allocate memory for htmp->id ?
htmp->next = NULL; # why nor allocate memory for htmp->next?
if (hprv == NULL) ht[hval] = htmp;
else hprv->next = htmp;
}
else fprintf(stderr, "Error, duplicate entry located: %s.\n",htmp->word);
return;
}

You need to separate in your mind two things (1) in what memory is the thing I want to store stored?; and (2) what variable (pointer) holds the address to where it is stored so I can find it again.
You first declare two pointers to struct hashrec:
HASHREC *htmp, *hprv;
A pointer is nothing but a variable that holds the address to something else as its value. When you first declare the two pointers, they are uninitialized and hold no address. You then, in a rather awkward manner, initialize both pointers within a for loop declaration, e.g. hprv = NULL, htmp = ht[hval] and later hprv = htmp, htmp = htmp->next so presumably both pointers now hold an address and point somewhere.
Following the loop (with an empty body), you test if (htmp == NULL), meaning that htmp does not point to an address (which can be the case if you have found the hash-index of interest empty).
Then in order to provide storage for one HASHREC (e.g. a struct hashrec) you need to allocate storage so you have a block of memory in which to store the thing you want to store. So you allocate a block to hold one struct. (See: Do I cast the result of malloc?)
Now, look at what you have allocated memory for:
typedef struct hashrec {
char *word;
long long id;
struct hashrec *next;
} HASHREC;
You have allocated storage for a struct that contains (1) a char *word; (a pointer to char - 8-bytes (4-bytes on x86)); (2) a long long id; (8-bytes on both) and (3) a pointer to hold the address of the next HASHREC in the sequence.
There is no question that id can hold a long long value, but what about word and next? They are both pointers. What do pointers hold? The address to where the thing they point to can be found. Where can word be found? The thing you want is currently pointed to by w, but there is no guarantee that w will continue to hold the word you want, so you are going to make a copy and store it as part of the HASHREC. So you see:
htmp->word = malloc(strlen(w) + 1); /* useless cast removed */
Now what does malloc return? It returns the address to the beginning of a new block of memory, strlen(w) + 1 bytes long. Since a pointer holds the value of something else as its value, htmp->word now stores the address to the beginning of the new block of memory as its value. So htmp->word "points" to the new block of memory and you can use htmp->word as a reference to refer to that block of memory.
What happens next is important:
strcpy(htmp->word, w); #
htmp->id = id; # why not allocate memory for htmp->id ?
htmp->next = NULL; # why nor allocate memory for htmp->next?
strcpy(htmp->word, w); copies w into that new block of memory. htmp->id = id; assigns the value of id to htmp->id and no allocation is required because when you allocate:
htmp = malloc(sizeof(HASHREC)); /* useless cast removed */
You allocate storage for a (1) char * pointer, (2) a long long id; and (3) a struct hashrec* pointer -- you have already allocated for a long long so htmp->id can store the value of id in the memory for the long long.
htmp->next = NULL; # why nor allocate memory for htmp->next?
What is it that you are attempting to store that would require new allocation for htmp->next? (hint: nothing currently) It will point to the next struct hashrec. Currently it is initialize to NULL so that the next time you iterate to the end of all the struct hashrec next pointers, you know you are at the end when you reach NULL.
Another way to think of it is that the previous struct hashrec next can now point to this node you just allocated. Again no additional allocation is required, the previous node ->next pointer simply points to the next node in sequence, not requiring any specific new memory to be allocated. It is just used as a reference to refer (or point to) the next node in the chain.
There is a lot of information here, but when you go through the thought process of determining (1) in what memory is the thing I want to store stored?; and (2) what variable (pointer) holds the address to where it is stored so I can find it again... -- things start to fall into place. Hope this helps.

When you allocate the memory for the struct, only enough memory to hold a pointer is allocated for the member word, that pointer also has to point to valid memory which you can then allocate.
Without making it point to valid memory, it's value is indeterminate and trying to dereference such pointer with an indeterminate value is undefined behavior.

malloc(size) will allocate memory of given size and returns starting address of the allocated memory. Addresses are stored in pointer variables.
In char *word, variable word is of pointer type and it can only hold a pointer of type char i.e., address of a character data in memory.
so, when you allocate memory to the struct type variable htmp, it will allocate memory as follows, 2 bytes for *word, 4 bytes for id and 2 bytes for *next
Now, lets assume that you want to store following into your struct:
{
word = Elephant
id = 1245
}
Now you can save id directly into id member by doing this htmp->id = 1245. But you also want to save a word "Elephant" in your struct, how to achieve this?
htmp->word = 'Elephant' will cause error because word is a pointer. So, now you allocate memory to store actual string literal Elephant and store the starting address in htmp->word.
Instead of using char *word in your struct, you could have used char word[size] where no need to allocate memory separately for member word. The reason behind not doing so, is that you want to select some random size for word, which can waste memory if you are storing less characters and which even fall shot if word is too big.

Related

Dynamic allocation of array in C that points to linked list

I have these structs and I want to initialize the PageTable and PageEntry. I want to create the shape below.
typedef struct PageEntry {
unsigned int page_number;
char mode;
int count, R;
struct PageEntry* next;
} PE;
typedef struct PageTable {
int p_faults, reads, writes, disk_writes, maxFrames, curFrames;
char* algorithm;
struct PE **pe;
} PT;
I want to create a hash table, so I allocate for maxFrames PE*. My PageTable needs to have a pointer to the array and each element has to point to a linked list.
Here is my init function:
PT *initialize_Table(int maxFrames, char *algorithm) {
PT *ptr = malloc(sizeof(PT)); //Aloc
ptr->p_faults = 0;
ptr->reads = 0;
ptr->writes = 0;
ptr->curFrames = 0;
ptr->disk_writes = 0;
ptr->maxFrames = maxFrames;
ptr->algorithm = malloc(strlen(algorithm) + 1);
strcpy(ptr->algorithm, algorithm);
ptr->pe = malloc((ptr->maxFrames) * sizeof(PE*));
return ptr;
}
So Ptr->pe must be an array, but it isn't.
I get this error:
What should I do ?
No, ptr->pe is a pointer, not an array. You allocated memory for an array and you can index ptr->pe as if it was an array. So ptr->pe[i] is valid, if i is within range.
The contents of this freshly malloced piece of memory are undefined. Use memset to set it to all zeros, or use calloc (iso malloc) to allocate cleared memory.
Arrays in C are second rate citizens. You can declare them , initialize them and query their size with sizeof, but you can't do anything else with them. For all other purposes an array variable decays (or is treated as) a pointer to the first element.

C - Copying Array of Strings into a Struct

I've been banging my head against the wall on this for a couple hours now.
I have a struct defined as follows:
typedef struct historyNode {
int pos;
char cmd[MAXLINE];
char* args[(MAXLINE / 2) + 1];
struct historyNode* next;
} historyNode_t;
I am attempting to copy a passed-in array of strings into the args field within the above struct. This happens in the method below:
void addToHistory(history_t* history, char* args[(MAXLINE / 2) + 1]) {
historyNode_t* node = malloc(sizeof(historyNode_t));
...
int index = 0;
while (args[index] != NULL) {
node->args[index] = args[index];
...
When I attempt to access this node's args value at a later time outside of the method, it spits out a value equal to whatever is in the passed-in args array at that moment; ie, the values aren't actually being copied, but rather the addresses are.
I feel like this is simple but it is frustrating me. Any tips on how this can be remedied are super appreciated.
I attempt to access this node's args value at a later time outside of the method
For doing that you should've returned pointer to struct historyNode wherever you are trying to access the args value. Here is an example for that:
#include<stdio.h>
#include<stdlib.h>
struct node
{
char *a[2];
};
struct node *fun()
{
char *c[]={"hello","World"};
struct node *ptr= malloc(sizeof(struct node));
ptr->a[0]=c[0];
ptr->a[1]=c[1];
return ptr;
}
int main()
{
struct node *ptr=fun();
printf("%s %s\n",ptr->a[0],ptr->a[1]);
return 0;
}
OUTPUT: hello World
In C, C++,
char charA[10]; // Array of char (i.e., string) up to 9+1 byte
// 10 bytes of memory is reserved
char *string; // Pointer to a null-terminated string
// Memory for 1 pointer (4 or 8 bytes) are reserved
// Need to allocate arbitrary bytes of memory
// Up to programmer to interpret the memory structure
// E.g.,
// As array of pointers to string
// A long string
// etc.
// This pointer could be passed to other functions
// and content at that pointed address could be changed
//char *strings[]; // Cannot declare pointer to unknown length of array
// Use char** as below
char **ptrs2Strings; // Pointer to pointer to char
// Memory for 1 pointer (4 or 8 bytes) are reserved
// Need to allocate arbitrary bytes of memory
// Up to programmer to interpret the memory structure
// E.g.,
// As array of pointers to string
// A long string
// etc.
// The pointer to pointer could be passed to other
// functions. The content at that pointed address is
// only an address to an user-allocated memory.
// These functions could change this second address as
// well as content of the memory pointed by the second
// address.
char *charvar[10]; // array of 10 char pointers
// Memory for 10 pointers (40 or 80 bytes) are reserved
// Programmer could allocate arbitrary bytes of memory
// for each char pointer
char stringA[10][256]; // array of 10 strings, and each string could store
// up to 255+1 bytes
I hope this could help you.

Check if a pointer points to a valid structure

I have N statically allocated structures.
struct exemple{
...
}
struct exemple array[N];
struct exemple *test_ptr = 0x3; /* random address */
Can I check if test_prt points to a valid address? i.e. it points to one "struct example" allocated.
You can't. You have to know. It's not a problem if you manage your pointers correctly. A good habit is to always set pointers to 0 / NULL as soon as you destroy the object they point to. Then you can just test with if (ptr) or if (!ptr) (or, more verbose: if (ptr == NULL) / if (ptr != NULL)).
Note that your last assignment
struct exemple *test_ptr = 0x3; /* random address */
is invalid. you can't assign an integer to a pointer. but you can cast it to the pointer type;
struct exemple *test_ptr = (struct exemple *)0x3; /* random address */
The result will depend on your implementation / system.
You can only check if pointer is valid by doing pointer != NULL because anything except `NULL' is treated by valid pointer.
In your case, to check if your pointer points to any of your array entry, you can only do this:
size_t i = 0;
int isValid = 0;
for (i = 0; i < N; i++) {
if (test_ptr == &array[i]) {
isValid = 1;
break;
}
}
if (isValid) {
//Pointer points to one of your array entry
}
But in general, you cannot just test if pointer points to specific valid location for you. It is up to you to take care of where it points. It can also have NON-NULL value but points to invalid location, for example:
int* ptr = malloc(10); //Now points to allocated memory
*ptr = 10;
free(ptr); //Free memory
*ptr = 10; //Undefined behaviour, it still points to the same address but
//we don't know what will happen. Depends on implementation
In general, no, you can't test if a pointer is valid or not.
But, if you want to know if a pointer points to an element of an array, you can:
if(test_ptr >= &array[0] && test_ptr < &array[N]
&& ((intptr_t)test_ptr - (intptr_t)array)%((intptr_t)(&array[1]) - (intptr_t)array) == 0) {
// test_ptr points to an element of array
}
This works because arrays are allocated contiguously.
There is no language method but in some circumstances you can try to have some known values at the certain points of the structure. If the pointed memory location has those values you can assume it as valid - but of course you do not have any guarantee. But you need to write your own functions when you create the structure, and when you destroy it (by filling with zeros before freeing the memory). It is a very week workaround - but if you connect with another measures and accept the overhead it makes the probability of the incorrect program behaviour lower.
Sometimes it is called a security cookie.
it is possible of course to make it more complicated - at certain positions you have only offsets to those cookies. It makes less probable that the random position in the memory will have such a chain of data :)
I don't know if I get your question properly.
If you want to know if a pointer points to a struct of some type (cast my structs to void * and vice-versa, for example), I do the next way:
#include <assert.h>
struct my_struct {
#ifndef NDEBUG
#define MY_STRUCT_MAGIC 0x1234abcd
uint64_t magic;
#endif
int my_data;
};
void init_struct(struct my_struct *s, int t_data) {
#ifdef MY_STRUCT_MAGIC
s->magic = MY_STRUCT_MAGIC;
#endif
s->my_data = t_data;
}
my_struct *my_struct_cast(void *vs) {
my_struct *s = vs;
#ifdef MY_STRUCT_MAGIC
assert(MY_STRUCT_MAGIC == s->magic);
#endif
return s;
}
It has a little bit more code because of inclusion of const-casting, but I think you get the idea.
If you want to know if test_ptr points to a aray member, you have to check this way: test_ptr >= array && test_ptr < &array[sizeof(array)/sizeof(array[0])]). If the pointer comes from void, char, or some kind of dangerout ariyhmetic, you could also check for test_ptr % sizeof(array[0])
If you want to know if a pointer points to valid memory "ever allocated" by your program, you will have to intercept allocs functions, save returned chunks pointer & size, and compute like the previous example.

What is the intent of the pointer arithmetic in this code?

What does this line of code do, newnode -> item.string = (char *)newnode + sizeof(log_t);, in the below example?
int nodesize = sizeof(log_t) + strlen(data.string) + 1;
newnode = (log_t *)malloc(nodesize);
if (newnode == NULL) return -1;
// What is this line doing:
newnode -> item.string = (char *)newnode + sizeof(log_t);
strcpy(newnode -> item.string, data.string);
where newnode is the variable of type log_t (log_t has a variable named item of type data_t). data_t has a property char *string.
Is this code setting up the buffer size of string?
I'm assuming the memory for newnode was allocated with additional space for the string.
The result is the same, but I'd personally write that as newnode->item.string = (char*) &newnode[1];
That is, the storage space for string is immediately after the log_t object. This is sometimes done when a single chunk of memory has been allocated in advance, and objects and their members all point to memory in this chunk. It's been done in the past to cut down on the overhead of small memory allocations.
If log_t is 32 bytes, and the string is 16 bytes long (including the nul terminator!), you could malloc 48 bytes, point the string member to the 32nd byte of this memory allocation and copy the string there.
newnode -> item.string = (char *)newnode + sizeof(log_t);
The right side will take the pointer newnode, cast it to a character pointer then add the size of an object to it.
This gives a pointer n bytes beyond newnode, where n is the size of the log_t object.
It then places this pointer value into the string member of the item member of the object pointed to by newnode.
Without seeing the actual structures in use, it's a little hard to tell why this is being done but my best guess would be that it's to provide an efficient self-referential pointer.
In other words, the pointer within newnode will point to an actual part of the newnode itself, or part of a larger memory block that was allocated which contains a newnode object at the start of it. And, since you state that newnode is of the type log_t, it must be the latter case (a type cannot contain a copy of itself - it can contain a pointer to the type itself but not a actual copy).
An example of where this may come in handy is an object allocation where small sizes are satisfied completely by the object itself but larger ones are handled differently, such as with an int-to-string map entry:
typedef struct {
int id;
char *string;
} hdr_t;
typedef struct {
hdr_t hdr;
char smallBuff[31];
} entry_t;
In the case where you want to populate an entry_t variable with a 500-character string, you would allocate the string separately then just set up string to point to it.
However, for a string of thirty characters or less, you could just create it in smallBuff then set string to point to that instead (no separate memory needed). That would be done with:
entry_t *entry = malloc (sizeof (*entry)); // should check for NULL.
entry->hdr.id = 7;
entry->hdr.string = (char*)entry + sizeof (hdr_t);
strcpy (entry->hdr.string, "small string");
The third line in that sample above is very similar to what you have in your code.
Similarly (and probably more apropos to your case), you can allocate more memory than you need and use it:
typedef struct {
int id;
char *string;
} entry_t;
char *str = "small string";
entry_t *entry = malloc (sizeof (*entry) + strlen (str) + 1); // with extra bytes.
entry->id = 7;
entry->string = (char*)entry + sizeof (entry_t);
strcpy (entry->string, str);
It's doing pointer arithmetic. In C, when you add a value to a pointer, it moves the pointer by that value * the size of the type pointed to by the pointer.
What this is doing is casting newnode to a char*, so that the pointer arithmetic is done assuming that newnode is a char*, thus the size of the data it points to is 1. It them adds the sizeof(log_t) which is the size of type log_t. Based on your description of log_t, it looks like it contains a single char*, so its size would be the size of a pointer, either 4 or 8 bytes, depending on the architecture.
So, this will set newnode.item.string to be sizeof(log_t) bytes after address that newnode contains.

Declare and allocate memory for an array of structures in C

I'm trying to declare and allocate memory for an array of structures defined as follows:
typedef struct y{
int count;
char *word;
} hstruct
What I have right now is:
hstruct *final_list;
final_list = calloc (MAX_STR, sizeof(hstruct));
MAX_STRbeing the max size of the char word selector.
I plan on being able to refer to the it as:
final_list[i].count, which would be an integer and
final_list[i].word, which would be a string.
ibeing an integer variable.
However, such expressions always return (null). I know I'm doing something wrong, but I don't know what. Any help would be appreciated. Thanks.
A struct that contains a pointer doesn't directly holds the data, but holds a pointer to the data. The memory for the pointer itself is correctly allocated through your calloc but it is just an address.
This means that is your duty to allocate it:
hstruct *final_list;
final_list = calloc(LIST_LENGTH, sizeof(hstruct));
for (int i = 0; i < LIST_LENGTH; ++i)
final_list[i].word = calloc(MAX_STR, sizeof(char));
This requires also to free the memory pointed by final_list[i].word before releasing the array of struct itself.

Resources