What is the intent of the pointer arithmetic in this code? - c

What does this line of code do, newnode -> item.string = (char *)newnode + sizeof(log_t);, in the below example?
int nodesize = sizeof(log_t) + strlen(data.string) + 1;
newnode = (log_t *)malloc(nodesize);
if (newnode == NULL) return -1;
// What is this line doing:
newnode -> item.string = (char *)newnode + sizeof(log_t);
strcpy(newnode -> item.string, data.string);
where newnode is the variable of type log_t (log_t has a variable named item of type data_t). data_t has a property char *string.
Is this code setting up the buffer size of string?

I'm assuming the memory for newnode was allocated with additional space for the string.
The result is the same, but I'd personally write that as newnode->item.string = (char*) &newnode[1];
That is, the storage space for string is immediately after the log_t object. This is sometimes done when a single chunk of memory has been allocated in advance, and objects and their members all point to memory in this chunk. It's been done in the past to cut down on the overhead of small memory allocations.
If log_t is 32 bytes, and the string is 16 bytes long (including the nul terminator!), you could malloc 48 bytes, point the string member to the 32nd byte of this memory allocation and copy the string there.

newnode -> item.string = (char *)newnode + sizeof(log_t);
The right side will take the pointer newnode, cast it to a character pointer then add the size of an object to it.
This gives a pointer n bytes beyond newnode, where n is the size of the log_t object.
It then places this pointer value into the string member of the item member of the object pointed to by newnode.
Without seeing the actual structures in use, it's a little hard to tell why this is being done but my best guess would be that it's to provide an efficient self-referential pointer.
In other words, the pointer within newnode will point to an actual part of the newnode itself, or part of a larger memory block that was allocated which contains a newnode object at the start of it. And, since you state that newnode is of the type log_t, it must be the latter case (a type cannot contain a copy of itself - it can contain a pointer to the type itself but not a actual copy).
An example of where this may come in handy is an object allocation where small sizes are satisfied completely by the object itself but larger ones are handled differently, such as with an int-to-string map entry:
typedef struct {
int id;
char *string;
} hdr_t;
typedef struct {
hdr_t hdr;
char smallBuff[31];
} entry_t;
In the case where you want to populate an entry_t variable with a 500-character string, you would allocate the string separately then just set up string to point to it.
However, for a string of thirty characters or less, you could just create it in smallBuff then set string to point to that instead (no separate memory needed). That would be done with:
entry_t *entry = malloc (sizeof (*entry)); // should check for NULL.
entry->hdr.id = 7;
entry->hdr.string = (char*)entry + sizeof (hdr_t);
strcpy (entry->hdr.string, "small string");
The third line in that sample above is very similar to what you have in your code.
Similarly (and probably more apropos to your case), you can allocate more memory than you need and use it:
typedef struct {
int id;
char *string;
} entry_t;
char *str = "small string";
entry_t *entry = malloc (sizeof (*entry) + strlen (str) + 1); // with extra bytes.
entry->id = 7;
entry->string = (char*)entry + sizeof (entry_t);
strcpy (entry->string, str);

It's doing pointer arithmetic. In C, when you add a value to a pointer, it moves the pointer by that value * the size of the type pointed to by the pointer.
What this is doing is casting newnode to a char*, so that the pointer arithmetic is done assuming that newnode is a char*, thus the size of the data it points to is 1. It them adds the sizeof(log_t) which is the size of type log_t. Based on your description of log_t, it looks like it contains a single char*, so its size would be the size of a pointer, either 4 or 8 bytes, depending on the architecture.
So, this will set newnode.item.string to be sizeof(log_t) bytes after address that newnode contains.

Related

C - Copying Array of Strings into a Struct

I've been banging my head against the wall on this for a couple hours now.
I have a struct defined as follows:
typedef struct historyNode {
int pos;
char cmd[MAXLINE];
char* args[(MAXLINE / 2) + 1];
struct historyNode* next;
} historyNode_t;
I am attempting to copy a passed-in array of strings into the args field within the above struct. This happens in the method below:
void addToHistory(history_t* history, char* args[(MAXLINE / 2) + 1]) {
historyNode_t* node = malloc(sizeof(historyNode_t));
...
int index = 0;
while (args[index] != NULL) {
node->args[index] = args[index];
...
When I attempt to access this node's args value at a later time outside of the method, it spits out a value equal to whatever is in the passed-in args array at that moment; ie, the values aren't actually being copied, but rather the addresses are.
I feel like this is simple but it is frustrating me. Any tips on how this can be remedied are super appreciated.
I attempt to access this node's args value at a later time outside of the method
For doing that you should've returned pointer to struct historyNode wherever you are trying to access the args value. Here is an example for that:
#include<stdio.h>
#include<stdlib.h>
struct node
{
char *a[2];
};
struct node *fun()
{
char *c[]={"hello","World"};
struct node *ptr= malloc(sizeof(struct node));
ptr->a[0]=c[0];
ptr->a[1]=c[1];
return ptr;
}
int main()
{
struct node *ptr=fun();
printf("%s %s\n",ptr->a[0],ptr->a[1]);
return 0;
}
OUTPUT: hello World
In C, C++,
char charA[10]; // Array of char (i.e., string) up to 9+1 byte
// 10 bytes of memory is reserved
char *string; // Pointer to a null-terminated string
// Memory for 1 pointer (4 or 8 bytes) are reserved
// Need to allocate arbitrary bytes of memory
// Up to programmer to interpret the memory structure
// E.g.,
// As array of pointers to string
// A long string
// etc.
// This pointer could be passed to other functions
// and content at that pointed address could be changed
//char *strings[]; // Cannot declare pointer to unknown length of array
// Use char** as below
char **ptrs2Strings; // Pointer to pointer to char
// Memory for 1 pointer (4 or 8 bytes) are reserved
// Need to allocate arbitrary bytes of memory
// Up to programmer to interpret the memory structure
// E.g.,
// As array of pointers to string
// A long string
// etc.
// The pointer to pointer could be passed to other
// functions. The content at that pointed address is
// only an address to an user-allocated memory.
// These functions could change this second address as
// well as content of the memory pointed by the second
// address.
char *charvar[10]; // array of 10 char pointers
// Memory for 10 pointers (40 or 80 bytes) are reserved
// Programmer could allocate arbitrary bytes of memory
// for each char pointer
char stringA[10][256]; // array of 10 strings, and each string could store
// up to 255+1 bytes
I hope this could help you.

Some confusions about struct memory allocation mechanism?

During my project, I am confronted with C program.
As shown below, htmp is a struct pointer. We first allocate a memory for it. But why should we allocate a memory for its element word again?
If it's essential to allocate memory for each element of a struct, why not allocate memory for its other elements, id and next?
#define HASHREC bitewisehash
typedef struct hashrec {
char *word;
long long id;
struct hashrec *next;
} HASHREC;
/* Move-to-front hashing and hash function from Hugh Williams, http://www.seg.rmit.edu.au/code/zwh-ipl/ */
/* Simple bitwise hash function */
unsigned int bitwisehash(char *word, int tsize, unsigned int seed) {
char c;
unsigned int h;
h = seed;
for (; (c =* word) != '\0'; word++) h ^= ((h << 5) + c + (h >> 2));
return((unsigned int)((h&0x7fffffff) % tsize));
}
/* Insert string in hash table, check for duplicates which should be absent */
void hashinsert(HASHREC **ht, char *w, long long id) {
HASHREC *htmp, *hprv;
unsigned int hval = HASHFN(w, TSIZE, SEED);
for (hprv = NULL, htmp = ht[hval]; htmp != NULL && scmp(htmp->word, w) != 0; hprv = htmp, htmp = htmp->next);
if (htmp == NULL) {
htmp = (HASHREC *) malloc(sizeof(HASHREC)); # allocate memory for htmp
htmp->word = (char *) malloc(strlen(w) + 1); # why allocate memory again ?
strcpy(htmp->word, w); #
htmp->id = id; # why not allocate memory for htmp->id ?
htmp->next = NULL; # why nor allocate memory for htmp->next?
if (hprv == NULL) ht[hval] = htmp;
else hprv->next = htmp;
}
else fprintf(stderr, "Error, duplicate entry located: %s.\n",htmp->word);
return;
}
You need to separate in your mind two things (1) in what memory is the thing I want to store stored?; and (2) what variable (pointer) holds the address to where it is stored so I can find it again.
You first declare two pointers to struct hashrec:
HASHREC *htmp, *hprv;
A pointer is nothing but a variable that holds the address to something else as its value. When you first declare the two pointers, they are uninitialized and hold no address. You then, in a rather awkward manner, initialize both pointers within a for loop declaration, e.g. hprv = NULL, htmp = ht[hval] and later hprv = htmp, htmp = htmp->next so presumably both pointers now hold an address and point somewhere.
Following the loop (with an empty body), you test if (htmp == NULL), meaning that htmp does not point to an address (which can be the case if you have found the hash-index of interest empty).
Then in order to provide storage for one HASHREC (e.g. a struct hashrec) you need to allocate storage so you have a block of memory in which to store the thing you want to store. So you allocate a block to hold one struct. (See: Do I cast the result of malloc?)
Now, look at what you have allocated memory for:
typedef struct hashrec {
char *word;
long long id;
struct hashrec *next;
} HASHREC;
You have allocated storage for a struct that contains (1) a char *word; (a pointer to char - 8-bytes (4-bytes on x86)); (2) a long long id; (8-bytes on both) and (3) a pointer to hold the address of the next HASHREC in the sequence.
There is no question that id can hold a long long value, but what about word and next? They are both pointers. What do pointers hold? The address to where the thing they point to can be found. Where can word be found? The thing you want is currently pointed to by w, but there is no guarantee that w will continue to hold the word you want, so you are going to make a copy and store it as part of the HASHREC. So you see:
htmp->word = malloc(strlen(w) + 1); /* useless cast removed */
Now what does malloc return? It returns the address to the beginning of a new block of memory, strlen(w) + 1 bytes long. Since a pointer holds the value of something else as its value, htmp->word now stores the address to the beginning of the new block of memory as its value. So htmp->word "points" to the new block of memory and you can use htmp->word as a reference to refer to that block of memory.
What happens next is important:
strcpy(htmp->word, w); #
htmp->id = id; # why not allocate memory for htmp->id ?
htmp->next = NULL; # why nor allocate memory for htmp->next?
strcpy(htmp->word, w); copies w into that new block of memory. htmp->id = id; assigns the value of id to htmp->id and no allocation is required because when you allocate:
htmp = malloc(sizeof(HASHREC)); /* useless cast removed */
You allocate storage for a (1) char * pointer, (2) a long long id; and (3) a struct hashrec* pointer -- you have already allocated for a long long so htmp->id can store the value of id in the memory for the long long.
htmp->next = NULL; # why nor allocate memory for htmp->next?
What is it that you are attempting to store that would require new allocation for htmp->next? (hint: nothing currently) It will point to the next struct hashrec. Currently it is initialize to NULL so that the next time you iterate to the end of all the struct hashrec next pointers, you know you are at the end when you reach NULL.
Another way to think of it is that the previous struct hashrec next can now point to this node you just allocated. Again no additional allocation is required, the previous node ->next pointer simply points to the next node in sequence, not requiring any specific new memory to be allocated. It is just used as a reference to refer (or point to) the next node in the chain.
There is a lot of information here, but when you go through the thought process of determining (1) in what memory is the thing I want to store stored?; and (2) what variable (pointer) holds the address to where it is stored so I can find it again... -- things start to fall into place. Hope this helps.
When you allocate the memory for the struct, only enough memory to hold a pointer is allocated for the member word, that pointer also has to point to valid memory which you can then allocate.
Without making it point to valid memory, it's value is indeterminate and trying to dereference such pointer with an indeterminate value is undefined behavior.
malloc(size) will allocate memory of given size and returns starting address of the allocated memory. Addresses are stored in pointer variables.
In char *word, variable word is of pointer type and it can only hold a pointer of type char i.e., address of a character data in memory.
so, when you allocate memory to the struct type variable htmp, it will allocate memory as follows, 2 bytes for *word, 4 bytes for id and 2 bytes for *next
Now, lets assume that you want to store following into your struct:
{
word = Elephant
id = 1245
}
Now you can save id directly into id member by doing this htmp->id = 1245. But you also want to save a word "Elephant" in your struct, how to achieve this?
htmp->word = 'Elephant' will cause error because word is a pointer. So, now you allocate memory to store actual string literal Elephant and store the starting address in htmp->word.
Instead of using char *word in your struct, you could have used char word[size] where no need to allocate memory separately for member word. The reason behind not doing so, is that you want to select some random size for word, which can waste memory if you are storing less characters and which even fall shot if word is too big.

Freeing a pointer in a structure referenced by a pointer

I have a pointer to several structures that have been allocated memory via:
STRUCTNAME *ptr;
ptr = (STRUCTNAME *)malloc(sizeof(STRUCTNAME)*numberOfStructs);
The structures are accessed via a offset like so:
(ptr + i)->field;
The structures have 2 fields that are character pointers as follows:
typedef struct
{
char *first;
char *second;
}STUCTNAME;
These fields are allocated memory as follows:
(ptr + i)->first = (char *)malloc(strlen(buffer));
This appears to work but when I try to free the pointers within the structures I get a segmentation fault 11 when I do this:
free((prt + i)->first);
Help?
Notes:
buffer is a character array. Offsetting a pointer by a integer should increment the pointer by the size of what it is pointing to times the integer correct?
Here is a link to my full source code. I have not written some of the functions and I am not using the freeAllpointers and printAll yet.
https://drive.google.com/file/d/0B6UPDg-HHAHfdjhUSU95aEVBb0U/edit?usp=sharing
OH! Thanks everyone! Have a happy Thanksgiving! =D (If you're into that kinda stuff)
In case, you don't initialize all those members in that piece of code, you're not showing us:
Allocate the struct storage (STRUCTNAME*) with calloc(), so that all allocated memory, namely firstand second are zero at the beginning. Passing NULL to free() will result in a no-op. Passing any wild (garbage) pointer to free() may cause a segmentation fault.
To detect a double-free, set ptr[i].first = NULL; after free(ptr[i].first); as a defensive measure for testing.
Notes: buffer is a character array. Offsetting a pointer by a integer
should increment the pointer by the size of what it is pointing to
times the integer correct?
Yes, except for void* on those compilers, which don't define sizeof(void), which is defined to have undefined behavior, to a value > 0: What is the size of void?
Edit:
void makeReviews(FILE *input, REVIEW *rPtr, int numReviews) <-- This does NOT return the new value of rPtr. In main(), it will remain NULL.
Do something like this:
REVIEW* makeReviews(FILE *input, int numReviews);
//...
int main(){
//...
rPtr = makeReviews(input,numReviews);
//...
}
or
void makeReviews(FILE** input,REVIEW** rPtrPtr,int numReviews){
REVIEW* rPtr = *rPtrPtr;
//...
*rPtrPtr = rPtr;
}
//...
int main(){
//...
makeReviews(input,&rPtr,numReviews);
//...
}
fgets(cNumReviews, sizeof(cNumReviews), input); <-- Perhaps, you could use something like fscanf().

how to convert a dynamically numerical string into a dynamically array of integers

struct integer* convert_integer(char* stringInt)
{
struct integer* convertedInt_1;
char* stringArray3 = (char *) malloc (sizeof(char));;
free(stringArray3);
stringArray3 = stringInt;
convertedInt_1->digits = atoi(stringArray3);
stringArray4 = stringInt;
}
this is a sample of the code. this code is giving me an error when i use the standard library from c "Warning: assignment makes pointer from integer to without a cast"
so i need to know how to convert a dynamically numerical string into dynimacally struct integer
You do not need any dynamic allocation for char string here, nor do you need an additional char * pointer.
struct integer* convert_integer(char* stringInt)
{
/*Allocate memory to structure,You cannot return pointer to local structure*/
struct integer* convertedInt_1 = (struct integer*)malloc(sizeof(*convertedInt_1));
/*Convert the string to integer*/
int i = atoi(stringInt);
/*Assign converted integer to structure member*/
convertedInt_1->digits = i;
/*return pointer to heap allocated structure*/
return convertedInt_1 ;
}
There are a lot of problems with this code, but I'll try to walk you through them.
One, you malloc only one char worth of memory, not the amount of memory needed to hold your array. You really need to include an argument for the size of the array to be changed if the string is not null terminated.
Second, you're trying to use memory after you free it. This is bad. Very bad. You should only free memory after you're finished with it.
Next, you're trying to atoi the entire array at once. This is going to try to change the entire string into one number, not one int.
What I think you want, is to convert each character from stringInt to a (single digit) int in your result. For this, use a for loop to iterate through the array.
I'm pretty sure you want to be using int and not integer.
Last, you forgot to return anything - this doesn't compile.

Which is correct when allocating memory for a struct in C

Assuming we have a simple struct like:
typedef struct
{
int d1;
int d2;
float f1;
}Type;
Which is the correct when allocate memory for a new instance of it:
This:
// sizeof *t == sizeof(Type) ( gcc prints 12 bytes)
Type *t = malloc(sizeof *t);
or:
// sizeof pointer always == 4 (in my case also on gcc)
Type *t = malloc(sizeof(t));
Which is the correct?
This is the correct way:
Type *t = malloc(sizeof *t);
Why this is correct?
Because you correctly allocate a size big enough to hold a structure. *t points to a type Type.
This is incorrect way:
Type *t = malloc(sizeof(t));
Why this is Incorrect?
sizeof(t) returns size of pointer and not the actual type(i.e: not the size of the structure).
What you need to allocate is size big enough to hold a structure not size equal to pointer to structure.
Note that, Size of an pointer pointing to Any type is same on an system.
Why is the first approach better?
With the first approach, when you change Type, the malloc automatically changes size to be the correct value, you do not have to do that explicitly unlike other ways.
Also, the important part of writing an malloc call is finding the correct size that needs to be passed. The best way to do this is not to look anywhere (because that is when we make the mistake) but only at the left hand side of this malloc statement. Since, it is t in this case therefore the correct size will be sizeof(*t).
How to standardize use of malloc?
With above mentioned correct approach there is one problem say, if we want to malloc say 30 elements. Then our malloc expression becomes:
t = (T *) malloc(30 * sizeof (*T));
This is not the preferred way to write a malloc expression, because one can make a mistake which entering the number 30 in the malloc parameter. What we would like- irrespective of the number of elements required the malloc parameter should always be the standard sizeof(*x) or something similar.
So here is an approach with an example:
Suppose we have a pointer p, pointing to a single dimensional array of size 20, whose each element is struct node. The declaration will be:
struct node (*p) [20];
Now if we wish to malloc 20 elements of stuct node, and wish that pointer p should hold the return address of malloc then we have
p = (data-type of p) malloc (size of 20 elements of struct node);
To find the data type of p, for casting we just make the variable name disappear or replace p with a blank. So we now have
p = (struct node (*)[20] ) malloc(size of 20 elements of struct node);
We can't go very wrong over here because the compiler will complain if we are wrong. Finally the size! We just do the standard way we have described, that is
p = (struct node (*)[20] ) malloc(sizeof (*p));
And we are done!
Type *t = malloc(sizeof *t);
This is the correct way to allocate the amount of memory needed for a new instance.
Type *t = malloc(sizeof (t));
This will only allocate enough storage for a pointer, not an instance.
sizeof(*t), because t is of type Type* so that *t points to something of type Type.
But I would suggest to use this instead, because it's more readable and less error prone:
Type *t = malloc(sizeof(Type));
this one:
Type *t = malloc(sizeof(*t));
you allocate memory for the struct, not for a pointer.
The first is correct. Second will not allocate enough memory, because t has the size of a pointer.
Better yet is
Type *t = malloc(sizeof(Type));
Preferably: Type * t = malloc(sizeof(Type));
Arguably, sizeof *t works as well and allows you to change the actual type of *t without requiring you to modify two separate locations, but using the type rather than an expression in the initial allocation feels more readable and expressive... that's subjective, though. If you you want to keep your options open to change the type, I'd personally prefer factoring that change into the typedef rather than the variable declaration/initialization.

Resources