char** args = (char**)malloc(MAX_ARGS*sizeof(char*));
and
char* args = (char*)malloc(MAX_ARGS*sizeof(char*));
Please explain the difference between these two types of declaration. Why do we need 2 stars and why 1 star?
In both cases, the cast is unnecessary and can mask errors; I'll delete the casts in the following. (malloc() returns a void*, which can be implicitly converted to a pointer-to-whatever.)
char **args = malloc(MAX_ARGS*sizeof(char*));
This defined args as a pointer-to-pointer-to-char, and initializes it to point to a chunk of memory big enough to hold MAX_ARGS elements, each of which is a char*. Once you've done this, you'll want to assign values to those char* elements, likely making them point to strings.
char *args = malloc(MAX_ARGS*sizeof(char*));
This is legal, but almost certainly a logical error. args is a pointer-to-char, which means it can point either to a single char object, or to the first element of an array of char elements. But you're allocating memory that can hold MAX_ARGS pointers.
A more likely thing to do is:
char *s = malloc(MAX_LEN);
which will cause s to point to a region of memory that can hold MAX_LEN char elements (a string of length up to MAX_LEN - 1). (Note that sizeof (char) == 1 by definition.)
There's a useful trick to avoid type mismatches. A pointer of type FOO*, for any type FOO, needs to point to a chunk of memory big enough to old one or more elements of type FOO. If you write:
ptr = malloc(count * sizeof *ptr);
and ptr is a FOO*, then sizeof *ptr is the same as sizeof (FOO) -- but you won't have to update the line if you later change ptr to be a pointer toBAR`.
In your case, the pointed-to type is itself a pointer, so you can write:
char **args = malloc(MAX_ARGS * sizeof *args);
And when you call malloc, you should always check whether it succeeded, and take some action if it failed -- even if that action is to terminate the program with an error message:
if (args == NULL) {
fprintf(stderr, "malloc failed\n");
exit(EXIT_FAILURE);
}
2 stars are like matrix (or a two dimensional array). First malloc only gets memory for the first line (you have to iterave over its elements making the second malloc to reserve space for the other ones).
The two stars mean a pointer-to-a-pointer_type-to-data_type. So in the case of a char** this is saying create an array of char*[MAX_ARGS].
So for this case the following would be valid as args[n] is a char*:
args[0] = (char*)strdup("aaa");
args[1] = (char*)strdup("bbbb");
args[2] = (char*)strdup("ccccc");
NOTE: you should remember to free each member of the array prior to freeing args otherwise the pointers are lost.
With a single star it is just a pointer-to-data and essentially you are writing char[MAX_ARGS*sizeof(char*)]
For this args[n] is a char type:
args[0] = 'a';
args[1] = 'b';
args[2] = 'c';
With this char-array you are creating a simple array of characters, also known as a c-style string.
Related
I have this program:
#include<stdio.h>
void copy_string(char string1[], char string2[]){
int counter=0;
while(string1[counter]!='\0'){
string2[counter] = string1[counter];
counter++;
}
string2[counter] = '\0';
}
int main() {
char* myString = "Hello there!";
char* myStringCopy;
copy_string(myString, myStringCopy);
printf("%s", myStringCopy);
}
My question is, why isn't it working unless I declare myStringCopy as a fixed-size variable (char myStringCopy[12];)? Shouldn't it work if I add a \0 character after the copy as I'm doing?
It can work by doing char* myStringCopy as long as you allocate memory space for it.
for example
char* myStringCopy
myStringCopy = malloc(sizeof(char) * (strlen(myString)+1))
I might be mistaken about the +1 but I think it is like this.
char myStringCopy[12]; tells the compiler to create an array of 12 char. When myStringCopy is passed to copy_string, this array is automatically converted to a pointer to its first element, so copy_string receives a pointer to the characters.
char *myStringCopy; tells the compiler to create a pointer to char. The compiler creates this pointer, including providing memory for it, but it does not set the value of the pointer. When this pointer is passed to copy_string, copy_string does not receive a valid value.
To make char *myStringCopy; work, you must allocate memory (which you can do with malloc). For example, you could use:
char *myStringCopy;
myStringCopy = malloc(13 * sizeof *myStringCopy);
if (myStringCopy == NULL)
{
fprintf(stderr, "Error, the malloc did not work.\n");
exit(EXIT_FAILURE);
}
Also, note that 12 is not enough. The string “Hello there!” contains 12 characters, but it also includes a terminating null character. You must provide space for the null character. char myStringCopy[12]; appeared to work, but copy_string was actually writing a thirteenth character beyond the array, damaging something else in your program.
The problem is that you don't have room for mystringCopy
You need to reserve space first:
char* myString = "Hello there!";
char* myStringCopy = malloc(strlen(myString) + 1);
char* myStringCopy;
This is only pointer to char*. You must first allocate memory for myStringCopy, before start copy. When you declare it like this:
char myStringCopy[12];
compiler allocate enough memory in stack.
I am working on a short program that reads a .txt file. Intially, I was playing around in main function, and I had gotten to my code to work just fine. Later, I decided to abstract it to a function. Now, I cannot seem to get my code to work, and I have been hung up on this problem for quite some time.
I think my biggest issue is that I don't really understand what is going on at a memory/hardware level. I understand that a pointer simply holds a memory address, and a pointer to a pointer simply holds a memory address to an another memory address, a short breadcrumb trail to what we really want.
Yet, now that I am introducing malloc() to expand the amount of memory allocated, I seem to lose sight of whats going on. In fact, I am not really sure how to think of memory at all anymore.
So, a char takes up a single byte, correct?
If I understand correctly, then by a char* takes up a single byte of memory?
If we were to have a:
char* str = "hello"
Would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Now, if you would judge my interpretation. We are telling the compiler that we need "size" number of contiguous memory reserved for chars. If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
// Add characters to buffer
int i = 0;
char c;
while((c=fgetc(file))!=EOF){
*(buffer + i) = (char)c;
i++;
}
Adding the characters to the buffer and allocating the memory is what is I cannot seem to wrap my head around.
If **buffer is pointing to *str which is equal to null, then how do I allocate memory to *str and add characters to it?
I understand that this is lengthy, but I appreciate the time you all are taking to read this! Let me know if I can clarify anything.
EDIT:
Whoa, my code is working now, thanks so much!
Although, I don't know why this works:
*((*buffer) + i) = (char)c;
So, a char takes up a single byte, correct?
Yes.
If I understand correctly, by default a char* takes up a single byte of memory.
Your wording is somewhat ambiguous. A char takes up a single byte of memory. A char * can point to one char, i.e. one byte of memory, or a char array, i.e. multiple bytes of memory.
The pointer itself takes up more than a single byte. The exact value is implementation-defined, usually 4 bytes (32bit) or 8 bytes (64bit). You can check the exact value with printf( "%zd\n", sizeof char * ).
If we were to have a char* str = "hello", would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
Yes.
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Do not cast the result of malloc. And sizeof char is by definition always 1.
If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Yes. Well, almost. str* makes no sense, and it's 10 chars, not 10 memory addresses. But str would point to the first of the 10 chars, yes.
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
No. You would write *buffer = malloc( size );. The idea is that the memory you are allocating inside the function can be addressed by the caller of the function. So the pointer provided by the caller -- str, which is NULL at the point of the call -- needs to be changed. That is why the caller passes the address of str, so you can write the pointer returned by malloc() to that address. After your function returns, the caller's str will no longer be NULL, but contain the address returned by malloc().
buffer is the address of str, passed to the function by value. Allocating to buffer would only change that (local) pointer value.
Allocating to *buffer, on the other hand, is the same as allocating to str. The caller will "see" the change to str after your file_read() returns.
Although, I don't know why this works: *((*buffer) + i) = (char)c;
buffer is the address of str.
*buffer is, basically, the same as str -- a pointer to char (array).
(*buffer) + i) is pointer arithmetic -- the pointer *buffer plus i means a pointer to the ith element of the array.
*((*buffer) + i) is dereferencing that pointer to the ith element -- a single char.
to which you are then assigning (char)c.
A simpler expression doing the same thing would be:
(*buffer)[i] = (char)c;
with char **buffer, buffer stands for the pointer to the pointer to the char, *buffer accesses the pointer to a char, and **buffer accesses the char value itself.
To pass back a pointer to a new array of chars, write *buffer = malloc(size).
To write values into the char array, write *((*buffer) + i) = c, or (probably simpler) (*buffer)[i] = c
See the following snippet demonstrating what's going on:
void generate0to9(char** buffer) {
*buffer = malloc(11); // *buffer dereferences the pointer to the pointer buffer one time, i.e. it writes a (new) pointer value into the address passed in by `buffer`
for (int i=0;i<=9;i++) {
//*((*buffer)+i) = '0' + i;
(*buffer)[i] = '0' + i;
}
(*buffer)[10]='\0';
}
int main(void) {
char *b = NULL;
generate0to9(&b); // pass a pointer to the pointer b, such that the pointer`s value can be changed in the function
printf("b: %s\n", b);
free(b);
return 0;
}
Output:
0123456789
Is the same
char* s1[size];
To
char** s2 = malloc(size * sizeof(char*));
They have any difference?
Theoretically, *arr[] and **arr are different. For example :
char *arr[size]; //case 1
Here arr is a an array of size size whose elements are of the type char*
Whereas,
char **arr; //case2
Here arr is itself a pointer to the type char*
Note: In case 1 array arr degrades to a pointer to become the type char** but it's not possible the other way around i.e, pointer in case 2 cannot become an array.
char* s1[size];
Is an array of pointers of type char that is allocated on the stack.
char** s2 = malloc(size * sizeof(char*));
Is a pointer of type char ** that is allocated on the stack but points to a dynamic array of pointers of type char * allocated on the heap.
The two differ in terms of scope and the usual difference between arrays and pointers.
There are few differences:
s1 is not an lvalue, so it cannot be modified (e.g. using assignment or increment operators). Because of this it resembles type char** const s1 which also does not allow modifications (but in this case this is caused by const modifier).
operator & used on address of array will return address of array (i.e. address of 1st element). When & will be used on variable, it will return its address:
assert((void*)&s1 == (void*)s1);
assert((void*)&s2 != (void*)s2);
sizeof() used on array will return array size, while sizeof() used on pointer will return pointer size - usually it will be the same as sizeof(void*), but C standard does not require this (see comments below):
assert(sizeof(s1) == size * sizeof(char*));
assert(sizeof(s1) == size * sizeof(s1[0])); // this is the same
assert(sizeof(s2) == sizeof(void*)); // on some platforms this may fail
and of course obvious one - s1 is allocated on stack, s2 on heap. Because of this s1 will be destroyed automatically when execution leaves current scope, and s2 requires call to free to release memory.
Update: here is example code which checks above asserts:
#include <assert.h>
#include <stdlib.h>
int main()
{
const int size = 22;
char* s1[size];
char** s2 = (char**)malloc(size * sizeof(char*));
assert((void*)&s1 == (void*)s1);
assert((void*)&s2 != (void*)s2);
assert(sizeof(s1) == size * sizeof(char*));
assert(sizeof(s1) == size * sizeof(s1[0])); // this is the same
assert(sizeof(s2) == sizeof(void*)); // on some platforms this may fail
free(s2);
// Attempts to modify value
char** const s3 = s1;
++s2;
//++s1; // compilation error - lvalue required as increment operand
//++s3; // compilation error - increment of read-only variable ‘s3’
return 0;
}
s1 is an array, s2 is a pointer. s2 points to the first element of the malloced array.
The array s1 has automatic storage duration, while the array which s2 points to has dynamic storage duration.
Also, in C89 char* s1[size]; is valid only if size is a constant expression, because C89 doesn't support variable-length arrays.
In a program I am writing I made a Tokenize struct that says:
TokenizerT *Tokenize(TokenizerT *str) {
TokenizerT *tok;
*tok->array = malloc(sizeof(TokenizerT));
char * arr = malloc(sizeof(50));
const char *s = str->input_strng;
int i = 0;
char *ds = malloc(strlen(s) + 1);
strcpy(ds, s);
*tok->array[i] = strtok(ds, " ");
while(*tok->array[i]) {
*tok->array[++i] = strtok(NULL, " ");
}
free(ds);
return tok;
}
where TokenizeT is defined as:
struct TokenizerT_ {
char * input_strng;
int count;
char **array[];
};
So what I am trying to do is create smaller tokens out of a large token that I already created. I had issues returning an array so I made array part of the TokenizerT struct so I can access it by doing tok->array. I am getting no errors when I build the program, but when I try to print the tokens I get issues.
TokenizerT *ans;
TokenizerT *a = Tokenize(tkstr);
char ** ab = a->array;
ans = TKCreate(ab[0]);
printf("%s", ans->input_strng);
TKCreate works because I use it to print argv but when i try to print ab it does not work. I figured it would be like argv so work as well. If someone can help me it would be greatl appreciated. Thank you.
Creating the Tokenizer
I'm going to go out on a limb, and guess that the intent of:
TokenizerT *tok;
*tok->array = malloc(sizeof(TokenizerT));
char * arr = malloc(sizeof(50));
was to dynamically allocate a single TokenizerT with the capacity to contain 49 strings and a NULL endmarker. arr is not used anywhere in the code, and tok is never given a value; it seems to make more sense if the values are each shifted one statement up, and corrected:
// Note: I use 'sizeof *tok' instead of naming the type because that's
// my style; it allows me to easily change the type of the variable
// being assigned to. I leave out the parentheses because
// that makes sure that I don't provide a type.
// Not everyone likes this convention, but it has worked pretty
// well for me over the years. If you prefer, you could just as
// well use sizeof(TokenizerT).
TokenizerT *tok = malloc(sizeof *tok);
// (See the third section of the answer for why this is not *tok->array)
tok->array = malloc(50 * sizeof *tok->array);
(tok->array is not a great name. I would have used tok->argv since you are apparently trying to produce an argument vector, and that's the conventional name for one. In that case, tok->count would probably be tok->argc, but I don't know what your intention for that member is since you never use it.)
Filling in the argument vector
strtok will overwrite (some) bytes in the character string it is given, so it is entirely correct to create a copy (here ds), and your code to do so is correct. But note that all of the pointers returned by strtok are pointers to character in the copy. So when you call free(ds), you free the storage occupied by all of those tokens, which means that your new freshly-created TokenizerT, which you are just about to return to an unsuspecting caller, is full of dangling pointers. So that will never do; you need to avoid freeing those strings until the argument vector is no longer needed.
But that leads to another problem: how will the string be freed? You don't save the value of ds, and it is possible that the first token returned by strtok does not start at the beginning of ds. (That will happen if the first character in the string is a space character.) And if you don't have a pointer to the very beginning of the allocated storage, you cannot free the storage.
The TokenizerT struct
char is a character (usually a byte). char* is a pointer to a character, which is usually (but not necessarily) a pointer to the beginning of a NUL-terminated string. char** is a pointer to a character pointer, which is usually (but not necessarily) the first character pointer in an array of character pointers.
So what is char** array[]? (Note the trailing []). "Obviously", it's an array of unspecified length of char**. Because the length of the array is not specified, it is an "incomplete type". Using an incomplete array type as the last element in a struct is allowed by modern C, but it requires you to know what you're doing. If you use sizeof(TokenizerT), you'll end up with the size of the struct without the incomplete type; that is, as though the size of the array had been 0 (although that's technically illegal).
At any rate, that wasn't what you wanted. What you wanted was a simple char**, which is the type of an argument vector. (It's not the same as char*[] but both of those pointers can be indexed by an integer i to return the ith string in the vector, so it's probably good enough.)
That's not all that's wrong with this code, but it's a good start at fixing it. Good luck.
I'm attempting to run execvp using the data from a char[][] type (aka an array of strings). Now I know that execvp() takes a pointer to a string as its first parameter and then a pointer to an array of strings as its second - in fact I have even used it successfully before as such - however I cannot seem to get the correct combination of pointers & strings to get it to work out below - whatever I try is deemed incompatible!
Any help very grateful :) - I've removed my headers to compact down the code a bit!
struct userinput {
char anyargs[30][30]; //The tokenised command
};
int main() {
struct userinput input = { { { 0 } } }; //I believe is valid to set input to 0's
struct userinput *inPtr = &input; //Pointer to input (direct access will be unavailable)
strcpy(inPtr->anyargs[0], "ls"); //Hard code anyargs to arbitary values
strcpy(inPtr->anyargs[1], "-lh");
char (*arrPointer)[30]; //Pointer to an array of char *
arrPointer = &(inPtr->anyargs[0]);
printf("arrPointer[0]: %s, arrPointer[1]: %s\n", arrPointer[0],
arrPointer[1]);
printf("At exec case; ");
execvp( arrPointer[0], arrPointer);
perror("Command not recognised"); //Prints string then error message from errno
return 0;
}
There is no such thing as char[][] in C. execvp requires an array of pointers to const char. This can be written as either char * const * or char * const [].
You however have an array of 30-characters-long arrays, not an array of pointers. The two types are not compatible, not interchangeable, and not convertible one to another in either direction.
In this line
char (*arrPointer)[30]; //Pointer to an array of char *
you attempt to declare a pointer to an array of char*, incorrectly. What you have declared instead is a pointer to char[30], which is very different from what execvp expects.
The next line
arrPointer = &(inPtr->anyargs[0]);
purports to initialize a pointer to an array of char* with a pointer to char[30], which cannot possibly be correct even if you declare a pointer to an array of char*, because the right hand side of the assignment is not a pointer to an array of char*, it's a pointer to char[30] and no sequence of casts, indices, addresses and dereferences will turn one to the other.
An array of 30 pointers to char is declared like this:
char* arguments[30];
A dynamically-sized array of pointers to char is made like this:
char** arguments = calloc (nargs, sizeof(char*));
You need to use one of those if you want to call execvp.
In either case each pointer in the array of pointers must be initialized to point to an individual NUL-terminated character array (possibly to elements of your char[30][30] array) and the last pointer (one after all the argumenrs we want to pass) must be set to NULL. (I wonder how you expected to find a NULL in a char[30][30]).
The execvp() expects as second argument a char *const argv[]. This means an array of pointers to char. This is different from a char[30][30] which is represented in memory as 30x30 contiguous chars (so no pointer).
To solve this, define your structure
struct userinput {
char *anyargs[30]; //space for 30 char* pointers
};
You could as well define anyargs as char** and initalize if dynamically with (char**)calloc(number_of_args+1,sizeof(char*))
Later, assign directly the pointers:
inPtr->anyargs[0] = "ls"; //Hard code (or use strdup() )
inPtr->anyargs[1] = "-lh";
inPtr->anyargs[2] = NULL; // end of the argument list !!!
char **arrPointer; //Pointer to an array of char *
arrPointer = inPtr->anyargs;
Edit: Caution: "The array of pointers must be terminated by a NULL pointer.".