Related
This is well known code to compute array length in C:
sizeof(array)/sizeof(type)
But I can't seem to find out the length of the array passed as an argument to a function:
#include <stdio.h>
int length(const char* array[]) {
return sizeof(array)/sizeof(char*);
}
int main() {
const char* friends[] = { "John", "Jack", "Jim" };
printf("%d %d", sizeof(friends)/sizeof(char*), length(friends)); // 3 1
}
I assume that array is copied by value to the function argument as constant pointer and reference to it should solve this, but this declaration is not valid:
int length(const char**& array);
I find passing the array length as second argument to be redundant information, but why is the standard declaration of main like this:
int main(int argc, char** argv);
Please explain if it is possible to find out the array length in function argument, and if so, why is there the redundancy in main.
sizeof only works to find the length of the array if you apply it to the original array.
int a[5]; //real array. NOT a pointer
sizeof(a); // :)
However, by the time the array decays into a pointer, sizeof will give the size of the pointer and not of the array.
int a[5];
int * p = a;
sizeof(p); // :(
As you have already smartly pointed out main receives the length of the array as an argument (argc). Yes, this is out of necessity and is not redundant. (Well, it is kind of reduntant since argv is conveniently terminated by a null pointer but I digress)
There is some reasoning as to why this would take place. How could we make things so that a C array also knows its length?
A first idea would be not having arrays decaying into pointers when they are passed to a function and continuing to keep the array length in the type system. The bad thing about this is that you would need to have a separate function for every possible array length and doing so is not a good idea. (Pascal did this and some people think this is one of the reasons it "lost" to C)
A second idea is storing the array length next to the array, just like any modern programming language does:
a -> [5];[0,0,0,0,0]
But then you are just creating an invisible struct behind the scenes and the C philosophy does not approve of this kind of overhead. That said, creating such a struct yourself is often a good idea for some sorts of problems:
struct {
size_t length;
int * elements;
}
Another thing you can think about is how strings in C are null terminated instead of storing a length (as in Pascal). To store a length without worrying about limits need a whopping four bytes, an unimaginably expensive amount (at least back then). One could wonder if arrays could be also null terminated like that but then how would you allow the array to store a null?
The array decays to a pointer when passed.
Section 6.4 of the C FAQ covers this very well and provides the K&R references etc.
That aside, imagine it were possible for the function to know the size of the memory allocated in a pointer. You could call the function two or more times, each time with different input arrays that were potentially different lengths; the length would therefore have to be passed in as a secret hidden variable somehow. And then consider if you passed in an offset into another array, or an array allocated on the heap (malloc and all being library functions - something the compiler links to, rather than sees and reasons about the body of).
Its getting difficult to imagine how this might work without some behind-the-scenes slice objects and such right?
Symbian did have a AllocSize() function that returned the size of an allocation with malloc(); this only worked for the literal pointer returned by the malloc, and you'd get gobbledygook or a crash if you asked it to know the size of an invalid pointer or a pointer offset from one.
You don't want to believe its not possible, but it genuinely isn't. The only way to know the length of something passed into a function is to track the length yourself and pass it in yourself as a separate explicit parameter.
As stated by #Will, the decay happens during the parameter passing. One way to get around it is to pass the number of elements. To add onto this, you may find the _countof() macro useful - it does the equivalent of what you've done ;)
First, a better usage to compute number of elements when the actual array declaration is in scope is:
sizeof array / sizeof array[0]
This way you don't repeat the type name, which of course could change in the declaration and make you end up with an incorrect length computation. This is a typical case of don't repeat yourself.
Second, as a minor point, please note that sizeof is not a function, so the expression above doesn't need any parenthesis around the argument to sizeof.
Third, C doesn't have references so your usage of & in a declaration won't work.
I agree that the proper C solution is to pass the length (using the size_t type) as a separate argument, and use sizeof at the place the call is being made if the argument is a "real" array.
Note that often you work with memory returned by e.g. malloc(), and in those cases you never have a "true" array to compute the size off of, so designing the function to use an element count is more flexible.
Regarding int main():
According to the Standard, argv points to a NULL-terminated array (of pointers to null-terminated strings). (5.1.2.2.1:1).
That is, argv = (char **){ argv[0], ..., argv[argc - 1], 0 };.
Hence, size calculation is performed by a function which is a trivial modification of strlen().
argc is only there to make argv length calculation O(1).
The count-until-NULL method will NOT work for generic array input. You will need to manually specify size as a second argument.
This is a old question, and the OP seems to mix C++ and C in his intends/examples. In C, when you pass a array to a function, it's decayed to pointer. So, there is no way to pass the array size except by using a second argument in your function that stores the array size:
void func(int A[])
// should be instead: void func(int * A, const size_t elemCountInA)
They are very few cases, where you don't need this, like when you're using multidimensional arrays:
void func(int A[3][whatever here]) // That's almost as if read "int* A[3]"
Using the array notation in a function signature is still useful, for the developer, as it might be an help to tell how many elements your functions expects. For example:
void vec_add(float out[3], float in0[3], float in1[3])
is easier to understand than this one (although, nothing prevent accessing the 4th element in the function in both functions):
void vec_add(float * out, float * in0, float * in1)
If you were to use C++, then you can actually capture the array size and get what you expect:
template <size_t N>
void vec_add(float (&out)[N], float (&in0)[N], float (&in1)[N])
{
for (size_t i = 0; i < N; i++)
out[i] = in0[i] + in1[i];
}
In that case, the compiler will ensure that you're not adding a 4D vector with a 2D vector (which is not possible in C without passing the dimension of each dimension as arguments of the function). There will be as many instance of the vec_add function as the number of dimensions used for your vectors.
int arsize(int st1[]) {
int i = 0;
for (i; !(st1[i] & (1 << 30)); i++);
return i;
}
This works for me :)
length of an array(type int) with sizeof:
sizeof(array)/sizeof(int)
Best example is here
thanks #define SIZE 10
void size(int arr[SIZE])
{
printf("size of array is:%d\n",sizeof(arr));
}
int main()
{
int arr[SIZE];
size(arr);
return 0;
}
I've always used string constants in C as one of the following
char *filename = "foo.txt";
const char *s = "bar"; /* preferably this or the next one */
const char * const s3 = "baz":
But, after reading this, now I'm wondering, should I be declaring my string constants as
const char s4[] = "bux";
?
Please note that linked question suggested as a duplicate is different because this one is specifically asking about constant strings. I know how the types are different and how they are stored. The array version in that question is not const-qualified. This was a simple question as to whether I should use constant array for constant strings vs. the pointer version I had been using. The answers here have answered my question, when two days of searching on SO and Google did not yield an exact answer. Thanks to these answers, I've learned that the compiler can do special things when the array is marked const, and there are indeed (at least one) case where I will now be using the array version.
Pointer and arrays are different. Defining string constants as pointers or arrays fits different purposes.
When you define a global string constant that is not subject to change, I would recommend you make it a const array:
const char product_name[] = "The program version 3";
Defining it as const char *product_name = "The program version 3"; actually defines 2 objects: the string constant itself, which will reside in a constant segment, and the pointer which can be changed to point to another string or set to NULL.
Conversely, defining a string constant as a local variable would be better done as a local pointer variable of type const char *, initialized with the address of a string constant:
int main() {
const char *s1 = "world";
printf("Hello %s\n", s1);
return 0;
}
If you define this one as an array, depending on the compiler and usage inside the function, the code will make space for the array on the stack and initialize it by copying the string constant into it, a more costly operation for long strings.
Note also that const char const *s3 = "baz"; is a redundant form of const char *s3 = "baz";. It is different from const char * const s3 = "baz"; which defines a constant pointer to a constant array of characters.
Finally, string constants are immutable and as such should have type const char []. The C Standard purposely allows programmers to store their addresses into non const pointers as in char *s2 = "hello"; to avoid producing warnings for legacy code. In new code, it is highly advisable to always use const char * pointers to manipulate string constants. This may force you to declare function arguments as const char * when the function does not change the string contents. This process is known as constification and avoid subtile bugs.
Note that some functions violate this const propagation: strchr() does not modify the string received, declared as const char *, but returns a char *. It is therefore possible to store a pointer to a string constant into a plain char * pointer this way:
char *p = strchr("Hello World\n", 'H');
This problem is solved in C++ via overloading. C programmers must deal with this as a shortcoming. An even more annoying situation is that of strtol() where the address of a char * is passed and a cast is required to preserve proper constness.
The linked article explores a small artificial situation, and the difference demonstrated vanishes if you insert const after * in const char *ptr = "Lorum ipsum"; (tested in Apple LLVM 10.0.0 with clang-1000.11.45.5).
The fact the compiler had to load ptr arose entirely from the fact it could be changed in some other module not visible to the compiler. Making the pointer const eliminates that, and the compiler can prepare the address of the string directly, without loading the pointer.
If you are going to declare a pointer to a string and never change the pointer, then declare it as static const char * const ptr = "string";, and the compiler can happily provide the address of the string whenever the value of ptr is used. It does not need to actually load the contents of ptr from memory, since it can never change and will be known to point to wherever the compiler chooses to store the string. This is then the same as static const char array[] = "string";—whenever the address of the array is needed, the compiler can provide it from its knowledge of where it chose to store the array.
Furthermore, with the static specifier, ptr cannot be known outside the translation unit (the file being compiled), so the compiler can remove it during optimization (as long as you have not taken its address, perhaps when passing it to another routine outside the translation unit). The result should be no differences between the pointer method and the array method.
Rule of thumb: Tell the compiler as much as you know about stuff: If it will never change, mark it const. If it is local to the current module, mark it static. The more information the compiler has, the more it can optimize.
From the performance perspective, this is a fairly small optimization which makes sense for low-level code that needs to run with the lowest possible latency.
However, I would argue that const char s3[] = "bux"; is better from the semantic perspective, because the type of the right hand side is closer to type of the left hand side. For that reason, I think it makes sense to declare string constants with the array syntax.
I was doing some pointers and arrays practice in C and I noticed all my four methods returned the same answer.My question is are there disadvantages of using any one of my below methods? I am stunned at how all these four give me the same output. I just noticed you can use a pointer as if it was an array and you can also use an array as if it was a pointer?
char *name = "Madonah";
int i= 0;
for (i=0;i<7; i++){
printf("%c", *(name+i));
}
char name1 [7] = "Madonah";
printf("\n");
int j= 0;
for (j=0;j<7; j++){
printf("%c", name1[j]);
}
char *name2 = "Madonah";
printf("\n");
int k= 0;
for (k=0;k<7; k++){
printf("%c", name2[k]);
}
char name3 [7] = "Madonah";
printf("\n");
int m= 0;
for (m=0;m<7; m++){
printf("%c", *(name+m));
}
Results:
Madonah
Madonah
Madonah
Madonah
It is true that pointers and arrays are equivalent in some context, "equivalent" means neither that they are identical nor even interchangeable. Arrays are not pointers.
It is pointer arithmetic and array indexing that are equivalent, pointers and arrays are different.
which one is preferable and the advantages/Disadvantages?
It depends how you want to use them. If you do not wanna modify string then you can use
char *name = "Madonah";
It is basically equivalent to
char const *name = "Madonah";
*(name + i) and name[i] both are same. I prefer name[i] over *(name + i) as it is neat and used most frequently by C and C++ programmers.
If you like to modify the string then you can go with
char name1[] = "Madonah";
In C, a[b], b[a], *(a+b) are equivalent and there's no difference between these 3. So you only have 2 cases:
char *name = "Madonah"; /* case 1 */
and
char name3 [7] = "Madonah"; /* case 2 */
The former is a pointer which points to a string literal. The latter is an array of 7 characters.
which one is preferred depends on your usage of name3.
If you don't intend to modify then the string then you can use (1) and I would also make it const char* to make it clear and ensure the string literal is not modified accidentally. Modifying string literal is undefined behaviour in C
If you do need to modify it then (2) should be used as it's an array of characters that you can modify. One thing to note is that in case (2), you have explicitly specified the size of the array as 7. That means the character array name3 doesn't have a null terminator (\0) at the end. So it can't be used as a string. I would rather not specify the size of the array and let the compiler calculate it:
char name3 [] = "Madonah"; /* case 2 */
/* an array of 8 characters */
Just in addition to what others said, I will add an image for better illustration. If you have
char a[] = "hello";
char *p = "world";
What happens in first case enough memory is allocated for a (6 characters) on the stack usually, and the string "hello" is copied to memory which starts at a. Hence, you can modify this memory region.
In the second case "world" is allocated somewhere else(usually in read only region), and a pointer to that memory is returned which is simply stored in p. You can't modify the string literal in this case via p.
Here is how it looks:
But for your question stick to notation which is easier, I prefer [].
More info on relationship between arrays and pointers is here.
In all cases, in C pointer + index is the same as pointer[index]. Also, in C an array name used in an expression is treated as a pointer to the first element of the array. Things get a little more mystifying when you consider that addition is commutative, which makes index + pointer and index[pointer] legal also. Compilers will usually generate similar code no matter how you write it.
This allows cool things like "hello"[2] == 'l' and 2["hello"] == 'l'.
it is convenient to use array syntax for random access
int getvalue(int *a, size_t x){return a[x];}
and pointer arithmetic syntax for sequential access
void copy_string(char *dest, const char *src){while((*dest++=*src++));}
Many compilers can optimize sequential accesses to pointers better than using array indices.
For practicing pointer and array idioms in C, those are all valid code blocks and illustrative of the different ways of expressing the same thing.
For production code, you wouldn't use any of those; you would use something both easier for humans to read and more likely to be optimized by the compiler. You'd dispense with the loop entirely and just say:
printf("%s", name);
(Note that this requires name include the \0 character at the end of the string, upon which printf relies. Your name1 and name3 definitions, as written, do not allocate the necessary 8 bytes.)
If you were trying to do something tricky in your code that required one of the more verbose methods you posted, which method you chose would depend on exactly what tricky thing you were trying to do (which you would of course explain in code comments -- otherwise, you would look at your own code six months later and ask yourself, "What the heck was I doing here??"). In general, for plucking characters out of a string, name[i] is a more common idiom than *(name+i).
char name[] = "Madonah";
char* name = "Madonah";
When declared inside a function, the first option yields additional operations every time the function is called. This is because the contents of the string are copied from the RO data section into the stack every time the function is called.
In the second option, the compiler simply sets the variable to point to the address of that string in memory, which is constant throughout the execution of the program.
So unless you're planning to change the contents of the local array (not the original string - that one is read-only), you may opt for the second option.
Please note that all the details above are compiler-implementation dependent, and not dictated by the C-language standard.
can I know which one is easy to use [...]
Try sticking to use the indexing operator [].
One time your code gets more complex and you then notice that things are "more easy" to code using pointer arithmetical expressions.
This typically will be the case when you also start to use the address-of operator: &.
If for example you see your self in the need to code:
char s[] = "Modonah";
char * p = &s[2]; /* This gives you the address of the 'd'. */
you will soon notice that it's "easier" to write:
char * p = s + 2; /* This is the same as char * p = &s[2];. */
An array is just a variable that contains several values, each value has an index, you probably know that already.
Pointers are one of those things you dont need to know about until you realize you need to use them. Pointers are not variables themselves they are literally pointers to variables.
An example of how and why you might want to use a pointer.
You create a variable in one function which you want to use in another function.
You could pass your variable to the new function in the function header.
This effectively COPIES the values from the original variable to a new variable local to the new function. Changes made to it in the new function only change the new variable in the new function.
But what if you wanted changes made in the new function to change the original variable where it is in the first function ?
You use a pointer.
Instead of passing the actual variable to the new function, you pass a pointer to the variable. Now changes made to the variable in the new function are reflected in the original variable in the first function.
This is why in your own example using the pointer to the array and using the actual array while in the same function has identical results. Both of them are saying "change this array".
I need to initialize an array of structs with the same default values. This is a VERY large array, so setting each element by hand in an initializer is not feasable. Is the following code a correct and sane way to do this, or will I need to fall back on some initializer function and a for loop?
#define SIZE_OF_S1_ARR 10000 //just some arbitrary size for an example
typedef struct { char* id, char* description} S1;
/*
* Array of structs, with each element having an id and a description
* which is an empty c-string
/*/
S1 s1_arr[SIZE_OF_S1_ARR] = {{ "", "" }};
I will add that this array already existed as a char array which only contained the ids as a single character. I am replacing it with the more useful struct.
In standard C there is not array initiaization other than 0, unless you specify each value separately.
However, if you use the GNU C compiler, you can use something like this:
char s1_arr[SIZE_OF_S1_ARR] = {[0 ... SIZE_OF_S1_ARR-1] = '_' };
Please note that this approach is not portable.
In addition, please note that initalizing strings (i.e., pointer to char) to have the same value can be done in two ways:
(possible automatically) allocate a number of stings and assign them or allocate one string and assign it to all elements
It seems you are looking for the memset function.
To initialize your S1 type
#define ARBITER 10000 // just an example number for elements
S1 *s1_arr = (S1*)malloc(sizeof(S1)*ARBITER);
id and description are initialized with NULL.
To set s1_arr->id and s1_arr->description you need also a *char variable
or just a literal string (char* a_string = "I am a String"; // This is a literal).
Hope I help you out.
I've seen people's code as:
char *str = NULL;
and I've seen this is as well,
char *str;
I'm wonder, what is the proper way of initializing a string? and when are you supposed to initialize a string w/ and w/out NULL?
You're supposed to set it before using it. That's the only rule you have to follow to avoid undefined behaviour. Whether you initialise it at creation time or assign to it just before using it is not relevant.
Personally speaking, I prefer to never have variables set to unknown values myself so I'll usually do the first one unless it's set in close proximity (within a few lines).
In fact, with C99, where you don't have to declare locals at the tops of blocks any more, I'll generally defer creating it until it's needed, at which point it can be initialised as well.
Note that variables are given default values under certain circumstances (for example, if they're static storage duration such as being declared at file level, outside any function).
Local variables do not have this guarantee. So, if your second declaration above (char *str;) is inside a function, it may have rubbish in it and attempting to use it will invoke the afore-mentioned, dreaded, undefined behaviour.
The relevant part of the C99 standard 6.7.8/10:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules;
if it is a union, the first named member is initialized (recursively) according to these rules.
I'm wonder, what is the proper way of initializing a string?
Well, since the second snippet defines an uninitialized pointer to string, I'd say the first one. :)
In general, if you want to play it safe, it's good to initialize to NULL all pointers; in this way, it's easy to spot problems derived from uninitialized pointers, since dereferencing a NULL pointer will yield a crash (actually, as far as the standard is concerned, it's undefined behavior, but on every machine I've seen it's a crash).
However, you should not confuse a NULL pointer to string with an empty string: a NULL pointer to string means that that pointer points to nothing, while an empty string is a "real", zero-length string (i.e. it contains just a NUL character).
char * str=NULL; /* NULL pointer to string - there's no string, just a pointer */
const char * str2 = ""; /* Pointer to a constant empty string */
char str3[] = "random text to reach 15 characters ;)"; /* String allocated (presumably on the stack) that contains some text */
*str3 = 0; /* str3 is emptied by putting a NUL in first position */
this is a general question about c variables not just char ptrs.
It is considered best practice to initialize a variable at the point of declaration. ie
char *str = NULL;
is a Good Thing. THis way you never have variables with unknown values. For example if later in your code you do
if(str != NULL)
doBar(str);
What will happen. str is in an unknown (and almost certainly not NULL) state
Note that static variables will be initialized to zero / NULL for you. Its not clear from the question if you are asking about locals or statics
Global variables are initialized with default values by a compiler, but local variables must be initialized.
an unitialized pointer should be considered as undefined so to avoid generating errors by using an undefined value it's always better to use
char *str = NULL;
also because
char *str;
this will be just an unallocated pointer to somewhere that will mostly cause problems when used if you forget to allocate it, you will need to allocate it ANYWAY (or copy another pointer).
This means that you can choose:
if you know that you will allocate it shortly after its declaration you can avoid setting it as NULL (this is a sort of rule to thumb)
in any other case, if you want to be sure, just do it. The only real problem occurs if you try to use it without having initialized it.
It depends entirely on how you're going to use it. In the following, it makes more sense not to initialize the variable:
int count;
while ((count = function()) > 0)
{
}
Don't initialise all your pointer variables to NULL on declaration "just in case".
The compiler will warn you if you try to use a pointer variable that has not been initialised, except when you pass it by address to a function (and you usually do that in order to give it a value).
Initialising a pointer to NULL is not the same as initialising it to a sensible value, and initialising it to NULL just disables the compiler's ability to tell you that you haven't initialised it to a sensible value.
Only initialise pointers to NULL on declaration if you get a compiler warning if you don't, or you are passing them by address to a function that expects them to be NULL.
If you can't see both the declaration of a pointer variable and the point at which it is first given a value in the same screen-full, your function is too big.
static const char str[] = "str";
or
static char str[] = "str";
Because free() doesn't do anything if you pass it a NULL value you can simplify your program like this:
char *str = NULL;
if ( somethingorother() )
{
str = malloc ( 100 );
if ( NULL == str )
goto error;
}
...
error:
cleanup();
free ( str );
If for some reason somethingorother() returns 0, if you haven't initialized str you will
free some random address anywhere possibly causing a failure.
I apologize for the use of goto, I know some finds it offensive. :)
Your first snippet is a variable definition with initialization; the second snippet is a variable definition without initialization.
The proper way to initialize a string is to provide an initializer when you define it. Initializing it to NULL or something else depends on what you want to do with it.
Also be aware of what you call "string". C has no such type: usually "string" in a C context means "array of [some number of] char". You have pointers to char in the snippets above.
Assume you have a program that wants the username in argv[1] and copies it to the string "name". When you define the name variable you can keep it uninitialized, or initialize it to NULL (if it's a pointer to char), or initialize with a default name.
int main(int argc, char **argv) {
char name_uninit[100];
char *name_ptr = NULL;
char name_default[100] = "anonymous";
if (argc > 1) {
strcpy(name_uninit, argv[1]); /* beware buffer overflow */
name_ptr = argv[1];
strcpy(name_default, argv[1]); /* beware buffer overflow */
}
/* ... */
/* name_uninit may be unusable (and untestable) if there were no command line parameters */
/* name_ptr may be NULL, but you can test for NULL */
/* name_default is a definite name */
}
By proper you mean bug free? well, it depends on the situation. But there are some rules of thumb I can recommend.
Firstly, note that strings in C are not like strings in other languages.
They are pointers to a block of characters. The end of which is terminated with a 0 byte or NULL terminator. hence null terminated string.
So for example, if you're going to do something like this:
char* str;
gets(str);
or interact with str in any way, then it's a monumental bug. The reason is because as I have just said, in C strings are not strings like other languages. They are just pointers. char* str is the size of a pointer and will always be.
Therefore, what you need to do is allocate some memory to hold a string.
/* this allocates 100 characters for a string
(including the null), remember to free it with free() */
char* str = (char*)malloc(100);
str[0] = 0;
/* so does this, automatically freed when it goes out of scope */
char str[100] = "";
However, sometimes all you need is a pointer.
e.g.
/* This declares the string (not intialized) */
char* str;
/* use the string from earlier and assign the allocated/copied
buffer to our variable */
str = strdup(other_string);
In general, it really depends on how you expect to use the string pointer.
My recommendation is to either use the fixed size array form if you're only going to be using it in the scope of that function and the string is relatively small. Or initialize it to NULL. Then you can explicitly test for NULL string which is useful when it's passed into a function.
Beware that using the array form can also be a problem if you use a function that simply checks for NULL as to where the end of the string is. e.g. strcpy or strcat functions don't care how big your buffer is. Therefore consider using an alternative like BSD's strlcpy & strlcat. Or strcpy_s & strcat_s (windows).
Many functions expect you to pass in a proper address as well. So again, be aware that
char* str = NULL;
strcmp(str, "Hello World");
will crash big time because strcmp doesn't like having NULL passed in.
You have tagged this as C, but if anyone is using C++ and reads this question then switch to using std::string where possible and use the .c_str() member function on the string where you need to interact with an API that requires a standard null terminated c string.