Here's what I did:
#include <stdio.h>
#include <string.h>
int main() {
char name[] = "longname";
printf("Name = %s \n",name);
strcpy(name,"evenlongername");
printf("Name = %s \n",name);
printf("size of the array is : %d",sizeof(name));
return 0;
}
It works, but how? I thought that once memory is assigned to an array in a program, it is not possible to change it. But, the output of this program is:
Name = longname
Name = evenlongername
size of the array is 9
So the compiler affirms that the size of the array is still 9. How is it able to store the word 'evenlongername' which has a size of 15 bytes (including the string terminator)?
In this case, name is allocated to fit "longname", which is 9 bytes. When you copy "evenlongername" into it, you're writing outside of bounds of that array. It's undefined behavior to write outside of the bounds, this means it may or may not work. Some times, it'll work, other times you'll get seg fault, yet other times you'll get weird behavior.
So the compiler affirms that the size of the array is still 9. How is it able to store the word 'evenlongername' which has a size of 15 bytes(including the string terminator)?
You are using a dangerous function (see Bugs), strcpy, which blindly copies source string to destination buffer without knowing about its size; in your case of copying 15 bytes into a buffer with size 9 bytes, essentially you have overflown. Your program may work fine if the memory access is valid and it doesn't overwrite something important.
Because C is a lower-level programming language, a C char[] is "barebone" mapping of memory, and not a "smart" container like C++ std::vector which automatically manages its size for you as you dynamically add and remove elements. If you are still not clear about the philosophy of C in this, I'd recommend you read *YOU* are full of bullshit. Very classic and rewarding.
Using sizeof on a char array will return the size of the buffer, not the length of the null-terminated string in the buffer. If you use strcpy to try and overflow the array, and it just happens to work (it's still undefined behavior), sizeof is still going to report the size used at declaration. That never changes.
If what you're interested in is observing how the length of a string changes with different assignments:
Use an adequate buffer to store every string you're going to test.
Use the function strlen in <string.h> which will give you the actual length of the string, and not the length of your buffer, which, once declared, is constant.
Related
I have a problem, and I cannot figure out the solution for it. I have to programm some code to a µC, but I am not familiar with it.
I have to create an analysis and show the results of it on the screen of the machine. The analysis is allready done and functional. But getting the results from the analysis to the screen is my problem.
I have to store all results in a global array. Since the stack is really limited on the machine, I have to bring it to the larger heap. The linker is made that way, that every dynamic allocation ends up on the heap. But this is done in C so I cannot use "new". But everything allocated with malloc ends up on the heap automatically and that is why I need to use malloc, but I haven't used that before, so I have real trouble with it. The problem with the screen is, it accepts only char arrays.
In summaray: I have to create a global 2D char array holding the results of up to 100 positions and I have to allocate the memory for it using malloc.
To make it even more complicated I have to declare the variable with "extern" in the buffer.h file and have to implement it in the buffer.c file.
So my buffer.h line looks like this:
extern char * g_results[100][10];
In the buffer.c I am using:
g_results[0][0] = malloc ( 100 * 10 )
Each char is 1 byte, so the array should have the size of 1000 byte to hold 100 results with the length of 9 and 1 terminating /0. Right?
Now I try to store the results into this array with the help of strcpy.
I am doing this in a for loop at the end of the analysis.
for (int i = 0; i < 100, i++)
{
// Got to convert it to text 1st, since the display does not accept anything but text.
snprintf(buffer, 9, "%.2f", results[i]);
strcpy(g_results[i][0], buffer);
}
And then I iterate through the g_results_buffer on the screen and display the content. The problem is: it works perfect for the FIRST result only. Everything is as I wanted it.
But all other lines are empty. I checked the results-array, and all values are stored in them, so that is not the cause for the problem. Also, the values are not overwritten, it is really the 1st value.
I cannot see what it is the problem here.
My guesses are:
a) allocation with malloc isn't done correctly. Only allocating space for the 1st element? When I remove the [0][0] I get a compiler error: "assignment to expression with array type". But I do not know what that should mean.
b) (totally) wrong usage of the pointers. Is there a way I can declare that array as a non-pointer, but still on the heap?
I really need your help.
How do I store the results from the results-array after the 1st element into the g_results-array?
I have to store all results in a global array. Since the stack is really limited on the machine, I have to bring it to the larger heap.
A “global array“ and “the larger heap” are different things. C does not have a true global name space. It does have objects with static storage duration, for which memory is reserved for the entire execution of the program. People use the “heap” to refer to dynamically allocated memory, which is reserved from the time a program requests it (as with malloc) until the time the program releases it (as with free).
Variables declared outside of functions have file scope for their names, external or internal linkage, and static storage duration. These are different from dynamic memory. So it is not clear what memory you want: static storage duration or dynamic memory?
“Heap” is a misnomer. Properly, that word refers to a type of data structure. You can simply call it “allocated memory.” A “heap” may be used to organize pieces of memory available for allocation, but it can be used for other purposes, and the memory management routines may use other data structures.
The linker is made that way, that every dynamic allocation ends up on the heap.
The linker links object modules together. It has nothing to do with the heap.
But everything allocated with malloc ends up on the heap automatically and that is why I need to use malloc,…
When you allocate memory, it does not end up on the heap. The heap (if it is used for memory management) is where memory that has been freed is kept until it is allocated again. When you allocate memory, it is taken off of the heap.
The problem with the screen is, it accepts only char arrays.
This is unclear. Perhaps you mean there is some display device that you must communicate with by providing strings of characters.
In summaray: I have to create a global 2D char array holding the results of up to 100 positions and I have to allocate the memory for it using malloc.
That would have been useful at the beginning of your post.
So my buffer.h line looks like this:
extern char * g_results[100][10];
That declares an array of 100 arrays of 10 pointers to char *. So you will have 1,000 pointers to strings (technically 1,000 pointers to the first character of strings, but we generally speak of a pointer to the first character of a string as a pointer to the string). That is not likely what you want. If you want 100 strings of up to 10 characters each (including the terminating null byte in that 10), then a pointer to an array of 100 arrays of 10 characters would suffice. That can be declared with:
extern char (*g_results)[100][10];
However, when working with arrays, we generally just use a pointer to the first element of the array rather than a pointer to the whole array:
extern char (*g_results)[10];
In the buffer.c I am using:
g_results[0][0] = malloc ( 100 * 10 )
Each char is 1 byte, so the array should have the size of 1000 byte to hold 100 results with the length of 9 and 1 terminating /0. Right?
That space does suffice for 100 instances of 10-byte strings. It would not have worked with your original declaration of extern char * g_results[100][10];, which would need space for 1,000 pointers.
However, having changed g_results to extern char (*g_results)[10];, we must now assign the address returned by malloc to g_results, not to g_results[0][0]. We can allocate the required space with:
g_results = malloc(100 * sizeof *g_results);
Alternately, instead of allocating memory, just use static storage:
char g_results[100][10];
Now I try to store the results into this array with the help of strcpy. I am doing this in a for loop at the end of the analysis.
for (int i = 0; i < 100, i++)
{
// Got to convert it to text 1st, since the display does not accept anything but text.
snprintf(buffer, 9, "%.2f", results[i]);
strcpy(g_results[i][0], buffer);
}
There is no need to use buffer; you can send the snprintf results directly to the final memory.
Since g_results is an array of 100 arrays of 10 char, g_results[i] is an array of 10 char. When an array is used as an expression, it is automatically converted to a pointer to its first element, except when it is the operand of sizeof, the operand of unary &, or is a string literal used to initialize an array (in a definition). So you can use g_results[i] to get the address where string i should be written:
snprintf(g_results[i], sizeof g_results[i], "%.2f", results[i]);
Some notes about this:
We see use of the array both with automatic conversion and without. The argument g_results[i] is converted to &g_results[i][0]. In sizeof g_results[i], sizeof gives the size of the array, not a pointer.
The buffer length passed to snprintf does not need to be reduced by 1 for allow for the terminating null character. snprintf handles that by itself. So we pass the full size, sizeof g_results[i].
But all other lines are empty.
That is because your declaration of g_results was wrong. It declared 1,000 pointers, and you stored an address only in g_results[0][0], so all the other pointers were uninitialized.
This is all odd, you seem to just want:
// array of 100 arrays of 10 chars
char g_results[100][10];
for (int i = 0; i < 100, i++) {
// why snprintf+strcpy? Just write where you want to write.
snprintf(g_results[i], 10, "%.2f", results[i]);
// ^^^^^^^^ has to be float or double
// ^^ why 9? The buffer has 10 chars.
}
Only allocating space for the 1st element?
Yes, you are, you only assigned first element g_results[0][0] to malloc ( 100 * 10 ).
wrong usage of the pointers. Is there a way I can declare that array as a non-pointer, but still on the heap?
No. To allocate something on the heap you have to call malloc.
But there is no reason to use the heap, especially that you are on a microcontroller and especially that you know how many elements you are going to allocate. Heap is for unknowns, if you know that you want exactly 100 x 10 x chars, just take them.
Overall, consider reading some C books.
I do not know what that should mean.
You cannot assign to an array as a whole. You can assign to array elements, one by one.
While working on dynamic memory allocation in C, I am getting confused when allocating size of memory to a char pointer. While I am only giving 1 byte as limit, the char pointer successfully takes input as long as possible, given that each letter corresponds to 1 byte.
Also I have tried to find sizes of pointer before and after input. How can I understand what is happening here? The output is confusing me.
Look at this code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int limit;
printf("Please enter the limit of your string - ");
gets(&limit);
char *text = (char*) malloc(limit*4);
printf("\n\nThe size of text before input is %d bytes",sizeof(text));
printf("\n\nPlease input your string - ");
scanf("%[^\n]s",text);
printf("\n\nYour string is %s",text);
printf("\n\nThe size of char pointer text after input is %d bytes",sizeof(text));
printf("\n\nThe size of text value after input is %d bytes",sizeof(*text));
printf("\n\nThe size of ++text value after input is %d bytes",sizeof(++text));
free(text);
return 0;
}
Check this output:
It works because malloc usually doesn't allocate the same number of bytes you pass to it.
It reserves memory multiple of "blocks". It usually reserve more memory to "cache" it for next malloc calls as an optimization. (it is an implementation specific)
check glibc malloc internals for example.
Using more memory than allocated by malloc is an undefined behavior. you may overwrite metadata of malloc saved on heap or corrupt other data.
Also I have tried to find sizes of pointer before and after input. How
can I understand what is happening here? The output is confusing me.
The size of pointer is fixed for all pointer types in a machine, it is usually 4/8 bytes depending on the address size. It doesn't have anything to do with data size.
Welcome to the world of Undefined Behaviour!
char *text = malloc(limit*4); (don't cast malloc in C) will make text point the the first element of an array of size limit*4.
C will not prevent you to write past the end of any array, simply the behaviour is undefined by the standard. It may work fine, or it may crash immediately, or you may experience abnormal behaviour later in the program.
Here, the underlying system call has probably allocated a full memory page (often 4k), and as you have not used another malloc you have just used a memory belonging to the process but still officially unused. But do not rely on that and never use it in production code.
And sizeof does not make sense with pointers. sizeof(text) is sizeof(char *) (same for sizeof(++text) for same reason) and is the size of a pointer (generaly 2, 4 or 8 bytes) and sizeof(*text) is sizeof(char) which by definition is 1.
C is confident that you as the programmer know how much memory you have asked, and will not try to use more. Anything can happen if you do (including expected result) but do not blame the language or the compiler if it breaks: only you will be guilty.
I create a buffer like this but I don't give it content. Then I try to view strlen() of this block memory.
int size = 24;
char *c;
c = (char *)malloc(size);
printf("%d\n", strlen(c));
What I get is not 24 but 40. I try to view the value from c[40] to c[47] and I always get \0 but after c[47] is not null anymore.
When I set size = 18, the result isn't 40 anymore. It's 32. And values from c[32] to c[47] are all \0.
When I set size = 7, the result is 24 and values form c[24] to c[47] are all \0.
I know using strlen for an array like this is not able to give me the size I used in malloc() function.
I just wonder why this happened and when we change the value of size, how the result change? Is there anything we can deal with using this?
Edit: It seem like everyone think the result is unpredictable. It's the fact that it's always a multiple of 8 and when we increase the size, there is a limit where the result increase. We can determine exactly the value of size that make the result change and it doesn't change despite how many times we test. Does it depend on OS not just C language or compiler? Why 8 is chosen?
The memory you allocated isn't initialized. Passing it to strlen, a function that expects a NUL terminated string, has undefined behavior. So you can get whatever, you don't even have to get any result.
There is not built-in way in C to know the exact size of an allocated block of memory.
You should read the documentation, strlen() is for the length of strings. Without much detail I will tell you one thing,
There is no way to get the length of a pointer dynamically, so your only option is to store it and use it later.
A simple elegant method, is to use a struct for that, where you store the size and use it every time you need.
struct pointer_type {
void *pointer;
size_t allocated_size;
};
As for the "What i get is not 24 but 40" question,
Your pointer, is uninitialized. This causes what is known as undefined behavior because the way strlen() works is by dereferencing the pointer and counting characters until a given character occurs, something like
int strlen(const char *const str)
{
int count = 0;
while (*(str++) != '\0') ++count;
return count;
}
and since your pointer points to random garbage, this will not work right and the returned value is unpredictable.
I have this code..
#include <stdio.h>
#include <string.h>
int main() {
char a[6]="Hello";
char b[]="this is mine";
strcpy(a,b);
printf("%d\n",sizeof(a));
printf("%d\n",sizeof(b));
printf("%s\n",a);
printf("%s\n",b);
printf("%d\n",sizeof(a));
return 0;
}
6
13
this is mine
this is mine
6
I want to ask even I have copied the larger strng to a but its size is not changing but its contents are changing why?
Any help would be appreciated..
You cannot change the size of array A. strcpy is meant to be very fast, so it assumes that you, the user, passes a large enough array to fit the copied string. What you have done is override your array a's null terminator, and changed memory past where you have allocated. In many cases this will not work and cause your program to crash, but in a simple example it will run.
The array a has a size of six char that cannot be changed. When you copy the other, longer string into the array, you overrun the array, introducing both instability and security concerns to the program.
When the computer loads the program into memory, the string literal hello is loaded into read-only memory as a constant, the space needed for the array is allocated in the stack memory, and finally, the string is copied into the array.
In this case, the source string overruns the destination array's length, as array a can hold 6 characters and the string literal that you are trying to copy to it is is 13 characters. This leads to a buffer overrun, which can lead to bugs, at the very least. Worse than that is the potential for information leaks and even more disastrous security consequences.
Please reference the strcpy man page:
Strcpy man page
In this example code, it minimally worked, but this is definitely something to avoid.
The size is not changing because for the compiler the array a has a fixed size and you cannot change it and no one can.
The contents are changing because there is no check performed for the bounds of the array, and it's working because of a coincidence. The code you posted has undefined behavior, and one of the possible outcomes is that it works as it is working in your case, but that will not necessarily always happen, add a variable to your main() function for example, and it might stop working.
This may be a very basic question for some. I was trying to understand how strcpy works actually behind the scenes. for example, in this code
#include <stdio.h>
#include <string.h>
int main ()
{
char s[6] = "Hello";
char a[20] = "world isnsadsdas";
strcpy(s,a);
printf("%s\n",s);
printf("%d\n", sizeof(s));
return 0;
}
As I am declaring s to be a static array with size less than that of source. I thought it wont print the whole word, but it did print world isnsadsdas .. So, I thought that this strcpy function might be allocating new size if destination is less than the source. But now, when I check sizeof(s), it is still 6, but it is printing out more than that. Hows that working actually?
You've just caused undefined behaviour, so anything can happen. In your case, you're getting lucky and it's not crashing, but you shouldn't rely on that happening. Here's a simplified strcpy implementation (but it's not too far off from many real ones):
char *strcpy(char *d, const char *s)
{
char *saved = d;
while (*s)
{
*d++ = *s++;
}
*d = 0;
return saved;
}
sizeof is just returning you the size of your array from compile time. If you use strlen, I think you'll see what you expect. But as I mentioned above, relying on undefined behaviour is a bad idea.
http://natashenka.ca/wp-content/uploads/2014/01/strcpy8x11.png
strcpy is considered dangerous for reasons like the one you are demonstrating. The two buffers you created are local variables stored in the stack frame of the function. Here is roughly what the stack frame looks like:
http://upload.wikimedia.org/wikipedia/commons/thumb/d/d3/Call_stack_layout.svg/342px-Call_stack_layout.svg.png
FYI things are put on top of the stack meaning it grows backwards through memory (This does not mean the variables in memory are read backwards, just that newer ones are put 'behind' older ones). So that means if you write far enough into the locals section of your function's stack frame, you will write forward over every other stack variable after the variable you are copying to and break into other sections, and eventually overwrite the return pointer. The result is that if you are clever, you have full control of where the function returns. You could make it do anything really, but it isn't YOU that is the concern.
As you seem to know by making your first buffer 6 chars long for a 5 character string, C strings end in a null byte \x00. The strcpy function copies bytes until the source byte is 0, but it does not check that the destination is that long, which is why it can copy over the boundary of the array. This is also why your print is reading the buffer past its size, it reads till \x00. Interestingly, the strcpy may have written into the data of s depending on the order the compiler gave it in the stack, so a fun exercise could be to also print a and see if you get something like 'snsadsdas', but I can't be sure what it would look like even if it is polluting s because there are sometimes bytes in between the stack entries for various reasons).
If this buffer holds say, a password to check in code with a hashing function, and you copy it to a buffer in the stack from wherever you get it (a network packet if a server, or a text box, etc) you very well may copy more data from the source than the destination buffer can hold and give return control of your program to whatever user was able to send a packet to you or try a password. They just have to type the right number of characters, and then the correct characters that represent an address to somewhere in ram to jump to.
You can use strcpy if you check the bounds and maybe trim the source string, but it is considered bad practice. There are more modern functions that take a max length like http://www.cplusplus.com/reference/cstring/strncpy/
Oh and lastly, this is all called a buffer overflow. Some compilers add a nice little blob of bytes randomly chosen by the OS before and after every stack entry. After every copy the OS checks these bytes against its copy and terminates the program if they differ. This solves a lot of security problems, but it is still possible to copy bytes far enough into the stack to overwrite the pointer to the function to handle what happens when those bytes have been changed thus letting you do the same thing. It just becomes a lot harder to do right.
In C there is no bounds checking of arrays, its a trade off in order to have better performance at the risk of shooting yourself in the foot.
strcpy() doesn't care whether the target buffer is big enough so copying too many bytes will cause undefined behavior.
that is one of the reasons that a new version of strcpy were introduced where you can specify the target buffer size strcpy_s()
Note that sizeof(s) is determined at run time. Use strlen() to find the number of characters s occupied. When you perform strcpy() source string will be replaced by destination string so your output wont be "Helloworld isnsadsdas"
#include <stdio.h>
#include <string.h>
main ()
{
char s[6] = "Hello";
char a[20] = "world isnsadsdas";
strcpy(s,a);
printf("%s\n",s);
printf("%d\n", strlen(s));
}
You are relying on undefined behaviour in as much as that the compiler has chose to place the two arrays where your code happens to work. This may not work in future.
As to the sizeof operator, this is figured out at compile time.
Once you use adequate array sizes you need to use strlen to fetch the length of the strings.
The best way to understand how strcpy works behind the scene is...reading its source code!
You can read the source for GLibC : http://fossies.org/dox/glibc-2.17/strcpy_8c_source.html . I hope it helps!
At the end of every string/character array there is a null terminator character '\0' which marks the end of the string/character array.
strcpy() preforms its task until it sees the '\0' character.
printf() also preforms its task until it sees the '\0' character.
sizeof() on the other hand is not interested in the content of the array, only its allocated size (how big it is supposed to be), thus not taking into consideration where the string/character array actually ends (how big it actually is).
As opposed to sizeof(), there is strlen() that is interested in how long the string actually is (not how long it was supposed to be) and thus counts the number of characters until it reaches the end ('\0' character) where it stops (it doesn't include the '\0' character).
Better Solution is
char *strcpy(char *p,char const *q)
{
char *saved=p;
while(*p++=*q++);
return saved;
}