C char* pointers pointing to same location where they definitely shouldn't - c

I'm trying to write a simple C program on Ubuntu using Eclipse CDT (yes, I'm more comfortable with an IDE and I'm used to Eclipse from Java development), and I'm stuck with something weird. On one part of my code, I initialize a char array in a function, and it is by default pointing to the same location with one of the inputs, which has nothing to do with that char array. Here is my code:
char* subdir(const char input[], const char dir[]){
[*] int totallen = strlen(input) + strlen(dir) + 2;
char retval[totallen];
strcpy(retval, input);
strcat(retval, dir);
...}
Ok at the part I've marked with [*], there is a checkpoint. Even at that breakpoint, when I check y locals, I see that retval is pointing to the same address with my argument input. It not even possible as input comes from another function and retval is created in this function. Is is me being unexperienced with C and missing something, or is there a bug somewhere with the C compiler?
It seems so obvious to me that they should't point to the same (and a valid, of course, they aren't NULL) location. When the code goes on, it literally messes up everything; I get random characters and shapes in console and the program crashes.

I don't think it makes sense to check the address of retval BEFORE it appears, it being a VLA and all (by definition the compiler and the debugger don't know much about it, it's generated at runtime on the stack).
Try checking its address after its point of definition.
EDIT
I just read the "I get random characters and shapes in console". It's obvious now that you are returning the VLA and expecting things to work.
A VLA is only valid inside the block where it was defined. Using it outside is undefined behavior and thus very dangerous. Even if the size were constant, it still wouldn't be valid to return it from the function. In this case you most definitely want to malloc the memory.

What cnicutar said.
I hate people who do this, so I hate me ... but ... Arrays of non-const size are a C99 extension and not supported by C++. Of course GCC has extensions to make it happen.
Under the covers you are essentially doing an _alloca, so your odds of blowing out the stack are proportional to who has access to abuse the function.
Finally, I hope it doesn't actually get returned, because that would be returning a pointer to a stack allocated array, which would be your real problem since that array is gone as of the point of return.
In C++ you would typically use a string class.
In C you would either pass a pointer and length in as parameters, or a pointer to a pointer (or return a pointer) and specify the calls should call free() on it when done. These solutions all suck because they are error prone to leaks or truncation or overflow. :/

Well, your fundamental problem is that you are returning a pointer to the stack allocated VLA. You can't do that. Pointers to local variables are only valid inside the scope of the function that declares them. Your code results in Undefined Behaviour.
At least I am assuming that somewhere in the ..... in the real code is the line return retval.
You'll need to use heap allocation, or pass a suitably sized buffer to the function.
As well as that, you only need +1 rather than +2 in the length calculation - there is only one null-terminator.

Try changing retval to a character pointer and allocating your buffer using malloc().

Pass the two string arguments as, char * or const char *
Rather than returning char *, you should just pass another parameter with a string pointer that you already malloc'd space for.
Return bool or int describing what happened in the function, and use the parameter you passed to store the result.
Lastly don't forget to free the memory since you're having to malloc space for the string on the heap...
//retstr is not a const like the other two
bool subdir(const char *input, const char *dir,char *retstr){
strcpy(retstr, input);
strcat(retstr, dir);
return 1;
}
int main()
{
char h[]="Hello ";
char w[]="World!";
char *greet=(char*)malloc(strlen(h)+strlen(w)+1); //Size of the result plus room for the terminator!
subdir(h,w,greet);
printf("%s",greet);
return 1;
}
This will print: "Hello World!" added together by your function.
Also when you're creating a string on the fly you must malloc. The compiler doesn't know how long the two other strings are going to be, thus using char greet[totallen]; shouldn't work.

Related

Whats wrong with this C code lines

Whats wrong with this C code lines
char *string()
{
char *text[20];
strcpy(text,"Hello world");
return text;
}
I was poor at pointers and I have seen this in some previous paper.
Can't able to solve.
It doesn't compile, since it treats an array of character pointers as a single array of characters.
The variable declaration line should be:
char text[200];
With that fix done, it's still broken for the reason you're probably interested in: it returns the address of a local variable (the text character array) which goes out of scope as the function returns, thus making the address invalid. There are two ways around that:
The easiest is to make the array static, since that makes it live for as long as the program runs.
You can also switch to dynamic (heap) memory by using malloc(), but that transfers ownership to the caller and requires a call to free() or memory will leak if this function gets called a lot.
Also, as a minor point, its name is in a reserved name space (user programs cannot define functions whose names begin with str). Also, a function taking no arguments should be declared as (void) in C, an empty pair of parentheses does not mean the same thing.
This code will not compile because you are trying to make an array of pointers.
In simple words if you want to handle string using pointer you can do using following:
char *str="HELLO WORLD";
And if you want to handle string using char array you have to remove value at address operator(*).
here it is:
char text[20];
then you can perform your function.
Still there is error as the Scope of the text is only valid inside the function so if you want to pass address and retain the value make it static
static char text[20];
return text;

Why do I need to allocate memory?

#include<stdio.h>
#include<stdlib.h>
void main()
{
char *arr;
arr=(char *)malloc(sizeof (char)*4);
scanf("%s",arr);
printf("%s",arr);
}
In the above program, do I really need to allocate the arr?
It is giving me the result even without using the malloc.
My second doubt is ' I am expecting an error in 9th line because I think it must be
printf("%s",*arr);
or something.
do I really need to allocate the arr?
Yes, otherwise you're dereferencing an uninitialised pointer (i.e. writing to a random chunk of memory), which is undefined behaviour.
do I really need to allocate the arr?
You need to set arr to point to a block of memory you own, either by calling malloc or by setting it to point to another array. Otherwise it points to a random memory address that may or may not be accessible to you.
In C, casting the result of malloc is discouraged1; it's unnecessary, and in some cases can mask an error if you forget to include stdlib.h or otherwise don't have a prototype for malloc in scope.
I usually recommend malloc calls be written as
T *ptr = malloc(N * sizeof *ptr);
where T is whatever type you're using, and N is the number of elements of that type you want to allocate. sizeof *ptr is equivalent to sizeof (T), so if you ever change T, you won't need to duplicate that change in the malloc call itself. Just one less maintenance headache.
It is giving me the result even without using the malloc
Because you don't explicitly initialize it in the declaration, the initial value of arr is indeterminate2; it contains a random bit string that may or may not correspond to a valid, writable address. The behavior on attempting to read or write through an invalid pointer is undefined, meaning the compiler isn't obligated to warn you that you're doing something dangerous. On of the possible outcomes of undefined behavior is that your code appears to work as intended. In this case, it looks like you're accessing a random segment of memory that just happens to be writable and doesn't contain anything important.
My second doubt is ' I am expecting an error in 9th line because I think it must be printf("%s",*arr); or something.
The %s conversion specifier tells printf that the corresponding argument is of type char *, so printf("%s", arr); is correct. If you had used the %c conversion specifier, then yes, you would need to dereference arr with either the * operator or a subscript, such as printf("%c", *arr); or printf("%c", arr[i]);.
Also, unless your compiler documentation explicitly lists it as a valid signature, you should not define main as void main(); either use int main(void) or int main(int argc, char **argv) instead.
1. The cast is required in C++, since C++ doesn't allow you to assign void * values to other pointer types without an explicit cast
2. This is true for pointers declared at block scope. Pointers declared at file scope (outside of any function) or with the static keyword are implicitly initialized to NULL.
Personally, I think this a very bad example of allocating memory.
A char * will take up, in a modern OS/compiler, at least 4 bytes, and on a 64-bit machine, 8 bytes. So you use four bytes to store the location of the four bytes for your three-character string. Not only that, but malloc will have overheads, that add probably between 16 and 32 bytes to the actual allocated memory. So, we're using something like 20 to 40 bytes to store 4 bytes. That's a 5-10 times more than it actually needs.
The code also casts malloc, which is wrong in C.
And with only four bytes in the buffer, the chances of scanf overflowing is substantial.
Finally, there is no call to free to return the memory to the system.
It would be MUCH better to use:
int len;
char arr[5];
fgets(arr, sizeof(arr), stdin);
len = strlen(arr);
if (arr[len] == '\n') arr[len] = '\0';
This will not overflow the string, and only use 9 bytes of stackspace (not counting any padding...), rather than 4-8 bytes of stackspace and a good deal more on the heap. I added an extra character to the array, so that it allows for the newline. Also added code to remove the newline that fgets adds, as otherwise someone would complain about that, I'm sure.
In the above program, do I really need to allocate the arr?
You bet you do.
It is giving me the result even without using the malloc.
Sure, that's entirely possible... arr is a pointer. It points to a memory location. Before you do anything with it, it's uninitialized... so it's pointing to some random memory location. The key here is wherever it's pointing is a place your program is not guaranteed to own. That means you can just do the scanf() and at that random location that arr is pointing to the value will go, but another program can overwrite that data.
When you say malloc(X) you're telling the computer that you need X bytes of memory for your own usage that no one else can touch. Then when arr captures the data it will be there safely for your usage until you call free() (which you forgot to do in your program BTW)
This is a good example of why you should always initialize your pointers to NULL when you create them... it reminds you that you don't own what they're pointing at and you better point them to something valid before using them.
I am expecting an error in 9th line because I think it must be printf("%s",*arr)
Incorrect. scanf() wants an address, which is what arr is pointing to, that's why you don't need to do: scanf("%s", &arr). And printf's "%s" specificier wants a character array (a pointer to a string of characters) which again is what arr is, so no need to deference.

Passing String as argument- getting segfault in function

SOLVED See bottom of question for solution.
I'm having trouble with passing on a String argument to my function, and am getting a segmentation fault when the function is called. The program takes in a command line input and passes on the file provided to the function after validation.
My function code goes like this:
char *inputFile; //
inputFile= argv[2];
strcpy(inputFile, argv[2]);
compress(inputFile){
//file open and creation work bug-free
//compression action to be written
void compress(char inputFile){
//compression code here
}
When the function is called, a segfault is thrown, and the value of inputFile is 0x00000000, when prior to the function call, it had a memory location and value of the test file path.
Some of the variations I've tried, with matching function prototypes:
compress(char *inputFile)
compress (char inputFile[])
I also changed the variable.
Why is a variable with a valid memory address and value in the debugger suddenly erase when used as a parameter?
Edit 1:
Incorporating suggestions here, I removed the inputFile= argv[2] line, and the debugger shows the strcpy function working.
However, I've tried both compress(char *inputFile) per Edwin Buck and compress(argv[2]) per unwind, and both changes still result in Cannot access memory at address 0xBEB9C74C
The strange thing is my file validation function checkFile(char inputFile[]) works with the inputFile value, but when I pass that same parameter to the compress(char inputFile[]) function, I get the segfault.
Edit 2- SOLVED
You know something is going on when you stump your professor for 45 min. It turns out I had declared the file read buffer as a 5MB long array inside the compress() method, which in turn maxed out the stack frame. Changing the buffer declaration to a global variable did the trick, and the code executes.
Thanks for the help!
You shouldn't be writing into memory used to hold argv[2].
You don't seem to quite understand how strings are represented; you're copying both the pointer (with the assignment) and the actual characters (with the strcpy()).
You should just do compress(argv[2]); once you've verified that that argument is valid.
First, to copy something from argv[2] to somewhere else, you need some memory for "that somewhere else". You could allocate the memory based on the size of argv[2] but for our simple example, a very large fixed size buffer will do.
char inputfile[2048];
It looks like you tried to do this by the assignment operator, which doesn't really do what you intended.
// this is not the way to what you seek, as it doesn't create any new memory for inputfile
char* inputfile = argv[2];
in passing the inputfile variable to a procedure, you want to pass much more than a single character, so void compress(char inputfile) is not an option. That leaves
compress(char *inputFile) // I prefer this one
compress (char inputFile[])
which both work, but in my experience the first is preferred, as some older compilers tend to make distinctions between pointer and array semantics. These compilers have no issues casting an array to a pointer (which is required as part of the C language specification); however, casting a pointer to an array gets a bit messy in such systems.
You've not allocated any memory for the char * to use. All you've done with char *inputfile is allocated a pointer.

Initialization strings in C

I have a question about how is the correct way of manipulate the initialization of c strings
For example the next code, isn't always correct.
char *something;
something = "zzzzzzzzzzzzzzzzzz";
i test a little incrementing the number of zetas and effectively the program crash in like about two lines, so what is the real size limit in this char array? how can i be sure that it is not going to crash, is this limit implementation dependent? Is the following code the correct approach that i always must use?
char something[FIXEDSIZE];
strcpy(something, "zzzzzzzzzzzzzzzzzzz");
As you say, manipulating this string leads to undefined behaviour:
char *something;
something = "zzzzzzzzzzzzzzzzzz";
If you are curious as to why, see "C String literals: Where do they go?".
If you plan to manipulate your string at all, (i.e. if you want it to be mutable) you should use this:
char something[] = "skjdghskfjhgfsj";
Otherwise, simply declare your char * as a const char * to indicate that it points to a constant.
In the second example, the compiler will be smart enough to declare this as an array on the stack of the exact size to hold the string. Thus, the size of this is limited by your stack.
Of course, you will likely want to specify the size anyway, since it is usually useful to know when manipulating the string.
The second is always correct.
The first is correct only if you never change the string, since you've assigned a pointer to fixed data.
The first example is only incorrect in that char *something should really be const char *something. Otherwise, this:
const char *something = "fooooooooooooooooooooooobar";
...should work, and should not crash.
char something[FIXEDSIZE];
...this one, however, can typically crash with a stack overflow if you, well, overflow the stack, which depends on how big that stack is, how big that array is, where this gets called, etc.
first should never crash. second will crash as soon as the number of 'z' + 1 go over the available space on the stack page, or if you try to return from the function.

Pointer initialization and string manipulation in C

I have this function which is called about 1000 times from main(). When i initialize a pointer in this function using malloc(), seg fault occurs, possibly because i did not free() it before leaving the function. Now, I tried free()ing the pointer before returning to main, but its of no use, eventually a seg fault occurs.
The above scenario being one thing, how do i initialize double pointers (**ptr) and pointer to array of pointers (*ptr[])?
Is there a way to copy a string ( which is a char array) into an array of char pointers.
char arr[]; (Lets say there are fifty such arrays)
char *ptr_arr[50]; Now i want point each such char arr[] in *ptr_arr[]
How do i initialize char *ptr_arr[] here?
What are the effects of uninitialized pointers in C?
Does strcpy() append the '\0' on its own or do we have to do it manually? How safe is strcpy() compared to strncpy()? Like wise with strcat() and strncat().
Thanks.
Segfault can be caused by many things. Do you check the pointer after the malloc (if it's NULL)? Step through the lines of the code to see exactly where does it happen (and ask a seperate question with more details and code)
You don't seem to understand the relation of pointers and arrays in C. First, a pointer to array of pointers is defined like type*** or type**[]. In practice, only twice-indirected pointers are useful. Still, you can have something like this, just dereference the pointer enough times and do the actual memory allocation.
This is messy. Should be a separate question.
They most likely crash your program, BUT this is undefined, so you can't be sure. They might have the address of an already used memory "slot", so there might be a bug you don't even notice.
From your question, my advice would be to google "pointers in C" and read some tutorials to get an understanding of what pointers are and how to use them - there's a lot that would need to be repeated in an SO answer to get you up to speed.
The top two hits are here and here.
It's hard to answer your first question without seeing some code -- Segmentation Faults are tricky to track down and seeing the code would be more straightforward.
Double pointers are not more special than single pointers as the concepts behind them are the same. For example...
char * c = malloc(4);
char **c = &c;
I'm not quite sure what c) is asking, but to answer your last question, uninitialized pointers have undefined action in C, ie. you shouldn't rely on any specific result happening.
EDIT: You seem to have added a question since I replied...
strcpy(..) will indeed copy the null terminator of the source string to the destination string.
for part 'a', maybe this helps:
void myfunction(void) {
int * p = (int *) malloc (sizeof(int));
free(p);
}
int main () {
int i;
for (i = 0; i < 1000; i++)
myfunction();
return 0;
}
Here's a nice introduction to pointers from Stanford.
A pointer is a special type of variable which holds the address or location of another variable. Pointers point to these locations by keeping a record of the spot at which they were stored. Pointers to variables are found by recording the address at which a variable is stored. It is always possible to find the address of a piece of storage in C using the special & operator. For instance: if location were a float type variable, it would be easy to find a pointer to it called location_ptr
float location;
float *location_ptr,*address;
location_ptr = &(location);
or
address = &(location);
The declarations of pointers look a little strange at first. The star * symbol which stands in front of the variable name is C's way of declaring that variable to be a pointer. The four lines above make two identical pointers to a floating point variable called location, one of them is called location_ptr and the other is called address. The point is that a pointer is just a place to keep a record of the address of a variable, so they are really the same thing.
A pointer is a bundle of information that has two parts. One part is the address of the beginning of the segment of memory that holds whatever is pointed to. The other part is the type of value that the pointer points to the beginning of. This tells the computer how much of the memory after the beginning to read and how to interpret it. Thus, if the pointer is of a type int, the segment of memory returned will be four bytes long (32 bits) and be interpreted as an integer. In the case of a function, the type is the type of value that the function will return, although the address is the address of the beginning of the function executable.
Also get more tutorial on C/C++ Programming on http://www.jnucode.blogspot.com
You've added an additional question about strcpy/strncpy.
strcpy is actually safer.
It copies a nul terminated string, and it adds the nul terminator to the copy. i.e. you get an exact duplicate of the original string.
strncpy on the other hand has two distinct behaviours:
if the source string is fewer than 'n' characters long, it acts just as strcpy, nul terminating the copy
if the source string is greater than or equal to 'n' characters long, then it simply stops copying when it gets to 'n', and leaves the string unterminated. It is therefore necessary to always nul-terminate the resulting string to be sure it's still valid:
char dest[123];
strncpy(dest, source, 123);
dest[122] = '\0';

Resources