How declaration of variables behave? - c

#include<stdio.h>
#include<conio.h>
int main(){
char i;
int c;
scanf("%i",&c);
scanf("%c",&i);// catch the new line or character introduced before x number
printf("%i",i);// value of that character
getch();
return(0);
}
The program will behave in the same way with the next variable declarations instead of the above variable declaration:
this:
int c;
int *x;
int i;
or this:
int *x;
int c;
int i;
And only this way: c variable and a x pointer before the i variable.
I know that those last declarations haven't sense, the int i instead of char i, and an added pointer that isn't even needed.
But this have been occurred accidentally and im wondering if it's only an a coincidence.

The order in which you declare your variables should make no difference at all, assuming there's nothing wrong with the rest of your code. The order of declaration needn't have anything at all to do with the way they're laid out in memory. And even if it did, you refer to variables by name; as long as your code is correct, a reference to i is a reference to i, and the compiler will generate whatever code is needed to access the variable correctly.
Now if you do this:
int i;
scanf("%c", &i);
then you're doing something wrong. scanf with a "%i" format requires a char* argument, which points to the char object into which the value will be stored. You're giving it an int* rather than a char*. As a result, your program's behavior is undefined; the language standard says nothing about how it will behave.
So why does it appear to work correctly? What's probably happening is that scanf treats the address of the int object i as if it were a pointer to a char. It will probably point to the first byte of the representation of i; for example, i might be 32 bits, and the pointer will point to the first 8 of those bits. (They could be the high-order or low-order bits, depending on the system.)
Now when you print the value of i:
printf("%d\n", i);
the contents of i are, for example, 1 byte consisting of whatever character you just read into it, and 3 bytes of garbage. Those 3 garbage bytes may well all be zeros, but they could be anything. If the garbage bytes happen to be 0, and the first byte happens to be the high-order byte (i.e., you're on a big-endian machine), then you're likely to get the "correct" output.
But don't do that. Since the behavior is undefined, it can work "correctly" for years, and then fail spectacularly at the worst possible moment.
The lesson here is that C tends to assume that you know what you're doing. There are a lot of constructs that have undefined behavior, which means that they're invalid, but neither the compiler nor the runtime system is required to tell you that there's a problem. In C, more than in most other languages, it's up to you as a programmer to get things right. The compiler (and other tools) will tell you about some errors, but not all of them.
And in the presence of undefined behavior, the order in which you declare your variables can make a difference. For example, if you write code that reads or writes past the end of a variable, it can matter what happens to be stored there. But don't be tempted to shuffle your declarations around until the program works. Get rid of the undefined behavior so the order doesn't matter.
The solution: Don't make mistakes in the first place. (Of course that's much easier said than done.)
And naming conventions can be helpful. If you had called your char variable c, and your int variable i, rather than vice versa, it would have been easier to keep track of which is which.
But c is a reasonable name for an int variable used to hold input character values -- not for scanf, but for getchar(), as in:
int c;
while ((c = getchar()) != EOF) {
/* ... */
}

The function expects a sequence of references as additional arguments, each one pointing to an object of the type specified by their corresponding %-tag within the format string, in the same order. Read about scanf
These can additionally help you:
I don't understand why I can't get three inputs in c
scanf() leaves the new line char in buffer?
Regarding the last portion of your question, the number of bits of int is always more than char, so it won't cause a problem.

Related

How do I initialize a char array with a memory address in C?

I'm in a sophomore C class and this project is about dealing with pointers and designing a memory dump function. So I've been able to struggle through the pointers and got a beginning and ending address to dump, even bitmasked it, and I wanted to initialize a char array with the beginning memory address. I initialize it with the same variable storing my masked beginning address but when I print the array, it contains a different memory address. Here's the function:
void memDump(void *base, int bytes)
{
unsigned char *begin;
begin = base;//beginning of range of memory
unsigned char *end;// ending range of memory
end = base + bytes;
int a, b;
long long int d=base;
d=d&0xFFFFF0; //trying to bitmask
long long int e=end;
e = e&0xFFFF0; //masked off the beginning and ending range
char c[16]={d}; //loop variables
printf("%x", c);
for (a=begin; a<=end; a+=16)
{
printf("\n%016X\n", d);
printf("%016X\n", a);
printf("%016X", e);
}
}
Sorry guys, i can't find something similar and this is my last resort. Thanks!
Update: Thanks for the insight everyone, reading some more about C and some articles on how to debug helped me out.
You cannot "initialize a char array" with some "memory address." A char array can only be initialized with characters.
Stackoverflow is not about doing your homework for you, so I will give you some advice, and then you can try implementing it. If you cannot put the advice into code, then you do not deserve to turn in a completed assignment.
First of all, once you have bitmasked your "d", you need to store it back into "begin", so that you have a pointer from which you can start reading bytes to dump.
This instruction:
printf( "%08p ", begin );
Will render the hexadecimal representation of your "begin" address in 8 characters, followed by a space. This is how you need to begin each row of your memory dump.
The instruction:
printf( "%02x ", *(begin++) );
gets the byte pointed by "begin", and renders the hexadecimal representation of that byte in two characters, followed by a space. It then increments "begin", to point to the next byte. You need to do this 8 or 16 times, depending on how wide you want your memory dump to be, then do a printf( "\n" ) to move to the next line.
Then you need to keep repeating the above until your "begin" has exceeded your "end". (So, you are looking at an outer loop, for each row, and an inner loop, for each byte within the row.)
I hope this helps.
As #Jean-FrançoisFabre observed,
char c[16]={d};
probably does not do what you think it does. That is, unless what you think it does is convert the long long int value stored in d to type char (producing an implementation-defined result drawn from a much smaller range than that of d itself), initializing the first element of array c with that value, and initializing the other fifteen with 0. I can't imagine what you would want to do with the result, but since you actually don't do anything with it, that's probably moot.
As I observed myself,
printf("%x", c);
also probably does not do what you think it does. Indeed, you cannot rely on it to do any particular thing, because its behavior is undefined. You are passing a pointer to the first element of c as the second argument, but a value of type unsigned int will be expected instead (based on the format). In any case, this neither "print[s] the array" nor tells you anything about what it contains.
I suspect that what you actually had in mind was to declare c as an array whose address -- not contents -- is that designated by base, truncated to a 16-byte-aligned address. You cannot do that, because you cannot specify the address of any variable you declare, but you can declare c as a pointer, like this:
unsigned char *c = d;
(Oh no, more pointers!) There's some implementation-dependency there, but it probably has the result I think you want. Or if you want to be really clever, you might do this:
unsigned char (*c16)[16] = d;
That declares c16 as a pointer to an array of 16 unsigned char. It's as close as you can get to declaring an array at an address specified by you. I suspect you'll find it easier to work with the other declaration, however.
If you want to print the contents of the memory to which such a pointer points (as a "memory dump" function seems wont to do) then you'll need to do a little more work. The standard library's formatted I/O functions do not provide directly for printing arrays (for good reasons that I'll not go into here), except C strings, and you do not appear to want to print the data as a C string. Do, however, consider this call, and how you might modify it for or adapt it to your purpose (assuming my above declaration for c):
printf("%02x", *c);

strlen() of an empty array within a struct is not 0

I'm very new to C, and I'm not understanding this behavior. Upon printing the length of this empty array I get 3 instead of 0.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct entry entry;
struct entry{
char arr[16];
};
int main(){
entry a;
printf("%d\n",strlen(a.arr));
return 0;
}
What am I not understanding here?
The statement entry a; does not initialize the struct, so its value is likely garbage. Therefore, there's no guarantee that strlen on any of its members will return anything sensible. In fact, it might even crash the program, or worse.
There is no such thing as an "empty array" in C. Your array of char[16]; always contains 16 bytes - uninitialized as a local variable each char has an unspecified value. In addition, if none of these unspecified values happen to be 0, strlen will read outside the array and your code will have undefined behaviour.
Additionally strlen returns size_t and using %d to print this has undefined behaviour too; you must use %zu where z says that the corresponding argument is size_t.
(If by happenstance you're using the MSVC++ "C" compiler, do note that it might not support %zu. Get a real C compiler and C standard library instead.)
Here's the source code to strlen():
size_t strlen(const char *str)
{
const char *s;
for (s = str; *s; ++s);
return(s - str);
}
Wait, you mean there's source code to strlen()? Why yes. All the standard functions in C are themselves written in C.
This function starts at the memory address specified by str. It then uses the for function to start at that address, and then it goes forward, byte by byte, until it reaches zero. How does that for function do that? Well first it assigns s to str. Then, it checks the value s points to. If it's zero (i.e. if *s returns zero) then the for loop is done. If that value is not zero, the s pointer is incremented, and the zero check is done, over and over, until it finds a zero.
Finally, the distance that the s pointer has moved, minus the original pointer you passed in, is the result of strlen().
In other words, strlen() just walks through memory until it finds the next zero character, and it returns the number of characters from that point to the original pointer.
But, what if it doesn't find a zero? Does it stop? Nope. It will just trudge on and on until it finds a zero or the program crashes.
That is why strlen() is so confusing, and why it's source of many critical bugs in modern software. This doesn't mean you can't use it, but it does mean you must be very very careful to make sure that whatever you pass in is a null-terminated string (i.e. a set of zero or more non-zero characters, followed by a zero character.)
Remember also that in C, you basically have no idea what memory contains when you allocate it or set it aside. If you want it to be all zeros, then you need to make sure to fill it with zeros yourself!
Anyway, the answer to your question involves the use of the memset() function. You'll have to pass memset() the pointer to the beginning of your array, the length of that array, and the value to fill it with (in your case, zero of course!)
No initialization of a, this leads to undefined behavior.
C "strings" are '\0' terminated arrays of char. So strlen() will browse whole memory from given address until it either finds a '\0' or results in a segmentation fault.
What am I not understanding here?
Perhaps the mis-understanding is that auto variables, such as:
entry a;
are assigned memory from the process' stack. The pre-existing content of that stack memory is not zeroed-out for your benefit. Hence the value(s) of the elements of a, which will also be located on the process stack, will not be initially zeroed-out for your benefit. Rather, the entire content of a and its elements (including .arr) will contain bizarre and perhaps unexpected values.
C programmers learn to initialize auto variables by zeroing them out, or initializing them with a desirable value.
For example, the question code might do this as follows:
int main(){
entry a =
{
.arr[0] = 0
};
...
}
Or:
int main(){
entry a;
memset(&a, 0, sizeof(a));
...
}

Multiple variables have the same address, yet they don't?

I was just goofing around on C, as I had just learned more about pointers. But my main confusing point was about addresses, I was just checking out random variable addresses and here's what I found.
#include<stdio.h>
#include<conio.h>
int main()
{
int x=5;
int y=4;
int z=8;
printf("%p\n",&x);
printf("%p\n",&y);
printf("%p\n",&z);
}
This code shows 3 different addresses, which is what to be expected. If I only have one printf statement, like this
printf("%p\n",&x);
It will show me one of the addresses, which is expected as well. But, if I remove the variable 'x', and use another variable like 'y', or 'z', it will show me the exact same address that 'x' had. How is that possible?
So basically, when you check all the variables' addresses in the same code, each will have a different address. If you check each of them separately, they will have the exact same address.
PS: This only happens for integers and floats, I tried the same with char and each variable gave a separate address.
The compiler is optimizing away unused variables in your code. This is dead code elimination.
So when you have code like this
int main()
{
int x=5;
int y=4;
int z=8;
printf("%p\n",&z);
}
The compiler can tell that you never use x or y, and therefore can remove that code from the compiled object. This makes your generated code faster and generates smaller object sizes.
Probably, the compiler is eliminating the unused variables. Hence, only the ones you are calling to print are getting created.
"But, if I remove the variable 'x', and use another variable like 'y', or 'z', it will show me the exact same address that 'x' had."
This has nothing to do with optimizing away as other answers suggest, though the result is the same: in your second test you did not declare your variable. Your program then looks like
int main()
{
//int x=5; // commented out
int y=4;
int z=8;
//printf("%p\n",&x); // commented out
printf("%p\n",&y);
printf("%p\n",&z);
}
For each variable, the compiler reserves space (here: on the stack) and this compiler apparently does that in the order it encounters your declarations (nearly all compilers do that). x no longer being declared, all following variables just move up in the address space.
Note that it does that for all variables, also char variables. But for char variables you often give a size. Consider:
char s1[10];
char s2[10];
char s3[10];
and:
//char s1[10]; // commented out
char s2[10];
char s3[10];
s2 of the second example will have the same adress as s1 in rhe first example because all these s variables are the same size. Were they different sizes you would see different addresses but yoy can calculate these differences from their sizes. (Note: "padding" can come into play here, where the compiler allocates slightly more then you declare so the next variable starts on a 4 byte/8 byte boundary (4: 32 bit compilation; 8: 64 bit compilation).

Why can an array receive values more than it is declared to hold

int main(void)
{
char name1[5];
int count;
printf("Please enter names\n");
count = scanf("%s",name1);
printf("You entered name1 %s\n",name1);
return 0;
}
When I entered more than 5 characters, it printed the characters as I entered, it was more than 5, but the char array is declared as:
char name1[5];
Why did this happened
Because the characters are stored on the addresses after the 'storage space'. This is very dangerous and can lead to crashes.
E.g. suppose you enter name: Michael and the name1 variable starts at 0x1000.
name1: M i c h a e l \0
0x1000 0x1001 0x1002 0x1003 0x1004 0x1005 0x1006 0x1007
[................................]
The allocated space is shown with [...]
This means from 0x1005 memory is overwritten.
Solution:
Copy only 5 characters (including the \0 at the end) or check the length of the entered string before you copy it.
This is undefined behavior, you are writing beyond the bounds of allocated memory. Anything can happen, including a program that appears to work correctly.
The C99 draft standard section J.2 Undefined Behavior says:
The behavior is undefined in the following circumstances:
and contains the following bullet:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).
This applies to the more general case since E1[E2] is identical to (*((E1)+(E2))).
This is undefined behavior, you can't count on it. It just happens to work, it may not work on another machine.
To avoid buffer overflow, use
fgets(name1, sizeof(name1) - 1, stdin);
or in C11
gets_s(name1, sizeof(name1) - 1);
another example to make things clearer :
#include <stdio.h>
int array[5] ;
int main ( void )
{
array[-1] = array[-1] ; // sound strange ??
printf ( "%d" , array[-1] ) ; // but work !!
return 0 ;
}
array in this case in an address, and you get number
before or after that address, but this is undefined behavior
unless you know what you do. Pointer works with ++ or -- !
It's very clear from other answers that this constitutes some kind of vulnerability to your program.
What can be learned from this? Lets assume:
int func(void)
{
char buffer[1];
...
In almost every implementation of the C compiler, the code generated here will create a local stack area and enables you to access this stack by the address given in buffer. On this stack reside other important data too, for example: the address of the next code line to be executed after the function returns to it's caller.
You could, therefore, theoretically:
Enter a lot of code into your input function,
Create a code that defines (in binary code) a new function that does something ugly,
Overwrite the correct return address (on the stack) with the address that the new function would have if you write it beyond the buffers bounds.
This is called buffer overflow exploit, you can read up here (and on many other places).
Yes it is allowed in C, as there is no bound checking.

Explain the output of this C code?

I wrote this code today, just out of experimentation, and I'm trying to figure out the output.
/*
* This code in C attempts to exploit insufficient bounds checking
* to legitimate advantage.
*
* A dynamic structure with the accessibility of an array.
* Handy for small-time code, but largely unreliable.
*/
int array[1] = {0};
int index = 0;
put(), get();
main ( )
{
put(1); put(10), put(100);
printf("%6d %5d %5d\n", get(0), get(1), get(2));
}
put ( x )
int x;
{
array[index++] = x;
}
get ( index )
int index;
{
return array[index];
}
The output:
1 3 100
There is a problem there, in that you declare 'array' as an array of length 1 but you write 3 values to it. It should be at least 'array[3]'. Without that, you are writing to unallocated memory, so anything could happen.
The reason it outputs '3' there without the fix is that it is outputting the value of the global 'index' variable, which is the next int in memory (in your case - as I said anything could happen). Even though you do overwrite this with your put(10) call, the index value is used in as the index in the assignment and then post-incremented, which will set it back to 2 - it then gets set to 3 at the end of the put(100) call and subsequently output via printf.
It's undefined behavior, so the only real explanation is "It does some things on one machine and other things on other machines".
Also, what's with the K&R function syntax?
EDIT: The printf guess was wrong. As far as the syntax, read K&R 2nd Edition (the cover has a red ANSI stamp), which uses modern function syntax (among other useful updates).
To expand on what has been said, accessing out-of-bounds array members results in undefined behavior. Undefined behavior means that literally anything could happen. There is no way to exploit undefined behavior unless you're deep into esoteric platform-specific hacks. Don't do it.
If you do want a "dynamic array", you'll have to take care of it yourself. If your requirements are simple, you can just malloc and realloc a buffer. If your needs are more complicated, you might want to define a struct that keeps a separate buffer, a size, and a count, and write functions that operate on that struct. If you're just learning, try it both ways.
Finally, your function declaration syntax is valid, but archaic. That form is rarely seen, and virtually unheard of in new code. Declare put as:
int put(int x) {…}
And always declare main as:
int main(int argc, char **argv) {…}
The names of argc and argv aren't important, but the types are. If you forget those parameters, demons could fly out of your nose.

Resources