Can anyone tell me how this piece of code works? - c

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void print_reverse(char *s)
{
size_t len=strlen(s);
char *t=s+len-1;
printf("%s %s\n",t,s);
while(t>=s){
printf("%c",*t);
t=t-1;
}
puts("");
}
int main(){
print_reverse("Hello");
}
Can anyone tell how char *t=s+len-1; and while(t>=s) works. I cant understand how a number can be added to pointer and how the pointers are compared in while loop. This program is for reversing a string in c.

Lets do this line by line:
print_reverse("Hello");
void print_reverse(char *s)
Now s points to a string that contains:
- - ----+----+----+----+----+----+----+---- - -
| H | e | l | l | o | \0 |
- - ----+----+----+----+----+----+----+---- - -
^
s
That last character is called the string "NUL" terminator because "NUL" is the name of the character with ASCII value zero (all ASCII values that are not printable have three letter names).
size_t len=strlen(s);
Now len has a value of five. Notice it does not include the "NUL" terminator so even though the string takes 6 bytes the length is five.
char *t=s+len-1;
Now t has a value of s+4. If you count the memory locations this is what you get:
- - ----+----+----+----+----+----+----+---- - -
| H | e | l | l | o | \0 |
- - ----+----+----+----+----+----+----+---- - -
^ ^
s t
Note that s+strlen(s) would point to the "NUL" terminator.
printf("%s %s\n",t,s);
That printf should print Hello o
while(t>=s)
This while loop will continue as long as t>=s which means it will do the body of the loop for every character, including the one where s is pointing.
printf("%c",*t);
This prints the contents of the memory that t is pointing at. It starts with the o and continues backwards towards the H.
t=t-1;
That the part that moves t backwards. Eventually t will be past s and then the loop will end. When the loop finishes it will look like this:
- - ----+----+----+----+----+----+----+---- - -
| H | e | l | l | o | \0 |
- - ----+----+----+----+----+----+----+---- - -
^ ^
t s
Then there is this one final line:
puts("");
That prints an empty string and a final linefeed - there wasn't a linefeed in the string but we needed one so this is a way to do that.

Pointer Arithmetic
When a pointer points into an array, adding integers to the pointer or subtracting integers from the pointer moves the pointer back and forth within the array.
This function should be passed a char *s that points to a string, which is an array of characters ending in a null character ('\0'). Then size_t len = strlen(s); sets len to the size of this string, and char *t = s+len-1; sets t to point to the last character before the null character.
Then, in the loop t=t-1; moves t backward.
Unfortunately, this loop uses t>=s as its control condition. This is intended to stop when t has been moved to the character before s, meaning it has gone back before the start point. However, the C standard only defines pointer arithmetic for elements within the array plus a special position at the end of the array. If this function is passed an s that points to the beginning of an array, then the loop will eventually make t point before the array, and the C standard does not define the resulting behavior.
Other Things to Know About Pointer Arithmetic
Any object may be treated as an array of one element. If you have some type T and some object T x;, you may set a pointer T *p = &x;, and then it is allowed to advance the pointer by one element, p = p+1;. Dereferencing that pointer with *p is not defined, but you can compare it, as in &x == p, or you can subtract one from it.
If print_reverse were passed a pointer into an array beyond the beginning, then its loop would be okay. However, that is now how it is used in the example code; print_reverse("Hello"); is not good code.
Any object may be treated as an array of characters. You can convert a pointer to any object to a pointer to unsigned char and then examine the bytes that make up an object. This is used for special purposes. You should not use it in general code while you are learning C, but you should be aware it exists.

Related

Understanding dereference operator and pointers in C, and logic of a small program

I got this code snippet from here:
int main(int argc, char *argv[])
{
for (int i = 1; i < argc; ++i) {
char *pos = argv[i];
while (*pos != '\0') {
printf("%c\n", *(pos++));
}
printf("\n");
}
}
I have two questions:
Why are we starting the iterations of for loop at i=1, why not
start it at i=0, especially when we are ending the iterations at
i<argc and Not at i<=argc?
The second, third and fourth last lines of code! In char *pos =
argv[i];, we declare a pointer type variable and assign it a
pointer to a commandline parameter passed when running the program.
Then in while (*pos != '\0'), *pos dereferences the pointer
stored in pos, so *pos contains the actual value pointed by the
pointer stored in pos.
Then in printf("%c\n", *(pos++));, we have *(pos++), and
that is the actual question: (a) Why did he increment pos, and
(b) what is the meaning of dereferencing (pos++) with the
dereference operator *?
Why are we starting the iterations of for loop at i=1, why not start it at i=0, especially when we are ending the iterations at
i
We start at 1 because argv[0] holds the name of the program itself which we don't care about. Ignoring the first element of an array does not move the index of the last array.
We have argc elements stored in argv[]. Therefore we mustn't run until i==argc but need to stop one element earlier, just as with every other array.
The second, third and fourth last lines of code! In char *pos = argv[i];, we declare a pointer type variable and assign it a pointer
to a commandline parameter passed when running the program.
Correct. pos is a pointer and points to the first string passed via command line.
Then in while (*pos != '\0'), *pos dereferences the pointer stored in
pos, so *pos contains the actual value pointed by the pointer stored
in pos.
*pos contains the first character of the string we are currently inspecting.
Then in printf("%c\n", *(pos++));, we have *(pos++), and that is the
actual question: (a) Why did he increment pos, and (b) what is the
meaning of dereferencing (pos++) with the dereference operator *?
You have 2 things here:
1. (pos++): pos is a pointer to char and with ++ increment the pointer to point to the next element, i.e. to the next char after taking its value.
2. The value of pos (before the post-increment) is taken and dereferenced to read the char at that position.
As a result the while loop will read all characters, while the for loop handles all strings.
For the first part,
Why are we starting the iterations of for loop at i=1, why not start it at i=0, especially when we are ending the iterations at i<argc and Not at i<=argc?
Because, for hosted environment, argv[0] represents the executable name. Here, we're only interested in supplied command line arguments other than the executable name itself.
Quoting C11, chapter §5.1.2.2.1
If the value of argc is greater than zero, the string pointed to by argv[0]
represents the program name; [....] If the value of argc is
greater than one, the strings pointed to by argv[1] through argv[argc-1]
represent the program parameters.
Point to note: using i<=argc as the loop condition would be wrong, as C arrays use 0-based indexing.
For the second part,
(a) Why did he increment pos, and (b) what is the meaning of dereferencing (pos++) with the dereference operator *?
*(pos++), can also be read as *pos; pos++; which, reads the current value from the memory location pointed to by pos and then advances pos by one element.
To elaborate, at the beginning of each iteration of the for loop, by saying
char *pos = argv[i];
pos holds the pointer to the starting of the string which holds the supplied program parameter and by continuous increment (upto NULL), we're basically traversing the string and by dereferencing, we're reading the value at those locations.
Just for the sake of completeness, let me state, that the whole for loop body
char *pos = argv[i];
while (*pos != '\0') {
printf("%c\n", *(pos++));
can be substituted using
puts(argv[i]);
Why are we starting the iterations of for loop at i=1, why not start
it at i=0, especially when we are ending the iterations at i
Because first parameter is name of the program - and it seems author is not interested in it.
Second case is basically similar to the following (argv[i] basically being a char *):
So you have something like this:
char * c = "Hello"
and then char * p = c;
Now you have
+------------------+
| H e l l o /0 |
| ^ |
| | |
+------------------+
|
|
+
p
When you do p++ you have
+------------------+
| H e l l o /0 |
| ^ |
| | |
+------------------+
|
+-+
+
p
If you do now *p - the value you get is 'e'.
*(p++) is basically same as above two steps, just due to post increment, first the value where p points will be retrieved (before increment), and then p will advance.
Then in printf("%c\n", *(pos++));, we have *(pos++), and that is the
actual question: (a) Why did he increment pos, and (b) what is the
meaning of dereferencing (pos++) with the dereference operator *?
So in the while loop the author is traversing through the whole string until he meets null terminator '\0' and printing each character.
This on the other hand is repeated for each parameter in argv using the for loop.
Why are we starting the iterations of for loop at i=1, why not start
it at i=0, especially when we are ending the iterations at i < argc and
Not at i <= argc?
Note that the argc contains the name of the program being executed too which will be the first (given the number zero) thing to be counted. So the actual arguments starts from 1 and ends at total - 1
A note here, the command line arguments are stored in array of pointer to char, ie
argv[0] -> "YourFirstArguement"
argv[1] -> "YourSecondArguement"
.
.
argv[argc-1] -> "YourLastArguement" //Remember argc-1 is the last argument
and so. Note that each of the argument is a null terminated string
So in
char *pos = argv[i]; // Create another pointer to each string
while (*pos != '\0') {
printf("%c\n", *(pos++)); // Note %c, you're printing char by char.
}
You're just printing character by character using the format specifier %c in printf. So you need to dereference character by character in the while loop and this answers
a) Why did he increment pos, and
b) what is the meaning of dereferencing (pos++) with the dereference operator *?

How does the string populate an array

I'm creating a simple program to see how a string populates an array.
#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <stdlib.h>
int main(void)
{
char string1[100];
int i;
printf("Enter sentence.\n");
fgets(string1, 100, stdin);
for (i=0; i<=15; i++)
puts(&string1[i]);
return 0;
}
I'm having a bit of a problem understanding how the string is populating an array. My expectation is that the string will be completely stored in string1[0] and any further indexes will come up blank. However, when I throw the loop to see if my assumption is true, it turns out that every index has been filled in by the string. Am I misunderstanding how the string is filling the array?
For the string "Hello!", the memory representation would be something like this
+-------+-------+-------+-------+-------+-------+-------+
| 'H' | 'e' | 'l' | 'l' | 'o' | '!' | '\0' |
+-------+-------+-------+-------+-------+-------+-------+
The first cell, at index 0, contains the first character. And each subsequent character is contained in a cell with an increasing index.
Library functions like puts expect you to pass the address of the first character, and then they read the string up to \0.
So if you pass simply string1 or &string1[0], it will resolve to the address of 'H'.
If you pass &string[1], it will resolve to the address of 'e', and the library function will think that is the first character, because that's the contract C strings are designed with.
Your problem is not string1 layout per se but how puts interprets it. Strings are represented by char arrays in C while their end is marked as null terminator (character with code 0):
S e n t e n c e \0
^ ^
string1 &string1[5]
&string1[5] is a pointer to a one character, but since the following character is not null terminator, following memory is interpreted as a string and nce gets printed.
You'll need to use putc and access individual characters:
putc(string1[i])
string is not stored in string1[0] but string's first character is stored at string1[0] or string starts at (string1+0). Here, &string1[0] or (string1+0) can be seen as a pointer, a pointer to C String string1.
In that sense, every valid index i of string1 will give you a valid pointer (string1 + i) which will point to some part of C String string1.
In the last for loop you are printing the suffixes of string string1 which are pointed by (string1 + 0), (string1 + 1), (string1 + 2)...

Understanding C Pointers

#include <stdio.h>
#include <string.h>
int main(){
// char* is a pointer
char str_a[20];
char *pointer;
char *pointer2;
strcpy(str_a, "Hello, world\n");
pointer = str_a;
printf(pointer);
pointer2 = pointer + 2;
printf(pointer2);
strcpy(pointer2, "y you guys!\n");
printf(pointer);
}
Hi, I'm following a book and have come across an program explaining pointers which I am unable to understand. The book does not seem to mention why it is this way, which means that I have to consult elsewhere to get a better understanding. The above code generates the following output:
Hello, world! (pointer)
llo, world! (pointer2)
Hey you guys! (pointer)
What I fail to understand is that the last change to the variable pointer is on line 8. Yet the value of pointer can clearly be seen to change in the last line of output.
I would expect the value of pointer2 to be He rather than llo, world! on the second line of output. The only thing that I can think of - is that on line 14, when + 2 is specified, the first two bytes of pointer is chopped (or the remaining bytes are chopped off, and the first two bytes stay the same in pointer?)
But this cannot be the case - because when I add printf(pointer) below pointer2 = pointer + 2 - the output is "Hello, world!" again rather than "He"
First of all pointer and pointer2 are not declared as pointers, there should be a '*' before those variables to declare them as pointers. Even if this is from the book, this is not correct.
What I fail to understand is that the last change to the variable
pointer is on line 8. Yet the value of pointer can clearly be seen to
change in the last line of output.
Yes! That's the point of pointers! pointer2 has the same address as pointer but plus 2 elements (remember an array variable contains the address of the first variable of that array) as assigned in "pointer2 = pointer + 2;" Therefore, the "strcpy(pointer2, "y you guys!\n");" instructions will begin to copy the characters after the "He" since pointer2 points to the first 'l'.
pointer and pointer2 basically point to the same chunk of memory. That said, the initial state is like this:
p p2
H e l l o , w o r l d \n \0
Then you overwrite the chunk under p2 and it becomes:
p p2
H e y y o u g u y s \n \0
Hope these charts make sense. When you print out a pointer it will always reach to the end of the array, which is \0.
Now, strings in c are zero-byte (\0) terminated, so when you assign str_a to pointer and print pointer the printer will go from the first address pointer points to until the terminating \0, so it will print all of the text.
But when assigning pointer2 the value of pointer plus 2, you make it point two addresses after where pointer point, and when printing it you start from the l of Hello, until the terminating \0.
And for the third one, you replace the content in the memory from where pointer2 points with "y you guys!\n", so the total string starting from two addresses (where pointer points) before would be "Hey you guys!\n", leading to the result you got.
H e l l o , w o r l d \n // start from pointer
^ pointer
H e l l o , w o r l d \n // start from pointer2
^ pointer ^ pointer2
H e y y o u g u y s ! \n // start from pointer
^ pointer ^ pointer2
In C, pointers to strings (or anything) point to the start of the string, not the end.
For strings, the end is identified by the presence of a null ('\0') character.
So, if you have a pointer to a string, say, *p pointing to "Hello", this is how the string is layed out in memory:
<*p points to 1000>
| 1000 | 1001 | 1002 | 1003 | 1004 | 1005 |
H e l l o \0
Now, if you add 2 to p, p now points to 1002:
| 1002 | 1003 | 1004 | 1005 |
l l o \0
So, obviously, accessing p as a string pointer will give you "llo". If you want to make the string end prematurely, you should set the target character to be '\0':
char mystr[10] = "Hello";
char *p = mystr;
*(p+2)='\0';
Hell, you don't even need a pointer to do that. You can just do mystr[2] = '\0';

C arrays and null byte

I have a question for better understing how arrays and nullbytes work in C.
Let's say I have an int array of 13 cells.
Let's say I want cells number: 1, 2, 3 and 10 to have a value. The others that are left as default, automatically get the nullchar \0 as value ?
My understanding of \0 was that the nullbyte is always at the end of the array and its function is to tell the program where array ends. But seems to be wrong
I wrote a simple prog to verify that and seems it is like that:
int nums[13] = {1,2,3};
nums[10] = 69;
int i;
for(i=0;i<13;i++) {
if(nums[i]=='\0') {
printf("null char found! in position: %d\n",i);
}
else {
printf("element: %d found in position: %d of int array\n",nums[i],i);
}
}
return 0;
here is the output:
element: 1 found in position: 0 of int array
element: 2 found in position: 1 of int array
element: 3 found in position: 2 of int array
null char found! in position: 3
null char found! in position: 4
null char found! in position: 5
null char found! in position: 6
null char found! in position: 7
null char found! in position: 8
null char found! in position: 9
element: 69 found in position: 10 of int array
null char found! in position: 11
null char found! in position: 12
| 1 | | 2 | | 3 | | \0 | | \0 | | \0 | | \0 | | \0 | | \0 | | 69 | | \0 | | \0 | | \0 |
So why default cells are set with the \0 value ? instead of being left empty for example ?
Shouldn't the null char be just once at the end of the entire array ?
Thanks
There is no requirement in C that arrays need a \0 at the end. A NUL-terminator is only needed for C strings (which usually have the char or wchar_t or other character type). In a C string the \0 byte also doesn't have to be at the end of the array that contains it, but it must be at the end of the string part. It is perfectly valid to have 0's anywhere within an array. But if that array is used as a string, then the standard C string functions will interpret the 0 with the lowest index to signify the end of the string.
When you declare a variable (nums) in C with an initializer ({1,2,3}) in
int nums[13] = {1,2,3};
all indexes that aren't mentioned in the initializer (3 through 12) have their value initialized to 0. It is not possible to have 'empty' cells in an array. All cells will have a value, it is up to the program(mer) what values to consider empty.
C types correspond to memory, and memory has no real concept of "empty". There are languages where everything (or almost) can be made "empty" by putting some "empty" constant (Python has None, for instance), but C doesn't allow that. One reason to not allow it is that it forces you to have a special universal pattern for the empty state, and this has low-level repercussions. For instance, a character can take any value from 0 to 255 inclusively. That's because characters occupy 8 bits. If you also wanted to have an empty state without sacrificing possible values for characters, you'd need at least one more bit since the 8 other bits can be used for legitimate reasons, and this is undesirable for a lot of reasons.
For your array, the initialization syntax that you're using sets every unspecified element to zero. If you write:
char foo[4] = {1, 2, 3, 4};
then every element has a value (notice that it has no null byte in the end, because arrays don't need to have a null byte in the end–however, if you're using them as strings, then they very much should). If you write:
char foo[4] = {1, 2};
elements 0 and 1 have a specified value, but 2 and 3 don't, and with this syntax C will assume that you want to make them zero. On the other hand, if you write:
char foo[4];
you are not assigning any value to any element, and in this case C will not initialize the array at all. It would be undefined behavior to read from it; in practice, usually, the elements will take the values of whatever happened to exist at its memory location previously.
NULL defined as (void*)0 -
It is zero with generic ptr casting,
wich is equal to the NUL character's (\0) ascii code - 0
Arrays do not need to end with any special character/number.
strings do need to end with a special character, and the reason is simple, it lets functions wich operates on strings "know" where the string ends , for example:
char str[100] = {'h','e','l','l','o',0}; // same as {'h','e','l','l','o','\0'}
printf("%s",str);
prints:
hello
if the last character in the string was not NUL it will print 95 garbage characters after the string ("hello") because the array size is 100 and there is no way for the compiler to know where the string ends.
Even though the zero at the 6th cell is ending the string in most compilers you can set only the "hello" string and they will fill out the rest of the cells with zeroes, so it will be ok in both cases.
First of all, you are confusing C strings with regular arrays. With strings, there is always a \0 at the end of the chararray. It signifies the end of the string. For example, say you have this:
char myText[] = "hello";
In this case, the array places look like this:
myText[0] = 'h';
myText[1] = 'e';
myText[2] = 'l';
myText[3] = 'l';
myText[4] = 'o';
myText[5] = '\0';
However, arrays do not terminate with '\0'. Take another example:
int myArray[3] = {1, 2, 3};
According to your rule, since arrays have to terminate with a '\0', this is not a legal statement since we only give the array 3 elements instead of 4, and we would need 4 elements to include a '\0'. However, this is a completely legal statement in C. Clearly, space for the '\0' is not needed in arrays, just at the end of C strings.
Also note that '\0' is equivalent to the integer, as Kninnug pointed out in the comments:
\0 (the null character) isn't the same as the NULL-pointer. \0 is a byte with all bits set to 0, which would always compare equal to the int 0.
So, in your program, you could just equally check if:
if(nums[i] == 0)
Now, let's prove why you are getting your output.
Shouldn't the null char be just once at the end of the entire array?
No. Any other elements left empty will be initialized with the value of zero. So that is why you are seeing the output that you have; elements that are not num[0], num[1], num[2], or num[10] will be initialized with zero. Since you are checking for \0 (also 0) then everything else with not those elements will be 0.
As alk pointed out in the comments, the null character and the null pointer literal are different. At the end of C strings, you see the null character (NUL) which is '/0' or 0. However, the null pointer literal (NULL) is different.

Recursive addition code in C

The following code works but I don't quite understand how *if (s == 0) works.
It checks if the string is 0?
Also for return(isnumber(s+1)) what is the logic behind that?
I know s is a string but I can just pass s+1 into a function? How does it even know what character I'm looking for?
int isnumber(char *s) {
if (*s == 0) {
return 1; /* Reached end, we've only seen digits so far! */
}
if(!isdigit(*s)) {
printf("The number is invalid\n");
return 0; /* first character is not a digit, so no go */
}
return(isnumber(s+1));
}
int main () {
char inbuf[LENGTH];
int i, j;
printf("Enter a string > ");
fgets(inbuf, LENGTH-1, stdin); // ignore carriage return
inbuf[strlen(inbuf)-1] = 0;
j = isnumber(inbuf);
....
}
This function is a recursive function that checks if a string contains all numbers. To understand how the code works, you must understand how C stores strings. If you have the string "123", C stores this string in memory, like this:
|-----------------------------------|
| 0x8707 | 0x8708 | 0x8709 | 0x870A |
|--------|--------|--------|--------|
| | | | |
| '1' | '2' | '3' | '\0' |
|-----------------------------------|
What C does is it breaks your sting up into characters, stores them in some arbitrary location in memory and adds a null character (\0) (ASCII 0) to the end of the string. This null character is how C knows where the string ends.
Your isnumber() function takes a char *s as a parameter. This is called a pointer. Internally, whats going on is your main() function calls isdigit() and it actually passes in the address of your string, not the string itself. This is important:
j = isnumber(inbuf);
How the compiler interprets this is call isnumber() and pass along the address of inbuf and assign the return value to j.
Now back up at the isnumber() function, its receiving the address of inbuf and assigning it to s. By placing an asterisk (*) in front of s, you are doing something called dereferencing s. Dereferencing means you want the value contained at the address of s. So the line that says if (*s == 0) is basically saying If the value contained at the address of s is equal to 0. Remember earlier I told you in memory, strings always have a terminating null (\0) character? This is how your function knows to end and return.
The next thing to understand is pointer arithmetic. Depending on your system, a char might occupy either 1 byte of memory or 2 bytes. You can find out for sure by printing a sizeof(char). But when you refer to (s+1), that is telling the computer to take the memory address pointed to by s and add to it whatever the size of a char is. So if a char is 1 byte long and s is pointed to 0x8707, then (s+1) will make s equal 0x8708 and *s will point to the '2' in our string (see my memory block diagram above). This is how we iterate through each character in the string.
Hopefully this clears up the confusion!
The statement if (*s == 0) checks to see if the char s points to is zero. In other words, it checks to see if s is a zero-length string and returns 1 if so.
The statement return (isnumber(s+1)) adds 1 to s, causing it to point to the second char in the string, and passes that to isnumber(). isnumber returns true if the string at s[1] is a digit.
In C, strings are terminated with a null character.
(*s == 0) is checking for the null terminator.
This code is a little weirder.
return(isnumber(s+1));
Since the current character is a digit, keep going...call the function again starting at the NEXT character. This is a recursive function call and there is really no need when iteration would be simpler.

Resources