I want to read my string backwards. I did that, but I don't understand how two parts of my code work.
char s1[] = "ABC";
printf("%s", s1);
size_t len = strlen(s1);
printf("\n%d", len);
char *t = s1 + len - 1;
printf("\n%s\n", t);
while (t>=s1)
{
printf("%c", *t);
t = t - 1;
}
First: How does t point to letter C?
Second: How is it possible to add variable len which holds integer with an array that holds literals? Is it because pointer t adds their addresses by using pointer arithmetic?
This
char s1[] = "ABC";
looks like
0x100 0x101 0x102 0x103 . . . .Assume 0x100 is base address of s1
--------------------------------
| A | B | C | \0 |
--------------------------------
s1
Here s1 which is char array, points to base address 0x100(let's assume).
I want to read my string reversely ?
For this you need someone to point to 0x102 location i.e last element of an array, for that
size_t len = strlen(s1); /* finding length of array s1 i.e it returns 3 here */
char *t = s1 + len - 1; /* (0x100 + 3*1) - 1 i.e char pointer t points to 0x102 */
above two lines of code is written. Now it looks like
0x100 0x101 0x102 0x103 . . . .
--------------------------------
| A | B | C | \0 |
--------------------------------
s1 t <-- t points here
Now when you do *t it prints the char value at t location i.e value at 0x102 i.e it prints C, in the next iteration you need to print the char one position back, so for that you are doing t = t - 1;.
Note : Here
char *t = s1 + len - 1;
s1 is char pointer and len is an integer variable, so when you are doing pointer arithmetic it will automatically increment by the size of data being pointed by pointer. For e.g
char *t = s1 + len;
evaluated as
t = 0x100 + 3*sizeof(*s1); ==> 0x100 + 3*1 ==> 0x103
char s1[] = "ABC";
s1 is an array of 4 characters char[4] with values {'A','B','C','\0'}
size_t len = strlen(s1);
s1 "decays" (read: is automagically converted) from an array of type to a pointer to type. So s1 decays from an array of 4 characters into a pointer to the first character of the array.
The strlen counts the number of bytes before encountering the null byte delimiter '\0'. Starting from 'A' we can count 'A', 'B', 'C' - that's len = 3.
Pointers in C are normal integers (ok, on most architectures). You can add to them and substract to them, and use uintptr_t to convert them to an integer. But adding to them without a cast will use "pointer arithmetics", that means that (int*)5 + 2 is to the value equal to 5 + 2 * sizeof(int).
char *t = s1 + len - 1;
s1 decays to the pointer to the first character inside the s1 array, that's 'A'. We add + (len = 3), that means that s1 + 3 points to the byte holding '\0' inside the s1 = (char[4]){'A','B','C','\0'} array. Then we subtract - 1, so t will now point to a byte holding the character 'C' inside the s1 array.
while (t >= s1) {
... *t ...
t = t - 1;
}
Start: s1 points to 'A'. t points to 'C'.
while: t is greater then s1. By two. t - s1 = 2, ie. s1 + 2 = t
loop: *t is equal to 'C'.
decrement: t--, so now t will point to 'B'.
while: t is greater then s1. By onw.
loop: *t is equal to 'B'.
decrement: Then t--, so now t will point to 'A'.
while: Now t is equal to then s1. Both point to the first character of the array.
loop: *t is equal to 'B'.
decrement: Then t--, so now t will point to an unknown location before the array. As pointers
(on most architectures) are simple integers, you can decrement and increment them as normal variables.
while: t is now lower then s1. Loop terminates.
Notes:
printf("\n%d", len); is undefined behavior and spawns nasal demons. Use printf("\n%zu", len); to print a size_t variable.
you can print pointer value by using the %p specifier and casting to void printf("%p", (void*)t)
t = s1 - 1. Assigning a pointer to one element before an array is undefined behavior in C. That happens in the end condition of the loop when t = t - 1. Change to do { .. } while loop.
In the language C arrays are a bit strange. At array-of-x can degrade in to a pointer-to-x, very easily. For example if passing to another routine, or as in your case when adding to it. So you are correct it is pointer arithmetic. (In pointer arithmetic you can add pointers and integers, to get pointers.)
Related
Okay, say I have a sequence "abcdefg" and I do
char* s = strdup("abcdefg");
char* p;
char* q;
p = strchr(s, 'c');// -> cdefg
q = strchr(p, 'd');// -> defg
And I want to display s - p basically abcdefg - cdefg = ab, can I do this using pointer arithmetic?
You can do:
printf("%.*s", (int)(p - s), s);
This prints s with a maximum length of p - s which is the number of characters from s to p.
You cannot. The string is saved as a sequence of letters ending in a zero byte. If you print a pointer as a string, you will tell it to print the sequence up to the zero byte, which is how you get cdefg and defg. There is no sequence of bytes that has 'a', 'b', 0 - you would need to copy that into a new char array. s - p will simply give you 2 (the distance between the two) - there is no amount of pointer arithmetic that will copy a sequence for you.
To get a substring, see Strings in C, how to get subString - copy with strncpy, then place a null terminator.
Okay, if you need to cut p from s, the following code should work:
char s[100]; // Use static mutable array
strcpy(s, "abcdefg"); // init array (ok, ok I dont remember the syntax how to init arrays)
char* p = strchr(s, 'c');// -> cdefg
*p = 0; // cut the found substring off from main string
printf("%s\n", s); // -> ab
I got this code snippet from here:
int main(int argc, char *argv[])
{
for (int i = 1; i < argc; ++i) {
char *pos = argv[i];
while (*pos != '\0') {
printf("%c\n", *(pos++));
}
printf("\n");
}
}
I have two questions:
Why are we starting the iterations of for loop at i=1, why not
start it at i=0, especially when we are ending the iterations at
i<argc and Not at i<=argc?
The second, third and fourth last lines of code! In char *pos =
argv[i];, we declare a pointer type variable and assign it a
pointer to a commandline parameter passed when running the program.
Then in while (*pos != '\0'), *pos dereferences the pointer
stored in pos, so *pos contains the actual value pointed by the
pointer stored in pos.
Then in printf("%c\n", *(pos++));, we have *(pos++), and
that is the actual question: (a) Why did he increment pos, and
(b) what is the meaning of dereferencing (pos++) with the
dereference operator *?
Why are we starting the iterations of for loop at i=1, why not start it at i=0, especially when we are ending the iterations at
i
We start at 1 because argv[0] holds the name of the program itself which we don't care about. Ignoring the first element of an array does not move the index of the last array.
We have argc elements stored in argv[]. Therefore we mustn't run until i==argc but need to stop one element earlier, just as with every other array.
The second, third and fourth last lines of code! In char *pos = argv[i];, we declare a pointer type variable and assign it a pointer
to a commandline parameter passed when running the program.
Correct. pos is a pointer and points to the first string passed via command line.
Then in while (*pos != '\0'), *pos dereferences the pointer stored in
pos, so *pos contains the actual value pointed by the pointer stored
in pos.
*pos contains the first character of the string we are currently inspecting.
Then in printf("%c\n", *(pos++));, we have *(pos++), and that is the
actual question: (a) Why did he increment pos, and (b) what is the
meaning of dereferencing (pos++) with the dereference operator *?
You have 2 things here:
1. (pos++): pos is a pointer to char and with ++ increment the pointer to point to the next element, i.e. to the next char after taking its value.
2. The value of pos (before the post-increment) is taken and dereferenced to read the char at that position.
As a result the while loop will read all characters, while the for loop handles all strings.
For the first part,
Why are we starting the iterations of for loop at i=1, why not start it at i=0, especially when we are ending the iterations at i<argc and Not at i<=argc?
Because, for hosted environment, argv[0] represents the executable name. Here, we're only interested in supplied command line arguments other than the executable name itself.
Quoting C11, chapter ยง5.1.2.2.1
If the value of argc is greater than zero, the string pointed to by argv[0]
represents the program name; [....] If the value of argc is
greater than one, the strings pointed to by argv[1] through argv[argc-1]
represent the program parameters.
Point to note: using i<=argc as the loop condition would be wrong, as C arrays use 0-based indexing.
For the second part,
(a) Why did he increment pos, and (b) what is the meaning of dereferencing (pos++) with the dereference operator *?
*(pos++), can also be read as *pos; pos++; which, reads the current value from the memory location pointed to by pos and then advances pos by one element.
To elaborate, at the beginning of each iteration of the for loop, by saying
char *pos = argv[i];
pos holds the pointer to the starting of the string which holds the supplied program parameter and by continuous increment (upto NULL), we're basically traversing the string and by dereferencing, we're reading the value at those locations.
Just for the sake of completeness, let me state, that the whole for loop body
char *pos = argv[i];
while (*pos != '\0') {
printf("%c\n", *(pos++));
can be substituted using
puts(argv[i]);
Why are we starting the iterations of for loop at i=1, why not start
it at i=0, especially when we are ending the iterations at i
Because first parameter is name of the program - and it seems author is not interested in it.
Second case is basically similar to the following (argv[i] basically being a char *):
So you have something like this:
char * c = "Hello"
and then char * p = c;
Now you have
+------------------+
| H e l l o /0 |
| ^ |
| | |
+------------------+
|
|
+
p
When you do p++ you have
+------------------+
| H e l l o /0 |
| ^ |
| | |
+------------------+
|
+-+
+
p
If you do now *p - the value you get is 'e'.
*(p++) is basically same as above two steps, just due to post increment, first the value where p points will be retrieved (before increment), and then p will advance.
Then in printf("%c\n", *(pos++));, we have *(pos++), and that is the
actual question: (a) Why did he increment pos, and (b) what is the
meaning of dereferencing (pos++) with the dereference operator *?
So in the while loop the author is traversing through the whole string until he meets null terminator '\0' and printing each character.
This on the other hand is repeated for each parameter in argv using the for loop.
Why are we starting the iterations of for loop at i=1, why not start
it at i=0, especially when we are ending the iterations at i < argc and
Not at i <= argc?
Note that the argc contains the name of the program being executed too which will be the first (given the number zero) thing to be counted. So the actual arguments starts from 1 and ends at total - 1
A note here, the command line arguments are stored in array of pointer to char, ie
argv[0] -> "YourFirstArguement"
argv[1] -> "YourSecondArguement"
.
.
argv[argc-1] -> "YourLastArguement" //Remember argc-1 is the last argument
and so. Note that each of the argument is a null terminated string
So in
char *pos = argv[i]; // Create another pointer to each string
while (*pos != '\0') {
printf("%c\n", *(pos++)); // Note %c, you're printing char by char.
}
You're just printing character by character using the format specifier %c in printf. So you need to dereference character by character in the while loop and this answers
a) Why did he increment pos, and
b) what is the meaning of dereferencing (pos++) with the dereference operator *?
Here is a small function from the book Head First on C.
This function should display a string backward on the screen.
void print_reverse(char *s)
{
size_t len = strlen(s);
char *t = s + len - 1;
while ( t >= s )
{
printf("%c", *t);
t -- ;
}
puts("");
}
Unfortunately, I don't understand how it reverses
string.
size_t len = strlen(s); // computes the number of
characters in the string
char *t = s + len - 1; // 't' is a pointer to
a type char
I don't understand what values are used for 's' in this equation; I've read that when the name of an array variable is assigned to a pointer, it actually refers to the address
of the fist character, i.e. array [0]; thus does 's'
have the value 0 here, or does it have the integer value
of a particular character?
For example, the word is s[] = "hello". Then if s[0] = 'h'. Adding strlen(s) to s[0] should yield 104 (in decimal), thus s + len - 1 = 104 + 6 - 1 = 109 (-1 because I assume
I have to subtract '\0' character that strlen takes
into account). But 109 is 'm'. I don't see the way
this equation traverses the string.
while (t >= s); I assume this means that while
t is not equal to zero, is it correct?
Thank you!
First things first, s is a pointer and not a normal char variable.
So, when you assign memory address of a string to s, it contains the address of the first location.
Pointer arithmetic: by adding 1 to a pointer, you make it point to the next memory location. Recall that strings are stored in contiguous memory locations.
So, if s points to "Hello",
printf(*s) will print 'H'
printf(*(s+1)) will print 'e'.
Now, we have the length (=5) in len. When we add len - 1 to s, we make it point 5 locations ahead. It now points to 'o'.
Then by doing while(t >= s) we compare two pointers (t and s) and print the value at address pointed by t and decrement it, till it becomes equal to s which is the first element.
Illustration:
Initial condition:
H e l l o
*s *t
Now we print *t and decrement it.
Output: o
H e l l o
*s *t
We continue it further:
Output: lleH
H e l l o
*s
*t
Since now t == s, we stop.
void print_reverse(char *s)
here s is the pointer to the beginning of the string
size_t len = strlen(s);
this is equals to the number of characters of the string (\0 not counted)
char *t = s + len - 1;
At this point, t is a new pointer which points at the last element of the string (read something about pointer arithmetic if this is not clear to you)
while ( t >= s )
{
printf("%c", *t);
t -- ;
}
In this loop t gets decremented at each iteration, so that every time it points at the previous character in the string.
At the last iteration, t==s, which means that you are printing the first element of the string.
s contains the address of the first character of the array, not the first character itself. An address is not an offset from the start of the array, but an arbitrary (to the user) value. So when you add len - 1 to s, the result is a pointer to the character at index len - 1.
Put another way, this:
char *t = s + len - 1;
Is the same as this:
char *t = &s[len - 1];
The condition while (t >= s) evaluates to true as long as t points to a memory location at or after the start of the array.
It reverses the string by setting the pointer t to the last character of the string and then doing the following:
Print the character that t points to.
Decrease t by one (make it move one char to the left)
Goto (1) unless t now points before the first character of the string.
s is a pointer to the first character of the string. A string is simply a sequence of characters in memory terminated by the NUL character ('\0'). When t == s, t points to the first character, when t < s, t points to the first byte before the first character and this byte is no part of the string.
And when the loop terminates, t won't be zero. t and s are pointers, that means their values are memory addresses. The value of s is the memory address of the first character of the string and this address will certainly not be zero.
char *start = str;
char *end = start + strlen(str) - 1; /* -1 for \0 */
char temp;
How does that find the end of the string? If the string is giraffe, start holds that string, then you have:
char *end = "giraffe" + 7 - 1;
How does that give you the last char in giraffe? (e)
Here's how "giraffe" is laid out in memory, with each number giving that character's index.
g i r a f f e \0
0 1 2 3 4 5 6 7
The last character e is at index 6. str + 6, alternatively written as &str[6], yields address of the last character. This is the address of the last character, not the character itself. To get the character you need to dereference that address, so *(str + 6) or str[6] (add a *, or remove the &).
In English, here are various ways to access parts of the string:
str == str + 0 == &str[0] address of character at index 0
== str + 6 == &str[6] address of character at index 6
*str == *(str + 0) == str[0] character at index 0 ('g')
== *(str + 6) == str[6] character at index 6 ('e')
If the string is giraffe, start holds that string
Not exactly, start is a char *, so it holds a pointer, not a string. It points to the first character of "giraffe": g.
start + 1 is also a char * pointer and it points to the next element of size char, in this case the i.
start + 5 is a pointer to the second f.
start + 6 is a pointer to the e.
start + 7 is a pointer to the special character \0 or NUL, which denotes the end of a string in C.
start is a pointer to the string "giraffe" in memory, not the string itself.
Likewise end is a pointer to the start of the string + 7 bytes (minus one to account for the null terminator).
There is no string type in C, just char, and arrays of char.
I would tell you to google "c strings", but since that's apparently a term that you need safesearch for these days, here's a link: http://www.cprogramming.com/tutorial/lesson9.html
The code snippet is:
char c[]="gate2011";
char *p=c;
printf("%s",p+p[3]-p[1]);
The output is:
2011
Can anyone explain how it came?
-----Thanks in advance-----
Going through each line in turn:
char c[] = "gate2011";
Let's assume that array c is located at memory address 200.
char *p = c;
p is now a pointer to c. It therefore points to memory address 200. The actual content of p is "200", indicating the memory address.
printf("%s", p + p[3] - p[1]);
The value of p is 200 when we treat it like a pointer. However, we can also treat it like an array. p[3] gets the value of the 4th item in the string, which is "e". C stores characters as their ASCII value. The ASCII value of "e" is 101.
Next, we get the value of p[1]. p[1] == "a", which has an ASCII value of 97. Substituting these into the function:
printf("%s", 200 + 101 - 97);
That evaluates to:
printf("%s", 204);
At memory address 204, we have the string "2011". Therefore, the program prints "2011".
I'm not sure why you'd want to do something like this, but anyway, this is what's happening.
p + p[3] - p[1]
Here you are taking a value of one pointer, and adding the value of the char at position 3, and then subtracting the value of the char at position 1. The char values are being implicitly cast to numerical values before doing the addition and subtraction.
If p is location 1000, then the sum 1000 + 101(ASCII for e) - 97(ASCII for a) will be made. Therefore the result is a pointer to location 1004 in memory. The %s in the printf then subsitutes the string that starts at this location, and ends with the special character '\0'. So the string is effectively clipped to "2011" (the first 4 letters are missed because 101 - 97 = 4).
If this still doesn't make sense, I'd suggest you have a good look at how arrays in C work.
What have you expected? Why not?
p[3]-p[1] = 'e'-'a' = 4
p+4 = "2011"