Okay, say I have a sequence "abcdefg" and I do
char* s = strdup("abcdefg");
char* p;
char* q;
p = strchr(s, 'c');// -> cdefg
q = strchr(p, 'd');// -> defg
And I want to display s - p basically abcdefg - cdefg = ab, can I do this using pointer arithmetic?
You can do:
printf("%.*s", (int)(p - s), s);
This prints s with a maximum length of p - s which is the number of characters from s to p.
You cannot. The string is saved as a sequence of letters ending in a zero byte. If you print a pointer as a string, you will tell it to print the sequence up to the zero byte, which is how you get cdefg and defg. There is no sequence of bytes that has 'a', 'b', 0 - you would need to copy that into a new char array. s - p will simply give you 2 (the distance between the two) - there is no amount of pointer arithmetic that will copy a sequence for you.
To get a substring, see Strings in C, how to get subString - copy with strncpy, then place a null terminator.
Okay, if you need to cut p from s, the following code should work:
char s[100]; // Use static mutable array
strcpy(s, "abcdefg"); // init array (ok, ok I dont remember the syntax how to init arrays)
char* p = strchr(s, 'c');// -> cdefg
*p = 0; // cut the found substring off from main string
printf("%s\n", s); // -> ab
Related
I want to read my string backwards. I did that, but I don't understand how two parts of my code work.
char s1[] = "ABC";
printf("%s", s1);
size_t len = strlen(s1);
printf("\n%d", len);
char *t = s1 + len - 1;
printf("\n%s\n", t);
while (t>=s1)
{
printf("%c", *t);
t = t - 1;
}
First: How does t point to letter C?
Second: How is it possible to add variable len which holds integer with an array that holds literals? Is it because pointer t adds their addresses by using pointer arithmetic?
This
char s1[] = "ABC";
looks like
0x100 0x101 0x102 0x103 . . . .Assume 0x100 is base address of s1
--------------------------------
| A | B | C | \0 |
--------------------------------
s1
Here s1 which is char array, points to base address 0x100(let's assume).
I want to read my string reversely ?
For this you need someone to point to 0x102 location i.e last element of an array, for that
size_t len = strlen(s1); /* finding length of array s1 i.e it returns 3 here */
char *t = s1 + len - 1; /* (0x100 + 3*1) - 1 i.e char pointer t points to 0x102 */
above two lines of code is written. Now it looks like
0x100 0x101 0x102 0x103 . . . .
--------------------------------
| A | B | C | \0 |
--------------------------------
s1 t <-- t points here
Now when you do *t it prints the char value at t location i.e value at 0x102 i.e it prints C, in the next iteration you need to print the char one position back, so for that you are doing t = t - 1;.
Note : Here
char *t = s1 + len - 1;
s1 is char pointer and len is an integer variable, so when you are doing pointer arithmetic it will automatically increment by the size of data being pointed by pointer. For e.g
char *t = s1 + len;
evaluated as
t = 0x100 + 3*sizeof(*s1); ==> 0x100 + 3*1 ==> 0x103
char s1[] = "ABC";
s1 is an array of 4 characters char[4] with values {'A','B','C','\0'}
size_t len = strlen(s1);
s1 "decays" (read: is automagically converted) from an array of type to a pointer to type. So s1 decays from an array of 4 characters into a pointer to the first character of the array.
The strlen counts the number of bytes before encountering the null byte delimiter '\0'. Starting from 'A' we can count 'A', 'B', 'C' - that's len = 3.
Pointers in C are normal integers (ok, on most architectures). You can add to them and substract to them, and use uintptr_t to convert them to an integer. But adding to them without a cast will use "pointer arithmetics", that means that (int*)5 + 2 is to the value equal to 5 + 2 * sizeof(int).
char *t = s1 + len - 1;
s1 decays to the pointer to the first character inside the s1 array, that's 'A'. We add + (len = 3), that means that s1 + 3 points to the byte holding '\0' inside the s1 = (char[4]){'A','B','C','\0'} array. Then we subtract - 1, so t will now point to a byte holding the character 'C' inside the s1 array.
while (t >= s1) {
... *t ...
t = t - 1;
}
Start: s1 points to 'A'. t points to 'C'.
while: t is greater then s1. By two. t - s1 = 2, ie. s1 + 2 = t
loop: *t is equal to 'C'.
decrement: t--, so now t will point to 'B'.
while: t is greater then s1. By onw.
loop: *t is equal to 'B'.
decrement: Then t--, so now t will point to 'A'.
while: Now t is equal to then s1. Both point to the first character of the array.
loop: *t is equal to 'B'.
decrement: Then t--, so now t will point to an unknown location before the array. As pointers
(on most architectures) are simple integers, you can decrement and increment them as normal variables.
while: t is now lower then s1. Loop terminates.
Notes:
printf("\n%d", len); is undefined behavior and spawns nasal demons. Use printf("\n%zu", len); to print a size_t variable.
you can print pointer value by using the %p specifier and casting to void printf("%p", (void*)t)
t = s1 - 1. Assigning a pointer to one element before an array is undefined behavior in C. That happens in the end condition of the loop when t = t - 1. Change to do { .. } while loop.
In the language C arrays are a bit strange. At array-of-x can degrade in to a pointer-to-x, very easily. For example if passing to another routine, or as in your case when adding to it. So you are correct it is pointer arithmetic. (In pointer arithmetic you can add pointers and integers, to get pointers.)
So I was studying a code about a custom linux shell and I am having a hard time understanding this section:
// add null to the end
char *end;
end = tokenized + strlen(tokenized) - 1;
end--;
*(end + 1) = '\0';
I don't understand what decreasing a char pointer yields and how this section functions in general, I get that it is pointing end at the last position of the tokenized array but I don't understand the following two lines. If anything similar has been posted I don't mind supplying me the links (although I did a good amount of research). Thank you!
Also a quick question is: I don't believe end is an array. Am I wrong on this?
Decreasing the pointer moves its location in memory to the preceding address. In the case of a char * string, end will now point to its preceding character.
// add null to the end
// declare a `char *`
char *end;
// set `end` to point to the last character of `tokenized`
end = tokenized + strlen(tokenized) - 1;
// decrease `end`; now points to the character before the character is was pointing to
end--;
// set the character after the one `end` points to to `NUL`
*(end + 1) = '\0';
I commented your code as I understand it...
char *end;
end = tokenized + strlen(tokenized) - 1; //
end--;
*(end + 1) = '\0';
strlen(tokenized)
This is the offset location of the null-terminator in the tokenized string.
Meaning, if you increment the pointer by this offset (the amount of non-null characters) you end up with a pointer to the index right after the last character. In order to get the index right on the last character you substract one from the offset.
Let offset = strlen(tokenized) - 1
tokenized + offset
This means the pointer tokenized is moved by an offset. If the pointer would reference 1 byte that means it just increments by 1, if its 2 bytes by 2, and so on. This is because if you have, e.g. an array of ints you want to access only integers when taking offsets of that array pointer. The int is at least 2 bytes in size so the pointer will move at least 2 bytes further when being incremented.
end--
Same thing as above, this decrements the pointer by one, since we moved to the last character of the string using our offset we are now at the second to last character in the string, not much to say other than this is equivalent to end = end - 1;.
*(end + 1) = '\0'
Again, we move an offset of 1 ahead with the pointer, so we once again point at the last character of the string. This is rather redundant since we only just decremented the pointer by the same offset. The only difference here is that the pointer end itself is not changed.
Then we dereference the pointer and write to it, this means we change the value the pointer is currently pointing at, namely the last character of the string. We change this to '\0' because that means we move the terminating null-byte to this location, effectively shortening the string by cutting of the last character.
The code here is equivalent to
size_t len = strlen(tokenized);
tokenized[len - 1] = '\0';
char *end = tokenized + len - 2; // we still have this pointer
Note that we do -2 now because we include the end--; statement.
The current end remains pointing at the last character of the now shortened string.
An illustration of what is happening:
tokenized = "hello world"; // [h e l l o w o r l d \0 ]
tokenized = "hello worl"; // [h e l l o w o r l \0 \0 ]
I don't believe end is an array. Am I wrong on this?
Arrays are essentially just pointers to memory locations. There are a few differences like sizeof results and write access but you can mostly say arrays are pointers.
After assigning 26th element, when printed, still "Computer" is printed out in spite I assigned a character to 26th index. I expect something like this: "Computer K "
What is the reason?
#include <stdio.h>
int main()
{
char m1[40] = "Computer";
printf("%s\n", m1); /*prints out "Computer"*/
m1[26] = 'K';
printf("%s\n", m1); /*prints out "Computer"*/
printf("%c", m1[26]); /*prints "K"*/
}
At 8th index of that string the \0 character is found and %s prints only till it finds a \0 (the end of string, marked by \0) - at 26th the character k is there but it will not be printed as \0 is found before that.
char s[100] = "Computer";
is basically the same as
char s[100] = { 'C', 'o', 'm', 'p', 'u','t','e','r', '\0'};
Since printf stops when the string is 0-terminated it won't print character 26
Whenever you partially initialize an array, the remaining elements are filled with zeroes. (This is a rule in the C standard, C17 6.7.9 ยง19.)
Therefore char m1[40] = "Computer"; ends up in memory like this:
[0] = 'C'
[1] = 'o'
...
[7] = 'r'
[8] = '\0' // the null terminator you automatically get by using the " " syntax
[9] = 0 // everything to zero from here on
...
[39] = 0
Now of course \0 and 0 mean the same thing, the value 0. Either will be interpreted as a null terminator.
If you go ahead and overwrite index 26 and then print the array as a string, it will still only print until it encounters the first null terminator at index 8.
If you do like this however:
#include <stdio.h>
int main()
{
char m1[40] = "Computer";
printf("%s\n", m1); // prints out "Computer"
m1[8] = 'K';
printf("%s\n", m1); // prints out "ComputerK"
}
You overwrite the null terminator, and the next zero that happened to be in the array is treated as null terminator instead. This code only works because we partially initialized the array, so we know there are more zeroes trailing.
Had you instead written
int main()
{
char m1[40];
strcpy(m1, "Computer");
This is not initialization but run-time assignment. strcpy would only set index 0 to 8 ("Computer" with null term at index 8). Remaining elements would be left uninitialized to garbage values, and writing m1[8] = 'K' would destroy the string, as it would then no longer be reliably null terminated. You would get undefined behavior when trying to print it: something like garbage output or a program crash.
In C strings are 0-terminated.
Your initialization fills all array elements after the 'r' with 0.
If you place a non-0 character in any random field of the array, this does not change anything in the fields before or after that element.
This means your string is still 0-terminated right after the 'r'.
How should any function know that after that string some other string might follow?
That's because after "Computer" there's a null terminator (\0) in your array. If you add a character after this \0, it won't be printed because printf() stops printing when it encounters a null terminator.
Just as an addition to the other users answers - you should try to answer your question by being more proactive in your learning. It is enough to write a simple program to understand what is happening.
int main()
{
char m1[40] = "Computer";
printf("%s\n", m1); /*prints out "Computer"*/
m1[26] = 'K';
for(size_t index = 0; index < 40; index++)
{
printf("m1[%zu] = 0x%hhx ('%c')\n", index, (unsigned char)m1[index], (m1[index] >=32) ? m1[index] : ' ');
}
}
Here is a small function from the book Head First on C.
This function should display a string backward on the screen.
void print_reverse(char *s)
{
size_t len = strlen(s);
char *t = s + len - 1;
while ( t >= s )
{
printf("%c", *t);
t -- ;
}
puts("");
}
Unfortunately, I don't understand how it reverses
string.
size_t len = strlen(s); // computes the number of
characters in the string
char *t = s + len - 1; // 't' is a pointer to
a type char
I don't understand what values are used for 's' in this equation; I've read that when the name of an array variable is assigned to a pointer, it actually refers to the address
of the fist character, i.e. array [0]; thus does 's'
have the value 0 here, or does it have the integer value
of a particular character?
For example, the word is s[] = "hello". Then if s[0] = 'h'. Adding strlen(s) to s[0] should yield 104 (in decimal), thus s + len - 1 = 104 + 6 - 1 = 109 (-1 because I assume
I have to subtract '\0' character that strlen takes
into account). But 109 is 'm'. I don't see the way
this equation traverses the string.
while (t >= s); I assume this means that while
t is not equal to zero, is it correct?
Thank you!
First things first, s is a pointer and not a normal char variable.
So, when you assign memory address of a string to s, it contains the address of the first location.
Pointer arithmetic: by adding 1 to a pointer, you make it point to the next memory location. Recall that strings are stored in contiguous memory locations.
So, if s points to "Hello",
printf(*s) will print 'H'
printf(*(s+1)) will print 'e'.
Now, we have the length (=5) in len. When we add len - 1 to s, we make it point 5 locations ahead. It now points to 'o'.
Then by doing while(t >= s) we compare two pointers (t and s) and print the value at address pointed by t and decrement it, till it becomes equal to s which is the first element.
Illustration:
Initial condition:
H e l l o
*s *t
Now we print *t and decrement it.
Output: o
H e l l o
*s *t
We continue it further:
Output: lleH
H e l l o
*s
*t
Since now t == s, we stop.
void print_reverse(char *s)
here s is the pointer to the beginning of the string
size_t len = strlen(s);
this is equals to the number of characters of the string (\0 not counted)
char *t = s + len - 1;
At this point, t is a new pointer which points at the last element of the string (read something about pointer arithmetic if this is not clear to you)
while ( t >= s )
{
printf("%c", *t);
t -- ;
}
In this loop t gets decremented at each iteration, so that every time it points at the previous character in the string.
At the last iteration, t==s, which means that you are printing the first element of the string.
s contains the address of the first character of the array, not the first character itself. An address is not an offset from the start of the array, but an arbitrary (to the user) value. So when you add len - 1 to s, the result is a pointer to the character at index len - 1.
Put another way, this:
char *t = s + len - 1;
Is the same as this:
char *t = &s[len - 1];
The condition while (t >= s) evaluates to true as long as t points to a memory location at or after the start of the array.
It reverses the string by setting the pointer t to the last character of the string and then doing the following:
Print the character that t points to.
Decrease t by one (make it move one char to the left)
Goto (1) unless t now points before the first character of the string.
s is a pointer to the first character of the string. A string is simply a sequence of characters in memory terminated by the NUL character ('\0'). When t == s, t points to the first character, when t < s, t points to the first byte before the first character and this byte is no part of the string.
And when the loop terminates, t won't be zero. t and s are pointers, that means their values are memory addresses. The value of s is the memory address of the first character of the string and this address will certainly not be zero.
The code snippet is:
char c[]="gate2011";
char *p=c;
printf("%s",p+p[3]-p[1]);
The output is:
2011
Can anyone explain how it came?
-----Thanks in advance-----
Going through each line in turn:
char c[] = "gate2011";
Let's assume that array c is located at memory address 200.
char *p = c;
p is now a pointer to c. It therefore points to memory address 200. The actual content of p is "200", indicating the memory address.
printf("%s", p + p[3] - p[1]);
The value of p is 200 when we treat it like a pointer. However, we can also treat it like an array. p[3] gets the value of the 4th item in the string, which is "e". C stores characters as their ASCII value. The ASCII value of "e" is 101.
Next, we get the value of p[1]. p[1] == "a", which has an ASCII value of 97. Substituting these into the function:
printf("%s", 200 + 101 - 97);
That evaluates to:
printf("%s", 204);
At memory address 204, we have the string "2011". Therefore, the program prints "2011".
I'm not sure why you'd want to do something like this, but anyway, this is what's happening.
p + p[3] - p[1]
Here you are taking a value of one pointer, and adding the value of the char at position 3, and then subtracting the value of the char at position 1. The char values are being implicitly cast to numerical values before doing the addition and subtraction.
If p is location 1000, then the sum 1000 + 101(ASCII for e) - 97(ASCII for a) will be made. Therefore the result is a pointer to location 1004 in memory. The %s in the printf then subsitutes the string that starts at this location, and ends with the special character '\0'. So the string is effectively clipped to "2011" (the first 4 letters are missed because 101 - 97 = 4).
If this still doesn't make sense, I'd suggest you have a good look at how arrays in C work.
What have you expected? Why not?
p[3]-p[1] = 'e'-'a' = 4
p+4 = "2011"