Reseting a char pointer to the top of an array - c

I am writing a function and I need to count the length of an array:
while(*substring){
substring++;
length++;
}
Now when I exit the loop. Will that pointer still point to the start of the array? For example:
If the array is "Hello"
when I exit the loop with the pointer be pointed at:
H or the NULL?
If it is pointing at NULL how do I make it point at H?

Strings in C are stored with a null character (denoted \0) at the end.
Thus, one might declare a string as follows.
char *str="Hello!";
In memory, this will look like Hello!0 (or rather, a string of numbers corresponding to each letter followed by a zero).
Your code looks like this:
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
When you reach the end of this loop, *substring will be equal to 0 and substring will contain the address of the 0 character mentioned above. The value of substring will not change unless you explicitly do so.
To make it point at the beginning of the string you could use substring-length, since pointers are integers and may be manipulated as such. Alternatively, you could memorize the location before you begin:
beginning=str;
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
substring=beginning;

It's pointing at the NULL-terminator of the array. Just remember the position in another variable, or subtract length from the pointer.

Pointer once moved will not automatically move to any another location. So once the while loop gets over the pointer would be pointing to NULL or precisely '\0' which is a termination sequence for the string.
In order to move back to the length of string just calculate the string length, which you already are doing by incrementing the length variable.
Sample code:
#include<stdio.h>
int main()
{
char name1[10] = "test program";
char *name = '\0';
name = name1;
int len = strlen(name);
while(*name)
{
name++;
}
name=name-len;
printf("\n%s\n",name);
}
Hope this helps...

At the end of the loop, *substring will be 0. That's the condition for the loop to end:
while(*substring)
So while( (the value pointed to by substring) is not equal to 0), do stuff
But then *substring becomes 0 (i.e. end of string), so *substring will point to NULL.
If you want to bring it back to H, do substring - length
However, the function you are writing already exists. It's in string.h and it's size_t strlen(const char*) size_t is an integer the size of a pointer (i.e. 32 bits on 32 bit OS and 64 bits on 64 bit OS).

Related

Finding the Beginning of a string in C

To solve a question, I am looking for a way to stop a loop after it has reached the beginning of the string, assuming the loop starts from the end and decrements, is there an alternative way to do this without finding the length of the string first and decrementing till the number is zero?
Please keep in mind the only functions I can use are malloc, free and write.
This is not possible, because there is nothing special about a string's contents at the beginning. C strings have a "sentinel value" at their end - '\0' - but the first character, and the byte in memory before the first character, can have any value.
is there an alternative way to do this without finding the length of the string first and decrementing till the number is zero?
Apparently you already know where the end of the string is. I suppose you must have a pointer to the terminator character, since you think you do not know the string length.
If finding the length of the string is a viable option at all, however, then you must already know where the beginning is, too. And if you know where the beginning is and you know where the end is, then you already know the length: it is end - beginning. But you do not need to keep a separate counter to iterate backward from the end of a string to the beginning, supposing that you do know where both the end and the beginning are. You can simply use pointer comparisons instead. For example:
int count_a_backwards(const char *beginning, const char *end) {
int count = 0;
for (const char *c = end; c > beginning; ) {
if (*--c == 'a') count += 1;
}
return count;
}
If in fact you do not know where the beginning of the string is, however, then you cannot identify it at all, at least not in the general case. Perhaps you can recognize the beginning if you have some kind of prior knowledge about the string's contents, or about its alignment, or some such, but in general, the beginning of a string cannot be recognized.
Please keep in mind the only functions I can use are malloc, free and
write.
If you are using the function malloc then the function returns pointer to the first byte of the allocated memory. So if the allocated array will contain a string then its beginning will be known.
The task is to find the end of the string.
You can use either the standard C function strlen or write your own loop that will find the end of the stored string.
So if you have two pointers, one that points to the beginning of a string and the second that points to the end of the same string then to traverse the string in the reverse order is not a hard work.
Pay attention to that if you have a character array that contains a string like this
char s[] = "Hello";
then the expressions s, s + 1, s + 2 and so on all points to a string correspondingly "Hello", "ello", "llo" and so on.
You could find the beginning of a string having a pointer to its end provided that the first element of the array contains a unique symbol that is a sentinel value. However in general this is a very rare case.
Here is a demonstrative program that shows how you can traverse a string in the reverse order without using standard C string functions except a function that places a string in a dynamically allocated array.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
enum { N = 12 };
char *s = malloc( N );
strcpy( s, "Hello World" );
puts( s );
char *p = s;
while ( *p ) ++p;
while ( p != s ) putchar( *--p );
putchar( '\n');
free( s );
return 0;
}
The program output is
Hello World
dlroW olleH

How does a char pointer differ from an int pointer in the below code?

1)
int main()
{
int *j,i=0;
int A[5]={0,1,2,3,4};
int B[3]={6,7,8};
int *s1=A,*s2=B;
while(*s1++ = *s2++)
{
for(i=0; i<5; i++)
printf("%d ", A[i]);
}
}
2)
int main()
{
char str1[] = "India";
char str2[] = "BIX";
char *s1 = str1, *s2=str2;
while(*s1++ = *s2++)
printf("%s ", str1);
}
The second code works fine whereas the first code results in some error(maybe segmentation fault). But how is the pointer variable s2 in program 2 working fine (i.e till the end of the string) but not in program 1, where its running infinitely....
Also, in the second program, won't the s2 variable get incremented beyond the length of the array?
The thing with strings in C is that they have a special character that marks the end of the string. It's the '\0' character. This special character has the value zero.
In the second program the arrays you have include the terminator character, and since it is zero it is treated as "false" when used in a boolean expression (like the condition in your while loop). That means your loop in the second program will copy characters up to and including the terminator character, but since that is "false" the loop will then end.
In the first program there is no such terminator, and the loop will continue and go out of bounds until it just randomly happen to find a zero in the memory you're copying from. This leads to undefined behavior which is a common cause of crashes.
So the difference isn't in how pointers are handled, but in the data. If you add a zero at the end of the source array in the first program (B) then it will also work well.
In str2, You have assigned String. Which means there will be end Of
String('\0' or NULL) due to which when you will increment Str2 and it will
reach to end of string, It will return null and hence your loop will break.
And with integer pointer, there is no end of string. thats why its going to infinite loop.
Joachim gave a good explanation about String terminal character \0 in C language.
Another thing to be aware of when working with pointer is pointer arithmetic.
Arithmetic unit for pointer is the size of the entity pointed.
With a char * pointer named charPtr, on system where char are stored on 1 byte, doing charPtr++ will increase the value in charPtr by *1 (1 byte) to make it ready to point to the next char in memory.
With a int * pointer named intPtr, on system where int are stored on 4 bytes, doing intPtr++ will increase the value in intPtr by 4 (4 bytes) to make it ready to point to the next int in memory.

Understanding character pointers in a while loop

I am learning C and a I came across this function in my study materials. The function accepts a string pointer and a character and counts the number of characters that are in the string. For example for a string this is a string and a ch = 'i' the function would return 3 for 3 occurrences of the letter i.
The part I found confusing is in the while loop. I would have expected that to read something like while(buffer[j] != '\0') where the program would cycle through each element j until it reads a null value. I don't get how the while loop works using buffer in the while loop, and how the program is incremented character by character using buffer++ until the null value is reached. I tried to use debug, but it doesn't work for some reason. Thanks in advance.
int charcount(char *buffer, char ch)
{
int ccount = 0;
while(*buffer != '\0')
{
if(*buffer == ch)
ccount++;
buffer++;
}
return ccount;
}
buffer is a pointer to a set of chars, a string, or a memory buffer holding char data.
*buffer will dereference the value at buffer, as a char. This can be compared with the null character.
When you add to buffer - you are adding to the address, not the value it points to, buffer++ adds 1 to the address, pointing to the next char. This means that now *buffer results in the next character.
In the loop you are incrementing the pointer buffer until it points to the null character, at which point you know you scanned the whole string. Instead of buffer[j], which is equivalent to *(buffer+j), we are incrementing the pointer itself.
When you say buffer++ you increment the address stored in buffer by one.
Once you internalize how pointers work, this code is cleaner than the code that uses a separate index to scan the character string.
In C and C++, arrays are stored in sequence, and an array is stored according to its first address and length.
Therefore *buffer is actually the address of the first byte, and is synonymous with buffer[0]. Because of this, you can use buffer as an array, like this:
int charcount(char *buffer, char ch)
{
int ccount = 0;
int charno = 0;
while(buffer[charno] != '\0')
{
if(buffer[charno] == ch)
ccount++;
charno++;
}
return ccount;
}
Note that this works because strings are null terminated - if you don't have a null termination in the character array pointed to by *buffer it will continue reading forever; you lose the bit where c knows how long the array is. This is why you see so many c functions to which you pass a pointer and a length - the pointer tells it the [0] position of the array, and the size you specify tells it how far to keep reading.
Hope this helps.

Don't understand how this for loop works

Can someone explain how this loop works? The entire function serves to figure out where in hash to place certain strings and the code is as follows:
//determine string location in hash
int hash(char* str)
{
int size = 100;
int sum;
for(; *str; str++)
sum += *str;
return sum % size;
}
It seems to iterate over the string character by character until it hits null, however why does simple *str works as a condition? Why does str++ moves to the next character, shouldn't it be something like this instead: *(str+i) where i increments with each loop and moves "i" places in memory based on *str address?
In C, chars and integers implicitly convert to booleans as: 0 - false, non-zero - true;
So for(; *str; str++) iterates until *str is zero. (or nul)
str is a pointer to an array of chars. str++ increments this pointer to point to the next element in the array and therefore the next character in the string.
So instead of indexing by index. You are moving the pointer.
The condition in a for loop is an expression that is tested for a zero value. The NUL character at the end of str is zero.
The more explicit form of this condition is of course *str != '\0', but that's equivalent since != produces zero when *str is equal to '\0'.
As for why str++ moves to the next character: that's how ++ is defined on pointers. When you increment a char*, you point it to the next char-sized cell in memory. Your *(str + i) solution would also work, it just takes more typing (even though it can be abbreviated str[i]).
This for loop makes use of pointer arithmetic. With that you can increment/decrement the pointer or add/substract an offset to it to navigate to certain entries in the array, since array are continuous blocks of memory you can do that.
str points to a string. Strings in C always end with a terminating \0.
*str dereferences the actual pointer to get the char value.
The for loop's break condition is equivalent to:
*str != '\0'
and
str++
moves the pointer forward to next element.
The hole for-loop is equivalent to:
int len = strlen(str);
int i;
for(i = 0; i < len; i++)
sum += str[i];
You could also write is as while-loop:
while(*str)
sum += *str++;
Why does str++ moves to the next character, shouldn't it be something like this
instead: *(str+i) where i increments with each loop and moves "i" places in
memory based on *str address?
In C/C++, string is a pointer variable that contains the address of your string literal.Initially Str points to the first character.*(str) returns the first character of string.
Str++ points to second charactes.Thus *(str) returns the second character of the string.
why does simple *str works as a condition?
Every c/c++ string contains null character.These Null Characters signify the end of a character string in C. ASCII code of NUL character is 0.
In C/C++,0 means FALSE.Thus, NUL Character in Conditional statement
means FALSE Condition.
for(;0;)/*0 in conditions means false, hence the loop terminates
when pointer points to Null Character.
{
}
It has to do with how C converts values to "True" and "False". In C, 0 is "False" and anything else is "True"
Since null (the character) happens to also be zero it evaluates to "False". If the character set were defined differently and the null character had a value of "11" then the above loop wouldn't work!
As for the 2nd half of the question, a pointer points to a "location" in memory. Incrementing that pointer makes it point to the next "location" in memory. The type of the pointer is relevant here too because the "Next" location depends on how big the thing being pointed to is
When the pointer points to a null character it is regarded as false. This happens in pointers. I don't know who defined it, but it happens.
It may be just becuase C treats 0 as false and every other things as true.
For example in the following code.
if(0) {
puts("true");
} else {
puts("false");
}
false will be the output
The unary * operator is a dereference operator -- *str means "the value pointed to by str." str is a pointer, so incrementing it with str++ (or ++str) changes the pointer to point to the next character. So it is the correct way to increment in the for loop.
Any integral value can be treated as a Boolean. *str as the condition of the for loop takes the value pointed to by str and determine if it is non-zero. If so, the loop continues Once it hits a null character, it terminates.

understanding strlen function in C

I am learning C. And, I see this function find length of a string.
size_t strlen(const char *str)
{
size_t len = 0U;
while(*(str++)) ++len; return len;
}
Now, when does the loop exit? I am confused, since str++, always increases the pointer.
while(*(str++)) ++len;
is same as:
while(*str) {
++len;
++str;
}
is same as:
while(*str != '\0') {
++len;
++str;
}
So now you see when str points to the null char at the end of the string, the test condition fails and you stop looping.
C strings are terminated by the NUL character which has the value of 0
0 is false in C and anything else is true.
So we keep incrementing the pointer into the string and the length until we find a NUL and then return.
You need to understand two notions to grab the idea of the function :
1°) A C string is an array of characters.
2°) In C, an array variable is actually a pointer to the first case of the table.
So what strlen does ? It uses pointer arithmetics to parse the table (++ on a pointer means : next case), till it gets to the end signal ("\0").
Once *(str++) returns 0, the loop exits. This will happen when str points to the last character of the string (because strings in C are 0 terminated).
Correct, str++ increases the counter and returns the previous value. The asterisk (*) dereferences the pointer, i.e. it gives you the character value.
C strings end with a zero byte. The while loop exits when the conditional is no longer true, which means when it is zero.
So the while loop runs until it encounters a zero byte in the string.

Resources