I am learning C and a I came across this function in my study materials. The function accepts a string pointer and a character and counts the number of characters that are in the string. For example for a string this is a string and a ch = 'i' the function would return 3 for 3 occurrences of the letter i.
The part I found confusing is in the while loop. I would have expected that to read something like while(buffer[j] != '\0') where the program would cycle through each element j until it reads a null value. I don't get how the while loop works using buffer in the while loop, and how the program is incremented character by character using buffer++ until the null value is reached. I tried to use debug, but it doesn't work for some reason. Thanks in advance.
int charcount(char *buffer, char ch)
{
int ccount = 0;
while(*buffer != '\0')
{
if(*buffer == ch)
ccount++;
buffer++;
}
return ccount;
}
buffer is a pointer to a set of chars, a string, or a memory buffer holding char data.
*buffer will dereference the value at buffer, as a char. This can be compared with the null character.
When you add to buffer - you are adding to the address, not the value it points to, buffer++ adds 1 to the address, pointing to the next char. This means that now *buffer results in the next character.
In the loop you are incrementing the pointer buffer until it points to the null character, at which point you know you scanned the whole string. Instead of buffer[j], which is equivalent to *(buffer+j), we are incrementing the pointer itself.
When you say buffer++ you increment the address stored in buffer by one.
Once you internalize how pointers work, this code is cleaner than the code that uses a separate index to scan the character string.
In C and C++, arrays are stored in sequence, and an array is stored according to its first address and length.
Therefore *buffer is actually the address of the first byte, and is synonymous with buffer[0]. Because of this, you can use buffer as an array, like this:
int charcount(char *buffer, char ch)
{
int ccount = 0;
int charno = 0;
while(buffer[charno] != '\0')
{
if(buffer[charno] == ch)
ccount++;
charno++;
}
return ccount;
}
Note that this works because strings are null terminated - if you don't have a null termination in the character array pointed to by *buffer it will continue reading forever; you lose the bit where c knows how long the array is. This is why you see so many c functions to which you pass a pointer and a length - the pointer tells it the [0] position of the array, and the size you specify tells it how far to keep reading.
Hope this helps.
Related
This is my code for two functions in C:
// Begin
void readTrain(Train_t *train){
printf("Name des Zugs:");
char name[STR];
getlinee(name, STR);
strcpy(train->name, name);
printf("Name des Drivers:");
char namedriver[STR];
getlinee(namedriver, STR);
strcpy(train->driver, namedriver);
}
void getlinee(char *str, long num){
char c;
int i = 0;
while(((c=getchar())!='\n') && (i<num)){
*str = c;
str++;
i++;
}
printf("i is %d\n", i);
*str = '\0';
fflush(stdin);
}
// End
So, with void getlinee(char *str, long num) function I want to get user input to first string char name[STR] and to second char namedriver[STR]. Maximal string size is STR (30 charachters) and if I have at the input more than 30 characters for first string ("Name des Zuges"), which will be stored in name[STR], after that I input second string, which will be stored in namedriver, and then printing FIRST string, I do not get the string from the user input (first 30 characters from input), but also the second string "attached" to this, I simply do not know why...otherwise it works good, if the limit of 30 characters is respected for the first string.
Here my output, when the input is larger than 30 characters for first string, problem is in the row 5 "Zugname", why I also have second string when I m printing just first one...:
Name des Zugs:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is 30
Name des Drivers:xxxxxxxx
i is 8
Zugname: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxxxxxxxx
Drivername: xxxxxxxx
I think your issue is that your train->name is not properly terminated with '\0', as a consequence when you call printf("%s", train->name) the function keeps reading memory until it finds '\0'. In your case I guess your structure looks like:
struct Train_t {
//...
char name[STR];
char driver[STR];
//...
};
In getlinee() function, you write '\0' after the last character. In particular, if the input is more than 30 characters long, you copy the first 30 characters, then add '\0' at the 31-th character (name[30]). This is a first buffer overflow.
So where is this '\0' actually written? well, at name[30], even though your not supposed to write there. Then, if you have the structure above when you do strcpy(train->name, name); you will actually copy a 31-bytes long string: 30 chars into train->name, and the '\0' will overflow into train->driver[0]. This is the second buffer overflow.
After this, you override the train->driver buffer so the '\0' disappears and your data in memory basically looks like:
train->name = "aaa...aaa" // no '\0' at the end so printf won't stop reading here
train->driver = "xxx\0" // but there
You have an off-by-one error on your array sizes -- you have arrays of STR chars, and you read up to STR characters into them, but then you store a NUL terminator, requiring (up to) STR + 1 bytes total. So whenever you have a max size input, you run off the end of your array(s) and get undefined behavior.
Pass STR - 1 as the second argument to getlinee for the easiest fix.
Key issues
Size test in wrong order and off-by-one. ((c=getchar())!='\n') && (i<num) --> (i+1<num) && ((c=getchar())!='\n'). Else no room for the null character. Bad form to consume an excess character here.
getlinee() should be declared before first use. Tip: Enable all compiler warnings to save time.
Other
Use int c; not char c; to well distinguish the typical 257 different possible results from getchar().
fflush(stdin); is undefined behavior. Better code would consume excess characters in a line with other code.
void getlinee(char *str, long num) better with size_t num. size_t is the right size type for array sizing and indexing.
int i should be the same type as num.
Better code would also test for EOF.
while((i<num) && ((c=getchar())!='\n') && (c != EOF)){
A better design would return something from getlinee() to indicate success and identify troubles like end-of-file with nothing read, input error, too long a line and parameter trouble like str == NULL, num <= 0.
I believe you have a struct similar to this:
typedef struct train_s
{
//...
char name[STR];
char driver[STR];
//...
} Train_t;
When you attempt to write a '\0' to a string that is longer than STR (30 in this case), you actually write a '\0' to name[STR], which you don't have, since the last element of name with length STR has an index of STR-1 (29 in this case), so you are trying to write a '\0' outside your array.
And, since two strings in this struct are stored one after another, you are writing a '\0' to driver[0], which you immediately overwrite, hence when printing out name, printf doesn't find a '\0' until it reaches the end of driver, so it prints both.
Fixing this should be easy.
Just change:
while(((c=getchar())!='\n') && (i<num))
to:
while(((c=getchar())!='\n') && (i<num - 1))
Or, as I would do it, add 1 to array size:
char name[STR + 1];
char driver[STR + 1];
i trying to create a function that return an array of zeros us a char array
and print this array in a file text but when i return a string an addition char was returned
this the text file string the program wrote
this my fuction :
char *zeros_maker (int kj,int kj1)
{
char *zeros;
zeros = (char *) malloc(sizeof(char)*(kj-kj1));
int i;
for(i =0;i<kj-kj1;i++)
zeros[i]='0';
printf("%s\n",zeros);
return zeros;
}
the instruction i used when i printed in the file
fprintf(pFile,"%c%s%c &",34,zeros_maker(added_zeros,0),34);
Thanks in advance
'0' in C is the value of the encoding used for the digit zero. This is not allowed to have the value 0 by the C standard.
You need to add a NUL-terminator '\0' to the end of the char array, in order for the printf function to work correctly.
Else you run the risk of it running past the end of the char array, with undefined results.
Finally, don't forget to free the allocated memory at some point in your program.
Read about how string in C are meant to be terminated.
Each string terminates with the null char '\0' (the NULL symbol ASCII value 0, not to be confused with the char '0' that has ASCII value 48). It identifies the end of the string.
zeros[kj-kj1]='\0';
Plus check always if you are accessing an element out of bound. In this case it happens if kj1> kj
Instead of for loop, you may get hand of memset.
char* zeros_maker(int kj,int kj1)
{
int len=kj-kj1;
char *zeros=malloc(sizeof(char)*(len));
memset(zeros,'0',len-1);
zeros[len-1]=0;
printf("%s\n",zeros);
//fflush(stdout);
return zeros;
}
Or if you are not fan of C-style string, and it's going to be ASCII only, following could be used too. Just be careful what your are doing this way.
char* zeros_maker_pascal_form(int kj,int kj1)
{
int len=kj-kj1;
char *zeros=malloc(sizeof(char)*(len));
memset(zeros,'0',len);
for(int a=0;a<len;a++){
printf("%c",zeros[a]);
}
printf("\n");
//fflush(stdout);
return zeros;
}
Your code has a few basic issues, the main one is that it fails to terminate the string (and include space for the terminator).
Here's a fixed and cleaned-up version:
char * zeros_maker(size_t length)
{
char *s = malloc(length + 1);
if(s != NULL)
{
memset(s, '0', length);
s[length - 1] = '\0';
}
return s;
}
This has the following improvements over your code:
It simplifies the interface, just taking the number of zeroes that should be returned (the length of the returned string). Do the subtraction at the call site, where those two values make sense.
No cast of the return value from malloc(), and no scaling by sizeof (char) since I consider that pointless.
Check for NULL being returned by malloc() before using the memory.
Use memset() to set a range of bytes to a single value, that's a standard C function and much easier to know and verify than a custom loop.
Terminate the string, of course.
Call it like so:
char *zs = zeros_maker(kj - kj1);
puts(s);
free(s);
Remember to free() the string once you're done with it.
1)
int main()
{
int *j,i=0;
int A[5]={0,1,2,3,4};
int B[3]={6,7,8};
int *s1=A,*s2=B;
while(*s1++ = *s2++)
{
for(i=0; i<5; i++)
printf("%d ", A[i]);
}
}
2)
int main()
{
char str1[] = "India";
char str2[] = "BIX";
char *s1 = str1, *s2=str2;
while(*s1++ = *s2++)
printf("%s ", str1);
}
The second code works fine whereas the first code results in some error(maybe segmentation fault). But how is the pointer variable s2 in program 2 working fine (i.e till the end of the string) but not in program 1, where its running infinitely....
Also, in the second program, won't the s2 variable get incremented beyond the length of the array?
The thing with strings in C is that they have a special character that marks the end of the string. It's the '\0' character. This special character has the value zero.
In the second program the arrays you have include the terminator character, and since it is zero it is treated as "false" when used in a boolean expression (like the condition in your while loop). That means your loop in the second program will copy characters up to and including the terminator character, but since that is "false" the loop will then end.
In the first program there is no such terminator, and the loop will continue and go out of bounds until it just randomly happen to find a zero in the memory you're copying from. This leads to undefined behavior which is a common cause of crashes.
So the difference isn't in how pointers are handled, but in the data. If you add a zero at the end of the source array in the first program (B) then it will also work well.
In str2, You have assigned String. Which means there will be end Of
String('\0' or NULL) due to which when you will increment Str2 and it will
reach to end of string, It will return null and hence your loop will break.
And with integer pointer, there is no end of string. thats why its going to infinite loop.
Joachim gave a good explanation about String terminal character \0 in C language.
Another thing to be aware of when working with pointer is pointer arithmetic.
Arithmetic unit for pointer is the size of the entity pointed.
With a char * pointer named charPtr, on system where char are stored on 1 byte, doing charPtr++ will increase the value in charPtr by *1 (1 byte) to make it ready to point to the next char in memory.
With a int * pointer named intPtr, on system where int are stored on 4 bytes, doing intPtr++ will increase the value in intPtr by 4 (4 bytes) to make it ready to point to the next int in memory.
I am writing a function and I need to count the length of an array:
while(*substring){
substring++;
length++;
}
Now when I exit the loop. Will that pointer still point to the start of the array? For example:
If the array is "Hello"
when I exit the loop with the pointer be pointed at:
H or the NULL?
If it is pointing at NULL how do I make it point at H?
Strings in C are stored with a null character (denoted \0) at the end.
Thus, one might declare a string as follows.
char *str="Hello!";
In memory, this will look like Hello!0 (or rather, a string of numbers corresponding to each letter followed by a zero).
Your code looks like this:
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
When you reach the end of this loop, *substring will be equal to 0 and substring will contain the address of the 0 character mentioned above. The value of substring will not change unless you explicitly do so.
To make it point at the beginning of the string you could use substring-length, since pointers are integers and may be manipulated as such. Alternatively, you could memorize the location before you begin:
beginning=str;
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
substring=beginning;
It's pointing at the NULL-terminator of the array. Just remember the position in another variable, or subtract length from the pointer.
Pointer once moved will not automatically move to any another location. So once the while loop gets over the pointer would be pointing to NULL or precisely '\0' which is a termination sequence for the string.
In order to move back to the length of string just calculate the string length, which you already are doing by incrementing the length variable.
Sample code:
#include<stdio.h>
int main()
{
char name1[10] = "test program";
char *name = '\0';
name = name1;
int len = strlen(name);
while(*name)
{
name++;
}
name=name-len;
printf("\n%s\n",name);
}
Hope this helps...
At the end of the loop, *substring will be 0. That's the condition for the loop to end:
while(*substring)
So while( (the value pointed to by substring) is not equal to 0), do stuff
But then *substring becomes 0 (i.e. end of string), so *substring will point to NULL.
If you want to bring it back to H, do substring - length
However, the function you are writing already exists. It's in string.h and it's size_t strlen(const char*) size_t is an integer the size of a pointer (i.e. 32 bits on 32 bit OS and 64 bits on 64 bit OS).
I am learning C. And, I see this function find length of a string.
size_t strlen(const char *str)
{
size_t len = 0U;
while(*(str++)) ++len; return len;
}
Now, when does the loop exit? I am confused, since str++, always increases the pointer.
while(*(str++)) ++len;
is same as:
while(*str) {
++len;
++str;
}
is same as:
while(*str != '\0') {
++len;
++str;
}
So now you see when str points to the null char at the end of the string, the test condition fails and you stop looping.
C strings are terminated by the NUL character which has the value of 0
0 is false in C and anything else is true.
So we keep incrementing the pointer into the string and the length until we find a NUL and then return.
You need to understand two notions to grab the idea of the function :
1°) A C string is an array of characters.
2°) In C, an array variable is actually a pointer to the first case of the table.
So what strlen does ? It uses pointer arithmetics to parse the table (++ on a pointer means : next case), till it gets to the end signal ("\0").
Once *(str++) returns 0, the loop exits. This will happen when str points to the last character of the string (because strings in C are 0 terminated).
Correct, str++ increases the counter and returns the previous value. The asterisk (*) dereferences the pointer, i.e. it gives you the character value.
C strings end with a zero byte. The while loop exits when the conditional is no longer true, which means when it is zero.
So the while loop runs until it encounters a zero byte in the string.