The following code works but I don't quite understand how *if (s == 0) works.
It checks if the string is 0?
Also for return(isnumber(s+1)) what is the logic behind that?
I know s is a string but I can just pass s+1 into a function? How does it even know what character I'm looking for?
int isnumber(char *s) {
if (*s == 0) {
return 1; /* Reached end, we've only seen digits so far! */
}
if(!isdigit(*s)) {
printf("The number is invalid\n");
return 0; /* first character is not a digit, so no go */
}
return(isnumber(s+1));
}
int main () {
char inbuf[LENGTH];
int i, j;
printf("Enter a string > ");
fgets(inbuf, LENGTH-1, stdin); // ignore carriage return
inbuf[strlen(inbuf)-1] = 0;
j = isnumber(inbuf);
....
}
This function is a recursive function that checks if a string contains all numbers. To understand how the code works, you must understand how C stores strings. If you have the string "123", C stores this string in memory, like this:
|-----------------------------------|
| 0x8707 | 0x8708 | 0x8709 | 0x870A |
|--------|--------|--------|--------|
| | | | |
| '1' | '2' | '3' | '\0' |
|-----------------------------------|
What C does is it breaks your sting up into characters, stores them in some arbitrary location in memory and adds a null character (\0) (ASCII 0) to the end of the string. This null character is how C knows where the string ends.
Your isnumber() function takes a char *s as a parameter. This is called a pointer. Internally, whats going on is your main() function calls isdigit() and it actually passes in the address of your string, not the string itself. This is important:
j = isnumber(inbuf);
How the compiler interprets this is call isnumber() and pass along the address of inbuf and assign the return value to j.
Now back up at the isnumber() function, its receiving the address of inbuf and assigning it to s. By placing an asterisk (*) in front of s, you are doing something called dereferencing s. Dereferencing means you want the value contained at the address of s. So the line that says if (*s == 0) is basically saying If the value contained at the address of s is equal to 0. Remember earlier I told you in memory, strings always have a terminating null (\0) character? This is how your function knows to end and return.
The next thing to understand is pointer arithmetic. Depending on your system, a char might occupy either 1 byte of memory or 2 bytes. You can find out for sure by printing a sizeof(char). But when you refer to (s+1), that is telling the computer to take the memory address pointed to by s and add to it whatever the size of a char is. So if a char is 1 byte long and s is pointed to 0x8707, then (s+1) will make s equal 0x8708 and *s will point to the '2' in our string (see my memory block diagram above). This is how we iterate through each character in the string.
Hopefully this clears up the confusion!
The statement if (*s == 0) checks to see if the char s points to is zero. In other words, it checks to see if s is a zero-length string and returns 1 if so.
The statement return (isnumber(s+1)) adds 1 to s, causing it to point to the second char in the string, and passes that to isnumber(). isnumber returns true if the string at s[1] is a digit.
In C, strings are terminated with a null character.
(*s == 0) is checking for the null terminator.
This code is a little weirder.
return(isnumber(s+1));
Since the current character is a digit, keep going...call the function again starting at the NEXT character. This is a recursive function call and there is really no need when iteration would be simpler.
Related
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void print_reverse(char *s)
{
size_t len=strlen(s);
char *t=s+len-1;
printf("%s %s\n",t,s);
while(t>=s){
printf("%c",*t);
t=t-1;
}
puts("");
}
int main(){
print_reverse("Hello");
}
Can anyone tell how char *t=s+len-1; and while(t>=s) works. I cant understand how a number can be added to pointer and how the pointers are compared in while loop. This program is for reversing a string in c.
Lets do this line by line:
print_reverse("Hello");
void print_reverse(char *s)
Now s points to a string that contains:
- - ----+----+----+----+----+----+----+---- - -
| H | e | l | l | o | \0 |
- - ----+----+----+----+----+----+----+---- - -
^
s
That last character is called the string "NUL" terminator because "NUL" is the name of the character with ASCII value zero (all ASCII values that are not printable have three letter names).
size_t len=strlen(s);
Now len has a value of five. Notice it does not include the "NUL" terminator so even though the string takes 6 bytes the length is five.
char *t=s+len-1;
Now t has a value of s+4. If you count the memory locations this is what you get:
- - ----+----+----+----+----+----+----+---- - -
| H | e | l | l | o | \0 |
- - ----+----+----+----+----+----+----+---- - -
^ ^
s t
Note that s+strlen(s) would point to the "NUL" terminator.
printf("%s %s\n",t,s);
That printf should print Hello o
while(t>=s)
This while loop will continue as long as t>=s which means it will do the body of the loop for every character, including the one where s is pointing.
printf("%c",*t);
This prints the contents of the memory that t is pointing at. It starts with the o and continues backwards towards the H.
t=t-1;
That the part that moves t backwards. Eventually t will be past s and then the loop will end. When the loop finishes it will look like this:
- - ----+----+----+----+----+----+----+---- - -
| H | e | l | l | o | \0 |
- - ----+----+----+----+----+----+----+---- - -
^ ^
t s
Then there is this one final line:
puts("");
That prints an empty string and a final linefeed - there wasn't a linefeed in the string but we needed one so this is a way to do that.
Pointer Arithmetic
When a pointer points into an array, adding integers to the pointer or subtracting integers from the pointer moves the pointer back and forth within the array.
This function should be passed a char *s that points to a string, which is an array of characters ending in a null character ('\0'). Then size_t len = strlen(s); sets len to the size of this string, and char *t = s+len-1; sets t to point to the last character before the null character.
Then, in the loop t=t-1; moves t backward.
Unfortunately, this loop uses t>=s as its control condition. This is intended to stop when t has been moved to the character before s, meaning it has gone back before the start point. However, the C standard only defines pointer arithmetic for elements within the array plus a special position at the end of the array. If this function is passed an s that points to the beginning of an array, then the loop will eventually make t point before the array, and the C standard does not define the resulting behavior.
Other Things to Know About Pointer Arithmetic
Any object may be treated as an array of one element. If you have some type T and some object T x;, you may set a pointer T *p = &x;, and then it is allowed to advance the pointer by one element, p = p+1;. Dereferencing that pointer with *p is not defined, but you can compare it, as in &x == p, or you can subtract one from it.
If print_reverse were passed a pointer into an array beyond the beginning, then its loop would be okay. However, that is now how it is used in the example code; print_reverse("Hello"); is not good code.
Any object may be treated as an array of characters. You can convert a pointer to any object to a pointer to unsigned char and then examine the bytes that make up an object. This is used for special purposes. You should not use it in general code while you are learning C, but you should be aware it exists.
I wanted to test things out with arrays on C as I'm just starting to learn the language. Here is my code:
#include <stdio.h>
main(){
int i,t;
char orig[5];
for(i=0;i<=4;i++){
orig[i] = '.';
}
printf("%s\n", orig);
}
Here is my output:
.....�
It is exactly that. What are those mysterious characters? What have i done wrong?
%s with printf() expects a pointer to a string, that is, pointer to the initial element of a null terminated character array. Your array is not null terminated.
Thus, in search of the terminating null character, printf() goes out of bound, and subsequently, invokes undefined behavior.
You have to null-terminate your array, if you want that to be used as a string.
Quote: C11, chapter §7.21.6.1, (emphasis mine)
s
If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type.280) Characters from the array are
written up to (but not including) the terminating null character. If the
precision is specified, no more than that many bytes are written. If the
precision is not specified or is greater than the size of the array, the array shall
contain a null character.
Quick solution:
Increase the array size by 1, char orig[6];.
Add a null -terminator in the end. After the loop body, add orig[i] = '\0';
And then, print the result.
char orig[5];//creates an array of 5 char. (with indices ranging from 0 to 4)
|?|?|?|0|0|0|0|0|?|?|?|?|
| ^memory you do not own (your mysterious characters)
^start of orig
for(i=0;i<=4;i++){ //attempts to populate array with '.'
orig[i] = '.';
|?|?|?|.|.|.|.|.|?|?|?|?|
| ^memory you do not own (your mysterious characters)
^start of orig
This results in a non null terminated char array, which will invoke undefined behavior if used in a function that expects a C string. C strings must contain enough space to allow for null termination. Change your declaration to the following to accommodate.
char orig[6];
Then add the null termination to the end of your loop:
...
for(i=0;i<=4;i++){
orig[i] = '.';
}
orig[i] = 0;
Resulting in:
|?|?|?|.|.|.|.|.|0|?|?|?|
| ^memory you do not own
^start of orig
Note: Because the null termination results in a C string, the function using it knows how to interpret its contents (i.e. no undefined behavior), and your mysterious characters are held at bay.
There is a difference between an array and a character array. You can consider a character array is an special case of array in which each element is of type char in C and the array should be ended (terminated) by a character null (ASCII value 0).
%s format specifier with printf() expects a pointer to a character array which is terminated by a null character. Your array is not null terminated and hence, printf function goes beyond 5 characters assigned by you and prints garbage values present after your 5th character ('.').
To solve your issues, you need to statically allocate the character array of size one more than the characters you want to store. In your case, a character array of size 6 will work.
#include <stdio.h>
int main(){
int i,t;
char orig[6]; // If you want to store 5 characters, allocate an array of size 6 to store null character at last position.
for (i=0; i<=4; i++) {
orig[i] = '.';
}
orig[5] = '\0';
printf("%s\n", orig);
}
There is a reason to waste one extra character space for the null character. The reason being whenever you pass any array to a function, then only pointer to first element is passed to the function (pushed in function's stack). This makes for a function impossible to determine the end of the array (means operators like sizeof won't work inside the function and sizeof will return the size of the pointer in your machine). That is the reason, functions like memcpy, memset takes an additional function arguments which mentions the array sizes (or the length upto which you want to operate).
However, using character array, function can determine the size of the array by looking for a special character (null character).
You need to add a NUL character (\0) at the end of your string.
#include <stdio.h>
main()
{
int i,t;
char orig[6];
for(i=0;i<=4;i++){
orig[i] = '.';
}
orig[i] = '\0';
printf("%s\n", orig);
}
If you do not know what \0 is, I strongly recommand you to check the ascii table (https://www.asciitable.com/).
Good luck
prinftf takes starting pointer of any memory location, array in this case and print till it encounter a \0 character. These type of strings are called as null terminated strings.
So please add a \0 at the end and put in characters till (size of array - 2) like this :
main(){
int i,t;
char orig[5];
for(i=0;i<4;i++){ //less then size of array -1
orig[i] = '.';
}
orig[i] = '\0'
printf("%s\n", orig);
}
I'm creating a simple program to see how a string populates an array.
#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <stdlib.h>
int main(void)
{
char string1[100];
int i;
printf("Enter sentence.\n");
fgets(string1, 100, stdin);
for (i=0; i<=15; i++)
puts(&string1[i]);
return 0;
}
I'm having a bit of a problem understanding how the string is populating an array. My expectation is that the string will be completely stored in string1[0] and any further indexes will come up blank. However, when I throw the loop to see if my assumption is true, it turns out that every index has been filled in by the string. Am I misunderstanding how the string is filling the array?
For the string "Hello!", the memory representation would be something like this
+-------+-------+-------+-------+-------+-------+-------+
| 'H' | 'e' | 'l' | 'l' | 'o' | '!' | '\0' |
+-------+-------+-------+-------+-------+-------+-------+
The first cell, at index 0, contains the first character. And each subsequent character is contained in a cell with an increasing index.
Library functions like puts expect you to pass the address of the first character, and then they read the string up to \0.
So if you pass simply string1 or &string1[0], it will resolve to the address of 'H'.
If you pass &string[1], it will resolve to the address of 'e', and the library function will think that is the first character, because that's the contract C strings are designed with.
Your problem is not string1 layout per se but how puts interprets it. Strings are represented by char arrays in C while their end is marked as null terminator (character with code 0):
S e n t e n c e \0
^ ^
string1 &string1[5]
&string1[5] is a pointer to a one character, but since the following character is not null terminator, following memory is interpreted as a string and nce gets printed.
You'll need to use putc and access individual characters:
putc(string1[i])
string is not stored in string1[0] but string's first character is stored at string1[0] or string starts at (string1+0). Here, &string1[0] or (string1+0) can be seen as a pointer, a pointer to C String string1.
In that sense, every valid index i of string1 will give you a valid pointer (string1 + i) which will point to some part of C String string1.
In the last for loop you are printing the suffixes of string string1 which are pointed by (string1 + 0), (string1 + 1), (string1 + 2)...
I am learning C and a I came across this function in my study materials. The function accepts a string pointer and a character and counts the number of characters that are in the string. For example for a string this is a string and a ch = 'i' the function would return 3 for 3 occurrences of the letter i.
The part I found confusing is in the while loop. I would have expected that to read something like while(buffer[j] != '\0') where the program would cycle through each element j until it reads a null value. I don't get how the while loop works using buffer in the while loop, and how the program is incremented character by character using buffer++ until the null value is reached. I tried to use debug, but it doesn't work for some reason. Thanks in advance.
int charcount(char *buffer, char ch)
{
int ccount = 0;
while(*buffer != '\0')
{
if(*buffer == ch)
ccount++;
buffer++;
}
return ccount;
}
buffer is a pointer to a set of chars, a string, or a memory buffer holding char data.
*buffer will dereference the value at buffer, as a char. This can be compared with the null character.
When you add to buffer - you are adding to the address, not the value it points to, buffer++ adds 1 to the address, pointing to the next char. This means that now *buffer results in the next character.
In the loop you are incrementing the pointer buffer until it points to the null character, at which point you know you scanned the whole string. Instead of buffer[j], which is equivalent to *(buffer+j), we are incrementing the pointer itself.
When you say buffer++ you increment the address stored in buffer by one.
Once you internalize how pointers work, this code is cleaner than the code that uses a separate index to scan the character string.
In C and C++, arrays are stored in sequence, and an array is stored according to its first address and length.
Therefore *buffer is actually the address of the first byte, and is synonymous with buffer[0]. Because of this, you can use buffer as an array, like this:
int charcount(char *buffer, char ch)
{
int ccount = 0;
int charno = 0;
while(buffer[charno] != '\0')
{
if(buffer[charno] == ch)
ccount++;
charno++;
}
return ccount;
}
Note that this works because strings are null terminated - if you don't have a null termination in the character array pointed to by *buffer it will continue reading forever; you lose the bit where c knows how long the array is. This is why you see so many c functions to which you pass a pointer and a length - the pointer tells it the [0] position of the array, and the size you specify tells it how far to keep reading.
Hope this helps.
I was looking at the following code I came across for printing a string in reverse order in C using recursion:
void ReversePrint(char *str) { //line 1
if(*str) { //line 2
ReversePrint(str+1); //line 3
putchar(*str); //line 4
}
}
I am relatively new to C and am confused by line 2. *str from my understanding is dereferencing the pointer and should return the value of the string in the current position. But how is this being used as an argument to a conditional statement (which should except a boolean right?)? In line 3, the pointer will always be incremented to the next block (4 bytes since its an int)...so couldn't this code fail if there happens to be data in the next memory block after the end of the string?
Update: so there are no boolean types in c correct? A conditional statement evaluates to 'false' if the value is 0, and 'true' otherwise?
Line 2 is checking to see if the current character is the null terminator of the string - since C strings are null-terminated, and the null character is considered a false value, it will begin unrolling the recursion when it hits the end of the string (instead of trying to call StrReverse4 on the character after the null terminator, which would be beyond the bounds of the valid data).
Also note that the pointer is to a char, thus incrementing the pointer only increments by 1 byte (since char is a single-byte type).
Example:
0 1 2 3
+--+--+--+--+
|f |o |o |\0|
+--+--+--+--+
When str = 0, then *str is 'f' so the recursive call is made for str+1 = 1.
When str = 1, then *str is 'o' so the recursive call is made for str+1 = 2.
When str = 2, then *str is 'o' so the recursive call is made for str+1 = 3.
When str = 3, then *str is '\0' and \0 is a false value thus if(*str) evaluates to false, so no recursive call is made, thus going back up the recursion we get...
Most recent recursion was followed by `putchar('o'), then after that,
Next most recent recursion was followed by `putchar('o'), then after that,
Least recent recursion was followed by `putchar('f'), and we're done.
The type of a C string is nothing but a pointer to char. The convention is that what the pointer points to is an array of characters, terminated by a zero byte.
*str, thus, is the first character of the string pointed to by str.
Using *str in a conditional evaluates to false if str points to the terminating null byte in the (empty) string.
At the end of a string is typically a 0 byte - the line if (*str) is checking for the existence of that byte and stopping when it gets to it.
In line 3, the pointer will always be incremented to the next block (4 bytes since its an int)...
Thats wrong, this is char *, it will only be incremented by 1. Because char is 1 byte long only.
But how is this being used as an argument to a conditional statement (which should except a boolean right?)?
You can use any value in if( $$ ) at $$, and it will only check if its non zero or not, basically bool is also implemented as simple 1=true and 0=false only.
In other higher level strongly typed language you cant use such things in if, but in C everything boils down to numbers. And you can use anything.
if(1) // evaluates to true
if("string") // evaluates to true
if(0) // evaulates to false
You can give any thing in if,while conditions in C.
At the end of the string there is a 0 - so you have "test" => [0]'t' [1]'e' [2]'s' [3]'t' [4]0
and if(0) -> false
this way this will work.
C has no concept of boolean values: in C, every scalar type (ie arithmetic and pointer types) can be used in boolean contexts where 0 means false and non-zero true.
As strings are null-terminated, the terminator will be interpreted as false, whereas every other character (with non-zero value!) will be true. This means means there's an easy way to iterate over the characters of a string:
for(;*str; ++str) { /* so something with *str */ }
StrReverse4() does the same thing, but by recursion instead of iteration.
conditional statements (if, for, while, etc) expect a boolean expression. If you provide an integer value the evaluation boils down to 0 == false or non-0 == true. As mentioned already, the terminating character of a c-string is a null byte (integer value 0). So the if will fail at the end of the string (or first null byte within the string).
As an aside, if you do *str on a NULL pointer you are invoking undefined behavior; you should always verify that a pointer is valid before dereferencing.
This is kind of off topic, but when I saw the question I immediately wondered if that was actually faster than just doing an strlen and iterate from the back.
So, I made a little test.
#include <string.h>
void reverse1(const char* str)
{
int total = 0;
if (*str) {
reverse1(str+1);
total += *str;
}
}
void reverse2(const char* str)
{
int total = 0;
size_t t = strlen(str);
while (t > 0) {
total += str[--t];
}
}
int main()
{
const char* str = "here I put a very long string ...";
int i=99999;
while (--i > 0) reverseX(str);
}
I first compiled it with X=1 (using the function reverse1) and then with X=2. Both times with -O0.
It consistently returned approximately 6 seconds for the recursive version and 1.8 seconds for the strlen version.
I think it's because strlen is implemented in assembler and the recursion adds quite an overhead.
I'm quite sure that the benchmark is representative, if I'm mistaken please correct me.
Anyway, I thought I should share this with you.
1.
str is a pointer to a char. Incrementing str will make the pointer point to the second character of the string (as it's a char array).
NOTE: Incrementing pointers will increment by the data type the pointer points to.
For ex:
int *p_int;
p_int++; /* Increments by 4 */
double *p_dbl;
p_dbl++; /* Increments by 8 */
2.
if(expression)
{
statements;
}
The expression is evaluated and if the resulting value is zero (NULL, \0, 0), the statements are not executed. As every string ends with \0 the recursion will have to end some time.
Try this code, which is as simple as the one which you are using:
int rev(int lower,int upper,char*string)
{
if(lower>upper)
return 0;
else
return rev(lower-1,upper-1,string);
}