Can not explain output of simple string operation in C - c

This is the code:
#include<stdio.h>
#include<string.h>
int main()
{
char *s = "name";
int n = strlen(s);
int i;
s = &s[n+1];
for(i=0; i<=n; i++)
{
printf("%d %c",i,*s);
s++;
}
return 0;
}
Output:
0 %1 d2 3 %4 c
I am unable to understand the output. Why its printing % although there's no escape sequence.

This line s = &s[n+1]; is causing your pointer to point off into the middle of nowhere. After that you start reading random garbage from it. Apparently that random garbage includes some % characters.

First assign s = &s[n+1]; then access out of bound memory in printf using *s. code is runing under Undefined behavior according to C standards.
Maximum index to s[] can be length of string that contains \0. Remember index value start from 0 to size of (array/string)-1
Your string is stored in memory something like:
s 23 24 25 26 27 28
+----+ +----+----+----+----+----+----+
| 23 | | n | a | m | e | \0 | ? |
+----+ +----+----+----+----+----+----+
0 1 2 3 4 5
s points to string "name"
string length of "name" is 4
length("name") + 1 = 5
? means garbage values
In expression s = &s[n+1];, n + 1 is five 5 that pointing to a location outside allocated memory for "name" string And in printf statement you access memory using * Dereference operator cause invalid memory access and behavior of this code at run time is Undefined. That is the reason you code behaving differently at different execution.
Your code compiles correctly because syntactically it correct, But at runtime access of unallocated memory can be detected by OS kernel. This may causes OS kernel send a signal core dump to the your process which caused the exception. (interesting to note: as OS detects memory right violation by a process -- An invalid access to valid memory gives: SIGSEGV And access to an invalid address gives: SIGBUS). In worth case your program may execute without any failure it produces garbage results.

s = &s[n+1];
Makes s to point unknown memory. Referring s thereafter invokes undefined behavior and anything may happen when you access it in printf.

Undefined behaviour because whenever you do s[n+1] where n is the length of the string. Also you are assigning this new address into s again. Accessing every index starting from this location will result in undifined behaviour, because you have no idea what lies at those locations, and you have access to it or not.
You may try defining another string immediately after the one you defined.
char *s = "name";
char *d = "hello test";
In that case you might end up printing the characters from the string "hello test", if the compiler happens to store the string immediately after the string "name" in the read only area. This is not guranteed.
The bottom line is that, the piece of code is not correct and results in undefined behaviour.

You are changing the pointer of s to the end of your screen, that's why you have some random garbage.
If for exemple you change your main to
void foo(char *str)
{}
int main()
{
char *s = "name";
int n = strlen(s);
int i;
s = &s[n+1];
foo("Test");
for(i=0; i<=n; i++)
{
printf("%d %c\n",i,*s);
s++;
}
return 0;
}
I think it will display test
But you should not do such thing.

s = &s[n+1]; is assignment from out of bound. value of s[n] is '\0' and after this which is s[n+1] will have some garbage value.
Assignment shown above is assigning base address of s[n+1] to s and later you are trying to print values from this new s so all values will be garbage.Accessing out of bound is Undefined behaviour.

in your program:
n = 4;
s = &s[n + 1] = &s[5];
pointer s points to a memory not uninitialized, so the output should be uncertain!

You asked:
Why its printing % although there's no escape sequence.
The escape sequence to print a % only applies when you are trying to print the % from within the format string itself. That is:
printf("%% %d\n", 1);
/* output:
% 1
*/
There is no need to escape it when it is being provided as the argument for the format conversion:
printf("%s %c\n", "%d", '%');
/* output:
%d %
*/
Your program invokes undefined behavior, since you are making s point one past the last valid object to which it is pointing to (which is allowed), and then you are reading from it (and beyond) during the printing loop (which is not allowed). Since it is undefined behavior, it could do nothing, it could crash, or it could create the output you are seeing.
The output you are getting can be obtained from the following program:
#include <stdio.h>
int main () {
const char *s = "%d %c";
int i;
for (i = 0; i < 5; ++i) {
printf("%d %c", i, *s);
s++;
}
puts("");
return 0;
}
/* output is:
0 %1 d2 3 %4 c
*/
This output would be less strange if there was a delimiter between each call to printf. If we add a newline at the end of the output after each call to printf, the output becomes:
0 %
1 d
2
3 %
4 c
As you can see, it is simply outputting the string pointed to by s, where each character it prints is preceded by the index position of that character.

As many people have pointed out, you have moved the pointer s past the end of the static string.
As the printf format string is also a static string there is a chance that the memory next to the static string "name" is the printf format string. This however is not guaranteed, you could just as well be print garbage memory.

Related

C printf prints an array that I didn't ask for

I have recently started learning C and I got into this problem where printf() prints an array I didn't ask for.
I was expecting an error since I used %s format in char array without the '\0', but below is what I got.
char testArray1[] = { 'a','b','c'};
char testArray2[] = { 'q','w','e','r','\0' };
printf("%c", testArray1[0]);
printf("%c", testArray1[1]);
printf("%c\n", testArray1[2]);
printf("%s\n", testArray1);
the result is
abc
abcqwer
thanks
The format "%s" expects that the corresponding argument points to a string: sequence of characters terminated by the zero character '\0'.
printf("%s\n", testArray1);
As the array testArray1 does not contain a string then the call above has undefined behavior.
Instead you could write
printf("%.*s\n", 3,testArray1);
or
printf("%.3s\n", testArray1);
specifying exactly how many elements of the array you are going to output.
Pay attention to that in C instead of these declarations
char testArray1[] = { 'a','b','c'};
char testArray2[] = { 'q','w','e','r','\0' };
you may write
char testArray1[3] = { "abc" };
char testArray2[] = { "qwer" };
or that is the same
char testArray1[3] = "abc";
char testArray2[] = "qwer";
In C++ the first declaration will be invalid.
%s indeed stop when encountered \0, but testArray1 didn't have that \0, so it keeps printing the following bytes in the memory.
And the compiler magically(actually intentionally) places the testArray2 next to testArray1, the memory is like:
a b c q w e r \0
^ testArray1 starts here
^ testArray2 starts here
And the %s will print all of those chars above until it meets a \0.
You can validate that by:
printf("%d\n", testArray2 == testArray1 + 3);
// prints `1`(true)
As your question, there was no error because the a ... r \0 sequece in memory is owned by your process. Only the program is trying to access an address not owned by it, the OS will throw an error.
Add zero at the end of first array:
char testArray1[] = { 'a','b','c', 0 };
Otherwise printf continues with memory after 'c' until zero byte and there is the second array.
PS: zero 0 is 100% identical to longer ASCII '\0'.

Why doesn't i++ work? Meanwhile printf("%c", a) is working just as intended

int getLevelWidth(FILE *level){
char a;
int i = 0;
while(fgets(&a, 2, level)) {
printf("%c",a);
i++;
}
printf("%i", i);
return 0;
}
This is file's content:
ABCDEFGHIJ
KLMNOPQ
RSTUVW
XYZ
And this is the output:
ABCDEFGHIJ
KLMNOPQ
RSTUVW
XYZ
1
The fgets function expects as its first parameter a pointer to the first element of an array of char, and the length of that array as the second. You're passing it the address of a single character and telling it that it is an array of size 2. This means that fgets is writing past the bounds of the variable c, triggering undefined behavior.
What most likely happened in this particular case is that a was followed immediately by i in memory, so writing past the bounds of a ended up writing into i. And assuming your system uses little-endian byte ordering, the first byte of i contains its lowest order byte. So by treating a as a 2 character array, the character in the file is written into a and the terminating null byte (i.e. the value 0) for the string is written into the first byte of i, and assuming the value of i was less than 256 this resets its value to 0.
But again, this is undefined behavior. Just because this is what happened in this particular case doesn't mean that it will always happen.
Since you only want to read a single character at a time, you instead want to use fgetc. You'll also want to change the type of c to an int to match what the function returns so you can check for EOF.
int a;
int i = 0;
while((a=fgetc(level)) != EOF) {
printf("%c",a);
i++;
}
You need 2 chars long buffer. Your code is writing 2 chars into single char. So the second one is written out of bounds. It is undefined behaviour.
Using same fgets function:
int getLevelWidth(FILE *level){
char a[2];
int i = 0;
while(fgets(a, 2, level)) {
printf("%c",a[0]);
i++;
}
printf("%i", i);
return 0;
}

Why does the program print out an "#" when I enter nothing?

I wrote a program in order to reverse a string. However, when I enter nothing but press the "enter" key, it prints out an "#".
The code is as follows:
#include <stdio.h>
#include <string.h>
int main(void)
{
int i, j, temp;
char str[80];
scanf("%[^\n]s", str);
i = strlen(str);
//printf("%d\n", i);
//printf("%d\n", sizeof(str));
for (j=0; j<i/2; j++) {
temp=str[i-j-1];
str[i-j-1]=str[j];
str[j]=temp;
}
for(i = 0; str[i] != 0; i++)
putchar(str[i]);
}
I tried to use printf() function to see what happens when I press the "Enter" key.
However, after adding printf("%d\n", i);, the output became "3 #".
After adding printf("%d\n", sizeof(str));, the output became "0 80".
It seemed as if "sizeof" had automatically "fixed" the problem.
My roommate said that the problem may result from initialization. I tried to change the code char str[80] to char str[80] = {0}, and everything works well. But I still don't understand what "sizeof" does when it exists in the code. If it really results in the initialization, why will such thing happen when the program runs line by line?
When you declare an array without initializing any part of the array, you receive a pointer to a memory location that has not been initialized. That memory location could contain anything. In fact, you're lucky it stopped just at the #.
By specifying char str[80] = {0} you are effectively saying:
char str[80] = {0, 0, 0, /* 77 more times */ };
Thereby initializing the string to all null values. This is because the compiler automatically pads arrays with nulls if it is partially initialized. (However, this is not the case when you allocate memory from the heap, just a warning).
To understand why everything was happening, let's follow through your code.
When you set i to the value returned by strlen(str), strlen iterates over the location starting at the memory location pointed to by str. Since your memory is not initialized, it finds a # at location 0 and then 0 at location 1, so it correctly returns 1.
What happens with the loops when you don't enter anything? i is set to 0, j is set to 0, so the condition j<i/2 evaluates to 0<0, which is false so it moves on to the second condition. The second condition only tests if the current location in the array is null. Coincidentally you are returned a memory location where the first char is #. It prints it and luckily the next value is null.
When you use the sizeof operator, you are receiving the size of the entire array that you were allocated on the stack (this is important you you may run into this issue later if you start using pointers). If you used strlen, you would have received 1 instead.
Suggestions
Instead of trying to do i = strlen(str);, I would suggest doing i = scanf("%[^\n]s", str);. This is because scanf returns the number of chars read and placed in the buffer. Also, try to use more descriptive variable names, it makes reading code so much easier.
Do a memset of str then it will nothing instead of garbage
char str[80];
memset(str,0,80);

How printf statement works in the below code?

void main()
{
printf("Adi%d"+2,3);
}
output= i3
This printf statement worked, but how the statement worked ?
printf("Adi%d"+2,3);
"Adi%d" - is interpreted as start of the address of the memory where the string literal "Adi%d" is stored. When you add 2 to it, it became address of memory where string "i%d" is stored. So basically you passed to printf string: "i%d". Then %d and printf came into play replacing %d with 3, hence the output i3.
Its part of pointer to character, nothing to do with printf, "Adi" + 2 will make it read from position 0 + 2 = 2 that will be i
int main()
{
char* a = "Adi" + 2;
printf(a); // output i
}

simple string counter program debugging (pointers)

I am new to C programming and pointers.
I made a simple program where I can read in the string and the program tells you how many characters are there and how many alphabets had appeared how many times.
Somehow, my output is not right. I think it might be my pointer and dereferencing problem.
here is my main:
extern int* count (char* str);
int main(int argc, char* argv[])
{
int numOfChars =0;
int numOfUniqueChars = 0;
char str[80];
int *counts;
strcpy(str, argv[1]);
printf("counting number of characters for \"%s\"..\n", str);
printf("\n");
counts = count(str);
int j;
for (j =0; j<sizeof(counts); j++)
{
if(counts[j])
printf("character %c", *str);
printf("appeared %d times\n", counts[j]);
numOfChars++;
numOfUniqueChars++;
}
printf("\"%s\" has a total of %d character(s)\n", str, numOfChars);
printf(wow %d different ascii character(s) much unique so skill\n", numOfUniqueChars);
}
and this is my count function:
int* count(char* str)
{
int* asctb = malloc(256);
int numOfChars =0;
int i;
int c;
for(i = 0; i<strlen(str); i++)
c = str[i];
asctb[c]++;
numOfChars += strlen(str);
return asctb;
}
and when I compile and run it, my result comes up like this:
./countingCharacter doge
counting number of characters for "doge"...
appeared 0 times
appeared 0 times
appeared 0 times
appeared 0 times
"doge" has a total of 4 character(s)
wow 4 different ascii character(s) much unique so skill
But, I want my result to be like this:
Character d appeared 1 times
Character e appeared 1 times
Character g appeared 1 times
Character o appeared 1 times
"doge" has a total of 4 character(s)
wow 4 different ascii character(s) much unique so skill
Any help will be much appreciated.
Thanks in advance!
EDIT:
i added curly braces for my for loop in the main function.
now i get this result:
./countingCharacter doge
character # appeared 7912 times
character d appeared 1 times
character e appeared 1 times
character g appeared 1 times
character o appeared 1 times
why do I get that "#" in the beginning??
As #kaylum said, one particularly large issue is your use of braces. If you don't use braces with a control flow statement (if, for, while, etc.), only the next line is counted as a part of that statement. As such, this segment:
if (counts[j])
printf("character %c", *str);
printf("appeared %d times\n", counts[j]);
/* ... */
...will only execute the first printf if counts[j] != 0, but will unconditionally execute the following statements.
Your use of malloc is also incorrect. malloc(256) will only allocate 256 bytes; one int is generally 4 bytes, but this differs based on the compiler and the machine. As such, when malloc'ing an array of any type, it's good practice to use the following technique:
type *array = malloc(element_count * sizeof(type));
In your case, this would be:
int *asctb = malloc(256 * sizeof(int));
This ensures you have room to count all the possible values of char. In addition, you'll have to change the way you iterate through counts as sizeof (counts) does not accurately represent the size of the array (it will most likely be 4 or 8 depending on your system).
The variable numOfChars will not behave the way you expect it to. It looks to me like you're trying to share it between the two functions, but because of the way it's declared this will not happen. In order to give global access to the variable, it needs to be declared at global scope, outside of any function.
Also, the line:
printf("character %c ", *str);
...neither keeps track of what characters you've printed nor which you're supposed to, instead just repeatedly printing the first character. *str should be (char)j, since you're printing ASCII values.
That ought to do it, I think.
If you are new to C, there are a number of issues in your code you need to pay attention to. First, if a function returns a value, validate that value. Otherwise, from that point in your code on, you can have no confidence that it is actually operating on the value or memory location you think it is. For example, each of the following should be validated (or changed to stay within allowable array bounds):
strcpy(str, argv[1]);
int* asctb = malloc(256);
counts = count(str);
What if argv[1] had 100 chars? What if malloc returned NULL? How do you know count succeeded? Always include the necessary validations needed by your code.
While not an error, the standard coding style for C avoids caMelCase variables in favor of all lower-case. See e.g. NASA - C Style Guide, 1994 So
int numOfChars =0;
int numOfUniqueChars = 0;
could simply be nchars and nunique.
Next, all your if and for loop syntax fails to encapsulate the required statements in braces, e.g. {...} to create a proper block for your if or for. For example, the following:
for(i = 0; i<strlen(str); i++)
c = str[i];
asctb[c]++;
only loops over c = str[i]; and asctb[c]++; is only executed AFTER the loop exits.
You must initialize your variable, (especially your array elements) before you attempt to reference them otherwise undefined behavior results. (it could seem to work, give weird output like a strange "#" character, or segfault, that's why it is undefined). You have a big problem here:
int* asctb = malloc(256);
None of the values in asctb are initialized. So when you return the array to main() and loop over all values in the array, every element that was not explicitly assigned a value causes undefined behavior. You can either set all values to 0 with memset, or recognize when you need all values initialized and use calloc instead:
int *asctb = calloc (1, 256);
Avoid the use of "magic-numbers" in your code. 256 above is a great example. Don't litter you code with these magic-numbers, instead defined a constant for them at the beginning of your code with either #define or for numerical constants, use an enum instead.
Lastly, in any code your write that dynamically allocates memory, you have 2 responsibilites regarding any block of memory allocated: (1) always preserves a pointer to the starting address for the block of memory so, (2) it can be freed using free when it is no longer needed. You should validate your memory use by running your code though a Memory Error Checking Program, such as valgrind on Linux. It's simple to do and will save you from yourself more times than you can imagine.
Putting all these pieces together and fixing additional logic errors in your code, you look like you were attempting something similar to the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* constants for max characters in str and values in asctb */
enum { MAXC = 80, MAXTB = 128 };
int *count (char *str);
int main (int argc, char **argv) {
if (argc < 2) { /* validate str given as argument 1 */
fprintf (stderr, "error: insufficient input, usage: %s str.\n",
argv[0]);
return 1;
}
/* initialize all variables avoid CamelCase names in C */
char str[MAXC] = "";
int j = 0, nchars = 0, nunique = 0;
int *counts = NULL;
strncpy (str, argv[1], MAXC - 1); /* limit copy len */
str[MAXC - 1] = 0; /* nul-terminate str */
printf ("\ncounting number of characters for \"%s\"..\n\n", str);
if (!(counts = count (str))) { /* validate return */
fprintf (stderr, "error: count() returned NULL.\n");
return 1;
}
for (j = 0; j < MAXTB; j++)
if (counts[j]) {
printf ("character '%c' appeared: %d times\n",
(char)j, counts[j]);
nchars += counts[j];
nunique++;
}
free (counts); /* free allocated memory */
printf ("\n\"%s\" has a total of %d character(s)\n", str, nchars);
printf (" wow %d different ascii character(s) much unique so skill\n\n",
nunique);
return 0; /* main is a function of type 'int' and returns a value */
}
int *count (char *str)
{
if (!str) return NULL; /* validate str */
int *asctb = calloc (1, sizeof *asctb * MAXTB);
size_t i; /* you are comparing with size_t in loop */
if (!asctb) { /* validate memory allocation - always */
fprintf (stderr, "count() error: virtual memory exhausted.\n");
return NULL;
}
for(i = 0; i < strlen(str); i++)
asctb[(int)str[i]]++; /* array indexes are type 'int' */
return asctb;
}
(note: the first 30 characters in counts are in the non-printable range, see ASCIItable.com. The indexes were left as you had them, but note, in practice you may want to shift them unless you are interested in counting the non-printable \t, \n, etc. chars).
Example Use/Output
$ ./bin/ccount "address 12234"
counting number of characters for "address 12234"..
character ' ' appeared: 1 times
character '1' appeared: 1 times
character '2' appeared: 2 times
character '3' appeared: 1 times
character '4' appeared: 1 times
character 'a' appeared: 1 times
character 'd' appeared: 2 times
character 'e' appeared: 1 times
character 'r' appeared: 1 times
character 's' appeared: 2 times
"address 12234" has a total of 13 character(s)
wow 10 different ascii character(s) much unique so skill
Look over the logic and syntax corrections and let me know if you have any further questions.

Resources