Scanning and printing strings using pointers - c

I wrote a code for scanning a string from user using pointers and storing it in another string array and printing that string array. The output is coming quite strange. The first three characters are printed but next characters are coming as random garbage values. Please tell me the error in the code. Following is the code:
#include<stdio.h>
int main(void)
{
char str1[8];
char *p1=str1;
char str2[8];
char *p2=str2;
printf("Enter a string\n");
while(*p1)
scanf("%c",p1++);
*p1='\0';
p1=&str1[0];
while(*p1)
*p2++=*p1++;
*p2='\0';
printf("The copied string is :");
p2=&str2[0];
while(*p2)
printf("%c",*p2++);
}

You are not placing the terminating null character('\0') at the end of your strings str1[] and str2[]. And you are trying to dereference and check the value which is not initialized in your first while loop condition: while(*p1)
printf("Enter a string\n");
do{
scanf("%c",p1++);
}while(*(p1 - 1) != '\n'); //try the do while loop
*(p1 - 1) = '\0'; //placing terminating null character
p1 = &str1[0];
while(*p1){
*p2++ = *p1++;
}
*p2 = '\0'; //placing terminating null character
here's the demo code: https://ideone.com/dF2QsJ
Why have you checked the condition for the new line in the do while condition? And why p1-1?
This is because you end the input by entering a '\n' which gets stored at p1 and then p1 moves to p1 + 1 at the end of each iteration. So, I check whether a '\n' is present at p1 - 1.

okay,
Why arent you using %s and get the input directly. you can get the entire string rather than looping over each character.

This loop
while(*p1)
scanf("%c",p1++);
checks the contents of str1 (pointed at by p1) before ever storing anything there. That uninitialized memory might contain anything, so this loop might never execute (if the first char happens to be NUL), or might run off the end of the array (corrupting memory).

Related

while loop with only parentheses syntax, in c

i just saw this "while(something);" syntax. i googled this but did not found anything. how does this work? especially second while in the example code confuses me.
this code is a program to concatenate two strings using pointer.
#include <stdio.h>
#define MAX_SIZE 100 // Maximum string size
int main()
{
char str1[MAX_SIZE], str2[MAX_SIZE];
char * s1 = str1;
char * s2 = str2;
/* Input two strings from user */
printf("Enter first string: ");
gets(str1);
printf("Enter second string: ");
gets(str2);
/* !!!!!!!!!!!!!!!!! this is it!!!!!!!!!!!!!!!!!!!! Move till the end of str1 */
while(*(++s1));
/* !!!!!!!!!!!!!!!!! this is it!!!!!!!!!!!!!!!!!!!! Copy str2 to str1 */
while(*(s1++) = *(s2++));
printf("Concatenated string = %s", str1);
return 0;
}
The while loop is defined in C the following way
while ( expression ) statement
In this while loop
while(*(++s1));
the statement is a null statement. (The C Standard, 6.8.3 Expression and null statements)
3 A null statement (consisting of just a semicolon) performs no
operations.
So in the above while loop the expression is evaluated cyclically until it logically becomes false.
Pay attention to that this while loop has a bug.;)
Let's assume that the pointed string is empty "". In memory it is represented the following way
{ '\0' }
So initially s1 points to the terminating zero.
But before dereferencing it is incremented in the expression of the while loop
while(*(++s1));
^^^^
and after that points in the uninitialized part of the character array after the terminating zero '\0'. So the loop can invoke undefined behavior.
It would be more correctly to rewrite it like
while( *s1 != '\0' ) ++s1;
In this case after the loop the pointer s1 will point to the terminating zero '\0' of the source string.
This while loop where the statement is again a null statement
while(*(s1++) = *(s2++));
can be rewritten the following way
while( ( *s1++ = *s2++ ) != '\0' );
that is in essence the same as
while( ( *s1 = *s2 ) != '\0' )
{
++s1;
++s2;
}
(except that if the terminating zero was encountered and copied the pointers are not incremented)
That is the result of the assignment ( *s1 = *s2 ) is the assigned character that is checked whether it is equal already to the terminating zero character '\0'. And if so the loop stops and it means that the string pointed to by the pointer s2 is appended to the string pointed to by the pointer s1.
Pay attention to that the function gets is unsafe and is not supported by the C Standard. Instead you should use the function fgets as for example
#include <string.h>
#include <stdio.h>
//...
printf("Enter first string: ");
fgets(str1, sizeof( str1 ), stdin );
str1[ strcspn( str1, "\n" ) ] = '\0';
The last statement is used to remove the new line character '\n' that can be appended to the entered string by the function call.
Also you need to check in the program whether there is enough space in the array str1 and the string stored in the array str2 can be indeed appended to the string stored in the array str1.
while(*(++s1)); is an obfuscated and bugged way of writing while(*s1 != '\0') { s1++; }.
(It should have been while(*(s1++)); to behave as expected, but that too is wrong since it increments the pointer upon failure and won't work with an empty string.)
while(*(s1++) = *(s2++)); is an obfuscated (and likely inefficient) way of writing strcpy(s1,s2);.
The whole program is an obfuscated way of writing strcat(s1, s2);. You can replace both of these buggy while loops with that single function call.
Generally while(something); is bad practice, to the point where compilers might even warn for it, since it isn't clear if the semicolon ended up there on purpose or by a slip of the finger. Preferred style is either:
while(something)
; // aha this was surely not placed there by accident
or
while(something){}
or
while(something)
{}
++s1 advances (or increments) the pointer, before the while checks it value
The while loop will iterate through the string until it will reach the null terminator, since while(NULL) is equal to while(false) or while(0)
The loop
while(*(++s1));
doesn't need a body because everything is done inside the loop condition.
Therefore the loop body is an empty statement ;.
The loop consists the following steps:
++s1 increment pointer
*(...) dereference pointer, i.e. get the data where the pointer points to.
use the value as the condition (0 is false, everything else is true)
The loop can be rewritten as
do
{
++s1;
}
while(*s1); // or while(*s1 != '\0');
Similarly, the other loop
while(*(s1++) = *(s2++));
can be written as
do
{
char c;
*s1 = *s2;
c = *s1;
s1++;
s2++;
}
while(c != '\0')
Note that the original loop condition contains an assignment (=), not a comparison (==). The assigned value is used as the loop condition.

Problem reading two strings with getchar() and then printing those strings in C

This is my code for two functions in C:
// Begin
void readTrain(Train_t *train){
printf("Name des Zugs:");
char name[STR];
getlinee(name, STR);
strcpy(train->name, name);
printf("Name des Drivers:");
char namedriver[STR];
getlinee(namedriver, STR);
strcpy(train->driver, namedriver);
}
void getlinee(char *str, long num){
char c;
int i = 0;
while(((c=getchar())!='\n') && (i<num)){
*str = c;
str++;
i++;
}
printf("i is %d\n", i);
*str = '\0';
fflush(stdin);
}
// End
So, with void getlinee(char *str, long num) function I want to get user input to first string char name[STR] and to second char namedriver[STR]. Maximal string size is STR (30 charachters) and if I have at the input more than 30 characters for first string ("Name des Zuges"), which will be stored in name[STR], after that I input second string, which will be stored in namedriver, and then printing FIRST string, I do not get the string from the user input (first 30 characters from input), but also the second string "attached" to this, I simply do not know why...otherwise it works good, if the limit of 30 characters is respected for the first string.
Here my output, when the input is larger than 30 characters for first string, problem is in the row 5 "Zugname", why I also have second string when I m printing just first one...:
Name des Zugs:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is 30
Name des Drivers:xxxxxxxx
i is 8
Zugname: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxxxxxxxx
Drivername: xxxxxxxx
I think your issue is that your train->name is not properly terminated with '\0', as a consequence when you call printf("%s", train->name) the function keeps reading memory until it finds '\0'. In your case I guess your structure looks like:
struct Train_t {
//...
char name[STR];
char driver[STR];
//...
};
In getlinee() function, you write '\0' after the last character. In particular, if the input is more than 30 characters long, you copy the first 30 characters, then add '\0' at the 31-th character (name[30]). This is a first buffer overflow.
So where is this '\0' actually written? well, at name[30], even though your not supposed to write there. Then, if you have the structure above when you do strcpy(train->name, name); you will actually copy a 31-bytes long string: 30 chars into train->name, and the '\0' will overflow into train->driver[0]. This is the second buffer overflow.
After this, you override the train->driver buffer so the '\0' disappears and your data in memory basically looks like:
train->name = "aaa...aaa" // no '\0' at the end so printf won't stop reading here
train->driver = "xxx\0" // but there
You have an off-by-one error on your array sizes -- you have arrays of STR chars, and you read up to STR characters into them, but then you store a NUL terminator, requiring (up to) STR + 1 bytes total. So whenever you have a max size input, you run off the end of your array(s) and get undefined behavior.
Pass STR - 1 as the second argument to getlinee for the easiest fix.
Key issues
Size test in wrong order and off-by-one. ((c=getchar())!='\n') && (i<num) --> (i+1<num) && ((c=getchar())!='\n'). Else no room for the null character. Bad form to consume an excess character here.
getlinee() should be declared before first use. Tip: Enable all compiler warnings to save time.
Other
Use int c; not char c; to well distinguish the typical 257 different possible results from getchar().
fflush(stdin); is undefined behavior. Better code would consume excess characters in a line with other code.
void getlinee(char *str, long num) better with size_t num. size_t is the right size type for array sizing and indexing.
int i should be the same type as num.
Better code would also test for EOF.
while((i<num) && ((c=getchar())!='\n') && (c != EOF)){
A better design would return something from getlinee() to indicate success and identify troubles like end-of-file with nothing read, input error, too long a line and parameter trouble like str == NULL, num <= 0.
I believe you have a struct similar to this:
typedef struct train_s
{
//...
char name[STR];
char driver[STR];
//...
} Train_t;
When you attempt to write a '\0' to a string that is longer than STR (30 in this case), you actually write a '\0' to name[STR], which you don't have, since the last element of name with length STR has an index of STR-1 (29 in this case), so you are trying to write a '\0' outside your array.
And, since two strings in this struct are stored one after another, you are writing a '\0' to driver[0], which you immediately overwrite, hence when printing out name, printf doesn't find a '\0' until it reaches the end of driver, so it prints both.
Fixing this should be easy.
Just change:
while(((c=getchar())!='\n') && (i<num))
to:
while(((c=getchar())!='\n') && (i<num - 1))
Or, as I would do it, add 1 to array size:
char name[STR + 1];
char driver[STR + 1];

Using getchar() to Read Two Strings

For the following code, I added two printf statements to test if the two strings are read properly. However, when I enter something like: abcabcabcza,cb
The outputs are:
abcabcabcza▒
cb9
Does anyone know where the symbol at the end of the first string, and the '9' at the end of the second string, come from? Thank you so much!
printf("\nEnter two words, seperated by a comma: ");
int temp1, temp2, index3, index4; char temp3[20], temp4[20];
index3=index4=0;
while((temp1 = getchar())!= ','){
temp3[index3++] = temp1;
}
printf("\n%s", temp3);
while((temp2 = getchar())!= '\n'){
temp4[index4++] = temp2;
}
printf("\n%s", temp4);
You need to add string terminators '\0' to your string before printing (or zero out the buffers memory first).
Also: you have declared buffers of size 20, but have no guards in your code to respect that allocated length, which means you could overrun them and corrupt memory. [Run with two words greater than 20 characters...]
'\n' and '\0' are different here. You need to add '\0' at the end of string because printf prints out string until it meets '\0'. C doesn't initializes array. If you do not initialize manually it will have garbage values.
I would do:
char temp3[20] = {0};
char temp4[20] = {0};
to fill out temp3 and temp3 with 0, which is same as '\0'.
when you give a string as 'abc' it will be saved as 'abc\0',so check for \0 also and dont print it
The string in C requires to be NULL-terminated.
A lot of functions use that terminator to announce the processor where the string ends.
What will hapan if not terminated? Lets get one simple string, with consumpption of 5 bytes of memory.
...[?][?][H][e][l][l][o][?][?][?]...
That causes a leak. As you see we haven't store a NULL. It causes u/b and you may get different signs on any new run. For us.. the processor puts the string in random blocks of memory. However gcc by default has optimizations to prevent that.

Pointers and Strings?

I want to write a program that erases all characters in string 1 that appear in string 2 , using pointers .
This is what I did , but it did not work .
#include<stdlib.h>
#include<stdio.h>
#include<string.h>
main()
{
char ch1[100] , ch2[100] ;
char *p1 , *p2;
printf("first chaine ");
gets(ch1);
printf("sd chaine");
gets(ch2);
for(p1=ch1;p1<ch1+100;p1++)
{
for(p2=ch2;p2<ch2;p2++)
{
if(*p1==*p2)
{
strcpy(p1,p1+1);
}
}
}
puts(ch1);
return 0 ;
}
strcpy() expects that its source and destination arguments don't overlap in memory — in other words, writing to the destination string shouldn't overwrite parts of the source string. So you can't use it to "shift" a string by an amount that's less than its length. Instead, you can use memmove(), which supports overlapping ranges.
You can replace your strcpy line with:
memmove(p1, p1+1, strlen(p1+1));
which will correctly do what you had expected the strcpy() call to do.
Also, your termination condition for the inner loop is p2<ch2, which is always false since they start out equal. You probably meant to write p2<ch2+100.
Your loop conditions have another problem, though: they go past the end of the actual string that's stored in the array. If the user types fewer than 99 characters of input for either string, the corresponding array will contain garbage characters after the null terminator. In the ch1 array, scanning past the end of the string may cause strlen() to go past the end of the whole array looking for another null terminator, and in ch2, going past the end of the string will cause the program to filter out characters that the user didn't specify.
You should change the two loop conditions to *p1 != '\0' and *p2 != '\0'. This will make the loops stop when they reach the end of the two strings.
NEVER USE GETS()
It's unsafe to use gets() under any circumstances, because it doesn't check the length of its input against the length of the array. Use fgets() instead.
Now that you understand that, take a look at your inner loop:
for(p2=ch2;p2<ch2;p2++)
You set p2=ch2, then check if p2<ch2. This will always be false. Perhaps you mean to check if p2<ch2+100?
First of all, you need bounds checking. gets() does not provide bounds checking.
As for for your loops, you will never enter the nested loops:
for(p2=ch2;p2<ch2;p2++)
Your initialization will always make your condition false, and you will never enter the loop.
Here is one solution to the problem. This code eliminates the inner loop of the question code by implementing strchr() to determine if a specific character of string1 is present in string2:
#include <stdio.h>
#include <string.h>
int main(void)
{
char ch1[100] , ch2[100];
char *p1, *p2;
/* Get string1 from stdin. */
printf("first chaine ");
fgets(ch1, sizeof(ch1), stdin);
/* Get string2 from stdin. */
printf("sd chaine ");
fgets(ch2, sizeof(ch2), stdin);
/* Eliminate all chars from string1 that appear in string2. */
for(p1=ch1, p2=ch1; *p1; p1++)
{
if(strchr(ch2, *p1))
continue;
*p2++ = *p1;
}
*p2 = '\0';
/* Print modified string1. */
puts(ch1);
return(0);
}
Execution example of the above code:
SLES11SP2:~/SO> ./test
first chaine Now is the time for all good men to come to the aid of their country.
sd chaine aeiou
Nw s th tm fr ll gd mn t cm t th d f thr cntry.
SLES11SP2:~/SO>

Initializing end of the string in C

I am learning C now and I'm at the point where I don't really get what is the difference of initializing the end of the string with NULL '\0' character. Below is the example from the book:
#include <stdio.h>
#include <string.h>
int main(){
int i;
char str1[] = "String to copy";
char str2[20];
for(i = 0; str1[i]; i++)
str2[i] = str1[i];
str2[i] = '\0'; //<====WHY ADDING THIS LINE??
printf("String str2 %s\n\n", str2);
return 0;
}
So, why do I have to add NULL character? Because it works without that line as well. Also, is there a difference if I use:
for (i = 0; str1[i]; i++){
str2[i] = str1[i];
}
Thanks for your time.
The line you're referring to is added in general use for safety. When you copy values to a string you always want to be sure that it's null terminated, otherwise when reading the string it will continue past the point where you want the end of that string to be (because it doesn't know where to stop due to lack of the null terminator).
There is no difference with the alternate code you posted since you are separating only the line below the for statement to be in the loop, which happens by default anyway if you don't use the curly braces {}
In C, the end of the string is detected by the null character. Consider the string 'abcd'. If the variable in the actual binary have the next variable immediately after the 'd' character, C will think that the next characters in the platform are part of that string and you will continue. This is called buffer overrun.
Your initial statement allowing 20 bytes for str2 will usually fill it with 20 zeroes, However, this is not required and may not occur. Additionally, let us say you move a 15 character string into str2. Since it starts with 20 zeroes, this will work. However, say that you then copy a 10 character string into str2. The remaining 5 characters will be unchanged and you will then have a 15 character string consisting of the new 10 characters, followed by the five characters previously copied in.
In the code above the for loop says move the character in str1 to str2 and point to the next character. If the character now pointed to in str1 is not 0, loop back and do again. Otherwise drop out of the loop. Now add the null character to the end of the str2. If you left that out, the null character at the end of str1 would not be copied to str2, and you would have no null character at the end of str2.
This can be expressed as
i = 0;
label:
if (str1[i] == 0) goto end;
str2[i] = str1[i];
i = i + 1;
goto label;
end: /* This is the end of the loop*/
Note that the '\0' character has not yet been moved into str2.
Since C requires brackets to show the range of the for, only the first line after the for is part of the loop. If i had local scope and is lost after the loop, you would not be able to just wait to fall out of the loop and make it 0. You would no longer have a valid i pointer to tell you where in str2 you need to add the 0.
An example is C++ or some compilers in C which would allow (syntactically)
for (int i = 0; str1[i]; i++)
{
str2[i] = str1[i];
}
str2[i] = 0;
This would fail because i would be reset to whatever it happened to be before it entered the loop (probably 0) as it falls out of the loop. If it had not been defined before the loop, you would get an undefined variable compiler error.
I see that you fixed the indentation, but had the original indentation stayed there, the following comment would apply.
C does not work solely by indentation (as Python does, for example). If it did, the logic would be as follows and it would fail because str2 would be overwritten as all 0.
for (int i = 0; str1[i]; i++)
{
str2[i] = str1[i];
str2[i] = 0;
}
You should only add a \0 (also called the null byte) in the end of the string. Do as follows:
...
for(i = 0; str1[i]; i++) {
str2[i] = str1[i];
}
str2[i] = '\0'; //<====WHY ADDING THIS LINE??
...
(note that I simply added braces to make the code more readable, it was confusing before)
For me, that is clearer. What you were doing before is basically take advantage of the fact that the integer i that you declared is still available after you ran the loop to add a \0 in the end of str2.
The way strings work in C is that they are basically a pointer to the location of the first character and string functions (such as the ones you can find in string.h) will read every single char until they find a \0 (null byte). It is simply a convention for marking the end of the string.
Some further reading: http://www.cs.nyu.edu/courses/spring05/V22.0201-001/c_tutorial/classes/String.html
'\0' is used for denoting end of string. It is not for the compiler, it is for the libraries and possibly your code. C does not support arrays properly. You can have local arrays, but there is no way to pass them about. If you try you just pass the start address (address of first element). So you can ever have the last element be special e.g. '\0' or always pass the size, being careful not to mess up.
For example:
If your string is like this:
char str[]="Hello \0 World";
will you tell me what would display if you print str ?
Output is:
Hello
This will be the case in character arrays, Hence to be in safer side, it is good to add '\0'at the end of string.
If you didnt add '\0', some garbage values might get printed out, and it will keep on printing till it reached '\0'
In C, char[] do not know the length of the string. It is therefore important character '\0' (ASCII 0) to indicate the end of the string. Your "For" command will not copy '\0', so output is a string > str2 (until found '\ 0' last stop)
Try:
#include <stdio.h>
#include <string.h>
int main(){
int i;
char str[5] = "1234";
str[4] = '5';
printf("String %s\n\n", str);
return 0;
}

Resources