How do you assign a string in C - c

Printing the initials (first character) of the string held in the variable 'fn' and the variable 'ln'
#include <stdio.h>
#include <cs50.h>
int main(void)
{
string fn, ln, initials;
fn = get_string("\nFirst Name: ");
ln = get_string("Last Name: ");
initials = 'fn[0]', 'ln[0]';
printf("%s", initials)
}

Read more about C. In particular, read some good C programming book, and some C reference site and read the C11 standard n1570. Notice that cs50.h is not a standard C header (and I never encountered it).
The string type does not exist. So your example don't compile and is not valid C code.
An important (and difficult) notion in C is : undefined behavior (UB). I won't explain what is it here, but see this, read much more about UB, and be really afraid of UB.
Even if you (wrongly) add something like
typedef char* string;
(and your cs50.h might do that) you need to understand that:
not every pointer is valid, and some pointers may contain an invalid address (such as NULL, or most random addresses; in particular an uninitialized pointer variable often has an invalid pointer). Be aware that in your virtual address space most addresses are invalid. Dereferencing an invalid pointer is UB (often, but not always, giving a segmentation fault).
even when a pointer to char is valid, it could point to something which is not a string (e.g. some sequence of bytes which is not NUL terminated). Passing such a pointer (to a non-string data) to string related functions -e.g. strlen or printf with %s is UB.
A string is a sequence of bytes, with additional conventions: at the very least it should be NUL terminated and you generally want it to be a valid string for your system. For example, my Linux is using UTF-8 (in 2017 UTF-8 is used everywhere) so in practice only valid UTF-8 strings can be correctly displayed in my terminals.
Arrays are decayed into pointers (read more to understand what that means, it is tricky). So in several occasions you might declare an array variable (a buffer)
char buf[50];
then fill it, perhaps using strcpy like
strcpy(buf, "abc");
or using snprintf like
int xx = something();
snprintf(buf, sizeof(buf), "x%d", xx);
and latter you can use as a "string", e.g.
printf("buf is: %s\n", buf);
In some cases (but not always!), you might even do some array accesses like
char c=buf[4];
printf("c is %c\n", c);
or pointer arithmetic like
printf("buf+8 is %s\n", buf+8);
BTW, since stdio is buffered, I recommend ending your printf control format strings with \n or using fflush.
Beware and be very careful about buffer overflows. It is another common cause of UB.
You might want to declare
char initials[8];
and fill that memory zone to become a proper string:
initials[0] = fn[0];
initials[1] = ln[0];
initials[2] = (char)0;
the last assignment (to initials[2]) is putting the NUL terminating byte and makes that initials buffer a proper string. Then you could output it using printf or fputs
fputs(initials, stdout);
and you'll better output a newline with
putchar('\n');
(or you might just do puts(initials); ....)
Please compile with all warnings and debug info, so gcc -Wall -Wextra -g with GCC. Improve your code to get no warnings. Learn how to use your compiler and your debugger gdb. Use gdb to run your program step by step and query its state. Take time to read the documentation of every standard function that you are using (e.g. strcpy, printf, scanf, fgets) even if at first you don't understand all of it.

char initials[]={ fn[0], ln[0], '\0'};
This will form the char array and you can print it with
printf("%s", initials) //This is a string - null terminated character array.
There is no concept of string datatype in c . We simulate it using null terminated character array.
If you don't put the \0 in the end, it won't be a null terminated char array and if you want to print it you will have to use indexing in the array to determine the individual characters. (You can't use printf or other standard functions).
int s[]={'h','i'} // not null terminated
//But you can work with this, iterating over the elements.
for(size_t i=0; i< sizeof s; i++)
printf("%c",s[i]);
To explain further there is no string datatype in C. So what you can do is you simulate it using char [] and that is sufficient for that work.
For example you have to do this to get a string
char fn[MAXLEN}, ln[MAXLEN];
Reading an input can be like :
if(!fgets(fn, MAXLEN,stdin) ){
fprintf(stderr,"Error in input");
}
Do similarly for the second char array.
And then you do form the initializationg of array initials.
char initials[]={fn[0],ln[0],'\0'}
The benefit of the null terminated char array is that you can pass it to the fucntions which works over char* and get a correct result. Like strcmp() or strcpy().
Also there are lots of ways to get input from stdin and it is better always to check the return type of the standard functions that you use.
Standard don't restrict us that all the char arrays must be null terminated. But if we dont do that way then it's hardly useful in common cases. Like my example above. That array i shown earlier (without the null terminator) can't be passed to strlen() or strcpy() etc.
Also knowingly or unknowingly you have used somnething interesting The comma operator
Suppose you write a statememnt like this
char initialChar = fn[0] , ln[0]; //This is error
char initialChar = (fn[0] , ln[0]); // This is correct and the result will be `ln[0]`
, operator works that first it tries to evaluate the first expression fn[0] and then moves to the second ln[0] and that value is returned as a value of the whole expression that is assigned to initialChar.
You can check these helpful links to get you started
Beginner's Guide Away from scanf()
How to debug small programs

Related

Segmentation fault of small code

I am trying to test something and I made a small test file to do so. The code is:
void main(){
int i = 0;
char array1 [3];
array1[0] = 'a';
array1[1] = 'b';
array1[2] = 'c';
printf("%s", array1[i+1]);
printf("%d", i);
}
I receive a segmentation error when I compile and try to run. Please let me know what my issue is.
Please let me know what my issue is. ? firstly char array1[3]; is not null terminated as there is no enough space to put '\0' at the end of array1. To avoid this undefined behavior increase the size of array1.
Secondly, array1[i+1] is a single char not string, so use %c instead of %s as
printf("%c", array1[i+1]);
I suggest you get yourself a good book/video series on C. It's not a language that's fun to pick up out of the blue.
Regardless, your problem here is that you haven't formed a correct string. In C, a string is a pointer to the start of a contiguous region of memory that happens to be filled with characters. There is no data whatsoever stored about it's size or any other characteristics. Only where it starts and what it is. Therefore you must provide information as to when the string ends explicitly. This is done by having the very last character in a string be set to the so called null character (in C represented by the escape sequence '\0'.
This implies that any string must be one character longer than the content you want it to hold. You should also never be setting up a string manually like this. Use a library function like strlcpy to do it. It will automatically add in a null character, even if your array is too small (by truncating the string). Alternatively you can statically create a literal string like this:
char array[] = "abc";
It will automatically be null terminated and be of size 4.
Strings need to have a NUL terminator, and you don't have one, nor is there room for one.
The solution is to add one more character:
char array1[4];
// ...
array1[3] = 0;
Also you're asking to print a string but supplying a character instead. You need to supply the whole buffer:
printf("%s", array1);
Then you're fine.
Spend the time to learn about how C strings work, in particular about the requirement for the terminator, as buffer overflow bugs are no joke.
When printf sees a "%s" specifier in the formatting string, it expects a char* as the corresponding argument, but you passed a char value of the array1[i+1] expression. That char got promoted to int but that is still incompatible with char *, And even if it was it has no chance to be a valid pointer to any meaningful character string...

Space for Null character in c strings

When is it necessary to explicitly provide space for a NULL character in C strings.
For eg;
This works without any error although I haven't declared str to be 7 characters long,i.e for the characters of string plus NULL character.
#include<stdio.h>
int main(){
char str[6] = "string";
printf("%s", str);
return 0;
}
Though in this question https://stackoverflow.com/a/7652089 the user says
"This is useful if you need to modify the string later on, but know that it will not exceed 40 characters (or 39 characters followed by a null terminator, depending on context)."
What does it mean by "depending on context" ?
When is it necessary to explicitly provide space for a NULL character in C strings?
Always. Not having that \0 character there will make functions like strcpy, strlen and printing via %s behave wrong. It might work for some examples (like your own) but I won't bet anything on that.
On the other hand, if your string is binary and you know the length of the packet you don't need that extra space. But then you cannot use str* functions. And this is not the case of your question, anyway.
It is buggy, keyword "buffer overflow". The memory is overwritten.
char str[4] = "stringulation";
char str2[20];
printf("%s", str);
printf("%s", str2);
Trying to write on some address for which you have not requested may lead to data corruption, Random output or undefined nature of code.
Your code invokes undefined behaviour. You may think it works, but the code is broken.
To store a C string with 6 characters, and a null-terminator, you need a character array of length 7 or more.
When is it necessary to explicitly provide space for a NULL character in C strings
There are no exceptions. A C string must always include a null terminating character.
What does it mean by "depending on context"?
The answer there is drawing the distinction between a string variable that you intend to modify at a later time, or a string variable that you will not modify. In the former case, you may choose to allocate more than you need for the initial contents, because you want to be able to add more later. In the latter case, you can simply allocate as many characters are needed for the initial value, and no more.
That 0 terminator1 is how the various library functions (strcpy(), strlen(), printf(), etc.) identify the end of a string. When you call a function like
char foo[6] = "hello";
printf( "%s\n", foo );
the array expression foo is converted to a pointer value before it's passed to the function, so all the function receives is the address of the first character; it doesn't know how long the foo array is. So it needs some way to know where the end of the string is. If foo didn't have that space for the 0 terminator, printf() would continue to print characters beyond the end of the array until it saw a 0-valued byte.
1. I prefer using the term "0 terminator" instead of "NULL terminator", just to avoid confusion with the NULL pointer, which is a different thing.

scanf() does not read input string when first string of earlier defined array of strings in null

I defined an array for strings. It works fine if I define it in such a way the first element is not an empty string. When its an empty string, the next scanf() for the other string stops reading the input string and program stops execution.
Now I don't understand how can defining the array of strings affect reading of input by scanf().
char *str_arr[] = {"","abc","","","b","c","","",""}; // if first element is "abc" instead of "" then works fine
int size = sizeof(str_arr)/sizeof(str_arr[0]);
int i;
printf("give string to be found %d\n",size);
char *str;
scanf("%s",str);
printf("OK\n");
Actually, you are getting it wrong my brother. The initialization of str_arr doesn't affect the working of scanf() , it may however seem to you like that but it ain't actually. As described in other answers too this is called undefined behavior. An undefined behavior in C itself is very vaguely defined .
The C FAQ defines “undefined behavior” like this:
Anything at all can happen; the Standard imposes no requirements. The
program may fail to compile, or it may execute incorrectly (either
crashing or silently generating incorrect results), or it may
fortuitously do exactly what the programmer intended.
It basically means anything can happen. When you do it like this :
char *str;
scanf("%s",str);
Its an UB. Sometimes you get results which you are not supposed to and you think its working.That's where debuggers come in handy.Use them almost every time, especially in the beginning. Other recommendation w.r.t your program:
Instead of scanf() use fgets() to read strings. If you want to use scanf then use it like scanf("%ws",name); where name is character array and w is the field width.
Compile using -Wall option to get all the warnings, if you would have used it, you might have got the warning that you are using str uninitialized.
Go on reading THIS ARTICLE, it has sufficient information to clear your doubts.
Declaring a pointer does not allocate a buffer for it in memory and does not initialize it, so you are trying to dereference an uninitialized pointer (str) which results in an undefined behavior.
Note that scanf will cause a potential buffer overflow if not used carefully when reading strings. I recommend you read this page for some ideas on how to avoid it.
You are passing to scanf a pointer that is not initialized to anything particular, so scanf will try to write the characters provided by the user in some random memory location; whether this results in a crash or something else depends mostly by luck (and by how the compiler decides to set up the stack, that we may also see as "luck"). Technically, that's called "undefined behavior" - i.e. as far as the C standard is concerned, anything can happen.
To fix your problem, you have to pass to scanf a buffer big enough for the string you plan to receive:
char str[101];
scanf("%100s",str); /* the "100" bit tells to scanf to avoid reading more than 100 chars, which would result in a buffer overflow */
printf("OK\n");
And remember that char * in C is not the equivalent of string in other languages - char * is just a pointer to char, that knows nothing about allocation.

C: Why string variable accepts more characters than its size?

I have following code and the out put:-
#include<stdio.h>
int main()
{
char pal_tmp[4];
printf("Size of String Variable %d\n",sizeof(pal_tmp));
strcpy(pal_tmp,"123456789");
printf("Printing Extended Ascii: %s\n",pal_tmp);
printf("Size of String Variable %d\n",sizeof(pal_tmp));
}
Out put:-
Size of String Variable 4
Printing Extended Ascii: 123456789
Size of String Variable 4
My questions is Why String variable (character array) accepts characters more than what its capacity is? Should not it just print 1234 instead of 123456789 ?
Am I doing something wrong?
Well yes. You are doing something wrong. You're putting more characters into the string than you are supposed to. According to the C specification, that is wrong and referred to as "undefined behaviour".
However, that very same C specification does not require the compiler (nor runtime) to actually flag that as an error. "Undefined behaviour" means that anything could happen, including getting an error, random data corruption or the program actually working.
In this particular case, your call to strcpy simply writes outside the reserved memory and will overwrite whatever happens to be stored after the array. There is probably nothing of importance there, which is why nothing bad seems to happen.
As an example of what would happen if you do have something relevant after the array, let's add a variable to see what happens to it:
#include <stdio.h>
int main( void )
{
char foo[4];
int bar = 0;
strcpy( foo, "a long string here" );
printf( "%d\n", bar );
return 0;
}
When run, I get the result 1701322855 on my machine (the results on yours will likely be different).
The call to strcpy clobbered the content of the bar variable, resulting in the random output that you saw.
Well yes, you are overwriting memory that doesn't belong to that buffer (pal_tmp). In some cases this might work, in others you might get a segfault and your program will crash. In the case you showed, it looks like you happened to not overwrite anything "useful". If you tried to write more, you'll be more likely to overwrite something useful and crash the program.
C arrays of char don't have a predefined size, as far as the string handling functions are concerned. The functions will happily write off the end of the array into other variables (bad), or malloc's bookkeeping data (worse), or the call stack's bookkeeping data (even worse). The C standard makes this undefined behaviour, and for good reason.
If a version of a particular function accepts a size argument to limit how much data it writes, use it. It protects you against this stuff.
C does not keep track of the size of strings (or arrays, or allocated memory, etc.), so that is your job. If you create a string, you must be careful to always make sure it never gets longer than the amount of memory you've allocated to it.
In C language Strings are defined as an array of characters or a pointer to a portion of memory containing ASCII characters. A string in C is a sequence of zero or more characters followed by a NULL '\0' character. It is important to preserve the NULL terminating character as it is how C defines and manages variable length strings. All the C standard library functions require this for successful operation.
For complete reference refer this
Function strcpy doesn't have knowledge about the length of the character array - this function is considered as unsecure.
You may use strncpy, where you tell the size of the buffer and if longer argument is provided, only the memory of the buffer is used and nothing else is changed.

C's strtok() and read only string literals

char *strtok(char *s1, const char *s2)
repeated calls to this function break string s1 into "tokens"--that is
the string is broken into substrings,
each terminating with a '\0', where
the '\0' replaces any characters
contained in string s2. The first call
uses the string to be tokenized as s1;
subsequent calls use NULL as the first
argument. A pointer to the beginning
of the current token is returned; NULL
is returned if there are no more
tokens.
Hi,
I have been trying to use strtok just now and found out that if I pass in a char* into s1, I get a segmentation fault. If I pass in a char[], strtok works fine.
Why is this?
I googled around and the reason seems to be something about how char* is read only and char[] is writeable. A more thorough explanation would be much appreciated.
What did you initialize the char * to?
If something like
char *text = "foobar";
then you have a pointer to some read-only characters
For
char text[7] = "foobar";
then you have a seven element array of characters that you can do what you like with.
strtok writes into the string you give it - overwriting the separator character with null and keeping a pointer to the rest of the string.
Hence, if you pass it a read-only string, it will attempt to write to it, and you get a segfault.
Also, becasue strtok keeps a reference to the rest of the string, it's not reeentrant - you can use it only on one string at a time. It's best avoided, really - consider strsep(3) instead - see, for example, here: http://www.rt.com/man/strsep.3.html (although that still writes into the string so has the same read-only/segfault issue)
An important point that's inferred but not stated explicitly:
Based on your question, I'm guessing that you're fairly new to programming in C, so I'd like to explain a little more about your situation. Forgive me if I'm mistaken; C can be hard to learn mostly because of subtle misunderstanding in underlying mechanisms so I like to make things as plain as possible.
As you know, when you write out your C program the compiler pre-creates everything for you based on the syntax. When you declare a variable anywhere in your code, e.g.:
int x = 0;
The compiler reads this line of text and says to itself: OK, I need to replace all occurrences in the current code scope of x with a constant reference to a region of memory I've allocated to hold an integer.
When your program is run, this line leads to a new action: I need to set the region of memory that x references to int value 0.
Note the subtle difference here: the memory location that reference point x holds is constant (and cannot be changed). However, the value that x points can be changed. You do it in your code through assignment, e.g. x = 15;. Also note that the single line of code actually amounts to two separate commands to the compiler.
When you have a statement like:
char *name = "Tom";
The compiler's process is like this: OK, I need to replace all occurrences in the current code scope of name with a constant reference to a region of memory I've allocated to hold a char pointer value. And it does so.
But there's that second step, which amounts to this: I need to create a constant array of characters which holds the values 'T', 'o', 'm', and NULL. Then I need to replace the part of the code which says "Tom" with the memory address of that constant string.
When your program is run, the final step occurs: setting the pointer to char's value (which isn't constant) to the memory address of that automatically created string (which is constant).
So a char * is not read-only. Only a const char * is read-only. But your problem in this case isn't that char *s are read-only, it's that your pointer references a read-only regions of memory.
I bring all this up because understanding this issue is the barrier between you looking at the definition of that function from the library and understanding the issue yourself versus having to ask us. And I've somewhat simplified some of the details in the hopes of making the issue more understandable.
I hope this was helpful. ;)
I blame the C standard.
char *s = "abc";
could have been defined to give the same error as
const char *cs = "abc";
char *s = cs;
on grounds that string literals are unmodifiable. But it wasn't, it was defined to compile. Go figure. [Edit: Mike B has gone figured - "const" didn't exist at all in K&R C. ISO C, plus every version of C and C++ since, has wanted to be backward-compatible. So it has to be valid.]
If it had been defined to give an error, then you couldn't have got as far as the segfault, because strtok's first parameter is char*, so the compiler would have prevented you passing in the pointer generated from the literal.
It may be of interest that there was at one time a plan in C++ for this to be deprecated (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/1996/N0896.asc). But 12 years later I can't persuade either gcc or g++ to give me any kind of warning for assigning a literal to non-const char*, so it isn't all that loudly deprecated.
[Edit: aha: -Wwrite-strings, which isn't included in -Wall or -Wextra]
In brief:
char *s = "HAPPY DAY";
printf("\n %s ", s);
s = "NEW YEAR"; /* Valid */
printf("\n %s ", s);
s[0] = 'c'; /* Invalid */
If you look at your compiler documentation, odds are there is a option you can set to make those strings writable.

Resources