ANSI C strncpy messing up screen output and other variables' values - c

Using ANSI C, screen is messing up after the strncpy. Also if I try to print any int variable values become incorrect. However if I move the print line before strncpy everything is fine.
Does anybody know why?
#define TICKET_NAME_LEN 40
struct stock_data
{
char ticket_name[TICKET_NAME_LEN+1];
};
struct stock_data user_input;
char tname[TICKET_NAME_LEN+1] = "testing it";
strncpy(user_input.ticket_name, tname, TICKET_NAME_LEN);

The symptoms you are describing are the classic ones for a copy that is out of control. However, the real source of your problem is almost certainly not in the code you show.
The only possible issue with the code you show is that strncpy() does not guarantee that the output (target) string is null terminated. This won't hurt with the code shown (it doesn't do anything untoward), but other code that expects the string to be null terminated that blithely copies it without ensuring that there's space may go trampling other memory because the string is not null terminated.
If the input (source) string is longer than the space specified (in this case more than TICKET_NAME_LEN bytes long), then user_input.ticket_name will not be null terminated except by accident. If it is shorter, then user_input.ticket_name will be null padded to the length TICKET_NAME_LEN bytes.
If this is the problem, a very simple fix is to add the line:
user_input.ticket_name[TICKET_NAME_LEN] = '\0';
after (or even before, but it is more conventional to do it after) the strncpy().
However, to run into this problem, you'd have to be trying to copy a name of 41 or more characters into the ticket name member of the structure.
It is much more likely that something else is the cause of your trouble.
ISO/IEC 9899:2011 §7.24.2.4 The strncpy function
¶2 The strncpy function copies not more than n characters (characters that follow a null
character are not copied) from the array pointed to by s2 to the array pointed to by
s1.308) If copying takes place between objects that overlap, the behavior is undefined.
¶3 If the array pointed to by s2 is a string that is shorter than n characters, null characters
are appended to the copy in the array pointed to by s1, until n characters in all have been
written.
308) Thus, if there is no null character in the first n characters of the array pointed to by s2, the result will not be null-terminated.

Related

Why printf() function print other characters? [duplicate]

I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.

When using getch(), there's already a character inputted into stdin. How to remove it? [duplicate]

I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.

How should character arrays be used as strings?

I understand that strings in C are just character arrays. So I tried the following code, but it gives strange results, such as garbage output or program crashes:
#include <stdio.h>
int main (void)
{
char str [5] = "hello";
puts(str);
}
Why doesn't this work?
It compiles cleanly with gcc -std=c17 -pedantic-errors -Wall -Wextra.
Note: This post is meant to be used as a canonical FAQ for problems stemming from a failure to allocate room for a NUL terminator when declaring a string.
A C string is a character array that ends with a null terminator.
All characters have a symbol table value. The null terminator is the symbol value 0 (zero). It is used to mark the end of a string. This is necessary since the size of the string isn't stored anywhere.
Therefore, every time you allocate room for a string, you must include sufficient space for the null terminator character. Your example does not do this, it only allocates room for the 5 characters of "hello". Correct code should be:
char str[6] = "hello";
Or equivalently, you can write self-documenting code for 5 characters plus 1 null terminator:
char str[5+1] = "hello";
But you can also use this and let the compiler do the counting and pick the size:
char str[] = "hello"; // Will allocate 6 bytes automatically
When allocating memory for a string dynamically in run-time, you also need to allocate room for the null terminator:
char input[n] = ... ;
...
char* str = malloc(strlen(input) + 1);
If you don't append a null terminator at the end of a string, then library functions expecting a string won't work properly and you will get "undefined behavior" bugs such as garbage output or program crashes.
The most common way to write a null terminator character in C is by using a so-called "octal escape sequence", looking like this: '\0'. This is 100% equivalent to writing 0, but the \ serves as self-documenting code to state that the zero is explicitly meant to be a null terminator. Code such as if(str[i] == '\0') will check if the specific character is the null terminator.
Please note that the term null terminator has nothing to do with null pointers or the NULL macro! This can be confusing - very similar names but very different meanings. This is why the null terminator is sometimes referred to as NUL with one L, not to be confused with NULL or null pointers. See answers to this SO question for further details.
The "hello" in your code is called a string literal. This is to be regarded as a read-only string. The "" syntax means that the compiler will append a null terminator in the end of the string literal automatically. So if you print out sizeof("hello") you will get 6, not 5, because you get the size of the array including a null terminator.
It compiles cleanly with gcc
Indeed, not even a warning. This is because of a subtle detail/flaw in the C language that allows character arrays to be initialized with a string literal that contains exactly as many characters as there is room in the array and then silently discard the null terminator (C17 6.7.9/15). The language is purposely behaving like this for historical reasons, see Inconsistent gcc diagnostic for string initialization for details. Also note that C++ is different here and does not allow this trick/flaw to be used.
From the C Standard (7.1.1 Definitions of terms)
1 A string is a contiguous sequence of characters terminated by and
including the first null character. The term multibyte string is
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial
(lowest addressed) character. The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
In this declaration
char str [5] = "hello";
the string literal "hello" has the internal representation like
{ 'h', 'e', 'l', 'l', 'o', '\0' }
so it has 6 characters including the terminating zero. Its elements are used to initialize the character array str which reserve space only for 5 characters.
The C Standard (opposite to the C++ Standard) allows such an initialization of a character array when the terminating zero of a string literal is not used as an initializer.
However as a result the character array str does not contain a string.
If you want that the array would contain a string you could write
char str [6] = "hello";
or just
char str [] = "hello";
In the last case the size of the character array is determined from the number of initializers of the string literal that is equal to 6.
Can all strings be considered an array of characters (Yes), can all character arrays be considered strings (No).
Why Not? and Why does it matter?
In addition to the other answers explaining that the length of a string is not stored anywhere as part of the string and the references to the standard where a string is defined, the flip-side is "How do the C library functions handle strings?"
While a character array can hold the same characters, it is simply an array of characters unless the last character is followed by the nul-terminating character. That nul-terminating character is what allows the array of characters to be considered (handled as) a string.
All functions in C that expect a string as an argument expect the sequence of characters to be nul-terminated. Why?
It has to do with the way all string functions work. Since the length isn't included as part of an array, string-functions, scan forward in the array until the nul-character (e.g. '\0' -- equivalent to decimal 0) is found. See ASCII Table and Description. Regardless whether you are using strcpy, strchr, strcspn, etc.. All string functions rely on the nul-terminating character being present to define where the end of that string is.
A comparison of two similar functions from string.h will emphasize the importance of the nul-terminating character. Take for example:
char *strcpy(char *dest, const char *src);
The strcpy function simply copies bytes from src to dest until the nul-terminating character is found telling strcpy where to stop copying characters. Now take the similar function memcpy:
void *memcpy(void *dest, const void *src, size_t n);
The function performs a similar operation, but does not consider or require the src parameter to be a string. Since memcpy cannot simply scan forward in src copying bytes to dest until a nul-terminating character is reached, it requires an explicit number of bytes to copy as a third parameter. This third parameter provides memcpy with the same size information strcpy is able to derive simply by scanning forward until a nul-terminating character is found.
(which also emphasizes what goes wrong in strcpy (or any function expecting a string) if you fail to provide the function with a nul-terminated string -- it has no idea where to stop and will happily race off across the rest of your memory segment invoking Undefined Behavior until a nul-character just happens to be found somewhere in memory -- or a Segmentation Fault occurs)
That is why functions expecting a nul-terminated string must be passed a nul-terminated string and why it matters.
Intuitively...
Think of an array as a variable (holds things) and a string as a value (can be placed in a variable).
They are certainly not the same thing. In your case the variable is too small to hold the string, so the string gets cut off. ("quoted strings" in C have an implicit null character at the end.)
However it's possible to store a string in an array that is much larger than the string.
Note that the usual assignment and comparison operators (= == < etc.) don't work as you might expect. But the strxyz family of functions comes pretty close, once you know what you're doing. See the C FAQ on strings and arrays.

A little query, String in C

Recently I was programming in my Code Blocks and I did a little program only for hobby in C.
char littleString[1];
fflush( stdin );
scanf( "%s", littleString );
printf( "\n%s", littleString);
If I created a string of one character, why does the CodeBlocks allow me to save 13 characters?
C have no bounds-checking, writing out of bounds of arrays or dynamically allocated memory can't be checked by the compiler. Instead it will lead to undefined behavior.
To prevent buffer overflow with scanf you can tell it to only read a specific number of characters, and nothing more. So to tell it to read only one character you use the format "%1s".
As a small side-note: Remember that strings in C have an extra character in them, the terminator (character '\0'). So if you have a string that should contain one character, the size actually needs to be two characters.
LittleString is not a string. It is a char array of length one. In order for a char array to be a string, it must be null terminated with an \0. You are writing past the memory you have allotted for littleString. This is undefined behavior.Scanf just reads user input from the console and assigns it to the variable specified, in this case littleString. If you would like to control the length of user input which is assigned to the variable, I would suggest using scanf_s. Please note that scanf_s is not a C99 standard
Many functions in C is implemented without any checks for correctness of use. In other words, it is the callers responsibility that the arguments fulfill some rules set by the function.
Example: For strcpy the Linux man page says
The strcpy() function copies the string pointed to by src,
including the terminating null byte ('\0'), to the buffer
pointed to by dest. The strings may not overlap, and the
destination string dest must be large enough to receive the copy.
If you as a caller break that contract by passing a too small buffer, you'll have undefined behavior and anything can happen.
The program may crash or even do exactly what you expected in 99 out of 100 times and do something strange in 1 out of 100 times.

strncpy introduces funny character

When I run some code on my machine then it behaves as I expect it to.
When I run it on a colleagues it misbehaves. This is what happens.
I have a string with a value of:
croc_data_0001.idx
when I do a strncpy on the string providing 18 as the length my copied string has a value of:
croc_data_0001.idx♂
If I do the following
myCopiedString[18]='\0';
puts (myCopiedString);
Then the value of the copied string is:
croc_data_0001.idx
What could be causing this problem and why does it get resolved by setting the last char to \0?
According to http://www.cplusplus.com/reference/clibrary/cstring/strncpy/
char * strncpy ( char * destination, const char * source, size_t num );
Copy characters from string
Copies the first num characters of source to destination. If the end
of the source C string (which is signaled by a null-character) is
found before num characters have been copied, destination is padded
with zeros until a total of num characters have been written to it.
No null-character is implicitly appended to the end of destination, so destination will only be null-terminated if the length
of the C string in source is less than num.
Thus, you need to manually terminate your destination with '\0'.
strncpy does not want the size of the string to be copied, but the size of the target buffer.
In your case, the target buffer is 1 too short, disabling strncpy to zero-terminate the string. So everything that is behind the string resp. position 18 and is non-zero will be treated as belonging to the string.
Normally, functions taking a buffer size are called with exactly that, i. e.
char dest[50];
strncpy(dest, "croc_data_0001.idx", sizeof dest);
With this and an additional
dest[sizeof dest - 1] = '\0';
the string will always be 0-terminated.
I think the C standard describes this function in a clearer manner than the links others have posted.
ISO 9899:2011
7.24.2.4 The strncpy function
char *strncpy (char * restrict s1,
const char * restrict s2,
size_t n);
The strncpy function copies not more than n characters (characters that follow a null
character are not copied) from the array pointed to by s2 to the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.
If the array pointed to by s2 is a string that is shorter than n characters, null characters
are appended to the copy in the array pointed to by s1, until n characters in all have been
written.
how much space have been alloted to myCopiedString variable? if its more than the length of the source string, then make sure you use bzero to clear out the destination variable.
strncpy does not always add a \0. See http://www.cplusplus.com/reference/clibrary/cstring/strncpy/
So either clear out your destination buffer beforehand, or always add the \0 yourself, or use strcpy.
If the question is: "why does uninitialised memory on my machine have different content than on another machine", well, one can only guess.
Edit changed wording somewhat; see comment.

Resources