So I have this code:
#include <stdio.h>
int main() {
char B,y[2];
scanf("%c",&B);
scanf("%s",y);
printf("%c\n",B);
}
When I enter in a character for B like S, then a character for y like a, it works fine.
It prints out
a
S
However, when I enter 2 characters for y like ab, it prints the two characters but doesn't print out S.
It prints:
ab
Am I doing something wrong?
First of all, a char array, defined like y[2] can hold only one char and the other space is reserved for terminating null for that array to behave as string. In other words, the max length of the string it can hold is only 1.
That said, as per the understanding, you should change
scanf("%s",y);
to
scanf("%1s",y);
to limit the input length. Otherwise, you'll experience buffer overflow which invokes undefined behavior.
To elaborate on adding that literal 1 in the format string, that 1 denotes the maximum field width.
Quoting C11, chapter §7.21.6.2, fscanf(), (emphasis mine)
An input item is read from the stream, unless the specification includes an n specifier. An
input item is defined as the longest sequence of input characters which does not exceed
any specified field width and which is, or is a prefix of, a matching input sequence. [....]
Related
I want to input an integer number and a character with scanf funtion, but it didn't work as I want.
The codes are as follows.
#include <stdio.h>
int main()
{
int a;
char c;
scanf("%d",&a);
scanf("%2c",&c);
printf("%d%c",a,c);
return 0;
}
I tried to input 12a (there is a space after a) from the terminal, but the output is not "12a" but "32a".
I also tried to run the code above step by step and found that when it run into the first "scanf", the value of "a" is 12, but when run into second "scanf", the value of "a" turned 32.
I want to figure out why the second scanf changes the value of a, which is not presented.
The problem is that the compiler has put variable a just behind variable c. When you do the second scanf() you specify to read two characters into a variable that has space only for one. You have incurred in a buffer overflow, and have overwritten memory past the variable c (and a happens to be there). The space has been written into a and this is the reason that you get 32 output (a has been stored the value of an ASCII SPACE, wich is 32).
What has happened is known as Undefined Behaviour, and it's common when you make this kind of mistakes. You can solve this by definning an array of char cells with at least two cells for reading the two characters . and then use something like:
#include <stdio.h>
int main()
{
int a;
char c[2];
scanf("%d", &a);
scanf("%2c", c); /* now c is a char array so don't use & */
printf("%d%.2s", a, c); /* use %.2s format instead */
return 0;
}
Note:
the use of %.2s format specifier is due to the fact that c is an array of two chars that has been filled completely (without allowing space to include a \0 string end delimiter) this would cause undefined behaviour if we don't ensure that the formatting will end at the second character (or before, in case a true \0 is found in the first or the second array positions)
Quoting C11, chapter 7.21.6.2, The fscanf function (emphasis mine)
c
[...]If an l length modifier is present, the input shall be a sequence of multibyte characters that begins in the initial shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the mbrtowc function, with the conversion state described by an mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument shall be a pointer to the initial element of an array of wchar_t large enough to accept the resulting sequence of wide characters. [...]
and you're supplying a char *. The supplied argument does not match the expected type of argument, so this is undefined behavior.
Therefore the outcome cannot be justified.
To hold an input like "a ", you'll need a (long enough) char array, a char variable is not sufficient.
I have the following C code:
#include <stdio.h>
#include <strings.h>
int main(void){
char * str = "\012\0345";
char testArr[8] = {'\0','1','2','\0','3','4','5','\0'};
printf("%s\n",str);
printf("**%s**",testArr);
return 0;
}
See live code here
I'm having trouble understanding the results and I have googled but am unsure that I understand why a null character at the start of a string and why one in the middle would cause only the string "5" to display. Also, when I assign each string character to array testArr and then attempt to display that array of characters the result is different despite the string and the array having the same characters. So, I'm struck by the confounding results, especially their disparity. With the string str, does the code display "5" because the null characters overwrite what is in memory?
Also, with the array I created using the same characters, nothing displays of the data contained in array testArr. Is it that once the first null is encountered for some reason everything else is ignored? If so, why doesn't the same behavior occur with string str which contains the same characters?
An octal escape sequence is \ followed by one to three octal digits, per C 2018 6.4.4.4 1. Per 6.4.4.4 7: “Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence.” So, when the compiler sees "\012\0345", it interprets it as the sequence \012 (which is ten), the sequence \034 (which is twenty-eight), and the character 5.
To represent the string you intended, you could use "\00012\000345". Since an octal escape sequence stops at three digits, this is interpreted as the sequence \000, the characters 1 and 2, the sequence \000, and the characters 3, 4, and 5. (A null terminating character will also be appended automatically.)
When you printed "\012\0345", the characters with codes ten and twenty-eight were printed but had no visible effect. (Your C implementation likely uses ASCII, in which case they are control characters. \012 is new-line, so it should have caused a line advance, but you probably did not notice that. \034 is a file-separator control character, which likely has no effect when printed to a regular terminal display.)
When you printed testArr, the null character in the first position ended the string.
#include <stdio.h>
int main(){
printf("Enter 10 numbers: ");
int a[10], i = 0;
for (i = 0; i < 10; i++)
{
scanf("%d", &a[i]);
}
}
When I put value in each array, why pushing the space bar can put a value in the array?
For example when I write 1space2space3space then each value is put in each array (a[0], a[1], a[2]).
Why is this happening?
From the C Standard (7.21.6.2 The fscanf function)
12 The conversion specifiers and their meanings are:
d Matches an optionally signed decimal integer, whose format is the same as
expected for the subject sequence of the strtol function with the value 10
for the base argument. The corresponding argument shall be a pointer to
signed integer.
And (7.22.1.4 The strtol, strtoll, strtoul, and strtoull functions)
...First, they decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters (as
specified by the isspace function), a subject sequence resembling an
integer represented in some radix determined by the value of base, and
a final string of one or more unrecognized characters, including the
terminating null character of the input string. Then, they attempt to
convert the subject sequence to an integer, and return the result.
For such an input
1space2space3space
the first subject sequence is 1, the second subject sequence (after skipping white-space characters) is 2, and the third subject sequence is 3. They are used to store integers correspondingly in a[0], a[1], and a[2] because each subject sequence represents a valid integer.
Take into account that in general implementations use the so-called line buffering for text streams.
From the C Standard (7.21.3 Files)
... When a stream is line buffered, characters are intended to be
transmitted to or from the host environment as a block when a
new-line character is encountered.
As you are using scanf() with %d, the input stream is accepting integer values, and considering others as separator.
So, I'm rewriting the tar extract command, and I stumbled upon a weird problem:
In short, I allocate a HEADER struct that contains multiple char arrays, let's say:
struct HEADER {
char foo[42];
char bar[12];
}
When I fprintf foo, I get a 3 character-long string, which is OK since the fourth character is a '\0'. But when I print bar, I have 25 characters that are printed.
How can I do to only get the 12 characters of bar?
EDIT The fact that the array isn't null terminated is 'normal' and cannot be changed, otherwise I wouldn't have so much trouble with it. What I want to do is parse the x first characters of my array, something like
char res[13];
magicScanf(res, 12, bar);
res[12] = '\0'
EDIT It turns out the string WAS null-terminated already. I thought it wasn't since it was the most logic possibility for my bug. As it's another question, I'll accept an answer that matched the problem described. If someone has an idea as to why sprintf could've printed 25 characters INCLUDING 2 \0, I would be glad.
You can print strings without NUL terminators by including a precision:
printf ("%.25s", s);
or, if your precision is unknown at compilation time:
printf ("%.*s", length, s);
The problem is that the size of arrays are lost when calling a function. Thus, the fprintf function does not know the size of the array and can only end at a \0.
No, unless you have supplied the precision, fprintf() has no magical way to know the size of the array supplied as argument to %s, it still relies on the terminating null.
Quoting C11, chapter §7.21.6.1, (emphasis mine)
s
If no l length modifier is present, the argument shall be a pointer to the initial
element of an array of character type.280) Characters from the array are
written up to (but not including) the terminating null character. If the
precision is specified, no more than that many bytes are written. If the
precision is not specified or is greater than the size of the array, the array shall
contain a null character.
So, in case your array is not null terminated, you must use a precision wo avoid out of bound access.
void printbar(struct HEADER *h) {
printf("%.12s", h->bar);
}
You can use it like this
struct HEADER data[100];
/* ... */
printbar(data + 42); /* print data[42].bar */
Note that if one of the 12 bytes of bar has a value of zero, not all of them get printed.
You might be better off printing them one by one
void printbar(struct HEADER *h) {
printf("%02x", h->bar[0]);
for (int i = 1; i < 12; i++) printf(" %02x", h->bar[i]);
}
How to use a scanf width specifier of 0?
1) unrestricted width (as seen with cywin gcc version 4.5.3)
2) UB
3) something else?
My application (not shown) dynamically forms the width specifier as part of a larger format string for scanf(). Rarely it would create a "%0s" in the middle of the format string. In this context, the destination string for that %0s has just 1 byte of room for scanf() to store a \0 which with behavior #1 above causes problems.
Note: The following test cases use constant formats.
#include <memory.h>
#include <stdio.h>
void scanf_test(const char *Src, const char *Format) {
char Dest[10];
int NumFields;
memset(Dest, '\0', sizeof(Dest)-1);
NumFields = sscanf(Src, Format, Dest);
printf("scanf:%d Src:'%s' Format:'%s' Dest:'%s'\n", NumFields, Src, Format, Dest);
}
int main(int argc, char *argv[]) {
scanf_test("1234" , "%s");
scanf_test("1234" , "%2s");
scanf_test("1234" , "%1s");
scanf_test("1234" , "%0s");
return 0;
}
Output:
scanf:1 Src:'1234' Format:'%s' Dest:'1234'
scanf:1 Src:'1234' Format:'%2s' Dest:'12'
scanf:1 Src:'1234' Format:'%1s' Dest:'1'
scanf:1 Src:'1234' Format:'%0s' Dest:'1234'
My question is about the last line. It seems that a 0 width results in no width limitation rather than a width of 0. If this is correct behavior or UB, I'll have to approach the zero width situation another way or are there other scanf() formats to consider?
The maximum field width specifier must be non-zero. C99, 7.19.6.2:
The format shall be a multibyte character sequence, beginning and ending in its initial
shift state. The format is composed of zero or more directives: one or more white-space
characters, an ordinary multibyte character (neither % nor a white-space character), or a
conversion specification. Each conversion specification is introduced by the character %.
After the %, the following appear in sequence:
— An optional assignment-suppressing character *.
— An optional nonzero decimal integer that specifies the maximum field width (in
characters).
— An optional length modifier that specifies the size of the receiving object.
— A conversion specifier character that specifies the type of conversion to be applied.
So, if you use 0, the behavior is undefined.
This came from 7.21.6.2 of n1570.pdf (C11 standard draft):
After the %, the following appear in sequence:
— An optional assignment-suppressing character *.
— An optional decimal integer greater than zero that specifies the
maximum field width (in characters).
...
It's undefined behaviour, because the C standard states that your maximum field width must be greater than zero.
An input item is defined as the longest sequence of input characters
which does not exceed any specified field width and ...
What is it you wish to achieve by reading a field of width 0 and assigning it as a string (empty string) into Dest? Which actual problem are you trying to solve? It seems more clear to just assign like *Dest = '\0';.