I have a bit of code like here:
#define MAXSIZE 100
int main() {
char str[MAXSIZE+1];
scanf("%100s", str);
...
The problem is I still have "magic number" 100, although defined MAXSIZE.
Is there a way to properly "insert" MAXSIZE into scanf format string? (pure C, c-99 standart)
There is, but you'd be better off using fgets():
if (fgets(str, sizeof str, stdin) != NULL) {
// process input
}
The good thing about this (apart from an undoubted readability boost) is that fgets() takes care of the size correctly, i. e. it accounts for the terminating 0 character (which scanf() doesn't), so you don't have to hack around with adding one to the size when declaring your buffer. It also always NUL-terminates the array for you. Way less error prone.
As to the original question: try the usual "stringify" trick:
#define REAL_STRINGIFY(x) #x
#define STRINGIFY(x) REAL_STRINGIFY(x)
scanf("%" STRINGIFY(MAXSIZE) "s", str);
But this is very ugly, isn't it?
Related
I am using write to create a csv file with the following type of values on every line
int16_t, int16_t, int16_t, int64_t, uint64_t
First a buffer is filled using sprintf and the it is passed to write. However, there is only one line with all the values in the written file. No new line.
static char line[34];
sprintf(line, "%d,%d,%d,%ld,%lu\n", ...);
write(fd_csv_data, line, sizeof(line));
%d,%d,%d,%ld,%lu makes 32 bytes in total, adding \n and \0 results in 34. What am I doing wrong ?
Two problems:
You write the full buffer, even the parts that are after the null-terminator. This part could be uninitialized and have indeterminate values.
Even if you fill the buffer completely, you write the null-terminator, which shouldn't be written to a text file.
To solve both these issues, use strlen instead to get the actual length of the string:
write(fd_csv_data, line, strlen(line));
On another couple of notes:
Use snprintf instead of sprintf, to avoid possible buffer overruns
The size-prefix l might be wrong for 64-bit types, use the standard format macro constants, like PRId64 for int64_t.
Your buffer could overflow, so you'll have to calculate the maximum size of the generated string or just use a buffer big enough.
To write to the file, you can use the return value of sprintf():
static char line[256];
int n = sprintf(line, "%d,%d,%d,%ld,%lu\n", ...);
write(fd_csv_data, line, n);
As an alternative the safer snprintf() could be used.
With some extra checks:
#define LINESIZE 256
static char line[LINESIZE];
int n = sprintf(line, "%d,%d,%d,%ld,%lu\n", ...);
if (n > 0 && n < LINESIZE) {
write(fd_csv_data, line, n);
}
// else..
GNU manual
This quote is from the GNU manual
Warning: If the input data has a null character, you can’t tell. So
don’t use fgets unless you know the data cannot contain a null. Don’t
use it to read files edited by the user because, if the user inserts a
null character, you should either handle it properly or print a clear
error message. We recommend using getline instead of fgets.
As I usually do, I spent time searching before asking a question, and I did find a similar question on Stack Overflow from five years ago:
Why is the fgets function deprecated?
Although GNU recommends getline over fgets, I noticed that getline in stdio.h takes any size line. It calls realloc as needed. If I try to set the size to 10 char:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *buffer;
size_t bufsize = 10;
size_t characters;
buffer = (char *)malloc(bufsize * sizeof(char));
if( buffer == NULL)
{
perror("Unable to allocate buffer");
exit(1);
}
printf("Type something: ");
characters = getline(&buffer,&bufsize,stdin);
printf("%zu characters were read.\n",characters);
printf("You typed: '%s'\n",buffer);
return(0);
}
In the code above, type any size string, over 10 char, and getline will read it and give you the right output.
There is no need to even malloc, as I did in the code above — getline does it for you. I'm setting the buffer to size 0, and getline will malloc and realloc for me as needed.
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *buffer;
size_t bufsize = 0;
size_t characters;
printf("Type something: ");
characters = getline(&buffer,&bufsize,stdin);
printf("%zu characters were read.\n",characters);
printf("You typed: '%s'\n",buffer);
return(0);
}
If you run this code, again you can enter any size string, and it works. Even though I set the buffer size to 0.
I've been looking at safe coding practices from CERT guidelines www.securecoding.cert.org
I was thinking of switching from fgets to getline, but the issue I am having, is I cannot figure out how to limit the input in getline. I think a malicious attacker can use a loop to send an unlimited amount of data, and use up all the ram available in the heap?
Is there a way of limiting the input size that getline uses or does getline have some limit within the function?
Using fgets is not necessarily problematic, all the gnu manual tells you is that if there's a '\0'-Byte in the file, so will there be in your buffer. You won't be able to tell if the null-delimiter in your buffer is the actual end of the file or just a null within the file. This means you can read a 100 char file into a 200 char buffer and it will contain a 50 char c-string.
The stdio.h readline in fact doesn't appear to have any sane length limitation so fread might be viable alternative.
Unlinke C getline and C++ std::getline(), C++ std::istream::getline() is limited to count characters
The GNU manual is just bad. Limiting the input length is usually the right thing to do, especially if input is untrusted, and fgets does this correctly. getline cannot be used safely in such a context.
I can specify the maximum amount of characters for scanf to read to a buffer using this technique:
char buffer[64];
/* Read one line of text to buffer. */
scanf("%63[^\n]", buffer);
But what if we do not know the buffer length when we write the code? What if it is the parameter of a function?
void function(FILE *file, size_t n, char buffer[n])
{
/* ... */
fscanf(file, "%[^\n]", buffer); /* WHAT NOW? */
}
This code is vulnerable to buffer overflows as fscanf does not know how big the buffer is.
I remember seeing this before and started to think that it was the solution to the problem:
fscanf(file, "%*[^\n]", n, buffer);
My first thought was that the * in "%*[*^\n]" meant that the maximum string size is passed an argument (in this case n). This is the meaning of the * in printf.
When I checked the documentation for scanf I found out that it means that scanf should discard the result of [^\n].
This left me somewhat disappointed as I think that it would be a very useful feature to be able to pass the buffer size dynamically for scanf.
Is there any way I can pass the buffer size to scanf dynamically?
Basic answer
There isn't an analog to the printf() format specifier * in scanf().
In The Practice of Programming, Kernighan and Pike recommend using snprintf() to create the format string:
size_t sz = 64;
char format[32];
snprintf(format, sizeof(format), "%%%zus", sz);
if (scanf(format, buffer) != 1) { …oops… }
Extra information
Upgrading the example to a complete function:
int read_name(FILE *fp, char *buffer, size_t bufsiz)
{
char format[16];
snprintf(format, sizeof(format), "%%%zus", bufsiz - 1);
return fscanf(fp, format, buffer);
}
This emphasizes that the size in the format specification is one less than the size of the buffer (it is the number of non-null characters that can be stored without counting the terminating null). Note that this is in contrast to fgets() where the size (an int, incidentally; not a size_t) is the size of the buffer, not one less. There are multiple ways of improving the function, but it shows the point. (You can replace the s in the format with [^\n] if that's what you want.)
Also, as Tim Čas noted in the comments, if you want (the rest of) a line of input, you're usually better off using fgets() to read the line, but remember that it includes the newline in its output (whereas %63[^\n] leaves the newline to be read by the next I/O operation). For more general scanning (for example, 2 or 3 strings), this technique may be better — especially if used with fgets() or getline() and then sscanf() to parse the input.
Also, the TR 24731-1 'safe' functions, implemented by Microsoft (more or less) and standardized in Annex K of ISO/IEC 9899-2011 (the C11 standard), require a length explicitly:
if (scanf_s("%[^\n]", buffer, sizeof(buffer)) != 1)
...oops...
This avoids buffer overflows, but probably generates an error if the input is too long. The size could/should be specified in the format string as before:
if (scanf_s("%63[^\n]", buffer, sizeof(buffer)) != 1)
...oops...
if (scanf_s(format, buffer, sizeof(buffer)) != 1)
...oops...
Note that the warning (from some compilers under some sets of flags) about 'non-constant format string' has to be ignored or suppressed for code using the generated format string.
There is indeed no variable width specifier in the scanf family of functions. Alternatives include creating the format string dynamically (though this seems a bit silly if the width is a compile-time constant) or simply accepting the magic number. One possibility is to use preprocessor macros for specifying both the buffer and format string width:
#define STR_VALUE(x) STR(x)
#define STR(x) #x
#define MAX_LEN 63
char buffer[MAX_LEN + 1];
fscanf(file, "%" STR_VALUE(MAX_LEN) "[^\n]", buffer);
Another option is to #define the length of the string:
#define STRING_MAX_LENGTH "%10s"
or
#define DOUBLE_LENGTH "%5lf"
I have this snippet of the code:
char* receiveInput(){
char *s;
scanf("%s",s);
return s;
}
int main()
{
char *str = receiveInput();
int length = strlen(str);
printf("Your string is %s, length is %d\n", str, length);
return 0;
}
I receive this output:
Your string is hellàÿ", length is 11
my input was:
helloworld!
can somebody explain why, and why this style of the coding is bad, thanks in advance
Several questions have addressed what you've done wrong and how to fix it, but you also said (emphasis mine):
can somebody explain why, and why this style of the coding is bad
I think scanf is a terrible way to read input. It's inconsistent with printf, makes it easy to forget to check for errors, makes it hard to recover from errors, and is incompatable with ordinary (and easier to do correctly) read operations (like fgets and company).
First, note that the "%s" format will read only until it sees whitespace. Why whitespace? Why does "%s" print out an entire string, but reads in strings in such a limited capacity?
If you'd like to read in an entire line, as you may often be wont to do, scanf provides... with "%[^\n]". What? What is that? When did this become Perl?
But the real problem is that neither of those are safe. They both freely overflow with no bounds checking. Want bounds checking? Okay, you got it: "%10s" (and "%10[^\n]" is starting to look even worse). That will only read 9 characters, and add a terminating nul-character automatically. So that's good... for when our array size never needs to change.
What if we want to pass the size of our array as an argument to scanf? printf can do this:
char string[] = "Hello, world!";
printf("%.*s\n", sizeof string, string); // prints whole message;
printf("%.*s\n", 6, string); // prints just "Hello,"
Want to do the same thing with scanf? Here's how:
static char tmp[/*bit twiddling to get the log10 of SIZE_MAX plus a few*/];
// if we did the math right we shouldn't need to use snprintf
snprintf(tmp, sizeof tmp, "%%%us", bufsize);
scanf(tmp, buffer);
That's right - scanf doesn't support the "%.*s" variable precision printf does, so to do dynamic bounds checking with scanf we have to construct our own format string in a temporary buffer. This is all kinds of bad, and even though it's actually safe here it will look like a really bad idea to anyone just dropping in.
Meanwhile, let's look at another world. Let's look at the world of fgets. Here's how we read in a line of data with fgets:
fgets(buffer, bufsize, stdin);
Infinitely less headache, no wasted processor time converting an integer precision into a string that will only be reparsed by the library back into an integer, and all the relevant elements are sitting there on one line for us to see how they work together.
Granted, this may not read an entire line. It will only read an entire line if the line is shorter than bufsize - 1 characters. Here's how we can read an entire line:
char *readline(FILE *file)
{
size_t size = 80; // start off small
size_t curr = 0;
char *buffer = malloc(size);
while(fgets(buffer + curr, size - curr, file))
{
if(strchr(buffer + curr, '\n')) return buffer; // success
curr = size - 1;
size *= 2;
char *tmp = realloc(buffer, size);
if(tmp == NULL) /* handle error */;
buffer = tmp;
}
/* handle error */;
}
The curr variable is an optimization to prevent us from rechecking data we've already read, and is unnecessary (although useful as we read more data). We could even use the return value of strchr to strip off the ending "\n" character if you preferred.
Notice also that size_t size = 80; as a starting place is completely arbitrary. We could use 81, or 79, or 100, or add it as a user-supplied argument to the function. We could even add an int (*inc)(int) argument, and change size *= 2; to size = inc(size);, allowing the user to control how fast the array grows. These can be useful for efficiency, when reallocations get costly and boatloads of lines of data need to be read and processed.
We could write the same with scanf, but think of how many times we'd have to rewrite the format string. We could limit it to a constant increment, instead of the doubling (easily) implemented above, and never have to adjust the format string; we could give in and just store the number, do the math with as above, and use snprintf to convert it to a format string every time we reallocate so that scanf can convert it back to the same number; we could limit our growth and starting position in such a way that we can manually adjust the format string (say, just increment the digits), but this could get hairy after a while and may require recursion (!) to work cleanly.
Furthermore, it's hard to mix reading with scanf with reading with other functions. Why? Say you want to read an integer from a line, then read a string from the next line. You try this:
int i;
char buf[BUSIZE];
scanf("%i", &i);
fgets(buf, BUFSIZE, stdin);
That will read the "2" but then fgets will read an empty line because scanf didn't read the newline! Okay, take two:
...
scanf("%i\n", &i);
...
You think this eats up the newline, and it does - but it also eats up leading whitespace on the next line, because scanf can't tell the difference between newlines and other forms of whitespace. (Also, turns out you're writing a Python parser, and leading whitespace in lines is important.) To make this work, you have to call getchar or something to read in the newline and throw it away it:
...
scanf("%i", &i);
getchar();
...
Isn't that silly? What happens if you use scanf in a function, but don't call getchar because you don't know whether the next read is going to be scanf or something saner (or whether or not the next character is even going to be a newline)? Suddenly the best way to handle the situation seems to be to pick one or the other: do we use scanf exclusively and never have access to fgets-style full-control input, or do we use fgets exclusively and make it harder to perform complex parsing?
Actually, the answer is we don't. We use fgets (or non-scanf functions) exclusively, and when we need scanf-like functionality, we just call sscanf on the strings! We don't need to have scanf mucking up our filestreams unnecessarily! We can have all the precise control over our input we want and still get all the functionality of scanf formatting. And even if we couldn't, many scanf format options have near-direct corresponding functions in the standard library, like the infinitely more flexible strtol and strtod functions (and friends). Plus, i = strtoumax(str, NULL) for C99 sized integer types is a lot cleaner looking than scanf("%" SCNuMAX, &i);, and a lot safer (we can use that strtoumax line unchanged for smaller types and let the implicit conversion handle the extra bits, but with scanf we have to make a temporary uintmax_t to read into).
The moral of this story: avoid scanf. If you need the formatting it provides, and don't want to (or can't) do it (more efficiently) yourself, use fgets / sscanf.
scanf doesn't allocate memory for you.
You need to allocate memory for the variable passed to scanf.
You could do like this:
char* receiveInput(){
char *s = (char*) malloc( 100 );
scanf("%s",s);
return s;
}
But warning:
the function that calls receiveInput will take the ownership of the returned memory: you'll have to free(str) after you print it in main. (Giving the ownership away in this way is usually not considered a good practice).
An easy fix is getting the allocated memory as a parameter.
if the input string is longer than 99 (in my case) your program will suffer of buffer overflow (which is what it's already happening).
An easy fix is to pass to scanf the length of your buffer:
scanf("%99s",s);
A fixed code could be like this:
// s must be of at least 100 chars!!!
char* receiveInput( char *s ){
scanf("%99s",s);
return s;
}
int main()
{
char str[100];
receiveInput( str );
int length = strlen(str);
printf("Your string is %s, length is %d\n", str, length);
return 0;
}
You have to first allocate memory to your s object in your receiveInput() method. Such as:
s = (char *)calloc(50, sizeof(char));
Is it possible to use toString operator, or how to convert numbers to char arrays.
int myNumber = 27; /* or what have you */
char myBuffer[100];
snprintf(myBuffer, 100, "%d", myNumber);
There are several considerations for you to think about here. Who provides the memory to hold the string? How long do you need it for, etc? (Above it's a stack buffer of 100 bytes, which is way bigger than necessary for any integer value being printed.)
Best answer: start using Java. Or Javascript, or C#, or for the love of God almost anything but C. Only tigers lie this way.
Use the sprintf() function.
sprintf() is considered unsafe because it can lead to a buffer overflow. If it's available (and on many platforms it is), you should use snprintf() instead.
Consider the following code:
#include <stdio.h>
int main()
{
int i = 12345;
char buf[4];
sprintf(buf, "%d", i);
}
This leads to a buffer overflow. So, you have to over-allocate a buffer to the maximum size (as a string) of an int, even if you require fewer characters, since you have the possibility of an overflow. Instead, if you used snprintf(), you could specify the number of characters to write, and any more than that number would simply be truncated.
#include <stdio.h>
int main()
{
int i = 12345;
char buf[4];
snprintf(buf, 4, "%d", i);
//truncates the string to 123
}
Note that in either case, you should take care to allocate enough buffer space for any valid output. It's just that snprintf() provides you with a safety net in case you haven't considered that one edge case where your buffer would otherwise overflow.