Lack of understanding of security engineering in simple C code - c

I am absolutely new to C programming. Currently I am preparing for my new course of studies IT Security. In a slightly older exam I found a task where I have no approach how to solve it. The task is in German. In principle it is about finding critical errors.
It is not written how the passed parameters look like.
1) I have come to the point that you should not use strcpy because it has no bounds checking.
2) Also char[10] should not be used if you want to store 10 characters (\0). It should be char[11].
Is it possible to read Adresses or write sth due the printf(argv[1]) command ?
I would like to mention again that you help me here personally and do not help to collect bonus points in the university.
#include <stdio.h>
int main(int argc, char *argv[])
{
char code[10];
if(argc != 2) return 1;
printf(argv[1]);
strcpy(code, "9999999999");
for(int i = 0; i < 10; ++i){
code[i] -= argv[1][i] % 10;
}
printf(", %s\n", code);
return 0;
}

See
related.
you should not use strcpy() because it has no bounds checking
Nothing in C has bounds checking unless either
the compiler writer put it there, or
you put it there.
Few compiler writers incorporate bounds checking into their products, because it usually causes the resulting code to be bigger and slower. Some tools exist (e.g.
Valgrind,
Electric Fence)
to provide bounds-checking-related debugging assistance, but they are not commonly incorporated into delivered software because of limitations they impose.
You absolutely should use strcpy() if
you know your source is a NUL-terminated array of characters, a.k.a. "a string", and
you know your destination is large enough to hold all of the source array including the terminating NUL
because the compiler writer is permitted to use behind-the-scenes tricks unavailable to compiler users to ensure strcpy() has the best possible performance while still providing the behaviour guaranteed by the standard.
char[10] should not be used if you want to store 10 characters (\0)
Correct.
To store 10 characters and the terminating NUL ('\0'), you must have at least 11 characters of space available.
Is it possible to read Adresses or write sth due the printf(argv[1]) command ?
In principle: maybe.
The first argument to printf() is a format string which is interpreted by printf() to determine what further arguments have been provided. If the format string contains any format specifications (e.g. "%d" or "%n") then printf() will try to retrieve corresponding arguments.
If they were not in fact passed to it, then it invokes Undefined Behaviour which is Bad.
An attacker could run your program giving it a command-line argument containing format specifiers, which would lead to such UB.
The right way to print an arbitrary string like this with printf() is printf("%s", argv[1]);

Related

How I can solve this problem during debugging?(unhandled exception at 0xFEFEFEFE)

It seems like there is problem in scanf_s
Here is my code.
#include <stdio.h>
#include "stack.h"
int main(){
int disk;
int hanoi[3][9];
char input[3] = { 0,0,0 };
int moveDisk;
for (int i = 0; i < 9; i++) {
hanoi[0][i] = i + 1;
hanoi[1][i] = 0;
hanoi[2][i] = 0;
}
printf("Insert the number of disks(1~9): ");
scanf_s("%d", &disk);
while (input[0] != 'q') {
printf("%3c %3c %3c\n", 'A', 'B', 'C');
for (int i = 0; i < disk; i++) {
printf("%3d %3d %3d\n", hanoi[0][i], hanoi[1][i], hanoi[2][i]);
}
scanf_s("%s", &input); //getting moving disk -- LOCATION OF ERROR
}
}
I have no idea how to solve this
No doubt you tried to use scanf() in the normal way and Visual Studio reported an error instructing you to use scanf_s()? It is not a direct replacement. For all %c, %s and %[ format specifiers you must provide two arguments - the target receiving the input, and the size of target (or strictly the number of elements).
In VS2019 even at /W1 warning level, it issues a clear explanation of the problem in this case:
warning C4473: 'scanf_s' : not enough arguments passed for format string
message : placeholders and their parameters expect 2 variadic arguments, but 1 were provided
message : the missing variadic argument 2 is required by format string '%s'
message : this argument is used as a buffer size
Don't ignore the warnings, and certainly don't disable them globally (/W0).
So in this case:
scanf_s("%s", input, sizeof(input) ) ;
again more strictly:
scanf_s("%s", input, sizeof(input)/sizeof(*input) ) ;
but the latter is really only necessary for wscanf_s (wide characters). In both cases you could use the _countof() macro, but it is Microsoft specific.
scanf_s("%s", input, _countof(input) ) ;
Note also the lack of an & before input. You don't need it for an argument that is already array or pointer. That is true of scanf() too.
Whilst there are arguments for using scanf_s() over scanf() (which is intrinsically more dangerous), it can just make life difficult if you are learning from standard examples or using a different toolchain. The simpler solution is just to disable the warning, and understand that it is unsafe:
You cited the line
scanf_s("%s", &input);
There are several things wrong with this line:
You are reading a string into a character array. This is an exception to the normal pattern for scanf, in that you do not need the &.
You are using the semistandard scanf_s, instead of the normal scanf. scanf_s is supposed to be "safer", but in order for it to provide its safetiness guarantees you have to call it differently than normal scanf, too. You have to tell it the size of the array you're reading the string into. Combined with #1 above, I believe a more correct call would be scanf_s("%s", input, 3);.
For most purposes, a string of size 3 would be far too small for reading a line of input from the user. Since in this case I guess you're only reading a "line" to give yourself an opportunity to hit RETURN before the program makes another trip through its loop, I guess it's okay.
As I mentioned, scanf_s is not quite Standard, so using it is a mixed bag. Pros: 1. It's allegedly safer. 2. Some people (including perhaps your instructor) will recommend always using it for that reason. Cons: 3. It's nut fully standard (it's an optional part of the standard) meaning that not all C compilers and libraries will support it. 4. Its calling patterns are necessarily quite different than normal scanf; it is not a drop-in replacement, so confusion is likely. (I'm not saying "don't use scanf_s", but you should be aware of its somewhat dubious status.)
If you want to read a line of input from the user before continuing, and if the line might be a "q" or something else, scanf (of any variety) might not be the best choice. In particular, %s wants to read a non-whitespace string, so if you just hit the Return key, it's going to keep waiting. This might or might not be a problem for you. (Or it might not be something you need to worry about right now; you may have bigger fish to fry.)
How can I solve this problem during debugging?
Run your program step by step using the debugger. Then when you get the exception, you've found the line causing it.
Restart your program and go up to the line where the exception will occur. That is stop on that line without execution it.
Then with the debugger, you can look at all variables and try to understand if their value is what you expect.
Does this answered your question?
BTW: The compiler should at least emitted some warnings. You really should first fix those warnings. If you have no warning, make sure you have turned on all warnings in the compiler options.

What happens when strnlen() is used with a larger maximum length than the buffer size actually is?

I've written the following code to understand better how strnlen behaves:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char bufferOnStack[10]={'a','b','c','d','e','f','g','h','i','j'};
char *bufferOnHeap = (char *) malloc(10);
bufferOnHeap[ 0]='a';
bufferOnHeap[ 1]='b';
bufferOnHeap[ 2]='c';
bufferOnHeap[ 3]='d';
bufferOnHeap[ 4]='e';
bufferOnHeap[ 5]='f';
bufferOnHeap[ 6]='g';
bufferOnHeap[ 7]='h';
bufferOnHeap[ 8]='i';
bufferOnHeap[ 9]='j';
int lengthOnStack = strnlen(bufferOnStack,39);
int lengthOnHeap = strnlen(bufferOnHeap, 39);
printf("lengthOnStack = %d\n",lengthOnStack);
printf("lengthOnHeap = %d\n",lengthOnHeap);
return 0;
}
Note the deliberate lack of null termination in both buffers.
According to the documentation, it seems that the lengths should
both be 39:
RETURN VALUE
The strnlen() function returns strlen(s), if that is less than maxlen, or
maxlen if there is no null terminating ('\0') among the first maxlen characters
pointed to by s.
Here's my compile line:
$ gcc ./main_08.c -o main
And the output:
$ ./main
lengthOnStack = 10
lengthOnHeap = 10
What's going on here? Thanks!
First of all, strnlen() is not defined by C standard; it's a POSIX standard function.
That being said, read the documentation carefully
The strnlen() function returns the number of bytes in the string pointed to by s, excluding the terminating null byte ('\0'), but at most maxlen. In doing this, strnlen() looks only at the first maxlen bytes at s and never beyond s+maxlen.
So that means, while calling the function, you need to make sure, for the value you provide for maxlen, the array idexing is valid for [maxlen -1] for the supplied string, i.e, the string has at least maxlen elements in it.
Otherwise, while accessing the string, you'll venture into memory location which is not allocated to you (array out of bound access) hereby invoking undefined behaviour.
Remember, this function is to calculate the length of an array, upper-bound to a value (maxlen). That implies, the supplied arrays are at least equal to or greater than the bound, not the other way around.
[Footnote]:
By definition, a string is null-terminated.
Quoting C11, chapter §7.1.1, Definitions of terms
A string is a contiguous sequence of characters terminated by and including the first null
character. [...]
Firstly, don't cast malloc.
Secondly, you are reading past the end of your arrays. The memory outside your array bounds is undefined, and therefore there is no guarantee that it is not zero; in this instance, it is!
In general, this kind of behaviour is sloppy - see this answer for a good summary of the potential consequences
Your question is roughly equivalent to the following:
I know that a burglar alarm is supposed to prevent your house from getting robbed. This morning when I left the house, I turned off the burglar alarm. Sometime during the day when I was away, a burglar broke in and stole my stuff. How did this happen?
Or to this:
I know you can use the cruise control on your car to help you avoid getting speeding tickets. Yesterday I was driving on a road where the speed limit was 65. I set the cruise control to 95. A cop pulled me over and I got a speeding ticket. How did this happen?
Actually, those aren't quite right. Here's a more contrived analogy:
I live in a house with a 10 yard long driveway to the street. I have trained my dog to fetch my newspaper. One day I made sure there were no newspapers on the driveway. I put my dog on a 39 yard leash, and I told him to fetch the newspapwer. I expected him to go to the end of the leash, 39 yards away. But instead, he only went 10 yards, then stopped. How did this happen?
And of course there are many answers. Perhaps, when your dog got to the end of your newspaper-free driveway, right away he found someone else's newspaper in the gutter. Or perhaps, when the leash failed to stop him at the end of the driveway and he continued into the street, he got run over by a car.
The point of putting your dog on a leash is to restrict him to a safe area -- in this case, your property, that you control. If you put him on such a long leash that he can go off into the street, or into the woods, you're kind of defeating the purpose of controlling him by putting him on a leash.
Similarly, the whole point of strnlen is to behave gracefully if, within the buffer you have defined, there is no null character for strnlen to find.
The problem with non-null-terminated strings is that functions like strlen (which blindly search for null terminators) sail off the end and rummage blindly around in undefined memory, desperately trying to find the terminator. For example, if you say
char non_null_terminated_string[3] = "abc";
int len = strlen(non_null_terminated_string);
the behavior is undefined, because strlen sails off the end. One way to fix this is to use strnlen:
char non_null_terminated_string[3] = "abc";
int len = strnlen(non_null_terminated_string, 3);
But if you hand a bigger number to strnlen, it defeats the whole purpose. You're back wondering what will happen when strnlen sails off the end, and there's no way to answer that.
What happens when ... "Undefined behaviour (UB)"?
“When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose”
Your heading is actually not UB, since calling strnlen("hi", 5) is perfectly legal, but the specifics of your question shows it is indeed UB...
Both strlen and strnlen expect a string, i.e. a nul-terminated char sequence. Providing your non-nul-terminatedchar array to the function is UB.
What happens in your case is that the function reads the first 10 chars, finds no '\0', and since it hasn't went out-of-bounds it continues to read further, and by that invoking UB (reading un-allocated memory). It could be that your compiler took the liberty to end your array with '\0', it could be that the '\0' was there before... the possibilities are limited only by the compiler designers.

How to handle array bounds out in C

Is there any way to handle error index out of bounds in C
i just want to to know, please explain it in context of this example.
if i enter a string more than 20 char i get * stack smashing detected *: ./upper1 terminated
Aborted (core dumped)
main()
{
char st[20];
int i;
/* accept a string */
printf("Enter a string : ");
gets(st);
/* display it in upper case */
for ( i = 0 ; st[i] != '\0'; i++)
if ( st[i] >= 'a' && st[i] <= 'z' )
putchar( st[i] - 32);
else
putchar( st[i]);
}
I want to handle those and stop them and display a custom message as done in Java's Exception Handling. Is it possible ? If yes how
Thanks in advance
To answer the original question: there is no way to handle implicitly out-of-bound array indexes in C. You should add that check explicitly in your code or you should prove (or at least be absolutely sure) that it does not happen. Beware of buffer overflow and other undefined behavior, it can hurt a lot.
Remember that C arrays don't "know" their size at runtime. You should know and manage that size, especially when passing arrays (which become decayed into pointers). Read also about flexible array members in struct-s (like here).
BTW, your code is poor taste. First, the char st[20]; is really too small these days: an input line can have really a hundred of characters (I often have terminal emulators wider than 80 columns). So make it e.g.
char st[128];
Then, as every one told you, gets(3) is dangerous, it is documented as "Never use this function". Take the habit of reading the documentation of every function that you dare use.
I would suggest to always clear such a string buffer with
memset (st, 0, sizeof st);
You should at the very least use fgets(3), but read the documentation first. You'll need to handle the failure case.
Also, your conversion to upper-case is specific to ASCII (and some other encodings). It won't work on old EBCDIC machine. And it is unreadable. So use isalpha(3) to detect letters (in ASCII or other single-byte encoding); but in UTF-8 it is more complex, since some letters -eg cyrillic ones- are encoded on several bytes). My family name (СТАРЫНКЕВИЧ when spelt in Russian) contains an Ы -which is a single letter called yery - whose UTF-8 encoding for the capital letter is 0xD0 0xAB on two bytes. You'll need an UTF-8 library like unistring to handle these. And use toupper(3) to convert (e.g. ASCII) letters to upper-case.
Notice that your main function is wrongly defined. It should return an int and preferably be declared as int main(int argc, char**argv).
At last, on Posix systems, the "right" way to read a line is to use the getline(3) function. It can read a line as wide as permitted by system resources (so it might read a line of a million characters on my machine). See this answer.
Regarding exceptions, C don't really have these (so most programmers take the habit to have functions giving some error code). However, for non-local jumps consider setjmp(3) to be used with great caution. (In C++, you have exceptions and they are related to destructors).
Don't forget to compile with all warnings and debug info (e.g. with gcc -Wall -g if using GCC). You absolutely need to learn how to use the debugger (e.g. gdb) and you also should use a memory leak detector like valgrind.
Yes, you must use fgets() instead of gets(). In fact, gets() is officially deprecated and should never, ever be used, because it is impossible to use safely as you discovered.
Though its not directly possible to detect that the user has written out of bounds it, we can add some logic to make sure to throw an error without crashing.
int main (int argc, char **argv)
{
char user_input [USERINPUT_MAX];
for (int i = 0; i < USERINPUT_MAX; ++i)
{
// read the character
// check for enter key, if enter break out of loop after adding null at end
// if not enter,store it in array
}
if (i == USERINPUT_MAX)
{
printf ("you have exceeded the character range");
}
}
I guess you get the idea of how to handle such situations from user input.

Strings behvior on C

I want to understand a number of things about the strings on C:
I could not understand why you can not change the string in a normal assignment. (But only through the functions of string.h), for example: I can't do d="aa" (d is a pointer of char or a array of char).
Can someone explain to me what's going on behind the scenes - the compiler gives to run such thing and you receive segmentation fault error.
Something else, I run a program in C that contains the following lines:
char c='a',*pc=&c;
printf("Enter a string:");
scanf("%s",pc);
printf("your first char is: %c",c);
printf("your string is: %s",pc);
If I put more than 2 letters (on scanf) I get segmentation fault error, why is this happening?
If I put two letters, the first letter printed right! And the string is printed with a lot of profits (incorrect)
If I put a letter, the letter is printed right! And the string is printed with a lot of profits and at the end something weird (a square with four numbers containing zeros and ones)
Can anyone explain what is happening behind?
Please note: I do not want the program to work, I did not ask the question to get suggestions for another program, I just want to understand what happens behind the scenes in these situations.
Strings almost do not exist in C (except as C string literals like "abc" in some C source file).
In fact, strings are mostly a convention: a C string is an array of char whose last element is the zero char '\0'.
So declaring
const char s[] = "abc";
is exactly the same as
const char s[] = {'a','b','c','\0'};
in particular, sizeof(s) is 4 (3+1) in both cases (and so is sizeof("abc")).
The standard C library contains a lot of functions (such as strlen(3) or strncpy(3)...) which obey and/or presuppose the convention that strings are zero-terminated arrays of char-s.
Better code would be:
char buf[16]="a",*pc= buf;
printf("Enter a string:"); fflush(NULL);
scanf("%15s",pc);
printf("your first char is: %c",buf[0]);
printf("your string is: %s",pc);
Some comments: be afraid of buffer overflow. When reading a string, always give a bound to the read string, or else use a function like getline(3) which dynamically allocates the string in the heap. Beware of memory leaks (use a tool like valgrind ...)
When computing a string, be also aware of the maximum size. See snprintf(3) (avoid sprintf).
Often, you adopt the convention that a string is returned and dynamically allocated in the heap. You may want to use strdup(3) or asprintf(3) if your system provides it. But you should adopt the convention that the calling function (or something else, but well defined in your head) is free(3)-ing the string.
Your program can be semantically wrong and by bad luck happening to sometimes work. Read carefully about undefined behavior. Avoid it absolutely (your points 1,2,3 are probable UB). Sadly, an UB may happen to sometimes "work".
To explain some actual undefined behavior, you have to take into account your particular implementation: the compiler, the flags -notably optimization flags- passed to the compiler, the operating system, the kernel, the processor, the phase of the moon, etc etc... Undefined behavior is often non reproducible (e.g. because of ASLR etc...), read about heisenbugs. To explain the behavior of points 1,2,3 you need to dive into implementation details; look into the assembler code (gcc -S -fverbose-asm) produced by the compiler.
I suggest you to compile your code with all warnings and debugging info (e.g. using gcc -Wall -g with GCC ...), to improve the code till you got no warning, and to learn how to use the debugger (e.g. gdb) to run your code step by step.
If I put more than 2 letters (on scanf) I get segmentation fault error, why is this happening?
Because memory is allocated for only one byte.
See char c and assigned with "a". Which is equal to 'a' and '\0' is written in one byte memory location.
If scanf() uses this memory for reading more than one byte, then this is simply undefined behavior.
char c="a"; is a wrong declaration in c language since even a single character is enclosed within a pair of double quotes("") will treated as string in C because it is treated as "a\0" since all strings ends with a '\0' null character.
char c="a"; is wrong where as char c='c'; is correct.
Also note that the memory allocated for char is only 1byte, so it can hold only one character, memory allocation details for datatypes are described bellow

Pointer mystery/noobish issue

I am originally a Java programmer who is now struggling with C and specifically C's pointers.
The idea on my mind is to receive a string, from the user, on a command line, into a character pointer. I then want to access its individual elements. The idea is later to devise a function that will reverse the elements' order. (I want to work with anagrams in texts.)
My code is
#include <stdio.h>
char *string;
int main(void)
{
printf("Enter a string: ");
scanf("%s\n",string);
putchar(*string);
int i;
for (i=0; i<3;i++)
{
string--;
}
putchar(*string);
}
(Sorry, Code marking doesn't work).
What I am trying to do is to have a first shot at accessing individual elements. If the string is "Santillana" and the pointer is set at the very beginning (after scanf()), the content *string ought to be an S. If unbeknownst to me the pointer should happen to be set at the '\0' after scanf(), backing up a few steps (string-- repeated) ought to produce something in the way of a character with *string. Both these putchar()'s, though, produce a Segmentation fault.
I am doing something fundamentally wrong and something fundamental has escaped me. I would be eternally grateful for any advice about my shortcomings, most of all of any tips of books/resources where these particular problems are illuminated. Two thick C books and the reference manual have proved useless as far as this.
You haven't allocated space for the string. You'll need something like:
char string[1024];
You also should not be decrementing the variable string. If it is an array, you can't do that.
You could simply do:
putchar(string[i]);
Or you can use a pointer (to the proposed array):
char *str = string;
for (i = 0; i < 3; i++)
str++;
putchar(*str);
But you could shorten that loop to:
str += 3;
or simply write:
putchar(*(str+3));
Etc.
You should check that scanf() is successful. You should limit the size of the input string to avoid buffer (stack) overflows:
if (scanf("%1023s", string) != 1)
...something went wrong — probably EOF without any data...
Note that %s skips leading white space, and then reads characters up to the next white space (a simple definition of 'word'). Adding the newline to the format string makes little difference. You could consider "%1023[^\n]\n" instead; that looks for up to 1023 non-newlines followed by a newline.
You should start off avoiding global variables. Sometimes, they're necessary, but not in this example.
On a side note, using scanf(3) is bad practice. You may want to look into fgets(3) or similar functions that avoid common pitfalls that are associated with scanf(3).

Resources