There are many non safe functions in C like gets,printf and others.
When these vulnerabilities are well known, why haven't the functions been modified to make them safe?
In most cases, you can't change a function's signature (i.e. the number and type of the arguments) without breaking existing code.
Vulnerable functions tend to get fixed by introducing alternatives that have additional arguments (such as buffer sizes). These alternative functions are designed to be non-exploitable when used correctly.
Compare, for example, sprintf() and snprintf().
Related
Is it allowed to use varargs in an autosar C code? If not, why?
I'm not familiar with autosar. I found this document for c++14, which says:
Rule A8-4-1 (required, implementation, automated)
Functions shall not be defined using the ellipsis notation.
The reasoning is, that the ellipsis notation bypasses the type check. It is recommended to use variadic templates, function overloading or function call chain.
I haven't found any rule regarding varargs for autosar c. Is there any rule against varargs in c code? Is there any reason to avoid it? Is there any way to avoid it (I need to implement a logging function with string formatting)?
It is in Misra as well. Misra C is to C as AutoSAR C++ is to C++. It improves code quality, safety, and security. Lots of stdlib things in C is unsafe. But especially strings are a lot harder without things like variable arguments.
What I do (also for logging) is to create multiple functions that is appropriate to logging in specific situations. Some thing like log(text, int, int) and log(text, binary data block, size) etc. as required. Inside these functions there is calls to single variable argument function (usually snprintf) that prints everything to the log. You are not fully compliant but you are close and the use of variable arguments is contained to a specific area of code. If you need to be fully compliant the code is decoupled and easier to change.
Can a function tell what's calling it, through the use of memory addresses maybe? For example, function foo(); gets data on whether it is being called in main(); rather than some other function?
If so, is it possible to change the content of foo(); based on what is calling it?
Example:
int foo()
{
if (being called from main())
printf("Hello\n");
if (being called from some other function)
printf("Goodbye\n");
}
This question might be kind of out there, but is there some sort of C trickery that can make this possible?
For highly optimized C it doesn't really make sense. The harder the compiler tries to optimize the less the final executable resembles the source code (especially for link-time code generation where the old "separate compilation units" problem no longer prevents lots of optimizations). At least in theory (but often in practice for some compilers) functions that existed in the source code may not exist in the final executable (e.g. may have been inlined into their caller); functions that didn't exist in the source code may be generated (e.g. compiler detects common sequences in many functions and "out-lines" them into a new function to avoid code duplication); and functions may be replaced by data (e.g. an "int abcd(uint8_t a, uint8_t b)" replaced by a abcd_table[a][b] lookup table).
For strict C (no extensions or hacks), no. It simply can't support anything like this because it can't expect that (for any compiler including future compilers that don't exist yet) the final output/executable resembles the source code.
An implementation defined extension, or even just a hack involving inline assembly, may be "technically possible" (especially if the compiler doesn't optimize the code well). The most likely approach would be to (ab)use debugging information to determine the caller from "what the function should return to when it returns".
A better way for a compiler to support a hypothetical extension like this may be for the compiler to use some of the optimizations I mentioned - specifically, split the original foo() into 2 separate versions where one version is only ever called from main() and the other version is used for other callers. This has the bonus of letting the compiler optimize out the branches too - it could become like int foo_when_called_from_main() { printf("Hello\n"); }, which could be inlined directly into the caller, so that neither version of foo exists in the final executable. Of course if foo() had other code that's used by all callers then that common code could be lifted out into a new function rather than duplicating it (e.g. so it might become like int foo_when_called_from_main() { printf("Hello\n"); foo_common_code(); }).
There probably isn't any hypothetical compiler that works like that, but there's no real reason you can't do these same optimizations yourself (and have it work on all compilers).
Note: Yes, this was just a crafty way of suggesting that you can/should refactor the code so that it doesn't need to know which function is calling it.
Knowing who called a specific function is essentially what a stack trace is visualizing. There are no general standard way of extracting that though. In theory one could write code that targeted each system type the software would run on, and implement a stack trace function for each of them. In that case you could examine the stack and see what is before the current function.
But with all that said and done, the question you should probably ask is why? Writing a function that functions in a specific way when called from a specific function is not well isolated logic. Instead you could consider passing in a parameter to the function that caused the change in logic. That would also make the result more testable and reliable.
How to actually extract a stack trace has already received many answers here: How can one grab a stack trace in C?
I think if loop in C cannot have a condition as you have mentioned.
If you want to check whether this function is called from main(), you have to do the printf statement in the main() and also at the other function.
I don't really know what you are trying to achieve but according to what I understood, what you can do is each function will pass an additional argument that would uniquely identify that function in form of a character array, integer or enumeration.
for example:
enum function{main, add, sub, div, mul};
and call functions like:
add(3,5,main);//adds 3 and 5. called from main
changes to the code would be typical like if you are adding more functions. but it's an easier way to do it.
No. The C language does not support obtaining the name or other information of who called a function.
As all other answers show, this can only be obtained using external tools, for example that use stack traces and compiler/linker emitted symbol tables.
When I have looked into the source of glibc, I sometimes stumbles over functions that are wrappers that does nothing and only works as an alias. For example:
int
rand (void)
{
return (int) __random ();
}
What is the reason for things like this? Why not just take the body of __random() and put it in rand()?
This is a very case specific question as there are a variety of reasons for such a behavior. One answer cannot cover all the reasons for all the cases.
For example, some compilers contain a variety of system specific "builtin" implementations, so the source / header files simply tell the compiler to place their implementation in there.
Another reason would be to type cast from a more general function to a standard conforming type.
Some functions contain repeated functionality (think printf vs. fprintf(stdin,...), and using wrappers is a simple way to keep the code more DRY.
Specifically, __random returns a long int and needs to be converted to int (which may or may not be the same, depending on your system).
In addition, __random reuses functionality in __random_r, but adds a lock to make the functionality thread safe.
Reusing the same functionality with minor variations (a global thread-safe state) keeps the code more DRY.
As the other C library function, like strcpy, strcat, there is a version which limits the size of string (strncpy, etc.), I am wondering why there is no such variant for strchr?
It does exist -- it is called memchr:
http://en.wikipedia.org/wiki/C_string_handling
In C, the term "string" usually means "null terminated array of characters", and the str* functions operate on those kinds of strings. The n in the functions you mention is mostly for the sake of controlling the output.
If you want to operate on an arbitary byte sequence without any implied termination semantics, use the mem* family of functions; in your case memchr should serve your needs.
strnchr and also strnlen are defined in some linux environments, for example https://manpages.debian.org/experimental/linux-manual-4.11/strnchr.9.en.html. It is really necessary. A program may crash on end of memory area if strlen or strcmp do not found any \0-termination. Unfortunately such things often are not standardized or too late and too sophisticated standardized. strnlen_s is existing in C11, but strnchr_s is not available.
You may found some more information about such problems in my internet page: https://www.vishia.org/emcdocu/html/portability_emC.html. I have defined some C-functions strnchr_emC... etc. which delivers the required functionality. To achieve compatibility you can define
#define strnchr strnchr_emC
In a common header but platform-specific. Refer the further content on https://www.vishia.org/emc/. You find the sources in https://github.com/JzHartmut
I'm writing some C code and use the Windows API. I was wondering if it was in any way good practice to cast the types that are obviously the same, but have a different name? For example, when passing a TCHAR * to strcmp(), which expects a const char *. Should I do, assuming I want to write strict and in every way correct C, strcmp((const char *)my_tchar_string, "foo")?
Don't. But also don't use strcmp() but rather _tcscmp() (or even the safe alternatives).
_tcs* denotes a whole set of C runtime (string) functions that will behave correctly depending on how TCHAR gets translated by the preprocessor.
Concerning safe alternatives, look up functions with a trailing _s and otherwise named as the classic string functions from the C runtime. There is another set of functions that returns HRESULT, but it is not as compatible with the C runtime.
No, casting that away is not safe because TCHAR is not always equal to char. Instead of casting, you should pick a function that works with a TCHAR. See http://msdn.microsoft.com/en-us/library/e0z9k731(v=vs.71).aspx
Casting is generally a bad idea. Casting when you don't need to is terrible practice.
Think what happens if you change the type of the variable you are casting? Suppose that at some future date you change my_tchar_string to be wchar_t* rather than char*. Your code will still compile but will behave incorrectly.
One of your primary goals when writing C code is to minimise the number of casts in your code.
My advice would be to just avoid TCHAR (and associated functions) completely. Their real intent was to allow a single code base to compile natively for either 16-bit or 32-bit versions of Windows -- but the 16-bit versions of Windows are long gone, and with them the real reason to write code like this.
If you want/need to support wide characters, do it. If you're fine with only narrow/multibyte characters, do that. At least IME, trying to sit on the fence and do some of both generally means you end up not doing either one well. It also means roughly doubling the amount of testing necessary without even coming close to doubling the functionality you provide to the user.