How to avoid buffer overflow in c - c

I tried to setup this code to avoid buffer overflow and I'm not sure why it doesn't work. I'm fairly new to this and help would be appreciated.
I've tried using assert to make sure it ends but i want the assert to succeed
void authenticate (void)
{
char buffer1[8];
int i;
for (i = 0; i < 16; i++)
{
assert (i < sizeof(buffer1));
buffer1[i] = ‘x’;
}
}
expect assert to pass but it fails. Want to fix it without completely rewriting the loop. Thanks!

There seems to be some misunderstanding here on exactly how assert functions. The assert macro performs a runtime check of the given condition. If that condition is false it causes the program to abort.
In this case, the value of i ranges from 0 to 15 inside of the loop. On the iterations where the value of i is less that 8 the assert passes. But once i becomes 8 the assert fails causing the program to abort. The failed assert will not cause the program to for example skip the next loop iteration.
The proper way to handle this is to limit the loop counter to not go out of bounds:
for (i=0; i<sizeof(buf); i++)
The C language by itself doesn't perform bounds checking like some other languages. That's part of what makes it fast. That also means that the language trusts the developer to not do things like read / write out of bounds of an array. Breaking that trust results in undefined behavior. So it's up to you to make sure that doesn't happen.
There are also tools such an valgrind which will help identify mismanagement of memory.

Assert fails as expected. Change counter limit to 8 to pass.
for (i = 0; i < 8; i++)
But perhaps you really want
buf[7]=0;
for (i = 0; i < 8; i++)

Related

Understanding the reason behind a output

I am a beginner is Computer Science and I recently started learning the language C.
I was studying the for loop and in the book it was written that even if we replace the initialization;testing;incrementation statement of a for loop by any valid statement the compiler will not show any syntax error.
So now I run the following program:
#include<stdio.h>
int main()
{
int i;
int j;
for(i<4;j=5;j=0)
printf("%d",i);
return 0;
}
I have got the following output.
OUTPUT:
1616161616161616161616161616161616161616.........indefinitely
I understood why this is an indefinite loop but i am unable to understand why my PC is printing this specific output? Is there any way to understand in these above kind of programs what the system will provide us as output?
This is a case of undefined behaviour.
You have declared i so it has a memory address but haven't set the value so its value is just whatever was already in that memory address (in this case 16)
I'd guess if you ran the code multiple times (maybe with restarts between) the outputs would change.
Some more information on uninitialized variables and undefined behaviour: https://www.learncpp.com/cpp-tutorial/uninitialized-variables-and-undefined-behavior/
Looks like i was never initialized, and happens to contain 16, which the for loop is printing continuously. Note that printf does not add a new line automatically. Probably want to read the manual on how to use for.
To better understand what is going on, you can add additional debugging output to see what other variables are set to (don't forget the \n newline!) but really problem is the for loop doesn't seem right. Unless there's something going on with j that you aren't showing us, it really doesn't belong there at all.
You are defining the variable i but never initializing it, instead, you are defining the value for j but never using it.
On a for loop, first you initialize the control variable, if you haven't already done it. Then you specify the condition you use to know if you should run another iteration or not. And last but not least, change the value of the control variable so you won't have infinite loops.
An example:
#include <stdio.h>
int main()
{
// 1. we declare i as an integer starting from 1
// 2. we will keep iterating as long as i is smaller than 10
// 3. at the end of each iteration, we increment it's value by 1
for (int i = 1; i < 10; i++) {
printf("%d\n", i);
}
return 0;
}

Why does not my program go into infinite loop when array out of bounds occur in C

int main(){
int i;
int arr[4];
for(int i=0; i<=4; i++){
arr[i] = 0;
}
return 0;
}
I watched a video on youtube of CS107(lecture 13) in which this example is shown and told why the above program will lead to infinite loop by showing memory diagrams. arr[4] goes out of bounds and should lead to an address where i is stored and changing the value of i back to 0 hence leading to infinite loop. But when I tried to run this on my mac using gcc compiler, for loop executed (checked by inserting printf) 5 times. i.e for the value of i = 0,1,2,3,4.
When there is undefined behavior you can't expect a single or particular behavior. Anything could happen.
What said in that video is just one of many possibilities of undefined behavior and what you are getting is another possibility. One should not rely upon any particular behavior.
first: it is undefined behaviour, anything can happen.
said this, you should try different optimization levels. one optimization can lead to loop-reduction:
for (i = 0; i <= 2; i++) arr[i] = 0;
can be reduced, because i and arr[i] are invariant in the scope of the loop, to
arr[0] = 0;
arr[1] = 0;
arr[2] = 0;
that would lead to your tested result.
Another thing to consider is the architecture which arranges the stack and places the items onto the stack (you cannot rely on the ordering of the items)

auto parallelization unsafe calls

I have ran into a problem when I was playing around with the auto parallelization function of the Oracle Solaris Compiler. Let's say I have the following code:
int var = -1;
int i;
for (i = 0; i < 3; i++){
bool flag = false;
// do operations to set the flag
if (flag == true)
var = i;
}
// do other operations with var
when I run this code, the compiler complains that it cannot be parallelized because of unsafe dependences.
Does anyone know what might be wrong here? Is there any way to avoid this but maintain the original functionality of the code?
Any help would be appreciated, thank you!
What the compiler sees is a bunch of loop iterations, all of which can potentially assign to V. If writing to V happens to be atomic, then this will get some random value of i, and one might argue that is OK. Under the assumption that writing to a variable is not atomic, most compilers see this as a data race ... (then what exactly ends up in V?). Thus the complaint.

C string length - is this valid code? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Is this type of expression in C valid (on all compilers)?
If it is, is this good C?
char cloneString (char clone[], char string[], int Length)
{
if(!Length)
Length=64
;
int i = 0
;
while((clone[i++] = string[i]) != '\0', --Length)
;
clone[i] = '\0';
printf("cloneString = %s\n", clone);
};
Would this be better, worse, indifferent?
char *cloneString (char clone[], char string[], int Length)
{
if(!Length)
Length=STRING_LENGTH
;
char *r = clone
;
while
( //(clone[i++] = string[i]) != '\0'
*clone++ = *string++
, --Length
);
*clone = '\0';
return clone = r
;
printf("cloneString = %s\n", clone);
};
Stackoverflow wants me to add more text to this question!
Okay! I'm concerned about
a.) expressions such as c==(a=b)
b.) performance between indexing vs pointer
Any comments?
Thanks so much.
Yes, it's syntactically valid on all compilers (though semantically valid on none), and no, it isn't considered good C. Most developers will agree that the comma operator is a bad thing, and most developers will generally agree that a single line of code should do only one specific thing. The while loop does a whole four and has undefined behavior:
it increments i;
it assigns string[i] to clone[i++]; (undefined behavior: you should use i only once in a statement that increments/decrements it)
it checks that string[i] isn't 0 (but discards the result of the comparison);
it decrements Length, and terminates the loop if Length == 0 after being decremented.
Not to mention that assuming that Length is 64 if it wasn't provided is a terrible idea and leaves plenty of room for more undefined behavior that can easily be exploited to crash or hack the program.
I see that you wrote it yourself and that you're concerned about performance, and this is apparently the reason you're sticking everything together. Don't. Code made short by squeezing statements together isn't faster than code longer because the statements haven't been squeezed together. It still does the same number of things. In your case, you're introducing bugs by squeezing things together.
The code has Undefined Behavior:
The expression
(clone[i++] = string[i])
both modifies and accesses the object i from two different subexpressions in an unsequenced way, which is not allowed. A compiler might use the old value of i in string[i], or might use the new value of i, or might do something entirely different and unexpected.
Simple answer no.
Why return char and the function has no return statement?
Why 64?
I assume that the two arrays are of length Length - Add documentation to say this.
Why the ; on a new line and not after the statement?
...
Ok so I decided to evolve my comments into an actual answer. Although this doesn’t address the specific piece of code in your question, it answers the underlying issue and I think you will find it illuminating as you can use this — let’s call it guide — on your general programming.
What I advocate, especially if you are just learning programming is to focus on readability instead of small gimmicks that you think or was told that improve speed / performance.
Let’s take a simple example. The idiomatic way to iterate through a vector in C (not in C++) is using indexing:
int i;
for (i = 0; i < size; ++i) {
v[i] = /* code */;
}
I was told when I started programming that v[i] is actually computed as *(v + i) so in generated assembler this is broken down (please note that this discussion is simplified):
multiply i with sizeof(int)
add that result to the address of v
access the element at this computed address
So basically you have 3 operations.
Let’s compare this with accessing via pointers:
int *p;
for (p = v; p != v + size; ++p) {
*p = /*..*/;
}
This has the advantage that *p actually expands to just one instruction:
access the element at the address p.
2 extra instructions don’t seam much but if your program spends most of it’s time in this loop (either extremely large size or multiple calls to (the functions containing this) loop) you realise that the second version makes your program almost 3 times faster. That is a lot. So if you are like me when I started, you will choose the second variant. Don’t!
So the first version has readability (you explicitly describe that you access the i-th element of vector v), the second one uses a gimmick in detriment of readability (you say that you access a memory location). Now this might not be the best example for unreadable code, but the principle is valid.
So why do I tell you to use the first version: until you have a firm grasp on concepts like cache, branching, induction variables (and a lot more) and how they apply in real world compilers and programs performance, you should stay clear of these gimmicks and rely on the compiler to do the optimizations. They are very smart and will generate the same code for both variants (with optimization enabled of course). So the second variant actually differs just by readability and is identical performance-wise with the first.
Another example:
const char * str = "Some string"
int i;
// variant 1:
for (i = 0; i < strlen(str); ++i) {
// code
}
// variant 2:
int l = strlen(str);
for (i = 0; i < l; ++i) {
// code
}
The natural way would be to write the first variant. You might think that the second improves performance because you call the function strlen on each iteration of the loop. And you know that getting the length of a string means iterating through all the string until you reach the end. So basically a call to strlen means adding an inner loop. Ouch that has to slow the program down. Not necessarily: the compiler can optimize the call out because it always produces the same result. Actually you can do harm as you introduce a new variable which will have to be assigned a different register from a very limited registry pool (a little extreme example, but nevertheless a point is to be made here).
Don’t spend your energy on things like this until much later.
Let me show you something else that will illustrate further more that any assumptions that you make about performance will be most likely be false and misleading (I am not trying to tell you that you are a bad programmer — far from it — just that as you learn, you should invest your energy in something else than performance):
Let’s multiply two matrices:
for (k = 0; k < n; ++k) {
for (i = 0; i < n; ++i) {
for (j = 0; j < n; ++j) {
r[i][j] += a[i][k] * b[k][j];
}
}
}
versus
for (k = 0; k < n; ++k) {
for (j = 0; j < n; ++j) {
for (i = 0; i < n; ++i) {
r[i][j] += a[i][k] * b[k][j];
}
}
}
The only difference between the two is the order the operations get executed. They are the exact same operations (number, kind and operands), just in a different order. The result is equivalent (addition is commutative) so on paper they should take the EXACT amount of time to execute. In practice, even with optimizations enable (some very smart compilers can however reorder the loops) the second example can be up to 2-3 times slower than the first. And even the first variant is still a long long way from being optimal (in regards to speed).
So basic point: worry about UB as the other answers show you, don’t worry about performance at this stage.
The second block of code is better.
The line
printf("cloneString = %s\n", clone);
there will never get executed since there a return statement before that.
To make your code a bit more readable, change
while
(
*clone++ = *string++
, --Length
);
to
while ( Length > 0 )
{
*clone++ = *string++;
--Length;
}
This is probably a better approach to your problem:
#include <stdio.h>
#include <string.h>
void cloneString(char *clone, char *string)
{
for (int i = 0; i != strlen(string); i++)
clone[i] = string[i];
printf("Clone string: %s\n", clone);
}
That been said, there's already a standard function to to that:
strncpy(const char *dest, const char *source, int n)
dest is the destination string, and source is the string that must be copied. This function will copy a maximum of n characters.
So, your code will be:
#include <stdio.h>
#include <string.h>
void cloneString(char *clone, char *string)
{
strncpy(clone, string, strlen(string));
printf("Clone string: %s\n", clone);
}

C for loop with integers

I am following the book Let us C and the following code has been shown as being perfectly correct:
for ( i < 4 ; j = 5 ; j = 0 )
printf ( "%d", i ) ;
But in the Turbo C it gives 3 warnings:
Code has no effect. Possibly incorrect assignment. 'j' is assigned a
value that is never used.
If the book is making the point that this code is allowed by the C standard, then it is correct. This code does not violate any rule of the C standard, provided that i and j have previously been declared correctly (and printf too, by including #include <stdio.h>).
However, nobody would actually write code like this, because it is not useful. That is why the compiler is issuing a warning, because the code is technically allowed but is probably not what a programmer would intend.
If the book is claiming that this code is useful in some way, then it is probably a typographical error. It is certainly wrong. If the book has more than a few errors like this, you should discard it.
I don't know what your book want to teach you with this example, but AFAIK a for loop should always be in the form
for ( init; check; next ) {
/* do something */
}
where init initialize what you're going to use, check check if it should stop or continue and next perform some kind of action. It is the same as
init;
while ( check ) {
/* do something */
next;
}
Therefore you are getting the warning because:
Code has no effect is referred to i < 4. As you can see in the while form, this comparison isn't used in any way, therefore it has no effect.
Possibly incorrect assignment. is refereed to j = 5 cause you're making a check of an assignment witch will always evaluate to the value assigned (in this case 5)
'j' is assigned a value that is never used as it says, 'j' is never used, as you print the 'i' in this example.
Probably what the book wants to do is for ( i = 5; i < 5; i++ ).
And probably what you need to do is using a better book.
It is valid C code but it's pretty much meaningless. This will not initialize the loop properly and trigger an infinite loop. Loops look something like
for (i = 0; i < 10; i++)
The first statement is the initializer, the second stipulates the end case, and the last is the increment. I would get rid of that book
Check this out.
int i=0;
for(i=0;i<5;i++)
{
printf("%d",i);
}
This is a correct but infinite loop,
the correct way to instantiate a for loop is
int i ;
for(i = 0; i< [variable or number];i++){
printf("%d",i);
}
the code you wrote is meaningless and you can't do anything with that code, actually it print the value of i infinite time because it never change.
The only thing we know about i is less then 4. Probably the output is always the same number.

Resources