printing strings produces garbage even when (I think) they are null terminated - c

When I run print_puzzle(create_puzzle(input)), I get a bunch of gobbledegook at the bottom of the output, only in the last row. I have no idea why this keeps happening. The output is supposed to be 9 rows of 9 numbers (the input is a sudoku puzzle with zeroes representing empty spaces).
This bunch of code should take that input, make a 2d array of strings and then, with print_puzzle, print those strings out in a grid. They are string because eventually I will implement a way to display all the values the square could possibly be. But for now, when I print it out, things are screwed up. I even tried putting the null value in every single element of all 81 strings but it still get's screwed up when it goes to print the strings. I'm lost!
typedef struct square {
char vals[10]; // string of possible values
} square_t;
typedef struct puzzle {
square_t squares[9][9];
} puzzle_t;
static puzzle_t *create_puzzle(unsigned char vals[9][9]) {
puzzle_t puz;
puzzle_t *p = &puz;
int i, j, k, valnum;
for (i = 0; i < 9; i++) {
for (j = 0; j < 9; j++) {
puz.squares[i][j].vals[0] = '\0';
puz.squares[i][j].vals[1] = '\0';
puz.squares[i][j].vals[2] = '\0';
puz.squares[i][j].vals[3] = '\0';
puz.squares[i][j].vals[4] = '\0';
puz.squares[i][j].vals[5] = '\0';
puz.squares[i][j].vals[6] = '\0';
puz.squares[i][j].vals[7] = '\0';
puz.squares[i][j].vals[8] = '\0';
puz.squares[i][j].vals[9] = '\0';
valnum = vals[i][j] -'0';
for (k = 0; k < 10; k++){
if ((char)(k + '0') == (char)(valnum + '0')){
char tmpStr[2] = {(char)(valnum +'0'),'\0'};
strcat(puz.squares[i][j].vals, tmpStr);
}
}
}
}
return p;
}
void print_puzzle(puzzle_t *p) {
int i, j;
for (i=0; i<9; i++) {
for (j=0; j<9; j++) {
printf(" %2s", p->squares[i][j].vals);
}
printf("\n");
}
}

In short:
In function create_puzzle(), you are returning a pointer to the local variable puz. Local variables are only known to function inside their own. So the content referenced by the pointer returned by create_puzzle is indeterminate.
More details:
In C++, local variables are usually generated as storage on a "stack" data structure. when create_puzzle() method is entered, its local variables come alive. A function's local variables will be dead when the method is over. An implementation of C++ is not required to leave the garbage you left on the stack untouched so that you can access it's original content. C++ is not a safe language, implementations let you make mistake and get away with it. Other memory-safe languages solve this problem by restricting your power. For example in C# you can take the address of a local, but the language is cleverly designed so that it is impossible to use it after the lifetime of the local ends.
This answer is very awesome:
Can a local variable's memory be accessed outside its scope?

In function create_puzzle(), you are returning a pointer of the type puzzle_t. But, the address of variable puz of the type puzzle_t is invalid once you return from the function.
Variables that are declared inside a function are local variables. They can be used only by statements that are inside that function. These Local variables are not known to functions outside their own, so returning an address of a local variable doesn't make sense as when the function returns, the local storage it was using on the stack is considered invalid by the program, though it may not get cleared right away. Logically, the value at puz is indeterminate, and accessing it results in undefined behavior.
You can make puz a global variable, and use it the way you are doing right now.

You are returning a local variable here:
return p;
Declare p and puz outside of the function, then it should work.

p point to local memory that is unavailable after the function ends. Returning that leads to problems. Instead allocate memory.
// puzzle_t puz;
// puzzle_t *p = &puz;
puzzle_t *p = malloc(sizeof *p);
assert(p);
Be sure to free() the memory after the calling code completes using it.

Related

Initialize an array in a loop (array vs malloc)

Hi I am quite new to C and I have a question about the behavior of array initialization using [] and malloc.
int main() {
int* pointer;
for(int i = 0; i < 100; i++) {
// Init the Array
int tmp[2] = {};
// Do some operation here...
tmp[0] = 0;
tmp[1] = i;
// If the value is 1, copy that array pointer
if(i == 1) {
pointer = tmp;
}
}
// expected 1 here, but got 99
printf("%d\n", pointer[1]);
return 0;
}
Why is the output 99? I thought the array is re-inited every loop, but it turns out using the same memory address. And if I use malloc to init the array instead, the result becomes 1 as expected.
Is there any way I could get result 1 without using malloc?
Your code is invalid as you access the variable which is out of the scope using the reference. It is undefined behaviour.
Every time you assign the i to the same element to the array. Pointer only references (points to) the first element of this array. So if you change the underlaying object the value you get using the reference will change as well. If your finger is pointing to the box of 5 apples and someone eats 2 apples, your finger will point to the box of 3 apples, not 5.
You need to make a copy of the object.
if(i == 1) {
pointer = malloc(sizeof(tmp));
memcpy(pointer, tmp, sizeof(tmp));
}
or break the loop (declaring it static or moving the tmp out of the for loop scope)
for(int i = 0; i < 100; i++) {
// Init the Array
static int tmp[2];
// Do some operation here...
tmp[0] = 0;
tmp[1] = i;
// If the value is 1, copy that array pointer
if(i == 1) {
pointer = tmp;
break;
}
}
The scope of the array tmp is the block scope of the for loop
for(int i = 0; i < 100; i++) {
// Init the Array
int tmp[2] = {};
// Do some operation here...
tmp[0] = 0;
tmp[1] = i;
// If the value is 1, copy that array pointer
if(i == 1) {
pointer = tmp;
}
}
That is in each iteration of the loop a new array tmp is created and ceases to be alive after exiting the block.
Thus the pointer pointer is invalid after the for loop. Dereferencing the pointer after the for loop invokes undefined behavior.
You have gotten the result 99 only because the array tmp was not being reallocated and the memory occupied by the array was not yet overwritten. So the last value stored in this extent of memory that is the value of i equal to 99 was outputted.
Even if you will declare the array tmp before the for loop then using the pointer pointer you will get as the output the value 99 that is the value last stored in the array.
You could write for example
int tmp[2] = { 0 };
int *pointer = tmp;
for(int i = 0; i < 100; i++) {
// Do some operation here...
tmp[0] = 0;
tmp[1] = i;
}
And the last value stored in the array (when i is equal to 99)
tmp[1] = i;
will be outputted in this call
printf("%d\n", pointer[1]);
Pay attention to that such an initialization with empty braces is invalid in C opposite to C++
int tmp[2] = {};
You need to write at least like
int tmp[2] = { 0 };
As we know pointer stores a memory address.
Here, I think when you give the command: pointer = tmp;,
the address of the array stored in 'tmp' is copied to the 'pointer'.
But when the loop of i = 1 gets completed, the array that you created in that particular loop and the pointer 'tmp' gets forgotten.
Then the loop for i=2 starts, 'tmp' and the array gets created again.
It happens again till the loop end.
I think that the program is storing tmp[1] at the same location every time due to which the data stored at that changes again and again.
So, when you give the command printf("%d\n", pointer[1]);, the data at that address get printed which is no longer equal to 1, it has changed.
The mistake is that we shared the address of 'tmp' with the 'pointer'.
But when we use malloc, we lock that memory means other programs can't use that memory. ( That's why we always need to free that memory to avoid memory leaks ).
It's the reason while using malloc you get output as 1 as your other commands can't touch that particular memory.
Solution:
If you want to solve the problem without malloc.
Initialise 'pointer' as an array to store data of 'tmp'.
use this code,
pointer[0] = tmp[0]; pointer[1] = tmp[1];
at place of
pointer = tmp;.
Now, you will not be copying addresses to 'pointer' but the data in the 'tmp'.
And if you have a big array with many values in it, just use it for loop.
solution image
Also, you will get the same problem if you do it like this, all because of copying only the address, you will be doing the same thing.
Maybe you can relate,same problem image
Thanks.

Weird output behavior with char array in C [duplicate]

This question already has answers here:
Return a pointer that points to a local variable [duplicate]
(3 answers)
Closed 2 years ago.
I have a function that returns char* when I print it with printf("%c", *(k+i)); on the main it prints;
0' 10101001 Q -> Q
but if I print with printf(" %c", *(k+i)); there are less problem.
If I print inside the tobinary function, output comes perfect like this;
1010.011010011011101001011110001101010011111101111100111 -> 111011
What Am I doing wrong? here is the code.
char *tobinary(double num) {
int length = 62;
char bin[length];
int intpart = (int)num;
double decpart = 1000*(num - intpart);
int i = 0;
while (intpart!=0) {
if(intpart%2 == 1) bin[3-i] = '1';
else bin[3-i] = '0';
intpart /= 2;
i++;
}
bin[i++] = '.';
while (i <= length) {
decpart *= 2;
if (decpart >= 1000) {
bin[i] = '1';
decpart -= 1000;
}
else bin[i] = '0';
i++;
}
char *k = bin;
return k;
}
int main(int argc, char **argv) {
char *k = tobinary(10.413);
for(int i = 0; i <= 62; ++i) {
printf("%c", *(k+i));
if (i==56) printf(" -> ");
}
}
When you declare a local variable in a function, like this
char *tobinary(double num) {
int length = 62;
char bin[length];
/* ... */
}
it is stored in a special memory area called stack. Whenever a function func() is called, the CPU saves there some useful data such as the address of the calling function where the execution will be restored after func() returns, along with the parameters of func() and, as I wrote above, any local variable declared in it.
All this data is stacked with a LIFO criteria (Last In, First Out), so that when function returns a special pointer (stack pointer) is changed to point back to the data regarding the calling function. func()'s data is still there, but it can be overwritten whenever another function is called or other local variables are declared by caller(). Please note that it is compliant with the fact that local variables have a lifetime limited to the function in which they are declared.
That's what happen in your scenario. Since the execution goes on, your bin[] array is not guaranted to stay "safe":
int i is declared in the for-loop section
printf() is called
This is what corrupts "your" data (I used double quotes because it is not yours anymore).
Whenever you need to return data manipulated by a function, you have three options:
Declare the array outside it and pass it to the function after changing its prototype: int tobinary(char *arr, unsigned int arrsize, double num);. In this way the function can modify the data passed by the caller (changing at most arrsize characters). The return value can become an error code; something like 0 on success and -1 on failure.
Dynamically allocate the array inside your function using malloc(). In this case freeing the memory (with free()) is responsability of the caller function.
Declare the array in your function as static. This qualifier, in fact, tells the compiler that the lifetime of the variable is the whole life of the program, and a different specific memory area is used to store it instead of the stack. Be aware that in this case your function won't be thread safe anymore (different thread accessing the same memory area would lead to bizarre results).
bin is character array inside your function.
It is not static, so when you return a pointer to it, it is not guaranteed to keep the same value.
Either change it to static or return a memory you allocate and the caller will need to free that memory.

Why give a variable a null value when declaring it?

I noticed that many times when declaring variables/arrays/etc, people give it a null value or 0.
for example, take a look at the following snippet:
int nNum = 0;
char cBuffer[10] = { 0 };
int *nPointer = NULL;
and etc.
Untill asking this question I figured it would be for debugging purposes, since when I
was debugging a program with Visual Studio I noticed that variables that had no value had
undefined numbers as their value, whilst with 0 they had... 0.
There are a number of reasons to initialize variables to 0 or NULL.
First of all, remember that unless it's declared at file scope (outside of any function) or with the static keyword, a variable will contain an indeterminate value; it may be 0, it may be 0xDEADBEEF, it may be something else.
For sums, counters, and the like, you want to make sure you start from 0, otherwise you will get an invalid result:
int sum = 0;
while ( not_at_end_of_things_to_sum )
sum += next_thing_to_sum;
int count = 0;
while ( there_is_another_thing_to_count )
count++;
etc.
Granted, you don't have to initialize these kinds of variables as part of the declaration; you just want to make sure they're zeroed out before you use them, so you could write
int count;
...
count = 0;
while ( there_is_another_thing_to_count )
count++;
it's just that by doing it as part of the declaration, you don't have to worry about it later.
For arrays intended to hold strings, it's to make sure that there's a 0 terminator if you're building a string without using strcpy or strcat or scanf or similar:
char buf[N] = { 0 }; // first element is *explicitly* initialized to 0,
// remaining elements are *implicitly* initialized to 0
while ( not_at_end_of_input && i < N - 1 )
buf[i++] = next_char;
You don't have to do it this way, but otherwise you'd have to be sure to add the 0 terminator manually:
buf[i] = 0;
For pointers, it's to make it easy to test if a pointer is valid or not. A NULL pointer is a well-defined invalid pointer value that's easy to test against:
char *p = NULL;
...
if ( !p ) // or p == NULL )
{
// p has not yet been assigned or allocated
p = malloc( ... );
}
Otherwise, it's effectively impossible to tell if the pointer value is valid (points to an object or memory allocated with malloc) or not.
An uninitialized pointer variable may contain a garbage value. The purpose of initializing a pointer with NULL is that if you inadvertently use it without assigning a proper address, you do not end up modifying the contents at a random memory address.
Depending on the language, you would use NULL or just int *nPointer;
it's called initializing a variable, i.e. you're creating it. This is extremely helpful if you want your program to remain at a constant state, knowing that "un-initialized" variables won't cause an exception.
If you're inializing a varibale inside a loop or function, and you want to use it outside that loop/function and the loop/function only executes when there's a condition attached to it e.g:
if(nNum != 0){
int *nPointer = NULL;
for(int i=0; i<10; i++){
*nPointer++;
}
}
In this case if you did not initialize your variable, and you try and use it later down the line, your program might break. However, if it has been initialized, your safe, knowing it exists, but is still NULL.
SAFER CODE:
int *nPointer = NULL; //Or this will be a class member
if(nNum != 0){
for(int i=0; i<10; i++){
*nPointer++;
}
}

Input and output array don't match

I am writing a program for a class at school and I cannot get the program to print out what I type in.
The problem states that the first line needs to contain the number of questions on an 'exam' followed by a space then the answer key. I wanted to print the answer key to make sure that it was being entered in correctly and it never matches what I type in. The code is posted below.
This is the main file that starts being run and it calls a method from another file I have made the prototype file correctly so I'm pretty sure it's not that.
int main()
{
int i;
int noOfQuestions;
scanf("%d ", &noOfQuestions);
char * answerKeyPtr;
answerKeyPtr = fgetAnswers(noOfQuestions);
for(i = 0; i < noOfQuestions; i++){
printf("%c",answerKeyPtr[i]);
}
printf("\n");
return 0;
}
char *fgetAnswers(int noOfQuestions){
int i;
char * answerKeyPtr;
char AnswerKey[noOfQuestions];
answerKeyPtr = AnswerKey;
for(i = 0; i < noOfQuestions; i++){
scanf("%c",&AnswerKey[i]);
}
return answerKeyPtr;
}
What you have here is a memory problem.
You're storing data into the AnswerKey array, which is local to fgetAnswers(). The problem is you're returning a pointer to that local variable, and that variable's memory is not reliable as soon as your fgetAnswers() function finishes that memory should not be accessed. So when you try to print the data in main() you're accessing memory you shouldn't.
To solve it, create the AnswersKey array in main, and pass it as a parameter to the fgetAnswers() function.
The char array AnswerKey is allocated on the stack when fgetAnswers is called. When you return from fgetAnswers, data stored in the stack frame for that call is no longer valid. You'll need to pass in the array or alloc it so the input isn't stored in the stack.

Variable reuse in C

The code I'm looking at is this:
for (i = 0; i < linesToFree; ++i ){
printf("Parsing line[%d]\n", i);
memset( &line, 0x00, 65 );
strcpy( line, lines[i] );
//get Number of words:
int numWords = 0;
tok = strtok(line , " \t");
while (tok != NULL) {
++numWords;
printf("Number of words is: %d\n", numWords);
println(tok);
tok = strtok(NULL, " \t");
}
}
My question centers around the use of numWords. Does the runtime system reuse this variable or does it allocate a new int every time it runs through the for loop? If you're wondering why I'm asking this, I'm a Java programmer by trade who wants to get into HPC and am therefore trying to learn C. Typically I know you want to avoid code like this, so this question is really exploratory.
I'm aware the answer is probably reliant upon the compiler... I'm looking for a deeper explanation than that. Assume the compiler of your choice.
Your conception about how this works in Java might be misinformed - Java doesn't "allocate" a new int every time through a loop like that either. Primitive type variables like int aren't allocated on the Java heap, and the compiler will reuse the same local storage for each loop iteration.
On the other hand, if you call new anything in Java every time through a loop, then yes, a new object will be allocated every time. However, you're not doing that in this case. C also won't allocate anything from the heap unless you call malloc or similar (or in C++, new).
Please note the difference between automatic and dynamic memory allocation. In Java only the latter exists.
This is automatic allocation:
int numWords = 0;
This is dynamic allocation:
int *pNumWords = malloc(sizeof(int));
*pNumWords = 0;
The dynamic allocation in C only happens explicitly (when you call malloc or its derivatives).
In your code, only the value is set to your variable, no new one is allocated.
From a performance standpoint, it's not going to matter. (Variables map to registers or memory locations, so it has to be reused.)
From a logical standpoint, yes, it will be reused because you declared it outside the loop.
From a logical standpoint:
numWords will not be reused in the outer loop because it is declared inside it.
numWords will be reused in the inner loop because it isn't declared inside.
This is what is called "block", "automatic" or "local" scope in C. It is a form of lexical scoping, i.e., a name refers to its local environment. In C, it is top down, meaning that it happens as the file is parsed and compiled and visible only after defined in the program.
When the variable goes out of scope, the lexical name is no longer valid (visible) and the memory may be reused.
The variable is declared in a local scope or a block defined by curly braces { /* block */ }. This defines a whole group of C and C99 idioms, such as:
for(int i=0; i<10; ++i){ // C99 only. int i is local to the loop
// do something with i
} // i goes out of scope here...
There are subtleties, such as:
int x = 5;
int y = x + 10; // this works
int x = y + 10;
int y = 5; // compiler error
and:
int g; // static by default and init to 0
extern int x; // defined and allocated elsewhere - resolved by the linker
int main (int argc, const char * argv[])
{
int j=0; // automatic by default
while (++j<=2) {
int i=1,j=22,k=3; // j from outer scope is lexically redefined
for (int i=0; i<10; i++){
int j=i+10,k=0;
k++; // k will always be 1 when printed below
printf("INNER: i=%i, j=%i, k=%i\n",i,j,k);
}
printf("MIDDLE: i=%i, j=%i, k=%i\n",i,j,k); // prints middle j
}
// printf("i=%i, j=%i, k=%i\n",i,j,k); compiler error
return 0;
}
There are idiosyncrasies:
In K&R C, ANSI C89, and Visual Studio, All variables must be declared at the beginning of the function or compound statement (i.e., before the first statement)
In gcc, Variables may be declared anywhere in the function or compound statement and is only visible from that point on.
In C99 and C++, Loop variables may be declared in for statement and are visible until end of loop body.
In a loop block, the allocation is performed ONCE and the RH assignment (if any) is performed each time.
In the particular example you posted, you enquired about int numWords = 0; and if a new int is allocated each time through the loop. No, there is only one int allocated in a loop block, but the right hand side of the = is executed every time. This can be demonstrated so:
#include <stdio.h>
#include <time.h>
#include <unistd.h>
volatile time_t ti(void){
return time(NULL);
}
void t1(void){
time_t t1;
for(int i=0; i<=10; i++){
time_t t2=ti(); // The allocation once, the assignment every time
sleep(1);
printf("t1=%ld:%p t2=%ld:%p\n",t1,(void *)&t1,t2,(void *)&t2);
}
}
Compile that with any gcc (clang, eclipse, etc) compatible compiler with optimizations off (-O0) or on. The address of t2 will always be the same.
Now compare with a recursive function:
int factorial(int n) {
if(n <= 1)
return 1;
printf("n=%i:%p\n",n,(void *)&n);
return n * factorial(n - 1);
}
The address of n will be different each time because a new automatic n is allocated with each recursive call.
Compare with an iterative version of factorial forced to used a loop-block allocation:
int fac2(int num) {
int r=0; // needed because 'result' goes out of scope
for (unsigned int i=1; i<=num; i++) {
int result=result*i; // only RH is executed after the first time through
r=result;
printf("result=%i:%p\n",result,(void *)&result); // address is always the same
}
return r;
}
In conclusion, you asked about int numWords = 0; inside the for loop. The variable is reused in this example.
The way the code is written, the programmer is relying on the RH of int numWords = 0; after the first to be executed and resetting the variable to 0 for use in the while loop that follows.
The scope of the numWords variable is inside the for loop. Just as Java, you can only use the variable inside the loop, so theoretically its memory would have to be freed on exit - since it is also on the stack in your case.
Any good compiler however would use the same memory and simply re-set the variable to 0 on each iteration.
If you were using a class instead of an int, you would see the destructor being called every time the for loops.
Even consider this:
class A;
A* pA = new A;
delete pA;
pA = new A;
The two objects created here will probably reside at the same memory.
It will be allocated every time through the loop (the compiler can optimize out that allocation)
for (i = 0; i < 100; i++) {
int n = 0;
printf("%d : %p\n", i, (void*)&n);
}
No guarantees all 100 lines will have the same address (though probably they will).
Edit: The C99 Standard, in 6.2.4/5 says: "[the object] lifetime extends from entry into the block with which it is associated until execution of that block ends in any way." and, in 6.8.5/5, it says that the body of a for statement is in fact a block ... so the paragraph 6.2.4/5 applies.

Resources