Mysterious crash report - looks like a CPU bug - c

A user sent me a crash dump of my program, and I cannot understand how what I'm seeing is possible. It looks like one of the registers just changed it's value, without any visible reason. I don't have any explanation except for a CPU bug, but I'm very skeptical about that. Perhaps you can spot what's going on here.
Here's the code disassembly, as seen when opening the crash report (clickable):
Here's, roughly, how the C code looks:
void **pp = *g_some_global;
if(!pp)
return NULL;
int array_count = (int)pp[0];
void **array_ptr = (void **)pp[1];
for(i = 0; i < array_count; i++)
{
LONG_PTR *contents = array_ptr[i];
if(contents[4] == compare)
{
void **pp2 = (LONG_PTR *)contents[7]; // contents is different here! pp2 is NULL
int array_count2 = (int)pp2[0]; // the CRASH!
void **array_ptr2 = (void **)pp2[1];
// ...
}
}

Related

Issues using realloc (old size)

I'm trying to use realloc function in C, to dynamically operate on a char array of strings (char**).
I usually get a realloc():invalid old size error after 41st cicle of the for loop and I really can't understand why.
So, thanks to everyone who will help me ^-^
[EDIT] I'm trying to make the post more clear and following your advices, as a "new active member" of this community, so thank you all!
typedef struct _WordsOfInterest { // this is in an header file containing just
char **saved; // the struct and libraries
int index;
} wordsOfInterest;
int main() {
char *token1, *token2, *save1 = NULL, file[LEN], *temp, *word, **tokenArr;
int n=0, ch,ch2, flag=0, size, init=0,position,currEdit,init2=0,tempEdit,size_arr=LEN,
oldIndex=0,totalIndex=0,*editArr,counterTok=0;
wordsOfInterest toPrint;
char **final;
toPrint.index = 0;
toPrint.saved = malloc(sizeof(char*)*LEN);
editArr = malloc(sizeof(int)*LEN);
tokenArr = malloc(sizeof(char*)*LEN);
final = malloc(sizeof(char*)*1);
// external for loop
for(...) {
tokenArr[counterTok] = token1;
// internal while loop
while(...) {
// some code here surely not involved in the issue
} else {
if(init2 == 0) {
currEdit = config(token1,token2);
toPrint.saved[toPrint.index] = token2;
toPrint.index++;
init2 = 1;
} else {
if((abs((int)strlen(token1)-(int)strlen(token2)))<=currEdit) {
if((tempEdit = config(token1,token2)) == currEdit) {
toPrint.saved[toPrint.index] = token2;
toPrint.index++;
if(toPrint.index == size_arr-1) {
size_arr = size_arr*2;
toPrint.saved = realloc(toPrint.saved, size_arr);
}
} else if((tempEdit = config(token1,token2))<currEdit) {
freeArr(toPrint, size_arr);
toPrint.saved[toPrint.index] = token2;
toPrint.index++;
currEdit = tempEdit;
}
}
}
flag = 0;
word = NULL;
temp = NULL;
freeArr(toPrint, size_arr);
}
}
editArr[counterTok] = currEdit;
init2 = 0;
totalIndex = totalIndex + toPrint.index + 1;
final = realloc(final, (sizeof(char*)*totalIndex));
uniteArr(toPrint, final, oldIndex);
oldIndex = toPrint.index;
freeArr(toPrint,size_arr);
fseek(fp2,0,SEEK_SET);
counterTok++;
}
You start with final uninitialized.
char **final;
change it to:
char **final = NULL;
Even if you are starting with no allocation, it needs a valid value (e.g. NULL) because if you don't initialize a local variable to NULL, it gets garbage, and realloc() will think it is reallocating a valid chunk of memory and will fail into Undefined Behaviour. This is probably your problem, but as you have eliminated a lot of code in between the declaration and the first usage of realloc, whe cannot guess what is happening here.
Anyway, if you have indeed initialized it, I cannot say, as you have hidden part of the code, unlistening the recommendation of How to create a Minimal, Reproducible Example
.
There are several reasons (mostly explained there) to provide a full but m inimal, out of the box, failing code. This allows us to test that code without having to provide (and probably solving, all or part) the neccesary code to make it run. If you only post a concept, you cannot expect from us complete, full running, and tested code, degrading strongly the quality of SO answers.
This means you have work to do before posting, not just eliminating what you think is not worth mentioning.
You need to build a sample that, with minimum code, shows the actual behaviour you see (a nonworking complete program) This means eliminating everything that is not related to the problem.
You need (and this is by far more important) to, before sending the code, to test it at your site, and see that it behaves as you see at home. There are many examples that, when eliminated the unrelated code, don't show the commented behaviour.
...and then, without touching anymore the code, send it as is. Many times we see code that has been touched before sending, and the problem dissapeared.
If we need to build a program, we will probably do it with many other mistakes, but not yours, and this desvirtuates the purpose of this forum.
Finally, sorry for the flame.... but it is necessary to make people read the rules.

Error caused by function that hasn't yet been run

So I'm writing a bill handling system. The data currently sits in a Stack structure that I've written.
I have this partially written function that writes out a report:
void GenerateReport(Bill* bill)
{
PrintBillHeading(bill);
//CallEntry* collatedEntries = CollapseCallStack(bill->callEntries);
//TODO
}
Which works fine as long as I leave the second line commented out. If I uncomment it I get a SIGSEGV fault within the PrintBillHeading() function where indicated below.
void PrintBillHeading(Bill* bill)
{
printf("Big Brother Telecom\n");
printf("Bill Date: %s\n\n",DateTimeToISOString(bill->date));
printf("Contract Holder: %s %s\n", bill->title, bill->name);
printf("Address:\n");
char* addressSeg;
char* addressCpy;
strcpy(addressCpy,bill->address); //This line throws the SIGSEGV
while ((addressSeg = strtok_r(addressCpy,";",&addressCpy)))
{
printf("%s\n\0",addressSeg);
}
}
and for completeness here is my CollapseCallStack() function, this is uncomplete, entirely untested and probably doesn't work.
CallEntry* CollapseCallStack(Stack* calls)
{
int size = calls->topIndex;
CallEntry* collatedSet = malloc(sizeof(CallEntry) * size);
CallEntry* poppedCall;
int curIndex = 0;
while (PopStack(calls,poppedCall))
{
bool found = false;
for (int i = 0; i < size; i++)
{
CallEntry* arrItem = collatedSet + i * sizeof(CallEntry);
if (StringEquals(arrItem->phoneNumber,poppedCall->phoneNumber))
{
found = true;
arrItem->minutes += poppedCall->minutes;
}
}
if (!found)
{
memcpy(collatedSet,poppedCall,sizeof(CallEntry)); //
}
}
}
And the CallEntry struct:
typedef struct{
char* phoneNumber;
int minutes;
DateTime* callDateTime;
} CallEntry;
My question is this: how can a function that hasn't yet been called cause a SIGSEGV fault to be expressed earlier on in a program?
Once I've got past this, I can debug the CollapseCallStack() function myself, although if anyone sees any glaring problems I would appreciate a comment on that.
In function PrintBillHeading(), the statement strcpy(addressCpy,bill->address) uses the value of an uninitialized variable addressCpy. This is undefined behavior. Undefined behavior means that the program may crash in any random place. If the program contains undefined behavior the entire program is invalid.
In addition to the correct answer by AlexP, I'd like to point out another (lurking) undefined behaviour:
void GenerateReport(Bill* bill)
{
PrintBillHeading(bill);
CallEntry* collatedEntries = CollapseCallStack(bill->callEntries);
//TODO
}
Now, CollapseCallStack in your current implementation does not return anything. It will still be called, and actually something will be assigned to your collatedEntries pointer upon your initialization of it.
The problem is that when CollapseCallStack is called, memory for the return value is being allocated, but it never gets assigned a meaningful value, since the return statement is missing. So, essentially your collatedEntries pointer will be initialized with a random garbage value, and if you'd try to dereference it, it would cause UB.

C: Segmentation fault: GDB: <error reading variable>

I have a function shortestPath() that is a modified implementation of Dijkstra's algorithm for use with a board game AI I am working on for my comp2 class. I have trawled through the website and using gdb and valgrind I know exactly where the segfault happens (actually knew that a few hours ago), but can't figure out what undefined behaviour or logic error is causing the problem.
The function in which the problem occurs is called around 10x and works as expected until it segfaults with GDB:
"error reading variable: cannot access memory"
and valgrind:
"Invalid read of size 8"
Normally that would be enough, but I can't work this one out. Also any general advise and tips are appreciated... thanks!
GDB: https://gist.github.com/mckayryan/b8d1e9cdcc58dd1627ea
Valgrind: https://gist.github.com/mckayryan/8495963f6e62a51a734f
Here is the function in which the segfault occurs:
static void processBuffer (GameView currentView, Link pQ, int *pQLen,
LocationID *buffer, int bufferLen, Link prev,
LocationID cur)
{
//printLinkIndex("prev", prev, NUM_MAP_LOCATIONS);
// adds newly retrieved buffer Locations to queue adding link types
appendLocationsToQueue(currentView, pQ, pQLen, buffer, bufferLen, cur);
// calculates distance of new locations and updates prev when needed
updatePrev(currentView, pQ, pQLen, prev, cur); <--- this line here
qsort((void *) pQ, *pQLen, sizeof(link), (compfn)cmpDist);
// qsort sanity check
int i, qsortErr = 0;
for (i = 0; i < *pQLen-1; i++)
if (pQ[i].dist > pQ[i+1].dist) qsortErr = 1;
if (qsortErr) {
fprintf(stderr, "loadToPQ: qsort did not sort succesfully");
abort();
}
}
and the function whereby after it is called everything falls apart:
static void appendLocationsToQueue (GameView currentView, Link pQ,
int *pQLen, LocationID *buffer,
int bufferLen, LocationID cur)
{
int i, c, conns;
TransportID type[MAX_TRANSPORT] = { NONE };
for (i = 0; i < bufferLen; i++) {
// get connection information (up to 3 possible)
conns = connections(currentView->gameMap, cur, buffer[i], type);
for (c = 0; c < conns; c++) {
pQ[*pQLen].loc = buffer[i];
pQ[(*pQLen)++].type = type[c];
}
}
}
So I thought that a pointer had been overridden to the wrong address, but after a lot of printing in GDB that doesn't seem to be the case. I also rotated through making reads/writes to the variables in question to see which trigger the fault and they all do after appendLocationsToQueue(), but not before (or at the end of that function for that matter).
Here is the rest of the relevant code:
shortestPath():
Link shortestPath (GameView currentView, LocationID from, LocationID to, PlayerID player, int road, int rail, int boat)
{
if (!RAIL_MOVE) rail = 0;
// index of locations that have been visited
int visited[NUM_MAP_LOCATIONS] = { 0 };
// current shortest distance from the source
// the previous node for current known shortest path
Link prev;
if(!(prev = malloc(NUM_MAP_LOCATIONS*sizeof(link))))
fprintf(stderr, "GameView.c: shortestPath: malloc failure (prev)");
int i;
// intialise link data structure
for (i = 0; i < NUM_MAP_LOCATIONS; i++) {
prev[i].loc = NOWHERE;
prev[i].type = NONE;
if (i != from) prev[i].dist = INF;
else prev[i].dist = LAST;
}
LocationID *buffer, cur;
// a priority queue that dictates the order LocationID's are checked
Link pQ;
int bufferLen, pQLen = 0;
if (!(pQ = malloc(MAX_QUEUE*sizeof(link))))
fprintf(stderr, "GameView.c: shortestPath: malloc failure (pQ)");
// load initial location into queue
pQ[pQLen++].loc = from;
while (!visited[to]) {
// remove first item from queue into cur
shift(pQ, &pQLen, &cur);
if (visited[cur]) continue;
// freeing malloc from connectedLocations()
if (cur != from) free(buffer);
// find all locations connected to
buffer = connectedLocations(currentView, &bufferLen, cur,
player, currentView->roundNum, road,
rail, boat);
// mark current node as visited
visited[cur] = VISITED;
// locations from buffer are used to update priority queue (pQ)
// and distance information in prev
processBuffer(currentView, pQ, &pQLen, buffer, bufferLen, prev,
cur);
}
free(buffer);
free(pQ);
return prev;
}
The fact that all your parameters look good before this line:
appendLocationsToQueue(currentView, pQ, pQLen, buffer, bufferLen, cur);
and become unavailable after it tells me that you've stepped on (wrote 0x7fff00000000 to) the $rbp register (all local variables and parameters are relative to $rbp when building without optimization).
You can confirm this in GDB with print $rbp before and after call to appendLocationsToQueue ($rbp is supposed to always have the same value inside a given function, but will have changed).
Assuming this is true, there are only a few ways this could happen, and the most likely way is a stack buffer overflow in appendLocationsToQueue (or something it calls).
You should be able to use Address Sanitizer (g++ -fsanitize=address ...) to find this bug fairly easily.
It's also fairly easy to find the overflow in GDB: step into appendLocationsToQueue, and do watch -l *(char**)$rbp, continue. The watchpoint should fire when your code overwrites the $rbp save location.

Fgets errors seg fault

Is there any reason that a program, which compiled earlier, should seg fault at a point because of fgets? I changed no code related to it AT ALL. Suddenly I believe it is failing to open the file, but I tested it with the file like fifteen minutes ago.... All I did was add a search function, so I don't understand what the issue is.....
Could it be the server I'm connecting to over PuTTy?
int createarray( int **arrayRef, FILE *fptr){
int size = 0, i;
char rawdata[100];
while (fgets(rawdata, 99, fptr) != NULL){
size++;
}
rewind(fptr);
*arrayRef = malloc(sizeof(int) * size);
for ( i = 0; i < size; i++ ){
fgets(rawdata, 99, fptr);
*(*arrayRef + i) = atoi(rawdata);
}
return size;
}
int main ( int argc, char **argv ) { //main call
// declare variable to hold file
FILE *inFilePtr = fopen(*(argv + 1), "r");
int **aryHold;
int numElements, sortchoice, key, foundindex;
// Call function to create array and return num elements
numElements = createarray(aryHold, inFilePtr);
This is the code that compiled, performed correct, and hasn't been changed since. GDB says there is an error with fgets.
OK, the reason it use to "work" is you were clobbering an unimportant memory location. Changing your code shifted things around and now you are clobbering something important.
You're passing an uninitialized pointer to createarray(). You wanted to do something like:
int* aryHold;
//...
... createarray(&aryHold ...
BTW, many compilers have the ability to catch this kind of error for you. If you haven't already, you might want to see if your compiler has an error checking option that could have saved you hassling with this (and perhaps find some other code that only "works" accidentally).

Running out of memory.. How?

I'm attempting to write a solver for a particular puzzle. It tries to find a solution by trying every possible move one at a time until it finds a solution. The first version tried to solve it depth-first by continually trying moves until it failed, then backtracking, but this turned out to be too slow. I have rewritten it to be breadth-first using a queue structure, but I'm having problems with memory management.
Here are the relevant parts:
int main(int argc, char *argv[])
{
...
int solved = 0;
do {
solved = solver(queue);
} while (!solved && !pblListIsEmpty(queue));
...
}
int solver(PblList *queue) {
state_t *state = (state_t *) pblListPoll(queue);
if (is_solution(state->pucks)) {
print_solution(state);
return 1;
}
state_t *state_cp;
puck new_location;
for (int p = 0; p < puck_count; p++) {
for (dir i = NORTH; i <= WEST; i++) {
if (!rules(state->pucks, p, i)) continue;
new_location = in_dir(state->pucks, p, i);
if (new_location.x != -1) {
state_cp = (state_t *) malloc(sizeof(state_t));
state_cp->move.from = state->pucks[p];
state_cp->move.direction = i;
state_cp->prev = state;
state_cp->pucks = (puck *) malloc (puck_count * sizeof(puck));
memcpy(state_cp->pucks, state->pucks, puck_count * sizeof(puck)); /*CRASH*/
state_cp->pucks[p] = new_location;
pblListPush(queue, state_cp);
}
}
}
free(state->pucks);
return 0;
}
When I run it I get the error:
ice(90175) malloc: *** mmap(size=2097152) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Bus error
The error happens around iteration 93,000.
From what I can tell, the error message is from malloc failing, and the bus error is from the memcpy after it.
I have a hard time believing that I'm running out of memory, since each game state is only ~400 bytes. Yet that does seem to be what's happening, seeing as the activity monitor reports that it is using 3.99GB before it crashes. I'm using http://www.mission-base.com/peter/source/ for the queue structure (it's a linked list).
Clearly I'm doing something dumb. Any suggestions?
Check the result of malloc. If it's NULL, you might want to print out the length of that queue.
Also, the code snippet you posted didn't include any frees...
You need to free() the memory you've allocated manually after you're done with it; dynamic memory doesn't just "free itself"

Resources