Realloc Causing App Crash - c

This question has been asked multiple times, but I've done (from what I can tell) everything that's been mentioned here. Basically, I'm getting 1 character at a time from a TCP socket and I'm building a dynamically growing string with the one character. I can do looping prints and see that the string grows and grows, and then when it gets to 20 characters long, the program crashes.
while(FilterAmount != 0)
{
g_io_channel_read_chars (source,(gchar *) ScanLine,1,&BytesRead,&GlibError);
printf("Scanline: %s FilterAmount: %ld\n", ScanLine, FilterAmount);
//the filters are delimited by \n, munch these off, reset important variables, save the last filter which is complete
if(ScanLine[0] == FilterTerminator[0]) {
//if the Filter Name actually has a filter in it
if(FilterName != NULL){
FilterArray = FilterName; //save off the filter name
printf("This is the filter name just added: %s\n", FilterName);
FilterArray++; //increment the pointer to point to the next memory location.
FilterAmount--; //update how many filters we have left
FilterNameCount = 0; //reset how many characters each filter name is
free(FilterName);
free(FilterTmp);
}
}
else {
printf("else!\n");
//keep track of the string length of the filter
FilterNameCount++;
//allocate more memory in the string used to store the filter name + null terminating character
FilterTmp = (gchar*)realloc(FilterName, FilterNameCount*sizeof(char) + 1);
if(FilterTmp == NULL)
{
free(FilterName);
printf("Error reallocating memory for the filter name temporary variable!");
return 1;
}
FilterName = FilterTmp;
printf("filter name: %s\n", FilterName);
//concat the character to the end of the string where space was just made for it.
strcat(FilterName, ScanLine);
}
}
}
This section of code loops and loops whenever we have a non "\n" character in a buffer we're reading data into. The program crashes when allocating the 21st character's location every single time. Here are the pertinent declarations:
static gchar *FilterName = NULL, *FilterTmp = NULL;
static gchar ScanLine[9640];

Related

Input Parsing in C & deciding where to forward

So I have the following project i need to work on for school. It involves a server/client communicating where the client sends requests to the server to get certain information. The server gets the req, parses it, and then sends a response based on the type of the request. For example:
GET /apple/fn/tom/ln/sawyer/a/25/id/1234 : This is a request to get the info for the following person who works at the Apple company (/apple):
fn (first name): Tom
ln (last name): Sawyer
a (age): 25
id (ID): 1234
Now the server should accept input, parse it, and return the information requested from its own database. implementing the server/client is not an issue. I need to know the best way to implement the algorithm to deal with input since not all requests would look like the one above. Other examples of requests:
GET /apple/i: This should return info about Apple company (i.e. address, phone number)
GET /apple/e: return number of employees in Apple company
GET /apple/e/id/1234: return info of employee in Apple company with the following id=1234 (which in our example would be Tom Sawyer) i.e. return first name, last name, age, address.
GET /apple/fn/tom/ln/sawyer/a/25/id/1234 : discussed above
SET /apple/fn/tom/ln/sawyer/a/25/id/5678 : update id of this employee to 5678
SET /apple/fn/tom/ln/sawyer/a/23 : update age of this employee to 23
...
I will have to implement different structs for req/response (i.e. a struct for req and another for response) as well as a different function for each of the different request/responses. BUT what is the best way to deal with parsing input && decide which function to send it to? I was told to look into using a parser-generator like Bison but to my understanding this would only help parse the input and break it up into pieces which is not that hard for me since I know I always have the "/" between fields so I can use the function:
strtok( input, "/" );
So main issue I have is how to decide where to send each request. So Assuming I have the following functions:
struct GetEmployeeInfoReq
{
char *fn;
char *ln;
int age;
};
struct GetEmployeeInfoResp
{
int house_num;
int street_num;
char *street_name;
char * postal_code;
int years_worked_here;
};
void GetEmployeeInfo( struct GetEmployeeInfoResp *resp, struct GetEmployeeInfoReq *req );
struct GetCompanyInfoReq
{
...
}
struct GetCompanyInfoResp
{
...
}
void GetCompanyInfo( struct GetCompanyInfoResp *resp, struct GetCompanyInfoReq *req );
Now I know that to call the first function I need the following request:
GET /apple/fn/tom/ln/sawyer/a/25/id/1234
and to call 2nd function I need the following:
GET /apple/i
My question is how to get this done? Off the top of my mind I'm thinking defining a variable for each possible field in the input req and using that so if this is my request:
GET /apple/e/id/1234
then I would have the following values defined and set to true:
bool is_apple = true;
bool is_employee = true;
bool is_id = true;
After I know that this request is:
GET /apple/e/id/<id_num>
AND NOT
GET /apple/e
so i can send it to the proper function.
Is this the correct approach as I'm lost on how to tackle this issue.
Thanks,
Get yourself a large piece of paper and make a diagram about the logic to get a grammar (not actually neede here, you could just parse it, but being a school assignment I assume that it is meant to be build up upon).
Some observations
every input starts with a /
every entry ends with a / except the last one
entries consist of characters and/or digits
empty entries gets you in trouble if you insist on using strtok (thanks to wildplasser, I missed that)
order matters?
Entries are
either key (one entry) or key/value (two consecutive entries).
In other words: the may or may not have an argument
Entries with a meaning
first entry is the company name (what do you do if that's all? Check assignment)
next entry is either
i print company info
e
without arguments: print #employees
with argument: print information about argument, argument must be correct
fn first name as a key, must have a value because of strtok, argument must be a string
ln last name as a key, must have a value because of strtok, argument must be a string
id id as a key, must have a value because of strtok, argument must be a string of digits (an integer so to say but it is still a string at that point) only
a age as a key, must have a value because of strtok, argument must be a string of digits (an integer so to say but it is still a string at that point) only
Examples:
/apple/e
/ start of input
apple company name. Must have an argument, so go on
e something with employers, my or may not have an argument so check.
EOI (end of input) that means that e has no arguments, so print the number of employees.
/apple/id/134
/ start of input
apple company name. Must have an argument, so go on
e something with employers, my or may not have an argument so check.
id is a key and must have a value, that means we need an argument, so check
1234 all digits which is correct value for id
EOI (end of input) no other things, so print the information you have about id-1234
/apple/fn/tom/ln/sawyer/a/25/id/1234
/ start of input
apple company name. Must have an argument, so go on
fn is a key, must have a value, check
tom the value for fn must be a string which it is, safe
ln is a key, must have a value, check
sawyer the value for ln must be a string which it is, safe
a is a key, must have a value, check
25 the value for ln must be a string of digits which it is, safe
id is a key, must have a value, check
1234 the value for ln must be a string of digits which it is, safe
EOI (end of input). The key for the database entry seems to be the name fn and ln, so read the entry tomsawyer and check if any of id or a is different and change accordingly. If nothing is different: check your assignment.
It might be good idea to build a struct with all of the informations and fill it while parsing (where I wrote "safe"). You need two main functions printIt(keys) and changeIt(key, value). Some things can be done immediately, like in my first example, some need further ado.
That's it, should be straightforward to implement.
EDIT a short example
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// ALL CHECKS OMMITTED!
#define DELIMITER '/'
// should resemble a row from the DB
typedef struct company {
char *comp_name;
unsigned int num_empl;
unsigned int empl_id;
unsigned int empl_age;
char *first_name;
char *last_name;
} company;
int main(int argc, char **argv)
{
company *entries;
char *input, *token;
size_t ilen;
if (argc < 2) {
fprintf(stderr, "Usage: %s stringtoparse \n", argv[0]);
exit(EXIT_FAILURE);
}
// work on copy
ilen = strlen(argv[1]);
input = malloc(ilen + 1);
strcpy(input, argv[1]);
entries = malloc(sizeof(company));
// skip first delimiter
if (*input == DELIMITER) {
input++;
ilen--;
}
token = strtok(input, "/");
if (token == NULL) {
fprintf(stderr, "Usage : %s stringtoparse \n", argv[0]);
exit(EXIT_FAILURE);
}
// first entry is the company name
entries->comp_name = malloc(strlen(token));
strcpy(entries->comp_name, token);
// mark empty entries as empty
entries->first_name = NULL;
entries->last_name = NULL;
entries->empl_age = -1;
entries->empl_id = -1;
// F(23)
entries->num_empl = 28657;
// only very small part of grammar implemented for simplicity
for (;;) {
token = strtok(NULL, "/");
if (token == NULL) {
break;
}
// "e" [ "/" "id" "/" number <<EOF>> ]
if (strcmp(token, "e") == 0) {
token = strtok(NULL, "/");
if (token == NULL) {
puts("Info about number of employees wanted\n");
// pure info, pull from DB (not impl.) and stop
break;
} else {
if (strcmp(token, "id") != 0) {
fprintf(stderr, "Only \"id\" allowed after \"e\" \n");
// free all heap memory here
exit(EXIT_FAILURE);
}
token = strtok(NULL, "/");
if (token == NULL) {
fprintf(stderr, "ERROR: \"id\" needs a number \n");
// free all heap memory here
exit(EXIT_FAILURE);
}
// does not check if it really is a number, use strtol() in prod.
entries->empl_id = atoi(token);
printf("Info about employee with id %d wanted\n", entries->empl_id);
// pure info, pull from DB (not impl.) and stop
break;
}
}
// "a" "/" number
else if (strcmp(token, "a") == 0) {
token = strtok(NULL, "/");
if (token == NULL) {
fprintf(stderr, "ERROR: \"a\" needs a number \n");
// free all heap memory here
exit(EXIT_FAILURE);
}
// does not check if it actually is a number, use strtol() in prod.
entries->empl_age = atoi(token);
printf("Age given: %d\n", entries->empl_age);
}
// "id" "/" number
else if (strcmp(token, "id") == 0) {
token = strtok(NULL, "/");
if (token == NULL) {
fprintf(stderr, "ERROR: \"id\" needs a number \n");
// free all heap memory here
exit(EXIT_FAILURE);
}
// does not check if it actually is a number, use strtol() in prod.
entries->empl_id = atoi(token);
printf("ID given: %d\n", entries->empl_id);
}
// "fn" "/" string
else if (strcmp(token, "fn") == 0) {
token = strtok(NULL, "/");
if (token == NULL) {
fprintf(stderr, "ERROR: \"fn\" needs a string \n");
// free all heap memory here
exit(EXIT_FAILURE);
}
entries->first_name = malloc(strlen(token));
strcpy(entries->first_name, token);
printf("first name given: %s\n", entries->first_name);
}
// "ln" "/" string
else if (strcmp(token, "ln") == 0) {
token = strtok(NULL, "/");
if (token == NULL) {
fprintf(stderr, "ERROR: \"ln\" needs a string \n");
// free all heap memory here
exit(EXIT_FAILURE);
}
entries->last_name = malloc(strlen(token));
strcpy(entries->last_name, token);
printf("last name given: %s\n", entries->last_name);
} else {
fprintf(stderr, "ERROR: Unknown token \"%s\" \n", token);
// free all heap memory here
exit(EXIT_FAILURE);
}
}
printf("\n\nEntries:\nCompany name: %s\nFirst name: %s\nLast name: %s\n",
entries->comp_name, entries->first_name, entries->last_name);
printf("Age: %d\nID: %d\nNumber of employees: %d\n",
entries->empl_age, entries->empl_id, entries->num_empl);
/*
* At this state you have information about what is given (in "entries")
* and what is wanted.
*
* Connect to the DB.
*
* If firstnamelastname is the DB-id and in the DB, you can check if
* the given ID is the same as the one in the DB and change if not.
*
* You can do the same for age.
*
* If firstnamelastname is not in the DB but ID is given check if the
* ID is in the DB, change firstname and/or lastname if necessary and
* congratulate on the wedding (many other reasons are possible, please
* check first or it might get really embarassing)
*/
// free all heap memory here
/* Disconnect from the DB */
exit(EXIT_SUCCESS);
}
Compiled with:
gcc -g3 -std=c11 -W -Wall -pedantic jjadams.c -o jjadams
try with
./jjadams "/Apple/fn/Tom/ln/Sawyer/a/10/id/3628800"
./jjadams "/Apple/e"
./jjadams "/Apple/e/id/1234"
./jjadams "/Apple/e/1234"

Inserting word from a text file into a tree in C

I have been encountering a weird problem for the past 2 days and I can't get to solve it yet. I am trying to get words from 2 texts files and add those words to a tree. The methods I choose to get the words are refereed here:
Splitting a text file into words in C.
The function that I use to insert words into a tree is the following:
void InsertWord(typosWords Words, char * w)
{
int error ;
DataType x ;
x.word = w ;
printf(" Trying to insert word : %s \n",x.word );
Tree_Insert(&(Words->WordsRoot),x, &error) ;
if (error)
{
printf("Error Occured \n");
}
}
As mentioned in the link posted , when I am trying to import the words from a text file into the tree , I am getting "Error Occured". For once again the function:
the text file :
a
aaah
aaahh
char this_word[15];
while (fscanf(wordlist, "%14s", this_word) == 1)
{
printf("Latest word that was read: '%s'\n", this_word);
InsertWord(W,this_word);
}
But when I am inserting the exact same words with the following way , it works just fine.
for (i = 0 ; i <=2 ; i++)
{
if (i==0)
InsertWord(W,"a");
if (i==1)
InsertWord(W,"aaah");
if (i==2)
InsertWord(W,"aaahh");
}
That proves the tree's functions works fine , but I can't understand what's happening then.I am debugging for straight 2 days and still can't figure it. Any ideas ?
When you read the words using
char this_word[15];
while (fscanf(wordlist, "%14s", this_word) == 1)
{
printf("Latest word that was read: '%s'\n", this_word);
InsertWord(W,this_word);
}
you are always reusing the same memory buffer for the strings. This means when you do
x.word = w ;
you are ALWAYS storing the SAME address. And every read redefine ALL already stored words, basically corrupting the data structure.
Try changing the char this_word[15]; to char *this_word; and placing a this_word = malloc(15);in the beggining of thewhile` loop instead, making it allocate a new buffer for each iteration. So looking like
char *this_word;
while (fscanf(wordlist, "%14s", this_word) == 1)
{
this_word = malloc(15);
printf("Latest word that was read: '%s'\n", this_word);
InsertWord(W,this_word);
}
As suggested by Michael Walz a strdup(3) also solves the immediate problem.
Of course you will also have do free up the .word elements when finished with the tree.
Seems like the problem was in the assignment of the strings.Strdup seemed to solve the problem !

Two-dimensional char array too large exit code 139

Hey guys I'm attempting to read in workersinfo.txt and store it into a two-dimensional char array. The file is around 4,000,000 lines with around 100 characters per line. I want to store each file line on the array. Unfortunately, I get exit code 139(Not enough memory). I'm aware I have to use malloc() and free() but I've tried a couple of things and I haven't been able to make them work.Eventually I have to sort the array by ID number but I'm stuck on declaring the array.
The file looks something like this:
First Name, Last Name,Age, ID
Carlos,Lopez,,10568
Brad, Patterson,,20586
Zack, Morris,42,05689
This is my code so far:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
FILE *ptr_file;
char workers[4000000][1000];
ptr_file =fopen("workersinfo.txt","r");
if (!ptr_file)
perror("Error");
int i = 0;
while (fgets(workers[i],1000, ptr_file)!=NULL){
i++;
}
int n;
for(n = 0; n < 4000000; n++)
{
printf("%s", workers[n]);
}
fclose(ptr_file);
return 0;
}
The Stack memory is limited. As you pointed out in your question, you MUST use malloc to allocate such a big (need I say HUGE) chunk of memory, as the stack cannot contain it.
you can use ulimit to review the limits of your system (usually including the stack size limit).
On my Mac, the limit is 8Mb. After running ulimit -a I get:
...
stack size (kbytes, -s) 8192
...
Or, test the limit using:
struct rlimit slim;
getrlimit(RLIMIT_STACK, &rlim);
rlim.rlim_cur // the stack limit
I truly recommend you process each database entry separately.
As mentioned in the comments, assigning the memory as static memory would, in most implementations, circumvent the stack.
Still, IMHO, allocating 400MB of memory (or 4GB, depending which part of your question I look at), is bad form unless totally required - especially for a single function.
Follow-up Q1: How to deal with each DB entry separately
I hope I'm not doing your homework or anything... but I doubt your homework would include an assignment to load 400Mb of data to the computer's memory... so... to answer the question in your comment:
The following sketch of single entry processing isn't perfect - it's limited to 1Kb of data per entry (which I thought to be more then enough for such simple data).
Also, I didn't allow for UTF-8 encoding or anything like that (I followed the assumption that English would be used).
As you can see from the code, we read each line separately and perform error checks to check that the data is valid.
To sort the file by ID, you might consider either running two lines at a time (this would be a slow sort) and sorting them, or creating a sorted node tree with the ID data and the position of the line in the file (get the position before reading the line). Once you sorted the binary tree, you can sort the data...
... The binary tree might get a bit big. did you look up sorting algorithms?
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 1 on sucesss or 0 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return 0;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
return 0;
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 1;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
return 0;
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE* ptr_file;
ptr_file = fopen("workersinfo.txt", "r");
if (!ptr_file)
perror("File Error");
struct DBEntry line;
while (read_db_line(ptr_file, &line)) {
// do what you want with the data... print it?
printf(
"First name:\t%s\n"
"Last name:\t%s\n"
"Age:\t\t%s\n"
"ID:\t\t%s\n"
"--------\n",
line.first_name, line.last_name, line.age, line.id);
}
// close file
fclose(ptr_file);
return 0;
}
Followup Q2: Sorting array for 400MB-4GB of data
IMHO, 400MB is already touching on the issues related to big data. For example, implementing a bubble sort on your database should be agonizing as far as performance goes (unless it's a single time task, where performance might not matter).
Creating an Array of DBEntry objects will eventually get you a larger memory foot-print then the actual data..
This will not be the optimal way to sort large data.
The correct approach will depend on your sorting algorithm. Wikipedia has a decent primer on sorting algorythms.
Since we are handling a large amount of data, there are a few things to consider:
It would make sense to partition the work, so different threads/processes sort a different section of the data.
We will need to minimize IO to the hard drive (as it will slow the sorting significantly and prevent parallel processing on the same machine/disk).
One possible approach is to create a heap for a heap sort, but only storing a priority value and storing the original position in the file.
Another option would probably be to employ a divide and conquer algorithm, such as quicksort, again, only sorting a computed sort value and the entry's position in the original file.
Either way, writing a decent sorting method will be a complicated task, probably involving threading, forking, tempfiles or other techniques.
Here's a simplified demo code... it is far from optimized, but it demonstrates the idea of the binary sort-tree that holds the sorting value and the position of the data in the file.
Be aware that using this code will be both relatively slow (although not that slow) and memory intensive...
On the other hand, it will require about 24 bytes per entry. For 4 million entries, it's 96MB, somewhat better then 400Mb and definitely better then the 4GB.
#include <stdlib.h>
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// this might be a sorting node for a sorted bin-tree:
struct SortNode {
struct SortNode* next; // a pointer to the next node
fpos_t position; // the DB entry's position in the file
long value; // The computed sorting value
}* top_sorting_node = NULL;
// this function will free all the memory used by the global Sorting tree
void clear_sort_heap(void) {
struct SortNode* node;
// as long as there is a first node...
while ((node = top_sorting_node)) {
// step forward.
top_sorting_node = top_sorting_node->next;
// free the original first node's memory
free(node);
}
}
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 0 on sucesss or 1 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return -1;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
clear_sort_heap();
exit(2);
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 0;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
clear_sort_heap();
exit(1);
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// read and sort a single line from the database.
// return 0 if there was no data to sort. return 1 if data was read and sorted.
int sort_line(FILE* fp) {
// allocate the memory for the node - use calloc for zero-out data
struct SortNode* node = calloc(sizeof(*node), 1);
// store the position on file
fgetpos(fp, &node->position);
// use a stack allocated DBEntry for processing
struct DBEntry line;
// check that the read succeeded (read_db_line will return -1 on error)
if (read_db_line(fp, &line)) {
// free the node's memory
free(node);
// return no data (0)
return 0;
}
// compute sorting value - I'll assume all IDs are numbers up to long size.
sscanf(line.id, "%ld", &node->value);
// heap sort?
// This is a questionable sort algorythm... or a questionable implementation.
// Also, I'll be using pointers to pointers, so it might be a headache to read
// (it's a headache to write, too...) ;-)
struct SortNode** tmp = &top_sorting_node;
// move up the list until we encounter something we're smaller then us,
// OR untill the list is finished.
while (*tmp && (*tmp)->value <= node->value)
tmp = &((*tmp)->next);
// update the node's `next` value.
node->next = *tmp;
// inject the new node into the tree at the position we found
*tmp = node;
// return 1 (data was read and sorted)
return 1;
}
// writes the next line in the sorting
int write_line(FILE* to, FILE* from) {
struct SortNode* node = top_sorting_node;
if (!node) // are we done? top_sorting_node == NULL ?
return 0; // return 0 - no data to write
// step top_sorting_node forward
top_sorting_node = top_sorting_node->next;
// read data from one file to the other
fsetpos(from, &node->position);
char* buffer = NULL;
ssize_t length;
size_t buff_size = 0;
length = getline(&buffer, &buff_size, from);
if (length <= 0) {
perror("Line Copy Error - Couldn't read data");
return 0;
}
fwrite(buffer, 1, length, to);
free(buffer); // getline allocates memory that we're incharge of freeing.
return 1;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE *fp_read, *fp_write;
fp_read = fopen("workersinfo.txt", "r");
fp_write = fopen("sorted_workersinfo.txt", "w+");
if (!fp_read) {
perror("File Error");
goto cleanup;
}
if (!fp_write) {
perror("File Error");
goto cleanup;
}
printf("\nSorting");
while (sort_line(fp_read))
printf(".");
// write all sorted data to a new file
printf("\n\nWriting sorted data");
while (write_line(fp_write, fp_read))
printf(".");
// clean up - close files and make sure the sorting tree is cleared
cleanup:
printf("\n");
fclose(fp_read);
fclose(fp_write);
clear_sort_heap();
return 0;
}

Bus Error on void function return

I'm learning to use libcurl in C. To start, I'm using a randomized list of accession names to search for protein sequence files that may be found hosted here. These follow a set format where the first line is a variable length (but which contains no information I'm trying to query) then a series of capitalized letters with a new line every sixty (60) characters (what I want to pull down, but reformat to eighty (80) characters per line).
I have the call itself in a single function:
//finds and saves the fastas for each protein (assuming on exists)
void pullFasta (proteinEntry *entry, char matchType, FILE *outFile) {
//Local variables
URL_FILE *handle;
char buffer[2] = "", url[32] = "http://www.uniprot.org/uniprot/", sequence[2] = "";
//Build full URL
/*printf ("u:%s\nt:%s\n", url, entry->title); /*This line was used for debugging.*/
strcat (url, entry->title);
strcat (url, ".fasta");
//Open URL
/*printf ("u:%s\n", url); /*This line was used for debugging.*/
handle = url_fopen (url, "r");
//If there is data there
if (handle != NULL) {
//Skip the first line as it's got useless info
do {
url_fread(buffer, 1, 1, handle);
} while (buffer[0] != '\n');
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
//Print it
printFastaEntry (entry->title, sequence, matchType, outFile);
}
url_fclose (handle);
return;
}
With proteinEntry being defined as:
//Entry for fasta formatable data
typedef struct proteinEntry {
char title[7];
struct proteinEntry *next;
} proteinEntry;
And the url_fopen, url_fclose, url_feof, url_read, and URL_FILE code found here, they mimic the file functions for which they are named.
As you can see I've been doing some debugging with the URL generator (uniprot URLs follow the same format for different proteins), I got it working properly and can pull down the data from the site and save it to file in the proper format that I want. I set the read buffer to 1 because I wanted to get a program that was very simplistic but functional (if inelegant) before I start playing with things, so I would have a base to return to as I learned.
I've tested the url_<function> calls and they are giving no errors. So I added incremental printf calls after each line to identify exactly where the bus error is occurring and it is happening at return;.
My understanding of bus errors is that it's a memory access issue wherein I'm trying to get at memory that my program doesn't have control over. My confusion comes from the fact that this is happening at the return of a void function. There's nothing being read, written, or passed to trigger the memory error (as far as I understand it, at least).
Can anyone point me in the right direction to fix my mistake please?
EDIT: As #BLUEPIXY pointed out I had a potential url_fclose (NULL). As #deltheil pointed out I had sequence as a static array. This also made me notice I'm repeating my bad memory allocation for url, so I updated it and it now works. Thanks for your help!
If we look at e.g http://www.uniprot.org/uniprot/Q6GZX1.fasta and skip the first line (as you do) we have:
MNAKYDTDQGVGRMLFLGTIGLAVVVGGLMAYGYYYDGKTPSSGTSFHTASPSFSSRYRY
Which is a 60 characters string.
When you try to read this sequence with:
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
The problem is sequence is not expandable and not large enough (it is a fixed length array of size 2).
So make sure to choose a large enough size to hold any sequence, or implement the ability to expand it on-the-fly.

All objects have the same name

I'm doing my final project for my algorithms course in C. For the project, we have to take an input text file that contains lines like:
P|A|0
or
E|0|1|2
The former indicates a vertex to be added to the graph we're using in the program, the 2nd token being the name of the vertex, and the last token being its index in the vertices[] array of the graph struct.
I've got a while loop going through this program line by line, it takes the first token to decide whether to make a vertex or an edge, and then proceeds accordingly.
When I finish the file traversal, I call my show_vertices function, which is just a for-loop that prints each name (g->vertices[i].name) sequentially.
The problem is that where the name should go in the output (%s), I keep getting the last "token1" I collected. In the case of the particular input file I'm using it happens to be the source node of the last edge in the list...which is odd because there are two other values passed through the strtok() function afterward. The line in the file looks like:
E|6|7|1
which creates an edge from indexes 6 to 7 with a weight of 1. The edge comes up fine. But when I call any printf with a %s, it comes up "6". Regardless.
This is the file traversal.
fgets(currLn, sizeof(currLn), infile);
maxv = atoi(currLn);
if(maxv = 0)
{
//file not formatted correctly, print error message
return;
}
t_graph *g = new_graph(maxv, TRUE);
while((fgets(currLn, sizeof(currLn), infile)) != NULL)
{
token1 = strtok(currLn, "|");
key = token1[0];
if(key == 'P' || key == 'p')
{
token1 = strtok(NULL, "|");
if(!add_vertex(g, token1))
{
//file integration fail, throw error!
return;
}
//***If I print the name here, it works fine and gives me the right name!****
continue;
}
if(key == 'E' || key == 'e')
{
token1 = strtok(NULL, "|");
token2 = strtok(NULL, "|");
token3 = strtok(NULL, "|");
src = atoi(token1);
dst = atoi(token2);
w = atoi(token3);
if(!add_edge(g, src, dst, w))
{
//file integration fail, throw error
return;
}
continue;
}
else
{
//epic error message because user doesn't know what they're doing.
return;
}
}
If I run show_vertices here, I get:
0. 6
1. 6
2. 6
etc...
You aren't copying the name. So you end up with a pointer (returned by strtok) to single static array in which you read each line. Since the name is always at offset 2, it that pointer will always be currLn+2. When you traverse and print, that will be the last name you read.
You need to strdup(token1) before passing it to (or in) add_vertex.
No there isn't enough information to be certain this is the answer. But I'll bet money this is it.

Resources