I'm using lex to implement a scanner. I want to build a symbol table while parsing. I have two structs, SymbolEntry and SymbolTable (below). Most of the time, when I call my function for inserting a symbol (registerID, also below) I have all the information for the entry. However, when I have a constant I also want to get it's value, but that is not immediately available when I first create the entry. When I try to change the entries value later in the code, I'm invalidating the whole memory block used by that entry and the name and value are printing garbage.
Here are the two structs:
typedef struct{
char* type;
char* name;
char* value;
} SymbolEntry;
typedef struct{
SymbolEntry *entries;
size_t size;
size_t capacity;
} SymbolTable;
This is the registerID function, called when an {id} is matched. yytext contains the ID.
int registerID(char* type){
//create a new symbol entry with the specified type and name and a default value
SymbolEntry e;
e.type = type;
e.name = (char *)calloc(yyleng+1, sizeof(char));
strcpy(e.name, yytext);
e.value = "";
prevSym = insertSymbol(&table, e);
return prevSym;
}
This is the relevant code for insertSymbol(SymbolTable* st, SymbolEntry entry). pos is always the last element in the array when inserting (otherwise the entry isn't unique and pos is just returned).
st->entries[pos].name = (char *)calloc(strlen(entry.name)+1, sizeof(char));
st->entries[pos].type = (char *)calloc(strlen(entry.type)+1, sizeof(char));
st->entries[pos].value = (char *)calloc(strlen(entry.value)+1, sizeof(char));
strcpy(st->entries[pos].name, entry.name);
strcpy(st->entries[pos].type, entry.type);
strcpy(st->entries[pos].value, entry.value);
Later, after the lex framework has matched the value immediately following a CONSTANTs name, this code is performed (directly in the rule for <CONSTANT_VAL>{number})
table.entries[prevSym].value = (char *)calloc(yyleng+1, sizeof(char));
strcpy(table.entries[prevSym].value, yytext);
Why does this invalidate the the SymbolEntry at this position in the array, and how can I safely change the contents of value?
EDIT:
It doesn't only happen with constants. The first two SymbolEntrys are always garbage. I'm assuming that probably means they ALL are, but the others just haven't been overwritten.
Also, it seems like subsequent calls to registerID is causing the data to get corrupted. With just 9 symbols, only the first two are garbage, with 34, it's the first 7. Adding more text to parse without variables did not cause any issues.
SOLVED
Well it turns out that I just accidentally deleted a line somewhere along the way and that's what introduced the bug. I accidentally erased my call to initSymbolTable. Thanks to chux for asking me how I initialized the table. Sorry about that.
2 potential problems.
1 - Compare
// Fields set with non-malloc'ed memory
e.type = type;
e.value = "";
// Fields set with malloc'ed memory
st->entries[pos].type = (char *)calloc(strlen(entry.type)+1, sizeof(char));
st->entries[pos].value = (char *)calloc(strlen(entry.value)+1, sizeof(char));
strcpy(st->entries[pos].type, entry.type);
strcpy(st->entries[pos].value, entry.value);
Both of these set the fields to valid memory and in the second case, dynamically fill the memory. The concern is subsequent use. How does OP know to free() or realloc() the second kind and not the first. Further concern: With registerID(char* type), how do we know the value passed to type is still valid way later when that pointer is used via field type. Suggest:
e.type = strdup(type); // or the usual strlen()+1, malloc() and copy
e.value = strdup("");
2 - The type and setting of yyleng are not shown. Maybe it is not big enough as compared to strlen(e.name), etc.?
[Edit] after review, I real think e.type = type; is the problem. e.type needs its own copy of type.
Minor: Consider
// st->entries[pos].type = (char *)calloc(strlen(entry.type)+1, sizeof(char));
// strcpy(st->entries[pos].type, entry.type);
size_t Length = strlen(entry.type) + 1;
st->entries[pos].type = malloc(Length);
memcpy(st->entries[pos].type, entry.type, Length);
Related
I am struggled to write a correct title for the post. Forgive me if it is not 100% accurate.
Because, the initial issue was just freeing a malloced output without disturbing where it is assigned to. I then decided to copy the source (encrypt_Data) into another variable before I free it.
And then another issue arose this time. It is where I am. If I can find a proper solution at least for one of them it would be great.
Issue#1
typedef struct {
const char* sTopic;
const char* pData;
} CLIENT_MESSAGE;
CLIENT_MESSAGE Publish;
char * pData = "Hello World!";
char * encrypt_Data = Encrypt_Data_Base64(pData);
Publish.pData = encrypt_Data;
free(encrypt_Data);
If I free the encrypt_Data, Publish.pData is also freed (as they are just a pointer and pointing to the same memory location).
Note that: The function Encrypt_Data_Base64 has several several nested function called underneath and it has malloced output. This is why I try to free the memory sourced from there.
And then I decided to make a copy of the encrypt_Data so I can then free it freely.
Issue#1 solving attempt
char * pData = "Hello World!";
char * encrypt_Data = Encrypt_Data_Base64(pData);
// ------- addition starts ------
int len = strlen(encrypt_Data);
char temp[len+1];
char * pTemp = temp;
memcpy(pTemp, encrypt_Data, len+1);
pTemp[len] = '\0';
// ------- addition ends------
Publish.pData = pTemp
free(encrypt_Data);
Struct variable value preserved well. So far so good.
And then I have to pass the struct to a library function (I don't have source code for it).
Issue#2
CLIENT_Publish(&Publish); // This is how it supposed to be.
//Prototype: int CLIENT_Publish(CLIENT_MESSAGE* pPublish);
And this time, when I debug, as soon as my current function is left and just called that one
before doing anything else, the struct value has been altered. I assumed this might be related to non-terminated string. I therefore added NUL termination as you may see in the solving attempt above. But it didn't help.
Array content before leaving the function (required block is between 0 and 12)
Array content when entering the other function (CLIENT_Publish)
Since I can't do much about the library part, I have to do something in
the part I can control.
EDIT:
If I get my value without using this line
char * encrypt_Data = Encrypt_Data_Base64(pData);
for example;
AFunction_GetPtr(&pData);
Publish.pData = pData;
CLIENT_Publish(&Publish);
This way, it works nice and easy. But, I like to intercept the value coming from the AFunction_GetPtr and use it in Encrypt_Data_Base64 and then pass it to CLIENT_Publish.
Any input highly appreciated.
It is not the correct solution, but the simplest thing for you to do right now is:
char * pData = "Hello World!";
char * encrypt_Data = Encrypt_Data_Base64(pData);
Publish.pData = strdup(encrypt_Data);
free(encrypt_Data);
Now that you've made another copy of the data, you'll need to eventually free it. So you might as well just do:
char * pData = "Hello World!";
char * encrypt_Data = Encrypt_Data_Base64(pData);
Publish.pData = encrypt_Data;
/* Do not free(encrypt_Data); */
Just remember to free Publish.pData when you no longer need it.
I'm working on a program in C and one of my key functions is defined as follows:
void changeIndex(char* current_index)
{
char temp_index[41]; // note: same size as current_index
// do stuff with temp_index (inserting characters and such)
current_index = temp_index;
}
However, this function has no effect on current_index. I thought I found a fix and tried changing the last line to
strcpy(current_index, temp_index)
but this gave me yet another error. Can anyone spot what I'm doing wrong here? I basically just want to set the contents of current_index equal to that of temp_index at each call of changeIndex.
If more information is needed, please let me know.
strcpy should work if current_index points to allocated memory of sufficient size. Consider the following example, where changeIndex require additional parameter - size of distination string:
void changeIndex(char* current_index, int max_length)
{
// check the destination memory
if(current_index == NULL)
{
return; // do nothing
}
char temp_index[41];
// do stuff with temp_index (inserting characters and such)
// copy to external memory, that should be allocated
strncpy(current_index, temp_index, max_length-1);
current_index[max_length-1] = '\0';
}
Note: strncpy is better for the case when temp_index is longer then current_index.
Examples of usage:
// example with automatic memory
char str[20];
changeIndex(str, 20);
// example with dinamic memory
char * ptr = (char *) malloc(50);
changeIndex(ptr, 50);
Obviously defining a local char array on the stack and returning a pointer to it is wrong. You should never do that as the memory is not defined after the function ends.
In addition to the previous answers: The strncpy char pointer (which seems unsafe for my opinion), and the malloc which is safer but you need to remember to free it outside of the function (and its inconsistent with the hierarchy of the program) you can do the following:
char* changeIndex()
{
static char temp_index[41]; // note: same size as current_index
// do stuff with temp_index (inserting characters and such)
return temp_index;
}
As the char array is static it will not be undefined at the end of the function and you do not need to remember to free the pointer at the end of the use.
Caveat: If you are using multiple thread you cannot use this option as the static memory could be changed by different threads entering the function at the same time
Your array temp_index is local for function, then *current_index don't take what u want.
U can use also function strdup . Function return begin memory location of copied string , or NULL if error occurred, lets say ( char *strdup(char *) )
char temp[] = "fruit";
char *line = strdup(temp );
I am creating a symbol table for a compiler I am writing and when I try adding to my symbol table I keep getting valgrind errors. When I call my function, I am calling my add function
stAdd (&sSymbolTable, "test", RSRVWRD, 4, 9);
and in my stAdd function it is currently
void stAdd (StPtr psSymbolTable, char *identifier, SymbolTableType type,
int addressField, int arrayDimensions)
{
int hashValue;
hashValue = hash (identifier, psSymbolTable->numBuckets);
if (psSymbolTable->spSymbolTable[hashValue] == NULL)
{
psSymbolTable->spSymbolTable[hashValue] = (StEntryPtr) malloc (sizeof(StEntry));
strcpy (psSymbolTable->spSymbolTable[hashValue]->identifier, identifier);
psSymbolTable->spSymbolTable[hashValue]->entryLevel = psSymbolTable->currentLevel;
psSymbolTable->spSymbolTable[hashValue]->type = type;
psSymbolTable->spSymbolTable[hashValue]->addressField = addressField;
psSymbolTable->spSymbolTable[hashValue]->arrayDimensions = arrayDimensions;
psSymbolTable->spSymbolTable[hashValue]->psNext = NULL;
}
}
But every time I set a value within my StEntry struckt, I get an error
Use of unitialised value of size 8
every time I set something within the if statement. Does any see where I am going wrong?
My StEntry is
typedef struct StEntry
{
char identifier[32];
SymbolTableLevel entryLevel;
SymbolTableType type;
int addressField;
int arrayDimensions;
StEntryPtr psNext;
} StEntry;
This would be a lot easier if I could see the definition of struct StEntry or even the precise valgrind error. But I'll take a wild guess anyway, because I'm feeling overconfident.
Here, you malloc a new StEntry which you will proceed to fill in:
psSymbolTable->spSymbolTable[hashValue] = (StEntryPtr) malloc (sizeof(StEntry));
This is C, by the way. You don't need to cast the result of the malloc, and it is generally a good idea not to do so. Personally, I'd prefer:
StEntry* new_entry = malloc(sizeof *new_entry);
// Fill in the fields in new_entry
psSymbolTable->spSymbolTable[hashvale] = new_entry;
And actually, I'd ditch the hungarian prefixes, too, but that's an entirely other discussion, which is primarily opinion-based. But I digress.
The next thing you do is:
strcpy (psSymbolTable->spSymbolTable[hashValue]->identifier, identifier);
Now, psSymbolTable->spSymbolTable[hashValue]->identifier might well be a char *, which will point to the character string of the identifier corresponding to this symbol table entry. So it's a pointer. But what is its value? Answer: it doesn't have one. It's sitting in a block of malloc'd and uninitialized memory.
So when strcpy tries to use it as the address of a character string... well, watching out for the flying lizards. (If that's the problem, you could fix it in a flash by using strdup instead of strcpy.)
Now, I could well be wrong. Maybe the identifier member is not char*, but rather char[8]. Then there is no problem with what it points to, but there's also nothing stopping the strcpy from writing beyond its end. So either way, there's something ungainly about that line, which needs to be fixed.
Hello I am experiencing something I really don't understand the principle with structures in C.
One of my structures contains 2 character strings (named 'seq' and 'foldedSeq'). Both these strings (should) have the same dimensions.
However when I try to modify one, the second automatically takes the same modifications at the same specified place of the string.
Here is the interesting chunk of code:
typedef struct MD {
int nb_line;
int nb_colomn;
EM ** matrix;
char * seq; // Initial sequence.
char * foldedSeq;
} MD;
void set_sequences(MD * M, char * seq) {
M->seq = seq;
M->foldedSeq = M->seq; //Purpose: give to foldedSeq the seq dimensions (perhaps it is useless).
printf("seq= %s\tstrlen= %d\nM->seq= %s\nM->foldedSeq= %s\n", seq, strlen(seq), M->seq, M->foldedSeq);
// Up to this point 'seq' = 'foldedSeq'
int i;
for( i = 0; i < strlen(seq); i++) {
M->foldedSeq[i] = '-'; // Original purpose: make 'foldedSeq' string filled with hyphens only.
}
printf("seq= %s\tstrlen= %d\nM->seq= %s\nM->foldedSeq= %s\n", seq, strlen(seq), M->seq, M->foldedSeq);
// Here is the problem: the string 'seq' REALLY IS modified alongside with 'foldedSeq'... WHY? :(
}
Since I wrote "M->foldedSeq[i]" should be modified, why would "M->seq[i]" be modified as well ??
Thank you for reading and providing me explanations, my logic found a dead end here.
M->seq = seq;
M->foldedSeq = M->seq;
is the same as saying
M->seq = seq;
M->foldedSeq = seq;
They are both pointing to the same location in memory. So modifying one is modifying both.
Probably what you want to do instead is malloc a block of memory that is the same length as the other.
M->foldedSeq = calloc(strlen(seq) + 1, sizeof(char));
What you're witnessing is simple pointer aliasing, a basic feature of the C language. Because you explicitly assign both seq and foldedSeq members to point to the same bit of memory, and modifications through one pointer will be witnessed by the other. If that's not what you intended/wanted, you'd need to copy the memory block of seq before assigning it to foldedSeq to keep the two distinct.
Because they both point to the same memory address and when you modify one you are modifying the other.
This assignment: M->foldedSeq = M->seq; is just assigning memory locations, not doing any sort of copy.
If you want to keep them separate, you will have to allocate memory and copy the string into the new memory.
Try either:
M->foldedSeq = strdp(M->seq) if you want to copy the content too.
Or:
M->foldedSeq = malloc(strlen(M->seq) + 1); to just have a new memory space of the same size.
This line:
M->foldedSeq = M->seq;
is setting the foldedSeq pointer to the same value as seq. It is not creating new space and copying the contents of seq to foldedSeq which is maybe where the confusion is. So when you modify either one the other will be modified as well. One possible solution is to use strdup:
M->foldedSeq = strdup( M->seq ) ;
i use pointer for holding name and research lab property. But when i print the existing Vertex ,when i print the vertex, i cant see so -called attributes properly.
For example though real value of name is "lancelot" , i see it as wrong such as "asdasdasdasd"
struct vertex {
int value;
char*name;
char* researchLab;
struct vertex *next;
struct edge *list;
};
void GRAPHinsertV(Graph G, int value,char*name,char*researchLab) {
//create new Vertex.
Vertex newV = malloc(sizeof newV);
// set value of new variable to which belongs the person.
newV->value = value;
newV->name=name;
newV->researchLab=researchLab;
newV->next = G->head;
newV->list = NULL;
G->head = newV;
G->V++;
}
/***
The method creates new person.
**/
void createNewPerson(Graph G) {
int id;
char name[30];
char researchLab[30];
// get requeired variables.
printf("Enter id of the person to be added.\n");
scanf("%d",&id);
printf("Enter name of the person to be added.\n");
scanf("%s",name);
printf("Enter researc lab of the person to be added\n");
scanf("%s",researchLab);
// insert the people to the social network.
GRAPHinsertV(G,id,name,researchLab);
}
void ListAllPeople(Graph G)
{
Vertex tmp;
Edge list;
for(tmp = G->head;tmp!=NULL;tmp=tmp->next)
{
fprintf(stdout,"V:%d\t%s\t%s\n",tmp->value,tmp->name,tmp->researchLab);
}
system("pause");
}
When you do this:
newV->name=name;
newV->researchLab=researchLab;
You are copying the pointer to the strings name and researchLab. You are not copying the strings themselves. In other words, after this, newV->name and name point to exactly the same location in memory where the name is stored; you have not created a duplicate copy of the data.
Since you then proceed to overwrite the name array in the createNewPerson function, at the end of this function, all of your vertex structs will have their name attribute pointing to the same memory location, which is only storing the last name entered.
Worse, when createNewPerson returns, its local name array goes out of scope, and is re-used for other things. Since your vertex structs are still pointing here for their name attributes, this is how you get garbage.
You need to duplicate the string. A simple way to do it is:
newV->name = strdup(name);
You will need to #include <string.h> to get the strdup library function.
And then you also need to make sure that you call free on the name attribute whenever you are disposing of a vertex structure.
GRAPHinsertV copies the pointer of the name and researchLab strings to the vector structure.
createNewPerson creates a temporary for the name and researchLab strings.
The problem here is, you're pointing to a temporary string which causes undefined behaviour when you access it after createNewPerson returns.
To solve this problem, you can duplicate the strings in GRAPHinsertV using malloc+strcpy, or by using the non-standard strdup.
The name variable you pass to GRAPHinsertV() is allocated on the stack for createNewPerson(), so the pointer points to a local variable. Once the activations records are popped off that value can (and will) be overwritten by subsequent code.
You need to allocate memory on the heap if you are only going to keep a char * in the struct.
Ex. Instead of
char name[30];
you could use
char *name = (char *)malloc(30*sizeof(char));
but keep in mind if you manually allocate it you have to take care of freeing it as well, otherwise it will have a memory leak.
When you assign the char *name pointer, like
newV->name=name;
You're not creating a new string, but making the newV.name member point to the same memory as the char[] array that was passed in. You'll need to malloc() or otherwise allocate a new char[] array in order to obtain separate storage for each structure.
There's a problem here:
Vertex newV = malloc(sizeof newV);
It should be
Vertex *newV = malloc(sizeof(Vertex));
You are allocating memory in the function createNewPerson() that lasts exactly as long as createNewPerson() is executing, and is available for overwriting immediately after it returns. You need to copy the text fields in with something like strdup(newV->name, name), rather than point to the local variables in createNewPerson(). (If your implementation doesn't have strdup(), you can easily define it as:
char * strdup(const char *inp)
{
char * s = malloc(strlen(inp) + 1);
strcpy(s, inp);
return s;
}
In addition, your I/O has potential problems. If you enter my name, "David Thornley", for the name, it'll take "David" as the name and "Thornley" as the lab, since "%s" searches for a whitespace-delimited string. If I enter "Forty-two" for the ID, nothing will be put in id, and "Forty-two" will be used for the name. If I enter a name or lab name over 29 characters, it will overwrite other memory.
I'd suggest using fgets() to get one line of input per answer, then use sscanf() to parse it.
When passing and assigning strings, always make a copy of them. There're no guarantees that the string you received is still in the memory afterwards, since the pointer could have been freed.
Of course, if you are only going to use name inside the function (that's, you're not going to assign it to a variable outside the scope of the function), you don't have to do the copy.
In order to do that, inside GRAPHinsertV, instead of
newV->name=name;
do
if (name != NULL) // Preventing using null pointer
{
newV->name = malloc(strlen(name)+1);
strcpy(newV->name, name);
}