I write a program which will analyze code in pascal. I've finished work with text and now I have to save my results in linked lists and I have any concept how to do it. In my program I have following elements:
variables(name, type)
constants(name)
typedef(name and components)
procedures(local variables, constants and types)
function (its var, const and types)
To my mind creating 5 lists won't be efficient solution. Can you give me a tip how to create list for these elements with different quantity of components?
Thanks a lot
It depends on what you mean by "efficient".
If you want to have all the different elements in the same list, you will need to add type information to the elements, and of course spend time filtering out all the wrong elements when looking for something.
It would seem more efficient, then, to have separate collections of various types of element.
You should do something like:
typedef struct Element Element;
struct Element {
Element *next;
};
typedef struct {
char *name;
VariableType type;
} ElementVariable;
/* ... and so on */
Then you can have various linked list headers:
ElementVariable *variables;
ElementConstant *constants;
Since each ElementXXX type begins with an Element, you can write fairly generic linked-list code to deal with all the different types.
I don't know exactly what you want to do with the lists, but here's an idea: Create a tree where siblings act like the regular next item in the liked list and where children provide additional information like entries of compound types or function parameters.
So the struct would look something like this:
struct Entity {
const char *id;
enum Type type;
char is_const;
char is_var;
/* ... whatever ... */
struct Entity *next;
struct Entity *child;
};
So if you have
var x: integer;
procedure proc(var res: integer, x, y: integer); forward;
type date = record
dd, mm, yy: integer;;
end;
your tree would look like this:
[var x: integer]
|
[proc: procedure] -> [var res: integer]
| |
| [x: integer]
| |
| [y: integer]
|
[type date: record] -> [dd: integer]
|
[mm: integer]
|
[yy: integer]
Here, the arrow -> denotes children and the vertical bar | siblings or just the next item in the list.
The elements at the left form your primary list, that include your first-level (global scope) symbols. The next level means the scope of the parent element. For example dd is only known inside the record date. This is essentially a multi-level linked list that is easy to expand for arbitrary numbers of function arguments or record entries.
5 lists aren't necessarily inefficient unless you want to search them all for a name, or something.
An alternative is to make a generic way of storing all your info in one list. You could enumerate what you're storing.
Related
I have a continuous block of memory, where the first N bytes all contain objects of type A and the remaining bytes contain elements of type B. So for example, I might have 120 objects of type A followed by 40 objects of type B.
Type A and type B are both structures with different sizes, but both have an integer member "index" which is what I want to sort on. Basically I want to end up with an array that is sorted on index and I currently have one which is sorted on data type. Ideally, I'd like to end up with it sorted on index, then by type, so something like
Index==1 elements | Index==2 elements | ... | Index==L elements
Type A | Type B | Type A | Type B | ... | Type A| Type B
The only thing I've come up with so far is to sort the type A and type B blocks separately by index and then use memcopy to shuffle things around so they look like above. Is there a better solution?
How is the original array set up? Mixing different types in the same array seems like a bad idea. Is there a reason you need to have them in the same array?
If they must be in the same array you might want to use a union. Something like
enum myType { A, B };
struct typeA { myType type; int key; ... data ... }
struct typeB { myType type; int key; ... data ... }
union myTypes { typeA myA; typeB myB }
myTypes data[128];
You can now use the qsort function from the C library.
Another option would be to use a separate array of pointers to the objects and then sort that auxilary array. You would still need some kind of type field at the beginning of each struct though.
Instead of pointers as suggested, you might want to use a struct consisting of type and pointer (the latter even as a union of two different pointers). In this case, you can easily distinguish which type the objects are - unless they nevertheless have the ID field at the same place.
So suppose you have
struct typeA {
int whatever;
int id;
}
struct typeB {
double whatever;
int id;
}
you are bitten and have to do as I stated:
struct typptr {
enum type { typ_A, typ_B } type;
union {
struct typeA * Aptr;
struct typeB * Bptr;
}
}
int getID(struct typptr t)
{
if (t.type == typ_A) {
return (t.Aptr)->id * 2;
} else {
return (t.Bptr)->id * 2 + 1; // have B always sorted after A...
}
}
This way you can easily write a cmp function for qsort in order to sort the struct typptrs.
If the types are "shaped similiarly", such as
struct typeA {
int id;
int whatever;
}
struct typeB {
int id;
double whatever;
}
i.e. have the id field at the start, things are easier (but I don't know if this is portable) as you can always cast to one of them and read out the id field. So you only need an array of pointers and can omit the type field.
There's important information in your later comment - the types aren't initially mixed.
If so, you could run a sort phase on each array part individually (quicksort or any other method you'd like), and then do a final phase of merge-sort between the two arrays (assuming you can spare the additional space)
that's still NlogN + N = NlogN
I know that when defining a an enum you can enumerate through the list numerically:
typedef enum MONTH { Jan = 1, Feb, March, ... };
Can you enumerate through values in a struct the same way? I basically want to loop through the values in a struct using a for or while loop.
struct items {char *item_name, int item_value};
struct items Items_list[] =
{
"item 1", 2000,
"item 2", 3600,
....
};
Language used is C.
Edit: I may have just answered my own question since what I had in mind is an array of structs. Will leave the question up for now however.
This declaration and initializer combination are invalid. (The question changed while the original version of this answer was written.)
If you are asking "is there a way to access the first member of the structure, then the second, without knowing the structure element names", then the answer is 'no, not without careful encoding beforehand'.
The careful coding involves multiple steps. For each element, you need an encoding of the type, the offset of the member in the structure, and perhaps the size of the member (if the encoding of the type does not give that to you anyway):
typedef enum { MT_INT, MT_CHAR_PTR, ... } MemberType;
typedef struct MemberAccess
{
const char *name;
size_t offset;
MemberType type;
} MemberAccess;
static const MemberAccess members[] =
{
"item_name", offsetof(struct items, item_name), MT_CHAR_PTR },
"item_value", offsetof(struct items, item_value), MT_INT },
};
And now, with excruciating care, you can write code to either get or set the value in the Nth member of a struct item pointed at by a particular pointer. However, doing so is still far from trivial.
int get_int(const void *data, const MemberAccess *member)
{
assert(member->type == MT_INT);
return (*(const int *)((const char *)data + member->offset));
}
GCC notwithstanding, you need the cast to a character pointer; you cannot legitimately do pointer arithmetic on void *.
You might then invoke:
int value = get_int(&Items_list[1], &members[1]);
to get at the integer value of the second field of the second element of the array.
This is so excruciating to deal with that there have to be excellent reasons to go through the overhead. There can be such reasons. I know of a system with 400 configuration parameters (which is a problem in its own right, but lets pretend that's OK; they've accumulated over 20 years of development) stored in a structure with heterogeneous types for the members. The code that manipulates it is written out 400 times - ouch! - because it doesn't use a system driven off an analogue of the MemberAccess structure. The code would be a lot more compact than it currently is because there are about a dozen data types to deal with, so most of the code is repetitive. Another way of reducing the complexity of that code would be to make everything into a string, but there are issues with that transformation too.
No, you cannot iterate the elements of a struct. The best you can do is hardcode the names of the struct in the loop:
struct items *item = Items_list;
while (item < Items_list + sizeof(Items_list) / sizeof(*Items_list)) {
printf("%s %d", item->item_name, item->item_value);
++item;
}
Also note that you cannot reliably iterate an enum either, because it could be defined like this:
typedef enum MONTH { Jan = 1, Feb = 13, March = 10, ... };
And the elements are both out of order and non-continuous (i.e. there are gaps in the numbers).
One way to do this, when you have a pointer type in the inner struct, and when that pointer cannot be meaningfully NULL, is to do something like:
for (int i=0; Item_List[i].item_name != 0; i++) {
// do whatever
}
If you don't have a handy pointer type, a "sentinel" value can often be used to mark the last record.
You'll need to remember to add a null element/sentinel at the end of your structure array though, and fix your syntax.
No, there's no easy way to do this. A struct is not a numerical value, so you can't loop through its values/members. You can either use an array instead of a struct and access its memebers with a simple for loop, or write a special enumerator callback function which takes the struct as its one argument, a number as another argument, and using case or if statements, looks up each member of the structure.
Often stacks in C are dependent upon datatype used to declare them. For example,
int arr[5]; //creates an integer array of size 5 for stack use
char arr[5]; //creates a character array of size 5 for stack use
are both limited to holding integer and character datatypes respectively and presumes that the programmer knows what data is generated during the runtime. What if I want a stack which can hold any datatype?
I initially thought of implementing it as a union, but the approach is not only difficult but also flawed. Any other suggestions?
I would use a structure like this:
struct THolder
{
int dataType; // this is a value representing the type
void *val; // this is the value
};
Then use an array of THolder to store your values.
This is really just a variant of Pablo Santa Cruz' answer, but I think it looks neater:
typedef enum { integer, real, other } type_t;
typedef struct {
type_t type;
union {
int normal_int; /* valid when type == integer */
double large_float; /* valid when type == real */
void * other; /* valid when type == other */
} content;
} stack_data_t;
You still need to use some way to explicitly set the type of data stored in each element, there is no easy way around that.
You could look into preprocessor magic relying on the compiler-dependent typeof keyword to do that automagically, but that will probably not do anything but ruin the portability.
Some people have suggested a void* member. In addition to that solution I'd like to offer an alternative (assuming your stack is a linked list of heap-allocated structures):
struct stack_node
{
struct stack_node *next;
char data[];
};
The data[] is a C99 construct. data must be the last member; this takes advantage of the fact that we can stuff arbitrary quantities after the address of the struct. If you're using non-C99 compiler you might have to do some sketchy trick like declare it as data[0].
Then you can do something like this:
struct stack_node*
allocate_stack_node(size_t extra_size)
{
return malloc(sizeof(struct stack_node) + extra_size);
}
/* In some other function... */
struct stack_node *ptr = allocate_stack_node(sizeof(int));
int *p = (int*)ptr->data;
If this looks ugly and hacky, it is... But the advantage here is that you still get the generic goodness without introducing more indirection (thus slightly quicker access times for ptr->data than if it were void* pointing to a different location from the structure.)
Update: I'd also like to point out that the code sample I give may have problems if your machine happens to have different alignment requirements for int than char. This is meant as an illustrative example; YMMV.
You could use macros and a "container" type to reduce "type" from being per-element, to whole-container. (C99 code below)
#define GENERIC_STACK(name, type, typeid, elements) \
struct name##_stack { \
unsigned int TypeID; \
type Data[elements]; \
} name = { .TypeID = typeid }
Of course, your "TypeID" would have to allow every possible agreed-upon type you expect; might be a problem if you intend to use whole structs or other user-defined types.
I realize having a uniquely named struct type for every variable is odd and probably not useful... oops.
I created an library that works for any data type:
List new_list(int,int);
creates new list eg:
List list=new_list(TYPE_INT,sizeof(int));
//This will create an list of integers
Error append(List*,void*);
appends an element to the list. *Append accpts two pointers as an argument, if you want to store pointer to the list don't pass the pointer by pointer
eg:
//using the int list from above
int a=5;
Error err;
err=append(&list,&a)
//for an list of pointers
List listptr=new_list(TYPE_CUSTOM,sizeof(int*));
int num=7;
int *ptr=#
append(&listptr,ptr);
//for list of structs
struct Foo
{
int num;
float *ptr;
};
List list=new_list(TYPE_CUSTOM,sizeof(struct Foo));
struct Foo x;
x.num=9;
x.ptr=NULL;
append(&list,&x);
Error get(List*,int);
Gets data at index specified. When called list's current poiter will point to the data.
eg:
List list=new_list(TYPE_INT,sizeof(int));
int i;
for(i=1;i<=10;i++)
append(&list,&i);
//This will print the element at index 2
get(&list,2);
printf("%d",*(int*)list.current);
Error pop(List*,int);
Pops and element from the specified index
eg:
List list=new_list(TYPE_INT,sizeof(int));
int i;
for(i=1;i<=10;i++)
append(&list,&i);
//element in the index 2 will be deleted,
//the current pointer will point to a location that has a copy of the data
pop(&list,2);
printf("%d",*(int*)list.current);
//To use the list as stack, pop at index list.len-1
pop(&list,list.len-1);
//To use the list as queue, pop at index 0
pop(&list,0);
Error merge(List ,List);
Merges two list of same type. If types are different will return a error message in the Error object it returns;
eg:
//Merge two elements of type int
//List 2 will come after list 1
Error err;
err=merge(&list1,&list2);
Iterator get_iterator(List*);
Get an iterator to an list. when initialized will have a pointer to the first element of the list.
eg:
Iterator ite=get_iterator(&list);
Error next(Iterator*);
Get the next element of the list.
eg:
//How to iterate an list of integers
Iterator itr;
for(itr=get_iterator(&list); ite.content!=NULL; next(ite))
printf("%d",*(int*)ite.content);
https://github.com/malayh/C-List
i need in double linked list in C, but it must be for different types. In C++ we use templates for it. Where can i find example in C for double linked list with abstract types items.
Thank you
There are a few approaches you can take, one of which involves storing a void* in your ADT.
I've always found this to be a bit of a pain in a linked list since you have to manage it's allocation separately to the list itself. In other words, to allocate a node, you need to alocate both the node and its payload separately (and remember to clean them both up on deletion as well).
One approach I've used in the past is to have a 'variable sized' structure like:
typedef struct _tNode {
struct _tNode *prev;
struct _tNode *next;
char payload[1];
} tNode;
Now that doesn't look variable sized but let's allocate a structure thus:
typedef struct {
char Name[30];
char Addr[50];
char Phone[20];
} tPerson;
tNode *node = malloc (sizeof (tNode) - 1 + sizeof (tPerson));
Now you have a node that, for all intents and purposes, looks like this:
typedef struct _tNode {
struct _tNode *prev;
struct _tNode *next;
char Name[30];
char Addr[50];
char Phone[20];
} tNode;
or, in graphical form (where [n] means n bytes):
+------------+
| prev[4] |
+------------+
| next[4] |
+------------+ +-----------+
| payload[1] | | Name[30] | <- overlap
+------------+ +-----------+
| Addr[50] |
+-----------+
| Phone[20] |
+-----------+
That is, assuming you know how to address the payload correctly. This can be done as follows:
node->prev = NULL;
node->next = NULL;
tPerson *person = &(node->payload); // cast for easy changes to payload.
strcpy (person->Name, "Richard Cranium");
strcpy (person->Addr, "10 Smith St");
strcpy (person->Phone, "555-5555");
That cast line simply casts the address of the payload character (in the tNode type) to be an address of the actual tPerson payload type.
Using this method, you can carry any payload type you want in a node, even different payload types in each node, if you make the structure more like:
typedef struct _tNode {
struct _tNode *prev;
struct _tNode *next;
int payloadType; // Allows different payload type at each node.
char payload[1];
} tNode;
and use payloadType to store an indicator as to what the payload actually is.
This has the advantage over a union in that it doesn't waste space, as can be seen with the following:
union {
int fourBytes;
char oneHundredBytes[100];
} u;
where 96 bytes are wasted every time you store an integer type in the list (for a 4-byte integer).
The payload type in the tNode allows you to easily detect what type of payload this node is carrying, so your code can decide how to process it. You can use something along the lines of:
#define PAYLOAD_UNKNOWN 0
#define PAYLOAD_MANAGER 1
#define PAYLOAD_EMPLOYEE 2
#define PAYLOAD_CONTRACTOR 3
or (probably better):
typedef enum {
PAYLOAD_UNKNOWN,
PAYLOAD_MANAGER,
PAYLOAD_EMPLOYEE,
PAYLOAD_CONTRACTOR
} tPayLoad;
The only thing you need to watch out for is to ensure that the alignment of the payload is correct. Since both my payload placeholder and the payload are all char types, that's not an issue. However, if your payload consists of types with more stringent alignment requirements (such as something more strict than the pointers, you may need to adjust for it).
While I've never seen an environment with alignments more strict than pointers, it is possible according to the ISO C standard.
You can usually get the required alignment simply by using a data type for the payload placeholder which has the strictest alignment requirement such as:
long payload;
In retrospect, it occurs to me that you probably don't need an array as the payload placeholder. It's simple enough to just have something you can take the address of. I suspect that particular idiom of mine hearkens back to the days where I just stored an array of characters (rather than a structure) and referenced them directly. In that case, you could use payload[] on its own without casting to another type.
Handling arbitrary data in C is usually done by using pointers - specifically void * in most cases.
Obviously, the linux kernel uses linked lists in many, many places both in the kernel itself and in the many device driver modules. Almost all of these are implemented using the same a set of macros defined in linux/list.h
See http://isis.poly.edu/kulesh/stuff/src/klist/ or http://kernelnewbies.org/FAQ/LinkedLists for a good explanation.
The macros look a bit strange at first but are easy to use and soon become second nature. They can trivially be adapted for use in user space (see list.h).
The closest think in C to an "object" base class or templated types is a void pointer. A void * represents a pointer to something, but it does not specify what type of data is being pointed to. If you want to access the data, you need to use a cast.
A doubly linked list node could look like this:
struct DoubleLinkedListNode {
struct DoubleLinkedListNode *previous, *next;
void *data;
};
To assign a node a string, for example, you could do:
char myString[80] = "hello, world";
struct DoubleLinkedListNode myNode;
myNode.data = myString;
To get the string back from a node, you use a cast:
char *string = (char *)myNode.data;
puts(string);
To store a non-pointer, you need to make a pointer from the data. For structures, you may be able to simply dereference the instance if its lifetime is long enough (similar to the above example). If not, or if you're dealing with a primitive type (e.g. an int or float), you need to malloc some space. Just be sure to free the memory when you're done.
You could use macros as demonstrated here (this particular example implements generic hash-tables).
In simplistic terms, a feature structure is an unordered list of attribute-value pairs.
[number:sg, person:3 | _ ],
which can be embedded:
[cat:np, agr:[number:sg, person:3 | _ ] | _ ],
can subindex stuff and share the value
[number:[1], person:3 | _ ],
where [1] is another feature structure (that is, it allows reentrancy).
My question is: what data structure would people think this should be implemented with for later access to values, to perform unification between 2 fts, to "type" them, etc.
There is a full book on this, but it's in lisp, which simplifies list handling. So, my choices are: a hash of lists, a list of lists, or a trie. What do people think about this?
Think a little harder about what constitutes a value.
I'd try the simplest thing that might possibly work:
typedef struct value {
enum { INT, BOOL, STRING, FEATURE_STRUCTURE } ty;
union {
int int;
bool bool;
char *string;
struct fs *feature_structure;
} u;
} *Value;
typedef struct fs * { // list of pairs; this rep could change
struct avpair *pair;
Value value;
} *Feature_Structure;
struct avpair {
const char *attribute;
Value value;
};
You'll want a bunch of constructor functions like
Value mkBool(bool p) {
Value v = malloc(sizeof(*v));
assert(v);
v->ty = BOOL;
v->u.bool = p;
return v;
}
At this point you can start to be in business. If "list of pairs" turns out not to be the right representation, you can change it. Without knowing what operations you plan or what your expectations are for cost model, I'd start off here. Then, if you need to move to something more efficient, I'd probably represent a feature structure using a ternary search tree and keep the same representation for Value.