Concatenate in C, like lists in Haskell - c

I've sort of ended up writing a translator from Haskell to C, a hobby thing..
Haskell's (:)-function, with type a -> [a] -> [a] is what I want to do in C.
1 : [2,3] is in fact 1 : (2 : (3 : [])) if I'm not mistaken.
Say I want to create an infinite list with increasing numbers in it:
lst i = i : lst (i + 1)
How do I do this in C? I imagine the final product looking something along the lines of:
int* lst(int i) {
return cons(i, lst(i + 1));
}
My thought so far:
C has arrays.
Arrays need to be of a defined length, this clashes with recursive reasoning.
C has pointers.
Arrays decay to pointers when passed as arguments anyway so, might as well use pure pointers.
array[i] is equivalent to *(ptr + i), I'm thinking I can use this to get around the problem of having to define things you cannot know (final length of the list etc).
I'm unsure of the implementation of cons though.. My best guess is:
int* cons(int head, int *tail) {
int *ptr;
*(ptr + 1) = *tail;
*ptr = head;
return ptr;
}
Pointers are hard for me, dereferencing etc etc, I don't know C very well and my brain hurts. I just want to make a pointer which contains both the head and the tail. Order is not important for the moment.
It compiles, but that's as far as it goes. Help would be appreciated, I'm open to suggestions, I'm not even sure I'm on the right track or if it's even possible.

First, this is what your function is doing:
int* cons(int head, int *tail) {
int *ptr; // Declare a pointer on the stack
*(ptr + 1) = *tail; // Set the int located after the one pointed by the (uninitialised) pointer ptr to the value pointed to by tail
*ptr = head; // Set the value that the (still unitiliased) points to to head
return ptr; // Return an uninitialised value
}
Second, what you want is a linked list. You can create a structure just like data List a = [] | (:) a (List a) in C. For example,
typedef struct list {
void *element;
struct list *next;
} list_t;
Now cons would look like this:
list_t *harp_cons(void *element, list_t *rest) {
list_t *list = (list_t*)malloc(sizeof(struct list_t));
list->element = element;
list->next = rest;
return list;
}
This is allocating data on the heap, so you need to free it afterwards. You can provide a function free_list which would look like the following. (Assuming the elements can simply be freed with free() for the sake of simplicity.)
void free_list(list_t *list) {
if(list != NULL) {
if(list->next != NULL) {
free_list(list->next);
}
free(list->element);
free(list);
}
}
I just took that code from some of my open source code.
If you want to look at a full implementation of (sort of) a list API: https://github.com/thoferon/harp/blob/master/libharp/list.c.

Related

Why declare pointer in linked list?

I'm new to C language.
studying linked list, I found it very hard to understand using pointer.
(I understand the benefit of linked list compared to array.)
Let's assume I have 3 customers and specific value to each.
struct linknode{
int data;
struct linknode *next
};
why we use pointer like (case1)
linknode *a = malloc(sizeof(Node));
linknode *b = malloc(sizeof(Node));
a->value = 1;
b->value = 2;
a->next = b;
b->next = NULL;
How about just (case2)
linknode a, b;
a.value = 1;
b.value = 2;
a.next = &b;
b.next = NULL;
Isn't it possible to make linked list with case 2?
also insert, delete being possible?
thanks.
Isn't it possible to make linked list with case 2? also insert, delete being possible?
It is possible, it just isn’t very useful. The maximum size of your list is limited to the number of variables you declare, and I doubt you’re going to want to declare more than a dozen separate variables.
Something you can do is use an array as your backing store - instead of declaring separate variables a and b you can declare an array of 10, 100, or 1000 elements, then do something like:
a[i].next = &a[j];
But you’re still limited - your list can never be bigger than the array. The advantage of using dynamic memory is that the list size isn’t limited (at least, not some fixed compile-time limit); however, it means messing with pointers.
Pointers are a fundamental part of programming in C - you cannot write useful C code without using pointers in some fashion.
Edit: A more realistic implementation of a linked list would use an insert function like
/**
* Inserts items into the list in ascending order.
*
* If the list is empty (head is NULL) or if the value
* of the new node is less than the value of the current
* head, then the new node becomes the new head of the
* list.
*
* Returns the pointer to the new node. If the allocation
* was unsuccessful, it returns NULL.
*/
struct linknode *insert( struct linknode **head, int val )
{
struct linknode *newnode = calloc( 1, sizeof *newnode );
if ( !newnode )
return NULL;
newnode->data = val;
if ( !*head )
{
/**
* list is empty, newnode becomes the head of the list.
*/
*head = newnode;
}
else if ( newnode->data < (*head)->data )
{
/**
* Value stored in newnode is less than the
* value stored at the list head, newnode
* becomes the new list head.
*/
newnode->next = *head;
*head = newnode;
}
else
{
/**
* Iterate through the list and insert the
* newnode in the correct location.
*/
struct linknode *cur = *head;
while ( cur->next && cur->next->data < newnode->data )
cur = cur->next;
newnode->next = cur->next;
cur->next = newnode;
}
return newnode;
}
and it would be used something like this:
int main( void )
{
struct linknode *list = NULL;
int val;
while ( scanf( "%d", &val ) == 1 )
{
if ( !insert( &list, val ) )
{
fprintf( stderr, "Could not add %d to list, not taking any more input...\n", val );
break;
}
}
...
}
So the elements of the list are allocated and added dynamically, and you're only limited by the amount of memory you have available.
Statically-allocated nodes (the latter) is fine, but not very useful in practice.
You'll presumably need to add nodes to your list. And chances are overwhelmingly in favour of some of them needing to be dynamically allocated. For example, nodes created in a loop would need to be dynamically allocated.
If you had a mix of statically- and dynamically-allocated nodes, you won't know which ones to free without some extra flag in each node. This would add complexity of the program. It's easier to only deal with dynamically-allocated nodes than a mix.
Your examples are not about pointers but about memory allocation and lifetime.
linknode *a = malloc(sizeof(linknode)); //creates a linknode on heap
linknode b; //creates a linknode on stack
First example creates a node on heap. It exists in memory until you free it. Second example creates a node on stack. It exists until program leaves current scope (eg. a function).
Pointers are very easy to understand, they just point somewhere. You've probably already used pointers in Java, C# or other languages (they're called different names in those languages but they work mostly the same). What's difficult to understand is object lifetime. In other languages garbage collector makes sure objects are alive as long as need them, by magic. In C it's your duty to carefully design lifetime of your objects. If you mess up lifetime, you end up using memory that was already assigned to something else or you end up with memory leaks.
In your second example, the list ceases to exist when the function returns. I guess that's not what you intended.
In both of your examples struct linknode *next is a pointer. In one you make it point to an object on heap and in other you make it point to an object on stack, but the pointer works same. It's the target of the pointer where the magic happens.
There is a way to build a similar data structure without pointers. This is the same technique used to serialize a linked data structure for storage or transmission where the pointers lose their meaning.
You allocate a large fixed static array, put your data into it, and use integers to index into the array as your "pointers". Using an integer as an index is commonly called a cursor.
typedef unsigned index;
typedef struct linknode {
int data;
index next;
} link;
link memory[ 1000 ];
index next = 0;
Then you can add data into it.
link *a = memory + next++;
link *b = memory + next++;
a->data = 1;
b->data = 2;
a->next = b - memory;
Some details would need to be worked out for a robust system like whether index 0 is considered valid or a NULL pointer, or if all the .next indices need to be pre-initialized with some non-zero "null" index. IMO, simplest is to treat 0 as NULL and initialize next to 1.
You could also create nodes as above without using pointers, but keeping track of the index is a little clumsier and the expression involving the index is more cumbersome.
index a_index = next++;
index b_index = next++;
memory[ a_index ].data = 1;
memory[ b_index ].data = 2;
memory[ a_index ].next = b_index;
Aside: There's room for improvement in your malloc calls.
linknode *a = malloc(sizeof(Node));
A better style is to use either the typename or variable name from the same line of code, so it can be verified at a glance.
linknode *a = malloc( sizeof( linknode ) );
Or, preferred by many is to use the variable itself, then you can change the type easily if you want because it's only written once.
linknode *a = malloc( sizeof *a );
By giving sizeof an expression argument (which you can do because it's an operator, not a function) you can drop the parentheses, too. The expression argument to sizeof is not evaluated, just inspected for its type. There is one weird exception if the type is variably modified, but that's too complicated to explain (I don't fully understand it). So just remember, there is a weird exception, but for the most part *a in the above code is safe because sizeof just needs the size of the type.
Think of it as "what the size would be if the malloc call succeeds".

How can a Linked List be implemented using only pointers (w/o structures)?

I'm trying to create a linked list without using structures in C.
I want to be able to store an int variable on every node and a pointer to the next node, add unlimited numbers to the list, remove the first item, print all of the elements, etc.
I was thinking that every node of type int** should have 2 pointers of type int*.
the first one will point to an int address and the second will point to NULL.
Then, if I like to add a number to the list, I'll use the last pointer to point to a new allocated node of type int** and so on.
I'm having trouble writing the proper code for this though, and can't seem to reach to the actual int values. See the image below:
You can achieve this by allocating two uintptr_t each time: the first allocated memory space will be responsible for storing the value of the integer and the second one will be pointing to the next memory location.
uintptr_t nodeFirst = malloc(2 * sizeof(uintptr_t));
...
...
uintptr_t nodeNext = malloc(2 * sizeof(uintptr_t));
....
....
*nodeFirst = someIntValue;
*(nodeFirst + 1) = nodeNext;
...
The fact is, my solution above is still using the struct analogy, but w/o the struct keyword.
Here is a complete solution of a LinkedList managed as int ** pointers.
Step 1 - the addNode() function to add one node to the int **head.
int **addNode(int **head, int ival)
{
int **node = malloc(2 * sizeof(int *));
// don't forget to alloc memory to store the int value
node[0] = malloc(sizeof(int));
*(node[0]) = ival;
// next is pointing to NULL
node[1] = NULL;
if (head == NULL) {
// first node to be added
head = node;
}
else {
int **temp;
temp = head;
// temp[1] is the next
while (temp[1]!=NULL) {
// cast needed to go to the next node
temp = (int **)temp[1];
}
// cast needed to store the next node
temp[1] = (int *)node;
}
return (head);
}
Step 2 - a function display() to explore the current linkedlist.
void display(int **head)
{
int **temp;
int i = 0;
temp = head;
printf("display:\n");
while (temp!=NULL) {
// temp[0] is pointing to the ivalue
printf("node[%d]=%d\n",i++,*(temp[0]));
temp = (int **)temp[1];
}
printf("\n");
}
Step 3 - the popNode() function to remove the first node.
int **popNode(int **head)
{
int **temp;
if (head!=NULL) {
temp = (int **)head[1];
// don't forget to free ivalue
free(head[0]);
// then free the next pointer
free(head[1]);
head = temp;
}
return (head);
}
Step 4 - then an example of main() function using the linkedlist.
int main()
{
int **head = NULL;
head = addNode(head,111);
head = addNode(head,222);
head = addNode(head,333);
display(head);
// display:
// node[0]=111
// node[1]=222
// node[2]=333
head = popNode(head);
display(head);
// display:
// node[0]=222
// node[1]=333
while ((head = popNode(head))!=NULL);
display(head);
// display:
return (0);
}
Allocate two arrays, both of which are stored as pointers. In C, they can be the pointers you get back from calloc(). The first holds your node data. We can call it nodes. The second is an array of pointers (or integral offsets). We can call it nexts. Whenever you update the list, update nodes so that each nexts[i] links to the next node after the one that contains nodes[i], or an invalid value such as NULL or -1 if it is the tail. For a double-linked list, you’d need befores or to use the XOR trick. You’ll need a head pointer and some kind of indicator of which elements in your pool are unallocated, which could be something simple like a first free index, or something more complicated like a bitfield.
You would still need to wrap all this in a structure to get more than one linked list in your program, but that does give you one linked list using no data structure other than pointers.
This challenge is crazy, but a structure of arrays isn’t, and you might see a graph or a list of vertices stored in a somewhat similar way. You can allocate or deallocate your node pool all at once instead of in small chunks, it could be more efficient to use 32-bit offsets instead of 64-bit next pointers, and contiguous storage gets you locality of reference.

Malloc of pointers in structs + pointer arithmetic + free() + linkedList

I'm trying to implement a linked-list data structure which each node has a identifier key, some data of variable length (malloc), and a pointer to the next node. Now I want to have 3 functions which respectively: sets a new node to the front of the list, prints the values of a given node using identifier key, and deletes a given node.
The struct I have for the node is as follows:
struct node {
char key[5];
int* data;
node* next;
};
struct node* headNode = NULL;
I have questions regarding each of functions. I will list the function codes I have and ask questions regarding that specific function below:
The code for my set function:
void command_set (char key[], int val[], int numOfVal){
struct node* temp = (node*)malloc(sizeof(node));
strcpy(temp->key, key);
temp->data = (int*)malloc(numOfVal*sizeof(int));
*(temp->data) = *(val);
temp->next = entry_head;
entry_head = temp;
return;
}
Now I have one question regarding this function:
1) Is my method of storing the data valid? i.e. "temp->data = (int*)malloc(numOfValuessizeof(int));" + "(temp->data) = *(val);". What I'm trying to do is dynamically allocate some memory, then store the given values as my node's data in that memory.
The code for my print function:
void printNode (char key[], int numOfVal){
int i;
struct node *currentNode = headNode;
while(currentNode->next!=NULL){
if(!strcmp(currentNode->key,key) ){
for(i=0; i<numOfVal; i++){
printf("%d ",*((currentNode->data)+i));
}
return;
}
currentNode = currentNode->next;
}
I have a one question regarding this function:
2) The data of a node is a list of integers, so does my way of printing out each integer actually work? i.e. "*((currentNode->data)+i)". What I'm trying to do is by using pointer arithmetic I print all the ints stored under data.
The code for my delete function:
void deleteNode (char key[]){
struct node *currentNode = headNode;
struct node *prevNode = headNode;
while(currentNode->next!=NULL){
if(!strcmp(currentNode->key,key) ){
prevNode->next = currentNode->next;
free(currentNode->data);
free(currentNode->next);
free(currentNode);
return;
}
prevNode = currentNode;
currentNode = currentNode->next;
}
I have two questions regarding this function:
3) Am I "deleting" the nodes properly? By using free(). Is this the way to do it?
4) Is this how you link up nodes after deletion? By setting the next pointer to another node.
Please assume that malloc will not return NULL for simplicity. Also note that I have simplified my actual code, else there is way too much to post, so there might be slight errors. You may also assum that the while loops will always work (i.e. there will not be a case where (currentNode->next==NULL). The main point of this post are my questions regarding whether the method of doing something is correct.
An example of the program would be:
-set ex1 2 3 4 5
-get ex1
2 3 4 5
-set ab 32 112
-get ab
32 112
Thanks in advance.
strcpy(temp->key, key);
For the the purpose of your program, this is probably ok, but you should use strncpy(temp->key,key,5) to be safe. Or at least check the length of key to make sure it fits.
*(temp->data) = *(val);
This only sets the first index in the array. You should use memcpy here.
memcpy (temp->data,val, sizeof (int) * numOfVal);
Your print function prints the first element that doesn't match. Did you mean to do the opposite?
Your delete function does the thing. It finds the first node that doesn't match.
You also don't want to free currentNode->next;

Cons Cell data structure in C

I'm a newbie at C, in the early stages of building a small Scheme interpreter. For this part of the project I'm trying to build a simple cons cell data structure. It should take a list like
(a b c)
and represent it internally like so:
[ ][ ] -> [ ][ ] -> [ ][/]
| | |
A B C
To test that it's working correctly, I have a print function to to echo out the input. Here is the code that isn't working:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "lexer.h"
#include "parse.h"
char token[20];
struct conscell {
char *data;
struct conscell *first, *rest;
};
void S_Expression ()
{
/* function from lexer to receive input a split into tokens no greater than 20 */
startTokens(20);
/* gets the next token */
strcpy(token, getToken());
/* List is a typedef for the struct conscell */
List tree = createList ();
tree = nextNode (tree);
printList(tree);
}
List createList ()
{
List node = malloc(sizeof (List));
if (node == NULL) {
printf("Out of memory!\n");
exit(1);
}
node->data = NULL;
node->first = NULL;
node->rest = NULL;
return node;
}
/* Recursive function to build cons cell structure */
List nextNode (List node)
{
node = createList ();
if (token[0] == '(')
{
strcpy(token, getToken());
node->first = nextNode(node->first);
node->rest = nextNode(node->rest);
}
else
{
if (token[0] == ')')
{
node = NULL;
}
else
{
List temp = createList();
temp->data = token;
temp->first = NULL;
temp->rest = NULL;
node->first = temp;
strcpy(token, getToken());
node->rest = nextNode(node->rest);
}
}
return node;
}
/* Prints output. So far, just trying to print symbols */
void printList(List node)
{
if (node != NULL)
{
if (node->data != NULL)
{
printf("%s", node->data);
}
}
}
So far can't print out anything. I'm almost positive its a pointer issue. If anyone could point me (no pun intended) in the right direction, it'd be very much appreciated.
Thank you
First, I'm assuming List is a typedef for a struct conscell*. If it's not, it should be, otherwise your code won't compile without tons of warnings.
A scheme cons cell should be a simple singly linked list, not a doubly-linked list. So your individual cells should be more like:
typedef conscell
{
unsigned char *data; //<== use unsigned char for a memory buffer
struct conscell* next; //<== only a "next" pointer needed
} conscell;
I see you're just trying to print symbols at the moment, so using char rather than unsigned char can work for that purpose, but when you go with more generic data-structures like lambdas, etc., you're going to have to switch to either unsigned char* or void* for the reference to the memory buffer holding those types of more complex data-structures.
The other issue that seems a bit confusing is that you're making each cell of your cons cells another cons cell, for instance, these lines of code,
if (token[0] == '(')
{
strcpy(token, getToken());
node->first = nextNode(node->first);
node->rest = nextNode(node->rest);
}
are recursively adding cons cells as your "first" and "rest" ... but that's not how a linked-list should look like. It should have a pointer to a list-node as the "head" of the list (not another cons-cell like it seems you're doing here), and then each node in the list points to some data and the next node in the list.
Next, you have memory leaks all over the place with your createList() function as you allocate memory with it, but then never delete that memory (i.e., you have code like node = NULL which effectively is a memory leak because you've lost the memory reference to the allocated memory location that node was originally pointing to). You have to call free() on a node pointer before you assign NULL to it.
Finally, printList() doesn't do anything but print the first element of the list you pass it ... there are no recursive calls or loops to cycle to the next node in the linked list. So you're not going to be printing much with that function. It should look more like:
void printList(List node)
{
List current = node;
while (current != NULL) //<== guard for the end-of-list
{
if (node->data != NULL)
{
printf("%s", node->data);
}
current = current->next; //cycle to the next node in the linked list
}
}
So to sum things up, 1) your cons data-structure should represent a singly linked list composed of a structure data-type having a data element and a pointer to the next node. The cons'ed list is accessed through a head pointer pointing to the first node. 2) As you parse the input, you should add nodes to the front of the linked list since Scheme's cons operation, and really all the operations in scheme, are recursive, and "fold to the right", meaning they work from a base-case (i.e., the cons'ing of two elements), and then expand on that base-case. So if you had something like (cons 'd (cons 'c (cons 'b (cons 'a '())))), you'd the print the list (d c b a). If you want, it could also help to put tokens into a stack as your recursively parse the input, and then from the stack input into your linked list (sort of like how a RPN calculator would work).
Also add \n to your printf to make sure it is flushed to stdout:
printf("%s\n", node->data);

Reversing Doublely Linked Deque in C

I'm having trouble reversing my doublely linked deque list (with only a back sentinel) in C, I'm approaching it by switching the pointers and here is the code I have so far:
/* Reverse the deque
param: q pointer to the deque
pre: q is not null and q is not empty
post: the deque is reversed
*/
/* reverseCirListDeque */
void reverseCirListDeque(struct cirListDeque *q)
{
struct DLink *back = q->backSentinel;
struct DLink *second = q->backSentinel->prev;
struct DLink *third = q->backSentinel->next;
while (second != q->backSentinel->next){
back->next = second;
third = back->prev;
back->next->prev = back;
back = second;
second = third;
}
}
But it doesn't seem to work, I've been testing it with a deque that looks like this: 1, 2, 3
The output is: 3 and this process seems to mess up the actual value of the numbers. ie. 2 becomes 2.90085e-309... I think the pointer switching is messed up but I cannot find the problem. And even though it doesn't mean my code is correct; it compiles fine.
Linked structures like deques lend themselves readily to recursion, so I tend to favor a recursive style when dealing with linked structures. This also allows us to write it incrementally so that we can test each function easily. Looping as your function does has many downsides: you can easily introduce fencepost errors and it tends toward large functions that are confusing.
First, you've decided to do this by swapping the pointers, right? So write a function to swap pointers:
void swapCirListDequePointers(
struct cirListDeque** left,
struct cirListDeque** right)
{
struct cirListDeque* temp = *left;
*left = *right;
*right = temp;
}
Now, write a function that reverses the pointers in a single node:
void swapPointersInCirListDeque(struct cirListDeque* q)
{
swapCirListDequePointers(&(q->prev),&(q->next));
}
Now, put it together recursively:
void reverseCirListDeque(struct cirListDeque* q)
{
if(q == q->backSentinel)
return;
swapPointersInCirListDeque(q);
// Leave this call in tail position so that compiler can optimize it
reverseCirListDeque(q->prev); // Tricky; this used to be q->next
}
I'm not sure exactly how your struct is designed; my function assumes that your deque is circular and that you'll be calling this on the sentinel.
EDIT: If your deque isn't circular, you'll want to call swapPointersInCirListDeque(q) on the sentinel as well, so move swapPointersInCirListDeque(q) before the if statement.
If you plan to use the backSentinel after this, you should change that also, since it's now the front of the list. If you have a frontSentinel, you can just add swapCirListDequePointers(&(q->frontSentinel),&(q->backSentinel)); to swapPointersInCirListDeque. Otherwise, you'll have to pass in the first node along with q and set q->backSentinel to that.
If it's a doubly linked list, you shouldn't need to change any pointers at all. Just swap over the payloads:
pointer1 = first
pointer2 = last
while pointer1 != pointer2 and pointer2->next != pointer1:
temp = pointer1->payload
pointer1->payload = pointer2->payload
pointer2->payload = temp
pointer1 = pointer1->next
pointer2 = pointer2->prev
If by back sentinel you mean the last pointer (as in no first pointer is available), then you need to step backwards throw the deque to find it. It's hard to believe however that this would be the case since it would be a fairly inefficient deque (which is supposed to be a double ended queue).
You've been given a couple of suggestions already; here's another possibility:
// Assumes a node something like:
typedef struct node {
struct node *next, *prev;
int data;
} node;
and also assumes a couple of variables (globals for the moment) named head and tail that point to the head and tail of the deque, respectively.
void reverse() {
node *pos = head;
node *temp = pos->next;
head = tail;
tail = pos;
while (pos != NULL) {
node *t = pos->prev;
pos->prev = pos->next;
pos->next = t;
pos = temp;
if (temp)
temp = temp->next;
}
}
At least for the moment, this does not assume any sentinels -- just NULL pointers to signal the ends of the list.
If you're just storing ints in the deque, Paxdiablo's suggestion is a good one (except that creating a doubly-linked node to hold only an int is a massive waste). Assuming that in reality you were storing something large enough for doubly-linked nodes to make sense, you'd also prefer to avoid moving that data around any more than necessary, at least as a general rule.

Resources