Obtain node number n of a linked list

Obtain node number n of a linked list - c

If I have a linked list:
first node -----> second node ------> third node ---> ?
Can I show the third node value ( for example ) without use a classic list-linear-searching algorithm?
My attempt of getting the n'th node:
struct node* indexof( struct node* head, int i )
{
int offset = (int)((char*)head->next - (char*)head);
return ((struct node*)((char*)head + offset * i));
}

That depends on your exact linked list implementation, but in general, no. You will have to traverse the list in order to access the nth element.
This is a characteristic of linked lists, in the sense that the normal tricks you could use for computing an offset into an array or other sequence-like structure will not work, as your individual list elements are not guaranteed to be laid out in memory in any sensible way, so you are forced to follow the next pointers in-order to retrieve the third element.
You could consider other data structures that provide constant-time indexed access into your linked list.

Sounds like you've picked the wrong data structure. If you want to go straight to nth then you should use an array.
Failing that, what's so bad about going through in linear fashion? Would have to be called a lot on a very long linked list to be causing performance problem.

One of the purposes of a linked list is to be able to easily add and delete nodes with little cost.
You can renounce that capability and use an array of payload pointers, but then it is no longer a linked list (what would the purpose be of having a pointer to the next node when the same node can be obtained trivially by arithmetic increment?).
E.g. instead of
struct
{
struct node *next;
void *payload;
...
} node;
node *root = NULL;
and allocate space for no nodes, you can have
typedef struct
{
void *payload;
...
} node;
node *vector = NULL;
size_t vectorsize = 0;
and allocate space for as many nodes as initially required, then using realloc to extend the list when needed, and memmove to remove nodes by shifting back the nodes beyond the deleted one. This incurs a clear performance loss when adding or removing nodes. On the other hand, the n-th node is just vector[n].
I repeat, this is no longer a linked list: it may be that whatever you're needing this for, it can be better accomplished with an array of pointers instead than a linked list.
Which reminds me, you'd do well to explain why you need the direct-addressing ability ("State the problem, don't ask how to implement the solution"): it may also well be that what you need is neither an array nor a linked list, but, who knows?, maybe a ring buffer, a stack, a hill, or a binary tree.
In some implementations you can even deploy two bonded structures, e.g. you might use a (doubly?) linked list in a first phase with lots of insertions and deletions especially of recently inserted data; then you build, and switch to, a pointer array for a second phase where you need direct addressing driven by the node number (use the array as "cache" of list node addresses):
for (listsize = 0, scan = root; scan; scan = scan->next)
listsize++;
if (NULL == (vector = (node *)malloc(listsize * sizeof(node))))
{
// out of memory
return EXIT_FAILURE;
}
for (listsize = 0, scan = root; scan; scan = scan->next)
vector[listsize++] = scan;
// Now vector[i]->payload is the payload of the i-th node

Related

Linked stacks and queues

Came across Linked Stacks and Queues in a book(and it is not stack/queue implementation using linked list).
it says that stack/queue can be represented sequentially if we had only one stack/queue.However, when several stacks/queues coexisted ,then there is no efficient way to represent them sequentially.
Below is the code given
#define MAX_STACKS 10 //maximum number of stacks;
typedef struct {
int key;
//other fields.
}element;
typedef struct stack *stackpointer;
typedef struct {
element data;
stackpointer link;
}stack;
stackpointer top[MAX_STACKS];
void push(int i ,element item) {
stackpointer temp;
malloc(temp,sizeof(*temp));
temp->data = item;
temp->link = top[i];
top[i] = temp;
}
Am newbie to data structures. Can I get the brief explantion of above concept i.e Linked Stacks/Queues.

So I checked out your book and I kind of understand what your problem is.
Such a representation proved efficient if we had only one stack or one queue. However, when several stacks and queues co−exist, there was noefficient way to represent them sequentially
So, by sequential, you must understand that it means using arrays to represent stacks and not a linked list. Now just assume that you have a matrix comprising of 10 arrays to represent each of size 100, and you push some data into each. Say you push only a few elements in each stack, what happens is that you end up wasting a lot of data as there are a 1000 elements in the matrix. This problem was there while using a single array but it becomes more pronounced when you have multiple arrays for multiple stacks.
Now as you might have understood, using the linked list representation of a stack uses as much memory as needed, with only a slight overhead of keeping track of the next element, in this case stackpointer link.
stackpointer top[MAX_STACKS]
So what we have done here is create an array of type stackpointer to keep track of the top position of each individual stack. So now whenever the user wishes to enter an element, they must pass the index(int i) as well as the data (element item).
void push(int i ,element item)
{
stackpointer temp;
malloc(temp,sizeof(*temp));
temp->data = item;
temp->link = top[i];
top[i] = temp;
}
So what we do is create a temp variable to store our data, which will now become the top of our stack but before doing so we must point it to the previous top of stack, (that is done in line 5) and in line 6, we just point the top[i] to temp.
However, you might want to correct your code with this
stackpointer temp = (stackpointer)malloc(sizeof(element));
If you have doubts, on malloc, just refer to this.
If you have a doubt, let me know and I will clarify anything you need.

Which data structure in C allow me to store lines and append lines easily?

I got a list of string data. 10,20,30 are the line numbers
10. string 1
20. string 2
30. string 3
and if user types in "23 string data". 23 is the line number user wants to insert into. The data should become like that
10. string 1
20. string 2
23. string data
30. string 3
and if user types in "40 string data". The data should become like that
10. string 1
20. string 2
23. string data
30. string 3
40. string data
I'm relatively new in C's data structure. Which data structure should I use to store this kind of data efficiently? My current direction is to implement dynamic array or linked list. However, below are the list of problems I experienced.
Problem with dynamic array:
Using the line number as the index and create sufficient space to
ensure array length is always more or equal to the highest line
number. Waste a lot of memory for un-used index.
Printing out the data would be an issue.
Eg. index 0-9 doesn't have memory allocated. Accessing it will cause an error. Need to find ways to know which index are used?
Problem with linked list:
I cannot jump index(not sure). How do identify which line comes after
another and insert line in between easily?

Let's assume the following about your requirements:
No strong real time. (I.e. it's not for high frequency trading, or controlling machinery.)
It runs on a relatively contemporary PC (RAM measured in GB, CPU frequency in GHz). In particular it does not run on an embedded system.
The data is no more than a few ten thousand lines.
Then you can use almost any data structure you like; it won't matter with respect to memory or run time behavior.
For example, in order to find the point of insertion in a linked list, just iterate that list. PCs are fast enough to iterate tens of thousands of times before you finished blinking.
Or just allocate an array of 100,000 lines of 80 characters each. No problem whatsoever. Or of a million lines. Still no problem. Or of 10 million lines, still no problem. You see my point? (In an array you'll need a marker to mark unused lines. I would use a struct line { bool used; char text[80]; } or the like. You can also cater to arbitrarily long lines — and save memory — by having just a char *text member and allocating dynamically, or defining the text as a linked list of chunks.)
The choice therefore boils down to what's easiest for you to use. Could be the array.

I'll give the two solutions I could come up with, but this question is possibly open-ended.
Use a hash table. Keys are line numbers. Values are (string, pointer to next line's value). This makes both random and linear access fast. Edit: Insertion is still O(n) with this. It'll only help with access time, which will be O(1). The second solution has O(1) insertion.
Assuming you don't have wildly spaced out line numbers: Use a singly linked list L to store strings. Also create a separate array P containing a pointer to every k-th node in the list. To access line i, check P[floor(i/k)], jump to the node it points to in L, and jump forward i mod k times to reach your string. Access time is therefore O(k). Insertion time is O(1). Space usage for n strings is O(n + max{i}/k).
The one thing that makes this relevant to C... is that there's no built-in hash table, of course! So #2 may be easier to implement.

I know you're looking for a specialized data structure, but how about instead using a simple data structure but sorting it lazily? You could append new lines to a dynamic array and then sort the array (with qsort) when you need to print them.
I think that this would be better because printing all lines is probably done much less frequently than adding/inserting lines. Therefore you should make adding lines cheap (in this case, O(1) amortized), and printing can be more expensive (in this case, O(n log n)). This also keeps your data structures simple and lets the C standard library handle the complicated parts.
You could make this a bit better still by keeping a flag that tracks whether all of the data is already known to be sorted; that way repeatedly printing (or, presuming you're trying to write a BASIC interpreter, repeatedly running) will be cheap too. Such a flag also might be helpful if you expect that lines are usually entered in order; then as each line is added:
alreadySorted = alreadySorted && (new_line_number > last_line_number)
I'll note that you have not specified what happens if a line is added that reuses an existing line number. If you wish to replace the old line, then you could tweak this approach by using a stable sort and afterward iterating over the lines to remove lines with duplicate numbers, keeping only the last one.
(If you want to make qsort stable for this case, instead of storing just a string for each line, you could store some extra metadata with it (any monotonically increasing counter would do, such as the current time, or just the total number of lines at the time the line was added). Then the comparison function you give to qsort would just need to use that extra data to resolve ties from duplicate line numbers.)
One disadvantage to this approach is that removing lines either won't be fast or won't reclaim memory immediately. However, you haven't specified whether line removal is a requirement; even if it is, it is likely to be a rare operation (so being a bit more time-inefficient or a bit more space-inefficient might be acceptable).

The best solution for this task is to use dictionary data type.
Of course, depending on nature of keys (number of lines) you can perform optimization via appropriate hash table.
Of course, c library don't have implementation of dictionary. But you can create your own, based on red black tree. Cormen explained such data structure easily https://www.amazon.com/Introduction-Algorithms-3rd-MIT-Press/dp/0262033844
Note: if your collection has small size or you will rarely modify structure, then you can just use linked list.

My suggestion is to use linked list and insertion sort to insert whenever needed ,
Here is the code modified on originally taken from geeksforgeeks.org,
I haven't tested code , this is just modified code as taken from the site.
/* C program for insertion sort on a linked list */
#include<stdio.h>
#include<stdlib.h>
/* Link list node */
struct node
{
int lineNumber;
char *str;
struct node* next;
}node;
// Function to insert a given node in a sorted linked list
void sortedInsert(struct node**, struct node*);
// function to sort a singly linked list using insertion sort
void insertionSort(struct node **head_ref)
{
// Initialize sorted linked list
struct node *sorted = NULL;
// Traverse the given linked list and insert every
// node to sorted
struct node *current = *head_ref;
while (current != NULL)
{
// Store next for next iteration
struct node *next = current->next;
// insert current in sorted linked list
sortedInsert(&sorted, current);
// Update current
current = next;
}
// Update head_ref to point to sorted linked list
*head_ref = sorted;
}
/* function to insert a new_node in a list. Note that this
function expects a pointer to head_ref as this can modify the
head of the input linked list (similar to push())*/
void sortedInsert(struct node** head_ref, struct node* new_node)
{
struct node* current;
/* Special case for the head end */
if (*head_ref == NULL || (*head_ref)->lineNumber >= new_node->lineNumber)
{
new_node->next = *head_ref;
*head_ref = new_node;
}
else
{
/* Locate the node before the point of insertion */
current = *head_ref;
while (current->next!=NULL &&
current->next->lineNumber < new_node->lineNumber)
{
current = current->next;
}
new_node->next = current->next;
current->next = new_node;
}
}
/* BELOW FUNCTIONS ARE JUST UTILITY TO TEST sortedInsert */
/* Function to print linked list */
void printList(struct node *head)
{
struct node *temp = head;
while(temp != NULL)
{
printf("%d %s \n", temp->lineNumber,temp->str);
temp = temp->next;
}
}
/* A utility function to insert a node at the beginning of linked list */
void push(struct node** head_ref, int new_data, char *line)
{
/* allocate node */
struct node* new_node = (struct node *)malloc(sizeof(struct node));
int len = strlen(line)+1;
/* put in the data */
new_node->lineNumber = new_data;
new_node->str = malloc(len);
strcpy(new_node->str,line);
new_node->str[len] = '\0';
/* link the old list off the new node */
new_node->next = (*head_ref);
/* move the head to point to the new node */
(*head_ref) = new_node;
}
// Driver program to test above functions
int main(int argc,char *argv[])
{
struct node *a = NULL;
push(&a, 5 , "TestLine");
push(&a, 1 , "SecondTest");
push(&a, 1 , "SecondTest");
push(&a, 3 , "SecondTest");
insertionSort(&a);
printf("\nLinked List after sorting \n");
printList(a);
return 0;
}

I' d advice you to use linked list.
// Define your list like this
typedef struct node {
int line; // To hold the line number
char * data;
struct node * next;
} node_t;
// To insert
node_t* insert(node_t *head, const char * data, int line) // n is line from beginning
{
// Node to be inserted in given line
node_t *newNode;
// Allocating Memory
newNode = malloc(sizeof(node_t));
// Filling the Data to New Node
newNode->data = malloc(strlen(data)+1); // Allocate memory to store data
strcpy(newNode->data, data);
newNode->line = line;
newNode->next = NULL;
// It might be our First Node in Linked List
if(head == NULL) {
//Address of New Node Becomes our head
return (head = newNode);
}
// Node Might be inserted At Head
else if(line == 0) {
// Joining previous Linked List After new Node
newNode->next = head;
// Address of New Node Becomes our head
return (head = newNode);
}
// Inserting At the line next to line
else {
// Pointer to store intermediate address of node
// To be used in Traversing
node_t * current = head;
// Go through to insert at Nth line
while(current != NULL) {
node_t * next = current->next; //The next Node
if((line >= current->line && line < next->line) || (line >= current->line && NULL == next->line)) { // Test if we are at some point between current line and next line or if there is no next
// If we are, point newNode to the next node of current
newNode->next = current->next;
// Now point current towards our New Node
current->next = newNode;
// Return Head as soon as we have inserted our new node
return head;
}
current = next; // Point current to the next node to continue
}
}
}
If there's guarantee that the line numbers will always be greater, you could also store a pointer to the node with greatest line number in every node. This will increase space but achieve the result in n(0) time.

C Code: Efficient way to parse through to end of LinkedList

I have a Struct Data, this will be a Linked List of Messages. For every new message, I need to append at the last in linked List. I have a Counter where I know how many messages are present.
Instead of parsing till the end of the linked List. Is there any better way to get to the specific position in Linked List??
struct Data {
char *message;
struct Data *next;
}data;
int total_message;
Right now I am parsing like below:
struct Data *traverse;
while(traverse->next != NULL)
traverse = traverse->next;
I tried below as well, I am not sure why this wrong logically it seems right to me.
data[total_messages - 1].next = new_data;
Is there any better way other than storing pointer to Last Message?

Consider maintaining a pointer to the tail of the linked list.
data * head = NULL;
data * tail = NULL;
void Append(data * entry) {
if (!head) {
head = entry;
}
if (tail) {
tail->next = entry;
}
tail = entry;
}
Why traversing (as in the question) is bad?
If we maintain only the head and the number of messages say n, then for each append we have to traverse the n linked nodes starting from head -- that's O(n) operation -- slightly inefficient. If adding to the tail of the list is a frequent operation -- as it seems in your case -- then maintaining the tail pointer is efficient. Space wise, maintaining a counter is same as maintaining a pointer.
Why the following is bad?
data[total_messages - 1].next = new_data;
That's an array notation. Arrays are contiguous block of memory. In linked list, the nodes could be anywhere in memory, they cannot be accessed in array notation like that.

The [] syntax works for arrays because arrays arrange their data in a line, in a predictable way. Linked lists do not. You can only find out where the ith element is by following the next pointers.
data[i] refers to the data i places after the memory address data, which is unlikely to be at the location of a Data struct. Writing data to that position will generally just disrupt a random section of code somewhere else in the program.

A fast and relatively simple solution is to push each new element onto the front of the list while parsing, and then reverse the list in place once you have pushed all the elements.

Why create heap when creating a linked list when we can simply do this?

I'm studying linked lists from this lesson.
The writer (and all other coders on every single tutorial) goes through creating node type pointer variables, then allocates memory to them using typecasting and malloc. It seems kinda unnecessary to me (Offourse I know I'm missing something), why can't we implement the same using this?
struct node
{
int data;
struct node *next;
};
int main()
{
struct node head;
struct node second;
struct node third;
head.data = 1;
head.next = &second;
second.data = 2;
second.next = &third;
third.data = 3;
third.next = NULL;
getchar();
return 0;
}
I've created nodes and the next pointers points towards the addresses of the next nodes...

Let's say you create a variable of type node called my_node:
struct node my_node;
You can access its members as my_node.data and my_node.next because it is not a pointer. Your code, however, will only be able to create 3 nodes. Let's say you have a loop that asks the user for a number and stores that number in the linked list, stopping only when the user types in 0. You don't know when the user will type in 0, so you have to have a way of creating variables while the program is running. "Creating a variable" at runtime is called dynamic memory allocation and is done by calling malloc, which always returns a pointer. Don't forget to free the dynamically allocated data after it is no longer needed, to do so call the free function with the pointer returned by malloc. The tutorial you mentioned is just explaining the fundamental concepts of linked lists, in an actual program you're not going to limit yourself to a fixed number of nodes but will instead make the linked list resizable depending on information you only have at runtime (unless a fixed-sized linked list is all you need).
Edit:
"Creating a variable at runtime" was just a highly simplified way of explaining the need for pointers. When you call malloc, it allocates memory on the heap and gives you an address, which you must store in a pointer.
int var = 5;
int * ptr = &var;
In this case, ptr is a variable (it was declared in all its glory) that holds the address of another variable, and so it is called a pointer. Now consider an excerpt from the tutorial you mentioned:
struct node* head = NULL;
head = (struct node*)malloc(sizeof(struct node));
In this case, the variable head will point to data allocated on the heap at runtime.
If you keep allocating nodes on the heap and assigning the returned address to the next member of the last node in the linked list, you will be able to iterate over the linked list simply by writing pointer_to_node = pointer_to_node->next. Example:
struct node * my_node = head; // my_node points to the first node in the linked list
while (true)
{
printf("%d\n", my_node->data); // print the data of the node we're iterating over
my_node = my_node->next; // advance the my_node pointer to the next node
if (my_node->next == NULL) // let's assume that the 'next' member of the last node is always set to NULL
{
printf("%d\n", my_node->data);
break;
}
}
You can, of course, insert an element into any position of the linked list, not just at the end as I mentioned above. Note though that the only node you ever have a name for is head, all the others are accessed through pointers because you can't possibly name all nodes your program will ever have a hold of.

When you declare 'struct node xyz;' in a function, it exists only so long as that function exists. If you add it to a linked list and then exit the function, that object no longer exists, but the linked list still has a reference to it. On the other hand, if you allocate it from the heap and add it to the linked list, it will still exist until it is removed from the linked list and deleted.
This mechanism allows an arbitrary number of nodes to be created at various times throughout your program and inserted into the linked list. The method you show above only allows a fixed number of specific items to be placed in the list for a short duration. You can do that, but it serves little purpose, since you could have just accessed the items directly outside the list.

Of course you can do like that. but how far ? how many nodes are you going to create ? We use linkedlists when we don't know how many entries we need when we create the list. So how can you create nodes ? How much ?
That's why we use malloc() (or new nodes).

But what if you had a file containing an unknown number of entries, and you needed to iterate over them, adding each one to the linked list? Think about how you might do that without malloc.
You would have a loop, and in each iteration you need to create a completely new "instance" of a node, different to all the other nodes. If you just had a bunch of locals, each loop iteration they would still be the same locals.

Your code and approach is correct as long as you know the number of nodes that you need in advance. In many cases, though, the number of nodes depends on user input and is not known in advance.
You definitely have to decide between C and C++, because typecasting and malloc belong in C only. Your C++ linked list code won't be doing typecasting nor using malloc precisely because it's not C code, but C++ code.

Say you are writing an application such as a text editor. The writer of the application has no idea how big a file a user in the future may want to edit.
Making the editor always use a large amount of memory is not helpful in multi-tasking environments, especially one with a large number of users.
With malloc() an editing application can take additional amounts of memory from the heap as required, with different processes using different amounts of memory, without large amounts of memory being wasted.

You can, and you can exploit this technique to create cute code like this, to use the stack as a malloc in a way:
The code below should be safe enough assuming there are no tail optimizations enabled.
#include <stdio.h>
typedef struct node_t {
struct node_t *next;
int cur;
int n;
} node_t;
void factorial(node_t *state, void (*then)(node_t *))
{
node_t tmp;
if (state->n <= 1) {
then(state);
} else {
tmp.next = state;
tmp.cur = state->n * state->cur;
tmp.n = state->n - 1;
printf("down: %x %d %d.\n", tmp);
factorial(&tmp, then);
printf("up: %x %d %d.\n", tmp);
}
}
void andThen(node_t *result)
{
while (result != (node_t *)0) {
printf("printing: %x %d %d.\n", *result);
result = result->next;
}
}
int main(int argc, char **argv)
{
node_t initial_state;
node_t *result_state;
initial_state.next = (node_t *)0;
initial_state.n = 6; // factorial of
initial_state.cur = 1; // identity for factorial
factorial(&initial_state, andThen);
}
result:
$ ./fact
down: 28ff34 6 5.
down: 28ff04 30 4.
down: 28fed4 120 3.
down: 28fea4 360 2.
down: 28fe74 720 1.
printing: 28fe74 720 1.
printing: 28fea4 360 2.
printing: 28fed4 120 3.
printing: 28ff04 30 4.
printing: 28ff34 6 5.
printing: 0 1 6.
up: 28fe74 720 1.
up: 28fea4 360 2.
up: 28fed4 120 3.
up: 28ff04 30 4.
up: 28ff34 6 5.
factorial works differently than usual because we can't return the result to caller because the caller will invalidate it with any single stack operation. a single function call will destroy the result, so instead, we must pass it to another function that will have its own frame on top of the current result, which will not invalidate the arbitrary number of stack frames it's sitting on top of that hold our nodes.
I imagine there are many ways for this to break other than tail call optimizations, but it's really elegant when it doesn't, because the links are guaranteed to be fairly cache local, since they are fairly close to each other, and there is no malloc/free needed for arbitrary sized consecutive allocations, since everything is cleaned as soon as returns happen.

Lets think you are making an Application like CHROME web browser, then you wanna create link between tabs created by user at run time which can only possible if you use Dynamic Memory Allocation.
That's why we use new, malloc() etc to apply dynamic memory allocation.
☺:).

fast random acces to linked list nodes

I have a singly linked list which can have 10000<< nodes at any given time.
Now in the interface I need to print these in order and a user can acces a single node and perform operations on that node. Obviously if the user chooses a very high number on the node count it will have to go over thousands of node before being able to acces the desired node.
My current fix "translates" the linked list to an array, since my code is multithreaded my linked list can grow at any given time. But by code design never shrink.
Here is code I use to translate linked list to array.
unsigned int i=0;
unsigned int LL_arr_bufsize=128;
my_ll **LL_arr;
my_ll *temp;
LL_arr = malloc(LL_arr_bufsize * sizeof(my_ll *));
// err check mem alooc
temp = l_list->next;
while (temp != NULL) {
LL_arr[i] = temp;
temp = temp->next;
if (++i == LL_arr_bufsize) {
LL_arr_bufsize = LL_arr_bufsize * 2;
LL_arr = realloc(LL_arr, LL_arr_bufsize * sizeof(my_ll *));
// err check mem alloc
}
}
What am I basically wondering if there is a better way to acces any given node without incuring the overhead of traversing the entire list before a given node can be accessed...

I will probably get down voted because I literally just thought of this idea and it might have some flaws. Here it goes.
What if you do a two dimensional node stack. Here me out.
NodeList - holds an array of 10 nodes and it's own index. ( you can experiment with bigger values)
What happens is that NodeList is a regular link list that you can de-queue and queue again. But you can get still some of that constant time look-upness that you are looking for. This is done with a clever search function that goes goes through the link list normally however, once it goes to the location of where your particular node is being held in the list you get that constant time look up from the array it stores.
I can probably clarify more of this concept if you want but I think you can get a good picture of what I'm going for with the description.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight