Find number of records in an array of structures - c

Suppose we have a structure array of up to 50 elements that will be added in turn from a buffer write function. How do I find the current number of recordings made in array if the maximum number of items has not been reached?
typedef struct
{
remoteInstructionReceived_t instruction;
uint16_t parameter;
} instructionData_type;
remoteInstructionReceived_t commandBuffer[50];

C arrays are fixed-size: there are always exactly 50 objects in your array. If your program logic requires some of them to be "inactive" (e.g. not written yet), you must keep track of such information separately. For example, you could use a size_t variable to store the number of "valid" entries in the array.
An alternative would be to designate a value of remoteInstructionReceived_t as a terminator, similarly to how 0 is used as a terminator for NUL-terminated strings. Then, you wouldn't have to track the "useful length" of the array separately, but you'd have to ensure a terminator always follows the last valid item in it.
In general, length-tracking is likely both more efficient and more maintainable. I am only mentioning the second (terminator) option out of a sense of completeness.

You can't, C doesn't have a way of knowing if a variable "has a value". All values are values, and no value is more real than any else.
The answer is that additional state, i.e. some form of counter variable, is required to hold this information. Typically you use that when inserting new records, to know where the next record should go.

Have you considered using a different data structure? You can wrap your structure to allow the creation of a linked list, for example. The deletion would be real just by freeing memory. Besides, it's more efficient for some kinds of operation, such as addin elements in the middle of the list.

Related

How best can a Sentinel Value be established when the full range of input is possible?

When parsing a file I need to detect whether an item with minimum and maximum occurrence of 1 has been processed already. Later on in validation I need to detect if it was not processed at all.
I can do this inelegantly with a count variable that increments each time but it is cumbersome and inelegant. Perhaps a boolean flag. In general I would use some form of a Sentinel Value, such as NULL for a pointer, or "" for a statically allocated string array. Or memset() zero for many items.
The problem is if the full range of the datatype is potentially valid input it gets very sticky trying to make a Sentinel.
If it is signed and only positive numbers are used, the Sentinel can be any negative number. If the data type is unsigned but values that would use the sign bit are not in use, then a negative number can be used.
If a larger data type can be used to store the value, the added range can be used for the SV. Although this may affect type-compatibility, truncation, promotion.
In an enum I can add an entry, creating an SV.
It gets difficult to keep track of all the ways of showing for each member of a structure whether it was initialized or not.
I almost forgot - an easy and universal way could be to make every variable dynamically allocated and initialized to NULL, even integers. Though a bit strange and slightly wasteful of memory perhaps, this would be highly consistent and would also allow boolean logic of conditional statements to work, eg:
if(age) print("Age is a valid variable with value: %d", *age);
Edit to clarify the question (no changes above):
I am parsing logs from another application (no documentation on the format) The log entries include data structures/objects and the files also have slight spontaneously corrupt entries because another thread occasionally writes to them without synchronizing access.
The structures have members of any base type, eg integer, string, sub-structure, in different quantities, eg 1, 0-1, 1 - N. It gets more complicated if you add the rules on valid combinations and valid sequences.
It might be easiest for me to define everything as an array with an associated counter variable.
I was motivated to ask about this because managing the initialization and checking if a variable has been read in already starts to get overwhelming.
The next stage - input validation - is even more difficult.
The problem is if the full range of the datatype is potentially valid input it gets very sticky trying to make a Sentinel.<
I would say that if that is the situation, there is no way to make a sentinel. You might get lucky if the data type in question has a trap representation (which essentially means that there are some bit patterns that you can store in the data type, but which are not interpretable as a value in the data type), which you could (ab)use.
Other than that, I think you need to resort to some secondary way (variable) to achieve your goal.
As a side note: Sometimes it is practical (but not safe) to reason about what values might be valid, but extremely unlikely input. You might use such a "special" value as a sentinel, but would have to provide some functionality to determine if, when encountering such a "special" value, it truly is a sentinel or a valid input.
Think of an array of doubles: You could use the value of PI to 30 significant digits, if it is highly unlikely that you would ever encounter that number as a valid input, let's say in an accounting software. But you would still need some handler for the sentinel value to determine if it truly is a sentinel, or, indeed, valid but improbable.

Array VS single linked list VS double link list

I am learning about arrays, single linked list and double linked list now a days and this question came that
" What is the best option between these three data structures when it comes to fast searching, less memory, easily insertion and updating of things "
As far I know array cannot be the answer because it has fixed size. If we want to insert a new thing. it wouldn't always be possible. Double linked list can do the task but there will be two pointers needed for each node so there will be memory problem, so I think single linked list will fulfill all given requirements. Am I right? Please correct me if I am missing any point. There is also one more question that instead of choosing one of them, can I make combination of one or more data structures given here to meet all the requirements?
"What is the best option between these three data structures when it comes to fast searching, less memory, easily insertion and updating of things".
As far as I can tell Arrays serve the purpose.
Fast search: You could do binary search if array is sorted. You dont get that option in linkedlist
Less memory: Arrays will take least memory (but contiguous memory )
Insertion: Inserting in array is a matter of a[i] = "value". If array size is exceeded then simply export data into a new array. That is exactly how HashMaps / ArrayLists work under covers.
Updating things: Only Arrays provide you with Random access. a[i] ="new value".. updated in O(1) time if you know the index.
Each of those has its own benefits and downsides.
For search speed, I'd say arrays are better suitable due to the quick lookup times.
Since an array is a sequence of same-size elements, retrieving the value at an index is just memoryLocation + index * elementSize. For a linked list, the whole list needs traversing.
Arrays also win in the "less memory" category, since there's no need to store extra pointers.
For insertions, arrays are slow. You'll need to traverse the array, copy contents to a new array, assign the new array, delete the old one...
Insertions go much quicker in linked- or double lists, because it's just a matter of changing one or two pointers.
In the end, it all just depends on the use case. Are you inserting a lot? Then you probably want to consider a non-array structure.
Do you need many quick lookups? Consider those arrays again. Etc..
See also this question.
A linked list is usually the best choice when we don’t know in advance the number of elements we will have to store or the number can change dynamically.
Arrays have slow insertion and deletion times. To insert an element to the front or middle of the array, the first step is to ensure that there is space in the array for the new element, otherwise, the array needs to be RESIZED. This is an expensive operation. The next step is to open space for the new element by shifting every element after the desired index. Likewise, for deletion, shifting is required after removing an element. This implies that insertion time for arrays is Big O of n (O(n)) as n elements must be shifted.
Using static arrays, we can save some extra memory in
comparison to linked lists because we do not need to store pointers to the next node
a doubly-linked list support fast insertion/removal at their ends. This is used in LRU cache, where you need to enter new item to front and remove the oldest item from the end.

Delete a specific element from a Fortran array

Is there a function in Fortran that deletes a specific element in an array, such that the array upon deletion shrinks its length by the number of elements deleted?
Background:
I'm currently working on a project which contain sets of populations with corresponding descriptions to the individuals (i.e, age, death-age, and so on).
A method I use is to loop through the array, find which elements I need, place it in another array, and deallocate the previous array and before the next time step, this array is moved back to the array before going through the subroutines to find once again the elements not needed.
You can use the PACK intrinsic function and intrinsic assignment to create an array value that is comprised of selected elements from another array. Assuming array is allocatable, and the elements to be removed are nominated by a logical mask logical_mask that is the same size as the original value of array:
array = PACK(array, .NOT. logical_mask)
Succinct syntax for a single element nominated by its index is:
array = [array(:index-1), array(index+1:)]
Depending on your Fortran processor, the above statements may result in the compiler creating temporaries that may impact performance. If this is problematic then you will need to use the subroutine approach that you describe.
Maybe you want to look into linked lists. You can insert and remove items and the list automatically resizes. This resource is pretty good.
http://www.iag.uni-stuttgart.de/IAG/institut/abteilungen/numerik/images/4/4c/Pointer_Introduction.pdf
To continue the discussion, the solution you might want to implement depends on the number of delete operation and access you do, where you insert/delete the elements (the first, the last, randomly in the set?), how do you access the data (from the first to the last, randomly in the set?), what are your efficiency requirements in terms of CPU and memory.
Then you might want to go for linked list or for static or dynamic vectors (other types of data structures might also fit better your needs).
For example:
a static vector can be used when you want to access a lot of elements randomly and know the maximum number nmax of elements in the vector. Simply use an array of nmax elements with an associated length variable that will track the last element. A deletion can simply and quickly be done my exchanging the last element with the deleted one and reducing the length.
a dynamic vector can be implemented when you don't know the maximum number of elements. In order to avoid systematic array allocation+copy+unallocation at for each deletion/insertion, you fix the maximum number of elements (as above) and only augment its size (eg. nmax becomes 10*nmax, then reallocate and copy) when reaching the limit (the reverse system can also be implemented to reduce the number of elements).

Is it more efficent to use a linked list and delete nodes or use an array and do a small computation to a string to see if element can be skipped?

I am writing a program in C that reads a file. Each line of the file is a string of characters to which a computation will be done. The result of the computation on a particular string may imply that strings latter on in the file do not need any computations done to them. Also if the reverse of the string comes in alphabetical order before the (current, non-reversed) string then it does not need to be checked.
My question is would it be better to put each string in a linked list and delete each node after finding particular strings don’t need to be checked or using an array and checking the last few characters of a string and if it is alphabetically after the string in the previous element skip it? Either way the list or array only needs to be iterated through once.
Rules of thumb is that if you are dealing with small objects (< 32 bytes), std::vector is better than a linked list for most of general operations.
But for larger objects, (say, 1K bytes), generally you need to consider lists.
There is an article details the comparison you can check , the link is here
http://www.baptiste-wicht.com/2012/11/cpp-benchmark-vector-vs-list/3/
Without further details about what are your needs is a bit difficult to tell you which one would fit more with your requirements.
Arrays are easy to access, specially if you are going to do it in a non sequential way, but they are hard to maintain if you need to perform deletions on it or if you don't have a good approximation of the final number of elements.
Lists are good if you plan to access them sequentially, but terrible if you need to jump between its elements. Also deletion over them can be done in constant time if you are already in the node you want to delete.
I don't quite understand how you plan to access them since you say that either one would be iterated just once, but if that is the case then either structure would give you the similar performance since you are not really taking advantage of their key benefits.
It's really difficult to understand what you are trying to do, but it sounds like you should create an array of records, with each record holding one of your strings and a boolean flag to indicate whether it should be processed.
You set each record's flag to true as you load the array from the file.
You use one pointer to scan the array once, processing only the strings from records whose flags are still true.
For each record processed, you use a second pointer to scan from the first pointer + 1 to the end of the array, identify strings that won't need processing (in light of the current string), and set their flags to false.
-Al.

Multi dimensional array with varying size

I want to make a 2D array "data" with the following dimensions: data(T,N)
T is a constant and N I dont know anything about to begin with. Is it possible to do something like this in fortran
do i = 1, T
check a few flags
if (all flags ok)
c = c+ 1
data(i,c) = some value
end if
end do
Basically I have no idea about the second dimension. Depending on some flags, if those flags are fine, I want to keep adding more elements to the array.
How can I do this?
There are several possible solutions. You could make data an allocatable array and guess the maximum value for N. As long as you don't excess N, you keep adding data items. If a new item would exceed the array size, you create a temporary array, copy data to the temporary array, deallocate data and reallocate with a larger dimension.
Another design choice would be to use a linked list. This is more flexible in that the length is indefinite. You loss "random access" in that the list is chained rather than indexed. You create an user defined type that contains various data, e.g., scalers, arrays, whatever, and also a pointer. When you add a list item, the pointer points to that next item. The is possible in Fortran >=90 since pointers are supported.
I suggest searching the web or reading a book about these data structures.
Assuming what you wrote is more-or-less how your code really goes, then you assuredly do know one thing: N cannot be greater than T. You would not have to change your do-loop, but you will definitely need to initialize data before the loop.

Resources