Buffer management in c - c

i have a buffer sized 2000, the data to be inserted is unlimited. I want, data more than 2000 should be added from the end of the buffer, i.e. push all data from right to left and insert new data at the end of the buffer. So, what kind of algorithm or flow i should try on?

You want to use a FIFO, or 'Circular Buffer'. See http://en.wikipedia.org/wiki/Circular_buffer for a complete explanation, or even example code.
Depending on your actual needs, the implementation can be different. If, for example, you always need to access the 2000 items sequentially, you can omit the read pointer (as it is always one item behind the write pointer).
Edit: Queue is something similar. If you are using C++, consider http://www.cplusplus.com/reference/stl/queue/

Related

Declaring a 10 item array as Dim items(10) As String. Any disadvantages? [duplicate]

In declaring an array in VB, would you ever leave the zero element empty and adjust the code to make it more user friendly?
This is for Visual Basic 2008
No, I wouldn't do that. It seems like it might help maintainability, but that's a very short-sighted view.
Think about it this way. It only takes each programmer who has to understand and maintain the code a short amount of time to get comfortable with zero-indexed arrays. But if you're using one-based arrays, which are unlike those found in almost all other VB.NET code, and in fact almost every other common programming language, it will take everyone on the team much longer. They'll be constantly making mistakes, tripping up because their natural assumptions aren't accurate in this one special case.
I know how it feels. When I worked in VB 6, I loved one-based arrays. They were very natural for the type of data that I was storing, and I used them all over the place. Perfectly documentable here, because you have an explicit syntax to specify the upper and lower bounds of the array. That's not the case in VB.NET (which is a newer, but incompatible version of the Visual Basic language), where all arrays have to be zero-indexed. I had a hard time switching to VB.NET's zero-based arrays for the first couple of days. After that initial period of adjustment, I can honestly say I've never looked back.
Some might argue that leaving the first element of every array empty would consume extra memory needlessly. While that's obviously true, I think it's a secondary reason behind the one I presented above. Good developers write code for others to read, so I commend you for considering how to make your code logical and understandable. You're on the right path by asking this question. But in the long run, I don't think this decision is a good one.
There might be a handful of exceptions in very specific cases, depending on the type of data that you're storing in the array. But again, failing to do this across the board seems like it would hurt readability in the aggregate, rather than helping it. It's not particularly counter-intuitive to simply write the following, once you've learned how arrays are indexed:
For i As Integer = 0 To (myArray.Length - 1)
'Do work
Next
And remember that in VB.NET, you can also use the For Each statement to iterate through your array elements, which many people find more readable. For example:
For Each i As Integer In myArray
'Do work
Next
First, it is about programmer friendly, not user friendly. User will never know the code is 0-based or 1-based.
Second, 0-based is the default and will be used more and more.
Third, 0-based is more natural to computer. From the very element, it has two status, 0 and 1, not 1 and 2.
I have upgraded a couple of VB6 projects to vb.net. To modify to 0-based array in the beginning is better than to debug the code a later time.
Most of my VB.Net arrays are 0-based and every element is used. That's usual in VB.Net and code mustn't surprise the reader. Readability is vital.
Any exceptions? Maybe if I had a program ported from VB6, so it used 0-based arrays with unused initial elements, and it needed a small change, I might match the pattern of the existing code. Least surprise again.
99 times out of 100 the question shouldn't arise because you should be using List(Of T) rather than an array!
Who are the "users" that are going to see the array indexes? Any good developer will be able to handle a zero-indexed array and no real user should ever see them. If the user has to interact with the array, then make an actually user-friendly system for doing so (text or a 1-based virtual index or whatever is called for).
In visual basic is it possible to declare an array starting from 1, if you find inconvenient to use a 0 based array.
Dim array(1 to 10) as Integer
It is just a matter of tastes. I use 1 based arrays in visual basic but 0 based arrays in C ;)

Optimal method for handling a changing array in Fortran

Let's say I have an 2D array. Along the first axis I have a series of properties for one individual measurement. Along the second axis I have a series of measurements.
So, for example, the array could look something like this:
personA personB personC
height 1.8 1.75 2.0
weight 60.5 82.0 91.3
age 23.1 65.8 48.5
or anything similar.
I want to change the size of the array very often - for example, ignoring personB's data and including personD and personE. I will be looping through "time", probably with >10^5 timesteps. Each timestep, there is a chance that each "person" in the array could be deleted and a chance that they will introduce several new people into the simulation.
From what I can see there are several ways to manage an array like this:
Overwriting and infrequent reallocation
I could use a very large array with an extra column, in which I put a "skip" flag. So, if I decide I no longer need personB, I set the flag to 1 and ignore personB every time I loop through the list of people. When I need to add personD, I search through the list for the first person with skip == 1, replace the data with the data for personD, and set skip = 0. If there aren't any people with skip == 1, I copy the array, deallocate it, reallocate it with several more columns, and then fill the first new column with personD's data.
Advantages:
infrequent allocation - possibly better performance
easy access to array elements
easier to optimise
Disadvantages:
if my array shrinks a lot, I'll be wasting a lot of memory
I need a whole extra row in the data, and I have to perform checks to make sure I don't use the irrelevant data. If the array shrinks from 1000 people to 1, I'm going to have to loop through 999 extra records
could encounter memory issues, if I have a very large array to copy
Frequent reallocation
Every time I want to add or remove some data, I copy and reallocate the entire array.
Advantages:
I know every piece of data in the array is relevant, so I don't have to check them
easy access to array elements
no wasted memory
easier to optimise
Disadvantages:
probably slow
could encounter memory issues, if I have a very large array to copy
A linked list
I refactor everything so that each individual's data includes a pointer to the next individual's data. Then, when I need to delete an individual I simply remove it from the pointer chain and deallocate it, and when I need to add an individual I just add some pointers to the end of the chain.
Advantages:
every record in the chain is relevant, so I don't have to do any checks
no wasted memory
less likely to encounter memory problems, as I don't have to copy the entire array at once
Disadvantages:
no easy access to array elements. I can't slice with data(height,:), for example
more difficult to optimise
I'm not sure how this option will perform compared to the other two.
--
So, to the questions: are there other options? When should I use each of these options? Is one of these options better than all of the others in my case?

Is it more efficent to use a linked list and delete nodes or use an array and do a small computation to a string to see if element can be skipped?

I am writing a program in C that reads a file. Each line of the file is a string of characters to which a computation will be done. The result of the computation on a particular string may imply that strings latter on in the file do not need any computations done to them. Also if the reverse of the string comes in alphabetical order before the (current, non-reversed) string then it does not need to be checked.
My question is would it be better to put each string in a linked list and delete each node after finding particular strings don’t need to be checked or using an array and checking the last few characters of a string and if it is alphabetically after the string in the previous element skip it? Either way the list or array only needs to be iterated through once.
Rules of thumb is that if you are dealing with small objects (< 32 bytes), std::vector is better than a linked list for most of general operations.
But for larger objects, (say, 1K bytes), generally you need to consider lists.
There is an article details the comparison you can check , the link is here
http://www.baptiste-wicht.com/2012/11/cpp-benchmark-vector-vs-list/3/
Without further details about what are your needs is a bit difficult to tell you which one would fit more with your requirements.
Arrays are easy to access, specially if you are going to do it in a non sequential way, but they are hard to maintain if you need to perform deletions on it or if you don't have a good approximation of the final number of elements.
Lists are good if you plan to access them sequentially, but terrible if you need to jump between its elements. Also deletion over them can be done in constant time if you are already in the node you want to delete.
I don't quite understand how you plan to access them since you say that either one would be iterated just once, but if that is the case then either structure would give you the similar performance since you are not really taking advantage of their key benefits.
It's really difficult to understand what you are trying to do, but it sounds like you should create an array of records, with each record holding one of your strings and a boolean flag to indicate whether it should be processed.
You set each record's flag to true as you load the array from the file.
You use one pointer to scan the array once, processing only the strings from records whose flags are still true.
For each record processed, you use a second pointer to scan from the first pointer + 1 to the end of the array, identify strings that won't need processing (in light of the current string), and set their flags to false.
-Al.

Fast way to remove bytes from a buffer

Is there a faster way to do this:
Vector3* points = malloc(maxBufferCount*sizeof(Vector3));
//put content into the buffer and increment bufferCount
...
// remove one point at index `removeIndex`
bufferCount--;
for (int j=removeIndex; j<bufferCount; j++) {
points[j] = points[j+1];
}
I'm asking because I have a huge buffer from which I remove elements quite often.
No, sorry - removing elements from the middle of an array takes O(n) time. If you really want to modify the elements often (i. e. remove certain items and/or add others), use a linked list instead - that has constant-time removal and addition. In contrast, arrays have constant lookup time, while linked lists can be accessed (read) in linear time. So decide what you will do more frequently (reading or writing) and choose the appropriate data structure based upon that decision.
Note, however, that I (kindly) assumed you are not trying to commit the crime of premature optimization. If you haven't benchmarked that this is the bottleneck, then probably just don't worry about it.
Unless you know it's a bottleneck you can probably let the compiler optimize for you, but you could try memmove.
The selected answer here is pretty comprehensive: When to use strncpy or memmove?
A description is here: http://www.kernel.org/doc/man-pages/online/pages/man3/memmove.3.html
A few things to say. The memmove function will probably copy faster than you, often it is optimised by the writers of your particular complier to use special instructions which arent available in the C language without inline assembler. I believe these instructions are called SIMD instructions (Single Instruction Multiple Data)? Somebody correct me if I am wrong.
If you can save up items to be removed, then you can optimse by sorting the list of items you wish to remove and then, doing a single pass. It isnt hard but just takes some funny arithmetic.
Also you could just store each item in a linked list, removing an item is trivial, but you lose random acccess to your array.
Finally you can have an additional array of pointers, the same size of your array, each pointer pointing to an element. Then you can access the array through double indirection, you can sort the array by swapping pointers, and you can delete items by making their pointer NULL.
Hope this gives you some ideas. There usually is a way to optimise things, but then it becomes more application specific.

Most appropriate MPI_Datatype for "block decomposition"?

With the help from Jonathan Dursi and osgx, I've now done the "row decomposition" among the processes:
row http://img535.imageshack.us/img535/9118/ghostcells.jpg
Now, I'd like to try the "block decomposition" approach (pictured below):
block http://img836.imageshack.us/img836/9682/ghostcellsblock.jpg
How should one go about it? This time, the MPI_Datatype will be necessary, right? Which datatype would be most appropriate/easy to use? Or can it plausibly be done without a datatype?
You can always make do without a datatype by just creating a buffer and copying the buffer as count of the underlying type; that's conceptually the simplest. On the other hand, it slower and it actually involves a lot more lines of code. Still, it can be handy when you're trying to get something to work, and then you can implement the datatype-y version along side that and make sure you're getting the same answers.
For the ghost-cell filling, in the i direction you don't need a type, as it's similar to what you had been doing; but you can use one, MPI_Type_contiguous, which just specifies a count of some type (which you can do anyway in your send/recv).
For ghost-cell filling in the j direction, probably easiest is to use MPI_Type_Vector. If you're sending the rightmost column of (say) an array with i=0..N-1, j=0..M-1 you want to send a vector with count=N, blocksize=1, stride=M. That is, you're sending count chunks of 1 value, each separated by M values in the array.
You can also use MPI_Type_create_subarray to pull out just the region of the array you want; that's probably a little overkill in this case.
Now, if as in your previous question you want to be able at some point to gather all the sub-arrays onto one processor, you'll probably be using subarrays, and part of the question is answered here: MPI_Type_create_subarray and MPI_Gather . Note that if your array chunks are of different sizes, though, then things start getting a little tricker.
(Actually, why are you doing the gather onto one processor, anyway? That'll eventually be a scalability bottleneck. If you're doing it for I/O, once you're comfortable with data types, you can use MPI-IO for this..)

Resources