Comparison of all array elements - C algorithm - c

I have a matrix m * n and, for each row, I need to compare all elements among them.
For each couple I find, I'll call a function that is going to perform some calculations.
Example:
my_array -> {1, 2, 3, 4, 5, ...}
I take 1 and I have: (1,2)(1,3)(1,4)(1,5)
I take 2 and I have: (2,1)(2,3)(2,4)(2,5)
and so on
Using C I wrote this:
for (i=0; i<array_length; i++) {
for (k=0; k<array_length; k++) {
if (i==k) continue;
//Do something
}
}
}
I was wondering if I can use an algorithm with lower complexity.

No, it's O(n^2) by definition [ too long to explain here, but trust me (-: ]
But you can decrease the number of iterations by half :
for (i=0; i<array_length; i++) {
for (k=i+1; k<array_length; k++) { // <-- no need to check the values before "i"
//Do something
//If the order of i and k make a different then here you should:
//'Do something' for (i,k) and 'Do something' for (k,i)
}
}
}

There are several things you might do, but which are possibile and which not depend on the array nature and the formula you apply. Overall complexity will probably remain unchanged or even grow, even if calculation can be made to go faster, unless the formula has a complexity dependancy on its arguments, in which case a decrease in complexity may be achievable.
Also, going from AO(N^a) to BO(N^b) with b > a (higher complexity) can still be worth pursuing, for some range of N, if B is sufficiently smaller than A.
In no particular order:
if the matrix has several repeated items, it can be convenient to use a caching function:
result function(arg1, arg2) {
int i = index(arg1, arg2); // Depending on the values, it could be
// something like arg1*(MAX_ARG2+1) + arg2;
if (!stored[i]) { // stored and values are allocated and initialised
// somewhere else - or in this function using a
// static flag.
stored[i] = 1;
values[i] = true_function(arg1, arg2);
}
return values[i];
}
Then, you have a memory overhead proportional to the number of different couples
of values available. The call overhead can be O(|arg1|*|arg2|), but in some circumstances
(e.g. true_function() is expensive) the savings will more than offset the added complexity.
chop the formula into pieces (not possible for every formula) and express it as:
F(x,y) = G(x) op H(y) op J(x,y)
then, you can do a O(max(M,N)) cycle pre-calculating G[] and H[]. This also has a O(M+N) memory cost. It is only convenient if the computational expenditure difference between F and J is significant. Or you might do:
for (i in 0..N) {
g = G(array[i]);
for (j in 0..N) {
if (i != j) {
result = f(array[i], array[j], g);
}
}
}
which brings some of the complexity from O(N^2) down to O(N).
the first two techniques are useable in tandem if G() or H() are practical to cache (limited range of argument, expensive function).
find a "law" to link F(a, b) with F(a+c, b+d). Then you can run the caching algorithm much more efficiently, reusing the same calculations. This shifts some complexity from O(N^2) to O(N) or even O(log N), so that while the overall cost is still quadratic, it grows much more slowly, and a higher bound for N becomes practical. If F is itself of a higher order of complexity than constant in (a,b), this may also reduce this order (as an extreme example, suppose F is iterative in a and/or b).

No, you can only get lower computational complexity if you include knowledge of the contents of the array, and semantics of the operation to optimize your algorithm.

Related

Does time complexity change based on parameters?

If my function void foo(n) has a time complexity of O(n), and I have a function call foo(4), would I say that the time complexity of foo(4) is O(4)?
Normally, the complexity of the function doesn't change depending of the parameters inserted, since it has been calculated over that parameter. In this case, if the complexity is O(n), over the parameter of the function (in you case, 4). Let's say that your function contains a for loop. Let's see an example in pseudo-code:
fun foo(int n) {
for int i = 0; i < n; i++) {
print(i);
}
}
This function prints the numbers from 0 to n. Since increasing n only increases the number of elements linearly, the function is O(n) independently of the value of n.
Another example in pseudo-code:
fun foo(int n) {
for int i = 0; i < 2^n; i++) {
print(i);
}
}
In this case, the function prints the values from 0 to 2^n. Increasing n increases the number of elements exponentially, so the function is O(2^n). Changing the value of n does not change the complexity of the function.
But what happens if we have a function like this one?:
fun foo(int n, boolean b) {
if(b == true) {
for int i = 0; i < n; i++) {
print(i);
}
} else {
for int i = 0; i < 2^n; i++) {
print(i);
}
}
}
In this case, the complexity of the function is O(n) if b is true, and O(2^n) if b is false. So, yes, the complexity of the function can change depending of the value of the parameters of the function, only if that parameter is not the selected for calculating the complexity.
Using Big-O notation will give you the notion how the time will increase with the n input. In this case, if foo() have always the same argument, the behavior will be constant. But keep this in mind, almost metrics used to measure complexity on time or/and memory only makes sense for a big n, because when the input is small the performance will be anyway high. So, in Big-O, you will look to your fucntion for an increasing n, then will be measurable and comparable.
I hope this will help you.
The big O of n (O (n)) notation allows to obtain an order of n for the execution time of your algorithm depending on the increase of n during the function. For example, if the complexity of your algorithm (foo(n)) is in order of (n ^ 2) (O(n²)) then in the worst case (when your algorithm performs a maximum of loop turns), n which is a variable which represents the size of the input data will be in order of n ^ 2. I hope it helped you
No.
O(n) simply means that as n gets large, the time it takes the function to execute is proportional to n; In other words, it's a qualitative statement, not a quantitative one. The time complexity of foo is O(n) whether the input is 4, 4000, 4000000, or 4000000000; all O(n) says is that the runtime grows linearly with n.
O(4) implies that the runtime is constant regardless of the size of the input - it's equivalent to writing O(1). Your function cannot be both O(n) and O(4).

What kind of drawbacks are there performance-wise , if I sort an array by using hashing?

I wrote a simple function to sort an array int a[]; using hash.
For that I stored frequency for every element in new array hash1[] and then I put back in original array in linear time.
#include<bits/stdc++.h>
using namespace std;
int hash1[10000];
void sah(int a[],int n)
{
int maxo=-1;
for(int i=0;i<n;i++)
{
hash1[a[i]]++;
if(maxo<a[i]){maxo=a[i];}
}
int i=0,freq=0,idx=0;
while(i<maxo+1)
{
freq=hash1[i];
if(freq>0)
{
while(freq>0)
{
a[idx++]=i;freq--;
}
}
i++;
}
}
int main()
{
int a[]={6,8,9,22,33,59,12,5,99,12,57,7};
int n=sizeof(a)/sizeof(a[0]);
sah(a,n);
for(int i=0;i<n;i++)
{
printf("%d ",a[i]);
}
}
This algorithm runs in O(max_element). What kind of disadvantages I'm facing here considering only performance( time and space)?
The algorithm you've implemented is called counting sort. Its runtime is O(n + U), where n is the total number of elements and U is the maximum value in the array (assuming the numbers go from 0 to U), and its space usage is Θ(U). Your particular implementation assumes that U = 10,000. Although you've described your approach as "hashing," this really isn't a hash (computing some function of the elements and using that to put them into buckets) as a distribution (spreading elements around according to their values).
If U is a fixed constant - as it is in your case - then the runtime is O(n) and the space usage is O(1), though remember that big-O talks about long-term growth rates and that if U is large the runtime can be pretty high. This makes it attractive if you're sorting very large arrays with a restricted range of values. However, if the range of values can be large, this is not a particularly good approach. Interestingly, you can think of radix sort as an algorithm that repeatedly runs counting sort with U = 10 (if using the base-10 digits of the numbers) or U = 2 (if going in binary) and has a runtime of O(n log U), which is strongly preferable for large values of U.
You can clean up this code in a number of ways. For example, you have an if statement and a while loop with the same condition, which can be combined together into a single while loop. You also might want to put in some assert checks to make sure all the values are in the range from 0 to 9,999, inclusive, since otherwise you'll have a bounds error. Additionally, you could consider making the global array either a local variable (though watch your stack usage) or a static local variable (to avoid polluting the global namespace). You could alternatively have the user pass in a parameter specifying the maximum size or could calculate it yourself.
Issues you may consider:
Input validation. What if the user enters -10 or a very large value.
If the maximum element is large, you will at some point get a performance hit when the L1 cache is exhausted. The hash1-array will compete for memory bandwidth with the a-array. When I have implemented radix-sorting in the past I found that 8-bits per iteration was fastest.
The time complexity is actually O(max_element + number_of_elements). E.g. what if you sorted 2 million ones or zeros. It is not as fast as sorting 2 ones or zeros.

Optimized Selection Sort?

I have read sources that say that the time complexities for Selection sort are:
Best-case: O(n^2)
Average-case: O(n^2)
Worst-case: O(n^2)
I was wondering if it is worth it to "optimize" the algorithm by adding a certain line of code to make the algorithm "short-circuit" itself if the remaining part is already sorted.
Here's the code written in C:
I have also added a comment which indicates which lines are part of the "optimization" part.
void printList(int* num, int numElements) {
int i;
for (i = 0; i < numElements; i ++) {
printf("%d ", *(num + i));
}
printf("\n");
}
int main() {
int numElements = 0, i = 0, j = 0, min = 0, swap = 0, numSorted = 0;
printf("Enter number of elements: ");
scanf("%d", &numElements);
int* num = malloc(sizeof(int) * numElements);
for (i = 0; i < numElements; i ++) {
printf("Enter number = ");
scanf(" %d", num + i);
}
for (i = 0; i < numElements-1; i++) {
numSorted = i + 1; // "optimized"
min = i;
for (j = i + 1; j < numElements; j++) {
numSorted += *(num + j - 1) <= *(num + j); // "optimized"
if (*(num + min) > *(num + j))
min = j;
}
if (numSorted == numElements) // "optimized"
break;
if (min != i) {
swap = *(num + i);
*(num + i) = *(num + min);
*(num + min) = swap;
}
printList(num, numElements);
}
printf("Sorted list:\n");
printList(num, numElements);
free(num);
getch();
return 0;
}
Optimizing selection sort is a little silly. It has awful best-case, average, and worst-case time complexity, so if you want a remotely optimized sort you would (almost?) always pick another sort. Even insertion sort tends to be faster and it's hardly much more complicated to implement.
More to the point, checking if the list is sorted increases the time the algorithm takes in the worst case scenarios (the average case too I'm inclined to think). And even a mostly sorted list will not necessarily go any faster this way: consider 1,2,3,4,5,6,7,9,8. Even though the list only needs two elements swapped at the end, the algorithm will not short-circuit as it is not ever sorted until the end.
Just because something can be optimized, doesn't necessarily mean it should. Assuming profiling or "boss-says-so" indicates optimization is warranted there are a few things you can do.
As with any algorithm involving iteration over memory, anything that reduces the number of iterations can help.
keep track of the min AND max values - cut number of iterations in half
keep track of multiple min/max values (4 each will be 1/8th the iterations)
at some point temp values will not fit in registers
the code will get more complex
It can also help to maximize cache locality.
do a backward iteration after the forward iteration
the recently accessed memory should still be cached
going straight to another forward iteration would cause a cache miss
since you are moving backward, the cache predictor may prefetch the rest
this could actually be worse on some architectures (RISC-V)
operate on a cache line at a time where possible
this can allow the next cache line to be prefetched in the mean time
you may need to align the data or specially handle the first and last data
even with increased alignment, the last few elements may need "padding"
Use SIMD instructions and registers where useful and practical
useful for non-branching rank order sort of temps
can hold many data points simultaneously (AVX-512 can do a cache line)
avoids memory access (thus less cache misses)
If you use multiple max/min values, optimize sorting the n values of max and min
see here for techniques to sort a small fixed number of values.
save memory swaps until the end of each iteration and do them once
keep temporaries (or pointers) in registers in the mean time
There are quite a few more optimization methods available, but eventually the resemblance to selection sort starts to get foggy. Each of these is going to increase complexity and therefore maintenance cost to the point where a simpler implementation of a more appropriate algorithm may be a better choice.
The only way I see how this can be answered is if you define the purpose of why you are optimizing it.
Is it worth it in a professional setting: on the job, for code running "in production" - most likely (even almost certainly) not.
Is it worth it as a teaching/learning tool - sometimes yes.
I teach programming to individuals and sometimes I teach them algorithms and datastructures. I consider selection sort to be one of the easiest to explain and teach - it flows so naturally after explaining the algorithm for finding the minimum and swapping two values (swap()). Then, at the end I introduce the concept of optimization where we can implement this counter "already sorted" detection.
Admittedly bubble sort is even better to introduce optimization, because it has at least 3 easy to explain and substantial optimizations.
I was wondering if it is worth it to "optimize" the algorithm by adding a certain line of code to make the algorithm "short-circuit" itself if the remaining part is already sorted.
Clearly this change reduces the best-case complexity from O(n2) to O(n). This will be observed for inputs that are already sorted except for O(1) leading elements. If such inputs are a likely case, then the suggested code change might indeed yield an observable and worthwhile performance improvement.
Note, however, that your change more than doubles the work performed in the innermost loop, and consider that for uniform random inputs, the expected number of outer-loop iterations saved is 1. Consider also that any outer-loop iterations you do manage to trim off will be the ones that otherwise would do the least work. Overall, then, although you do not change the asymptotic complexity, the actual performance in the average and worst cases will be noticeably worse -- runtimes on the order of twice as long.
If you're after better speed then your best bet is to choose a different sorting algorithm. Among the comparison sorts, Insertion Sort will perform about the same as your optimized Selection Sort on the latter's best case, but it has a wider range of best-case scenarios, and will usually outperform (regular) Selection Sort in the average case. How the two compare in the worst case depends on implementation.
If you want better performance still then consider Merge Sort or Quick Sort, both of which are pretty simple to implement. Or if your data are suited to it then Counting Sort is pretty hard to beat.
we can optimize selection sort in best case which will be O(n) instead of O(n^2).
here is my optimization code.
public class SelectionSort {
static void selectionSort(int[]arr){
for(int i=0; i< arr.length; i++){
int maxValue=arr[0];
int maxIndex=0;
int cnt=1;
for (int j=1; j< arr.length-i; j++){
if(maxValue<=arr[j]){
maxValue=arr[j];
maxIndex=j;
cnt++;
}
}
if(cnt==arr.length)break;
arr[maxIndex]=arr[arr.length-1-i];
arr[arr.length-1-i]=maxValue;
}
}
public static void main(String[] args) {
int[]arr={1,-3, 0, 8, -45};
selectionSort(arr);
System.out.println(Arrays.toString(arr));
}
}

Optimising C for performance vs memory optimisation using multidimensional arrays

I am struggling to decide between two optimisations for building a numerical solver for the poisson equation.
Essentially, I have a two dimensional array, of which I require n doubles in the first row, n/2 in the second n/4 in the third and so on...
Now my difficulty is deciding whether or not to use a contiguous 2d array grid[m][n], which for a large n would have many unused zeroes but would probably reduce the chance of a cache miss. The other, and more memory efficient method, would be to dynamically allocate an array of pointers to arrays of decreasing size. This is considerably more efficient in terms of memory storage but would it potentially hinder performance?
I don't think I clearly understand the trade-offs in this situation. Could anybody help?
For reference, I made a nice plot of the memory requirements in each case:
There is no hard and fast answer to this one. If your algorithm needs more memory than you expect to be given then you need to find one which is possibly slower but fits within your constraints.
Beyond that, the only option is to implement both and then compare their performance. If saving memory results in a 10% slowdown is that acceptable for your use? If the version using more memory is 50% faster but only runs on the biggest computers will it be used? These are the questions that we have to grapple with in Computer Science. But you can only look at them once you have numbers. Otherwise you are just guessing and a fair amount of the time our intuition when it comes to optimizations are not correct.
Build a custom array that will follow the rules you have set.
The implementation will use a simple 1d contiguous array. You will need a function that will return the start of array given the row. Something like this:
int* Get( int* array , int n , int row ) //might contain logical errors
{
int pos = 0 ;
while( row-- )
{
pos += n ;
n /= 2 ;
}
return array + pos ;
}
Where n is the same n you described and is rounded down on every iteration.
You will have to call this function only once per entire row.
This function will never take more that O(log n) time, but if you want you can replace it with a single expression: http://en.wikipedia.org/wiki/Geometric_series#Formula
You could use a single array and just calculate your offset yourself
size_t get_offset(int n, int row, int column) {
size_t offset = column;
while (row--) {
offset += n;
n << 1;
}
return offset;
}
double * array = calloc(sizeof(double), get_offset(n, 64, 0));
access via
array[get_offset(column, row)]

What is the bug in this code?

Based on a this logic given as an answer on SO to a different(similar) question, to remove repeated numbers in a array in O(N) time complexity, I implemented that logic in C, as shown below. But the result of my code does not return unique numbers. I tried debugging but could not get the logic behind it to fix this.
int remove_repeat(int *a, int n)
{
int i, k;
k = 0;
for (i = 1; i < n; i++)
{
if (a[k] != a[i])
{
a[k+1] = a[i];
k++;
}
}
return (k+1);
}
main()
{
int a[] = {1, 4, 1, 2, 3, 3, 3, 1, 5};
int n;
int i;
n = remove_repeat(a, 9);
for (i = 0; i < n; i++)
printf("a[%d] = %d\n", i, a[i]);
}
1] What is incorrect in above code to remove duplicates.
2] Any other O(N) or O(NlogN) solution for this problem. Its logic?
Heap sort in O(n log n) time.
Iterate through in O(n) time replacing repeating elements with a sentinel value (such as INT_MAX).
Heap sort again in O(n log n) to distil out the repeating elements.
Still bounded by O(n log n).
Your code only checks whether an item in the array is the same as its immediate predecessor.
If your array starts out sorted, that will work, because all instances of a particular number will be contiguous.
If your array isn't sorted to start with, that won't work because instances of a particular number may not be contiguous, so you have to look through all the preceding numbers to determine whether one has been seen yet.
To do the job in O(N log N) time, you can sort the array, then use the logic you already have to remove duplicates from the sorted array. Obviously enough, this is only useful if you're all right with rearranging the numbers.
If you want to retain the original order, you can use something like a hash table or bit set to track whether a number has been seen yet or not, and only copy each number to the output when/if it has not yet been seen. To do this, we change your current:
if (a[k] != a[i])
a[k+1] = a[i];
to something like:
if (!hash_find(hash_table, a[i])) {
hash_insert(hash_table, a[i]);
a[k+1] = a[i];
}
If your numbers all fall within fairly narrow bounds or you expect the values to be dense (i.e., most values are present) you might want to use a bit-set instead of a hash table. This would be just an array of bits, set to zero or one to indicate whether a particular number has been seen yet.
On the other hand, if you're more concerned with the upper bound on complexity than the average case, you could use a balanced tree-based collection instead of a hash table. This will typically use more memory and run more slowly, but its expected complexity and worst case complexity are essentially identical (O(N log N)). A typical hash table degenerates from constant complexity to linear complexity in the worst case, which will change your overall complexity from O(N) to O(N2).
Your code would appear to require that the input is sorted. With unsorted inputs as you are testing with, your code will not remove all duplicates (only adjacent ones).
You are able to get O(N) solution if the number of integers is known up front and smaller than the amount of memory you have :). Make one pass to determine the unique integers you have using auxillary storage, then another to output the unique values.
Code below is in Java, but hopefully you get the idea.
int[] removeRepeats(int[] a) {
// Assume these are the integers between 0 and 1000
Boolean[] v = new Boolean[1000]; // A lazy way of getting a tri-state var (false, true, null)
for (int i=0;i<a.length;++i) {
v[a[i]] = Boolean.TRUE;
}
// v[i] = null => number not seen
// v[i] = true => number seen
int[] out = new int[a.length];
int ptr = 0;
for (int i=0;i<a.length;++i) {
if (v[a[i]] != null && v[a[i]].equals(Boolean.TRUE)) {
out[ptr++] = a[i];
v[a[i]] = Boolean.FALSE;
}
}
// Out now doesn't contain duplicates, order is preserved and ptr represents how
// many elements are set.
return out;
}
You are going to need two loops, one to go through the source and one to check each item in the destination array.
You are not going to get O(N).
[EDIT]
The article you linked to suggests a sorted output array which means the search for duplicates in the output array can be a binary search...which is O(LogN).
Your logic just wrong, so the code is wrong too. Do your logic by yourself before coding it.
I suggest a O(NlnN) way with a modification of heapsort.
With heapsort, we join from a[i] to a[n], find the minimum and replace it with a[i], right?
So now is the modification, if the minimum is the same with a[i-1] then swap minimum and a[n], reduce your array item's number by 1.
It should do the trick in O(NlnN) way.
Your code will work only on particular cases. Clearly, you're checking adjacent values but duplicate values can occur any where in array. Hence, it's totally wrong.

Resources