What is the time comlexity of this function(f1)?
as I can see that the first loop(i=0)-> (n/4 times) the second one(i=3)->(n/4 - 3 times).... etc, the result is: (n/3)*(n/4 + (n-3)/4 + (n-6)/4 + (n-9)/4 ....
And I stop here, how to continue?
int f1(int n){
int s=0;
for(int i=0; i<n; i+=3)
for (int j=n; j>i; j-=4)
return s;

The important thing about Big(O) notation is that it eliminates 'constants'. The objective is to determine trend as input size grows without concern for specific numbers.
Think of it as determining the curve on a graph where you don't know the number ranges of the x and y axes.
So in your code, even though you skip most of the values in the range of n for each iteration of each loop, this is done at a constant rate. So regardless of how many you actually skip, this still scales relative to n^2.
It wouldn't matter if you calculated any of the following:
1/4 * n^2
0.0000001 * n^2
(1/4 * n)^2
(0.0000001 * n)^2
1000000 + n^2
n^2 + 10000000 * n
In Big O, these are all equivalent to O(n^2). The point being that once n gets big enough (whatever that may be), all the lower order terms and constant factors become irrelevant in the 'big picture'.
(It's worth emphasising that this is why on small inputs you should be wary of relying too heavily on Big O. That's when constant overheads can still have a big impact.)

Key observation: The inner loop executes (n-i)/4 times in step i, hencei/4 in step n-i.
Now sum all these quantities for i = 3k, 3(k-1), 3(k-2), ..., 9, 6, 3, 0, where 3k is the largest multiple of 3 before n (i.e., 3k <= n < 3(k+1)):
3k/4 + 3(k-1)/4 + ... + 6/4 + 3/4 + 0/4 = 3/4(k + (k-1) + ... + 2 + 1)
= 3/4(k(k+1))/2
= O(k^2)
= O(n^2)
because k <= n/3 <= k+1 and therefore k^2 <= n^2/9 <= (k+1)^2 <= 4k^2

In theory it's "O(n*n)", but...
What if the compiler felt like optimising it into this:
int f1(int n){
int s=0;
for(int i=0; i<n; i+=3)
s += table[i];
return s;
Or even this:
int f1(int n){
if(n <= 0) return 0;
return table[n];
Then it could also be "O(n)" or "O(1)".
Note that on the surface these kinds of optimisations seem impractical (due to worst case memory costs); but with a sufficiently advanced compiler (e.g. using "whole program optimisation" to examine all callers and determine that n is always within a certain range) it's not inconceivable. In a similar way it's not impossible for all of the callers to be using a constant (e.g. where a sufficiently advanced compiler can replace things like x = f1(123); with x = constant_calculated_at_compile_time).
In other words; in practice, the time complexity of the original function depends on how the function is used and how good/bad the compiler is.


Big O - Why is this algorithm O(AxB)?

I am unsure why this code evaluates to O(A*B)?
void printUnorderedPairs(int[] arrayA, int[] arrayB) {
for (int i= 0; i < arrayA.length; i++) {
for (int j = 0; j < arrayB.length; j++) {
for (int k= 0; k < 100000; k++) {
System.out.println(arrayA[i] + "," + arrayB[j]);
Sure, more precisely its O(1000*AB) and we would drop the 1000 making it O(AB). But what if array A had a length of 2? wouldn't the 1000 iterations be more significant? Is it just because we know the final loop is constant (and its value is shown) that we don't count it? what if we knew all of the arrays sizes?
Can anyone explain why we would not say its O(ABC)? What would be the runtime if I made the code this:
int[] arrayA = new int[20];
int[] arrayB = new int[500];
int[] arrayC = new int[100000];
void printUnorderedPairs(int[] arrayA, int[] arrayB) {
for (int i= 0; i < arrayA.length; i++) {
for (int j = 0; j < arrayB.length; j++) {
for (int k= 0; k < arrayC.length; k++) {
System.out.println(arrayA[i] + "," + arrayB[j]);
If the running time (or number of execution steps, or number of times println gets called, or whatever you are assessing with your Big O notation) is O(AB), it means that the running time approaches being linearly proportional to AB as AB grows without bound (approaches infinity). It is literally a limit to infinity, in terms of calculus.
Big O is not concerned with what happens for any finite number of iterations. It's about what the limiting behaviour of the function is as its free variables approach infinity. Sure, for small values of A there could very well be a constant term that dominates execution time. But as A approaches infinity, all those other factors becomes insignificant.
Consider a polynomial like Ax^3 + Bx^2 + Cn + D. It will be proportional to x^3 as x grows to infinity - regardless of the magnitude of A, B, C, or D. B can be Grahams number for all Big O cares; infinity is still way bigger than any big finite number you pick and therefore the x^3 term dominates.
So first, considering what if A were 2 is not really in the spirit of AB approaching infinity. Any number you can fit on a whiteboard basically rounds down to zero..
And second, remember that proportional to AB means equal to AB times some constant; and it doesn't matter what that constant is. It is fine if the constant happens to be 10000. Saying something is proportional to 2N is the same as saying it is proportional to N, or any other number times N. So O(2N) is the same as O(N). By convention we always simplify when using Big-O notation to drop any constant factors. So we would always write O(N), and never O(2N). And for that same reason, we would write O(AB) and not O(10000AB).
And finally we don't say O(ABC) only because "C" (the number of iterations of your inner loop in your question) happens to be a constant; which also happens to equal 10000. That's why we say it's O(AB) and not O(ABC) because C is not a free variable; it's hard-coded to 10000. If the size of B were not expected to change (were to be constant for whatever reason) then you could say that it is simply O(A). But if you allow B to grow without bound, then the limit is O(AB) and if you also allow C to grow without bound then the limit is O(ABC). You get to decide which numbers are constant and which variables are free variables depending on the context of your analysis.
You can read more about Big O notation at Wikipedia.
Appreciate that the for loops in i and j are independent of each other, so their running time is O(A*B). The inner loop in k is a fixed number of iterations, 100000, and also is independent of the two outer loops, so we get O(100000*A*B). But, since the k loop is just a constant (non variable) penalty, with are still left with O(A*B) for the overall complexity.
If you were to write the inner loop in k from 0 to C, then you could write O(A*B*C) for the complexity, and that would be valid as well.
Generally the A*B doesn't matter, and it's just considered O(N).
If there was some knowledge that A and B were always somewhat the same length, then one could argue that it's really O(N^2).
Any sort of constant doesn't really matter in order-notation, because for really really large numbers of A/B, the constant becomes of negligible importance.
void printUnorderedPairs(int[] arrayA, int[] arrayB) {
for (int i= 0; i < arrayA.length; i++) {
for (int j = 0; j < arrayB.length; j++) {
for (int k= 0; k < 100000; k++) {
System.out.println(arrayA[i] + "," + arrayB[j]);
This code is evaluated to O(AB), because arrayC has constant length. Of course, its run time is proportional to AB*100000. Here, we never care about constant values, because when the variables get higher and higher like 10^10000, the constants can be easily ignored.
In the second code, we say its O(1), because all arrays have constant length and we can calculate its run time without any variable.

Shuffle an array while making each index have the same probability to be in any index

I want to shuffle an array, and that each index will have the same probability to be in any other index (excluding itself).
I have this solution, only i find that always the last 2 indexes will always ne swapped with each other:
void Shuffle(int arr[]. size_t n)
int newIndx = 0;
int i = 0;
for(; i > n - 2; ++i)
newIndx = rand() % (n - 1);
if (newIndx >= i)
swap(i, newIndx, arr);
but in the end it might be that some indexes will go back to their first place once again.
Any thoughts?
C lang.
A permutation (shuffle) where no element is in its original place is called a derangement.
Generating random derangements is harder than generating random permutations, can be done in linear time and space. (Generating a random permutation can be done in linear time and constant space.) Here are two possible algorithms.
The simplest solution to understand is a rejection strategy: do a Fisher-Yates shuffle, but if the shuffle attempts to put an element at its original spot, restart the shuffle. [Note 1]
Since the probability that a random shuffle is a derangement is approximately 1/e, the expected number of shuffles performed is about e (that is, 2.71828…). But since unsuccessful shuffles are restarted as soon as the first fixed point is encountered, the total number of shuffle steps is less than e times the array size for a detailed analysis, see this paper, which proves the expected number of random numbers needed by the algorithm to be around (e−1) times the number of elements.
In order to be able to do the check and restart, you need to keep an array of indices. The following little function produces a derangement of the indices from 0 to n-1; it is necessary to then apply the permutation to the original array.
/* n must be at least 2 for this to produce meaningful results */
void derange(size_t n, size_t ind[]) {
for (size_t i = 0; i < n; ++i) ind[i] = i;
swap(ind, 0, randint(1, n));
for (size_t i = 1; i < n; ++i) {
int r = randint(i, n);
swap(ind, i, r);
if (ind[i] == i) i = 0;
Here are the two functions used by that code:
void swap(int arr[], size_t i, size_t j) {
int t = arr[i]; arr[i] = arr[j]; arr[j] = t;
/* This is not the best possible implementation */
int randint(int low, int lim) {
return low + rand() % (lim - low);
The following function is based on the 2008 paper "Generating Random Derangements" by Conrado Martínez, Alois Panholzer and Helmut Prodinger, although I use a different mechanism to track cycles. Their algorithm uses a bit vector of size N but uses a rejection strategy in order to find an element which has not been marked. My algorithm uses an explicit vector of indices not yet operated on. The vector is also of size N, which is still O(N) space [Note 2]; since in practical applications, N will not be large, the difference is not IMHO significant. The benefit is that selecting the next element to use can be done with a single call to the random number generator. Again, this is not particularly significant since the expected number of rejections in the MP&P algorithm is very small. But it seems tidier to me.
The basis of the algorithms (both MP&P and mine) is the recursive procedure to produce a derangement. It is important to note that a derangement is necessarily the composition of some number of cycles where each cycle is of size greater than 1. (A cycle of size 1 is a fixed point.) Thus, a derangement of size N can be constructed from a smaller derangement using one of two mechanisms:
Produce a derangement of the N-1 elements other than element N, and add N to some cycle at any point in that cycle. To do so, randomly select any element j in the N-1 cycle and place N immediately after j in the j's cycle. This alternative covers all possibilities where N is in a cycle of size > 3.
Produce a derangement of N-2 of the N-1 elements other than N, and add a cycle of size 2 consisting of N and the element not selected from the smaller derangement. This alternative covers all possibilities where N is in a cycle of size 2.
If Dn is the number of derangements of size n, it is easy to see from the above recursion that:
Dn = (n−1)(Dn−1 + Dn−2)
The multiplier is n−1 in both cases: in the first alternative, it refers to the number of possible places N can be added, and in the second alternative to the number of possible ways to select n−2 elements of the recursive derangement.
Therefore, if we were to recursively produce a random derangement of size N, we would randomly select one of the N-1 previous elements, and then make a random boolean decision on whether to produce alternative 1 or alternative 2, weighted by the number of possible derangements in each case.
One advantage to this algorithm is that it can derange an arbitrary vector; there is no need to apply the permuted indices to the original vector as with the rejection algorithm.
As MP&P note, the recursive algorithm can just as easily be performed iteratively. This is quite clear in the case of alternative 2, since the new 2-cycle can be generated either before or after the recursion, so it might as well be done first and then the recursion is just a loop. But that is also true for alternative 1: we can make element N the successor in a cycle to a randomly-selected element j even before we know which cycle j will eventually be in. Looked at this way, the difference between the two alternatives reduces to whether or not element j is removed from future consideration or not.
As shown by the recursion, alternative 2 should be chosen with probability (n−1)Dn−2/Dn, which is how MP&P write their algorithm. I used the equivalent formula Dn−2 / (Dn−1 + Dn−2), mostly because my prototype used Python (for its built-in bignum support).
Without bignums, the number of derangements and hence the probabilities need to be approximated as double, which will create a slight bias and limit the size of the array to be deranged to about 170 elements. (long double would allow slightly more.) If that is too much of a limitation, you could implement the algorithm using some bignum library. For ease of implementation, I used the Posix drand48 function to produce random doubles in the range [0.0, 1.0). That's not a great random number function, but it's probably adequate to the purpose and is available in most standard C libraries.
Since no attempt is made to verify the uniqueness of the elements in the vector to be deranged, a vector with repeated elements may produce a derangement where one or more of these elements appear to be in the original place. (It's actually a different element with the same value.)
The code:
/* Deranges the vector `arr` (of length `n`) in place, to produce
* a permutation of the original vector where every element has
* been moved to a new position. Returns `true` unless the derangement
* failed because `n` was 1.
bool derange(int arr[], size_t n) {
if (n < 2) return n != 1;
/* Compute derangement counts ("subfactorials") */
double subfact[n];
subfact[0] = 1;
subfact[1] = 0;
for (size_t i = 2; i < n; ++i)
subfact[i] = (i - 1) * (subfact[i - 2] + subfact[i - 1]);
/* The vector 'todo' is the stack of elements which have not yet
* been (fully) deranged; `u` is the count of elements in the stack
size_t todo[n];
for (size_t i = 0; i < n; ++i) todo[i] = i;
size_t u = n;
/* While the stack is not empty, derange the element at the
* top of the stack with some element lower down in the stack
while (u) {
size_t i = todo[--u]; /* Pop the stack */
size_t j = u * drand48(); /* Get a random stack index */
swap(arr, i, todo[j]); /* i will follow j in its cycle */
/* If we're generating a 2-cycle, remove the element at j */
if (drand48() * (subfact[u - 1] + subfact[u]) < subfact[u - 1])
todo[j] = todo[--u];
return true;
Many people get this wrong, particularly in social occasions such as "secret friend" selection (I believe this is sometimes called "the Santa game" in other parts of the world.) The incorrect algorithm is to just choose a different swap if the random shuffle produces a fixed point, unless the fixed point is at the very end in which case the shuffle is restarted. This will produce a random derangement but the selection is biased, particularly for small vectors. See this answer for an analysis of the bias.
Even if you don't use the RAM model where all integers are considered fixed size, the space used is still linear in the size of the input in bits, since N distinct input values must have at least N log N bits. Neither this algorithm nor MP&P makes any attempt to derange lists with repeated elements, which is a much harder problem.
Your algorithm is only almost correct (which in algorithmics means unexpected results). Because of some little errors scattered along, it will not produce expected results.
First, rand() % N is not guaranteed to produce an uniformal distribution, unless N is a divisor of the number of possible values. In any other case, you will get a slight bias. Anyway my man page for rand describes it as a bad random number generator, so you should try to use random or if available arc4random_uniform.
But avoiding that an index come back at its original place is both incommon, and rather hard to achieve. The only way I can imagine is to keep an array of the numbers [0; n[ and swap it the same as the real array to be able to know the original index of a number.
The code could become:
void Shuffle(int arr[]. size_t n)
int i, newIndx;
int *indexes = malloc(n * sizeof(int));
for (i=0; i<n; i++) indexes[i] = i;
for(i=0; i < n - 1; ++i) // beware to the inequality!
int i1;
// search if index i is in the [i; n[ current array:
for (i1=i; i1 < n; ++i) {
if (indexes[i1] == i) { // move it to i position
if (i1 != i) { // nothing to do if already at i
swap(i, i1, arr);
swap(i, i1, indexes);
i1 = (i1 == n) ? i : i+1; // we will start the search at i1
// to guarantee that no element keep its place
newIndx = i1 + arc4random_uniform(n - i1);
/* if arc4random is not available:
newIndx = i1 + (random() % (n - i1));
swap(i, newIndx, arr);
swap(i, newIndx, indexes);
/* special case: a permutation of [0: n-1[ have left last element in place
* we will exchange the last element with a random one
if (indexes[n-1] == n-1) {
newIndx = arc4random_uniform(n-1)
swap(n-1, newIndx, arr);
swap(n-1, newIndx, indexes);
free(indexes); // don't forget to free what we have malloc'ed...
Beware: the algorithm should be correct, but the code has not been tested and can contain typos...

Big-O small clarification

Is O(log(log(n))) actually just O(log(n)) when it comes to time complexity?
Do you agree that this function g() has a time complexity of O(log(log(n)))?
int f(int n) {
if (n <= 1)
return 0;
return f(n/2) + 1;
int g(int n) {
int m = f(f(n));
int i;
int x = 0;
for (i = 0; i < m; i++) {
x += i * i;
return m;
function f(n) computes the logarithm in base 2 of n by repeatedly dividing by 2. It iterates log2(n) times.
Calling it on its own result will indeed return log2(log2(n)) for an additional log2(log2(n)) iterations.
So far the complexity is O(log(N)) + O(log(log(N)). The first term dominates the second, overall complexity is O(log(N)).
The final loop iterates log2(log2(n)) times, time complexity of this last phase is O(log(log(N)), negligible in front of the initial phase.
Note that since x is not used before the end of function g, computing it is not needed and the compiler may well optimize this loop to nothing.
Overall time complexity comes out as O(log(N)), which is not the same as O(log(log(N)).
Looks like it is log(n) + log(log n) + log(log n).
In order: the first recursion of f(), plus the second recursion of f(), and the for loop, so the final complexity is O(log n), because lower order terms are ignored.
int f(int n) {
if (n<=1)
return 0;
return f(n/2) + 1;
Has Time Complexity of Order O(log2(n)). Here 2 is base of logrithm.
int g(int n) {
int m = f(f(n)); // O(log2(log2(n))
int i, x=0;
for( i = 0; i < m; i++) {
x += i*i;
// This for loop will take O(log2(log2(n))
return m;
Hence overall time complexity of given function is :
T(n) = t1 + t2 + t3
But here O(log2(n)) dominates over O(log2(log2(n)).
Hence time complexity of given function is log2(n).
Please read What is a plain English explanation of "Big O" notation? once.
The time consumed by O(log n) algorithms depends only linearly on the number of digits of n. So it is very easy to scale it.
Say you want to compute F(100000000), the 10^8th F....ascci number. For a O(log n) algorithm it is only going to take 4x the time consumed by computing F(100).
O(log log n) terms can show up in a variety of different places, but there are typically two main routes that will arrive at this runtime. Reference link enter code here here.

What sort of indexing method can I use to store the distances between X^2 vectors in an array without redundancy?

I'm working on a demo that requires a lot of vector math, and in profiling, I've found that it spends the most time finding the distances between given vectors.
Right now, it loops through an array of X^2 vectors, and finds the distance between each one, meaning it runs the distance function X^4 times, even though (I think) there are only (X^2)/2 unique distances.
It works something like this: (pseudo c)
#define MATRIX_WIDTH 8
typedef float vec2_t[2];
for(int i = 0; i < MATRIX_WIDTH; i++)
for(int j = 0; j < MATRIX_WIDTH; j++)
float xd, yd;
float distance;
for(int k = 0; k < MATRIX_WIDTH; k++)
for(int l = 0; l < MATRIX_WIDTH; l++)
int index_a = (i * MATRIX_LENGTH) + j;
int index_b = (k * MATRIX_LENGTH) + l;
xd = matrix[index_a][0] - matrix[index_b][0];
yd = matrix[index_a][1] - matrix[index_b][1];
distance = sqrtf(powf(xd, 2) + powf(yd, 2));
// More code that uses the distances between each vector
What I'd like to do is create and populate an array of (X^2) / 2 distances without redundancy, then reference that array when I finally need it. However, I'm drawing a blank on how to index this array in a way that would work. A hash table would do it, but I think it's much too complicated and slow for a problem that seems like it could be solved by a clever indexing method.
EDIT: This is for a flocking simulation.
performance ideas:
a) if possible work with the squared distance, to avoid root calculation
b) never use pow for constant, integer powers - instead use xd*xd
I would consider changing your algorithm - O(n^4) is really bad. When dealing with interactions in physics (also O(n^4) for distances in 2d field) one would implement b-trees etc and neglect particle interactions with a low impact. But it will depend on what "more code that uses the distance..." really does.
just did some considerations: the number of unique distances is 0.5*n*n(+1) with n = w*h.
If you write down when unique distances occur, you will see that both inner loops can be reduced, by starting at i and j.
Additionally if you only need to access those distances via the matrix index, you can set up a 4D-distance matrix.
If memory is limited we can save up nearly 50%, as mentioned above, with a lookup function that will access a triangluar matrix, as Code-Guru said. We would probably precalculate the line index to avoid summing up on access
float distanceArray[(H*W+1)*H*W/2];
int lineIndices[H];
searchDistance(int i, int j)
return i<j?distanceArray[i+lineIndices[j]]:distanceArray[j+lineIndices[i]];

Big O for number of operations for a decreasing function

I have a problem with a loop that requires a decreasing number of operations each time the loop executes. Here's the code:
for (int i = 1; i < n; i++) {
...code that takes at most 100/i operations to execute...
I need to find a big O that describes the number of operations. I think what's tripping me up here is that bigger n = more operations, but the growth is smaller.
Thanks for your help!
Harmonic number 1 + 1/2 + 1/3 + ... + 1/n is O(log n)
Also, what if n > 100? For instance: Is 100/12345 operations well defined?
