Optimized Selection Sort? - c

I have read sources that say that the time complexities for Selection sort are:
Best-case: O(n^2)
Average-case: O(n^2)
Worst-case: O(n^2)
I was wondering if it is worth it to "optimize" the algorithm by adding a certain line of code to make the algorithm "short-circuit" itself if the remaining part is already sorted.
Here's the code written in C:
I have also added a comment which indicates which lines are part of the "optimization" part.
void printList(int* num, int numElements) {
int i;
for (i = 0; i < numElements; i ++) {
printf("%d ", *(num + i));
}
printf("\n");
}
int main() {
int numElements = 0, i = 0, j = 0, min = 0, swap = 0, numSorted = 0;
printf("Enter number of elements: ");
scanf("%d", &numElements);
int* num = malloc(sizeof(int) * numElements);
for (i = 0; i < numElements; i ++) {
printf("Enter number = ");
scanf(" %d", num + i);
}
for (i = 0; i < numElements-1; i++) {
numSorted = i + 1; // "optimized"
min = i;
for (j = i + 1; j < numElements; j++) {
numSorted += *(num + j - 1) <= *(num + j); // "optimized"
if (*(num + min) > *(num + j))
min = j;
}
if (numSorted == numElements) // "optimized"
break;
if (min != i) {
swap = *(num + i);
*(num + i) = *(num + min);
*(num + min) = swap;
}
printList(num, numElements);
}
printf("Sorted list:\n");
printList(num, numElements);
free(num);
getch();
return 0;
}

Optimizing selection sort is a little silly. It has awful best-case, average, and worst-case time complexity, so if you want a remotely optimized sort you would (almost?) always pick another sort. Even insertion sort tends to be faster and it's hardly much more complicated to implement.
More to the point, checking if the list is sorted increases the time the algorithm takes in the worst case scenarios (the average case too I'm inclined to think). And even a mostly sorted list will not necessarily go any faster this way: consider 1,2,3,4,5,6,7,9,8. Even though the list only needs two elements swapped at the end, the algorithm will not short-circuit as it is not ever sorted until the end.

Just because something can be optimized, doesn't necessarily mean it should. Assuming profiling or "boss-says-so" indicates optimization is warranted there are a few things you can do.
As with any algorithm involving iteration over memory, anything that reduces the number of iterations can help.
keep track of the min AND max values - cut number of iterations in half
keep track of multiple min/max values (4 each will be 1/8th the iterations)
at some point temp values will not fit in registers
the code will get more complex
It can also help to maximize cache locality.
do a backward iteration after the forward iteration
the recently accessed memory should still be cached
going straight to another forward iteration would cause a cache miss
since you are moving backward, the cache predictor may prefetch the rest
this could actually be worse on some architectures (RISC-V)
operate on a cache line at a time where possible
this can allow the next cache line to be prefetched in the mean time
you may need to align the data or specially handle the first and last data
even with increased alignment, the last few elements may need "padding"
Use SIMD instructions and registers where useful and practical
useful for non-branching rank order sort of temps
can hold many data points simultaneously (AVX-512 can do a cache line)
avoids memory access (thus less cache misses)
If you use multiple max/min values, optimize sorting the n values of max and min
see here for techniques to sort a small fixed number of values.
save memory swaps until the end of each iteration and do them once
keep temporaries (or pointers) in registers in the mean time
There are quite a few more optimization methods available, but eventually the resemblance to selection sort starts to get foggy. Each of these is going to increase complexity and therefore maintenance cost to the point where a simpler implementation of a more appropriate algorithm may be a better choice.

The only way I see how this can be answered is if you define the purpose of why you are optimizing it.
Is it worth it in a professional setting: on the job, for code running "in production" - most likely (even almost certainly) not.
Is it worth it as a teaching/learning tool - sometimes yes.
I teach programming to individuals and sometimes I teach them algorithms and datastructures. I consider selection sort to be one of the easiest to explain and teach - it flows so naturally after explaining the algorithm for finding the minimum and swapping two values (swap()). Then, at the end I introduce the concept of optimization where we can implement this counter "already sorted" detection.
Admittedly bubble sort is even better to introduce optimization, because it has at least 3 easy to explain and substantial optimizations.

I was wondering if it is worth it to "optimize" the algorithm by adding a certain line of code to make the algorithm "short-circuit" itself if the remaining part is already sorted.
Clearly this change reduces the best-case complexity from O(n2) to O(n). This will be observed for inputs that are already sorted except for O(1) leading elements. If such inputs are a likely case, then the suggested code change might indeed yield an observable and worthwhile performance improvement.
Note, however, that your change more than doubles the work performed in the innermost loop, and consider that for uniform random inputs, the expected number of outer-loop iterations saved is 1. Consider also that any outer-loop iterations you do manage to trim off will be the ones that otherwise would do the least work. Overall, then, although you do not change the asymptotic complexity, the actual performance in the average and worst cases will be noticeably worse -- runtimes on the order of twice as long.
If you're after better speed then your best bet is to choose a different sorting algorithm. Among the comparison sorts, Insertion Sort will perform about the same as your optimized Selection Sort on the latter's best case, but it has a wider range of best-case scenarios, and will usually outperform (regular) Selection Sort in the average case. How the two compare in the worst case depends on implementation.
If you want better performance still then consider Merge Sort or Quick Sort, both of which are pretty simple to implement. Or if your data are suited to it then Counting Sort is pretty hard to beat.

we can optimize selection sort in best case which will be O(n) instead of O(n^2).
here is my optimization code.
public class SelectionSort {
static void selectionSort(int[]arr){
for(int i=0; i< arr.length; i++){
int maxValue=arr[0];
int maxIndex=0;
int cnt=1;
for (int j=1; j< arr.length-i; j++){
if(maxValue<=arr[j]){
maxValue=arr[j];
maxIndex=j;
cnt++;
}
}
if(cnt==arr.length)break;
arr[maxIndex]=arr[arr.length-1-i];
arr[arr.length-1-i]=maxValue;
}
}
public static void main(String[] args) {
int[]arr={1,-3, 0, 8, -45};
selectionSort(arr);
System.out.println(Arrays.toString(arr));
}
}

Related

Matrix multiplication in 2 different ways (comparing time)

I've got an assignment - compare 2 matrix multiplications - in the default way, and multiplication after transposition of second matrix, we should point the difference which method is faster. I've written something like this below, but time and time2 are nearly equal to each other. In one case the first method is faster, I run the multiplication with the same size of matrix, and in another one the second method is faster. Is something done wrong? Should I change something in my code?
clock_t start = clock();
int sum;
for(int i=0; i<size; ++i) {
for(int j=0; j<size; ++j) {
sum = 0;
for(int k=0; k<size; ++k) {
sum = sum + (m1[i][k] * m2[k][j]);
}
score[i][j] = sum;
}
}
clock_t end = clock();
double time = (end-start)/(double)CLOCKS_PER_SEC;
for(int i=0; i<size; ++i) {
for(int j=0; j<size; ++j) {
int temp = m2[i][j];
m2[i][j] = m2[j][i];
m2[j][i] = temp;
}
}
clock_t start2 = clock();
int sum2;
for(int i=0; i<size; ++i) {
for(int j=0; j<size; ++j) {
sum2 = 0;
for(int k=0; k<size; ++k) {
sum2 = sum2 + (m1[k][i] * m2[k][j]);
}
score[i][j] = sum2;
}
}
clock_t end2 = clock();
double time2 = (end2-start2)/(double)CLOCKS_PER_SEC;
You have multiple severe issues with your code and/or your understanding. Let me try to explain.
Matrix multiplication is bottlenecked by the rate at which the processor can load and store the values to memory. Most current architectures use cache to help with this. Data is moved from memory to cache and from cache to memory in blocks. To maximize the benefit of caching, you want to make sure you will use all the data in that block. To do that, you make sure you access the data sequentially in memory.
In C, multi-dimensional arrays are specified in row-major order. It means that the rightmost index is consecutive in memory; i.e. that a[i][k] and a[i][k+1] are consecutive in memory.
Depending on the architecture, the time it takes for the processor to wait (and do nothing) for the data to be moved from RAM to cache (and vice versa), may or may not be included in the CPU time (that e.g. clock() measures, albeit at a very poor resolution). For this kind of measurement ("microbenchmark"), it is much better to measure and report both CPU and real (or wall clock) time used; especially so if the microbenchmark is run on different machines, to get a better idea of the practical impact of the change.
There will be a lot of variation, so typically, you measure the time taken by a few hundred repeats (each repeat possibly making more than one operation; enough to be easily measured), storing the duration of each, and report their median. Why median, and not minimum, maximum, average? Because there will always be occasional glitches (unreasonable measurement due to an external event, or something), which typically yield a much higher value than normal; this makes the maximum uninteresting, and skews the average (mean) unless removed. The minimum is typically an over-optimistic case, where everything just happened to go perfectly; that rarely occurs in practice, so is only a curiosity, not of practical interest. The median time, on the other hand, gives you a practical measurement: you can expect 50% of all runs of your test case to take no more than the median time measured.
On POSIXy systems (Linux, Mac, BSDs), you should use clock_gettime() to measure the time. The struct timespec format has nanosecond precision (1 second = 1,000,000,000 nanoseconds), but resolution may be smaller (i.e., the clocks change by more than 1 nanosecond, whenever they change). I personally use
#define _POSIX_C_SOURCE 200809L
#include <time.h>
static struct timespec cpu_start, wall_start;
double cpu_seconds, wall_seconds;
void timing_start(void)
{
clock_gettime(CLOCK_REALTIME, &wall_start);
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &cpu_start);
}
void timing_stop(void)
{
struct timespec cpu_end, wall_end;
clock_gettime(CLOCK_REALTIME, &wall_end);
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &cpu_end);
wall_seconds = (double)(wall_end.tv_sec - wall_start.tv_sec)
+ (double)(wall_end.tv_nsec - wall_start.tv_nsec) / 1000000000.0;
cpu_seconds = (double)(cpu_end.tv_sec - cpu_start.tv_sec)
+ (double)(cpu_end.tv_nsec - cpu_start.tv_nsec) / 1000000000.0;
}
You call timing_start() before the operation, and timing_stop() after the operation; then, cpu_seconds contains the amount of CPU time taken and wall_seconds the real wall clock time taken (both in seconds, use e.g. %.9f to print all meaningful decimals).
The above won't work on Windows, because Microsoft does not want your C code to be portable to other systems. It prefers to develop their own "standard" instead. (Those C11 "safe" _s() I/O function variants are a stupid sham, compared to e.g. POSIX getline(), or the state of wide character support on all systems except Windows.)
Matrix multiplication is
c[r][c] = a[r][0] * b[0][c]
+ a[r][1] * b[1][c]
: :
+ a[r][L] * b[L][c]
where a has L+1 columns, and b has L+1 rows.
In order to make the summation loop use consecutive elements, we need to transpose b. If B[c][r] = b[r][c], then
c[r][c] = a[r][0] * B[c][0]
+ a[r][1] * B[c][1]
: :
+ a[r][L] * B[c][L]
Note that it suffices that a and B are consecutive in memory, but separate (possibly "far" away from each other), for the processor to utilize cache efficiently in such cases.
OP uses a simple loop, similar to the following pseudocode, to transpose b:
For r in rows:
For c in columns:
temporary = b[r][c]
b[r][c] = b[c][r]
b[c][r] = temporary
End For
End For
The problem above is that each element participates in a swap twice. For example, if b has 10 rows and columns, r = 3, c = 5 swaps b[3][5] and b[5][3], but then later, r = 5, c = 3 swaps b[5][3] and b[3][5] again! Essentially, the double loop ends up restoring the matrix to the original order; it does not do a transpose.
Consider the following entries and the actual transpose:
b[0][0] b[0][1] b[0][2] b[0][0] b[1][0] b[2][0]
b[1][0] b[1][1] b[1][2] ⇔ b[0][1] b[1][1] b[2][1]
b[2][0] b[2][1] b[2][2] b[0][2] b[1][2] b[2][2]
The diagonal entries are not swapped. You only need to do the swap in the upper triangular portion (where c > r) or in the lower triangular portion (where r > c), to swap all entries, because each swap swaps one entry from the upper triangular to the lower triangular, and vice versa.
So, to recap:
Is something done wrong?
Yes. Your transpose does nothing. You haven't understood the reason why one would want to transpose the second matrix. Your time measurement relies on a low-precision CPU time, which may not reflect the time taken by moving data between RAM and CPU cache. In the second test case, with m2 "transposed" (except it isn't, because you swap each element pair twice, returning them back to the way they were), your innermost loop is over the leftmost array index, which means it calculates the wrong result. (Moreover, because consecutive iterations of the innermost loop accesses items far from each other in memory, it is anti-optimized: it uses the pattern that is worst in terms of speed.)
All the above may sound harsh, but it isn't intended to be, at all. I do not know you, and I am not trying to evaluate you; I am only pointing out the errors in this particular answer, in your current understanding, and only in the hopes that it helps you, and anyone else encountering this question in similar circumstances, to learn.

Best (fastest) way to find the number most frequently entered in C?

Well, I think the title basically explains my doubt. I will have n numbers to read, this n numbers go from 1 to x, where x is at most 105. What is the fastest (less possible time to run it) way to find out which number were inserted more times? That knowing that the number that appears most times appears more than half of the times.
What I've tried so far:
//for (1<=x<=10⁵)
int v[100000+1];
//multiple instances , ends when n = 0
while (scanf("%d", &n)&&n>0) {
zerofill(v);
for (i=0; i<n; i++) {
scanf("%d", &x);
v[x]++;
if (v[x]>n/2)
i=n;
}
printf("%d\n", x);
}
Zero-filling a array of x positions and increasing the position vector[x] and at the same time verifying if vector[x] is greater than n/2 it's not fast enough.
Any idea might help, thank you.
Observation: No need to care about amount of memory used.
The trivial solution of keeping a counter array is O(n) and you obviously can't get better than that. The fight is then about the constants and this is where a lot of details will play the game, including exactly what are the values of n and x, what kind of processor, what kind of architecture and so on.
On the other side this seems really the "knockout" problem, but that algorithm will need two passes over the data and an extra conditional, thus in practical terms in the computers I know it will be most probably slower than the array of counters solutions for a lot of n and x values.
The good point of the knockout solution is that you don't need to put a limit x on the values and you don't need any extra memory.
If you know already that there is a value with the absolute majority (and you simply need to find what is this value) then this could make it (but there are two conditionals in the inner loop):
initialize count = 0
loop over all elements
if count is 0 then set champion = element and count = 1
else if element != champion decrement count
else increment count
at the end of the loop your champion will be the value with the absolute majority of elements, if such a value is present.
But as said before I'd expect a trivial
for (int i=0,n=size; i<n; i++) {
if (++count[x[i]] > half) return x[i];
}
to be faster.
EDIT
After your edit seems you're really looking for the knockout algorithm, but caring about speed that's probably still the wrong question with modern computers (100000 elements is nothing even for a nail-sized single chip today).
I think you can create a max heap for the count of number you read,and use heap sort to find all the count which greater than n/2

calculating the no of steps in insertion sort

Here are the two versions of insertion sort, which I implement one from pseudo code and one directly. I want to know which version take more steps and space(even a little space is complex).
void insertion_sort(int a[], int n) {
int key, i, j;
for(i = 1; i < n; i++) {
key = a[i];
j = i - 1;
while(j >= 0 && a[j] > key) {
a[j+1] = a[j];
j--;
}
a[j+1] = key;
}
}
and this one
insertion_sort(item s[], int n) {
int i,j;
for (i=1; i<n; i++) {
j=i;
while ((j>0) && (s[j] < s[j-1])) {
swap(&s[j],&s[j-1]);
j = j-1;
}
}
}
Here is the sample sorting array a = {5, 2, 4, 6, 1, 3}.
In my opinion 2nd version take more steps because it swaps number one by one, while the 1st one swaps greater numbers in the while loop and then swaps the smallest number. For example:
Upto index = 3, both version take equal steps, but when index = 4 comes i.e. to swap number 1, 2nd takes more steps than 1st.
What do you think?
"Number of steps" isn't a useful measure of anything.
Is a step a line? A statement? An expression? An assembler instruction? A CPU micro-op?
That is, your "steps" are transformed into assembler and then optimized, and the resulting instructions can have different (and potentially variable) runtime costs.
Sensible questions you might ask:
1 what is the algorithmic complexity?
As given in Rafe Kettler's comment and Arpit's answer, this is about how the algorithm scales as the input size grows
2 how does it perform
If you want to know which is faster (for some set of inputs), you should just measure it.
If you just want to know which performs more swaps, why not just write a swap function that increments a global counter every time it is called, and find out?
Number of swaps is the wrong term, you should count the number of assignments. swap() expands to three assignments and you therefore usually end up with more assignments in the second version without saving space (you may not have key in the second version, but swap() internally has something similar).
Both versions are using two loops. so complexity O(n*n) time. Considering constant(1) time for all other statements.
Let's analyze it line by line. I assume complexity of swap to be 3
a)
Computational complexity:
3+(n-1)*(1+1+((n-1)/2)*(1+1+1)*(1+1)+1)=1+(n-1)*(3n)=3n^2-3n+1
(We use n/2 because it appears to be the average of continuous worst case scenarios).
Memory:
3 ints, +1 (for loop)
b)
Computational complexity:2+(n-1)(1+((n-1))/2(1+1+1)(3+1))=2+(n-1)*(6n-5)=6n^2-11n+7
Memory:
2 ints, +cost of swap (most likely additional 1 integer)
Not counting the input memory, as it is the same in both cases.
Hope it helps.

Comparison of all array elements - C algorithm

I have a matrix m * n and, for each row, I need to compare all elements among them.
For each couple I find, I'll call a function that is going to perform some calculations.
Example:
my_array -> {1, 2, 3, 4, 5, ...}
I take 1 and I have: (1,2)(1,3)(1,4)(1,5)
I take 2 and I have: (2,1)(2,3)(2,4)(2,5)
and so on
Using C I wrote this:
for (i=0; i<array_length; i++) {
for (k=0; k<array_length; k++) {
if (i==k) continue;
//Do something
}
}
}
I was wondering if I can use an algorithm with lower complexity.
No, it's O(n^2) by definition [ too long to explain here, but trust me (-: ]
But you can decrease the number of iterations by half :
for (i=0; i<array_length; i++) {
for (k=i+1; k<array_length; k++) { // <-- no need to check the values before "i"
//Do something
//If the order of i and k make a different then here you should:
//'Do something' for (i,k) and 'Do something' for (k,i)
}
}
}
There are several things you might do, but which are possibile and which not depend on the array nature and the formula you apply. Overall complexity will probably remain unchanged or even grow, even if calculation can be made to go faster, unless the formula has a complexity dependancy on its arguments, in which case a decrease in complexity may be achievable.
Also, going from AO(N^a) to BO(N^b) with b > a (higher complexity) can still be worth pursuing, for some range of N, if B is sufficiently smaller than A.
In no particular order:
if the matrix has several repeated items, it can be convenient to use a caching function:
result function(arg1, arg2) {
int i = index(arg1, arg2); // Depending on the values, it could be
// something like arg1*(MAX_ARG2+1) + arg2;
if (!stored[i]) { // stored and values are allocated and initialised
// somewhere else - or in this function using a
// static flag.
stored[i] = 1;
values[i] = true_function(arg1, arg2);
}
return values[i];
}
Then, you have a memory overhead proportional to the number of different couples
of values available. The call overhead can be O(|arg1|*|arg2|), but in some circumstances
(e.g. true_function() is expensive) the savings will more than offset the added complexity.
chop the formula into pieces (not possible for every formula) and express it as:
F(x,y) = G(x) op H(y) op J(x,y)
then, you can do a O(max(M,N)) cycle pre-calculating G[] and H[]. This also has a O(M+N) memory cost. It is only convenient if the computational expenditure difference between F and J is significant. Or you might do:
for (i in 0..N) {
g = G(array[i]);
for (j in 0..N) {
if (i != j) {
result = f(array[i], array[j], g);
}
}
}
which brings some of the complexity from O(N^2) down to O(N).
the first two techniques are useable in tandem if G() or H() are practical to cache (limited range of argument, expensive function).
find a "law" to link F(a, b) with F(a+c, b+d). Then you can run the caching algorithm much more efficiently, reusing the same calculations. This shifts some complexity from O(N^2) to O(N) or even O(log N), so that while the overall cost is still quadratic, it grows much more slowly, and a higher bound for N becomes practical. If F is itself of a higher order of complexity than constant in (a,b), this may also reduce this order (as an extreme example, suppose F is iterative in a and/or b).
No, you can only get lower computational complexity if you include knowledge of the contents of the array, and semantics of the operation to optimize your algorithm.

What is the bug in this code?

Based on a this logic given as an answer on SO to a different(similar) question, to remove repeated numbers in a array in O(N) time complexity, I implemented that logic in C, as shown below. But the result of my code does not return unique numbers. I tried debugging but could not get the logic behind it to fix this.
int remove_repeat(int *a, int n)
{
int i, k;
k = 0;
for (i = 1; i < n; i++)
{
if (a[k] != a[i])
{
a[k+1] = a[i];
k++;
}
}
return (k+1);
}
main()
{
int a[] = {1, 4, 1, 2, 3, 3, 3, 1, 5};
int n;
int i;
n = remove_repeat(a, 9);
for (i = 0; i < n; i++)
printf("a[%d] = %d\n", i, a[i]);
}
1] What is incorrect in above code to remove duplicates.
2] Any other O(N) or O(NlogN) solution for this problem. Its logic?
Heap sort in O(n log n) time.
Iterate through in O(n) time replacing repeating elements with a sentinel value (such as INT_MAX).
Heap sort again in O(n log n) to distil out the repeating elements.
Still bounded by O(n log n).
Your code only checks whether an item in the array is the same as its immediate predecessor.
If your array starts out sorted, that will work, because all instances of a particular number will be contiguous.
If your array isn't sorted to start with, that won't work because instances of a particular number may not be contiguous, so you have to look through all the preceding numbers to determine whether one has been seen yet.
To do the job in O(N log N) time, you can sort the array, then use the logic you already have to remove duplicates from the sorted array. Obviously enough, this is only useful if you're all right with rearranging the numbers.
If you want to retain the original order, you can use something like a hash table or bit set to track whether a number has been seen yet or not, and only copy each number to the output when/if it has not yet been seen. To do this, we change your current:
if (a[k] != a[i])
a[k+1] = a[i];
to something like:
if (!hash_find(hash_table, a[i])) {
hash_insert(hash_table, a[i]);
a[k+1] = a[i];
}
If your numbers all fall within fairly narrow bounds or you expect the values to be dense (i.e., most values are present) you might want to use a bit-set instead of a hash table. This would be just an array of bits, set to zero or one to indicate whether a particular number has been seen yet.
On the other hand, if you're more concerned with the upper bound on complexity than the average case, you could use a balanced tree-based collection instead of a hash table. This will typically use more memory and run more slowly, but its expected complexity and worst case complexity are essentially identical (O(N log N)). A typical hash table degenerates from constant complexity to linear complexity in the worst case, which will change your overall complexity from O(N) to O(N2).
Your code would appear to require that the input is sorted. With unsorted inputs as you are testing with, your code will not remove all duplicates (only adjacent ones).
You are able to get O(N) solution if the number of integers is known up front and smaller than the amount of memory you have :). Make one pass to determine the unique integers you have using auxillary storage, then another to output the unique values.
Code below is in Java, but hopefully you get the idea.
int[] removeRepeats(int[] a) {
// Assume these are the integers between 0 and 1000
Boolean[] v = new Boolean[1000]; // A lazy way of getting a tri-state var (false, true, null)
for (int i=0;i<a.length;++i) {
v[a[i]] = Boolean.TRUE;
}
// v[i] = null => number not seen
// v[i] = true => number seen
int[] out = new int[a.length];
int ptr = 0;
for (int i=0;i<a.length;++i) {
if (v[a[i]] != null && v[a[i]].equals(Boolean.TRUE)) {
out[ptr++] = a[i];
v[a[i]] = Boolean.FALSE;
}
}
// Out now doesn't contain duplicates, order is preserved and ptr represents how
// many elements are set.
return out;
}
You are going to need two loops, one to go through the source and one to check each item in the destination array.
You are not going to get O(N).
[EDIT]
The article you linked to suggests a sorted output array which means the search for duplicates in the output array can be a binary search...which is O(LogN).
Your logic just wrong, so the code is wrong too. Do your logic by yourself before coding it.
I suggest a O(NlnN) way with a modification of heapsort.
With heapsort, we join from a[i] to a[n], find the minimum and replace it with a[i], right?
So now is the modification, if the minimum is the same with a[i-1] then swap minimum and a[n], reduce your array item's number by 1.
It should do the trick in O(NlnN) way.
Your code will work only on particular cases. Clearly, you're checking adjacent values but duplicate values can occur any where in array. Hence, it's totally wrong.

Resources