Sum of bit differences among all pairs

Sum of bit differences among all pairs - arrays

The problem statement is the following:
Given an integer array of n integers, find sum of bit differences in all pairs that can be formed from array elements. Bit difference of a pair (x, y) is count of different bits at same positions in binary representations of x and y.
For example, bit difference for 2 and 7 is 2. Binary representation of 2 is 010 and 7 is 111 ( first and last bits differ in two numbers).
Examples:
Input: arr[] = {1, 2}
Output: 4
All pairs in array are (1, 1), (1, 2)
(2, 1), (2, 2)
Sum of bit differences = 0 + 2 +
2 + 0
= 4
Based on this post the most efficient (running time of O(n)) solution is the following:
The idea is to count differences at individual bit positions. We traverse from 0 to 31 and count numbers with i’th bit set. Let this count be ‘count’. There would be “n-count” numbers with i’th bit not set. So count of differences at i’th bit would be “count * (n-count) * 2″.
// C++ program to compute sum of pairwise bit differences
#include <bits/stdc++.h>
using namespace std;
int sumBitDifferences(int arr[], int n)
{
int ans = 0; // Initialize result
// traverse over all bits
for (int i = 0; i < 32; i++)
{
// count number of elements with i'th bit set
int count = 0;
for (int j = 0; j < n; j++)
if ( (arr[j] & (1 << i)) )
count++;
// Add "count * (n - count) * 2" to the answer
ans += (count * (n - count) * 2);
}
return ans;
}
// Driver prorgram
int main()
{
int arr[] = {1, 3, 5};
int n = sizeof arr / sizeof arr[0];
cout << sumBitDifferences(arr, n) << endl;
return 0;
}
What I'm not entirely clear on is how the running time would be linear when there are two for loops incrementing by 1 for each iteration. The way I'm interpreting it is that since the outer loop is iterating from 0 to 32
(corresponding to the 0th and 32nd bits of each number) and because I'm guessing all 32 bit shifts would happen in the same clock period (or relatively fast compared to linear iteration), the overall running time would be dominated by the linear iteration over the array.
Is this the correct interpretation?

In English, "My algorithm runs in O(n) time" translates to "My algorithm runs in time that is at most proportional to n for very large inputs". The proportionality aspect of that is the reason that 32 iterations in an outer loop don't make any difference. The execution time is still proportional to n.
Let's look at a different example:
for (int i=0; i<n; i++) {
for (int j=0; j<n; j++) {
// do something
}
}
In this example the execution time is proportional to n2 so it's not O(n). It is however O(n2). And technically O(n3) and O(n4), ... as well. This follows from the definition.
There's only so much you can talk about this stuff in English without misinterpretation, so if you want to nail down the concepts you're best off checking out the formal definition in an introductory algorithms textbook or online class and working out a few examples.

Instead of comparing two numbers we could compare the i_th bit of every number with each other which would reduce your time complexity from O(n*n) to O(32*n), We just need to count the total pairs of zeroes and ones possible wrt i_th bit of every number.
Simple Cpp implementation would look like this:
int cntBits(vector<int> &A) {
int n = A.size();
int ans = 0;
for(int i=0;i<31;i++) {
long long z = 0,o = 0;
for(int j=0;j<n;j++) {
if( ((A[j]>>i)&1 ) == 1 ) o++;
else z++;
}
ans = ( ans + (z*o)%1000000007 )%1000000007;
}
return (2*ans)%1000000007;
}
Anyone who is feeling stuck over this logic can refer to this explanation: https://www.youtube.com/watch?v=OKROwC2fLEg

Related

C - Generate random sequence with no repeats without shuffling

I want to generate an array of the sequence [0...1'000'000] in random order without shuffling.
This means that I don't want to do:
int arr[1000000];
for (int i = 0; i < 1000000; i++)
{
arr[i] = i;
}
shuffle(arr);
shuffle(arr);
I want to figure out how to do it without the "black-box" shuffle function. I also don't want to randomly select an index between 1 and 1'000'000 because at number 999'999 there would be only a 1/1'000'000 chance to continue.
I've been trying to think of a solution and I think the key is parallel arrays and looping backwards then using modulus to limit only to the indexes that you haven't already been to, but then I can't guarantee that the value I get is unique.
I don't want to use a HashSet or TreeSet implementation as well.

This can be done in O(n) time with two lists, one with the number (initialy) in order, and one in the resulting order.
You start with n elements in order in your source list. Then you select a random number mod n. That gives you the next element, which you place in the destination list.
Now the key part. If you were to pick a random number between 0 and n-1 each time, as you seem to think a shuffle does, you have an increasing chance of selecting a number you selected before. So how do you handle this? By decreasing the available list of number to select from.
In the source list, after selecting a number, you move the last element of the list to the index that was just used. You now have a list of n-1 numbers to chose from. So on the next iteration you take a random number mod n-1. Keep going until your source list only has one element.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define LEN 10
int main()
{
int a[LEN], b[LEN];
int i, val;
int count = LEN;
srand(time(NULL));
for (i=0;i<LEN;i++) {
a[i]=i+1;
}
for (i=0;i<LEN;i++) {
val = rand() % count;
b[i] = a[val];
a[val] = a[count-1];
count--;
}
for (i=0;i<LEN;i++) {
printf("%d ", b[i]);
}
printf("\n");
return 0;
}
EDIT:
Here's a slightly more efficient version that doesn't use two arrays and is therefore O(1) space:
int a[LEN];
int i, val, tmp;
srand(time(NULL));
for (i=0;i<LEN;i++) {
a[i]=i+1;
}
for (i=0;i<LEN-1;i++) {
val = (rand() % (LEN - 1 - i)) + i + 1;
tmp = a[i];
a[i] = a[val];
a[val] = tmp;
}
for (i=0;i<LEN;i++) {
printf("%d ", a[i]);
}
printf("\n");

The O(N) answer is great but here is an alternative way using binary search and binary indexed tree to do this in O(NlogN).
arr = []
N = 1000,000
for i from 0 to N-1
low = 0
high = N-1
mid = (low+high)/2
while low < high
if full(low,mid)
low = mid+1
else if full(mid+1,high)
high = mid
else
if rand() < 0.5
low = mid+1
else
high = mid
mark(low) // marking the element in binary indexed tree
arr[i] = low
The function full is implemented using binary indexed tree and checks whether all the elements in the range given are marked or not.
Both mark and full have O(logN) complexity.

How can I make this very small C program faster?

Is there any simple way to make this small program faster? I've made it for an assignment, and it's correct but too slow. The aim of the program is to print the nth pair of primes where the difference between the two is two, given n.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
bool isPrime(int number) {
for (int i = 3; i <= number/2; i += 2) {
if (!(number%i)) {
return 0;
}
}
return 1;
}
int findNumber(int n) {
int prevPrime, currentNumber = 3;
for (int i = 0; i < n; i++) {
do {
prevPrime = currentNumber;
do {
currentNumber+=2;
} while (!isPrime(currentNumber));
} while (!(currentNumber - 2 == prevPrime));
}
return currentNumber;
}
int main(int argc, char *argv[]) {
int numberin, numberout;
scanf ("%d", &numberin);
numberout = findNumber(numberin);
printf("%d %d\n", numberout - 2, numberout);
return 0;
}
I considered using some kind of array or list that would contain all primes found up until the current number and divide each number by this list instead of all numbers, but we haven't really covered these different data structures yet so I feel I should be able to solve this problem without. I'm just starting with C, but I have some experience in Python and Java.

To find pairs of primes which differ by 2, you only need to find one prime and then add 2 and test if it is also prime.
if (isPrime(x) && isPrime(x+2)) { /* found pair */ }
To find primes the best algorithm is the Sieve of Eratosthenes. You need to build a lookup table up to (N) where N is the maximum number that you can get. You can use the Sieve to get in O(1) if a number is prime. While building the Sieve you can build a list of sorted primes.
If your N is big you can also profit from the fact that a number P is prime iif it doesn't have any prime factors <= SQRT(P) (because if it has a factor > SQRT(N) then it should also have one < SQRT(N)). You can build a Sieve of Eratosthenes with size SQRT(N) to get a list of primes and then test if any of those prime divides P. If none divides P, P is prime.
With this approach you can test numbers up to 1 billion or so relatively fast and with little memory.

Here is an improvement to speed up the loop in isPrime:
bool isPrime(int number) {
for (int i = 3; i * i <= number; i += 2) { // Changed the loop condition
if (!(number%i)) {
return 0;
}
}
return 1;
}

You are calling isPrime more often than necessary. You wrote
currentNummber = 3;
/* ... */
do {
currentNumber+=2;
} while (!isPrime(currentNumber));
...which means that isPrime is called for every odd number. However, when you identified that e.g. 5 is prime, you can already tell that 10, 15, 20 etc. are not going to be prime, so you don't need to test them.
This approach of 'crossing-out' multiples of primes is done when using a sieve filter, see e.g. Sieve of Eratosthenes algorithm in C for an implementation of a sieve filter for primes in C.

Avoid testing ever 3rd candidate
Pairs of primes a, a+2 may only be found a = 6*n + 5. (except pair 3,5).
Why?
a + 0 = 6*n + 5 Maybe a prime
a + 2 = 6*n + 7 Maybe a prime
a + 4 = 6*n + 9 Not a prime when more than 3 as 6*n + 9 is a multiple of 3
So rather than test ever other integer with + 2, test with
a = 5;
loop {
if (isPrime(a) && isPrime(a+2)) PairCount++;
a += 6;
}
Improve loop exit test
Many processors/compilers, when calculating the remainder, will also have available, for nearly "free" CPU time cost, the quotient. YMMV. Use the quotient rather than i <= number/2 or i*i <= number to limit the test loop.
Use of sqrt() has a number of problems: range of double vs. int, exactness, conversion to/from integer. Recommend avoid sqrt() for this task.
Use unsigned for additional range.
bool isPrime(unsigned x) {
// With OP's selective use, the following line is not needed.
// Yet needed for a general purpose `isPrime()`
if (x%2 == 0) return x == 2;
if (x <= 3) return x == 3;
unsigned p = 1;
unsigned quotient, remainder;
do {
p += 2;
remainder = x%p;
if (remainder == 0) return false;
quotient = x/p; // quotient for "free"
} while (p < quotient); // Low cost compare
return true;
}

grey codes using 2d arrays (C)

My assignment is to print out grey codes using recursion. A user puts in a bit value between 0-8, therefore the maximum amount of strings you can have is 256 (2^8).
I've got the base case done but i don't know what I would do for the else portion.
My code so far:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
void gcodes (int n) {
char bits[256][8];
int i, j;
int x = pow (2, n);
if (n == 1) {
bits[0][0] = '0';
bits[1][0] = '1';
} else {
gcodes (n-1);
}
for (i=0; i<x; i++) {
for (j=0; j<n; j++) {
printf("%c", reverse[i][j]);
}
printf("\n");
}
}
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Invalid number of arguments\n");
return 0;
}
int n;
n = atoi (argv[1]);
if (n > 8 || n <= 0) {
printf("Invalid integer\n");
return 0;
}
gcodes (n);
}

a gray code can have only one bit change from one number to the next consecutive number. and over the whole sequence, there are no repeated values.
Given that criteria, there are several possible gray code implementations.
There are several deadend sequences where the values start off ok, then fail,
Calculating a gray code via code will take lots of experimentation.
In reality it is much easier to simply find a valid gray code sequence from the net, and paste that into any program that needs a gray code sequence.
Most often, a input is a gray coded wheel that is read to determine if the wheel moved rather than something generated in code.
however, if I were implementing a gray code generator, I would expect it to perform exclusive-or between the last generated value and the proposed new/next value and if that is valid (only one bit changed) I would search through the existing table of values to assure it is not a duplicate.
this SO question suggests a possible algorithm:
Non-recursive Grey code algorithm understanding
and the answer is repeated below:
The answer to all four your questions is that this algorithm does not start with lower values of n. All strings it generates have the same length, and the i-th (for i = 1, ..., 2n-1) string is generated from the (i-1)-th one.
Here is the first few steps for n = 4:
Start with G0 = 0000
To generate G1, flip 0-th bit in G0, as 0 is the position of the least significant 1 in the binary representation of 1 = 0001b. G1 = 0001.
To generate G2, flip 1-st bit in G1, as 1 is the position of the least significant 1 in the binary representation of 2 = 0010b. G2 = 0011.
To generate G3, flip 0-th bit in G2, as 0 is the position of the least significant 1 in the binary representation of 3 = 0011b. G3 = 0010.
To generate G4, flip 2-nd bit in G3, as 2 is the position of the least significant 1 in the binary representation of 4 = 0100b. G4 = 0110.
To generate G5, flip 0-th bit in G4, as 0 is the position of the least significant 1 in the binary representation of 5 = 0101b. G5 = 0111.

Since you define
char bits[256][8];
with automatic storage duration inside the function gcodes(), the array's lifetime ends when returning from the function, so you lose the results of the recursive calls. Thus, at least define it
static char bits[256][8];
or globally if you want to keep the resulting bits for use outside of gcodes().
Since in the standard Gray code the least significant bit (bit 0) follows the repetitive pattern of 0110, it is convenient to set the complete pattern in the base case even if it is not needed for n = 1.
For the ith code's bit j where j > 0, its value can be taken from bit j-1 of code i/2.
This leads to the completed function:
void gcodes(int n)
{
static char bits[256][8];
int i, j, x = pow(2, n);
if (n == 1)
{
bits[0][0] = '0';
bits[1][0] = '1';
bits[2][0] = '1';
bits[3][0] = '0';
}
else
{
gcodes(n-1);
// generate bit j (from n-1 down to 1) for codes up to x-1
for (i=0, j=n; --j; i=x/2)
for (; i<x; i++)
bits[i][j] = bits[i/2][j-1];
// replicate bit 0 for codes up to x-1
for (; i<x; i++)
bits[i][0] = bits[i%4][0];
}
for (i=0; i<x; i++, printf("\n"))
for (j=n; j--; )
printf("%c", bits[i][j]);
}

Multiplication of very large numbers using character strings

I'm trying to write a C program which performs multiplication of two numbers without directly using the multiplication operator, and it should take into account numbers which are sufficiently large so that even the usual addition of these two numbers cannot be performed by direct addition.
I was motivated for this when I was trying to (and successfully did) write a C program which performs addition using character strings, I did the following:
#include<stdio.h>
#define N 100000
#include<string.h>
void pushelts(char X[], int n){
int i, j;
for (j = 0; j < n; j++){
for (i = strlen(X); i >= 0; i--){
X[i + 1] = X[i];
}
X[0] = '0';
}
}
int max(int a, int b){
if (a > b){ return a; }
return b;
}
void main(){
char E[N], F[N]; int C[N]; int i, j, a, b, c, d = 0, e;
printf("Enter the first number: ");
gets_s(E);
printf("\nEnter the second number: ");
gets_s(F);
a = strlen(E); b = strlen(F); c = max(a, b);
pushelts(E, c - a); pushelts(F, c - b);
for (i = c - 1; i >= 0; i--){
e = d + E[i] + F[i] - 2*'0';
C[i] = e % 10; d = e / 10;
}
printf("\nThe answer is: ");
for (i = 0; i < c; i++){
printf("%d", C[i]);
}
getchar();
}
It can add any two numbers with "N" digits. Now, how would I use this to perform multiplication of large numbers? First, I wrote a function which performs the multiplication of number, which is to be entered as a string of characters, by a digit n (i.e. 0 <= n <= 9). It's easy to see how such a function is written; I'll call it (*). Now the main purpose is to multiply two numbers (entered as a string of characters) with each other. We might look at the second number with k digits (assuming it's a1a2.....ak) as:
a1a2...ak = a1 x 10^(k - 1) + a2 x 10^(k - 2) + ... + ak-1 x 10 + ak
So the multiplication of the two numbers can be achieved using the solution designed for addition and the function (*).
If the first number is x1x2.....xn and the second one is y1y2....yk, then:
x1x2...xn x y1y2...yk = (x1x2...xn) x y1 x 10^(k-1) + .....
Now the function (*) can multiply (x1x2...xn) with y1 and the multiplication by 10^(k-1) is just adding k-1 zero's next to the number; finally we add all of these k terms with each other to obtain the result. But the difficulty lies in just knowing how many digits each number contains in order to perform the addition each time inside the loop designed for adding them together. I have thought about doing a null array and each time adding to it the obtained result from multiplication of (x1x2....xn) by yi x 10^(i-1), but like I've said I am incapable of precising the required bounds and I don't know how many zeros I should each time add in front of each obtained result in order to add it using the above algorithm to the null array. More difficulty arises when I'll have to do several conversions from char types into int types and conversely. Maybe I'm making this more complicated than it should; I don't know if there's an easier way to do this or if there are tools I'm unaware of. I'm a beginner at programming and I don't know further than the elementary tools.
Does anyone have a solution or an idea or an algorithm to present? Thanks.

There is an algorithm for this which I developed when doing Small Factorials problem on SPOJ.
This algorithm is based on the elementary school multiplication method. In school days we learn multiplication of two numbers by multiplying each digit of the first number with the last digit of the second number. Then multiplying each digit of the first number with second last digit of the second number and so on as follows:
1234
x 56
------------
7404
+6170- // - is denoting the left shift
------------
69104
What actually is happening:
num1 = 1234, num2 = 56, left_shift = 0;
char_array[] = all digits in num1
result_array[]
while(num2)
n = num2%10
num2 /= 10
carry = 0, i = left_shift, j = 0
while(char_array[j])
i. partial_result = char_array[j]*n + carry
ii. partial_result += result_array[i]
iii. result_array[i++] = partial_result%10
iv. carry = partial_result/10
left_shift++
Print the result_array in reverse order.
You should note that the above algorithm work if num1 and num2 do not exceed the range of its data type. If you want more generic program, then you have to read both numbers in char arrays. Logic will be the same. Declare num1 and num2 as char array. See the implementation:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char num1[200], num2[200];
char result_arr[400] = {'\0'};
int left_shift = 0;
fgets(num1, 200, stdin);
fgets(num2, 200, stdin);
size_t n1 = strlen(num1);
size_t n2 = strlen(num2);
for(size_t i = n2-2; i >= 0; i--)
{
int carry = 0, k = left_shift;
for(size_t j = n1-2; j >= 0; j--)
{
int partial_result = (num1[j] - '0')*(num2[i] - '0') + carry;
if(result_arr[k])
partial_result += result_arr[k] - '0';
result_arr[k++] = partial_result%10 + '0';
carry = partial_result/10;
}
if(carry > 0)
result_arr[k] = carry +'0';
left_shift++;
}
//printf("%s\n", result_arr);
size_t len = strlen(result_arr);
for(size_t i = len-1; i >= 0; i-- )
printf("%c", result_arr[i]);
printf("\n");
}
This is not a standard algorithm but I hope this will help.

Bignum arithmetic is hard to implement efficiently. The algorithms are quite hard to understand (and efficient algorithms are better than the naive one you are trying to implement), and you could find several books on them.
I would suggest using an existing Bignum library like GMPLib or use some language providing bignums natively (e.g. Common Lisp with SBCL)

You could re-use your character-string-addition code as follows (using user300234's example of 384 x 56):
Set result="0" /* using your character-string representation */
repeat:
Set N = ones_digit_of_multiplier /* 6 in this case */
for (i = 0; i < N; ++i)
result += multiplicand /* using your addition algorithm */
Append "0" to multiplicand /* multiply it by 10 --> 3840 */
Chop off the bottom digit of multiplier /* divide it by 10 --> 5 */
Repeat if multiplier != 0.

How to find a duplicate element in an array of shuffled consecutive integers?

I recently came across a question somewhere:
Suppose you have an array of 1001 integers. The integers are in random order, but you know each of the integers is between 1 and 1000 (inclusive). In addition, each number appears only once in the array, except for one number, which occurs twice. Assume that you can access each element of the array only once. Describe an algorithm to find the repeated number. If you used auxiliary storage in your algorithm, can you find an algorithm that does not require it?
What I am interested in to know is the second part, i.e., without using auxiliary storage. Do you have any idea?

Just add them all up, and subtract the total you would expect if only 1001 numbers were used from that.
Eg:
Input: 1,2,3,2,4 => 12
Expected: 1,2,3,4 => 10
Input - Expected => 2

Update 2: Some people think that using XOR to find the duplicate number is a hack or trick. To which my official response is: "I am not looking for a duplicate number, I am looking for a duplicate pattern in an array of bit sets. And XOR is definitely suited better than ADD to manipulate bit sets". :-)
Update: Just for fun before I go to bed, here's "one-line" alternative solution that requires zero additional storage (not even a loop counter), touches each array element only once, is non-destructive and does not scale at all :-)
printf("Answer : %d\n",
array[0] ^
array[1] ^
array[2] ^
// continue typing...
array[999] ^
array[1000] ^
1 ^
2 ^
// continue typing...
999^
1000
);
Note that the compiler will actually calculate the second half of that expression at compile time, so the "algorithm" will execute in exactly 1002 operations.
And if the array element values are know at compile time as well, the compiler will optimize the whole statement to a constant. :-)
Original solution: Which does not meet the strict requirements of the questions, even though it works to find the correct answer. It uses one additional integer to keep the loop counter, and it accesses each array element three times - twice to read it and write it at the current iteration and once to read it for the next iteration.
Well, you need at least one additional variable (or a CPU register) to store the index of the current element as you go through the array.
Aside from that one though, here's a destructive algorithm that can safely scale for any N up to MAX_INT.
for (int i = 1; i < 1001; i++)
{
array[i] = array[i] ^ array[i-1] ^ i;
}
printf("Answer : %d\n", array[1000]);
I will leave the exercise of figuring out why this works to you, with a simple hint :-):
a ^ a = 0
0 ^ a = a

A non destructive version of solution by Franci Penov.
This can be done by making use of the XOR operator.
Lets say we have an array of size 5: 4, 3, 1, 2, 2
Which are at the index: 0, 1, 2, 3, 4
Now do an XOR of all the elements and all the indices. We get 2, which is the duplicate element. This happens because, 0 plays no role in the XORing. The remaining n-1 indices pair with same n-1 elements in the array and the only unpaired element in the array will be the duplicate.
int i;
int dupe = 0;
for(i = 0; i < N; i++) {
dupe = dupe ^ arr[i] ^ i;
}
// dupe has the duplicate.
The best feature of this solution is that it does not suffer from overflow problems that is seen in the addition based solution.
Since this is an interview question, it would be best to start with the addition based solution, identify the overflow limitation and then give the XOR based solution :)
This makes use of an additional variable so does not meet the requirements in the question completely.

Add all the numbers together. The final sum will be the 1+2+...+1000+duplicate number.

To paraphrase Francis Penov's solution.
The (usual) problem is: given an array of integers of arbitrary length that contain only elements repeated an even times of times except for one value which is repeated an odd times of times, find out this value.
The solution is:
acc = 0
for i in array: acc = acc ^ i
Your current problem is an adaptation. The trick is that you are to find the element that is repeated twice so you need to adapt solution to compensate for this quirk.
acc = 0
for i in len(array): acc = acc ^ i ^ array[i]
Which is what Francis' solution does in the end, although it destroys the whole array (by the way, it could only destroy the first or last element...)
But since you need extra-storage for the index, I think you'll be forgiven if you also use an extra integer... The restriction is most probably because they want to prevent you from using an array.
It would have been phrased more accurately if they had required O(1) space (1000 can be seen as N since it's arbitrary here).

Add all numbers. The sum of integers 1..1000 is (1000*1001)/2. The difference from what you get is your number.

One line solution in Python
arr = [1,3,2,4,2]
print reduce(lambda acc, (i, x): acc ^ i ^ x, enumerate(arr), 0)
# -> 2
Explanation on why it works is in #Matthieu M.'s answer.

If you know that we have the exact numbers 1-1000, you can add up the results and subtract 500500 (sum(1, 1000)) from the total. This will give the repeated number because sum(array) = sum(1, 1000) + repeated number.

Well, there is a very simple way to do this... each of the numbers between 1 and 1000 occurs exactly once except for the number that is repeated.... so, the sum from 1....1000 is 500500. So, the algorithm is:
sum = 0
for each element of the array:
sum += that element of the array
number_that_occurred_twice = sum - 500500

n = 1000
s = sum(GivenList)
r = str(n/2)
duplicate = int( r + r ) - s

public static void main(String[] args) {
int start = 1;
int end = 10;
int arr[] = {1, 2, 3, 4, 4, 5, 6, 7, 8, 9, 10};
System.out.println(findDuplicate(arr, start, end));
}
static int findDuplicate(int arr[], int start, int end) {
int sumAll = 0;
for(int i = start; i <= end; i++) {
sumAll += i;
}
System.out.println(sumAll);
int sumArrElem = 0;
for(int e : arr) {
sumArrElem += e;
}
System.out.println(sumArrElem);
return sumArrElem - sumAll;
}

No extra storage requirement (apart from loop variable).
int length = (sizeof array) / (sizeof array[0]);
for(int i = 1; i < length; i++) {
array[0] += array[i];
}
printf(
"Answer : %d\n",
( array[0] - (length * (length + 1)) / 2 )
);

Do arguments and callstacks count as auxiliary storage?
int sumRemaining(int* remaining, int count) {
if (!count) {
return 0;
}
return remaining[0] + sumRemaining(remaining + 1, count - 1);
}
printf("duplicate is %d", sumRemaining(array, 1001) - 500500);
Edit: tail call version
int sumRemaining(int* remaining, int count, int sumSoFar) {
if (!count) {
return sumSoFar;
}
return sumRemaining(remaining + 1, count - 1, sumSoFar + remaining[0]);
}
printf("duplicate is %d", sumRemaining(array, 1001, 0) - 500500);

public int duplicateNumber(int[] A) {
int count = 0;
for(int k = 0; k < A.Length; k++)
count += A[k];
return count - (A.Length * (A.Length - 1) >> 1);
}

A triangle number T(n) is the sum of the n natural numbers from 1 to n. It can be represented as n(n+1)/2. Thus, knowing that among given 1001 natural numbers, one and only one number is duplicated, you can easily sum all given numbers and subtract T(1000). The result will contain this duplicate.
For a triangular number T(n), if n is any power of 10, there is also beautiful method finding this T(n), based on base-10 representation:
n = 1000
s = sum(GivenList)
r = str(n/2)
duplicate = int( r + r ) - s

I support the addition of all the elements and then subtracting from it the sum of all the indices but this won't work if the number of elements is very large. I.e. It will cause an integer overflow! So I have devised this algorithm which may be will reduce the chances of an integer overflow to a large extent.
for i=0 to n-1
begin:
diff = a[i]-i;
dup = dup + diff;
end
// where dup is the duplicate element..
But by this method I won't be able to find out the index at which the duplicate element is present!
For that I need to traverse the array another time which is not desirable.

Improvement of Fraci's answer based on the property of XORing consecutive values:
int result = xor_sum(N);
for (i = 0; i < N+1; i++)
{
result = result ^ array[i];
}
Where:
// Compute (((1 xor 2) xor 3) .. xor value)
int xor_sum(int value)
{
int modulo = x % 4;
if (modulo == 0)
return value;
else if (modulo == 1)
return 1;
else if (modulo == 2)
return i + 1;
else
return 0;
}
Or in pseudocode/math lang f(n) defined as (optimized):
if n mod 4 = 0 then X = n
if n mod 4 = 1 then X = 1
if n mod 4 = 2 then X = n+1
if n mod 4 = 3 then X = 0
And in canonical form f(n) is:
f(0) = 0
f(n) = f(n-1) xor n

My answer to question 2:
Find the sum and product of numbers from 1 -(to) N, say SUM, PROD.
Find the sum and product of Numbers from 1 - N- x -y, (assume x, y missing), say mySum, myProd,
Thus:
SUM = mySum + x + y;
PROD = myProd* x*y;
Thus:
x*y = PROD/myProd; x+y = SUM - mySum;
We can find x,y if solve this equation.

In the aux version, you first set all the values to -1 and as you iterate check if you have already inserted the value to the aux array. If not (value must be -1 then), insert. If you have a duplicate, here is your solution!
In the one without aux, you retrieve an element from the list and check if the rest of the list contains that value. If it contains, here you've found it.
private static int findDuplicated(int[] array) {
if (array == null || array.length < 2) {
System.out.println("invalid");
return -1;
}
int[] checker = new int[array.length];
Arrays.fill(checker, -1);
for (int i = 0; i < array.length; i++) {
int value = array[i];
int checked = checker[value];
if (checked == -1) {
checker[value] = value;
} else {
return value;
}
}
return -1;
}
private static int findDuplicatedWithoutAux(int[] array) {
if (array == null || array.length < 2) {
System.out.println("invalid");
return -1;
}
for (int i = 0; i < array.length; i++) {
int value = array[i];
for (int j = i + 1; j < array.length; j++) {
int toCompare = array[j];
if (value == toCompare) {
return array[i];
}
}
}
return -1;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight