very large loop counts in c - c

How can I run a loop in c for a very large count in c for eg. 2^1000 times?
Also, using two loops that run a and b no. of times, we get a resultant block that runs a*b no. of times. Is there any smart method for running a loop a^b times?

You could loop recursively, e.g.
void loop( unsigned a, unsigned b ) {
unsigned int i;
if ( b == 0 ) {
printf( "." );
} else {
for ( i = 0; i < a; ++i ) {
loop( a, b - 1 );
}
}
}
...will print a^b . characters.

While I cannot answer your first question, (although look into libgmp, this might help you work with large numbers), a way to perform an action a^b times woul be using recursion.
function (a,b) {
if (b == 0) return;
while (i < a) {
function(a,b-1);
}
}
This will perform the loop a times for each step until b equals 0.

Regarding your answer to one of the comments: But if I have two lines of input and 2^n lines of trash between them, how do I skip past them? Can you tell me a real life scenario where you will see 2^1000 lines of trash that you have to monitor?
For a more reasonable (smaller) number of inputs, you may be able to solve what sounds to be your real need (i.e. handle only relevant lines of input), not by iterating an index, but rather by simply checking each line for the relevant component as it is processed in a while loop...
pseudo code:
BOOL criteriaMet = FALSE;
while(1)
{
while(!criteriaMet)
{
//test next line of input
//if criteria met, set criteriaMet = TRUE;
//if criteria met, handle line of input
//if EOF or similar, break out of loops
}
//criteria met, handle it here and continue
criteriaMet = FALSE;//reset for more searching...
}

Use a b-sized array i[] where each cell hold values from 0 to a-1. For example - for 2^3 use a 3-sized array of booleans.
On each iteration. Increment i[0]. If a==i[0], set i[0] to 0 and increment i[1]. If 0==i[1], set i[1] to 0 and increment i[2], and so on until you increment a cell without reaching a. This can easily be done in a loop:
for(int j=0;j<b;++j){
++i[j];
if(i[j]<a){
break;
}
}
After a iterations, i[0] will return to zero. After a^2 iterations, i[0],i[1] will both be zero. AFter a^b iterations, all cells will be 0 and you can exit the loop. You don't need to check the array each time - the moment you reset i[b-1] you know the all the array is back to zero.

Your question doesn't make sense. Even when your loop is empty you'd be hard pressed to do more than 2^32 iterations per second. Even in this best case scenario, processing 2^64 loop iterations which you can do with a simple uint64_t variable would take 136 years. This is when the loop does absolutely nothing.
Same thing goes for skipping lines as you later explained in the comments. Skipping or counting lines in text is a matter of counting newlines. In 2006 it was estimated that the world had around 10*2^64 bytes of storage. If we assume that all the data in the world is text (it isn't) and the average line is 10 characters including newline (it probably isn't), you'd still fit the count of numbers of lines in all the data in the world in one uint64_t. This processing would of course still take at least 136 years even if the cache of your cpu was fed straight from 4 10Gbps network interfaces (since it's inconceivable that your machine could have that much disk).
In other words, whatever problem you think you're solving is not a problem of looping more than a normal uint64_t in C can handle. The n in your 2^n can't reasonably be more than 50-55 on any hardware your code can be expected to run on.
So to answer your question: if looping a uint64_t is not enough for you, your best option is to wait at least 30 years until Moore's law has caught up with your problem and solve the problem then. It will go faster than trying to start running the program now. I'm sure we'll have a uint128_t at that time.

Related

Is it a bug in ReduceVocab() or missing something?

here's a piece of code of word2vec i've downloaded from google word2vec.c:
// Reduces the vocabulary by removing infrequent tokens
void ReduceVocab() {
int a, b = 0;
unsigned int hash;
for (a = 0; a < vocab_size; a++) if (vocab[a].cn > min_reduce) {
vocab[b].cn = vocab[a].cn;
vocab[b].word = vocab[a].word;
b++;
} else free(vocab[a].word);
vocab_size = b;
for (a = 0; a < vocab_hash_size; a++) vocab_hash[a] = -1;
for (a = 0; a < vocab_size; a++) {
// Hash will be re-computed, as it is not actual
hash = GetWordHash(vocab[a].word);
while (vocab_hash[hash] != -1) hash = (hash + 1) % vocab_hash_size;
vocab_hash[hash] = a;
}
fflush(stdout);
min_reduce++;
}
which is called in LearnVocabFromTrainFile function.
Assume min_reduce=5
So if the input file is not that good, I mean if a word say "hello" that appeared 4 times when ReduceVocab called, and the vocab will remove hello from itself.
Later, when ReduceVocab called again and luckly hello appeared 5 times.. and it seems ReduceVocab will remove hello again.
As in truth, hello appeared 9 times which should be in the vocab, but the code above removed it.
it takes not such matter as it seems the situation happens seldomly. Just wondering my analysis is right or i've missed something in the code.
Thanks for any advice.
A better URL for reviewing the relevant source is:
https://github.com/tmikolov/word2vec/blob/master/word2vec.c#L185
As I understand it, this is not a bug – just a compromise with non-intuitive effects.
This code uses an intentionally rough/approximate method of ensuring the number of tracked vocabulary terms never exceeds 0.7 * vocab_hash_size (21 million). Whenever the number of terms hits that high-water mark, all terms with fewer than min_reduce occurrences are discarded - & min_reduce is increased to take even more, next time.
(And in practice, this escalating-floor, along with the typical long-tail Zipfian distribution of word frequences, can mean that at each triggered ReduceVocab operation, most terms are discarded, bringing the total vocab size to something that's way smaller than 0.7 * vocab_hash_size.)
An unavoidable effect of discarding known counts, in an interim running fashion, is that counts after each discard are no longer complete & exact. The relative position of terms in the corpus can thus have a big effect on which terms are ReduceVocab-pruned - with terms that "just miss" the cutoff each time potentially having far more occurrences, in total, than the final min_reduce. And further, all final counts of less-frequent words might be incomplete, if the term's early occurrence counts didn't survive earlier ReduceVocab steps.
Still, this approach works to keep the vocabulary-survey from taking an arbitrary amount of RAM, and the imprecision in the tail of rarer word counts isn't too big of a concern in typical cases.
If you have the RAM & want to prevent this behavior, you could edit the source to make vocab_hash_size arbitrarily larger, so that either ReduceVocab() is never triggered (and thus your final counts are exact), or happens rarely enough that any words it affects don't concern you.

How to solve a runtime error happening when I use a big size of static array

my development environment : visual studio
Now, I have to create a input file and print random numbers from 1 to 500000 without duplicating in the file. First, I considered that if I use a big size of local array, problems related to heap may happen. So, I tried to declare as a static array. Then, in main function, I put random numbers without overlapping in the array and wrote the numbers in input file accessing array elements. However, runtime errors(the continuous blinking of the cursor in the console window) continue to occur.
The source code is as follows.
#define SIZE 500000
int sort[500000];
int main()
{
FILE* input = NULL;
input = fopen("input.txt", "w");
if (sort != NULL)
{
srand((unsigned)time(NULL));
for (int i = 0; i < SIZE; i++)
{
sort[i] = (rand() % SIZE) + 1;
for (int j = 0; j < i; j++)
{
if (sort[i] == sort[j])
{
i--;
break;
}
}
}
for (int i = 0; i < SIZE; i++)
{
fprintf(input, "%d ", sort[i]);
}
fclose(input);
}
return 0;
}
When I tried to reduce the array size from 1 to 5000, it has been implemented. So, Carefully, I think it's a memory out phenomenon. Finally, I'd appreciate it if you could comment on how to solve this problem.
“First, I considered that if I use a big size of local array, problems related to heap may happen.”
That does not make any sense. Automatic local objects generally come from the stack, not the heap. (Also, “heap” is the wrong word; a heap is a particular kind of data structure, but the malloc family of routines may use other data structures for managing memory. This can be referred to simply as dynamically allocated memory or allocated memory.)
However, runtime errors(the continuous blinking of the cursor in the console window)…
Continuous blinking of the cursor is normal operation, not a run-time error. Perhaps you are trying to say your program continues executing without ever stopping.
#define SIZE 500000<br>
...
sort[i] = (rand() % SIZE) + 1;
The C standard only requires rand to generate numbers from 0 to 32767. Some implementations may provide more. However, if your implementation does not generate numbers up to 499,999, then it will never generate the numbers required to fill the array using this method.
Also, using % to reduce the rand result skews the distribution. For example, if we were reducing modulo 30,000, and rand generated numbers from 0 to 44,999, then rand() % 30000 would generate the numbers from 0 to 14,999 each two times out of every 45,000 and the numbers from 15,000 to 29,999 each one time out of every 45,000.
for (int j = 0; j < i; j++)
So this algorithm attempts to find new numbers by rejecting those that duplicate previous numbers. When working on the last of n numbers, the average number of tries is n, if the selection of random numbers is uniform. When working on the second-to-last number, the average is n/2. When working on the third-to-last, the average is n/3. So the average number of tries for all the numbers is n + n/2 + n/3 + n/4 + n/5 + … 1.
For 5000 elements, this sum is around 45,472.5. For 500,000 elements, it is around 6,849,790. So your program will average around 150 times the number of tries with 500,000 elements than with 5,000. However, each try also takes longer: For the first try, you check against zero prior elements for duplicates. For the second, you check against one prior element. For try n, you check against n−1 elements. So, for the last of 500,000 elements, you check against 499,999 elements, and, on average, you have to repeat this 500,000 times. So the last try takes around 500,000•499,999 = 249,999,500,000 units of work.
Refining this estimate, for each selection i, a successful attempt that gets completely through the loop of checking requires checking against all i−1 prior numbers. An unsuccessful attempt will average going halfway through the prior numbers. So, for selection i, there is one successful check of i−1 numbers and, on average, n/(n+1−i) unsuccessful checks of an average of (i−1)/2 numbers.
For 5,000 numbers, the average number of checks will be around 107,455,347. For 500,000 numbers, the average will be around 1,649,951,055,183. Thus, your program with 500,000 numbers takes more than 15,000 times as long than with 5,000 numbers.
When I tried to reduce the array size from 1 to 5000, it has been implemented.
I think you mean that with an array size of 5,000, the program completes execution in a short amount of time?
So, Carefully, I think it's a memory out phenomenon.
No, there is no memory issue here. Modern general-purpose computer systems easily handle static arrays of 500,000 int.
Finally, I'd appreciate it if you could comment on how to solve this problem.
Use a Fischer-Yates shuffle: Fill the array A with integers from 1 to SIZE. Set a counter, say d to the number of selections completed so far, initially zero. Then pick a random number r from 1 to SIZE-d. Move the number in that position of the array to the front by swapping A[r] with A[d]. Then increment d. Repeat until d reaches SIZE-1.
This will swap a random element of the initial array into A[0], then a random element from those remaining into A[1], then a random element from those remaining into A[2], and so on. (We stop when d reaches SIZE-1 rather than when it reaches SIZE because, once d reaches SIZE-1, there is only one more selection to make, but there is also only one number left, and it is already in the last position in the array.)

Looping through all character combinations with increasing number of elements

What I want to achieve:
I have a function where I want to loop through all possible combinations of printable ascii-characters, starting with a single character, then two characters, then three etc.
The part that makes this difficult for me is that I want this to work for as many characters as I can (leave it overnight).
For the record: I know that abc really is 97 98 99, so a numeric representation is fine if that's easier.
This works for few characters:
I could create a list of all possible combinations for n characters, and just loop through it, but that would require a huge amount of memory already when n = 4. This approach is literally impossible for n > 5 (at least on a normal desktop computer).
In the script below, all I do is increment a counter for each combination. My real function does more advanced stuff.
If I had unlimited memory I could do (thanks to Luis Mendo):
counter = 0;
some_function = #(x) 1;
number_of_characters = 1;
max_time = 60;
max_number_of_characters = 8;
tic;
while toc < max_time && number_of_characters < max_number_of_characters
number_of_characters = number_of_characters + 1;
vectors = [repmat({' ':'~'}, 1, number_of_characters)];
n = numel(vectors);
combs = cell(1,n);
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1});
combs = cat(n+1, combs{:});
combs = reshape(combs, [], n);
for ii = 1:size(combs, 1)
counter = counter + some_function(combs(ii, :));
end
end
Now, I want to loop through as many combinations as possible in a certain amount of time, 5 seconds, 10 seconds, 2 minutes, 30 minutes, so I'm hoping to create a function that's only limited by the available time, and uses only some reasonable amount of memory.
Attempts I've made (and failed at) for more characters:
I've considered pre-computing the combinations for two or three letters using one of the approaches above, and use a loop only for the last characters. This would not require much memory, since it's only one (relatively small) array, plus one or more additional characters that gets looped through.
I manage to scale this up to 4 characters, but beyond that I start getting into trouble.
I've tried to use an iterator that just counts upwards. Every time I hit any(mod(number_of_ascii .^ 1:n, iterator) == 0) I increment the m'th character by one. So, the last character just repeats the cycle !"# ... ~, and every time it hits tilde, the second character increments. Every time the second character hits tilde, the third character increments etc.
Do you have any suggestions for how I can solve this?
It looks like you're basically trying to count in base-26 (or base 52 if you need CAPS). Each number in that base will account for a specific string of character. For example,
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,10,11,12,...
Here, cap A through P are just symbols that are used to represent number symbols for base-26 system. The above simply represent this string of characters.
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,ba,bb,bc,...
Then, you can simply do this:
symbols = ['0','1','2','3','4','5','6','7','8','9','A','B','C','D','E',...
'F','G','H','I','J','K','L','M','N','O','P']
characters = ['a','b','c','d','e','f','g','h','i','j','k','l',...
'm','n','o','p','q','r','s','t','u','v','w','x','y','z']
count=0;
while(true)
str_base26 = dec2base(count,26)
actual_str = % char-by-char-lookup-of-str26 to chracter string
count=count+1;
end
Of course, it does not represent characters that begin with trailing 0's. But that should be pretty simple.
You were not far with your idea of just getting an iterator that just counts upward.
What you need with this idea is a map from the integers to ASCII characters. As StewieGriffin suggested, you'd just need to work in base 95 (94 characters plus whitespace).
Why whitespace : You need something that will be mapped to 0 and be equivalent to it. Whitespace is the perfect candidate. You'd then just skip the strings containing any whitespace. If you don't do that and start directly at !, you'll not be able to represent strings like !! or !ab.
First let's define a function that will map (1:1) integers to string :
function [outstring,toskip]=dec2ASCII(m)
out=[];
while m~=0
out=[mod(m,95) out];
m=(m-out(1))/95;
end
if any(out==0)
toskip=1;
else
toskip=0;
end
outstring=char(out+32);
end
And then in your main script :
counter=1;
some_function = #(x) 1;
max_time = 60;
max_number_of_characters = 8;
currString='';
tic;
while numel(currString)<=max_number_of_characters&&toc<max_time
[currString,toskip]=dec2ASCII(counter);
if ~toskip
some_function(currString);
end
counter=counter+1;
end
Some random outputs of the dec2ASCII function :
dec2ASCII(47)
ans =
O
dec2ASCII(145273)
ans =
0)2
In terms of performance I can't really elaborate as I don't know what you want to do with your some_function. The only thing I can say is that the running time of dec2ASCII is around 2*10^(-5) s
Side note : iterating like this will be very limited in terms of speed. With the function some_function doing nothing, you'd just be able to cycle through 4 characters in around 40 minutes, and 5 characters would already take up to 64 hours. Maybe you'd want to reduce the amount of stuff you want to pass through the function you iterate on.
This code, though, is easily parallelizable, so if you want to check more combinations, I'd suggest trying to do it in a parallel manner.

Best (fastest) way to find the number most frequently entered in C?

Well, I think the title basically explains my doubt. I will have n numbers to read, this n numbers go from 1 to x, where x is at most 105. What is the fastest (less possible time to run it) way to find out which number were inserted more times? That knowing that the number that appears most times appears more than half of the times.
What I've tried so far:
//for (1<=x<=10⁵)
int v[100000+1];
//multiple instances , ends when n = 0
while (scanf("%d", &n)&&n>0) {
zerofill(v);
for (i=0; i<n; i++) {
scanf("%d", &x);
v[x]++;
if (v[x]>n/2)
i=n;
}
printf("%d\n", x);
}
Zero-filling a array of x positions and increasing the position vector[x] and at the same time verifying if vector[x] is greater than n/2 it's not fast enough.
Any idea might help, thank you.
Observation: No need to care about amount of memory used.
The trivial solution of keeping a counter array is O(n) and you obviously can't get better than that. The fight is then about the constants and this is where a lot of details will play the game, including exactly what are the values of n and x, what kind of processor, what kind of architecture and so on.
On the other side this seems really the "knockout" problem, but that algorithm will need two passes over the data and an extra conditional, thus in practical terms in the computers I know it will be most probably slower than the array of counters solutions for a lot of n and x values.
The good point of the knockout solution is that you don't need to put a limit x on the values and you don't need any extra memory.
If you know already that there is a value with the absolute majority (and you simply need to find what is this value) then this could make it (but there are two conditionals in the inner loop):
initialize count = 0
loop over all elements
if count is 0 then set champion = element and count = 1
else if element != champion decrement count
else increment count
at the end of the loop your champion will be the value with the absolute majority of elements, if such a value is present.
But as said before I'd expect a trivial
for (int i=0,n=size; i<n; i++) {
if (++count[x[i]] > half) return x[i];
}
to be faster.
EDIT
After your edit seems you're really looking for the knockout algorithm, but caring about speed that's probably still the wrong question with modern computers (100000 elements is nothing even for a nail-sized single chip today).
I think you can create a max heap for the count of number you read,and use heap sort to find all the count which greater than n/2

Finding an element in an array where every element is repeated odd number of times (but more than single occurrence) and only one appears once

You have an array in which every number is repeated odd number of times (but more than single occurrence). Exactly one number appears once. How do you find the number that appears only once?
e.g.: {1, 6, 3, 1, 1, 6, 6, 9, 3, 3, 3, 3}
The answer is 9.
I was thinking about having a hash table and then just counting the element whose count is 1.
This seems trivial and i am not using the fact that every other element is repeated an odd no of times. Is there a better approach.
I believe you can still use the basic idea of XOR to solve this problem in a clever fashion.
First, let's change the problem so that one number appears once and all other numbers appear three times.
Algorithm:
Here A is the array of length n:
int ones = 0;
int twos = 0;
int not_threes, x;
for (int i=0; i<n; ++i) {
x = A[i];
twos |= ones & x;
ones ^= x;
not_threes = ~(ones & twos);
ones &= not_threes;
twos &= not_threes;
}
And the element that occurs precisely once is stored in ones. This uses O(n) time and O(1) space.
I believe I can extend this idea to the general case of the problem, but possibly one of you can do it faster, so I'll leave this for now and edit it when and if I can generalize the solution.
Explanation:
If the problem were this: "one element appears once, all others an even number of times", then the solution would be to XOR the elements. The reason is that x^x = 0, so all the paired elements would vanish leaving only the lonely element. If we tried the same tactic here, we would be left with the XOR of distinct elements, which is not what we want.
Instead, the algorithm above does the following:
ones is the XOR of all elements that have appeared exactly once so far
twos is the XOR of all elements that have appeared exactly twice so far
Each time we take x to be the next element in the array, there are three cases:
if this is the first time x has appeared, it is XORed into ones
if this is the second time x has appeared, it is taken out of ones (by XORing it again) and XORed into twos
if this is the third time x has appeared, it is taken out of ones and twos.
Therefore, in the end, ones will be the XOR of just one element, the lonely element that is not repeated. There are 5 lines of code that we need to look at to see why this works: the five after x = A[i].
If this is the first time x has appeared, then ones&x=ones so twos remains unchanged. The line ones ^= x; XORs x with ones as claimed. Therefore x is in exactly one of ones and twos, so nothing happens in the last three lines to either ones or twos.
If this is the second time x has appeared, then ones already has x (by the explanation above), so now twos gets it with the line twos |= ones & x;. Also, since ones has x, the line ones ^= x; removes x from ones (because x^x=0). Once again, the last three lines do nothing since exactly one of ones and twos now has x.
If this is the third time x has appeared, then ones does not have x but twos does. So the first line let's twos keep x and the second adds x to ones. Now, since both ones and twos have x, the last three lines remove x from both.
Generalization:
If some numbers appear 5 times, then this algorithm still works. This is because the 4th time x appears, it is in neither ones nor twos. The first two lines then add x to ones and not twos and the last three lines do nothing. The 5th time x appears, it is in ones but not twos. The first line adds it to twos, the second removed it from ones, and the last three lines do nothing.
The problem is that the 6th time x appears, it is taken from ones and twos, so it gets added back to ones on the 7th pass. I'm trying to think of a clever way to prevent this, but so far I'm coming up empty.
For the problem as stated it is most likely that the most efficient answer is the O(n) space answer. On the other hand, if we narrow the problem to be "All numbers appear n times except for one which only appears once" or even "All numbers appear a multiple of n times except for one which only appears once" then there's a fairly straightforward solution for any n (greater than 1, obviously) which takes only O(1) space, which is to break each number into bits and then count how many times each bit is turned on and take that modulo n. If the result is 1, then it should be turned on in the answer. If it is 0, then it should be turned off. (Any other answer shows that the parameters of the problem did not hold). If we examine the situation where n is 2, we can see that using XOR does exactly this (bitwise addition modulo 2). We're just generalizing things to do bitwise addition modulo n for other values of n.
This, by the way, is what the other answer for n=3 does, it's actually just a complex way of doing bit-wise addition where it stores a 2-bit count for each bit. The int called "ones" contains the ones bit of the count and the int called "twos" contains the twos bit of the count. The int not_threes is used to set both bits back to zero when the count reaches 3, thus counting the bits modulo 3 rather than normally (which would be modulo 4 since the bits would wrap around). The easiest way to understand his code is as a 2-bit accumulator with an extra part to make it work modulo 3.
So, for the case of all numbers appearing a multiple of 3 times except the one unique number, we can write the following code for 32 bit integers:
int findUnique(int A[], int size) {
// First we set up a bit vector and initialize it to 0.
int count[32];
for (int j=0;j<32;j++) {
count[j] = 0;
}
// Then we go through each number in the list.
for (int i=0;i<size;i++) {
int x = A[i];
// And for each number we go through its bits one by one.
for (int j=0;j<32;j++) {
// We add the bit to the total.
count[j] += x & 1;
// And then we take it modulo 3.
count[j] %= 3;
x >>= 1;
}
}
// Then we just have to reassemble the answer by putting together any
// bits which didn't appear a multiple of 3 times.
int answer = 0;
for (int j=31;j>=0;j--) {
answer <<= 1;
if (count[j] == 1) {
answer |= 1;
}
}
return answer;
}
This code is slightly longer than the other answer (and superficially looks more complex due to the additional loops, but they're each constant time), but is hopefully easier to understand. Obviously, we could decrease the memory space by packing the bits more densely since we never use more than two of them for any number in count. But I haven't bothered to do that since it has no effect on the asymptotic complexity.
If we wish to change the parameters of the problem so that instead the numbers are repeated 5 times, we just change the 3s to 5s. Or we can do likewise for 7, 11, 137, 727, or any other number (including even numbers). But instead of using the actual number, we can use any factor of it, so for 9, we could just leave it as 3, and for even numbers we can just use 2 (and hence just use xor).
However, there is no general bit-counting based solution for the original problem where a number can be repeated any odd number of times. This is because even if we count the bits exactly without using modulo, when we look at a particular bit, we simply can't know whether the 9 times it appears represents 3 + 3 + 3 or 1 + 3 + 5. If it was turned on in three different numbers which each appeared three times, then it should be turned off in our answer. If it was turned on in a number which appeared once, a number which appeared three times, and a number which appeared five times, then it should be turned on in our answer. But with just the count of the bits, it's impossible for us to know this.
This is why the other answer doesn't generalize and the clever idea to handle the special cases is not going to materialize: any scheme based on looking at things bit by bit to figure out which bits should be turned on or off does not generalize. Given this, I don't think that any scheme which takes space O(1) works for the general case. It is possible that there are clever schemes which use O(lg n) space or so forth, but I would doubt it. I think that the O(n) space approach is probably the best which can be done in the problem as proposed. I can't prove it, but at this point, it's what my gut tells me and I hope that I've at least convinced you that small tweaks to the "even number" technique are not going to cut it.
I know that the subtext of this question is to find an efficient or performant solution, but I think that the simplest, readable code counts for a lot and in most cases it is more than sufficient.
So how about this:
var value = (new [] { 1, 6, 3, 1, 1, 6, 6, 9, 3, 3, 3, 3, })
.ToLookup(x => x)
.Where(xs => xs.Count() == 1)
.First()
.Key;
Good old LINQ. :-)
Test Score 100% with c#
using System;
using System.Collections.Generic;
// you can also use other imports, for example:
// using System.Collections.Generic;
// you can write to stdout for debugging purposes, e.g.
// Console.WriteLine("this is a debug message");
class Solution {
public int solution(int[] A) {
Dictionary<int, int> dic =new Dictionary<int, int>();
foreach(int i in A)
{
if (dic.ContainsKey(i))
{
dic[i]=dic[i]+1;
}
else
{
dic.Add(i, 1);
}
}
foreach(var d in dic)
{
if (d.Value%2==1)
{
return d.Key;
}
}
return -1;
}
}
Java, Correctness 100%, Performance 100%, Task score 100%
// you can also use imports, for example:
// import java.util.*;
// you can write to stdout for debugging purposes, e.g.
// System.out.println("this is a debug message");
import java.util.HashMap;
class Solution {
/*Simple solution
Will be using HashMap(for performance) as Array,
only Key set is needed.
Initially tried with ArryList but there was performance issue with
that so switch to HashMap.
Iterate over the given array and if item is there in key set
remove it(coz you found your pair) otherwise add as new Key.
After full Iteration only one key will be there in the Map that is
your solution.
In Short: If pair is found, remove it from Map's Key set,
Last Key will be your solution
*/
public int solution(int[] A) {
//Map but used as Array
final HashMap<Integer, Boolean> mapHave = new HashMap<>();
//Iterate over given Array
for (int nIdx = 0; nIdx < A.length; nIdx++) {
//Current Item
Integer nVal = A[nIdx];
//Try to remove the current item, if item does not exists will
//return null and if condition will be true
if (mapHave.remove(nVal) == null) {
//current item not found, add it
mapHave.put(nVal, Boolean.TRUE);
}
}
//There will be only one key remaining in the Map, that is
//your solution
return mapHave.keySet().iterator().next();
}
}

Resources