XOR problem solvable with 2x2x1 neural network without bias?

XOR problem solvable with 2x2x1 neural network without bias? - artificial-intelligence

Is a Neural network with 2 input nodes, 2 hidden nodes and an output supposed to be able to solve the XOR problem provided there is no bias? Or can it get stuck?

Leave the bias in. It doesn't see the values of your inputs.
In terms of a one-to-one analogy, I like to think of the bias as the offsetting c-value in the straight line equation: y = mx + c; it adds an independent degree of freedom to your system that is not influenced by the inputs to your network.

If I remember correctly it's not possible to have XOR without a bias.

I have built a neural network without bias and a 2x2x1 architecture solves XOR in 280 epochs.
Am new to this, so didn't know either way, but it works, so it is possible.
Regards,

Yes, you can if you use an activation function like Relu (f(x) =max(0,x))
Example of weights of such network are:
Layer1: [[-1, 1], [1, -1]]
Layer2: [[1], [1]]
For the first (hidden) layer:
If the input is [0,0], both nodes will have an activation of 0: ReLU(-1*0 + 1*0) = 0, ReLU(1*0 + -1*0) = 0
If the input is [1,0], one node will have activation of 0 ReLU(-1*1 + 1*0) = 0 and the other activation of 1 ReLU(1*1 + -1*0) = 1
If the input is [0,1], one node will have activation of 1 ReLu(-1*0 + 1*1) = 1 and the other activation of 0 ReLU(1*0 + -1*1) = 0
If the input is [1,1], both nodes will have an activation of 0: ReLU(-1*1 + 1*1 = 0) = 0, ReLU(1*1 + -1*1 = 0) = 0
For the second (output) layer:
Since the weights are [[1], [1]] (and there can be no negative activations from previous layer due to ReLU), the layer simply acts as a summation of activations in layer 1
If the input is [0,0], the sum of activations in the previous layer is 0
If the input is [1,0], the sum of activations in the previous layer is 1
If the input is [0,1], the sum of activations in the previous layer is 1
If the input is [1,1], the sum of activations in the previous layer is 0
While this method coincidentally works in the example above, it is limited to using zero (0) label for False examples of the XOR problem. If, for example, we used ones for False examples and twos for True examples, this approach would not work anymore.

Related

Daily Coding Problem 260 : Reconstruct a jumbled array - Intuition?

I'm going through the question below.
The sequence [0, 1, ..., N] has been jumbled, and the only clue you have for its order is an array representing whether each number is larger or smaller than the last. Given this information, reconstruct an array that is consistent with it.
For example, given [None, +, +, -, +], you could return [1, 2, 3, 0, 4].
I went through the solution on this post but still unable to understand it as to why this solution works. I don't think I would be able to come up with the solution if I had this in front of me during an interview. Can anyone explain the intuition behind it? Thanks in advance!

This answer tries to give a general strategy to find an algorithm to tackle this type of problems. It is not trying to prove why the given solution is correct, but lying out a route towards such a solution.
A tried and tested way to tackle this kind of problem (actually a wide range of problems), is to start with small examples and work your way up. This works for puzzles, but even so for problems encountered in reality.
First, note that the question is formulated deliberately to not point you in the right direction too easily. It makes you think there is some magic involved. How can you reconstruct a list of N numbers given only the list of plusses and minuses?
Well, you can't. For 10 numbers, there are 10! = 3628800 possible permutations. And there are only 2⁹ = 512 possible lists of signs. It's a very huge difference. Most original lists will be completely different after reconstruction.
Here's an overview of how to approach the problem:
Start with very simple examples
Try to work your way up, adding a bit of complexity
If you see something that seems a dead end, try increasing complexity in another way; don't spend too much time with situations where you don't see progress
While exploring alternatives, revisit old dead ends, as you might have gained new insights
Try whether recursion could work:
given a solution for N, can we easily construct a solution for N+1?
or even better: given a solution for N, can we easily construct a solution for 2N?
Given a recursive solution, can it be converted to an iterative solution?
Does the algorithm do some repetitive work that can be postponed to the end?
....
So, let's start simple (writing 0 for the None at the start):
very short lists are easy to guess:
'0++' → 0 1 2 → clearly only one solution
'0--' → 2 1 0 → only one solution
'0-+' → 1 0 2 or 2 0 1 → hey, there is no unique outcome, though the question only asks for one of the possible outcomes
lists with only plusses:
'0++++++' → 0 1 2 3 4 5 6 → only possibility
lists with only minuses:
'0-------'→ 7 6 5 4 3 2 1 0 → only possibility
lists with one minus, the rest plusses:
'0-++++' → 1 0 2 3 4 5 or 5 0 1 2 3 4 or ...
'0+-+++' → 0 2 1 3 4 5 or 5 0 1 2 3 4 or ...
→ no very obvious pattern seem to emerge
maybe some recursion could help?
given a solution for N, appending one sign more?
appending a plus is easy: just repeat the solution and append the largest plus 1
appending a minus, after some thought: increase all the numbers by 1 and append a zero
→ hey, we have a working solution, but maybe not the most efficient one
the algorithm just appends to an existing list, no need to really write it recursively (although the idea is expressed recursively)
appending a plus can be improved, by storing the largest number in a variable so it doesn't need to be searched at every step; no further improvements seem necessary
appending a minus is more troublesome: the list needs to be traversed with each append
what if instead of appending a zero, we append -1, and do the adding at the end?
this clearly works when there is only one minus
when two minus signs are encountered, the first time append -1, the second time -2
→ hey, this works for any number of minuses encountered, just store its counter in a variable and sum with it at the end of the algorithm
This is in bird's eye view one possible route towards coming up with a solution. Many routes lead to Rome. Introducing negative numbers might seem tricky, but it is a logical conclusion after contemplating the recursive algorithm for a while.

It works because all changes are sequential, either adding one or subtracting one, starting both the increasing and the decreasing sequences from the same place. That guarantees we have a sequential list overall. For example, given the arbitrary
[None, +, -, +, +, -]
turned vertically for convenience, we can see
None 0
+ 1
- -1
+ 2
+ 3
- -2
Now just shift them up by two (to account for -2):
2 3 1 4 5 0
+ - + + -

Let's look at first to a solution which (I think) is easier to understand, formalize and demonstrate for correctness (but I will only explain it and not demonstrate in a formal way):
We name A[0..N] our input array (where A[k] is None if k = 0 and is + or - otherwise) and B[0..N] our output array (where B[k] is in the range [0, N] and all values are unique)
At first we see that our problem (find B such that B[k] > B[k-1] if A[k] == + and B[k] < B[k-1] if A[k] == -) is only a special case of another problem:
Find B such that B[k] == max(B[0..k]) if A[k] == + and B[k] == min(B[0..k]) if A[k] == -.
Which generalize from "A value must larger or smaller than the last" to "A value must be larger or smaller than everyone before it"
So a solution to this problem is a solution to the original one as well.
Now how do we approach this problem?
A greedy solution will be sufficient, indeed is easy to demonstrate that the value associated with the last + will be the biggest number in absolute (which is N), the one associated with the second last + will be the second biggest number in absolute (which is N-1) ecc...
And in the same time the value associated with the last - will be the smallest number in absolute (which is 0), the one associated with the second last - will be the second smallest (which is 1) ecc...
So we can start filling B from right to left remembering how many + we have seen (let's call this value X), how many - we have seen (let's call this value Y) and looking at what is the current symbol, if it is a + in B we put N-X and we increase X by 1 and if it is a - in B we put 0+Y and we increase Y by 1.
In the end we'll need to fill B[0] with the only remaining value which is equal to Y+1 and to N-X-1.
An interesting property of this solution is that if we look to only the values associated with a - they will be all the values from 0 to Y (where in this case Y is the total number of -) sorted in reverse order; if we look to only the values associated with a + they will be all the values from N-X to N (where in this case X is the total number of +) sorted and if we look at B[0] it will always be Y+1 and N-X-1 (which are equal).
So the - will have all the values strictly smaller than B[0] and reverse sorted and the + will have all the values strictly bigger than B[0] and sorted.
This property is the key to understand why the solution proposed here works:
It consider B[0] equals to 0 and than it fills B following the property, this isn't a solution because the values are not in the range [0, N], but it is possible with a simple translation to move the range and arriving to [0, N]

The idea is to produce a permutation of [0,1...N] which will follow the pattern of [+,-...]. There are many permutations which will be applicable, it isn't a single one. For instance, look the the example provided:
[None, +, +, -, +], you could return [1, 2, 3, 0, 4].
But you also could have returned other solutions, just as valid: [2,3,4,0,1], [0,3,4,1,2] are also solutions. The only concern is that you need to have the first number having at least two numbers above it for positions [1],[2], and leave one number in the end which is lower then the one before and after it.
So the question isn't finding the one and only pattern which is scrambled, but to produce any permutation which will work with these rules.
This algorithm answers two questions for the next member of the list: get a number who’s both higher/lower from previous - and get a number who hasn’t been used yet. It takes a starting point number and essentially create two lists: an ascending list for the ‘+’ and a descending list for the ‘-‘. This way we guarantee that the next member is higher/lower than the previous one (because it’s in fact higher/lower than all previous members, a stricter condition than the one required) and for the same reason we know this number wasn’t used before.
So the intuition of the referenced algorithm is to start with a referenced number and work your way through. Let's assume we start from 0. The first place we put 0+1, which is 1. we keep 0 as our lowest, 1 as the highest.
l[0] h[1] list[1]
the next symbol is '+' so we take the highest number and raise it by one to 2, and update both the list with a new member and the highest number.
l[0] h[2] list [1,2]
The next symbol is '+' again, and so:
l[0] h[3] list [1,2,3]
The next symbol is '-' and so we have to put in our 0. Note that if the next symbol will be - we will have to stop, since we have no lower to produce.
l[0] h[3] list [1,2,3,0]
Luckily for us, we've chosen well and the last symbol is '+', so we can put our 4 and call is a day.
l[0] h[4] list [1,2,3,0,4]
This is not necessarily the smartest solution, as it can never know if the original number will solve the sequence, and always progresses by 1. That means that for some patterns [+,-...] it will not be able to find a solution. But for the pattern provided it works well with 0 as the initial starting point. If we chose the number 1 is would also work and produce [2,3,4,0,1], but for 2 and above it will fail. It will never produce the solution [0,3,4,1,2].
I hope this helps understanding the approach.

This is not an explanation for the question put forward by OP.
Just want to share a possible approach.
Given: N = 7
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Go from 0 to N
[1] fill all '-' starting from right going left.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 2 1 0
[2] fill all the vacant places i.e [X & +] starting from left going right.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 3 4 5 6 7
Final:
Pattern: X + - + - + - + //X = None
Answer: 3 4 2 5 1 6 0 7

My answer definitely is too late for your problem but if you need a simple proof, you probably would like to read it:
+min_last or min_so_far is a decreasing value starting from 0.
+max_last or max_so_far is an increasing value starting from 0.
In the input, each value is either "+" or "-" and for each increase the value of max_so_far or decrease the value of min_so_far by one respectively, excluding the first one which is None. So, abs(min_so_far, max_so_far) is exactly equal to N, right? But because you need the range [0, n] but max_so_far and min_so_far now are equal to the number of "+"s and "-"s with the intersection part with the range [0, n] being [0, max_so_far], what you need to do is to pad it the value equal to min_so_far for the final solution (because min_so_far <= 0 so you need to take each value of the current answer to subtract by min_so_far or add by abs(min_so_far)).

What is the advantage of linspace over the colon ":" operator?

Is there some advantage of writing
t = linspace(0,20,21)
over
t = 0:1:20
?
I understand the former produces a vector, as the first does.
Can anyone state me some situation where linspace is useful over t = 0:1:20?

It's not just the usability. Though the documentation says:
The linspace function generates linearly spaced vectors. It is
similar to the colon operator :, but gives direct control over the
number of points.
it is the same, the main difference and advantage of linspace is that it generates a vector of integers with the desired length (or default 100) and scales it afterwards to the desired range. The : colon creates the vector directly by increments.
Imagine you need to define bin edges for a histogram. And especially you need the certain bin edge 0.35 to be exactly on it's right place:
edges = [0.05:0.10:.55];
X = edges == 0.35
edges = 0.0500 0.1500 0.2500 0.3500 0.4500 0.5500
X = 0 0 0 0 0 0
does not define the right bin edge, but:
edges = linspace(0.05,0.55,6); %// 6 = (0.55-0.05)/0.1+1
X = edges == 0.35
edges = 0.0500 0.1500 0.2500 0.3500 0.4500 0.5500
X = 0 0 0 1 0 0
does.
Well, it's basically a floating point issue. Which can be avoided by linspace, as a single division of an integer is not that delicate, like the cumulative sum of floting point numbers. But as Mark Dickinson pointed out in the comments:
You shouldn't rely on any of the computed values being exactly what you expect. That is not what linspace is for. In my opinion it's a matter of how likely you will get floating point issues and how much you can reduce the probabilty for them or how small can you set the tolerances. Using linspace can reduce the probability of occurance of these issues, it's not a security.
That's the code of linspace:
n1 = n-1
c = (d2 - d1).*(n1-1) % opposite signs may cause overflow
if isinf(c)
y = d1 + (d2/n1).*(0:n1) - (d1/n1).*(0:n1)
else
y = d1 + (0:n1).*(d2 - d1)/n1
end
To sum up: linspace and colon are reliable at doing different tasks. linspace tries to ensure (as the name suggests) linear spacing, whereas colon tries to ensure symmetry
In your special case, as you create a vector of integers, there is no advantage of linspace (apart from usability), but when it comes to floating point delicate tasks, there may is.
The answer of Sam Roberts provides some additional information and clarifies further things, including some statements of MathWorks regarding the colon operator.

linspace and the colon operator do different things.
linspace creates a vector of integers of the specified length, and then scales it down to the specified interval with a division. In this way it ensures that the output vector is as linearly spaced as possible.
The colon operator adds increments to the starting point, and subtracts decrements from the end point to reach a middle point. In this way, it ensures that the output vector is as symmetric as possible.
The two methods thus have different aims, and will often give very slightly different answers, e.g.
>> a = 0:pi/1000:10*pi;
>> b = linspace(0,10*pi,10001);
>> all(a==b)
ans =
0
>> max(a-b)
ans =
3.5527e-15
In practice, however, the differences will often have little impact unless you are interested in tiny numerical details. I find linspace more convenient when the number of gaps is easy to express, whereas I find the colon operator more convenient when the increment is easy to express.
See this MathWorks technical note for more detail on the algorithm behind the colon operator. For more detail on linspace, you can just type edit linspace to see exactly what it does.

linspace is useful where you know the number of elements you want rather than the size of the "step" between them. So if I said make a vector with 360 elements between 0 and 2*pi as a contrived example it's either going to be
linspace(0, 2*pi, 360)
or if you just had the colon operator you would have to manually calculate the step size:
0:(2*pi - 0)/(360-1):2*pi
linspace is just more convenient
For a simple real world application, see this answer where linspace is helpful in creating a custom colour map

Efficient histogram implementation using a hash function

Is there a more efficient approach to computing a histogram than a binary search for a non-linear bin distribution?
I'm actually only interested in the bit of the algorithm that matches the key (value) to the bin (the transfer function?) , i.e. for a bunch of floating point values I just want to know the appropriate bin index for each value.
I know that for a linear bin distribution you can get O(1) by dividing the value by the bin width, and that for non linear bins a binary search gets you O(logN). My current implementation uses a binary search on unequal bin widths.
In the spirit of improving efficiency I was curious as to whether you could use a hash function to map a value to its appropriate bin and achieve O(1) time complexity when you have bins of unequal widths?

In some simple cases you can get O(1).
Suppose, your values are 8-bit, from 0 to 255.
If you split them into 8 bins of sizes 2, 2, 4, 8, 16, 32, 64, 128, then the bin value ranges will be: 0-1, 2-3, 4-7, 8-15, 16-31, 32-63, 64-127, 128-255.
In binary these ranges look like:
0000000x (bin 0)
0000001x
000001xx
00001xxx
0001xxxx
001xxxxx
01xxxxxx
1xxxxxxx (bin 7)
So, if you can quickly (in O(1)) count how many most significant zero bits there are in the value, you can get the bin number from it.
In this particular case you may precalculate a look-up table of 256 elements, containing the bin number and finding the appropriate bin for a value is just one table look-up.
Actually, with 8-bit values you can use bins of arbitrary sizes since the look-up table is small.
If you were to go with bins of sizes of powers of 2, you could reuse this look-up table for 16-bit values as well. And you'd need two look-ups. You can extend it to even longer values.

Ordinary hash functions are intended to scatter different values quite randomly across some range. A single-bit difference in arguments may lead to dozens of bits different in results. For that reason, ordinary hash functions are not suitable for the situation described in the question.
An alternative is to build an array P with entries that index into the table B of bin limits. Given some value x, we find the bin j it belongs to (or sometimes a nearby bin) via j = P[⌊x·r⌋] where r is a ratio that depends on the size of P and the maximum value in B. The effectiveness of this approach depends on the values in B and the size of P.
The behavior of functions like P[⌊x·r⌋] can be seen via the python code shown below. (The method is about the same in any programming language. However, tips for Python-to-C are given below.) Suppose the code is stored in file histobins.py and loaded into the ipython interpreter with the command import histobins as hb. Then a command like hb.betterparts(27, 99, 9, 80,155) produces output like
At 80 parts, steps = 20 = 7+13
At 81 parts, steps = 16 = 7+9
At 86 parts, steps = 14 = 6+8
At 97 parts, steps = 13 = 12+1
At 108 parts, steps = 12 = 3+9
At 109 parts, steps = 12 = 8+4
At 118 parts, steps = 12 = 6+6
At 119 parts, steps = 10 = 7+3
At 122 parts, steps = 10 = 3+7
At 141 parts, steps = 10 = 5+5
At 142 parts, steps = 10 = 4+6
At 143 parts, steps = 9 = 7+2
These parameters to betterparts set nbins=27, topsize=99, seed=9, plo=80, phi=155 which creates a test set of 27 bins for values from 0 to 99, with random seed 9, and size of P from 80 to 155-1. The number of “steps” is the number of times the two while loops in testparts() operated during a test with 10*nbins values from 0 to topsize. Eg, “At 143 parts, steps = 9 = 7+2” means that when the size of P is 143, out of 270 trials, 261 times P[⌊x·r⌋] produced the correct index at once; 7 times the index had to be decreased, and twice it had to be increased.
The general idea of the method is to trade off space for time. Another tradeoff is preparation time versus operation time. If you are going to be doing billions of lookups, it is worthwhile to do a few thousand trials to find a good value of |P|, the size of P. If you are going to be doing only a few millions of lookups, it might be better to just pick some large value of |P| and run with it, or perhaps just run betterparts over a narrow range. Instead of doing 75 tests as above, if we start with larger |P| fewer tests may give a good enough result. For example, 10 tests via “hb.betterparts(27, 99, 9, 190,200)” produces
At 190 parts, steps = 11 = 5+6
At 191 parts, steps = 5 = 3+2
At 196 parts, steps = 5 = 4+1
As long as P fits into some level of cache (along with other relevant data) making |P| larger will speed up access. So, making |P| as large as practical is a good idea. As |P| gets larger, the difference in performance between one value of |P| and the next gets smaller and smaller. The limiting factors on speed then include time to multiply and time to set up while loops. One approach for faster multiplies may be to choose a power of 2 as a multiplier; compute |P| to match; then use shifts or adds to exponents instead of multiplies. One approach to spending less time setting up while loops is to move the statement if bins[bin] <= x < bins[bin+1]: (or its C equivalent, see below) to before the while statements and do the while's only if the if statement fails.
Python code is shown below. Note, in translating from Python to C,
• # begins a comment
• def begins a function
• a statement like ntest, right, wrong, x = 10*nbins, 0, 0, 0 assigns values to respective identifiers
• a statement like return (ntest, right, wrong, stepdown, stepup) returns a tuple of 5 values that the caller can assign to a tuple or to respective identifiers
• the scope of a def, while, or if ends with a line not indented farther than the def, while, or if
• bins = [0] initializes a list (an extendible indexable array) with value 0 as its initial entry
• bins.append(t) appends value t at the end of list bins
• for i,j in enumerate(p): runs a loop over the elements of iterable p (in this case, p is a list), making the index i and corresponding entry j == p[i] available inside the loop
• range(nparts) stands for a list of the values 0, 1, ... nparts-1
• range(plo, phi) stands for a list of the values plo, plo+1, ... phi-1
• if bins[bin] <= x < bins[bin+1] means if ((bins[bin] <= x) && (x < bins[bin+1]))
• int(round(x*float(nparts)/topsize))) actually rounds x·r, instead of computing ⌊x·r⌋ as advertised above
def makebins(nbins, topsize):
bins, t = [0], 0
for i in range(nbins):
t += random.random()
bins.append(t)
for i in range(nbins+1):
bins[i] *= topsize/t
bins.append(topsize+1)
return bins
#________________________________________________________________
def showbins(bins):
print ''.join('{:6.2f} '.format(x) for x in bins)
def showparts(nbins, bins, topsize, nparts, p):
ratio = float(topsize)/nparts
for i,j in enumerate(p):
print '{:3d}. {:3d} {:6.2f} {:7.2f} '.format(i, j, bins[j], i*ratio)
print 'nbins: {} topsize: {} nparts: {} ratio: {}'.format(nbins, topsize, nparts, ratio)
print 'p = ', p
print 'bins = ',
showbins(bins)
#________________________________________________________________
def testparts(nbins, topsize, nparts, seed):
# Make bins and make lookup table p
import random
if seed > 0: random.seed(seed)
bins = makebins(nbins,topsize)
ratio, j, p = float(topsize)/nparts, 0, range(nparts)
for i in range(nparts):
while j<nbins and i*ratio >= bins[j+1]:
j += 1
p[i] = j
p.append(j)
#showparts(nbins, bins, topsize, nparts, p)
# Count # of hits and steps with avg. of 10 items per bin
ntest, right, wrong, x = 10*nbins, 0, 0, 0
delta, stepdown, stepup = topsize/float(ntest), 0, 0
for i in range(ntest):
bin = p[min(nparts, max(0, int(round(x*float(nparts)/topsize))))]
while bin < nbins and x >= bins[bin+1]:
bin += 1; stepup += 1
while bin > 0 and x < bins[bin]:
bin -= 1; stepdown += 1
if bins[bin] <= x < bins[bin+1]: # Test if bin is correct
right += 1
else:
wrong += 1
print 'Wrong bin {} {:7.3f} at x={:7.3f} Too {}'.format(bin, bins[bin], x, 'high' if bins[bin] > x else 'low')
x += delta
return (ntest, right, wrong, stepdown, stepup)
#________________________________________________________________
def betterparts(nbins, topsize, seed, plo, phi):
beststep = 1e9
for parts in range(plo, phi):
ntest, right, wrong, stepdown, stepup = testparts(nbins, topsize, parts, seed)
if wrong: print 'Error with ', parts, ' parts'
steps = stepdown + stepup
if steps <= beststep:
beststep = steps
print 'At {:3d} parts, steps = {:d} = {:d}+{:d}'.format(parts, steps, stepdown, stepup)
#________________________________________________________________

Interpolation search is your friend. It's kind of an optimistic, predictive binary search where it guesses where the bin should be based on a linear assumption about the distribution of inputs, rather than just splitting the search space in half at each step. It will be O(1) if the linear assumption is true, but still works (though more slowly) when the assumption is not. To the degree that its predictions are accurate, the search is fast.

Depends on the implementation of the hashing and the type of data you're working with. For smaller data sets a more simple algorithm like binary search might outperform constant lookup if the lookup-overhead of hashing is larger on average.
The usual implementation of hashing, consists of an array of linked lists and a hashing function that maps a string to an index in the array of linked lists. There's a thing called the load factor, which is the number of elements in the hash map / length of the linked-list array. Thus for load factors < 1 you'll achieve constant lookup in the best case because no linked-list will contain more than one element (best case).
There's only one way to find out which is better - implement a hash map and see for yourself. You should be able to get something near constant lookup :)

What's the fastest way to find deepest path in a 3D array?

I've been trying to find solution to my problem for more than a week and I couldn't find out anything better than a milion iterations prog, so I think it's time to ask someone to help me.
I've got a 3D array. Let's say, we're talking about the ground and the first layer is a surface.
Another layers are floors below the ground. I have to find deepest path's length, count of isolated caves underground and the size of the biggest cave.
Here's the visualisation of my problem.
Input:
5 5 5 // x, y, z
xxxxx
oxxxx
xxxxx
xoxxo
ooxxx
xxxxx
xxoxx
and so...
Output:
5 // deepest path - starting from the surface
22 // size of the biggest cave
3 // number of izolated caves (red ones) (izolated - cave that doesn't reach the surface)
Note, that even though red cell on the 2nd floor is placed next to green one, It's not the same cave because it's placed diagonally and that doesn't count.
I've been told that the best way to do this, might be using recursive algorithm "divide and rule" however I don't really know how could it look like.

I think you should be able to do it in O(N).
When you parse your input, assign each node a 'caveNumber' initialized to 0. Set it to a valid number whenever you visit a cave:
CaveCount = 0, IsolatedCaveCount=0
AllSizes = new Vector.
For each node,
ProcessNode(size:0,depth:0);
ProcessNode(size,depth):
If node.isCave and !node.caveNumber
if (size==0) ++CaveCount
if (size==0 and depth!=0) IsolatedCaveCount++
node.caveNumber = CaveCount
AllSizes[CaveCount]++
For each neighbor of node,
if (goingDeeper) depth++
ProcessNode(size+1, depth).
You will visit each node 7 times at worst case: once from the outer loop, and possibly once from each of its six neighbors. But you'll only work on each one once, since after that the caveNumber is set, and you ignore it.
You can do the depth tracking by adding a depth parameter to the recursive ProcessNode call, and only incrementing it when visiting a lower neighbor.

The solution shown below (as a python program) runs in time O(n lg*(n)), where lg*(n) is the nearly-constant iterated-log function often associated with union operations in disjoint-set forests.
In the first pass through all cells, the program creates a disjoint-set forest, using routines called makeset(), findset(), link(), and union(), just as explained in section 22.3 (Disjoint-set forests) of edition 1 of Cormen/Leiserson/Rivest. In later passes through the cells, it counts the number of members of each disjoint forest, checks the depth, etc. The first pass runs in time O(n lg*(n)) and later passes run in time O(n) but by simple program changes some of the passes could run in O(c) or O(b) for c caves with a total of b cells.
Note that the code shown below is not subject to the error contained in a previous answer, where the previous answer's pseudo-code contains the line
if (size==0 and depth!=0) IsolatedCaveCount++
The error in that line is that a cave with a connection to the surface might have underground rising branches, which the other answer would erroneously add to its total of isolated caves.
The code shown below produces the following output:
Deepest: 5 Largest: 22 Isolated: 3
(Note that the count of 24 shown in your diagram should be 22, from 4+9+9.)
v=[0b0000010000000000100111000, # Cave map
0b0000000100000110001100000,
0b0000000000000001100111000,
0b0000000000111001110111100,
0b0000100000111001110111101]
nx, ny, nz = 5, 5, 5
inlay, ncells = (nx+1) * ny, (nx+1) * ny * nz
masks = []
for r in range(ny):
masks += [2**j for j in range(nx*ny)][nx*r:nx*r+nx] + [0]
p = [-1 for i in range(ncells)] # parent links
r = [ 0 for i in range(ncells)] # rank
c = [ 0 for i in range(ncells)] # forest-size counts
d = [-1 for i in range(ncells)] # depths
def makeset(x): # Ref: CLR 22.3, Disjoint-set forests
p[x] = x
r[x] = 0
def findset(x):
if x != p[x]:
p[x] = findset(p[x])
return p[x]
def link(x,y):
if r[x] > r[y]:
p[y] = x
else:
p[x] = y
if r[x] == r[y]:
r[y] += 1
def union(x,y):
link(findset(x), findset(y))
fa = 0 # fa = floor above
bc = 0 # bc = floor's base cell #
for f in v: # f = current-floor map
cn = bc-1 # cn = cell#
ml = 0
for m in masks:
cn += 1
if m & f:
makeset(cn)
if ml & f:
union(cn, cn-1)
mr = m>>nx
if mr and mr & f:
union(cn, cn-nx-1)
if m & fa:
union(cn, cn-inlay)
ml = m
bc += inlay
fa = f
for i in range(inlay):
findset(i)
if p[i] > -1:
d[p[i]] = 0
for i in range(ncells):
if p[i] > -1:
c[findset(i)] += 1
if d[p[i]] > -1:
d[p[i]] = max(d[p[i]], i//inlay)
isola = len([i for i in range(ncells) if c[i] > 0 and d[p[i]] < 0])
print "Deepest:", 1+max(d), " Largest:", max(c), " Isolated:", isola

It sounds like you're solving a "connected components" problem. If your 3D array can be converted to a bit array (e.g. 0 = bedrock, 1 = cave, or vice versa) then you can apply a technique used in image processing to find the number and dimensions of either the foreground or background.
Typically this algorithm is applied in 2D images to find "connected components" or "blobs" of the same color. If possible, find a "single pass" algorithm:
http://en.wikipedia.org/wiki/Connected-component_labeling
The same technique can be applied to 3D data. Googling "connected components 3D" will yield links like this one:
http://www.ecse.rpi.edu/Homepages/wrf/pmwiki/pmwiki.php/Research/ConnectedComponents
Once the algorithm has finished processing your 3D array, you'll have a list of labeled, connected regions, and each region will be a list of voxels (volume elements analogous to image pixels). You can then analyze each labeled region to determine volume, closeness to the surface, height, etc.
Implementing these algorithms can be a little tricky, and you might want to try a 2D implementation first. Thought it might not be as efficient as you like, you could create a 3D connected component labeling algorithm by applying a 2D algorithm iteratively to each layer and then relabeling the connected regions from the top layer to the bottom layer:
For layer 0, find all connected regions using the 2D connected component algorithm
For layer 1, find all connected regions.
If any labeled pixel in layer 0 sits directly over a labeled pixel in layer 1, change all the labels in layer 1 to the label in layer 0.
Apply this labeling technique iteratively through the stack until you reach layer N.
One important considering in connected component labeling is how one considers regions to be connected. In a 2D image (or 2D array) of bits, we can consider either the "4-connected" region of neighbor elements
X 1 X
1 C 1
X 1 X
where "C" is the center element, "1" indicates neighbors that would be considered connected, and "X" are adjacent neighbors that we do not consider connected. Another option is to consider "8-connected neighbors":
1 1 1
1 C 1
1 1 1
That is, every element adjacent to a central pixel is considered connected. At first this may sound like the better option. In real-world 2D image data a chessboard pattern of noise or diagonal string of single noise pixels will be detected as a connected region, so we typically test for 4-connectivity.
For 3D data you can consider either 6-connectivity or 26-connectivity: 6-connectivity considers only the neighbor pixels that share a full cube face with the center voxel, and 26-connectivity considers every adjacent pixel around the center voxel. You mention that "diagonally placed" doesn't count, so 6-connectivity should suffice.

You can observe it as a graph where (non-diagonal) adjacent elements are connected if they both empty (part of a cave). Note that you don't have to convert it to a graph, you can use normal 3d array representation.
Finding caves is the same task as finding the connected components in a graph (O(N)) and the size of a cave is the number of nodes of that component.

Algorithm to find "most common elements" in different arrays

I have for example 5 arrays with some inserted elements (numbers):
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
I need to find most common elements in those arrays and every element should go all the way till the end (see example below). In this example that would be the bold combination (or the same one but with "30" on the end, it's the "same") because it contains the smallest number of different elements (only two, 4 and 2/30).
This combination (see below) isn't good because if I have for ex. "4" it must "go" till it ends (next array mustn't contain "4" at all). So combination must go all the way till the end.
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
EDIT2: OR
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
OR anything else is NOT good.
Is there some algorithm to speed this thing up (if I have thousands of arrays with hundreds of elements in each one)?
To make it clear - solution must contain lowest number of different elements and the groups (of the same numbers) must be grouped from first - larger ones to the last - smallest ones. So in upper example 4,4,4,2 is better then 4,2,2,2 because in first example group of 4's is larger than group of 2's.
EDIT: To be more specific. Solution must contain the smallest number of different elements and those elements must be grouped from first to last. So if I have three arrrays like
1,2,3
1,4,5
4,5,6
Solution is 1,1,4 or 1,1,5 or 1,1,6 NOT 2,5,5 because 1's have larger group (two of them) than 2's (only one).
Thanks.
EDIT3: I can't be more specific :(
EDIT4: #spintheblack 1,1,1,2,4 is the correct solution because number used first time (let's say at position 1) can't be used later (except it's in the SAME group of 1's). I would say that grouping has the "priority"? Also, I didn't mention it (sorry about that) but the numbers in arrays are NOT sorted in any way, I typed it that way in this post because it was easier for me to follow.

Here is the approach you want to take, if arrays is an array that contains each individual array.
Starting at i = 0
current = arrays[i]
Loop i from i+1 to len(arrays)-1
new = current & arrays[i] (set intersection, finds common elements)
If there are any elements in new, do step 6, otherwise skip to 7
current = new, return to step 3 (continue loop)
print or yield an element from current, current = arrays[i], return to step 3 (continue loop)
Here is a Python implementation:
def mce(arrays):
count = 1
current = set(arrays[0])
for i in range(1, len(arrays)):
new = current & set(arrays[i])
if new:
count += 1
current = new
else:
print " ".join([str(current.pop())] * count),
count = 1
current = set(arrays[i])
print " ".join([str(current.pop())] * count)
>>> mce([[1, 4, 8, 10], [1, 2, 3, 4, 11, 15], [2, 4, 20, 21], [2, 30]])
4 4 4 2

If all are number lists, and are all sorted, then,
Convert to array of bitmaps.
Keep 'AND'ing the bitmaps till you hit zero. The position of the 1 in the previous value indicates the first element.
Restart step 2 from the next element

This has now turned into a graphing problem with a twist.
The problem is a directed acyclic graph of connections between stops, and the goal is to minimize the number of lines switches when riding on a train/tram.
ie. this list of sets:
1,4,8,10 <-- stop A
1,2,3,4,11,15 <-- stop B
2,4,20,21 <-- stop C
2,30 <-- stop D, destination
He needs to pick lines that are available at his exit stop, and his arrival stop, so for instance, he can't pick 10 from stop A, because 10 does not go to stop B.
So, this is the set of available lines and the stops they stop on:
A B C D
line 1 -----X-----X-----------------
line 2 -----------X-----X-----X-----
line 3 -----------X-----------------
line 4 -----X-----X-----X-----------
line 8 -----X-----------------------
line 10 -----X-----------------------
line 11 -----------X-----------------
line 15 -----------X-----------------
line 20 -----------------X-----------
line 21 -----------------X-----------
line 30 -----------------------X-----
If we consider that a line under consideration must go between at least 2 consecutive stops, let me highlight the possible choices of lines with equal signs:
A B C D
line 1 -----X=====X-----------------
line 2 -----------X=====X=====X-----
line 3 -----------X-----------------
line 4 -----X=====X=====X-----------
line 8 -----X-----------------------
line 10 -----X-----------------------
line 11 -----------X-----------------
line 15 -----------X-----------------
line 20 -----------------X-----------
line 21 -----------------X-----------
line 30 -----------------------X-----
He then needs to pick a way that transports him from A to D, with the minimal number of line switches.
Since he explained that he wants the longest rides first, the following sequence seems the best solution:
take line 4 from stop A to stop C, then switch to line 2 from C to D
Code example:
stops = [
[1, 4, 8, 10],
[1,2,3,4,11,15],
[2,4,20,21],
[2,30],
]
def calculate_possible_exit_lines(stops):
"""
only return lines that are available at both exit
and arrival stops, discard the rest.
"""
result = []
for index in range(0, len(stops) - 1):
lines = []
for value in stops[index]:
if value in stops[index + 1]:
lines.append(value)
result.append(lines)
return result
def all_combinations(lines):
"""
produce all combinations which travel from one end
of the journey to the other, across available lines.
"""
if not lines:
yield []
else:
for line in lines[0]:
for rest_combination in all_combinations(lines[1:]):
yield [line] + rest_combination
def reduce(combination):
"""
reduce a combination by returning the number of
times each value appear consecutively, ie.
[1,1,4,4,3] would return [2,2,1] since
the 1's appear twice, the 4's appear twice, and
the 3 only appear once.
"""
result = []
while combination:
count = 1
value = combination[0]
combination = combination[1:]
while combination and combination[0] == value:
combination = combination[1:]
count += 1
result.append(count)
return tuple(result)
def calculate_best_choice(lines):
"""
find the best choice by reducing each available
combination down to the number of stops you can
sit on a single line before having to switch,
and then picking the one that has the most stops
first, and then so on.
"""
available = []
for combination in all_combinations(lines):
count_stops = reduce(combination)
available.append((count_stops, combination))
available = [k for k in reversed(sorted(available))]
return available[0][1]
possible_lines = calculate_possible_exit_lines(stops)
print("possible lines: %s" % (str(possible_lines), ))
best_choice = calculate_best_choice(possible_lines)
print("best choice: %s" % (str(best_choice), ))
This code prints:
possible lines: [[1, 4], [2, 4], [2]]
best choice: [4, 4, 2]
Since, as I said, I list lines between stops, and the above solution can either count as lines you have to exit from each stop or lines you have to arrive on into the next stop.
So the route is:
Hop onto line 4 at stop A and ride on that to stop B, then to stop C
Hop onto line 2 at stop C and ride on that to stop D
There are probably edge-cases here that the above code doesn't work for.
However, I'm not bothering more with this question. The OP has demonstrated a complete incapability in communicating his question in a clear and concise manner, and I fear that any corrections to the above text and/or code to accommodate the latest comments will only provoke more comments, which leads to yet another version of the question, and so on ad infinitum. The OP has gone to extraordinary lengths to avoid answering direct questions or to explain the problem.

I am assuming that "distinct elements" do not have to actually be distinct, they can repeat in the final solution. That is if presented with [1], [2], [1] that the obvious answer [1, 2, 1] is allowed. But we'd count this as having 3 distinct elements.
If so, then here is a Python solution:
def find_best_run (first_array, *argv):
# initialize data structures.
this_array_best_run = {}
for x in first_array:
this_array_best_run[x] = (1, (1,), (x,))
for this_array in argv:
# find the best runs ending at each value in this_array
last_array_best_run = this_array_best_run
this_array_best_run = {}
for x in this_array:
for (y, pattern) in last_array_best_run.iteritems():
(distinct_count, lengths, elements) = pattern
if x == y:
lengths = tuple(lengths[:-1] + (lengths[-1] + 1,))
else :
distinct_count += 1
lengths = tuple(lengths + (1,))
elements = tuple(elements + (x,))
if x not in this_array_best_run:
this_array_best_run[x] = (distinct_count, lengths, elements)
else:
(prev_count, prev_lengths, prev_elements) = this_array_best_run[x]
if distinct_count < prev_count or prev_lengths < lengths:
this_array_best_run[x] = (distinct_count, lengths, elements)
# find the best overall run
best_count = len(argv) + 10 # Needs to be bigger than any possible answer.
for (distinct_count, lengths, elements) in this_array_best_run.itervalues():
if distinct_count < best_count:
best_count = distinct_count
best_lengths = lengths
best_elements = elements
elif distinct_count == best_count and best_lengths < lengths:
best_count = distinct_count
best_lengths = lengths
best_elements = elements
# convert it into a more normal representation.
answer = []
for (length, element) in zip(best_lengths, elements):
answer.extend([element] * length)
return answer
# example
print find_best_run(
[1,4,8,10],
[1,2,3,4,11,15],
[2,4,20,21],
[2,30]) # prints [4, 4, 4, 30]
Here is an explanation. The ...this_run dictionaries have keys which are elements in the current array, and they have values which are tuples (distinct_count, lengths, elements). We are trying to minimize distinct_count, then maximize lengths (lengths is a tuple, so this will prefer the element with the largest value in the first spot) and are tracking elements for the end. At each step I construct all possible runs which are a combination of a run up to the previous array with this element next in sequence, and find which ones are best to the current. When I get to the end I pick the best possible overall run, then turn it into a conventional representation and return it.
If you have N arrays of length M, this should take O(N*M*M) time to run.

I'm going to take a crack here based on the comments, please feel free to comment further to clarify.
We have N arrays and we are trying to find the 'most common' value over all arrays when one value is picked from each array. There are several constraints 1) We want the smallest number of distinct values 2) The most common is the maximal grouping of similar letters (changing from above for clarity). Thus, 4 t's and 1 p beats 3 x's 2 y's
I don't think either problem can be solved greedily - here's a counterexample [[1,4],[1,2],[1,2],[2],[3,4]] - a greedy algorithm would pick [1,1,1,2,4] (3 distinct numbers) [4,2,2,2,4] (two distinct numbers)
This looks like a bipartite matching problem, but I'm still coming up with the formulation..
EDIT : ignore; This is a different problem, but if anyone can figure it out, I'd be really interested
EDIT 2 : For anyone that's interested, the problem that I misinterpreted can be formulated as an instance of the Hitting Set problem, see http://en.wikipedia.org/wiki/Vertex_cover#Hitting_set_and_set_cover. Basically the left hand side of the bipartite graph would be the arrays and the right hand side would be the numbers, edges would be drawn between arrays that contain each number. Unfortunately, this is NP complete, but the greedy solutions described above are essentially the best approximation.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight