Solving Kakuro puzzles - c

Here's a good one to reflect on:
http://en.wikipedia.org/wiki/Kakuro
I'm attempting to make a solver for this game. The paperwork is done (reading an initial file with a variable number of columns and rows. It's assumed the input file follows the rules of the game so the game is always solvable. Take your time to read the game rules.
I've taken care of the data structure which I think will suit best:
struct aSquare { int verticalSum; int horizontalSum; int value; }
And made an "array" of these dynamically to work on.
I made it so that the black squares have value of -1 and white squares (the actual solution squares) initialize at 0. You can also get the position of each aSquare struct from the array easily, no need to make additional struct fields for it.
Now the algorithm ... How in the world will I conciliate all these sums and find a general way that will solve all types of grids. I been struggling with this all afternoon to no avail.
Help is appreciated, have fun!
*EDIT: I just realized the actual link I posted has some tips regarding solving techniques. I will still keep this up to see what people come up with.

Regarding Constraint Programming: Here are some different implementations of how to solve a Kakuro puzzle with Constraint Programming (all using the same basic principle). The problem instance is fixed in the program.
Google or-tools/Python: http://www.hakank.org/google_or_tools/kakuro.py
Comet: http://www.hakank.org/comet/kakuro.co
MiniZinc: http://www.hakank.org/minizinc/kakuro.mzn
SICStus: http://www.hakank.org/sicstus/kakuro.pl
ECLiPSe: http://www.hakank.org/eclipse/kakuro.ecl
Gecode: http://www.hakank.org/gecode/kakuro.cpp
Google or-tools/C#: http://hakank.org/google_or_tools/kakuro.cs
Answer Set Programming: http://hakank.org/asp/kakuro.lp
Edit: Added Google or-tools/C# and Answer Set Programming.

A simple brute-force solver for Sudoku takes miliseconds to run, so you don't need to bother implementing any special tactics. I think that in case of Kakuro this will be the same. A simple algorithm:
def solve(kakuro):
if kakuro has no empty fields:
print kakuro
stop.
else:
position = pick a position
values = [calculate possible legal values for that field]
for value in values:
kakuro[position] = value
solve(kakuro)
kakuro[position] = None # unset after trying all possibilities
This algorithm might work better if you find the best order of fields to fill. Try to choose fields which will be the most constrained (as in: there are not many values that are legal).
Anyway, this will be probably the simplest to implement, so try it and look for more sophisticated solvers only if this one will not work. (Actually this is one of the simplest constraint programming solvers; real CP solvers are much more complicated, of course).

If your interest is ultimately in making a software solver for these games, but not getting into the algorithmic details, I recommend using a Constraint Programming (CP) engine. CP is a declarative programming paradigm that is very well suited to these sorts of problems. Several commercial and open source CP engines are available.
http://en.wikipedia.org/wiki/Constraint_programming

I would guess that Linear Programming can be easily used to solve this kind of game.. then this is an integer problem for which exact solutions does exist.. (branch and bound? cutting-plane?)
In any case using a table with the most certain combinations will be useful for sure (eg http://www.enigmoteka.com/Kakuro%20Cheatsheet.pdf)

I have found some nice bit-manipulation tricks that speed up Kakuro solving. You can pick up the source here.

This is straight-forward linear algebra, use vector/matrix manipulation techniques to solve.
[edit - answering the comments]
a + b + 0 + d + 0 = n1
0 + b + c + 0 + e = n2
a + 0 + c + 0 + 0 = n3
a + b + c + 0 + e = n4
a + 0 + c + d + 0 = n5
Above is converted to a matrix, and by adding and subtracting multiples of the rows, you end up with:
a 0 0 0 0 na
0 b 0 0 0 nb
0 0 c 0 0 nc
0 0 0 d 0 nd
0 0 0 0 e ne
No combinatorics, all remain integers.

Related

Daily Coding Problem 260 : Reconstruct a jumbled array - Intuition?

I'm going through the question below.
The sequence [0, 1, ..., N] has been jumbled, and the only clue you have for its order is an array representing whether each number is larger or smaller than the last. Given this information, reconstruct an array that is consistent with it.
For example, given [None, +, +, -, +], you could return [1, 2, 3, 0, 4].
I went through the solution on this post but still unable to understand it as to why this solution works. I don't think I would be able to come up with the solution if I had this in front of me during an interview. Can anyone explain the intuition behind it? Thanks in advance!
This answer tries to give a general strategy to find an algorithm to tackle this type of problems. It is not trying to prove why the given solution is correct, but lying out a route towards such a solution.
A tried and tested way to tackle this kind of problem (actually a wide range of problems), is to start with small examples and work your way up. This works for puzzles, but even so for problems encountered in reality.
First, note that the question is formulated deliberately to not point you in the right direction too easily. It makes you think there is some magic involved. How can you reconstruct a list of N numbers given only the list of plusses and minuses?
Well, you can't. For 10 numbers, there are 10! = 3628800 possible permutations. And there are only 2⁹ = 512 possible lists of signs. It's a very huge difference. Most original lists will be completely different after reconstruction.
Here's an overview of how to approach the problem:
Start with very simple examples
Try to work your way up, adding a bit of complexity
If you see something that seems a dead end, try increasing complexity in another way; don't spend too much time with situations where you don't see progress
While exploring alternatives, revisit old dead ends, as you might have gained new insights
Try whether recursion could work:
given a solution for N, can we easily construct a solution for N+1?
or even better: given a solution for N, can we easily construct a solution for 2N?
Given a recursive solution, can it be converted to an iterative solution?
Does the algorithm do some repetitive work that can be postponed to the end?
....
So, let's start simple (writing 0 for the None at the start):
very short lists are easy to guess:
'0++' → 0 1 2 → clearly only one solution
'0--' → 2 1 0 → only one solution
'0-+' → 1 0 2 or 2 0 1 → hey, there is no unique outcome, though the question only asks for one of the possible outcomes
lists with only plusses:
'0++++++' → 0 1 2 3 4 5 6 → only possibility
lists with only minuses:
'0-------'→ 7 6 5 4 3 2 1 0 → only possibility
lists with one minus, the rest plusses:
'0-++++' → 1 0 2 3 4 5 or 5 0 1 2 3 4 or ...
'0+-+++' → 0 2 1 3 4 5 or 5 0 1 2 3 4 or ...
→ no very obvious pattern seem to emerge
maybe some recursion could help?
given a solution for N, appending one sign more?
appending a plus is easy: just repeat the solution and append the largest plus 1
appending a minus, after some thought: increase all the numbers by 1 and append a zero
→ hey, we have a working solution, but maybe not the most efficient one
the algorithm just appends to an existing list, no need to really write it recursively (although the idea is expressed recursively)
appending a plus can be improved, by storing the largest number in a variable so it doesn't need to be searched at every step; no further improvements seem necessary
appending a minus is more troublesome: the list needs to be traversed with each append
what if instead of appending a zero, we append -1, and do the adding at the end?
this clearly works when there is only one minus
when two minus signs are encountered, the first time append -1, the second time -2
→ hey, this works for any number of minuses encountered, just store its counter in a variable and sum with it at the end of the algorithm
This is in bird's eye view one possible route towards coming up with a solution. Many routes lead to Rome. Introducing negative numbers might seem tricky, but it is a logical conclusion after contemplating the recursive algorithm for a while.
It works because all changes are sequential, either adding one or subtracting one, starting both the increasing and the decreasing sequences from the same place. That guarantees we have a sequential list overall. For example, given the arbitrary
[None, +, -, +, +, -]
turned vertically for convenience, we can see
None 0
+ 1
- -1
+ 2
+ 3
- -2
Now just shift them up by two (to account for -2):
2 3 1 4 5 0
+ - + + -
Let's look at first to a solution which (I think) is easier to understand, formalize and demonstrate for correctness (but I will only explain it and not demonstrate in a formal way):
We name A[0..N] our input array (where A[k] is None if k = 0 and is + or - otherwise) and B[0..N] our output array (where B[k] is in the range [0, N] and all values are unique)
At first we see that our problem (find B such that B[k] > B[k-1] if A[k] == + and B[k] < B[k-1] if A[k] == -) is only a special case of another problem:
Find B such that B[k] == max(B[0..k]) if A[k] == + and B[k] == min(B[0..k]) if A[k] == -.
Which generalize from "A value must larger or smaller than the last" to "A value must be larger or smaller than everyone before it"
So a solution to this problem is a solution to the original one as well.
Now how do we approach this problem?
A greedy solution will be sufficient, indeed is easy to demonstrate that the value associated with the last + will be the biggest number in absolute (which is N), the one associated with the second last + will be the second biggest number in absolute (which is N-1) ecc...
And in the same time the value associated with the last - will be the smallest number in absolute (which is 0), the one associated with the second last - will be the second smallest (which is 1) ecc...
So we can start filling B from right to left remembering how many + we have seen (let's call this value X), how many - we have seen (let's call this value Y) and looking at what is the current symbol, if it is a + in B we put N-X and we increase X by 1 and if it is a - in B we put 0+Y and we increase Y by 1.
In the end we'll need to fill B[0] with the only remaining value which is equal to Y+1 and to N-X-1.
An interesting property of this solution is that if we look to only the values associated with a - they will be all the values from 0 to Y (where in this case Y is the total number of -) sorted in reverse order; if we look to only the values associated with a + they will be all the values from N-X to N (where in this case X is the total number of +) sorted and if we look at B[0] it will always be Y+1 and N-X-1 (which are equal).
So the - will have all the values strictly smaller than B[0] and reverse sorted and the + will have all the values strictly bigger than B[0] and sorted.
This property is the key to understand why the solution proposed here works:
It consider B[0] equals to 0 and than it fills B following the property, this isn't a solution because the values are not in the range [0, N], but it is possible with a simple translation to move the range and arriving to [0, N]
The idea is to produce a permutation of [0,1...N] which will follow the pattern of [+,-...]. There are many permutations which will be applicable, it isn't a single one. For instance, look the the example provided:
[None, +, +, -, +], you could return [1, 2, 3, 0, 4].
But you also could have returned other solutions, just as valid: [2,3,4,0,1], [0,3,4,1,2] are also solutions. The only concern is that you need to have the first number having at least two numbers above it for positions [1],[2], and leave one number in the end which is lower then the one before and after it.
So the question isn't finding the one and only pattern which is scrambled, but to produce any permutation which will work with these rules.
This algorithm answers two questions for the next member of the list: get a number who’s both higher/lower from previous - and get a number who hasn’t been used yet. It takes a starting point number and essentially create two lists: an ascending list for the ‘+’ and a descending list for the ‘-‘. This way we guarantee that the next member is higher/lower than the previous one (because it’s in fact higher/lower than all previous members, a stricter condition than the one required) and for the same reason we know this number wasn’t used before.
So the intuition of the referenced algorithm is to start with a referenced number and work your way through. Let's assume we start from 0. The first place we put 0+1, which is 1. we keep 0 as our lowest, 1 as the highest.
l[0] h[1] list[1]
the next symbol is '+' so we take the highest number and raise it by one to 2, and update both the list with a new member and the highest number.
l[0] h[2] list [1,2]
The next symbol is '+' again, and so:
l[0] h[3] list [1,2,3]
The next symbol is '-' and so we have to put in our 0. Note that if the next symbol will be - we will have to stop, since we have no lower to produce.
l[0] h[3] list [1,2,3,0]
Luckily for us, we've chosen well and the last symbol is '+', so we can put our 4 and call is a day.
l[0] h[4] list [1,2,3,0,4]
This is not necessarily the smartest solution, as it can never know if the original number will solve the sequence, and always progresses by 1. That means that for some patterns [+,-...] it will not be able to find a solution. But for the pattern provided it works well with 0 as the initial starting point. If we chose the number 1 is would also work and produce [2,3,4,0,1], but for 2 and above it will fail. It will never produce the solution [0,3,4,1,2].
I hope this helps understanding the approach.
This is not an explanation for the question put forward by OP.
Just want to share a possible approach.
Given: N = 7
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Go from 0 to N
[1] fill all '-' starting from right going left.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 2 1 0
[2] fill all the vacant places i.e [X & +] starting from left going right.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 3 4 5 6 7
Final:
Pattern: X + - + - + - + //X = None
Answer: 3 4 2 5 1 6 0 7
My answer definitely is too late for your problem but if you need a simple proof, you probably would like to read it:
+min_last or min_so_far is a decreasing value starting from 0.
+max_last or max_so_far is an increasing value starting from 0.
In the input, each value is either "+" or "-" and for each increase the value of max_so_far or decrease the value of min_so_far by one respectively, excluding the first one which is None. So, abs(min_so_far, max_so_far) is exactly equal to N, right? But because you need the range [0, n] but max_so_far and min_so_far now are equal to the number of "+"s and "-"s with the intersection part with the range [0, n] being [0, max_so_far], what you need to do is to pad it the value equal to min_so_far for the final solution (because min_so_far <= 0 so you need to take each value of the current answer to subtract by min_so_far or add by abs(min_so_far)).

I really can't figure out where to start

By using 9 numbers which are 1 to 9 you should find the number of ways to get N using multiplication and addition.
For example, if 100 is given, you would answer 7.
The reason is that there are 7 possible ways.
100 = 1*2*3*4+5+6+7*8+9
100 = 1*2*3+4+5+6+7+8*9
100 = 1+2+3+4+5+6+7+8*9
100 = 12+3*4+5+6+7*8+9
100 = 1+2*3+4+5+67+8+9
100 = 1*2+34+5+6*7+8+9
100 = 12+34+5*6+7+8+9
If this question is given to you, how would you start?
Are we allowed to use parentheses? That would expand the number of possibilities by a lot.
I would try to find the first additive term, let’s say 1×23, first. There are a limited number of those, and since we can’t subtract, we know that if we get a term above our target, we can prune it from our search. That leaves us looking for the solution to 23 + f = 100, where f is another formula of exactly the same form. But that is exactly the same as solving the original problem for numbers 4–9 and target 77! So call your algorithm recursively and add the solutions for that subproblem to the solutions to the original problem. That is, if we have 23 + 4, are there any solutions to the subproblem with numbers 5–9 and n = 73? Divide and conquer.
You might benefit from a dynamic table of partial solutions, since it's possible you might get the same subproblem in different ways: 1+2+3 = 1×2×3, so solving the subproblem with numbers 4–9 and target 94 twice duplicates work.
You are probably better going from right to left than from left to right, on the principle of most-constrained first. 89, 8×9, or 78+9 leave much less room for possible solutions than 1+2+3, 1×2×3, 12×3, 12+3 or 1×23.
There are three possible operations
addition
multiplication
combine, for example combine 1 and 2 to make 12
There are 8 positions for each operator. Hence, there are a total of 3^8 = 6561 possible equations. So I would start with
for ( i = 0; i < 6561; i++ )

What is the advantage of linspace over the colon ":" operator?

Is there some advantage of writing
t = linspace(0,20,21)
over
t = 0:1:20
?
I understand the former produces a vector, as the first does.
Can anyone state me some situation where linspace is useful over t = 0:1:20?
It's not just the usability. Though the documentation says:
The linspace function generates linearly spaced vectors. It is
similar to the colon operator :, but gives direct control over the
number of points.
it is the same, the main difference and advantage of linspace is that it generates a vector of integers with the desired length (or default 100) and scales it afterwards to the desired range. The : colon creates the vector directly by increments.
Imagine you need to define bin edges for a histogram. And especially you need the certain bin edge 0.35 to be exactly on it's right place:
edges = [0.05:0.10:.55];
X = edges == 0.35
edges = 0.0500 0.1500 0.2500 0.3500 0.4500 0.5500
X = 0 0 0 0 0 0
does not define the right bin edge, but:
edges = linspace(0.05,0.55,6); %// 6 = (0.55-0.05)/0.1+1
X = edges == 0.35
edges = 0.0500 0.1500 0.2500 0.3500 0.4500 0.5500
X = 0 0 0 1 0 0
does.
Well, it's basically a floating point issue. Which can be avoided by linspace, as a single division of an integer is not that delicate, like the cumulative sum of floting point numbers. But as Mark Dickinson pointed out in the comments:
You shouldn't rely on any of the computed values being exactly what you expect. That is not what linspace is for. In my opinion it's a matter of how likely you will get floating point issues and how much you can reduce the probabilty for them or how small can you set the tolerances. Using linspace can reduce the probability of occurance of these issues, it's not a security.
That's the code of linspace:
n1 = n-1
c = (d2 - d1).*(n1-1) % opposite signs may cause overflow
if isinf(c)
y = d1 + (d2/n1).*(0:n1) - (d1/n1).*(0:n1)
else
y = d1 + (0:n1).*(d2 - d1)/n1
end
To sum up: linspace and colon are reliable at doing different tasks. linspace tries to ensure (as the name suggests) linear spacing, whereas colon tries to ensure symmetry
In your special case, as you create a vector of integers, there is no advantage of linspace (apart from usability), but when it comes to floating point delicate tasks, there may is.
The answer of Sam Roberts provides some additional information and clarifies further things, including some statements of MathWorks regarding the colon operator.
linspace and the colon operator do different things.
linspace creates a vector of integers of the specified length, and then scales it down to the specified interval with a division. In this way it ensures that the output vector is as linearly spaced as possible.
The colon operator adds increments to the starting point, and subtracts decrements from the end point to reach a middle point. In this way, it ensures that the output vector is as symmetric as possible.
The two methods thus have different aims, and will often give very slightly different answers, e.g.
>> a = 0:pi/1000:10*pi;
>> b = linspace(0,10*pi,10001);
>> all(a==b)
ans =
0
>> max(a-b)
ans =
3.5527e-15
In practice, however, the differences will often have little impact unless you are interested in tiny numerical details. I find linspace more convenient when the number of gaps is easy to express, whereas I find the colon operator more convenient when the increment is easy to express.
See this MathWorks technical note for more detail on the algorithm behind the colon operator. For more detail on linspace, you can just type edit linspace to see exactly what it does.
linspace is useful where you know the number of elements you want rather than the size of the "step" between them. So if I said make a vector with 360 elements between 0 and 2*pi as a contrived example it's either going to be
linspace(0, 2*pi, 360)
or if you just had the colon operator you would have to manually calculate the step size:
0:(2*pi - 0)/(360-1):2*pi
linspace is just more convenient
For a simple real world application, see this answer where linspace is helpful in creating a custom colour map

Hello, I have a computational q. regarding combination/permutations

A brief intro. I am creating a medical software. I forget some of the computation/permutation theorems in college. Let's say I have five nerves. Median, ulnar, radial, tibial, peroneal. I can choose one, two, three, four, or all five of them in any combintation. What is the equation to find the maxmimum number of combinations I can make?
For example;
median
median + ulnar
median + ulnar + radial
etc etc
ulnar + median = median + ulnar. so those would be repetitive. Thank you for your help. I know this isn't directly programming related, but I thought you guys would be familiar.
The comment that says it is (2^n)-1 is correct. 2^n is the number of possible subsets you can form from a set of n objects (in this case you have 5 objects), and then in your case, you don't want to count the empty set, so you subtract out 1.
I'm sure you can do the math, but for the sake of completeness, for 5 nerves, there would be 2^5 - 1 = 32 - 1 = 31 possible combinations you could end up with.

Subset Sum TI Basic Programming

I'm trying to program my TI-83 to do a subset sum search. So, given a list of length N, I want to find all lists of given length L, that sum to a given value V.
This is a little bit different than the regular subset sum problem because I am only searching for subsets of given lengths, not all lengths, and recursion is not necessarily the first choice because I can't call the program I'm working in.
I am able to easily accomplish the task with nested loops, but that is becoming cumbersome for values of L greater than 5. I'm trying for dynamic solutions, but am not getting anywhere.
Really, at this point, I am just trying to get the list references correct, so that's what I'm looking at. Let's go with an example:
L1={p,q,r,s,t,u}
so
N=6
let's look for all subsets of length 3 to keep it relatively short, so L = 3 (6c3 = 20 total outputs).
Ideally the list references that would be searched are:
{1,2,3}
{1,2,4}
{1,2,5}
{1,2,6}
{1,3,4}
{1,3,5}
{1,3,6}
{1,4,5}
{1,4,6}
{1,5,6}
{2,3,4}
{2,3,5}
{2,3,6}
{2,4,5}
{2,4,6}
{2,5,6}
{3,4,5}
{3,4,6}
{3,5,6}
{4,5,6}
Obviously accomplished by:
FOR A,1,N-2
FOR B,A+1,N-1
FOR C,B+1,N
display {A,B,C}
END
END
END
I initially sort the data of N descending which allows me to search for criteria that shorten the search, and using FOR loops screws it up a little at different places when I increment the values of A, B and C within the loops.
I am also looking for better dynamic solutions. I've done some research on the web, but I can't seem to adapt what is out there to my particular situation.
Any help would be appreciated. I am trying to keep it brief enough as to not write a novel but explain what I am trying to get at. I can provide more details as needed.
For optimisation, you simply want to skip those sub-trees of the search where you already now they'll exceed the value V. Recursion is the way to go but, since you've already ruled that out, you're going to be best off setting an upper limit on the allowed depths.
I'd go for something like this (for a depth of 3):
N is the total number of array elements.
L is the desired length (3).
V is the desired sum
Y[] is the array
Z is the total
Z = 0
IF Z <= V
FOR A,1,N-L
Z = Z + Y[A]
IF Z <= V
FOR B,A+1,N-L+1
Z = Z + Y[B]
IF Z <= V
FOR C,B+1,N-L+2
Z = Z + Y[C]
IF Z = V
DISPLAY {A,B,C}
END
Z = Z - Y[C]
END
END
Z = Z - Y[B]
END
END
Z = Z - Y[A]
END
END
Now that's pretty convoluted but it basically check at every stage whether you've already exceed the desired value and refuses to check lower sub-trees as an efficiency measure. It also keeps a running total for the current level so that it doesn't have to do a large number of additions when checking at lower levels. That's the adding and subtracting of the array values against Z.
It's going to get even more complicated when you modify it to handle more depth (by using variables from D to K for 11 levels (more if you're willing to move N and L down to W and X or if TI BASIC allows more than one character in a variable name).
The only other non-recursive way I can think of doing that is to use an array of value groups to emulate recursion with iteration, and that will look only slightly less hairy (although the code should be less nested).

Resources