To emulate a simple loop like that:
start = something;
incr = something_else;
end = yet_something_else; /* all three are numerical values, int or float */
while (start <= end) {
/* do something for its side effect, for example: */
printf("%d %d\n", start, start*start);
start += incr;
}
I could write either:
loop1(Start, End, _Incr) :-
Start > End, !. % yes, the cut is necessary!
loop1(Start, End, Incr) :-
Start =< End,
/* do something for its side effect, for example */
format('~d ~d~n', [Start, Start*Start]),
Next is Start + Incr,
loop1(Next, End, Incr).
or:
loop2(Start, End, Incr) :-
( Start =< End
-> format('~d ~d~n, [Start, Start*Start]),
Next is Start + Incr,
loop2(Next, End, Incr)
; true
).
loop/3 must (and always will be) called with all arguments instantiated to numbers.
I should be using the second version, right? The only reason there is a doubt is that the if-then-else construct is pretty much absent from introductory Prolog material, and I can't figure out why (Learn Prolog Now!, for example, otherwise a good introductory material, doesn't even mention it!). At the same time there are cuts haphazardly flying every each way.
Thanks for the help!
my preferred way, that resembles structured programming, is between/3 coupled with forall/2.
?- forall(between(1,3,N), writeln(N)).
here is an 'applicative' example, from ICLP2013 contest:
icecream(N) :-
loop(N, top(N)),
left, loop(N+1, center), nl,
loop(N+1, bottom(N)).
:- meta_predicate loop(+, 1).
loop(XH, PR) :-
H is XH,
forall(between(1, H, I), call(PR, I)).
top(N, I) :-
left, spc(N-I+1), pop,
( I > 1
-> pop,
spc(2*(I-2)),
pcl
; true
),
pcl, nl.
bottom(N, I) :-
left, spc(I-1), put(\), spc(2*(N-I+1)), put(/), nl.
center(_) :- put(/), put(\).
left :- spc(4).
pop :- put(0'().
pcl :- put(0')).
spc(Ex) :- V is Ex, forall(between(1, V, _), put(0' )).
yields
2 ?- [icecream].
% icecream compiled 0.00 sec, 10 clauses
true.
3 ?- icecream(5).
()
(())
(( ))
(( ))
(( ))
/\/\/\/\/\/\
\ /
\ /
\ /
\ /
\ /
\/
true.
I don't know why they don't mention it. All practical programmers use it.
But we can avoid using of cut/if-then-else if rewrite your code with a failure-driven loop.
loop(From, To, Incr, Val) :-
From =< To,
( Val = From
; Next is From + Incr,
loop(Next, To, Incr, Val)
).
print_squares(Start, End, Incr) :-
loop(Start, End, Incr, Val),
Square is Val * Val,
format('~d ~d~n', [Val, Square]),
fail
;
true.
In a case Incr = 1 you can use between/3 from the standard library:
print_squares(Start, End) :-
between(Start, End, Val),
Square is Val * Val,
format('~d ~d~n', [Val, Square]),
fail
;
true.
If you know Russian or can translate it I can recommend my book http://sourceforge.net/projects/uranium-test/files/prolog/speed_prolog.pdf/download as an introductory matherial for Prolog.
Probably a better way to enumerate a sequence of (float) numbers:
sequence(First, Step, Last, R) :-
D is Last - First,
sign(Step) =:= sign(D),
N is floor(D / Step),
between(0, N, X),
R is First + X * Step.
One of the virtues of this solution is that it does not accumulate a floating point error like Next is This + Step.
Related
I've made a function to find a color within a image, and return x, y. Now I need to add a new function, where I can find a color with a given tolerence. Should be easy?
Code to find color in image, and return x, y:
def FindColorIn(r,g,b, xmin, xmax, ymin, ymax):
image = ImageGrab.grab()
for x in range(xmin, xmax):
for y in range(ymin,ymax):
px = image.getpixel((x, y))
if px[0] == r and px[1] == g and px[2] == b:
return x, y
def FindColor(r,g,b):
image = ImageGrab.grab()
size = image.size
pos = FindColorIn(r,g,b, 1, size[0], 1, size[1])
return pos
Outcome:
Taken from the answers the normal methods of comparing two colors are in Euclidean distance, or Chebyshev distance.
I decided to mostly use (squared) euclidean distance, and multiple different color-spaces. LAB, deltaE (LCH), XYZ, HSL, and RGB. In my code, most color-spaces use squared euclidean distance to compute the difference.
For example with LAB, RGB and XYZ a simple squared euc. distance does the trick:
if ((X-X1)^2 + (Y-Y1)^2 + (Z-Z1)^2) <= (Tol^2) then
...
LCH, and HSL is a little more complicated as both have a cylindrical hue, but some piece of math solves that, then it's on to using squared eucl. here as well.
In most these cases I've added "separate parameters" for tolerance for each channel (using 1 global tolerance, and alternative "modifiers" HueTol := Tolerance * hueMod or LightTol := Tolerance * LightMod).
It seems like colorspaces built on top of XYZ (LAB, LCH) does perform best in many of my scenarios. Tho HSL yields very good results in some cases, and it's much cheaper to convert to from RGB, RGB is also great tho, and fills most of my needs.
Computing distances between RGB colours, in a way that's meaningful to the eye, isn't as easy a just taking the Euclidian distance between the two RGB vectors.
There is an interesting article about this here: http://www.compuphase.com/cmetric.htm
The example implementation in C is this:
typedef struct {
unsigned char r, g, b;
} RGB;
double ColourDistance(RGB e1, RGB e2)
{
long rmean = ( (long)e1.r + (long)e2.r ) / 2;
long r = (long)e1.r - (long)e2.r;
long g = (long)e1.g - (long)e2.g;
long b = (long)e1.b - (long)e2.b;
return sqrt((((512+rmean)*r*r)>>8) + 4*g*g + (((767-rmean)*b*b)>>8));
}
It shouldn't be too difficult to port to Python.
EDIT:
Alternatively, as suggested in this answer, you could use HLS and HSV. The colorsys module seems to have functions to make the conversion from RGB. Its documentation also links to these pages, which are worth reading to understand why RGB Euclidian distance doesn't really work:
http://www.poynton.com/ColorFAQ.html
http://www.cambridgeincolour.com/tutorials/color-space-conversion.htm
EDIT 2:
According to this answer, this library should be useful: http://code.google.com/p/python-colormath/
Here is an optimized Python version adapted from Bruno's asnwer:
def ColorDistance(rgb1,rgb2):
'''d = {} distance between two colors(3)'''
rm = 0.5*(rgb1[0]+rgb2[0])
d = sum((2+rm,4,3-rm)*(rgb1-rgb2)**2)**0.5
return d
usage:
>>> import numpy
>>> rgb1 = numpy.array([1,1,0])
>>> rgb2 = numpy.array([0,0,0])
>>> ColorDistance(rgb1,rgb2)
2.5495097567963922
Instead of this:
if px[0] == r and px[1] == g and px[2] == b:
Try this:
if max(map(lambda a,b: abs(a-b), px, (r,g,b))) < tolerance:
Where tolerance is the maximum difference you're willing to accept in any of the color channels.
What it does is to subtract each channel from your target values, take the absolute values, then the max of those.
Assuming that rtol, gtol, and btol are the tolerances for r,g, and b respectively, why not do:
if abs(px[0]- r) <= rtol and \
abs(px[1]- g) <= gtol and \
abs(px[2]- b) <= btol:
return x, y
Here's a vectorised Python (numpy) version of Bruno and Developer's answers (i.e. an implementation of the approximation derived here) that accepts a pair of numpy arrays of shape (x, 3) where individual rows are in [R, G, B] order and individual colour values ∈[0, 1].
You can reduce it two a two-liner at the expense of readability. I'm not entirely sure whether it's the most optimised version possible, but it should be good enough.
def colour_dist(fst, snd):
rm = 0.5 * (fst[:, 0] + snd[:, 0])
drgb = (fst - snd) ** 2
t = np.array([2 + rm, 4 + 0 * rm, 3 - rm]).T
return np.sqrt(np.sum(t * drgb, 1))
It was evaluated against Developer's per-element version above, and produces the same results (save for floating precision errors in two cases out of one thousand).
A cleaner python implementation of the function stated here, the function takes 2 image paths, reads them using cv.imread and the outputs a matrix with each matrix cell having difference of colors. you can change it to just match 2 colors easily
import numpy as np
import cv2 as cv
def col_diff(img1, img2):
img_bgr1 = cv.imread(img1) # since opencv reads as B, G, R
img_bgr2 = cv.imread(img2)
r_m = 0.5 * (img_bgr1[:, :, 2] + img_bgr2[:, :, 2])
delta_rgb = np.square(img_bgr1- img_bgr2)
cols_diffs = delta_rgb[:, :, 2] * (2 + r_m / 256) + delta_rgb[:, :, 1] * (4) +
delta_rgb[:, :, 0] * (2 + (255 - r_m) / 256)
cols_diffs = np.sqrt(cols_diffs)
# lets normalized the values to range [0 , 1]
cols_diffs_min = np.min(cols_diffs)
cols_diffs_max = np.max(cols_diffs)
cols_diffs_normalized = (cols_diffs - cols_diffs_min) / (cols_diffs_max - cols_diffs_min)
return np.sqrt(cols_diffs_normalized)
Simple:
def eq_with_tolerance(a, b, t):
return a-t <= b <= a+t
def FindColorIn(r,g,b, xmin, xmax, ymin, ymax, tolerance=0):
image = ImageGrab.grab()
for x in range(xmin, xmax):
for y in range(ymin,ymax):
px = image.getpixel((x, y))
if eq_with_tolerance(r, px[0], tolerance) and eq_with_tolerance(g, px[1], tolerance) and eq_with_tolerance(b, px[2], tolerance):
return x, y
from pyautogui source code
def pixelMatchesColor(x, y, expectedRGBColor, tolerance=0):
r, g, b = screenshot().getpixel((x, y))
exR, exG, exB = expectedRGBColor
return (abs(r - exR) <= tolerance) and (abs(g - exG) <= tolerance) and (abs(b - exB) <= tolerance)
you just need a little fix and you're ready to go.
Here is a simple function that does not require any libraries:
def color_distance(rgb1, rgb2):
rm = 0.5 * (rgb1[0] + rgb2[0])
rd = ((2 + rm) * (rgb1[0] - rgb2[0])) ** 2
gd = (4 * (rgb1[1] - rgb2[1])) ** 2
bd = ((3 - rm) * (rgb1[2] - rgb2[2])) ** 2
return (rd + gd + bd) ** 0.5
assuming that rgb1 and rgb2 are RBG tuples
I solved a puzzle in C and tried to do the same in Prolog but i'm having some trouble expressing the facts and goals in this language.
The very simplified version of the problem is this: there's two levers in a room. Each lever control a mechanism that can move either forward or backward in four different positions (which i noted 0, 1, 2 or 3). If you move a mechanism four times in the same direction, it'll be in the same position as before.
The lever n°1 move the mechanism n°1 two positions forward.
The lever n°2 move the mechanism n°2 one position forward.
Initially, the mechanism n°1 is in position 2 and the mechanism n°2 is in position 1.
The problem is to find the quickest way to move both mechanisms in position 0 and get the sequence of lever that lead to each solution.
Of course here the problem is trivial and you only need to pull the lever n°1 one time and the lever n°2 three times to have a solution.
Here's a simple code in C which gives the sequence of lever to pull to solve this problem by pulling less than 5 levers:
int pos1 = 2, pos2 = 1;
int main()
{
resolve(0,5);
return 0;
}
void lever1(){
pos1 = (pos1 + 2) % 4;
}
void undolever1(){
pos1 = (pos1 - 2) % 4;
}
void lever2(){
pos2 = (pos2 + 1) % 4;
}
void undolever2(){
pos2 = (pos2 - 1) % 4;
}
void resolve(l, k){
if(k == 0){
return;
}
if(pos1 == 0 && pos2 == 0){
printf("Solution: %d\n", l);
return;
}
if(k>0){
k--;
lever1();
resolve(l*10+1,k);
undolever1();
lever2();
resolve(l*10+2,k);
undolever2();
}
}
My code in Prolog looks like this so far:
lever(l1).
lever(l2).
mechanism(m1).
mechanism(m2).
position(m1,2).
position(m2,1).
pullL1() :- position(m1, mod(position(m1,X)+2,4)).
pullL2() :- position(m2, mod(position(m2,X)+1,4)).
solve(k) :- solve_(k, []).
solve_(0, r) :- !, postion(m1, p1), postion(m2, p2), p1 == 0, p2 == 0.
solve_(k, r) :- k > 0, pullL1(), k1 is k - 1, append(r, [1], r1), solve_(k1, r1).
solve_(k, r) :- k > 0, pullL2(), k1 is k - 1, append(r, [2], r2), solve_(k1, r2).
I'm pretty sure there's multiple problems in this code but I'm not sure how to fix it.
Any help would be really appreciated.
I think this is a very interesting Problem. I suppose you want a general solution -> one lever can move multiple mechanisms. In the case the problem is like yours, where one lever only controls one mechanism the solution is trivial. You just move every lever for the amount of time until the mechanism is at state zero.
But I want to provide a more general solution, so where one lever can move multiple mechanisms. But first a little bit math. Don't worry i'll end up doing an example too.
Lets define
as being n levers and
being m mechanisms. Then lets define every lever by a vector:
where is the amount of steps moves forward.
For the mechanisms we define:
beeing the bias of the mechanisms -> so is in initial state and
being the amount of states for every mechanism. So now we can describe our whole system like this:
wher is the amount of times we have to activate . If we want to set all mechanisms to zero. If you are not familiar with the notation this just means that a%m = b%m.
can we rewritten as:
where k can be any natural number. So we can rewrite our system to an equation system:
prolog can solve for us such a equation system.
(there are different solutions to solve diophantine equation systems look at https://en.wikipedia.org/wiki/Diophantine_equation)
Ok now lets make an example: let say we have two levers and three mechanisms with 4 states. The fist lever moves M1 one forward and M3 two forward. The second lever moves M2 one forward and M3 one forward. M1 is in State 2. M2 is in State 3. M3 is in State 3. So our equation system looks like this:
in prolog we can solve this with the clpfd libary.
?- [library(clpfd)].
and then solve like this:
?- X1+(-4)*K1+2 #= 0, 1*X2+(-4)*K2+3 #= 0, 2*X1+X2+(-4)*K3+3 #= 0,Vs = [X1,X2], Vs ins 0..100,label(Vs).
which gives us the solution
Vs = [2, 1]
-> so X1 = 2 and X2 = 1 which is correct. Prolog can give you more solutions.
Recently I have been given a gcd() function, written in C programming language which takes two arguments n and m and compute the GCD of these two numbers using recursion.I have been asked that "How many recursive calls are made by the function if n>=m?" Can any one provide the solution with explanation to my problem as I am unable to figure it out.
Here is the source code of the function :
int gcd(int n, int m)
{
if (n%m==0)
return m;
else
n=n%m;
return gcd(m, n);
}
Euclidean algorithm gives #steps =
T(a, b) = 1 + T(b, r0) = 2 + T(r0, r1) = … = N + T(rN - 2, rN - 1) = N + 1
where a and b are the inputs, and r_i the remainder. We used that T(x, 0) = 0
Running an example in paper would help you get a better grasp of the aforementioned equation:
gcd(1071, 462) is calculated from the equivalent gcd(462, 1071 mod 462) = gcd(462, 147). The latter GCD is calculated from the gcd(147, 462 mod 147) = gcd(147, 21), which in turn is calculated from the gcd(21, 147 mod 21) = gcd(21, 0) = 21
So a = 1071 and b = 462, and we have:
T(a, b) =
1 + T(b, a % b) = 1 + T(b, r_0) = (1)
2 + T(r_0, b % r_0) = 2 + T(r_0, r_1) =
3 + T(r_1, r_0 % r_1) = 3 + T(r_1, r_2) = (2)
3 + T(r_1, 0) =
3 + 0 =
3
which says that we needed to take 3 steps to compute gcd(1071, 462).
(1): notice that the 1 is the step already done before, i.e. T(a, b)
(2): r_2 is equal to 0 in this example
You could run a plethora of examples in paper, and see how this unfolds, and eventually you will be able to see the pattern, if you don't see it already.
Note: While #Ian'Abott's comments are also correct, I decided to present this approach, since it's more generic, and can be applied to any similar recursive method.
What I want to achieve when doing divide([1,2], 3, X). is something like:
I Should just get all the permutations of the first list, divided over N lists.
X = [[],[],[1,2]] ;
X = [[],[],[2,1]] ;
X = [[],[2],[1]] ;
X = [[],[1],[2]] ;
X = [[],[1,2],[]] ;
X = [[],[2,1],[]] ;
X = [[],[],[2,1]] ;
X = [[],[],[1,2]] ;
X = [[],[1],[2]] ;
X = [[],[2],[1]] ;
X = [[],[2,1],[]] ;
X = [[],[1,2],[]] ;
X = [[2],[],[1]] ;
X = [[2],[1],[]] ;
X = [[1],[],[2]] ;
X = [[1],[2],[]] ;
X = [[1,2],[],[]] ;
X = [[2,1],[],[]] ;
but for some reason, if my list is longer than 2 items, the code below goes into a loop and shows way too much information.
% Divides a list over N sets
divide(_,N,[]) :- N < 1.
divide(Items,1,[Items]).
divide(Items,N,[Selected|Other]) :- N > 1,
sublistPerm(Items,Selected,Rest),
N1 is N-1,
divide(Rest,N1,Other).
the sublistPerm works as it should (you can test it if you want).
% Gets all power sets of a list and permutes them
sublistPerm(Items, Sel, Rest) :- sublist(Items, Temp1, Temp2),
permutation(Temp1, Sel),
permutation(Temp2, Rest).
% Gets all power sets of a list
sublist([], [], []).
sublist([X|XS], YS, [X|ZS]) :- sublist(XS, YS, ZS).
sublist([X|XS], [X|YS], ZS) :- sublist(XS, YS, ZS).
If you would do the effort of running the following code, you will see the redundant info that I am getting. I have ABSOLUTELY no idea why it doesn't just terminate, as it should. divide([1,2,3], 3, X).
As you can see in my example, there are no duplicates. Normally these won't occur, and if they occur, duplicates should be removed.
Thanks for anyone pointing me in the right direction.
There are several issues with your code, looping is none of them. We can set that issue apart very quickly:
?- divide([1,2], 3, X), false.
This terminates. No termination issues with this query.
There are some redundant solutions. But again this is not really an issue. However, what is most problematic is that your relation is incomplete. The minimal example is:
?- divide([1,2], 1, [[2,1]]).
which should succeed but fails. So let's attack this issue first. The fact
divide(Items,1,[Items]).
has to be generalized to cover all permutations.
divide(Items,1,[ItemsP]) :-
permutation(Items, ItemsP).
For the redundant answers/solutions the second goal permutation/2 is not needed, you can replace it by (=)/2 or rewrite your program accordingly.
I'm running the same OpenCL kernel code on an Intel CPU and on a NVIDIA GPU and the results are wrong on the first but right on the latter; the strange thing is that if I do some seemingly irrelevant changes the output works as expected in both cases.
The goal of the function is to calculate the matrix multiplication between A (triangular) and B (regular), where the position of A in the operation is determined by the value of the variable left. The bug only appears when left is true and when the for loop iterates at least twice.
Here is a fragment of the code omitting some bits that shouldn't affect for the sake of clarity.
__kernel void blas_strmm(int left, int upper, int nota, int unit, int row, int dim, int m, int n,
float alpha, __global const float *a, __global const float *b, __global float *c) {
/* [...] */
int ty = get_local_id(1);
int y = ty + BLOCK_SIZE * get_group_id(1);
int by = y;
__local float Bs[BLOCK_SIZE][BLOCK_SIZE];
/* [...] */
for(int i=start; i<end; i+=BLOCK_SIZE) {
if(left) {
ay = i+ty;
bx = i+tx;
}
else {
ax = i+tx;
by = i+ty;
}
barrier(CLK_LOCAL_MEM_FENCE);
/* [...] (Load As) */
if(bx >= m || by >= n)
Bs[tx][ty] = 0;
else
Bs[tx][ty] = b[bx*n+by];
barrier(CLK_LOCAL_MEM_FENCE);
/* [...] (Calculate Csub) */
}
if(y < n && x < (left ? row : m)) // In bounds
c[x*n+y] = alpha*Csub;
}
Now it gets weird.
As you can see, by always equals y if left is true. I checked (with some printfs, mind you) and left is always true, and the code on the else branch inside the loop is never executed. Nevertheless, if I remove or comment out the by = i+ty line there, the code works. Why? I don't know yet, but I though it might be something related to by not having the expected value assigned.
My train of thought took me to check if there was ever a discrepancy between by and y, as they should have the same value always; I added a line that checked if by != y but that comparison always returned false, as expected. So I went on and changed the appearance of by for y so the line
if(bx >= m || by >= n)
transformed into
if(bx >= m || y >= n)
and it worked again, even though I'm still using the variable by properly three lines below.
With an open mind I tried some other things and I got to the point that the code works if I add the following line inside the loop, as long as it is situated at any point after the initial if/else and before the if condition that I mentioned just before.
if(y >= n) left = 1;
The code inside (left = 1) can be substituted for anything (a printf, another useless assignation, etc.), but the condition is a bit more restrictive. Here are some examples that make the code output the correct values:
if(y >= n) left = 1;
if(y < n) left = 1;
if(y+1 < n+1) left = 1;
if(n > y) left = 1;
And some that don't work, note that m = n in the particular example that I'm testing:
if(y >= n+1) left = 1;
if(y > n) left = 1;
if(y >= m) left = 1;
/* etc. */
That's the point where I am now. I have added a line that shouldn't affect the program at all but it makes it work. This magic solution is not satisfactory to me and I would like to know what's happening inside my CPU and why.
Just to be sure I'm not forgetting anything, here is the full function code and a gist with example inputs and outputs.
Thank you very much.
Solution
Both users DarkZeros and sharpneli were right about their assumptions: the barriers inside the for loop weren't being hit the right amount of times. In particular, there was a bug involving the very first element of each local group that made it run one iteration less than the rest, provoking an undefined behaviour. It was painfully obvious to see in hindsight.
Thank you all for your answers and time.
Have you checked that the get_local_size always returns the correct value?
You said "In short, the full length of the matrix is divided in local blocks of BLOCK_SIZE and run in parallel; ". Remember that OpenCL allows any concurrency only within a workgroup. So if you call enqueueNDrange with global size of [32,32] and local size of [16,16] it is possible that the first thread block runs from start to finish, then the second one, then third etc. You cannot synchronize between workgroups.
What are your EnqueueNDRange call(s)? Example of the calls required to get your example output would be heavily appreciated (mostly interested in the global and local size arguments).
(I'd ask this in a comment but I am a new user).
E (Had an answer, upon verification did not have it, still need more info):
http://multicore.doc.ic.ac.uk/tools/GPUVerify/
By using that I got a complaint that a barrier could be reached by a nonuniform control flow.
It all depends on what values dim, nota and upper get. Could you provide some examples?
I did some testing. Assuming left = 1. nota != upper and dim = 32, row as 16 or 32 or whatnot, still worked and got the following result:
...
gid0: 2 gid1: 0 lid0: 14 lid1: 13 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 14 lid1: 14 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 14 lid1: 15 start: 0 end: 32
gid0: 2 gid1: 0 lid0: 15 lid1: 0 start: 0 end: 48
gid0: 2 gid1: 0 lid0: 15 lid1: 1 start: 0 end: 48
gid0: 2 gid1: 0 lid0: 15 lid1: 2 start: 0 end: 48
...
So if my assumptions about the variable values are even close to correct you have barrier divergence issue there. Some threads encounter a barrier which another threads never will. I'm surprised it did not deadlock.
The first thing I see it can terribly fail, is that you are using barriers inside a for loop.
If all the threads do not enter the same amount of times the for loop. Then the results are undefined completely. And you clearly state the problem only occurs if the for loop runs more than once.
Do you ensure this condition?