Prolog , endless loop - loops

I did t he following code to go through all student Ids starting from 476 and ending at 520.
schedule_errors(A,B,C):-
Errors is 0,
check_Courses(476,A,B,C,Errors).
check_Courses(X,A,B,C,Errors):-
. .
. .
. .
Y is X+1,
check_Courses(Y,A,B,C,Er).
The problem is the programm keeps running indefinetly ignoring my exit loop predicate
check_Courses(520,A,B,C,Er):-
write('Check complete').
I can't understand what i am doing wrong. i Tried a similar easier version (just counting to 10) and it works fine
loop(10):-
write('cd finished').
loop(X):-
write(X), nl,
Y is X+1,
loop(Y).
What am i missing?

One important observation is that loop/1 does not terminate either. You can see this for example as follows:
?- loop(1), false.
1
2
3
...
8
9
cd finished10
11
12
13
14
...
49
50
51
...
32394
32395
...
Note that the textual order in which you state your clauses in Prolog matters.
If you exchange the two clauses of loop/1, then you do not get a single solution, only an endless stream of output:
?- loop(1).
...
42642
42643
...
So, in check_courses/5, if you put a more specific case after a case that subsumes it, then the textually first clause will always be tried first.
Put simple cases before more complex cases!

Related

Writing my output into two columns in a text file .txt in C

I've looked around but couldn't find a satisfying solution... Basically I made a function that calculates the probability distribution of x number of loss in a portfolio of n credits... And I am trying to write the output in a text file into two columns where the first column would be the X (number of defaults) and second column would be the P(density function of each loss).. something like this:
X P
1 0.005
2 0.003
3 0.005
4 0.005
5 0.005
etc.
I've looked around and people suggested using negative- sign in front of my %d and %f when using fprintf but no luck....
Here's a sample of my code and the output it gives me...
Code:
for(i=0;i<d+1;i++)
{
Densite= gsl_ran_binomial_pdf(i,p,d);
fprintf(pF,"%-5d %-20f .\n",i, Densite);
}
Output:
0 0.005921 .
1 0.031161 .
2 0.081182 .
3 0.139576 .
4 0.178143 .
5 0.180018 .
6 0.150015 .
7 0.106026 .
8 0.064871 .
9 0.034901 .
10 0.016716 .
How to remedy?
Thanks in advance! (complete noob that started coding in C like two days ago..)
Did you run the executable program on Windows or Linux? If Window please use \r\n for new line.

How to recognize stored procedure name by using regex? [duplicate]

I have to parse some tables from an ASCII text file. Here's a partial sample:
QSMDRYCELL 11.00 11.10 11.00 11.00 -.90 11 11000 1.212
RECKITTBEN 192.50 209.00 192.50 201.80 5.21 34 2850 5.707
RUPALIINS 150.00 159.00 150.00 156.25 6.29 4 80 .125
SALAMCRST 164.00 164.75 163.00 163.25 -.45 80 8250 13.505
SINGERBD 779.75 779.75 770.00 773.00 -.89 8 95 .735
SONARBAINS 68.00 69.00 67.50 68.00 .74 11 3050 2.077
The table consists of 1 column of text and 8 columns of floating point numbers. I'd like to capture each column via regex.
I'm pretty new to regular expressions. Here's the faulty regex pattern I came up with:
(\S+)\s+(\s+[\d\.\-]+){8}
But the pattern captures only the first and the last columns. RegexBuddy also emits the following warning:
You repeated the capturing group
itself. The group will capture only
the last iteration. Put a capturing
group around the repeated group to
capture all iterations.
I've consulted their help file, but I don't have a clue as to how to solve this.
How can I capture each column separately?
In C# (modified from this example):
string input = "QSMDRYCELL 11.00 11.10 11.00 11.00 -.90 11 11000 1.212";
string pattern = #"^(\S+)\s+(\s+[\d.-]+){8}$";
Match match = Regex.Match(input, pattern, RegexOptions.MultiLine);
if (match.Success) {
Console.WriteLine("Matched text: {0}", match.Value);
for (int ctr = 1; ctr < match.Groups.Count; ctr++) {
Console.WriteLine(" Group {0}: {1}", ctr, match.Groups[ctr].Value);
int captureCtr = 0;
foreach (Capture capture in match.Groups[ctr].Captures) {
Console.WriteLine(" Capture {0}: {1}",
captureCtr, capture.Value);
captureCtr++;
}
}
}
Output:
Matched text: QSMDRYCELL 11.00 11.10 11.00 11.00 -.90 11 11000 1.212
...
Group 2: 1.212
Capture 0: 11.00
Capture 1: 11.10
Capture 2: 11.00
...etc.
If you want to know what the warning is appearing for, it's because your capture group matches multiple times (8, as you specified) but the capture variable can only have one value. It is assigned the last value matched.
As described in question 1313332, retrieving these multiple matches is generally not possible with a regular expression, although .NET and Perl 6 have some support for it.
The warning suggests that you could put another group around the whole set, like this:
(\S+)\s+((\s+[\d\.\-]+){8})
You would then be able to see all the columns, but of course they would not be separated. Because it's generally not possible to capture them separately, the more common intention is to capture all of it, and the warning helps remind you of this.
Unfortunately you need to repeat the (…) 8 times to get each column separately.
^(\S+)\s+([-.\d]+)\s+([-.\d]+)\s+([-.\d]+)\s+([-.\d]+)\s+([-.\d]+)\s+([-.\d]+)\s+([-.\d]+)\s+([-.\d]+)$
If code is possible, you can first match those numeric columns as a whole
>>> rx1 = re.compile(r'^(\S+)\s+((?:[-.\d]+\s+){7}[-.\d]+)$', re.M)
>>> allres = rx1.findall(theAsciiText)
then split the columns by spaces
>>> [[p] + q.split() for p, q in allres]

Data Entry in SAS using Loops

I just learned about the "do" loop today and would like to try using it for data entry in SAS. I have tried most examples online, but I still cannot figure it out.
My dataset in an experiment with 6 treatments (1 to 6) using 2 sets of cues, 3 each, Visual and Audio. There's lag measured in seconds, which are 5, 10, and 15, which there are 2 sets.
Basically it looks like this:
Table
The entries I want are:
1. Obs_no, ranging from 1 to 18 (total of 18 observations, this allows me to easily delete outliers with an IF THEN)
2. Treatment type, which are Auditory and Visual.
3.Treatment number, 1 to 6, 3 sets.
4. Lag, 5, 10 or 15.
5. And the data itself
So far, my code makes 2 and 5 possible, it also makes the rest possible with an IF THEN statement and input statement, although I assume there's a way easier method:
data AVCue;
do cue = 'Auditory','Visual';
do i = 1 to 3;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
Lag and the rest was made possible using an IF THEN statement and the crude method of input:
data AVCue;
set AVCue;
IF i=1 THEN Lag=5;
IF i=2 THEN Lag=10;
IF i=3 THEN Lag=15;
input obs_no treatment;
cards;
1 1
2 2
3 3
4 4
5 5
6 6
7 1
8 2
9 3
10 4
11 5
12 6
13 1
14 2
15 3
16 4
17 5
18 6
;
proc print data=AVCue;
run;
The IF THEN should be fine, but the input statement here is just in my opinion counterproductive, and defeats the purpose of using loops, which is to me, to save time. If done this way, I might as well just put the data into excel and import it, or type everything out with ample copy and paste of the text in the
input obs_no treatment;
cards;
section.
My coding knowledge is basic, so sorry if this question sounds silly, I want to know:
1. How would I make a list of numbers using the "do" loops in SAS? I've made several attempts and all I get is a list containing the next number. I know why this happens, the loop counts to x and the value assigned would just be x. I just don't know how to get around that. Somehow this didn't happen in the datalines section, I guess SAS knows there's 18 numbers and the entry i is stored accordingly... or something?
2. How would I go about assigning in this case, the numbers 1 to 6 to each entry?
Thanks!
It is certainly much easier to read in the actual dataset instead of having to impute some of the variables based on the order the values have in the source data. You might be able to combine a SET statement and an INPUT statement in the same data step and get it to work, but it is probably NOT worth the effort. Just make two datasets and merge them.
Looking at the photograph you posted it looks like TREATMENT is not an independent variable. Instead it is just a label for the combination of CUE and LAG. To make it cycle from 1 to 6 just reset it back to 1 when it gets too large.
data AVCue;
do cue = 'Auditory','Visual';
do lag= 5, 10, 15 ;
treatment+1;
if treatment=7 then treatment=1;
obsno+1;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
You can get in trouble if you just let SAS guess at how you want to define your variables. For example if you change the order of the CUE values do cue = 'Visual','Auditory'; then SAS will make CUE with length $5 instead of $8. Add a LENGTH statement to define your variables before you use them.
length obsno 8 treatment 8 cue $8 lag 8 AVCue 8 ;
This will also let you control the order they are created in the dataset.
If you really did already have a SAS dataset and you wanted to add a variable like TREATMENT that cycled from 1 to 6 (or really any DO loop construct) then could nest the SET statement inside the DO loop. Just remember to add the explicit OUTPUT statement.
data new ;
do treatment=1 to 6 ;
set old;
output;
end;
run;

SAS looped macro variables resolving incorrectly

Good afternoon.
I am writing a SAS program that will loop through several sets of time-series/ observations. For each set, there is one observation per month, with roughly 450 observations/months total. For simplicity's sake, the months start at 1 and move sequentially.
Now, for each set of observations I have an additional set of variables to be employed. I am importing an auxiliary data set that contains these variables for all of the sets, and using the &&var&i. structure to assign each observation's variables a unique macro variable to be called during the execution of the main loop. So, for example, all of the variables in the first observation have a "1" concatenated onto their variable name, second observation variables have a "2," and so on. When the main loop goes through it's first iteration and calls &&var&i., it will resolve to &var1 and pull in the value assigned from the first observation in the auxiliary data set. I have tested this, and it is working fine.
Important note: each observation in the auxiliary set has a series of variables called ratio_1, ratio_2, ... up to ratio_9. After passing through the macro assignment above, they would assume macro names of ratio_11, ratio_21... for the first set, and ratio_12, ratio_22,... and so on for subsequent sets.
My problem arises when I try to insert code that is only supposed to occur at very specific time intervals within each set. Each set has a variable initial_check that determines on which month this code should begin executing. This code should then execute on each observation that occurs in 12-month increments. So, for example, set 1 might have an initial_check value of 36, meaning that the code will only execute for the observation on month 35 (see code below), with subsequent executions on months 47, 59, 71, and so on.
The first line of code is meant to determine that the code that follows only executes at the aforementioned intervals (the rem_var checks for the remainder of the difference between the current month and the initial_check, over 12 - if there is no remainder, then a multiple of 12 months has passed) :
if mon >= %eval(&&initial_check&k -1) and rem_var = 0 and mon < &&term&k. then do ;
I have run that code in isolation to check that each of its parameters is doing what it should, and it appears to be working correctly. The following code comes next:
** Iterate ratios **
if mon = %eval(&&initial_check&k. -1) then call symput('j',1) ;
else if mon = %eval(&&initial_check&k. +11) then call symput('j',2) ;
else if mon = %eval(&&initial_check&k. +23) then call symput('j',3) ;
else if mon = %eval(&&initial_check&k. +35) then call symput('j',4) ;
else if mon = %eval(&&initial_check&k. +47) then call symput('j',5) ;
else if mon = %eval(&&initial_check&k. +59) then call symput('j',6) ;
else if mon = %eval(&&initial_check&k. +71) then call symput('j',7) ;
else if mon = %eval(&&initial_check&k. +83) then call symput('j',8) ;
else if mon = %eval(&&initial_check&k. +95) then call symput('j',9) ;
end ;
Again, I have tested this using non-macro language (that is, assigning the values to regular variable j), and this also appears to be working. Unfortunately, even with the "mprint" option on, I can't see if the macro variable is being properly assigned. Following that, I have additional code that is only supposed to execute if that first condition was met.
if &&ratio_&j&k ne 0 then do ;
And HERE is the issue: I'm getting a note that macro variable j is unresolved.
This code is only supposed to execute in an instance in which &j has been defined, so I can't figure out why it is unresolved. That &&ratio_&j&k is supposed to resolve to &ratio_11 in month 35, &ratio_21 in month 47, and so on for the first loop of the broader program.
I have tried experimenting with the macro versions of the conditional logic (%IF, %THEN, %DO), but have so far failed to get the results I want.
Would anyone happen to have any insight? I'm at my wit's end. I will be following this thread, so I can add details where necessary. And thank you in advance for taking the time to read this.
We need more information. You cannot include the last two blocks of code in the same data step since the data step will use the value of the macro variable J that exists when the data step is compiled and not the one generated by the call symput() function.
Why isn't J just a data step variable?
If it is a macro variable and you want to use the value that call symput() created then you need to use symget() (or symgetn()) to retrieve it at run time. You can then use its value to generate the name of the macro variable that you actually want to reference.
if symgetn(cats('ratio_',symgetn('j'),"&k")) ne 0 then do ;

Replace values only if they are different

I have a vcf file like this:
http://www.1000genomes.org/node/101
Here's the example from that site:
##fileformat=VCFv4.0
##fileDate=20090805
##source=myImputationProgramV3.1
##reference=1000GenomesPilot-NCBI36
##phasing=partial
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129">
##INFO=<ID=H2,Number=0,Type=Flag,Description="HapMap2 membership">
##FILTER=<ID=q10,Description="Quality below 10">
##FILTER=<ID=s50,Description="Less than 50% of samples have data">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003
20 14370 rs6054257 G A 29 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ 0|0:48:1:51,51 1|0:48:8:51,51 1/1:43:5:.,.
20 17330 . T A 3 q10 NS=3;DP=11;AF=0.017 GT:GQ:DP:HQ 0|0:49:3:58,50 0|1:3:5:65,3 0/0:41:3
20 1110696 rs6040355 A G,T 67 PASS NS=2;DP=10;AF=0.333,0.667;AA=T;DB GT:GQ:DP:HQ 1|2:21:6:23,27 2|1:2:0:18,2 2/2:35:4
20 1230237 . T . 47 PASS NS=3;DP=13;AA=T GT:GQ:DP:HQ 0|0:54:7:56,60 0|0:48:4:51,51 0/0:61:2
20 1234567 microsat1 GTCT G,GTACT 50 PASS NS=3;DP=9;AA=G GT:GQ:DP 0/1:35:4 0/2:17:2 1/1:40:3
After the header lines, each line has fields that contain genotypes starting with the 10th field. The 10th field is below the NA0001 heading; the 11th field is genotype NA0002, etc. I have a file with 123 different genotypes, so going from position 10 to 133 (NA0001 until NA0123). What is shown in these fields can be 0/0, 0/1, 0/2 .... till 8/9 for instance. Now I want to replace all the non-equal ones. So I would like to keep 0/0, 1/1, 2/2, etc. And replace 0/1, 0/2, 1/2, 4/5, 4/6 etc by ./.
I would like to write this in a C script. Thought about using sed y/regexp/replacement/ but no idea how to write all those unequal values in a regular expression. And on other positions in the file there could also be these values, so really only positions 10 till 133 should be replaced. And it needs to be replaced; I will be needing the rest of the file with the new values.
Hope it is clear. Anyone any idea how to do this?
This regex should do what you want: \s(\d)[|\/](?!\1)\d: Replace matches with ./.:
Breakdown:
\s(\d) matches a space followed by a single digit, capturing the digit in capture group #1
[|\/] matches a pipe or slash (since it seems that the VCF format allows either)
(?!\1)\d uses a negative lookahead to ensure that the next character is not the same as capture group #1, and matches the digit
Caveats:
I matched a leading space and trailing : to try to ensure it matches only the intended values. I couldn't work out a good way to limit it to fields 10 and after.
Example using perl:
perl -pe 's#\s(\d)[|/](?!\1)\d:# ./.:#g' testfile.vcf > testfile_afterchange.vcf
Note: I used # as the delimiter to avoid having to escape the / characters in the regex.

Resources