I ma still a ROOKIE when it comes to shell script. Long story short I am trying the increment the values of the array by one for every iteration. Here is my code
cmd=(1 2 3 4 5 6 7 8 ................) // How can I pass numbers 1 to 1000 with out having to type manually.
${cmd[#]}
for (( i = 0 ; i < ${#cmd[#]} ; i++ )) do
echo ${cmd[$i]}"
done
One approach would be cmd=() and then inside the loop we add the line "let cmd[i]++" , but it didnt work for me. Thanks in advance
Try the seq command
cmd=( $(seq 1 1000) )
If you are running bash you may take advantage of its features.
Try:
cmd=({1..1000})
You can say:
cmd=( $(seq 1000) )
in order to create the array.
Related
I have this small geo location dataset.
37.9636140,23.7261360
37.9440840,23.7001760
37.9637190,23.7258230
37.9901450,23.7298770
From a random location.
For example this one 37.97570, 23.66721
I need to create a bash command with awk that returns the distances with simple euclidean distance.
This is the command i use
awk -v OFMT=%.17g -F',' -v long=37.97570 -v lat=23.66721 '{for (i=1;i<=NR;i++) distances[i]=sqrt(($1 - long)^2 + ($2 - lat)^2 ); a[i]=$1; b[i]=$2} END {for (i in distances) print distances[i], a[i], b[i]}' filename
When I run this command i get this weird result which is not correct, could someone explain to me what am I doing wrong?
➜ awk -v OFMT=%.17g -F',' -v long=37.97570 -v lat=23.66721 '{for (i=1;i<=NR;i++) distances[i]=sqrt(($1 - long)^2 + ($2 - lat)^2 ); a[i]=$1; b[i]=$2} END {for (i in distances) print distances[i], a[i], b[i]}' filename
44,746962127881936 37.9440840 23.7001760
44,746962127881936 37.9901450 23.7298770
44,746962127881936 37.9636140 23.7261360
44,746962127881936
44,746962127881936 37.9637190 23.7258230
Updated.
Appended the command that #jas provided, I included od -c as #mark-fuso suggetsted.
The issue now is that I get different results from #jas
Command output which showcases the new issue.
awk -v OFMT=%.17g -F, -v long=37.97570 -v lat=23.66721 '
{distance=sqrt(($1 - long)^2 + ($2 - lat)^2 ); print distance, $1, $2}
' file
1,1820150904705098 37.9636140 23.7261360
1,1820150904705098 37.9440840 23.7001760
1,1820150904705098 37.9637190 23.7258230
1,1820150904705098 37.9901450 23.7298770
od -c that shows the content of the input file.
od -c file
0000000 3 7 . 9 6 3 6 1 4 0 , 2 3 . 7 2
0000020 6 1 3 6 0 \n 3 7 . 9 4 4 0 8 4 0
0000040 , 2 3 . 7 0 0 1 7 6 0 \n 3 7 . 9
0000060 6 3 7 1 9 0 , 2 3 . 7 2 5 8 2 3
0000100 0 \n 3 7 . 9 9 0 1 4 5 0 , 2 3 .
0000120 7 2 9 8 7 7 0 \n
0000130
While #jas has provided a 'fix' for the problem, thought I'd throw in a few comments about what OP's code is doing ...
Some basics ...
the awk program ({for (i=1;i<=NR;i++) ... ; b[i]=$2}) is applied against each row of the input file
as each row is read from the input file the awk variable NR keeps track of the row number (ie, NR=1 for the first row, NR=2 for the second row, etc)
on the last pass through the for loop the counter (i in this case) will have a value of NR+1 (ie, the i++ is applied on the last pass through the loop thus leaving i=NR+1)
unless there are conditional checks for each line of input the awk program will apply against every line from the input file (including blank lines - more on this below)
for (i in distances)... isn't guaranteed to process the array indices in numerical order
The awk/for loop is doing the following:
for the 1st input row (NR=1) we get for (i=1;i<=1;i++) ...
for the 2nd input row (NR=2) we get for (i=1;i<=2;i++) ...
for the 3rd input row (NR=3) we get for (i=1;i<=3;i++) ...
for the 4th input row (NR=4) we get for (i=1;i<=4;i++) ...
For each row processed by awk the program will overwrite all previous entries in the distance[] array; net result is the last row (NR=4) will place the same values in all 4 entries of the the distance[] array.
The a[i]=$1; b[i]=$2 array assignments occur outside the scope of the for loop so these will be assigned once per input row (ie, will not be overwritten) however, the array assignments are being made with i=NR+1; net result is the contents of the 1st row (NR=1) are stored in array entries a[2] and b[2], the contents of the 2nd row (NR=2) are stored in array entries a[3] and a[3], etc.
Modifying OP's code with print i, distances[i], a[i], b[i]} and running against the 4-line input file I get:
1 0.064310270672728084 # no data for 2nd/3rd columns because a[1] and b[1] are never set
2 0.064310270672728084 37.9636140 23.7261360 # 2nd/3rd columns are from 1st row of input
3 0.064310270672728084 37.9440840 23.7001760 # 2nd/3rd columns are from 2nd row of input
4 0.064310270672728084 37.9637190 23.7258230 # 2nd/3rd columns are from 3rd row of input
From this we can see the first column of output is the same (ie, distance[1]=distance[2]=distance[3]=distance[4]), while the 2nd and 3rd columns are the same as the input columns except they are shifted 'down' by one row.
That leaves us with two outstanding issues ...
why does OP show 5 lines of output?
why is the first column consist of the garbage 44,746962127881936?
I was able to reproduce this issue by adding a blank line on the end of my input file:
$ cat geo.dat
37.9636140,23.7261360
37.9440840,23.7001760
37.9637190,23.7258230
37.9901450,23.7298770
<<=== blank line !!
Which generates the following with OP's awk code:
44.746962127881936
44.746962127881936 37.9636140 23.7261360
44.746962127881936 37.9440840 23.7001760
44.746962127881936 37.9637190 23.7258230
44.746962127881936 37.9901450 23.7298770
NOTES:
this order is different from OP's sample output and is likely due to OP's awk version not processing for (i in distances)... in numerical order; OP can try something like for (i=1;i<=NR;i++)... or for (i=1;i in distances; i++)... (though the latter will not work correcly for a sparsely populated array)
OPs output (in the question; in comment to #jas' answer) shows a comma (,) in place of the period (.) for the first column so I'm guessing OP's env is using a locale that switches the comma/period as thousands/decimal delimiter (though the input data is based on an 'opposite' locale)
Notice we finally get to see the data from the 4th line of input (shifted 'down' and displayed on line 5) but the first column has what appears to be a nonsensical value ... which can be tracked back to applying the following against a blank line:
sqrt(($1 - long)^2 + ($2 - lat)^2 )
sqrt(( - long)^2 + ( - lat)^2 ) # empty line => $1 = $2 = undefined/empty
sqrt(( - 37.97570)^2 + ( - 23.66721^2 )
sqrt( 1442.153790 + 560.136829 )
sqrt( 2002.290619 )
44.746952... # contents of 1st column
To 'fix' this issue the OP can either a) remove the blank line from the input file or b) add some logic to the awk script to only perform calculations if the input line has (numeric) values in fields #1 & #2 (ie, $1 and $2 are not empty); it's up to the coder to decide on how much validation to apply (eg, are the fields numeric, are the fields within the bounds of legitimate long/lat values, etc).
One last design-related comment ... as demonstrated in jas' answer there is no need for any of the arrays (which in turn reduces memory usage) when all desired output can generated 'on-the-fly' while processing each line of the input file.
Awk takes care of the looping for you. The code will be run in turn for each line of the input file:
$ awk -v OFMT=%.17g -F, -v long=37.97570 -v lat=23.66721 '
{distance=sqrt(($1 - long)^2 + ($2 - lat)^2 ); print distance, $1, $2}
' file
0.060152679674309095 37.9636140 23.7261360
0.045676346307474212 37.9440840 23.7001760
0.059824979147508742 37.9637190 23.7258230
0.064310270672728084 37.9901450 23.7298770
EDIT:
OP is getting different results. I notice in OP's output that there are commas instead of decimal points when printing the distance. This points to a possible issue with the locale setting.
OP confirms that the locale was set for greek, causing the difference in output.
I just learned about the "do" loop today and would like to try using it for data entry in SAS. I have tried most examples online, but I still cannot figure it out.
My dataset in an experiment with 6 treatments (1 to 6) using 2 sets of cues, 3 each, Visual and Audio. There's lag measured in seconds, which are 5, 10, and 15, which there are 2 sets.
Basically it looks like this:
Table
The entries I want are:
1. Obs_no, ranging from 1 to 18 (total of 18 observations, this allows me to easily delete outliers with an IF THEN)
2. Treatment type, which are Auditory and Visual.
3.Treatment number, 1 to 6, 3 sets.
4. Lag, 5, 10 or 15.
5. And the data itself
So far, my code makes 2 and 5 possible, it also makes the rest possible with an IF THEN statement and input statement, although I assume there's a way easier method:
data AVCue;
do cue = 'Auditory','Visual';
do i = 1 to 3;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
Lag and the rest was made possible using an IF THEN statement and the crude method of input:
data AVCue;
set AVCue;
IF i=1 THEN Lag=5;
IF i=2 THEN Lag=10;
IF i=3 THEN Lag=15;
input obs_no treatment;
cards;
1 1
2 2
3 3
4 4
5 5
6 6
7 1
8 2
9 3
10 4
11 5
12 6
13 1
14 2
15 3
16 4
17 5
18 6
;
proc print data=AVCue;
run;
The IF THEN should be fine, but the input statement here is just in my opinion counterproductive, and defeats the purpose of using loops, which is to me, to save time. If done this way, I might as well just put the data into excel and import it, or type everything out with ample copy and paste of the text in the
input obs_no treatment;
cards;
section.
My coding knowledge is basic, so sorry if this question sounds silly, I want to know:
1. How would I make a list of numbers using the "do" loops in SAS? I've made several attempts and all I get is a list containing the next number. I know why this happens, the loop counts to x and the value assigned would just be x. I just don't know how to get around that. Somehow this didn't happen in the datalines section, I guess SAS knows there's 18 numbers and the entry i is stored accordingly... or something?
2. How would I go about assigning in this case, the numbers 1 to 6 to each entry?
Thanks!
It is certainly much easier to read in the actual dataset instead of having to impute some of the variables based on the order the values have in the source data. You might be able to combine a SET statement and an INPUT statement in the same data step and get it to work, but it is probably NOT worth the effort. Just make two datasets and merge them.
Looking at the photograph you posted it looks like TREATMENT is not an independent variable. Instead it is just a label for the combination of CUE and LAG. To make it cycle from 1 to 6 just reset it back to 1 when it gets too large.
data AVCue;
do cue = 'Auditory','Visual';
do lag= 5, 10, 15 ;
treatment+1;
if treatment=7 then treatment=1;
obsno+1;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
You can get in trouble if you just let SAS guess at how you want to define your variables. For example if you change the order of the CUE values do cue = 'Visual','Auditory'; then SAS will make CUE with length $5 instead of $8. Add a LENGTH statement to define your variables before you use them.
length obsno 8 treatment 8 cue $8 lag 8 AVCue 8 ;
This will also let you control the order they are created in the dataset.
If you really did already have a SAS dataset and you wanted to add a variable like TREATMENT that cycled from 1 to 6 (or really any DO loop construct) then could nest the SET statement inside the DO loop. Just remember to add the explicit OUTPUT statement.
data new ;
do treatment=1 to 6 ;
set old;
output;
end;
run;
I'm trying to setup a loop that prints 1000 lines in a file, but it's printing 50000 lines. It should be adding 5 to i and 2 to c but it's adding 2 to c 100 times, adding 5 to i once then adding 2 to c another 100 times.
Here's the code,
local file = io.open("output.txt", "w")
for i=60,5055,5 do
for c=0,100,2 do
file:write(" - summon tnt ~" .. c .. " ~" .. c .. " ~" .. c .. " {Fuse:" .. i .. "}\n")
end
end
file:close()
You need one cycle and inside increase your variables as you want:
--
local c = 0
for i=60,5055,5 do
if c<100 then c = c + 2 else c=0 end
-- write
end
such an idea?
Well, your inner loop wil run 50 times for each outer loop, which runs a 1000 times, so 50'000 lines is what you can expect from this code.
The math you have for the two loops works out to 50000 file-writes. The first for loop only completes every time the inner loop completes all 50 lines. So if the first loop is running 1000 times, and the inner is going 50 times, 1000 x 50 = 50000.
You could simply remove the inner loop altogether and have code to print 1000 lines to the output file.
I'm just learning SPSS and I want to do simple subgroup analysis based on a variable "status" I created which can take values from 0 to 8. I would like to print outputs in one go.
this is the pseudocode for what I want to do:
for( i = 1, i = 8, i++)
{
filter by (ststus = i)
display analysis
remove filter
}
That way I can do it all in one go but also i can add to the analysis code and do something easily for the 8 subgroups.
I don't know if it's relevant but here is the code I want to iterate over currently:
USE ALL.
COMPUTE filter_$=(Workforce EQ 1 AND SurveySample = 1 AND State = 1).
VARIABLE LABELS filter_$ 'Workforce EQ 1 (FILTER)'.
> VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$
> (f1.0). FILTER BY filter_$. EXECUTE.
>
>
> FREQUENCIES VARIABLES = Q86 Q33 Q34 Q88 FSEScore /BARCHART FREQ
> /ORDER=ANALYSIS.
>
> CROSSTABS /TABLES=FSEScore BY Q86 /FORMAT=AVALUE TABLES
> /CELLS=ROW /COUNT ROUND CELL.
>
> FILTER OFF. USE ALL.
Thanks guys.
split file command may solve the problem - it causes your analysis reports to show results for each category of your split variable separately:
*run your transformations.
sort cases by status.
split file by status.
FREQUENCIES .....
CROSSTABS ....
split file off.
If this is not enough, you can use a macro to run through "status" categories:
first define the macro:
define MyMacro ()
!do !ST=1 !to 8
* filter commands using **status = !ST**
* transformations using **status = !ST**
FREQUENCIES .....
CROSSTABS ....
!doend
!enddefine.
now call your macro:
MyMacro .
this is probably a very getto way of doing this, the suggestion above is probably more sensible.
You can initialise Python is spss. The following code works:
begin program.
import spss
for i in xrange(1,8):
string = str(i)
spss.Submit("""
USE ALL.
COMPUTE filter_$=(Workforce EQ 1 AND SurveySample = 1 AND Status = %s).
VARIABLE LABELS filter_$ 'Workforce EQ 1 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
#analysis as required
FREQUENCIES VARIABLES = Q86
/BARCHART FREQ
/ORDER=ANALYSIS.
"""%(' '.join(string)) )
end program.
Many thanks to eli-k I probably should have just used splitfile.
What I'm really stuck on is how I'm supposed to make this loop properly I tried doing every combination for each branch but it either never stops or it stops at the wrong number of loops (supposed to stop after amount of values in RPN_START). I even tried what I did in a previous lab which also failed to work so I'm completely lost as to how I'm supposed to get this program to loop. I understand that the RPN_OUT is the end pointer of the array so I tried putting RPN_START with CPY to compare the values but doing that made it stop after one loop. So I don't know at all which branch I'm supposed to use. Any help/tips/advice would be greatly appreciated. Thank you so much!
ORG DATA
;RPN_IN FCB $06,$03,$2F,$04,$2A,$02,$2B ; 63/4*2+=10
;RPN_IN FCB $05,$01,$02,$2B,$04,$2A,$2B,$03,$2D ; 512+4*+3-
;RPN_IN FCB $02,$03,$2A,$05,$2A,$02,$2F,$01, $2B ; ( ( (2 * 3) * 5) / 2) + 1
RPN_IN FCB $11,$10,$2F,$15,$2A ; ( (11 / 10) ) * 15
RPN_OUT RMB 1
RPN_START FDB RPN_IN ; Pointer to start of RPN array
RPN_END FDB RPN_OUT ; Pointer to end of RPN array
ANSWER RMB 1
TEMP FCB $00
ORG PROG
Entry: ; KEEP THIS LABEL!!
LDS #PROG
LDX #RPN_IN
LOOP:
TFR X,A
LDAA X
...
PSHA
RET:
INX
CPX #RPN_OUT
BNE LOOP
PULA
STAA ANSWER