Batch insert heading/newline to ASCII file if value of column changes - batch-file

I have a file similar to this:
A B C
D E C
F G C
A B X
F G X
A B Q
D E Q
Thats what I am looking for
> C
A B C
D E C
F G C
> X
A B X
F G X
> Q
A B Q
D E Q
So far I have a kind of complicated work-around.
Using AWK to add a empty line.
awk -v i=3 "NR>0 && $i!=p { print "A" }{ p=$i } 1" file.txt
I dont manage to add a ">" directly with awk since its a newline value. Instead of the "A", awk is outputting a empty line. Not really sure why..
Using then
sed -e "s/^$/>/" file.txt
I manage to insert a ">" to the empty line but the heading behind is still missing.

sed is for doing s/old/new, that is all. What you are attempting to do is not just s/old/new so you shouldn't be considering using sed, just use awk:
$ awk '$3!=p{print ">", $3; p=$3} 1' file
> C
A B C
D E C
F G C
> X
A B X
F G X
> Q
A B Q
D E Q

awk solution. Assuming that your input file is sorted:
awk '!a[$NF]++{ print ">",$NF }1' file
The output:
> C
A B C
D E C
F G C
> X
A B X
F G X
> Q
A B Q
D E Q

Could you please try following also and let me know if this helps you.
awk 'NR==1{print ">",$3 RS $0;prev=$3;next} prev!=$3{print ">",$3};1; {prev=$3}' Input_file
Output will be as follows.
> C
A B C
D E C
F G C
> X
A B X
F G X
> Q
A B Q
D E Q

Related

Join four columns into one according to each row

A B C D
E F G H
I J K L
M N O P
If I chose to join the columns I would ={A1:A;B1:B;C1:C;D1:D} but it would look like this:
A
E
I
M
B
F
J
N
... and so on
I would like it to look like this:
A
B
C
D
E
F
G
... and so on
How to proceed in this case?
Note: It may happen that some of the columns are not complete in data, some may have more values than the others, but I still want to continue following this same pattern. Example:
A B D
E G H
I J K L
M N O P
Result:
A
B
D
E
G
H
... and so on
use:
=TRANSPOSE(QUERY(TRANSPOSE(A:D),, 9^9))
then:
=TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(A:D),,9^9)),,9^9), " "))

If statement for two different arrays in bash

I have two arrays, which are arrAlpha[] and arrPT[]. Array arrAlpha contains the Alphabets and array arrPT[] contains some of the plain letters.following is the code that i wrote in bash shell script to compare elements of both arrays and to store the position elements of arrPT[] in arrAlpha[] to array arrT[]. But when i run i feel like something is wrong in if statement to print out the elements in arrT[]. can anyone help me please?
#!/bin/bash
arrAlpha=(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
arrPT=(E K N R S W )
lenPT=${#arrPT}
declare -A arrT
q=0
for((i=0; i<lenPT; i++)) do
for((j=i; j<26; j++)) do
if [ ${arrPT[$i]} = ${arrAlpha[$j]} ]; then
arrT[$q]=$j % 26;
((++q));
fi
done
done
echo ${arrAlpha[#]}
echo ${arrPT[#]}
echo ${arrT[#]}
the expected output is to change elements arrPT to number 0 to 25.
arrAlpha=(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
arrPT =(E K N R S W)
arrT =(4 10 13 17 18 22)
Here's a fixed version of your script - there are some style changes that I prefer
arrAlpha=(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
arrPT=(E K N R S W )
# array length needs index *
lenPT=${#arrPT[*]}
# seems arrT can be simple indexed array
declare -a arrT
q=0
for((i=0; i<lenPT; ++i)); do
for((j=0; j<26; ++j)); do
if [[ ${arrPT[i]} == ${arrAlpha[j]} ]]; then
# arithmetic inside $(())
arrT[q++]=$((j % 26))
fi
done
done
echo ${arrAlpha[#]}
echo ${arrPT[#]}
echo ${arrT[#]}
The main bug is with this line:
lenPT=${#arrPT}
which should be written like:
lenPT=${#arrPT[#]}
But, in fact, there is no need (even if it is not wrong to use it) for such variable, change this line:
for((i=0; i<lenPT; i++)) do
to:
for((i=0; i<${#arrPT[#]}; i++)) do
Some other issues:
There is no need to mod 26 the value of $j, it will never reach 26.
The array arrT seems to be an indexed array: -a no need for -A.
There is no need for variable q if each result will be in the same position as the index i
Please quote your expansions.
Use printf (more reliable) instead of echo.
The script with all the above done is:
#!/bin/bash
arrAlpha=(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
arrPT=(E K N R S W )
declare -a arrT
for((i=0; i<${#arrPT[#]}; i++)) do
for((j=i; j<${#arrAlpha[#]}; j++)) do
[[ ${arrPT[i]} == "${arrAlpha[j]}" ]] && let arrT[i]=j
done
done
printf '%s ' "${arrAlpha[#]}" ; echo
printf '%3s ' "${arrPT[#]}" ; echo
printf '%3s ' "${arrT[#]}" ; echo
Of course, the second loop could be removed by cutting that string at the character:
#!/bin/bash
Alpha=ABCDEFGHIJKLMNOPQRSTUVWXYZ
arrPT=(E K N R S W )
declare -a arrT
for((i=0; i<${#arrPT[#]}; i++)) do
arrT[i]=${Alpha%"${arrPT[i]}"*} # cut the string at the character.
arrT[i]=${#arrT[i]} # Use the len of the cut string.
done
printf '%s ' "$Alpha" ; echo
printf '%3s ' "${arrPT[#]}" ; echo
printf '%3s ' "${arrT[#]}" ; echo

naming array from an array in GAWK

I have a file with repeating elements. I would like to assign records to an array until the file repeats, at which point I want to create a new array to assign the records to. I would like to do this an arbitrary amount of times.
for example.
$ cat repeat.txt
a
b
c
d
e
f
g
a
b
c
d
e
f
g
a
b
c
d
e
f
g
I want the output to be something like this
0 a a a
1 b b b
2 c c c
3 d d d
4 e e e
5 f f f
6 g g g
right now I am doing this with this hideous code.
awk 'BEGIN{n=0;z=0}
$1~"a" {n=0;z++}
z==1{a[n]=$0}
z==2{b[n]=$0}
z==3{c[n]=$0}
z==4{d[n]=$0}
z==5{e[n]=$0}
z==6{f[n]=$0}
{n++}
END{for (i in a)
print i,a[i],b[i],c[i],d[i],e[i],f[i],g[i],h[i],k[i],j[i]}'
repeat.txt
I would like the assignment of new arrays to be automatic.
I attempted this by the following
echo "abcdefghijklmopqrstuvwxyz" > alphabet.txt
awk 'BEGIN{N=0}
NR==FNR{FS=""}
NR==FNR{for (zz=0;zz<=NF;zz++) a[zz]=$zz; next}
NR!=FNR{FS="\t"}
NR!=FNR{if ($0~a) N++; (a[N])[N]=$0}
END{for (I in (a[N])) print I,(a[N])[I]}' alphabet.txt repeat.txt
but this didn't work because you can't do multidimensional arrays like this in gawk. I can't think of another way to do this.

How would I convert an argument into a string in bash

When I run the script I enter a single argument. I want to store the argument into a variable and access it as a string. So if I enter $ ./script foo I should be able to access f, o, and o. So echo $pass[0] should display f
but what I am finding is that $pass is storing the argument as one piece
so echo $pass[0] displays foo
How do I access the different positions in the string?
#!/bin/bash
all=( 0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z )
pass=$1
max=${#pass}
for (( i=0; i<max; i++ ))
do
for (( n=0; n<10; n++ ))
do
if [ "${pass[$i]}" == ${all[$n]} ]
then
echo true
else
echo false i:$i n:$n pass:${pass[$i]} all:${all[$n]}
fi
done
done
To spell out Etan's comment in the context of this question:
set -- "my password"
chars=()
for ((i=0; i<${#1}; i++)); do chars+=("${1:i:1}"); done
declare -p chars
outputs
declare -a chars='([0]="m" [1]="y" [2]=" " [3]="p" [4]="a" [5]="s" [6]="s" [7]="w" [8]="o" [9]="r" [10]="d")'

Cropping a .ppm file in C

I'm working on a C program that crops .ppm files from a starting point pixel (x,y) (top left corner of cropped image) to an end point pixel (x+w,x+h)(bottom left corner of cropped image).
The data in .ppm files is of the following format:
r g b r g b r g b r g b r g b r g b
r g b r g b r g b r g b r g b r g b
r g b r g b r g b r g b r g b r g b
r g b r g b r g b r g b r g b r g b
Is there a simple way, wich avoids the use of 2 dimensional arrays, to do this using scanf()?
One easy way would be to simply keep track of your pixel coordinate as you read the file in. If you're currently in the crop rectangle, store the pixel; otherwise, skip it.
If you want to get more fancy: figure out the byte offset for the start of each row, seek to it, then read in the whole row.
Warning, some pnm files are in binary mode (they differ by magic number in the beginning of the file contents).
Maybe lookup the sources of pnmcrop would help?

Resources