Brace expansion in zsh - how to concatenate two lists for file selection? - concatenation

In zsh, one can create an expression of {nx..ny}, for example to select files x to y inside a folder.
For example, {1..50} selects items, files, etc. from 1 to 50.
How can I concatenate two two brace expansions into one?
Example: I would like to select {1..50} and {60..100} for one and the same output.

You can nest brace expansions, so this will work:
> print {{1..50},{60..100}}
1 2 3 (lots of numbers) 49 50 60 61 (more numbers) 99 100
Brace expansions support lists as well as sequences, and can be included in strings:
> print -l file{A,B,WORD,{R..T}}.txt
fileA.txt
fileB.txt
fileWORD.txt
fileR.txt
fileS.txt
fileT.txt
Note that brace expansions are not glob patterns. The {n..m} expansion will include every value between the start and end values, regardless of whether a file exists by that name. For finding files in folders, the <-> glob expression will usually work better:
> touch 2 3 55 89
> ls -l <1-50> <60-100>
-rw-r--r-- 1 me grp 0 Feb 18 06:52 2
-rw-r--r-- 1 me grp 0 Feb 18 06:52 3
-rw-r--r-- 1 me grp 0 Feb 18 06:52 89

Related

Bad substitution error when passing decremented variable to "sed" command

I like to use "sed" command to delete two consecutive lines from a file.
I can delete single line using following syntax where
variable "index" holds the line number:
sed -i "${index}d" "$PWD$DEBUG_DIR$DEBUG_MENU"
Since I like to delete consecutive lines I did test this
syntax which supposedly decrements / increments the variable but I am getting a bad substitution error.
Addendum
I am not sure where to post this , but during troubleshooting of
this issue I have discovered that this syntax does not work as expected
(( index++)) nor (( index-- ))
Using sed with "index" in single line deletion twice in a row / sequentially / works and resolves few issues.
#delete command line
sed -i "${index}d" "$PWD$DEBUG_DIR$DEBUG_MENU"
#delete description line
sed -i "${index}d" "$PWD$DEBUG_DIR$DEBUG_MENU"
sed is the right tool for doing s/old/new, that is all. What you're trying to do isn't that so why are you trying to coerce sed into doing something when there's far more appropriate tools can do the job far easier?
The right way to do what your script:
#retrieve first matching line number
index=$(sed -n "/$choice/=" "$PWD$DEBUG_DIR$DEBUG_MENU")
#delete matching line plus next line from file
sed -i "/$index[1], (( $index[1]++))/"
"$PWD$DEBUG_DIR$DEBUG_MENU"
seems to be trying to do is this:
awk -v choice="$choice" '$0~choice{skip=2} !(skip&&skip--)' "$PWD$DEBUG_DIR$DEBUG_MENU"
For example:
$ seq 20 | awk -v choice="4" '$0~choice{skip=2} !(skip&&skip--)'
1
2
3
6
7
8
9
10
11
12
13
16
17
18
19
20
if you only want to delete the first match:
$ seq 20 | awk -v choice="4" '$0~choice{skip=2;cnt++} !(cnt==1 && skip && skip--)'
1
2
3
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
or the 2nd:
$ seq 20 | awk -v choice="4" '$0~choice{skip=2;cnt++} !(cnt==2 && skip && skip--)'
1
2
3
4
5
6
7
8
9
10
11
12
13
16
17
18
19
20
and to skip 5 lines instead of 2:
$ seq 20 | awk -v choice="4" '$0~choice{skip=5} !(skip&&skip--)'
1
2
3
9
10
11
12
13
19
20
Just use the right tool for the right job, don't go digging holes to plant trees with a teaspoon.
If you just want the first one, then quit when you see it:
sed -n "/$choice/ {=;q}" file
But you look like you're processing this file multiple times. There must be a simpler way to do it, if you can describe your over-arching goal.
For example, if you just want to remove the matched line and the next line, but only the first time, you can use awk: here we see "4" and "5" are gone, but "14" and "15" remain:
$ seq 20 | awk '/4/ && !seen {getline; seen++; next} 1'
1
2
3
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
With GNU sed, you can use +1 as the second address in an address range:
index=4
seq 10 | sed "$index,+1 d"
Or if you want to use bash/ksh arithmetic expansion: use the post-increment operator
seq 10 | sed "$((index++)),$index d"
Note here that, due to the pipeline, sed is running in a subshell: even though the index value is now 5, this occurs in a subshell. After the sed command ends, index has value 4 in the current shell.

trying to run arbitrary commands and parse their output

here is part of code
scanf("%[^\n]%*c",command);
int pid;
pid=fork();
if (pid == 0) {
// Child process
char *argv[]={command ,NULL};
execvp(argv[0], argv);
exit (0);
}
When I give as input ls I want as output
1 copy of mysh1.c mysh1.c mysh3.c mysh.c New Folder
a.out helpmanual.desktop mysh2.c mysh4.c New File
and when i give ls -l /tmp
i'm waiting
total 12
-rw------- 1 antre antre 0 Nov 4 17:31 config-err-KT9sEZ
drwx------ 2 antre antre 4096 Nov 4 19:21 mozilla_antre0
drwx------ 2 antre antre 4096 Jan 1 1970 orbit-antre
drwx------ 2 antre antre 4096 Nov 4 17:31 ssh-HaOFtKdeIQnQ `
but i take:
1 copy of mysh1.c mysh1.c mysh3.c mysh.c New Folder
a.out helpmanual.desktop mysh2.c mysh4.c New File
It seems that you're trying to parse the output of ls -l in a C program for some reason.
That's unlikely to be the “right” thing to do. The usual mechanism is to use opendir and readdir to read the directory file, directly.
If you have some truly strange situation in which you cannot opendir (the only case that comes to mind is if you're running ls on a remote system, eg, over ssh), there is a mode in GNU ls specifically for producing an output record format that can be parsed by another program.
From the GNU coreutils info:
10.1.2 What information is listed
‘-D’
‘--dired’
With the long listing (‘-l’) format, print an additional line after
the main output:
//DIRED// BEG1 END1 BEG2 END2 ...
The BEGN and ENDN are unsigned integers that record the byte
position of the beginning and end of each file name in the output.
This makes it easy for Emacs to find the names, even when they
contain unusual characters such as space or newline, without fancy
searching.
If directories are being listed recursively (‘-R’), output a
similar line with offsets for each subdirectory name:
//SUBDIRED// BEG1 END1 ...
Finally, output a line of the form:
//DIRED-OPTIONS// --quoting-style=WORD
where WORD is the quoting style (*note Formatting the file
names::).
Here is an actual example:
$ mkdir -p a/sub/deeper a/sub2
$ touch a/f1 a/f2
$ touch a/sub/deeper/file
$ ls -gloRF --dired a
a:
total 8
-rw-r--r-- 1 0 Jun 10 12:27 f1
-rw-r--r-- 1 0 Jun 10 12:27 f2
drwxr-xr-x 3 4096 Jun 10 12:27 sub/
drwxr-xr-x 2 4096 Jun 10 12:27 sub2/
a/sub:
total 4
drwxr-xr-x 2 4096 Jun 10 12:27 deeper/
a/sub/deeper:
total 0
-rw-r--r-- 1 0 Jun 10 12:27 file
a/sub2:
total 0
//DIRED// 48 50 84 86 120 123 158 162 217 223 282 286
//SUBDIRED// 2 3 167 172 228 240 290 296
//DIRED-OPTIONS// --quoting-style=literal
Note that the pairs of offsets on the ‘//DIRED//’ line above
delimit these names: ‘f1’, ‘f2’, ‘sub’, ‘sub2’, ‘deeper’, ‘file’.
The offsets on the ‘//SUBDIRED//’ line delimit the following
directory names: ‘a’, ‘a/sub’, ‘a/sub/deeper’, ‘a/sub2’.
Here is an example of how to extract the fifth entry name,
‘deeper’, corresponding to the pair of offsets, 222 and 228:
$ ls -gloRF --dired a > out
$ dd bs=1 skip=222 count=6 < out 2>/dev/null; echo
deeper
Note that although the listing above includes a trailing slash for
the ‘deeper’ entry, the offsets select the name without the
trailing slash. However, if you invoke ‘ls’ with ‘--dired’ along
with an option like ‘--escape’ (aka ‘-b’) and operate on a file
whose name contains special characters, notice that the backslash
is included:
$ touch 'a b'
$ ls -blog --dired 'a b'
-rw-r--r-- 1 0 Jun 10 12:28 a\ b
//DIRED// 30 34
//DIRED-OPTIONS// --quoting-style=escape
If you use a quoting style that adds quote marks (e.g.,
‘--quoting-style=c’), then the offsets include the quote marks. So
beware that the user may select the quoting style via the
environment variable ‘QUOTING_STYLE’. Hence, applications using
‘--dired’ should either specify an explicit
‘--quoting-style=literal’ option (aka ‘-N’ or ‘--literal’) on the
command line, or else be prepared to parse the escaped names.
i just only needed to use strtok

Two arrays with different length in a loop

I have two arrays with different length, and I need to use them in the same loop.
This is the code
#!/bin/bash
data=`date +%Y-%m-%d`
data1=`date -d "1 day" +%Y-%m-%d`
cd /home/test/em_real/
#first array (today and tomorrow)
days="$data $data1"
#second array (00 till 23)
hours="00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23"
for value in $hours
do
cp /home/test/em_real/mps_"${days[i++]}"_"$value":00:00 /home/DOMAINS/test_case/
sleep 10
done
Tt fails, doesn't get days.
How can I do it?
#fedorqui If now, inside the bucle, I want to remove the dash (-) of days and do another order, I don't know why it doesn't get the string , the code is the following:
days=("$data" "$data1") #create an array properly
for value in {00..23}; do
for day in "${days[#]}"; do
cp "/path/mps_${day}_${value}:00:00" /another/path/test_case/
d=d01
hourSIMULATION=01
clean= echo ${day} | sed -e 's/-//g'
sed -e 's/<domine>/'$d'/g' -e 's/<data-initial>/'$clean$value'/g' -e 's/<hour-SIMULATION>/'$hourSIMULATION'/g' run_prhours > run_pr
done
done
The string $dayclean is empty when I check inside run_pr, do you know what could be the reason?
You are using days[i++] but no i is defined anywhere. Not sure what you want to do with ${days[i++]} but $days is just a string containing "$data $data1".
You probably want to say days=($data $data1) to create an array.
Also, you can say for hour in {00.23} instead of being explicit on the numbers.
Then, you want to loop through the hours and then through the days. For this, use a nested loop:
days=("$data" "$data1") #create an array properly
for value in {00..23}; do
for day in "${days[#]}"; do
cp "/path/mps_${day}_${value}:00:00" /another/path/test_case/
done
done

Same column of different files into the same new file

I have multiple folders Case-1, Case-2....Case-N and they all have a file named PPD. I want to extract all 2nd columns and put them into one file named 123.dat.
It seems that I cannot use awk in a for loop.
case=$1
for (( i = 1; i <= $case ; i ++ ))
do
file=Case-$i
cp $file/PPD temp$i.dat
awk 'FNR==1{f++}{a[f,FNR]=$2}
END
{for(x=1;x<=FNR;x++)
{for(y=1;y<ARGC;y++)
printf("%s ",a[y,x]);print ""} }'
temp$i.dat >> 123.dat
done
Now 123.dat only has the date of the last PPD in Case-N
I know I can use join(I used that command before) if every PPD file has at least one column the same, but it turns out to be extremely slow if I have lots of Case folders
Maybe
eval paste $(printf ' <(cut -f2 %s)' Case-*/PPD)
There is probably a limit to how many process substitutions you can perform in one go. I did this with 20 columns and it was fine. Process substitutions are a Bash feature, so not portable to other Bourne-compatible shells in general.
The wildcard will be expanded in alphabetical order. If you want the cases in numerical order, maybe use case-[1-9] case-[1-9][0-9] case-[1-9][0-9][0-9] to force the expansion to get the single digits first, then the double digits, etc.
The interaction between the outer shell script and inner awk invocation aren't working the way you expect.
Every time through the loop, the shell script calls awk a new time, which means that f will be unset, and then that first clause will set it to 1. It will never become 2. That is, you are starting a new awk process for each iteration through the outer loop, and awk is starting from scratch each time.
There are other ways to structure your code, but as a minimal tweak, you can pass in the number $i to the awk invocation using the -v option, e.g. awk -v i="$i" ....
Note that there are better ways to structure your overall solution, as other answerers have already suggested; I meant this response to be an answer the question, "Why doesn't this work?" and not "Please rewrite this code."
The below AWK program can help you.
#!/usr/bin/awk -f
BEGIN {
# Defaults
nrecord=1
nfiles=0
}
BEGINFILE {
# Check if the input file is accessible,
# if not skip the file and print error.
if (ERRNO != "") {
print("Error: ",FILENAME, ERRNO)
nextfile
}
}
{
# Check if the file is accessed for the first time
# if so then increment nfiles. This is to keep count of
# number of files processed.
if ( FNR == 1 ) {
nfiles++
} else if (FNR > nrecord) {
# Fetching the maximum size of the record processed so far.
nrecord=FNR
}
# Fetch the second column from the file.
array[nfiles,FNR]=$2
}
END {
# Iterate through the array and print the records.
for (i=1; i<=nrecord; i++) {
for (j=1; j<=nfiles; j++) {
printf("%5s", array[j,i])
}
print ""
}
}
Output:
$ ./get.awk Case-*/PPD
1 11 21
2 12 22
3 13 23
4 14 24
5 15 25
6 16 26
7 17 27
8 18 28
9 19 29
10 20 30
Here the Case*/PPD expands to Case-1/PPD, Case-2/PPD, Case-3/PPD and so on. Below are the source files for which the output was generated.
$ cat Case-1/PPD
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
9 9 9 9
10 10 10 10
$ cat Case-2/PPD
11 11 11 11
12 12 12 12
13 13 13 13
14 14 14 14
15 15 15 15
16 16 16 16
17 17 17 17
18 18 18 18
19 19 19 19
20 20 20 20
$ cat Case-3/PPD
21 21 21 21
22 22 22 22
23 23 23 23
24 24 24 24
25 25 25 25
26 26 26 26
27 27 27 27
28 28 28 28
29 29 29 29
30 30 30 30

Merge multiple files by common field - Unix

I have hundreds of files, each with two columns :
For example :
file1.txt
ID Value1
1 40
2 30
3 70
file2.txt
ID Value2
1 50
2 70
3 20
And so on, till
file150.txt
ID Value150
1 98
2 52
3 71
How do I merge these files based on the first column (which is common). My output should be
ID Value1 Value2...........Value150
1 40 50 98
2 30 70 52
3 70 20 71
Thank you.
using cut and paste combination to solve the file merging problem on three files or more. cd to the folder only contains file1, file2, file3, ... file150:
i=0
cut -f 1 file1 > delim ## use first column as delimiter
for file in file*
do
i=$(($i+1)) ## for adding count to distinguish files from original ones
cut -f 2 $file > ${file}__${i}.temp
done
paste -d\\t delim file*__*.temp > output
Another solution is using join to merge two files once by steps.
join -j 1 test1 test2 | join -j 1 test3 - | join -j 1 test4 -

Resources