In Shake, does the definition order of Rules matter? - shake-build-system

If I have a general build rule for *.o object files but a more specific build rule for a foo.o object file, does the definition order matter?

As described in the documentation for %> operator:
Patterns with no wildcards have higher priority than those with wildcards, and no file required by the system may be matched by more than one pattern at the same priority (see priority and alternatives to modify this behaviour).
So definition order doesn't matter, but files can't match multiple rules at the same priority.
Therefore, in the case of *.o and foo.o, it'll be fine. Here's an example (using foo.txt and *.txt):
import Development.Shake
main = shakeArgs shakeOptions $ do
want ["foo.txt", "bar.txt"]
"foo.txt" %> \out -> writeFile' out "foo"
"*.txt" %> \out -> writeFile' out "anything"
vs
import Development.Shake
main = shakeArgs shakeOptions $ do
want ["foo.txt", "bar.txt"]
"*.txt" %> \out -> writeFile' out "anything"
"foo.txt" %> \out -> writeFile' out "foo"
In both cases foo.txt will contain "foo" and bar.txt will contain "anything" because the definition for "foo.txt" doesn't contain any wildcards.
Alternatively, if you want to use definition order, you can use the alternatives function which uses "first-wins" matching semantics:
alternatives $ do
"hello.*" %> \out -> writeFile' out "hello.*"
"*.txt" %> \out -> writeFile' out "*.txt"
hello.txt will match the first rule, because it's defined before.
Finally, you can directly assign the priority of a rule with the priority function:
priority 4 $ "hello.*" %> \out -> writeFile' out "hello.*"
priority 8 $ "*.txt" %> \out -> writeFile' out "*.txt"
hello.txt will match the second rule, because it has a higher priority.

Related

$# vs. "$#" when an argument is enclosed with single quotes [duplicate]

The $# variable seems to maintain quoting around its arguments so that, for example:
$ function foo { for i in "$#"; do echo $i; done }
$ foo herp "hello world" derp
herp
hello world
derp
I am also aware that bash arrays, work the same way:
$ a=(herp "hello world" derp)
$ for i in "${a[#]}"; do echo $i; done
herp
hello world
derp
What is actually going on with variables like this? Particularly when I add something to the quote like "duck ${a[#]} goose". If its not space separated what is it?
Usually, double quotation marks in Bash mean "make everything between the quotation marks one word, even if it has separators in it." But as you've noticed, $# behaves differently when it's within double quotes. This is actually a parsing hack that dates back to Bash's predecessor, the Bourne shell, and this special behavior applies only to this particular variable.
Without this hack (I use the term because it seems inconsistent from a language perspective, although it's very useful), it would be difficult for a shell script to pass along its array of arguments to some other command that wants the same arguments. Some of those arguments might have spaces in them, but how would it pass them to another command without the shell either lumping them together as one big word or reparsing the list and splitting the arguments that have whitespace?
Well, you could pass an array of arguments, and the Bourne shell really only has one array, represented by $* or $#, whose number of elements is $# and whose elements are $1, $2, etc, the so-called positional parameters.
An example. Suppose you have three files in the current directory, named aaa, bbb, and cc c (the third file has a space in the name). You can initialize the array (that is, you can set the positional parameters) to be the names of the files in the current directory like this:
set -- *
Now the array of positional parameters holds the names of the files. $#, the number of elements, is three:
$ echo $#
3
And we can iterate over the position parameters in a few different ways.
1) We can use $*:
$ for file in $*; do
> echo "$file"
> done
but that re-separates the arguments on whitespace and calls echo four times:
aaa
bbb
cc
c
2) Or we could put quotation marks around $*:
$ for file in "$*"; do
> echo "$file"
> done
but that groups the whole array into one argument and calls echo just once:
aaa bbb cc c
3) Or we could use $# which represents the same array but behaves differently in double quotes:
$ for file in "$#"; do
> echo "$file"
> done
will produce
aaa
bbb
cc c
because $1 = "aaa", $2 = "bbb", and $3 = "cc c" and "$#" leaves the elements intact. If you leave off the quotation marks around $#, the shell will flatten and re-parse the array, echo will be called four times, and you'll get the same thing you got with a bare $*.
This is especially useful in a shell script, where the positional parameters are the arguments that were passed to your script. To pass those same arguments to some other command -- without the shell resplitting them on whitespace -- use "$#".
# Truncate the files specified by the args
rm "$#"
touch "$#"
In Bourne, this behavior only applies to the positional parameters because it's really the only array supported by the language. But you can create other arrays in Bash, and you can even apply the old parsing hack to those arrays using the special "${ARRAYNAME[#]}" syntax, whose at-sign feels almost like a wink to Mr. Bourne:
$ declare -a myarray
$ myarray[0]=alpha
$ myarray[1]=bravo
$ myarray[2]="char lie"
$ for file in "${myarray[#]}"; do echo "$file"; done
alpha
bravo
char lie
Oh, and about your last example, what should the shell do with "pre $# post" where you have $# within double quotes but you have other stuff in there, too? Recent versions of Bash preserve the array, prepend the text before the $# to the first array element, and append the text after the $# to the last element:
pre aaa
bb
cc c post

GNU Make: How to set an array from a space-separated string?

I'm writing a Terminal Match-Anything Pattern Rule, i.e. %::, that, as expected, will run only if no other target is matched. In its recipe I want to iterate over makefile's explicit targets and check if the found pattern ($*) is the beginning of any other target
By now I'm successfully getting all desired targets in a space-separated string and storing it in a variable TARGETS, however I couldn't turn it in an array to be able to iterate over each word in the string.
For instance
%::
$(eval TARGETS ::= $(shell grep -Ph "^[^\t].*::.*##" ./Makefile | cut -d : -f 1 | sort))
echo $(TARGETS)
gives me just what I was expecting:
build clean compile deploy execute init run serve
The Question
How could I iterate over each of $(TARGET) string words inside a GNU Make 4.2.1 loop?
I found a bunch of BASH solutions, but none of them worked in my tests:
Reading a delimited string into an array in Bash
How to split one string into multiple strings separated by at least >one space in bash shell?
It's generally a really bad idea to use eval and shell inside a recipe. A recipe is already a shell script so you should just use shell scripting.
It's not really clear exactly what you want to do. If you want to do this in a recipe, you can use a shell loop:
%::
TARGETS=$$(grep -Ph "^[^\t].*::.*##" ./Makefile | cut -d : -f 1 | sort); \
for t in $$TARGETS; do \
echo $$t; \
done
If you want to do it outside of a recipe you can use the GNU make foreach function.

Array of all files in a directory, except one

Trying to figure out how to include all .txt files except one called manifest.txt.
FILES=(path/to/*.txt)
You can use extended glob patterns for this:
shopt -s extglob
files=(path/to/!(manifest).txt)
The !(pattern-list) pattern matches "anything except one of the given patterns".
Note that this exactly excludes manifest.txt and nothing else; mmanifest.txt, for example, would still go in to the array.
As a side note: a glob that matches nothing at all expands to itself (see the manual and this question). This behaviour can be changed using the nullglob (expand to empty string) and failglob (print error message) shell options.
You can build the array one file at a time, avoiding the file you do not want :
declare -a files=()
for file in /path/to/files/*
do
! [[ -e "$file" ]] || [[ "$file" = */manifest.txt ]] || files+=("$file")
done
Please note that globbing in the for statement does not cause problems with whitespace (even newlines) in filenames.
EDIT
I added a test for file existence to handle the case where the glob fails and the nullglob option is not set.
I think this is best handled with an associative array even if just one element.
Consider:
$ touch f{1..6}.txt manifest.txt
$ ls *.txt
f1.txt f3.txt f5.txt manifest.txt
f2.txt f4.txt f6.txt
You can create an associative array for the names you wish to exclude:
declare -A exclude
for f in f1.txt f5.txt manifest.txt; do
exclude[$f]=1
done
Then add files to an array that are not in the associative array:
files=()
for fn in *.txt; do
[[ ${exclude[$fn]} ]] && continue
files+=("$fn")
done
$ echo "${files[#]}"
f2.txt f3.txt f4.txt f6.txt
This approach allows any number of exclusions from the list of files.
FILES=($(ls /path/to/*.txt | grep -wv '^manifest.txt$'))

Rename multiple files based on pattern in Unix

There are multiple files in a directory that begin with prefix fgh, for example:
fghfilea
fghfileb
fghfilec
I want to rename all of them to begin with prefix jkl. Is there a single command to do that instead of renaming each file individually?
There are several ways, but using rename will probably be the easiest.
Using one version of rename (Perl's rename):
rename 's/^fgh/jkl/' fgh*
Using another version of rename (same as Judy2K's answer):
rename fgh jkl fgh*
You should check your platform's man page to see which of the above applies.
This is how sed and mv can be used together to do rename:
for f in fgh*; do mv "$f" $(echo "$f" | sed 's/^fgh/jkl/g'); done
As per comment below, if the file names have spaces in them, quotes may need to surround the sub-function that returns the name to move the files to:
for f in fgh*; do mv "$f" "$(echo $f | sed 's/^fgh/jkl/g')"; done
rename might not be in every system. so if you don't have it, use the shell
this example in bash shell
for f in fgh*; do mv "$f" "${f/fgh/xxx}";done
Using mmv:
mmv "fgh*" "jkl#1"
There are many ways to do it (not all of these will work on all unixy systems):
ls | cut -c4- | xargs -I§ mv fgh§ jkl§
The § may be replaced by anything you find convenient. You could do this with find -exec too but that behaves subtly different on many systems, so I usually avoid that
for f in fgh*; do mv "$f" "${f/fgh/jkl}";done
Crude but effective as they say
rename 's/^fgh/jkl/' fgh*
Real pretty, but rename is not present on BSD, which is the most common unix system afaik.
rename fgh jkl fgh*
ls | perl -ne 'chomp; next unless -e; $o = $_; s/fgh/jkl/; next if -e; rename $o, $_';
If you insist on using Perl, but there is no rename on your system, you can use this monster.
Some of those are a bit convoluted and the list is far from complete, but you will find what you want here for pretty much all unix systems.
rename fgh jkl fgh*
Using find, xargs and sed:
find . -name "fgh*" -type f -print0 | xargs -0 -I {} sh -c 'mv "{}" "$(dirname "{}")/`echo $(basename "{}") | sed 's/^fgh/jkl/g'`"'
It's more complex than #nik's solution but it allows to rename files recursively. For instance, the structure,
.
├── fghdir
│   ├── fdhfilea
│   └── fghfilea
├── fghfile\ e
├── fghfilea
├── fghfileb
├── fghfilec
└── other
├── fghfile\ e
├── fghfilea
├── fghfileb
└── fghfilec
would be transformed to this,
.
├── fghdir
│   ├── fdhfilea
│   └── jklfilea
├── jklfile\ e
├── jklfilea
├── jklfileb
├── jklfilec
└── other
├── jklfile\ e
├── jklfilea
├── jklfileb
└── jklfilec
The key to make it work with xargs is to invoke the shell from xargs.
Generic command would be
find /path/to/files -name '<search>*' -exec bash -c 'mv $0 ${0/<search>/<replace>}' {} \;
where <search> and <replace> should be replaced with your source and target respectively.
As a more specific example tailored to your problem (should be run from the same folder where your files are), the above command would look like:
find . -name 'gfh*' -exec bash -c 'mv $0 ${0/gfh/jkl}' {} \;
For a "dry run" add echo before mv, so that you'd see what commands are generated:
find . -name 'gfh*' -exec bash -c 'echo mv $0 ${0/gfh/jkl}' {} \;
To install the Perl rename script:
sudo cpan install File::Rename
There are two renames as mentioned in the comments in Stephan202's answer.
Debian based distros have the Perl rename. Redhat/rpm distros have the C rename.
OS X doesn't have one installed by default (at least in 10.8), neither does Windows/Cygwin.
Here's a way to do it using command-line Groovy:
groovy -e 'new File(".").eachFileMatch(~/fgh.*/) {it.renameTo(it.name.replaceFirst("fgh", "jkl"))}'
On Solaris you can try:
for file in `find ./ -name "*TextForRename*"`; do
mv -f "$file" "${file/TextForRename/NewText}"
done
#!/bin/sh
#replace all files ended witn .f77 to .f90 in a directory
for filename in *.f77
do
#echo $filename
#b= echo $filename | cut -d. -f1
#echo $b
mv "${filename}" "${filename%.f77}.f90"
done
This script worked for me for recursive renaming with directories/file names possibly containing white-spaces:
find . -type f -name "*\;*" | while read fname; do
dirname=`dirname "$fname"`
filename=`basename "$fname"`
newname=`echo "$filename" | sed -e "s/;/ /g"`
mv "${dirname}/$filename" "${dirname}/$newname"
done
Notice the sed expression which in this example replaces all occurrences of ; with space . This should of course be replaced according to the specific needs.
Using StringSolver tools (windows & Linux bash) which process by examples:
filter fghfilea ok fghreport ok notfghfile notok; mv --all --filter fghfilea jklfilea
It first computes a filter based on examples, where the input is the file names and the output (ok and notok, arbitrary strings). If filter had the option --auto or was invoked alone after this command, it would create a folder ok and a folder notok and push files respectively to them.
Then using the filter, the mv command is a semi-automatic move which becomes automatic with the modifier --auto. Using the previous filter thanks to --filter, it finds a mapping from fghfilea to jklfilea and then applies it on all filtered files.
Other one-line solutions
Other equivalent ways of doing the same (each line is equivalent), so you can choose your favorite way of doing it.
filter fghfilea ok fghreport ok notfghfile notok; mv --filter fghfilea jklfilea; mv
filter fghfilea ok fghreport ok notfghfile notok; auto --all --filter fghfilea "mv fghfilea jklfilea"
# Even better, automatically infers the file name
filter fghfilea ok fghreport ok notfghfile notok; auto --all --filter "mv fghfilea jklfilea"
Multi-step solution
To carefully find if the commands are performing well, you can type the following:
filter fghfilea ok
filter fghfileb ok
filter fghfileb notok
and when you are confident that the filter is good, perform the first move:
mv fghfilea jklfilea
If you want to test, and use the previous filter, type:
mv --test --filter
If the transformation is not what you wanted (e.g. even with mv --explain you see that something is wrong), you can type mv --clear to restart moving files, or add more examples mv input1 input2 where input1 and input2 are other examples
When you are confident, just type
mv --filter
and voilà! All the renaming is done using the filter.
DISCLAIMER: I am a co-author of this work made for academic purposes. There might also be a bash-producing feature soon.
It was much easier (on my Mac) to do this in Ruby. Here are 2 examples:
# for your fgh example. renames all files from "fgh..." to "jkl..."
files = Dir['fgh*']
files.each do |f|
f2 = f.gsub('fgh', 'jkl')
system("mv #{f} #{f2}")
end
# renames all files in directory from "021roman.rb" to "021_roman.rb"
files = Dir['*rb'].select {|f| f =~ /^[0-9]{3}[a-zA-Z]+/}
files.each do |f|
f1 = f.clone
f2 = f.insert(3, '_')
system("mv #{f1} #{f2}")
end
Using renamer:
$ renamer --find /^fgh/ --replace jkl * --dry-run
Remove the --dry-run flag once you're happy the output looks correct.
My version of renaming mass files:
for i in *; do
echo "mv $i $i"
done |
sed -e "s#from_pattern#to_pattern#g” > result1.sh
sh result1.sh
Another possible parameter expansion:
for f in fgh*; do mv -- "$f" "jkl${f:3}"; done
I would recommend using my own script, which solves this problem. It also has options to change the encoding of the file names, and to convert combining diacriticals to precomposed characters, a problem I always have when I copy files from my Mac.
#!/usr/bin/perl
# Copyright (c) 2014 André von Kugland
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
$help_msg =
"rename.pl, a script to rename files in batches, using Perl
expressions to transform their names.
Usage:
rename.pl [options] FILE1 [FILE2 ...]
Where options can be:
-v Verbose.
-vv Very verbose.
--apply Really apply modifications.
-e PERLCODE Execute PERLCODE. (e.g. 's/a/b/g')
--from-charset=CS Source charset. (e.g. \"iso-8859-1\")
--to-charset=CS Destination charset. (e.g. \"utf-8\")
--unicode-normalize=NF Unicode normalization form. (e.g. \"KD\")
--basename Modifies only the last element of the path.
";
use Encode;
use Getopt::Long;
use Unicode::Normalize 'normalize';
use File::Basename;
use I18N::Langinfo qw(langinfo CODESET);
Getopt::Long::Configure ("bundling");
# ----------------------------------------------------------------------------------------------- #
# Our variables. #
# ----------------------------------------------------------------------------------------------- #
my $apply = 0;
my $verbose = 0;
my $help = 0;
my $debug = 0;
my $basename = 0;
my $unicode_normalize = "";
my #scripts;
my $from_charset = "";
my $to_charset = "";
my $codeset = "";
# ----------------------------------------------------------------------------------------------- #
# Get cmdline options. #
# ----------------------------------------------------------------------------------------------- #
$result = GetOptions ("apply" => \$apply,
"verbose|v+" => \$verbose,
"execute|e=s" => \#scripts,
"from-charset=s" => \$from_charset,
"to-charset=s" => \$to_charset,
"unicode-normalize=s" => \$unicode_normalize,
"basename" => \$basename,
"help|h|?" => \$help,
"debug" => \$debug);
# If not going to apply, then be verbose.
if (!$apply && $verbose == 0) {
$verbose = 1;
}
if ((($#scripts == -1)
&& (($from_charset eq "") || ($to_charset eq ""))
&& $unicode_normalize eq "")
|| ($#ARGV == -1) || ($help)) {
print $help_msg;
exit(0);
}
if (($to_charset ne "" && $from_charset eq "")
||($from_charset eq "" && $to_charset ne "")
||($to_charset eq "" && $from_charset eq "" && $unicode_normalize ne "")) {
$codeset = langinfo(CODESET);
$to_charset = $codeset if $from_charset ne "" && $to_charset eq "";
$from_charset = $codeset if $from_charset eq "" && $to_charset ne "";
}
# ----------------------------------------------------------------------------------------------- #
# Composes the filter function using the #scripts array and possibly other options. #
# ----------------------------------------------------------------------------------------------- #
$f = "sub filterfunc() {\n my \$s = shift;\n";
$f .= " my \$d = dirname(\$s);\n my \$s = basename(\$s);\n" if ($basename != 0);
$f .= " for (\$s) {\n";
$f .= " $_;\n" foreach (#scripts); # Get scripts from '-e' opt. #
# Handle charset translation and normalization.
if (($from_charset ne "") && ($to_charset ne "")) {
if ($unicode_normalize eq "") {
$f .= " \$_ = encode(\"$to_charset\", decode(\"$from_charset\", \$_));\n";
} else {
$f .= " \$_ = encode(\"$to_charset\", normalize(\"$unicode_normalize\", decode(\"$from_charset\", \$_)));\n"
}
} elsif (($from_charset ne "") || ($to_charset ne "")) {
die "You can't use `from-charset' nor `to-charset' alone";
} elsif ($unicode_normalize ne "") {
$f .= " \$_ = encode(\"$codeset\", normalize(\"$unicode_normalize\", decode(\"$codeset\", \$_)));\n"
}
$f .= " }\n";
$f .= " \$s = \$d . '/' . \$s;\n" if ($basename != 0);
$f .= " return \$s;\n}\n";
print "Generated function:\n\n$f" if ($debug);
# ----------------------------------------------------------------------------------------------- #
# Evaluates the filter function body, so to define it in our scope. #
# ----------------------------------------------------------------------------------------------- #
eval $f;
# ----------------------------------------------------------------------------------------------- #
# Main loop, which passes names through filters and renames files. #
# ----------------------------------------------------------------------------------------------- #
foreach (#ARGV) {
$old_name = $_;
$new_name = filterfunc($_);
if ($old_name ne $new_name) {
if (!$apply or (rename $old_name, $new_name)) {
print "`$old_name' => `$new_name'\n" if ($verbose);
} else {
print "Cannot rename `$old_name' to `$new_name'.\n";
}
} else {
print "`$old_name' unchanged.\n" if ($verbose > 1);
}
}
This worked for me using regexp:
I wanted files to be renamed like this:
file0001.txt -> 1.txt
ofile0002.txt -> 2.txt
f_i_l_e0003.txt -> 3.txt
usig the [a-z|_]+0*([0-9]+.) regexp where ([0-9]+.) is a group substring to use on the rename command
ls -1 | awk 'match($0, /[a-z|\_]+0*([0-9]+.*)/, arr) { print arr[0] " " arr[1] }'|xargs -l mv
Produces:
mv file0001.txt 1.txt
mv ofile0002.txt 2.txt
mv f_i_l_e0003.txt 3.txt
Another example:
file001abc.txt -> abc1.txt
ofile0002abcd.txt -> abcd2.txt
ls -1 | awk 'match($0, /[a-z|\_]+0*([0-9]+.*)([a-z]+)/, arr) { print arr[0] " " arr[2] arr[1] }'|xargs -l mv
Produces:
mv file001abc.txt abc1.txt
mv ofile0002abcd.txt abcd2.txt
Warning, be careful.
I wrote this script to search for all .mkv files recursively renaming found files to .avi. You can customize it to your neeeds. I've added some other things such as getting file directory, extension, file name from a file path just incase you need to refer to something in the future.
find . -type f -name "*.mkv" | while read fp; do
fd=$(dirname "${fp}");
fn=$(basename "${fp}");
ext="${fn##*.}";
f="${fn%.*}";
new_fp="${fd}/${f}.avi"
mv -v "$fp" "$new_fp"
done;
A generic script to run a sed expression on a list of files (combines the sed solution with the rename solution):
#!/bin/sh
e=$1
shift
for f in $*; do
fNew=$(echo "$f" | sed "$e")
mv "$f" "$fNew";
done
Invoke by passing the script a sed expression, and then any list of files, just like a version of rename:
script.sh 's/^fgh/jkl/' fgh*
You can also use below script. it is very easy to run on terminal...
//Rename multiple files at a time
for file in FILE_NAME*
do
mv -i "${file}" "${file/FILE_NAME/RENAMED_FILE_NAME}"
done
Example:-
for file in hello*
do
mv -i "${file}" "${file/hello/JAISHREE}"
done
This is an extended version of the find + sed + xargs solution.
Original solutions: this and this.
Requirements: search, prune, regex, rename
I want to rename multiple files in many folders.
Some folders should be pruned/excluded.
I am on cygwin and cannot get perl rename to work, which is required for the most popular solution (and I assume it to be slow, since it does not seem to have a pruning option?)
Solution
Use find to get files effectively (with pruning), and with many customization options.
Use sed for regex replacement.
Use xargs to funnel the result into the final command.
Example 1: rename *.js files but ignore node_modules
This example finds files and echos the found file and the renamed file. For safety reasons, it does not move anything for now. You have to replace echo with mv for that.
set -x # stop on error
set -e # verbose mode (echo all commands)
find "." -type f -not \( -path "**/node_modules/**" -prune \) -name "*.js" |
sed -nE "s/(.*)\/my(.*)/& \1\/YOUR\2/p" |
xargs -n 2 echo # echo first (replace with `mv` later)
The above script turns this:
./x/y/my-abc.js
Into this:
./x/y/YOUR-abc.js
Breakdown of Solution
find "." -type f -not \( -path "**/node_modules/**" -prune \) -name "*.js"
Searches for files (-type f).
The -not part excludes (and, importantly does not traverse!) the (notoriously ginormous) node_modules folder.
File name must match "*.js".
You can add more include and exclude clauses.
Refs:
This post discusses recursive file finding alternatives.
This post discusses aspects of pruning and excluding.
man find
sed -nE "s/(.*)\/my\-(.*\.js)/& \1\/YOUR-\2/p"
NOTE: sed always takes some getting used to.
-E enables "extended" (i.e. more modern) regex syntax.
-n is used in combination with the trailing /p flag: -n hides all results, while /p will print only matching results. This way, we only see/move files that need changing, and ignore all others.
Replacement regex with sed (and other regex tools) is always of the format: s/regex/replacement/FLAGS
In replacement, the & represents the matched input string. This will be the first argument to mv.
Refs:
linux regex tutorial
man sed
xargs -n 2 echo
Run the command echo with (the first two strings of) the replaced string.
Refs: man xargs
Good luck!

Text specification for a tree of files?

I'm looking for examples of specifying files in a tree structure, for example, for specifying the set of files to search in a grep tool. I'd like to be able to include and exclude files and directories by name matches. I'm sure there are examples out there, but I'm having a hard time finding them.
Here's an example of a possible syntax:
*.py *.html
*.txt *.js
-*.pyc
-.svn/
-*combo_*.js
(this would mean include file with extensions .py .html .txt .js, exclude .pyc files, anything under a .svn directory, and any file matching combo_.js)
I know I've seen these sorts of specifications in other tools before. Is this ringing any bells for anyone?
There is no single standard format for this kind of thing, but if you want to copy something that is widely recognized, have a look at the rsync documentation. Look at the chapter on "INCLUDE/EXCLUDE PATTERN RULES."
Apache Ant provides 'ant globs or patterns where:
**/foo/**/*.java
means "any file ending in '.java' in a directory which includes a directory named 'foo' in its path" -- including ./foo/X.java
In your example syntax, is it implicitly understood that there's an escaping character so that you can explicitly include a file that begins with a dash? (The same question goes for any other wildcard characters, but I suppose I'd expect to see more files with dashes in their names than asterisks.)
Various command shells use * (and possibly ? to match a single char), as in your example, but they generally only match against a string of characters that doesn't include a path component separator (i.e. '\' on Windows systems, '/' elsewhere). I've also seen such source control apps as Perforce use additional patterns that can match against path component separators. For instance, with Perforce the pattern "foo/...ext" (without quotes) will match all files under the foo/ directory structure that end with "ext", regardless of whether they are in foo/ itself or in one of its descendant directories. This seems to be a useful pattern.
If you're using bash, you can use the extglob extension to get some nice globbing functions. Enable it as follows:
shopt -s extglob
Then you can do things like the following:
# everything but .html, .jpg or ,gif files
ls -d !(*.html|*gif|*jpg)
# list file9, file22 but not fileit
ls file+([0-9])
# begins with apl or un only
ls -d +(apl*|un*)
See also this page.
How about find in unixish environments?
Find can, of course, do more than build a list of files, but that is one of the common ways it is used. From the man page:
NAME
find -- walk a file hierarchy
SYNOPSIS
find [-H | -L | -P] [-EXdsx] [-f pathname] pathname ... expression
find [-H | -L | -P] [-EXdsx] -f pathname [pathname ...] expression
DESCRIPTION
The find utility recursively descends the directory tree for each
pathname listed, evaluating an expression (composed of the
primaries''
andoperands'' listed below) in terms of each file in the tree.
to achieve your goal I would write something like (formatted for readability):
find ./ \( -name *.{py,html,txt,js,pyc} -or \
-name *combo_*.js -or \
\( -name *.svn -and -type d\)\) \
-print
Moreover there is a idomatic pattern using xargs which makes find suitable for sending the whole list so constructed to an arbitrary command as in:
find /path -type f -print0 | xargs -0 rm
find(1) is a fine tool as described in the previous answer but if it gets more complicated, you should consider either writing your own script in any of the usual suspects (Ruby, Perl, Python et al.) or try to use one of the more powerful shells such as zsh which has a ** globbing commands and you can specify things to exclude. The latter is probably more complicated though.
You might want to check out ack, which allows you to specify file types to search in with options like --perl, etc.
It also ignores .svn directories by default, as well as core dumps, editor cruft, binary files, and so on.

Resources