Simple Bash - for f in * - file

Consider this simple loop:
for f in *.{text,txt}; do echo $f; done
I want to echo ONLY valid file names. Using the $f variable in a script everything works great unless there aren't any files of that extension. In the case of an empty set, $f is set to *.text and the above line echos:
*.text
*.txt
rather than echoing nothing. This creates an error if you are trying to use $f for anything that is expecting an actual real file name and instead gets *.
If there are any files that match the wildcard so that it is not an empty set everything works as I would like. e.g.
123.text
456.txt
789.txt
How can I do this without the errors and without seemingly excessive complexity of first string matching $f for an asterisk?

Set the nullglob option.
$ for f in *.foo ; do echo "$f" ; done
*.foo
$ shopt -s nullglob
$ for f in *.foo ; do echo "$f" ; done
$

You can test if the file actually exists:
for f in *.{text,txt}; do if [ -f $f ]; then echo $f; fi; done
or you can use the find command:
for f in $(find -name '*.text' -o -name '*.txt'); do
echo $f
done

Also, if you can afford the external use of ls, you can escape its results by using
for f in `ls *.txt *.text`;

Related

Search for a substring within listed files with spaces from multiple directories in bash

I want to create a script that loops over multiple directories from an array and, if the files there, which are not in the blacklist, are older than a certain time period, remove them. The problem is that any type of string comparison (whether grep -q or wildcards) doesn't work when trying to list a directory with files that contain spaces in them (so I change the $IFS value to loop through them), making the script unusable. Blacklisted strings can also have spaces in them, of course.
Here's what I wrote so far:
#!/bin/bash
declare -a dirs=(~/path/to/dir1/* ~/path/to/dir2/*)
declare -a blacklist=("file number 1" "file number 2" "file number 3")
saveifs=$IFS
IFS=$'\n'
echo "Starting the autocleaner..."
for dirname in "${dirs[#]}"; do
for filename in $(ls "$dirname"); do
for excluded in ${blacklist[#]}; do
if [ -e $filename ]; then
if echo "$filename" | grep -q "$excluded"; then
# if [[ "$filename" == *"$excluded"* ]]; then
:
else
if test `find "$filename" -mtime +1`; then
# rm -f $filename
echo "File $filename removed."
else
echo "File $filename is up-to-date and doesn't need to be removed."
fi
fi
else
:
fi
done
done
done
IFS=$saveifs
How can I make the comparison actually work?
Have you tried using single square brackets [ ... ] for the comparison line? Reading about the difference here between [ ... ] and [[ ... ]] may help you.

Using loop to convert multiple files into separate files

I used this command to convert multiple pcap log files to text using tcpdump :
$ cat /home/dalya/snort-2.9.9.0/snort_logs/snort.log.* | tcpdump -n -r - > /home/dalya/snort-2.9.9.0/snort_logs2/bigfile.txt
and it worked well.
Now I want to separate the output, each converted file in a separate output file using loop like this :
for f in /home/dalya/snort-2.9.9.0/snort_logs/snort.log.* ; do
tcpdump -n -r "$f" > /home/dalya/snort-2.9.9.0/snort_logs2/"$f.txt" ;
done
But it gave me :
bash: /home/dalya/snort-2.9.9.0/snort_logs2//home/dalya/snort-2.9.9.0/snort_logs/snort.log.1485894664.txt: No such file or directory
bash: /home/dalya/snort-2.9.9.0/snort_logs2//home/dalya/snort-2.9.9.0/snort_logs/snort.log.1485894770.txt: No such file or directory
bash: /home/dalya/snort-2.9.9.0/snort_logs2//home/dalya/snort-2.9.9.0/snort_logs/snort.log.1487346947.txt: No such file or directory
I think the problem in $f, Where did I go wrong?
If you run
for f in /home/dalya/snort-2.9.9.0/snort_logs/snort.log.* ; do
echo $f
done
You'll find that you're getting
/home/dalya/snort-2.9.9.0/snort_logs/snort.log.1485894664
/home/dalya/snort-2.9.9.0/snort_logs/snort.log.1485894770
/home/dalya/snort-2.9.9.0/snort_logs/snort.log.1487346947
You can use basename
To get only the filename, something like this:
for f in /home/dalya/snort-2.9.9.0/snort_logs/snort.log.* ; do
base="$(basename $f)"
echo $base
done
Once you're satisfied that this is working, remove the echo statement and use
tcpdump -n -r "$f" > /home/dalya/snort-2.9.9.0/snort_logs2/"$base.txt"
instead.
Edit: tcpdump -n -r "$base" > ... should have been tcpdump -n -r "$f" > ...; you only want to use $base in the context of creating the new filename, not in the context of reading the existing data.

Script to group numbered files into folders

I have around a million files in one folder in the form xxxx_description.jpg where xxx is a number ranging from 100 to an unknown upper.
The list is similar to this:
146467_description1.jpg
146467_description2.jpg
146467_description3.jpg
146467_description4.jpg
14646_description1.jpg
14646_description2.jpg
14646_description3.jpg
146472_description1.jpg
146472_description2.jpg
146472_description3.jpg
146500_description1.jpg
146500_description2.jpg
146500_description3.jpg
146500_description4.jpg
146500_description5.jpg
146500_description6.jpg
To get the file number down in the at folder I'd like to put them all into folders grouped by the number at the start.
ie:
146467/146467_description1.jpg
146467/146467_description2.jpg
146467/146467_description3.jpg
146467/146467_description4.jpg
14646/14646_description1.jpg
14646/14646_description2.jpg
14646/14646_description3.jpg
146472/146472_description1.jpg
146472/146472_description2.jpg
146472/146472_description3.jpg
146500/146500_description1.jpg
146500/146500_description2.jpg
146500/146500_description3.jpg
146500/146500_description4.jpg
146500/146500_description5.jpg
146500/146500_description6.jpg
I was thinking to try and use command line: find | awk {} | mv command or maybe write a script, but I'm not sure how to do this most efficiently.
If you really are dealing with millions of files, I suspect that a glob (*.jpg or [0-9]*_*.jpg may fail because it makes a command line that's too long for the shell. If that's the case, you can still use find. Something like this might work:
find /path -name "[0-9]*_*.jpg" -exec sh -c 'f="{}"; mkdir -p "/target/${f%_*}"; mv "$f" "/target/${f%_*}/"' \;
Broken out for easier reading, this is what we're doing:
find /path - run find, with /path as a starting point,
-name "[0-9]*_*.jpg" - match files that match this filespec in all directories,
-exec sh -c execute the following on each file...
'f="{}"; - put the filename into a variable...
mkdir -p "/target/${f%_*}"; - make a target directory based on that variable (read mkdir's man page about the -p option)
mv "$f" "/target/${f%_*}/"' - move the file into the directory.
\; - end the -exec expression
On the up side, it can handle any number of files that find can handle (i.e. limited only by your OS). On the down side, it's launching a separate shell for each file to be handled.
Note that the above answer is for Bourne/POSIX/Bash. If you're using CSH or TCSH as your shell, the following might work instead:
#!/bin/tcsh
foreach f (*_*.jpg)
set split = ($f:as/_/ /)
mkdir -p "$split[1]"
mv "$f" "$split[1]/"
end
This assumes that the filespec will fit in tcsh's glob buffer. I've tested with 40000 files (894KB) on one command line and not had a problem using /bin/sh or /bin/csh in FreeBSD.
Like the Bourne/POSIX/Bash parameter expansion solution above, this avoids unnecessary calls to external I haven't tested that, and would recommend the find solution even though it's slower.
You can use this script:
for i in [0-9]*_*.jpg; do
p=`echo "$i" | sed 's/^\([0-9]*\)_.*/\1/'`
mkdir -p "$p"
mv "$i" "$p"
done
Using grep
for file in *.jpg;
do
dirName=$(echo $file | grep -oE '^[0-9]+')
[[ -d $dirName ]] || mkdir $dirName
mv $file $dirName
done
grep -oE '^[0-9]+' extracts the starting digits in the filename as
146467
146467
146467
146467
14646
...
[[ -d $dirName ]] returns 1 if the directory exists
[[ -d $dirName ]] || mkdir $dirName ensures that the mkdir works only if the test [[ -d $dirName ]] fails, that is the direcotry does not exists

Bash script with using special characters in variable array and copy to folder

I've a new question about a closed question from me.
In the last one I asked for help in fixing a script, which sorts files to folders by it's content. (Bash script which sorts files to folders by it's content; How to solve wildcard in variables?)
Now I have a new problem with that.
The variables had changed. The old ones where single word variables in an array, now I've multiple words with special characters as variable.
Here is my script:
#!/bin/bash
declare -a standorte;
standorte=('Zweigst: 00' 'Zweigst: 03' 'Zweigst: 08')
ls lp.3.* | while read f
do
for ort in "${standorte[#]}"; do
grep -i $ort "$f" >/dev/null 2>&1
if [ $? -eq 0 ]; then
echo Copying $f to $ort
cp "$f" $ort
fi
done
done
Now you see, the "ort" is the folder name. So the script try to copy the file lp.3.* to e.g. Zweigst: 00. But without the escape backslashes it doesn't work. Put I escape charakters into the variable, the script doesn't work, because in the file lp.3.* is no "Zweigst:\ 00".
I think, I must declare a new variable for "ort" where I put the folder names in it.
But I've no idea how to change the for loop. I must say the script, when you found Zweigst: 00 copy this file to folder "zweigst00". I'm sorry my bash script experience is not good at all. I can't change this by my own.
I have multiple (zero to unlimited) lp.3.* files (e.g. lp.3.1, lp.3.2, lp.3.5.2 and so on)
In this files is this text: http://pastebin.com/0ZzCUrpx
You just need to quote the variable:
for ort in "${standorte[#]}"; do
grep -i "$ort" "$f" >/dev/null 2>&1
# ^----^-------- quotes needed
if [ $? -eq 0 ]; then
echo Copying $f to $ort
cp "$f" "$ort"
# ^----^-------- quotes needed
fi
done
Why? Because otherwise this
grep -i $ort "$f" >/dev/null 2>&1
gets expanded as something that grep cannot understand properly:
grep -i Zweigst: 00 "$f" >/dev/null 2>&1
see that it is trying to grep Zgeigst: from file 00.

How do I capture the output from the ls or find command to store all file names in an array?

Need to process files in current directory one at a time. I am looking for a way to take the output of ls or find and store the resulting value as elements of an array. This way I can manipulate the array elements as needed.
To answer your exact question, use the following:
arr=( $(find /path/to/toplevel/dir -type f) )
Example
$ find . -type f
./test1.txt
./test2.txt
./test3.txt
$ arr=( $(find . -type f) )
$ echo ${#arr[#]}
3
$ echo ${arr[#]}
./test1.txt ./test2.txt ./test3.txt
$ echo ${arr[0]}
./test1.txt
However, if you just want to process files one at a time, you can either use find's -exec option if the script is somewhat simple, or you can do a loop over what find returns like so:
while IFS= read -r -d $'\0' file; do
# stuff with "$file" here
done < <(find /path/to/toplevel/dir -type f -print0)
for i in `ls`; do echo $i; done;
can't get simpler than that!
edit: hmm - as per Dennis Williamson's comment, it seems you can!
edit 2: although the OP specifically asks how to parse the output of ls, I just wanted to point out that, as the commentators below have said, the correct answer is "you don't". Use for i in * or similar instead.
You actually don't need to use ls/find for files in current directory.
Just use a for loop:
for files in *; do
if [ -f "$files" ]; then
# do something
fi
done
And if you want to process hidden files too, you can set the relative option:
shopt -s dotglob
This last command works in bash only.
Depending on what you want to do, you could use xargs:
ls directory | xargs cp -v dir2
For example. xargs will act on each item returned.

Resources