Move files containing X but not containing Y

Move files containing X but not containing Y - file

To manage my backup sync folder, I am trying to come up with a command that would move files beginning with string1* but NOT ending with *string2 from /folder1 to /folder2
What would a command containing such two opposite conditions (HAS and HAS NOT) look like?

#!/bin/bash
for i in `ls -d /folder1/string1* | grep -v 'string2$'`
do
ls -ld $i | grep '^-' > /dev/null # Test that we have a regular file and not a directory etc.
if [ $? == 0 ]; then
mv $i /folder2
fi
done

Try something like
find /folder1 -mindepth 1 -maxdepth 1 -type f \
-name 'string1*' \! -name '*string2' -exec cp -iv {} /folder2 +
Note: If your have a older version of find you can replace + with \;

To me this is another case for (what I shall denote) the read while pattern.
cd /folder1
ls string1* | grep -v 'string2$' | while read f; do mv $f /folder2; done
The other answers are good alternatives, and in particular, find can do a lot. But I always get a headache using find, and never quite use it enough to do so without the manpage open.
Also, starting with ls or a simple find to get a list of files, and then using any or all of sed, awk, grep or whatever you have to hand, to adjust/trim/extend this list, and then bunging it into a loop, is a crude(ish) but pretty powerful technique.

Related

Improved find command to list files, their dir and size

I working on a cmd-line that I execute with plink from PowerShell (PowerCLI) on ESXi.
The idea is to list vmdk files (with exceptions), with their symlink (because their real folders names are IDs) and first subfolder (that'd help me finding VMDK file as it may reflect VM folder). Output is CSV format so I can easily use it in PowerShell. This is where I came so far:
find /vmfs/volumes -type l -exec find {} -name "*.vmdk" -follow \; | awk '{n=split($0,a,"/"); print a[4]";"a[5]";"a[n] }' | grep -v ".*-flat.vmdk$" | grep -v ".*delta.vmdk$" | grep -v ".*-ctk.vmdk$"
This is good for me, but I'd like to add file size as last field (VMDKFileName;Size). Size format does not really matter, I'll be able to manipulate it within my PS script.
Idk if I'm on the right way to fulfill my needs.
Do not hesitate to ask for more informations.
P.S: a one-liner command would be great as I'm using PLink, it's easier for me to use.
TIA

Ok, anwser is here (lots of headaches) !
find $(find /vmfs/volumes -type l -maxdepth 1) -name "*.vmdk" -follow -exec ls -lHd {} \; | awk '{n=split($0,a,"/"); print a[4]";"a[5]";"a[n]";"$5}' | grep -v ".*-flat.vmdk" | grep -v ".*delta.vmdk" | grep -v ".*-ctk.vmdk"

Script to group numbered files into folders

I have around a million files in one folder in the form xxxx_description.jpg where xxx is a number ranging from 100 to an unknown upper.
The list is similar to this:
146467_description1.jpg
146467_description2.jpg
146467_description3.jpg
146467_description4.jpg
14646_description1.jpg
14646_description2.jpg
14646_description3.jpg
146472_description1.jpg
146472_description2.jpg
146472_description3.jpg
146500_description1.jpg
146500_description2.jpg
146500_description3.jpg
146500_description4.jpg
146500_description5.jpg
146500_description6.jpg
To get the file number down in the at folder I'd like to put them all into folders grouped by the number at the start.
ie:
146467/146467_description1.jpg
146467/146467_description2.jpg
146467/146467_description3.jpg
146467/146467_description4.jpg
14646/14646_description1.jpg
14646/14646_description2.jpg
14646/14646_description3.jpg
146472/146472_description1.jpg
146472/146472_description2.jpg
146472/146472_description3.jpg
146500/146500_description1.jpg
146500/146500_description2.jpg
146500/146500_description3.jpg
146500/146500_description4.jpg
146500/146500_description5.jpg
146500/146500_description6.jpg
I was thinking to try and use command line: find | awk {} | mv command or maybe write a script, but I'm not sure how to do this most efficiently.

If you really are dealing with millions of files, I suspect that a glob (*.jpg or [0-9]*_*.jpg may fail because it makes a command line that's too long for the shell. If that's the case, you can still use find. Something like this might work:
find /path -name "[0-9]*_*.jpg" -exec sh -c 'f="{}"; mkdir -p "/target/${f%_*}"; mv "$f" "/target/${f%_*}/"' \;
Broken out for easier reading, this is what we're doing:
find /path - run find, with /path as a starting point,
-name "[0-9]*_*.jpg" - match files that match this filespec in all directories,
-exec sh -c execute the following on each file...
'f="{}"; - put the filename into a variable...
mkdir -p "/target/${f%_*}"; - make a target directory based on that variable (read mkdir's man page about the -p option)
mv "$f" "/target/${f%_*}/"' - move the file into the directory.
\; - end the -exec expression
On the up side, it can handle any number of files that find can handle (i.e. limited only by your OS). On the down side, it's launching a separate shell for each file to be handled.
Note that the above answer is for Bourne/POSIX/Bash. If you're using CSH or TCSH as your shell, the following might work instead:
#!/bin/tcsh
foreach f (*_*.jpg)
set split = ($f:as/_/ /)
mkdir -p "$split[1]"
mv "$f" "$split[1]/"
end
This assumes that the filespec will fit in tcsh's glob buffer. I've tested with 40000 files (894KB) on one command line and not had a problem using /bin/sh or /bin/csh in FreeBSD.
Like the Bourne/POSIX/Bash parameter expansion solution above, this avoids unnecessary calls to external I haven't tested that, and would recommend the find solution even though it's slower.

You can use this script:
for i in [0-9]*_*.jpg; do
p=`echo "$i" | sed 's/^\([0-9]*\)_.*/\1/'`
mkdir -p "$p"
mv "$i" "$p"
done

Using grep
for file in *.jpg;
do
dirName=$(echo $file | grep -oE '^[0-9]+')
[[ -d $dirName ]] || mkdir $dirName
mv $file $dirName
done
grep -oE '^[0-9]+' extracts the starting digits in the filename as
146467
146467
146467
146467
14646
...
[[ -d $dirName ]] returns 1 if the directory exists
[[ -d $dirName ]] || mkdir $dirName ensures that the mkdir works only if the test [[ -d $dirName ]] fails, that is the direcotry does not exists

Looking to take only main folder name within a tarball & match it to folders to see if it's been extracted

I have a situation where I need to keep .tgz files & if they've been extracted, remove the extracted directory & contents.
In all examples, the only top-level directory within the tarball has a different name than the tarball itself:
[host1]$ find / -name "*\#*.tgz" #(has an # symbol somewhere in the name)
/1-#-test.tgz
[host1]$ tar -tzvf /1-#-test.tgz | head -n 1 | awk '{ print $6 }'
TJ #(directory name)
What I'd like to accomplish (pulling my hair out; rusty scripting fingers), is to look at each tarball, see if the corresponding directory name (like above) exists. If it does, echo "rm -rf /directoryname" into an output file for review.
I can read all of the tarballs into an array ... but how to check the directories?
Frustrated & appreciate any help.

Maybe you're looking for something like this:
find / -name "*#*.tgz" | while read line; do
dir=$(tar ztf "$line" | awk -F/ '{print $6; exit}')
test -d "$dir" && echo "rm -fr '$dir'"
done
Explanation:
We iterate over the *#*.tgz files found with a while loop, line by line
Get the list of files in the tgz file with tar ztf "$line"
Since paths are separated by /, use that as the separator in the awk, print the 6th field. After the print we exit, making this equivalent to but more efficient than using head -n1 first
With dir=$(...) we put the entire output of the tar..awk chain, thus the 6th field of the first file in the tar, into the variable dir
We check if such directory exists, if yes then echo an rm command so you can review and execute later if looks good
My original answer used a find ... -exec but I think that's not so good in this particular case:
find / -name "*#*.tgz" -exec \
sh -c 'dir=$(tar ztf "{}" | awk -F/ "{print \$6; exit}");\
test -d "$dir" && echo "rm -fr \"$dir\""' \;
It's not so good because of running sh for every file, and since we are using {} in the subshell, we lose the usual benefits of a typical find ... -exec where special characters in {} are correctly handled.

Using a variable to pass grep pattern in bash

I am struggling with passing several grep patterns that are contained within a variable. This is the code I have:
#!/bin/bash
GREP="$(which grep)"
GREP_MY_OPTIONS="-c"
for i in {-2..2}
do
GREP_MY_OPTIONS+=" -e "$(date --date="$i day" +'%Y-%m-%d')
done
echo $GREP_MY_OPTIONS
IFS=$'\n'
MYARRAY=( $(${GREP} ${GREP_MY_OPTIONS} "/home/user/this path has spaces in it/"*"/abc.xyz" | ${GREP} -v :0$ ) )
This is what I wanted it to do:
determine/define where grep is
assign a variable (GREP_MY_OPTIONS) holding parameters I will pass to grep
assign several patterns to GREP_MY_OPTIONS
using grep and the patterns I have stored in $GREP_MY_OPTIONS search several files within a path that contains spaces and hold them in an array
When I use "echo $GREP_MY_OPTIONS" it is generating what I expected but when I run the script it fails with an error of:
/bin/grep: invalid option -- ' '
What am I doing wrong? If the path does not have spaces in it everything seems to work fine so I think it is something to do with the IFS but I'm not sure.

If you want to grep some content in a set of paths, you can do the following:
find <directory> -type f -print0 |
grep "/home/user/this path has spaces in it/\"*\"/abc.xyz" |
xargs -I {} grep <your_options> -f <patterns> {}
So that <patterns> is a file containing the patterns you want to search for in each file from directory.
Considering your answer, this shall do what you want:
find "/path\ with\ spaces/" -type f | xargs -I {} grep -H -c -e 2013-01-17 {}
From man grep:
-H, --with-filename
Print the file name for each match. This is the default when
there is more than one file to search.
Since you want to insert the elements into an array, you can do the following:
IFS=$'\n'; array=( $(find "/path\ with\ spaces/" -type f -print0 |
xargs -I {} grep -H -c -e 2013-01-17 "{}") )
And then use the values as:
echo ${array[0]}
echo ${array[1]}
echo ${array[...]}
When using variables to pass the parameters, use eval to evaluate the entire line. Do the following:
parameters="-H -c"
eval "grep ${parameters} file"

If you build the GREP_MY_OPTIONS as an array instead of as a simple string, you can get the original outline script to work sensibly:
#!/bin/bash
path="/home/user/this path has spaces in it"
GREP="$(which grep)"
GREP_MY_OPTIONS=("-c")
j=1
for i in {-2..2}
do
GREP_MY_OPTIONS[$((j++))]="-e"
GREP_MY_OPTIONS[$((j++))]=$(date --date="$i day" +'%Y-%m-%d')
done
IFS=$'\n'
MYARRAY=( $(${GREP} "${GREP_MY_OPTIONS[#]}" "$path/"*"/abc.xyz" | ${GREP} -v :0$ ) )
I'm not clear why you use GREP="$(which grep)" since you will execute the same grep as if you wrote grep directly — unless, I suppose, you have some alias for grep (which is then the problem; don't alias grep).

You can do one thing without making things complex:
First do a change directory in your script like following:
cd /home/user/this\ path\ has\ spaces\ in\ it/
$ pwd
/home/user/this path has spaces in it
or
$ cd "/home/user/this path has spaces in it/"
$ pwd
/home/user/this path has spaces in it
Then do what ever your want in your script.
$(${GREP} ${GREP_MY_OPTIONS} */abc.xyz)
EDIT :
[sgeorge#sgeorge-ld stack1]$ ls -l
total 4
drwxr-xr-x 2 sgeorge eng 4096 Jan 19 06:05 test tesd
[sgeorge#sgeorge-ld stack1]$ cat test\ tesd/file
SUKU
[sgeorge#sgeorge-ld stack1]$ grep SUKU */file
SUKU
EDIT :
[sgeorge#sgeorge-ld stack1]$ find */* -print | xargs -I {} grep SUKU {}
SUKU

How do I capture the output from the ls or find command to store all file names in an array?

Need to process files in current directory one at a time. I am looking for a way to take the output of ls or find and store the resulting value as elements of an array. This way I can manipulate the array elements as needed.

To answer your exact question, use the following:
arr=( $(find /path/to/toplevel/dir -type f) )
Example
$ find . -type f
./test1.txt
./test2.txt
./test3.txt
$ arr=( $(find . -type f) )
$ echo ${#arr[#]}
3
$ echo ${arr[#]}
./test1.txt ./test2.txt ./test3.txt
$ echo ${arr[0]}
./test1.txt
However, if you just want to process files one at a time, you can either use find's -exec option if the script is somewhat simple, or you can do a loop over what find returns like so:
while IFS= read -r -d $'\0' file; do
# stuff with "$file" here
done < <(find /path/to/toplevel/dir -type f -print0)

for i in `ls`; do echo $i; done;
can't get simpler than that!
edit: hmm - as per Dennis Williamson's comment, it seems you can!
edit 2: although the OP specifically asks how to parse the output of ls, I just wanted to point out that, as the commentators below have said, the correct answer is "you don't". Use for i in * or similar instead.

You actually don't need to use ls/find for files in current directory.
Just use a for loop:
for files in *; do
if [ -f "$files" ]; then
# do something
fi
done
And if you want to process hidden files too, you can set the relative option:
shopt -s dotglob
This last command works in bash only.

Depending on what you want to do, you could use xargs:
ls directory | xargs cp -v dir2
For example. xargs will act on each item returned.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Move files containing X but not containing Y - file

To manage my backup sync folder, I am trying to come up with a command that would move files beginning with string1* but NOT ending with *string2 from /folder1 to /folder2 What would a command containing such two opposite conditions (HAS and HAS NOT) look like?

#!/bin/bash for i in `ls -d /folder1/string1* | grep -v 'string2$'` do ls -ld $i | grep '^-' > /dev/null # Test that we have a regular file and not a directory etc. if [ $? == 0 ]; then mv $i /folder2 fi done

Try something like find /folder1 -mindepth 1 -maxdepth 1 -type f \ -name 'string1' \! -name 'string2' -exec cp -iv {} /folder2 + Note: If your have a older version of find you can replace + with \;

Related

Improved find command to list files, their dir and size

Script to group numbered files into folders

Looking to take only main folder name within a tarball & match it to folders to see if it's been extracted

Using a variable to pass grep pattern in bash

How do I capture the output from the ls or find command to store all file names in an array?

Categories

Resources

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Move files containing X but not containing Y - file

To manage my backup sync folder, I am trying to come up with a command that would move files beginning with string1* but NOT ending with *string2 from /folder1 to /folder2 What would a command containing such two opposite conditions (HAS and HAS NOT) look like?

#!/bin/bash for i in `ls -d /folder1/string1* | grep -v 'string2$'` do ls -ld $i | grep '^-' > /dev/null # Test that we have a regular file and not a directory etc. if [ $? == 0 ]; then mv $i /folder2 fi done

Try something like find /folder1 -mindepth 1 -maxdepth 1 -type f \ -name 'string1*' \! -name '*string2' -exec cp -iv {} /folder2 + Note: If your have a older version of find you can replace + with \;

Related

Improved find command to list files, their dir and size

Script to group numbered files into folders

Looking to take only main folder name within a tarball & match it to folders to see if it's been extracted

Using a variable to pass grep pattern in bash

How do I capture the output from the ls or find command to store all file names in an array?

Categories

Resources

Try something like find /folder1 -mindepth 1 -maxdepth 1 -type f \ -name 'string1' \! -name 'string2' -exec cp -iv {} /folder2 + Note: If your have a older version of find you can replace + with \;