Sorting a bash array - arrays

I'm trying to sort the output of this code by size of the file. Currently I have:
IFS=!
FILEARRAY=(`find * -printf %f!`)
to get all of the file names out of the directory. I've tried piping it all sorts of ways and nothing works. Is it even possible to do like this or do I need to go about getting the file names in my array a different way?
Thanks

Try something like this instead:
FILEARRAY=$(find * -printf '%s~%f\n' | sort -n | awk -F"~" '{print $2}')
This should give you a list of file names sorted by size.

Not sure what you are trying to achieve here but to extract the size of the files you might want to use sed. to pass it to sort or some other sorting utility check out xargs which gives you some extra features when piping and might be of some use.
Edit:
If you are trying to sort all of the files in the current directory by size,
somthing like this:
find ./ -name "*" | xargs ls -s | sort -n
should work.

Does not use bash arrays. Also does not parse ls
find . -type f -printf '%s:%f\n' | sort -t: -n -k1 | cut -d: -f2-

Related

How to run a command on all .cs files in directory and store file path as a variable to be used as command on windows

I'm trying to run the following command on each file of a directory.
svn blame FILEPATH | gawk '{print $2}' | sort | uniq -c
It works well however it only works on individual files. For whatever reason, it won't run on the directory as a whole. I was hoping to create some form of batch script that would iterate through the directory and would grab the file path and store it as a variable to be used in the command. However, I've never written a batch script nor do I know the first thing about them. I tried this loop but couldn't get it to work
set codedirectory=%C:\Repo\Pineapple% for %codedirectory% %%i in (*.cs) do
but I'm not necessarily sure what to do next. Unfortunately, this all has to be run on windows. Any help would be greatly appreciated. Thanks!
use for and find, similar to example on
https://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-7.html
for i in $(find . -name "*.cs"); do
svn blame $i | gawk '{print $2}' | sort | uniq -c
done

Counting the number of files in a directory that contain the different variables in my array - bash script

I have a bash script, which needs to check certain files for certain variables, and count how many files come back containing those variables.
As there is more than one variable I need to look for I decided to to use an array for the variables.
The code I am using is below:
#!/bin/bash
declare -a MYARRAY=('Variable One' 'Variable Two' 'Variable Three');
COUNT_MYARRAY=$(find $DIRECTORY -mtime -1 -exec grep -ln $MYARRAY {} \; | wc -l)
I have declared the $DIRECTORY in my real script.
However, it does not seem to pick up files if they have the second and third variable within?
Can anyone see where I might be going wrong?
You can use greps regex support and pass multiple expressions using 'var1\|var2'. First construct the grep argument and then execute grep.
You don't need line numbers -n to grep to count the files...
grep can handle multiple files - it will be faster to pass multiple files to one grep with -exec ... +, rather then spawn grep for each file.
UPPER_CASE_VARIABLES are shouting at me and by convention upper vase variables are reserved for exported variables.
myarray=('Variable One' 'Variable Two' 'Variable Three')
arg=$(printf "%s\|" "${MYARRAY[#]}" | sed 's/\\|$//')
directory=.
count_myarray=$(find "$directory" -type f -mtime -1 -exec grep -l "$arg" {} + | wc -l)
Alternatively: you can pass multiple -exec arguments to find. So first from myarray construct arguments to find in the form -exec grep -l <the var>. Note that multiple variables can be in same files, so get unique filenames after grepping.
myarray=('Variable One' 'Variable Two' 'Variable Three');
findargs=()
for i in "${MYARRAY[#]}"; do
findargs+=(-exec grep -l "$i" {} +)
done
directory=.
count_myarray=$(find "$directory" -type f -mtime -1 "${findargs[#]}" | sort -u | wc -l)
or similar:
count_myarray=$(printf '-exec\0grep\0-l\0%s\0{}\0+\0' "${myarray[#]}" | xargs -0 find "$directory" -type f -mtime -1 | sort -u | wc -l)
Remember to quote your variable expansions to protect against whitespaces or special characters in filenames and directory names.
Going wrong:
With echo $MYARRAY you find Variable One, not the string you want for grep.
Also note that it is better to use lowercase for your variable names. I will use ${directory} and not $DIRECTORY (and in double quotes for directories with a space).
You have more options with grep. When you want a file with 8 occurances counted one, you can not use the grep option -c. An useful option is -r. You are looking for something like
grep -Erl "Variable One|Variable Two|Variable Three" | wc -l
This is difficult when the variables might have special characters like $or |.
Another option of grep is using the option
-f FILE, Obtain patterns from FILE, one per line
So you should make a function that writes the variables to a file, and use something like
grep -rlFf "myVariablesFile" "${directory}" | wc -l
When the content of the file is changing rapidly, you might want to avoid the temporary file with
grep -rlFf <(function_that_writes_variables_to_stdout) "${directory}"| wc -l
or directly
grep -rlFf <(printf "%s\n" "${var1}" "${var2}" "${var3}") "${directory}" | wc -l

bash array count always returns 1

I searched all over for this, but the terms are apparently too general. I'm writing a script to search a group of folders for .mp3 files. Some folders don't have mp3's so they have to be excluded.
I created an array to hold the uniq'd folder names. This find command will get the folders I need.
Folders=$(sudo find /my/music/ -type f -name "*.mp3" | cut -d'/' -f7 | sort -u)
When I try to count the number of folders in the array, I always get 1
echo ${#Folders[#]}
echo ${Folders[#]} prints them out on separate lines so I thought they were separate array elements. Can anyone explain what is going on? You might have to jiggle the field number in the cut command to reproduce locally.
Folders is not an array but a variable.
You need:
Folders=( $(sudo find /my/music/ -type f -name "*.mp3" | cut -d'/' -f7 | sort -u) )
i.e. enclose the command substitution with (). Now ${#Folders[#]} would give you the number of elements of array Folders.
Or do :
sudo find /my/music/ -type f -name "*.mp3" | cut -d'/' -f7 | sort -u | wc -l
Note
wc -l prints the number of lines which in this case would be the number of unique files
to make things a bit more explicit, use -printf "%p\n" option with find where %p specifier prints the file with full path.
Assuming bash 4 or later, don't use find here; use the globstar operator.
shopt -s globstar
folders=( /my/music/**/*.mp3 )
Also assuming that cut -d/ -f7 is supposed to extract the filename alone, follow this up with
folders=${folders[#]##*/}
Other methods for populating the array must take more care to accomodate files containing whitespace or characters like ?, *, or [. File names containing newlines (rare, but not illegal) are much more difficult to handle correctly. Pathname expansion is done inside the shell, so you don't need to worry about any such special characters.

Improved find command to list files, their dir and size

I working on a cmd-line that I execute with plink from PowerShell (PowerCLI) on ESXi.
The idea is to list vmdk files (with exceptions), with their symlink (because their real folders names are IDs) and first subfolder (that'd help me finding VMDK file as it may reflect VM folder). Output is CSV format so I can easily use it in PowerShell. This is where I came so far:
find /vmfs/volumes -type l -exec find {} -name "*.vmdk" -follow \; | awk '{n=split($0,a,"/"); print a[4]";"a[5]";"a[n] }' | grep -v ".*-flat.vmdk$" | grep -v ".*delta.vmdk$" | grep -v ".*-ctk.vmdk$"
This is good for me, but I'd like to add file size as last field (VMDKFileName;Size). Size format does not really matter, I'll be able to manipulate it within my PS script.
Idk if I'm on the right way to fulfill my needs.
Do not hesitate to ask for more informations.
P.S: a one-liner command would be great as I'm using PLink, it's easier for me to use.
TIA
Ok, anwser is here (lots of headaches) !
find $(find /vmfs/volumes -type l -maxdepth 1) -name "*.vmdk" -follow -exec ls -lHd {} \; | awk '{n=split($0,a,"/"); print a[4]";"a[5]";"a[n]";"$5}' | grep -v ".*-flat.vmdk" | grep -v ".*delta.vmdk" | grep -v ".*-ctk.vmdk"

Move files containing X but not containing Y

To manage my backup sync folder, I am trying to come up with a command that would move files beginning with string1* but NOT ending with *string2 from /folder1 to /folder2
What would a command containing such two opposite conditions (HAS and HAS NOT) look like?
#!/bin/bash
for i in `ls -d /folder1/string1* | grep -v 'string2$'`
do
ls -ld $i | grep '^-' > /dev/null # Test that we have a regular file and not a directory etc.
if [ $? == 0 ]; then
mv $i /folder2
fi
done
Try something like
find /folder1 -mindepth 1 -maxdepth 1 -type f \
-name 'string1*' \! -name '*string2' -exec cp -iv {} /folder2 +
Note: If your have a older version of find you can replace + with \;
To me this is another case for (what I shall denote) the read while pattern.
cd /folder1
ls string1* | grep -v 'string2$' | while read f; do mv $f /folder2; done
The other answers are good alternatives, and in particular, find can do a lot. But I always get a headache using find, and never quite use it enough to do so without the manpage open.
Also, starting with ls or a simple find to get a list of files, and then using any or all of sed, awk, grep or whatever you have to hand, to adjust/trim/extend this list, and then bunging it into a loop, is a crude(ish) but pretty powerful technique.

Resources