remove blank first line script - file

I have this script which is printing out the files that have the first line blank:
for f in `find . -regex ".*\.php"`; do
for t in head; do
$t -1 $f |egrep '^[ ]*$' >/dev/null && echo "blank line at the $t of $f";
done;
done
How can I improve this to actually remove the blank line too, or at least copy all the files with the blank first line somewhere else.
I tried copying using this, which is good, because it copies preserving the directory structure, but it was copying every php file, and I needed to capture the postive output of the egrep and only copy those files.
rsync -R $f ../DavidSiteBlankFirst/

I would use sed personally
find ./ -type f -regex '.*\.php' -exec sed -i -e '1{/^[[:blank:]]*$/d;}' '{}' \;
this finds all the regular files ending in .php and executes the sed command which works on the first line only and checks to see if its blank and deletes it if it is, other blank lines in the file remain unaffected.

Just using find and sed:
find . -type f -name "*.php" -exec sed -i '1{/^\s*$/d;q;}' {} \;
The -type f option only find files, not that I expect you would name folders with a .php suffix but it's good practice. The use of -regex '.*\.php' is overkill and messier just using globbing -name "*.php". Use find's -exec instead of a shell script, the sed script will operate on each matching file passed by find.
The sed script looks at the first line only 1 and applies the operations inside {} to that line. We check if the line is blank /^\s*$/ if the line matches we delete d it and quit q the script so not to read all the other lines in the file. The -i option saves the change back to the file as the default behaviour of sed is to print to stdout. If you want back files making use -i~ instead, this will create a backfile file~ for file.

Related

Script to read file, find in the file system, and move to a directory

I'm horrible with scripting and have been pouring over sites for the past 2 days and haven't found a solution, nor been able to piece one together. I'm at my wits end and need some help.
I have a text file with partial names and numbers that I need to find within a directory and its subdirectories and move to a separate directory.
I've tried the following with no luck:
#!/bin/bash
getArray() {
array=()
while IFS= read -r line
do
array+=("$line")
done < "$1"
}
getArray "dan.txt"
for e in "${array[#]}"
do
mv "$e" /root/moved
done
Thanks in advance.
If by partial file names, you mean they have something that they match within each file name, you can do:
find . -name "foo*"
If it's dependent also on the directory (i.e., any parent folders that are within the path), use the -regex option flag instead, which will match agaisnt the entire file path, instead of the file name:
find . -regex '.*/foo/[^/]*.doc'
So:
while read -r line; do
find . -name "$line"
done < "$1"
EDIT per the comments:
while read -r line; do
find "/data/Match" -type f -iname "$line\.*" -exec mv -t /root/moved {} \;
done < "$1"
You may need to change the -iname argument to the following:
If they are complete file names (e.g., line with value "filename.txt" matches a file named "filename.txt), then -iname "$line".
If they are partial file names (e.g., line with value "file" matches "file.txt" and "filename.txt"), then -iname "$line*".
If they are complete names without file extensions, (e.g., line with value "file" matches "file.txt", but not "filename.txt"), then -iname "$line\.*"
Bit of an explanation for that find command btw:
-type f - files only (d for directories, l for soft/hardlinks);
-iname - like -name, but case-insensitive;
-exec - command to execute for each matching file, ending with \;
mv -t /root/moved {} - the -t option is the target directory, followed by the path of that target to move the file to. The source file argument (i.e., the file path we're moving) is the final parameter in mv, which is {}, which represents the current file path in find -exec command.
Note that you may need to play with "$line*" as I can't tell how vague your names are from the text file. If they will definitely match from the beginning (e.g., a line in the file with the value "somefile" only matches a file named in the format "somefile.*", where the * is any amount of characters)

Adapt file renaming script so that it searches out each file in sub-directories

I have a .csv file in which looks something like this:
unnamed_0711-42_p1.mov,day1_0711-42_p1.mov
unnamed_0711-51_p2.mov,day1_0711-51_p2.mov
unnamed_0716-42_p1_2.mov,day1_0716-42_p1_2.mov
unnamed_0716-51_p2_2.mov,day1_0716-51_p2_2.mov
I have written this code to rename files from the name in field 1 (e.g. unnamed_0711-42_p1.mov), to the name in field 2 (e.g. day1_0711-42_p1.mov).
csv=/location/rename.csv
cat $csv | while IFS=, read -r -a arr; do mv "${arr[#]}"; done
However, this script only works when it and all the files that need to be renamed are in the same directory. This was okay previously, but now I need to find files in various subdirectories (without adding the full path to my .csv file).
How can I adapt my script so that is searches out the files in subdirectories then changes the name as before?
A simple way to make this work, though it leads to an inefficient script, is this:
for D in `find . -type d`
do
cat $csv | while IFS=, read -r -a arr; do mv "${arr[#]}"; done
done
This will run your command for every directory in the current directory, but this runs through the entire list of filenames for every subdirectory. An alternative would be to search for each file as you process it's name:
csv=files.csv
while IFS=, read -ra arr; do
while IFS= read -r -d '' old_file; do
old_dir=$(dirname "$old_file")
mv "$old_file" "$old_dir/${arr[1]}"
done < <(find . -name "${arr[0]}" -print0)
done<"$csv"
This uses find to locate each old filename, then uses dirname to get the directory of the old file (which we need so that mv does not place the renamed file into a different directory).
This will rename every instance of each file (i.e., if unnamed_0711-42_p1.mov appears in multiple subdirectories, each instance will be renamed to day1_0711-42_p1.mov). If you know each file name will only appear once, you can speed things up a bit by adding -print -quit to the end of the find command, before the pipe.
Below script
while IFS=, read -ra arr # -r to prevent mangling backslashes
do
find . -type f -name "${arr[0]}" -printf "mv '%p' '%h/${arr[1]}'" | bash
done<csvfile
should do it.
See [ find ] manpage to understand what the printf specifiers like %p,%h do

Read filenames with embedded whitespace into an array in a shell script

Basically I'm searching for a multi-word file which is present in many directories using find command and the output is stored on to a variable vari
vari = `find -name "multi word file.xml"
When I try to delete the file using a for loop to iterate through.,
for file in ${vari[#]}
the execution fails saying.,
rm: cannot remove `/abc/xyz/multi':: No such file or directory
Could you guys please help me with this scenario??
If you really need to capture all file paths in an array up front (assumes bash, primarily due to use of arrays and process substitution (<(...))[1]; a POSIX-compliant solution would be more cumbersome[2]; also note that this is a line-based solution, so it won't handle filenames with embedded newlines correctly, but that's very rare in practice):
# Read matches into array `vari` - safely: no word splitting, no
# globbing. The only caveat is that filenames with *embedded* newlines
# won't be handled correctly, but that's rarely a concern.
# bash 4+:
readarray -t vari < <(find . -name "multi word file.xml")
# bash 3:
IFS=$'\n' read -r -d '' -a vari < <(find . -name "multi word file.xml")
# Invoke `rm` with all array elements:
rm "${vari[#]}" # !! The double quotes are crucial.
Otherwise, let find perform the deletion directly (these solutions also handle filenames with embedded newlines correctly):
find . -name "multi word file.xml" -delete
# If your `find` implementation doesn't support `-delete`:
find . -name "multi word file.xml" -exec rm {} +
As for what you tried:
vari=`find -name "multi word file.xml"` (I've removed the spaces around =, which would result in a syntax error) does not create an array; such a command substitution returns the stdout output from the enclosed command as a single string (with trailing newlines stripped).
By enclosing the command substitution in ( ... ), you could create an array:
vari=( `find -name "multi word file.xml"` ),
but that would perform word splitting on the find's output and not properly preserve filenames with spaces.
While this could be addressed with IFS=$'\n' so as to only split at line boundaries, the resulting tokens are still subject to pathname expansion (globbing), which can inadvertently alter the file paths.
While this could also be addressed with a shell option, you now have 2 settings you need to perform ahead of time and restore to their original value; thus, using readarray or read as demonstrated above is the simpler choice.
Even if you did manage to collect the file paths correctly in $vari as an array, referencing that array as ${vari[#]} - without double quotes - would break, because the resulting strings are again subject to word splitting, and also pathname expansion (globbing).
To safely expand an array to its elements without any interpretation of its elements, double-quote it: "${vari[#]}"
[1]
Process substitution rather than a pipeline is used so as to ensure that readarray / read is executed in the current shell rather than in a subshell.
As eckes points out in a comment, if you were to try find ... | IFS=$'\n' read ... instead, read would run in a subshell, which means that the variables it creates will disappear (go out of scope) when the command returns and cannot be used later.
[2]
The POSIX shell spec. supports neither arrays nor process substitution (nor readarray, nor any read options other than -r); you'd have to implement line-by-line processing as follows:
while IFS='
' read -r vari; do
pv vari
done <<EOF
$(find . -name "multi word file.xml")
EOF
Note the require actual newline between IFS=' and ' in order to assign a newline, given that the $'\n' syntax is not available.
Here are a few approaches:
# change the input field separator to a newline to ignore spaces
IFS=$'\n'
for file in $(find . -name '* *.xml'); do
ls "$file"
done
# pipe find result lines to a while loop
IFS=
find . -name '* *.xml' | while read -r file; do
ls "$file"
done
# feed the while loop with process substitution
IFS=
while read -r file; do
ls "$file"
done < <(find . -name '* *.xml')
When you're satisfied with the results, replace ls with rm.
The solutions are all line-based solutions. There is a test environment at bottom for which there is no known solution.
As already written, the file could be removed with this tested command:
$ find . -name "multi word file".xml -exec rm {} +
I did not manage to use rm command with a variable when the path or filename contains \n.
Test environment:
$ mkdir "$(printf "\1\2\3\4\5\6\7\10\11\12\13\14\15\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37\40\41\42\43\44\45\46\47testdir" "")"
$ touch "multi word file".xml
$ mv *xml *testdir/
$ touch "2nd multi word file".xml ; mv *xml *testdir
$ ls -b
\001\002\003\004\005\006\a\b\t\n\v\f\r\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\ !"#$%&'testdir
$ ls -b *testdir
2nd\ multi\ word\ file.xml multi\ word\ file.xml

Replace string with sed in array and store as variable

If I hard code the csgo path my code works, but if I use a search funtion and replace the directory I searched for using sed my code fails.
#Find directorties of CSGO instances to update
updatepaths=`find /home/tcagame/ -type f -name "update_csgo.txt"`
#Splits diretories on space to be read from the array
updates=($updatepaths)
#Path to CSGO instances to update
#csgo="/home/tcagame/user/33/csgo/steam.inf"
#Creating automated path
csgo= echo "${updates[0]}" | sed 's,update_csgo.txt,csgo/steam.inf,'
#Check for updates
python $updatecheck $csgo > ~/autoupdate/status/updatestatus.txt
When I echo "$csgo" it creates a new line, I think thats why its not working.
/home/tcagame/user/33/csgo/steam.inf
[New Line]
This is what I am tryin to achieve in an automated style:
python srcupdatecheck /home/tcagame/iceman/206/csgo/steam.inf
Using mapfile to read the lines of find output into an array is safer than relying on word splitting: the only trouble you'll have is if a filename contains a newline character.
mapfile -t updates < <(find /home/tcagame/ -type f -name "update_csgo.txt")
Here, you only need parameter expansion, not sed:
csgo="${updates[0]%update_csgo.txt}csgo/steam.inf"
Or, let find do more of the heavy lifting for you:
mapfile -t update_dirs < <(
find /home/tcagame/ -type f -name "update_csgo.txt" -exec dirname '{}' \;
)
csgo="${update_dirs[0]}/csgo/steam.inf"

Looking to take only main folder name within a tarball & match it to folders to see if it's been extracted

I have a situation where I need to keep .tgz files & if they've been extracted, remove the extracted directory & contents.
In all examples, the only top-level directory within the tarball has a different name than the tarball itself:
[host1]$ find / -name "*\#*.tgz" #(has an # symbol somewhere in the name)
/1-#-test.tgz
[host1]$ tar -tzvf /1-#-test.tgz | head -n 1 | awk '{ print $6 }'
TJ #(directory name)
What I'd like to accomplish (pulling my hair out; rusty scripting fingers), is to look at each tarball, see if the corresponding directory name (like above) exists. If it does, echo "rm -rf /directoryname" into an output file for review.
I can read all of the tarballs into an array ... but how to check the directories?
Frustrated & appreciate any help.
Maybe you're looking for something like this:
find / -name "*#*.tgz" | while read line; do
dir=$(tar ztf "$line" | awk -F/ '{print $6; exit}')
test -d "$dir" && echo "rm -fr '$dir'"
done
Explanation:
We iterate over the *#*.tgz files found with a while loop, line by line
Get the list of files in the tgz file with tar ztf "$line"
Since paths are separated by /, use that as the separator in the awk, print the 6th field. After the print we exit, making this equivalent to but more efficient than using head -n1 first
With dir=$(...) we put the entire output of the tar..awk chain, thus the 6th field of the first file in the tar, into the variable dir
We check if such directory exists, if yes then echo an rm command so you can review and execute later if looks good
My original answer used a find ... -exec but I think that's not so good in this particular case:
find / -name "*#*.tgz" -exec \
sh -c 'dir=$(tar ztf "{}" | awk -F/ "{print \$6; exit}");\
test -d "$dir" && echo "rm -fr \"$dir\""' \;
It's not so good because of running sh for every file, and since we are using {} in the subshell, we lose the usual benefits of a typical find ... -exec where special characters in {} are correctly handled.

Resources