Compare strings inside in the two different directories using array - arrays

I don't get the scenario of this given code. All I wanted is to compare the files that is given below. But, in this script nothings happen. I assume that this given code can executed wherever like in /root and it will run. Please check this out.
#!/bin/bash
for file in /var/files/sub/old/*
do
# Strip path from file name
file="${file##*/}"
# Strip everything after the first hyphen
prefix="${file%%-*}-"
# Strip everything before the second-to-last dot
suffix="$(echo $file | awk -F. '{ print "."$(NF-1)"."$NF }')"
# Create new file name from $prefix and $suffix, and any version number
new=$(echo "/var/files/new/${prefix}"*"${suffix}")
# If file exists in the 'new' folder:
if test -f "${new}"
then
# Do string comparison to see if new file is lexicographically "greater than" old
if [[ "${new##*/}" > "${file}" ]]
then
# If so, delete the old version.
rm /var/sub/files/old/"${file}"
else
# 'new' file is NOT newer, delete it instead.
rm "${new}"
fi
fi
done
# Move all new files into the old folder.
mv /var/files/new/* /var/files/sub/old/
Example files inside of each sub- directories ..
/var/files/sub/old/
firefox-24.5.0-1.el5_10.i386.rpm
firefox-24.5.0-1.el5_10.x86_64.rpm
google-1.6.0-openjdk-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
google-1.6.0-openjdk-demo-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
/var/files/new/
firefox-25.5.0-1.el5_10.i386.rpm
firefox-25.5.0-1.el5_10.x86_64.rpm
ie-1.6.0-openjdk-devel-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
ie-1.6.0-openjdk-javadoc-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
ie-1.6.0-openjdk-src-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
google-2.6.0-openjdk-demo-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
In this instance, I want to get the files that are the same. So the files that are the same in the given example are:
firefox-24.5.0-1.el5_10.i386.rpm
firefox-24.5.0-1.el5_10.x86_64.rpm
google-1.6.0-openjdk-demo-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
in the old/ directory and for the new/ directory the equivalents are:
firefox-25.5.0-1.el5_10.i386.rpm
firefox-25.5.0-1.el5_10.x86_64.rpm
google-2.6.0-openjdk-demo-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
The files have similarity for their first characters. It will display in the terminal. After that, there will be another comparing again of the files and the comparison is about which file is more updated one by the number after the name of the file like: firefox-24.5.0-1.el5_10.i386.rpm compared with firefox-25.5.0-1.el5_10.i386.rpm. So in that instance the firefox-24.5.0-1.el5_10.i386.rpm will be replaced by firefox-25.5.0-1.el5_10.i386.rpm because it has a greater value and more updated one and same as other files that are similar. And if the old one is removed and the new will take replacement of it.
So at this moment after the script has been executed the output will be like this.
/var/files/sub/old/
google-1.6.0-openjdk-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
firefox-25.5.0-1.el5_10.i386.rpm
firefox-25.5.0-1.el5_10.x86_64.rpm
ie-1.6.0-openjdk-devel-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
ie-1.6.0-openjdk-javadoc-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
ie-1.6.0-openjdk-src-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
google-2.6.0-openjdk-demo-1.6.0.0-5.1.13.3.el5_10.x86_64.rpm
/var/files/new/
<<empty all files here must to moved to other directory take as a replacement>>
Can anyone help me to make a script for this ? above is just an example. Let's assume that there are lots of files to considered as similar and need to removed and moved.

You can use rpm to get the name of the package without version or architecture strings:
rpm -qi -p /firefox-25.5.0-1.el5_10.i386.rpm
Gives:
Name : firefox
Version : 25.5.0
Release : 1.el5_10
Architecture: i386
....
So you can compare the Names to find related packages.

If the goal here is to have the newrpms directory have only the newest version of each RPM from a combination of sources then you most likely want to simply combine all the files in a single directory and then use the repomanage tool (from the yum-utils package, at least on CentOS) to have it inform you which of the RPMS are old and remove them.
Something like:
repomanage --old combined_rpms_directory | xargs -r rm
As to your initial script
for i in $(\ls -d ./new/*);
do
diff ${i} newrpms/;
rm ${i}
done
You generally don't want to "parse" the output from ls, especially when a glob will do what you want just as easily (for i in ./new/* in this case).
diff ${i} newrpms/ is attempting to diff a file and a directory (or two directories if your ls/glob happened to catch a directory) but in neither case will diff do what you want there. That being said what diff does doesn't really matter because, as Barmar said in his comment
your script is removing them without testing the result of diff

A bash script that does the checking. Here's how it works:
Traverse over each file in the old files directory. Get the prefix (package name with no version, architecture, etc), eg. firefox-; get the suffix (architecture.rpm), eg. .i386.rpm.
Attempt to match prefix and suffix with any version number within the new files directory, ie. firefox-*.i386.rpm. If there is a match, $new will contain the file name, eg. firefox-25.5.0-1.el5_10.i386.rpm; if no match, $new will equal the literal string firefox-*.i386.rpm which is not a file.
Check new files directory for existence of $new.
If it exists, check that $new is indeed newer than the old version. This is done by lexicographical string comparison, ie. firefox-24.5.0-1.el5_10.i386.rpm is less than firefox-25.5.0-1.el5_10.i386.rpm because it comes earlier in the alphabet. Conveniently, sane versioning schemes also happen to be alphabetical. NB: this may fail, for example, when comparing version 2 to version 10.
A new version of a file in the old files directory has been found! In this case, get rid of the old file with rm. If the file in the new directory is not newer, then delete it instead.
Done removing old versions. Old files directory has only files without newer versions.
Move all new files into old directory, leaving newest files in old directory, and new directory empty.
#!/bin/bash
for file in /var/files/sub/old/*
do
# Strip path from file name
file="${file##*/}"
# Strip everything after the first hyphen
prefix="${file%%-*}-"
# Strip everything before the second-to-last dot
suffix="$(echo $file | awk -F. '{ print "."$(NF-1)"."$NF }')"
# Create new file name from $prefix and $suffix, and any version number
new=$(echo "/var/files/new/${prefix}"*"${suffix}")
# If file exists in the 'new' folder:
if test -f "${new}"
then
# Do string comparison to see if new file is lexicographically "greater than" old
if [[ "${new##*/}" > "${file}" ]]
then
# If so, delete the old version.
rm /var/sub/files/old/"${file}"
else
# 'new' file is NOT newer, delete it instead.
rm "${new}"
fi
fi
done
# Move all new files into the old folder.
mv /var/files/new/* /var/files/sub/old/

Related

How do I create a crontab job in unix that will move all files in my home directory to another directory at a specific time/date?

Hello I'm new to Unix and I am trying to create a crontab job that moves all the files I have in my home directory where the name contains the letter f followed by a digit 1,3 or 7 to a directory called backups, on the 12th of April and November at 9:30 PM.
This is my home directory:
arsenal.by flhome list1 stmnpgs
arsenal.pass flhome2 list2 test.c
assignment foreachScript1 list2.c testdir
availisting.csv funxdir local.cshrc testfile
backups funxdir2 local.login tmp.test
backups1 homlnk local.profile train
biglist lab4 myfile treat
biglist.c lab5 myfile2 trick
biglist2 lab6 Myhome.list tricking
CFiles.tar.Z lab7 myinfo.fl troll
clssnotes.txt lab8 myList typescript
delfh lec3 names.txt workdir
If anyone could help me out with this it'd be much appreciated!
Firstly home rolling a backup solution for work, professional or college, is usually a bad idea because the stakes of an error are potentially very high and local backups obviously have the possibility of being lost by whatever causes the original files to be inaccessible.
However it's a worthwhile exercise to show how you would do it in cron as it's a frequent type of task and it would provide you some cover while looking for a better solution.
Your date specification can be safely done as one cron entry as only the day of the year varies, if both the minute of the day and the day of the year (or the day of week) changed you would need two entries.
# M H DoM MoY DoW
30 21 12 4,10 * BACKUPDIR=~/backups; ds=$(date +\%Y\%m\%d\%H\%M\%S); mkdir -p $BACKUPDIR; find ~/* -type d -prune -o -type f -name f\*\[137\] -exec mv {} $BACKUPDIR/{}.$ds \;
The find command is told to look at all entries in your home directory that do not start with a . ("visible" files), if they are directories to ignore them (do not descend the directory tree) and if they are files that start with an f to move them (not copy them) to the $BACKUPDIR. If you wanted any file containing an f instead the find pattern would be \*f\*\[137\]
Above we define two variables for the backup dir and a datestamp (the \ before the % are because it is is a special character to cron).
The file globbing patterns * and [] are similarly escaped because they are shell special and we want to pass them to the find command.
The reason to use a timestamp is that moving or copying files frequently causes unintentional overwriting of files so if the backup directory path does not contain a date stamp then the target file name should.
Lastly it might be better to use a tar command to create a compressed date stamped archive that you can easily copy elsewhere, a local backup directory is asking for trouble, particularly if nested underneath the directory you are working in.
eg: Something like
#!/bin/bash
backup_file=~/backups/backup.$(date +%Y%m%d%H%M%S).tar.gz
tar czf $backup_file $(find ~/* -type d -prune -o -type f -name f\*\[137\] -print)
# <Commands to copy the file elsewhere here>
# You should then copy this file elsewhere (another system) or email it to yourself (after possibly encrypting it)

bash array of file locations - how to find last updated file?

Have an array of files built from a locate command that I need to cycle through and figure out the latest and print the latest. We have a property file called randomname-properties.txt that is in multiple locations and is sometimes called randomname-properties.txt.bak or randomname-properties.txt.old. Example is below
Directory structure
/opt/test/something/randomname-properties.txt
/opt/test2/something/randomname-properties.txt.old
/opt/test3/something/randomname-properties.txt.bak
/opt/test/something1/randomname-properties.txt.working
Code
#Builds list of all files
PropLoc=(`locate randomname-properties.txt`)
#Parse list and remove older file
for i in ${PropLoc[#]} ; do
if [ ${PropLoc[0]} -ot ${PropLoc[1]} ] ; then
echo "Removing ${PropLoc[0]} from the list as it is older"
#Below should rebuild the array while removing the older element
PropLoc=( "${PropLoc[#]/$PropLoc[0]}" )
fi
done
echo "Latest file found is ${PropLoc[#]}"
Overall this isn't working. It currently appears that it doesn't even go into the loop as the first two files have the same timestamp of last year (doesn't appear to deconflict down past the day for things older than a year). Any thoughts on how to get this to work properly? Thank you
You can use ls -t, which will sort the files by modification time. The first line will then be the newest file.
newest=$(ls -t "${PropLoc[#]}" | head -n 1)
This should work as long as none of the filenames contain newlines.
Don't forget to quote your variables in case they contain whitespace or wildcard characters.
Without parsing the output of ls:
#!/usr/bin/env bash
latest=
while read -r -d '' file; do
if [ "$file" -nt "$latest" ]; then
latest=$file
fi
done < <(locate --null randomname-properties.txt)
printf 'Latest file found is %s\n' "$latest"

Loop thru a filename list and iterate thru a variable/array removing all strings from filenames with bash

I have a list of strings that I have in a variable and would like to remove those strings from a list of filenames. I'm pulling that string from a file that I can add to and modify over time. Some of the strings in the variable may include part of the item needed to be removed while the other may be another line in the list. Thats why I need to loop thru the entire variable list.
I'm familiar using a while loop to loop thru a list but not sure how I can loop thru each line to remove all strings from that filename.
Here's an example:
getstringstoremove=$(cat /text/from/some/file.txt)
echo "$getstringstoremove"
# Or the above can be an array
getstringstoremove=$(cat /text/from/some/file.txt)
declare -a arr=($getstringstoremove)
the above 2 should return the following lines
-SOMe.fil
(Ena)M-3_1
.So[Me].filEna)M-3_2
SOMe.fil(Ena)M-3_3
Here's the loop I was running to grab all filenames from a directory and remove anything other than the filenames
ls -l "/files/in/a/folder/" | awk -v N=9 '{sep=""; for (i=N; i<=NF; i++) {printf("%s%s",sep,$i); sep=OFS}; printf("\n")}' | while read line; do
echo "$line"
returns the following result after each loop
# 1st loop
ilikecoffee1-SOMe.fil(Ena)M-3_1.jpg
# iterate thru $getstringstoremove to remove all strings from the above file.
# 2nd loop
ilikecoffee2.So[Me].filEna)M-3_2.jpg
# iterate thru $getstringstoremove again
# 3rd loop
ilikecoffee3SOMe.fil(Ena)M-3_3.jpg
# iterate thru $getstringstoremove and again
done
the final desired output would be the following
ilikecoffee1.jpg
ilikecoffee2.jpg
ilikecoffee3.jpg
I'm running this in bash on Mac.
I hope this makes sense as I'm stuck and can use some help.
If someone has a better way of doing this by all means it doesn't have to be the way I have it listed above.
You can get the new filenames with this awk one-liner:
$ awk 'NR==FNR{a[$0];next} {for(i in a){n=index($0,i);if(n){$0=substr($0,0,n-1)substr($0,n+length(i))}}} 1' rem.txt files.lst
This assumes your exclusion strings are in rem.txt and there's a files list in files.lst.
Spaced out for easier commenting:
NR==FNR { # suck the first file into the indices of an array,
a[$0]
next
}
{
for (i in a) { # for each file we step through the array,
n=index($0,i) # search for an occurrence of this string,
if (n) { # and if found,
$0=substr($0,0,n-1)substr($0,n+length(i))
# rewrite the line with the string missing,
}
}
}
1 # and finally, print the line.
If you stow the above script in a file, say foo.awk, you could run it as:
$ awk -f foo.awk rem.txt files.lst
to see the resultant files.
Note that this just shows you how to build new filenames. If what you want is to do this for each file in a directory, it's best to avoid running your renames directly from awk, and use shell constructs designed for handling files, like a for loop:
for f in path/to/*.jpg; do
mv -v "$f" "$(awk -f foo.awk rem.txt - <<<"$f")"
done
This should be pretty obvious except perhaps for the awk options, which are:
-f foo.awk, use the awk script from this filename,
rem.txt, your list of removal strings,
-, a hyphen indicating that standard input should be used IN ADDITION to rem.txt, and
<<<"$f", a "here-string" to provide that input to awk.
Note that this awk script will work with both gawk and the non-GNU awk that is included in macos.
I think I have understood what you mean, and I would do it with Perl which comes built-in to the standard macOS - so nothing to install.
I assume you have a file called remove.txt with your list of stuff to remove, and that you want to run the script on all files in your current directory. If so, the script would be:
#!/usr/local/bin/perl -w
use strict;
# Load the strings to remove into array "strings"
my #strings = `cat remove.txt`;
for(my $i=0;$i<$#strings;$i++){
# Strip carriage returns and quote metacharacters - e.g. *()[]
chomp($strings[$i]);
$strings[$i] = quotemeta($strings[$i]);
}
# Iterate over all filenames
my #files = glob('*');
foreach my $file (#files){
my $new = $file;
# Iterate over replacements
foreach my $string (#strings){
$new =~ s/$string//;
}
# Check if name would change
if($new ne $file){
if( -f $new){
printf("Cowardly refusing to rename %s as %s since it involves overwriting\n",$file,$new);
} else {
printf("Rename %s as %s\n",$file,$new);
# rename $file,$new;
}
}
}
Then save that in your HOME directory as renamer. Make it executable - only necessary once - with this command in Terminal:
chmod +x $HOME/renamer
Then you can go in any directory where you madly named files are and run the script like this:
cd path/to/mad/files
$HOME/renamer
As with all things you download off the Internet, make a backup first and just run on a small, copied, subset of your files till you get the idea of how it works.
If you use homebrew as your package manager, you could install rename using:
brew install rename
You could then take all the Perl from my other answer and condense it down to a couple of lines and embed it in a rename command which would give you the added benefit of being able to do dry-runs etc. The code below does exactly the same as my other answer but is somewhat harder to read for non_perl folk.
Your command would simply be:
rename --dry-run '
my #strings = map { s/\r|\n//g; $_=quotemeta($_) } `cat remove.txt`;
foreach my $string (#strings){ s/$string//; } ' *
Sample Output
'ilikecoffee(Ena)M-3_1' would be renamed to 'ilikecoffee'
'ilikecoffee-SOMe.fil' would be renamed to 'ilikecoffee'
'ilikecoffee.So[Me].filEna)M-3_2' would be renamed to 'ilikecoffee'
To try and understand it, remember:
the rename part applies the following Perl to each file because of the asterisk at the end
the #strings part reads all the strings from the file remove.txt and removes any carriage returns and linefeeds from them and quotes any metacharacters
the foreach applies each of the deletions to the current filename which rename stores in $_ for you
Note that this method trades simplicity for performance somewhat. If you have millions of files to do, the other method will be quicker because here I read the remove.txt file for each and every file whose name is checked, but if you only have a few hundred/thousand files, I doubt you'll notice it.
This should be much the same, just shorter:
rename --dry-run '
my #strings = `cat remove.txt`; chomp #strings;
foreach my $string (#strings){ s/\Q$string\E//; } ' *

Script for renameing special characters files and directories

I am looking for a script to rename files and directories that have special characters in them.
My files:
?rip?ev <- Directory
- Juhendid ?rip?evaks.doc <- Document
- ?rip?ev 2 <- Subdirectory
-- t?ts?.xml <- Subdirectory file
They need to be like this:
ripev <- Directory
- Juhendid ripevaks.doc <- Document
- ripev 2 <- Subdirectory
-- tts.xml <- Subdirectory file
I need to change the files and the folders so that the filetype stays the same as it is for example .doc and .xml wont be lost. Last time I did it with rename it lost every filetype and the files were moved to mother directory in this case ?rip?ev directory and subdirectories were empty. Everything was located under the mother directory /home/samba/.
So in this case I need just to rename the question mark in the file name and directory name, but not to move it anywhere else or lose any other character or the filetype. I have been looking around google for a answer but haven't found one. I know it can be done with find and rename, but haven't been able to over come the complexity of the script. Can anyone help me please?
You can just do something like this
find -name '*\?*' -exec bash -c 'echo mv -iv "$0" "${0//\?/}"' {} \;
Note the echo before the mv so you can see what it does before actually changing anything. Otherwise above:
searches for ? in the name (? is equivalent to a single char version of * so needs to be escaped)
executes a bash command passing the {} as the first argument (since there is no script name it's $0 instead of $1)
${0//\?/} performs parameter expansion in bash replacing all occurrences of ? with nothing.
Note also that file types do not depend on the name in linux, but this should not change any file extension unless they contain ?.
Also this will rename symlinks as well if they contain ? (not clear whether or not that was expected from question).
I usually do this kind of thing in Perl:
#!/usr/bin/perl
sub procdir {
chdir #_[0];
for (<*>) {
my $oldname = $_;
rename($oldname, $_) if s/\?//g;
procdir($_) if -d;
}
chdir "..";
}
procdir("top_directory");

Mercurial, stop versioning cache directory but keep directory

I have a CakePHP project under Mercurial version control. Right now all the files in the app/tmp directory are being versioned, which are always changing.
I do not want to version control these files.
I know I can stop by running hg forget app/tmp/*
But this will also forget the file structure. Which I want to keep.
Now I know that Mercurial doesn't version directories, just files, but the CakePHP folks were also smart enough to put an empty file called empty in every empty directory (I am guessing for this reason).
So what I want to do is tell Mercurial to forget every file under app/tmp except files whos name is exactly empty.
What would the command be for this?
Well, if nothing else works, you can always just ask Mercurial to forget everything, and then revert empty before committing:
Here's how I reproduced it, first create initial repo:
hg init
md app
md app\tmp
echo a>app\empty
echo a>app\tmp\empty
hg commit -m "initial" -A
Then add some files we later want to get rid of:
echo a >app\tmp\test1.txt
echo a >app\tmp\test2.txt
hg commit -m "adding" -A
Then forget the files we don't want:
hg forget app\tmp\*
hg status <-- will show all 3 files
hg revert app\tmp\empty
hg status <-- now empty is gone
echo glob:app/tmp>.hgignore
hg commit -m "ignored" -A
Note that all .hgignore does is to prevent Mercurial from discovering new files during addremove or commit -A, if you have explicitly tracked files that match your ignore filter, Mercurial will still track changes to those files.
In other words, even though I asked Mercurial to ignore app/tmp above, the file empty inside will not be ignored, or removed, since I have explicitly asked Mercurial to track it.
At least theoretically (I don't have time to try it right now), pattern matching should work with the hg forget command. So, you could do something like hg forget -X empty while in the directory (-X means "exclude").
You may want to consider using .hgignore, of course.
Since you only need to do it once I'd just do this:
find app/tmp -type f | grep -v empty | xargs hg forget
hg commit
from then on just put this in your `.hgignore'
^app/tmp
Mercurial has built-in support for globbing and regexes, as explained in the relevant chapter in the mercurial book. The python regex implementation is used.
This should work for you:
hg forget "re:app/tmp/.*(?<!/empty)$"

Resources