Unique file names in a directory in unix - file

I have a capture file in a directory in which some logs are being written in a file
word.cap
now there is a script in which when its size becomes exactly 1.6Gb then it clears itself and prepares files in below format in same directory-
word.cap.COB2T_1389889231
word.cap.COB2T_1389958275
word.cap.COB2T_1390035286
word.cap.COB2T_1390132825
word.cap.COB2T_1390213719
Now i want to pick all these files in a script one by one and want to perform some actions.
my script is-
today=`date +%d_%m_%y`
grep -E '^IPaddress|^Node' /var/rawcap/word.cap.COB2T* | awk '{print $3}' >> snmp$today.txt
sort -u snmp$today.txt > snmp_final_$today.txt
so, what should i write to pick all file names of above mentioned format one by one as i will place this script in crontab,but i don't want to read main word.cap file as that is being edited.

As per your comment:
Thanks, this is working but i have a small issue in this. There are
some files which are bzipped i.e. word.cap.COB2T_1390213719.bz2, so i
dont want these files in list, so what should be done?
You could add a condition inside the loop:
for file in word.cap.COB2T*; do
if [[ "$file" != *.bz2 ]]; then
# Do something here
echo ${file};
fi
done

Related

bash array of file locations - how to find last updated file?

Have an array of files built from a locate command that I need to cycle through and figure out the latest and print the latest. We have a property file called randomname-properties.txt that is in multiple locations and is sometimes called randomname-properties.txt.bak or randomname-properties.txt.old. Example is below
Directory structure
/opt/test/something/randomname-properties.txt
/opt/test2/something/randomname-properties.txt.old
/opt/test3/something/randomname-properties.txt.bak
/opt/test/something1/randomname-properties.txt.working
Code
#Builds list of all files
PropLoc=(`locate randomname-properties.txt`)
#Parse list and remove older file
for i in ${PropLoc[#]} ; do
if [ ${PropLoc[0]} -ot ${PropLoc[1]} ] ; then
echo "Removing ${PropLoc[0]} from the list as it is older"
#Below should rebuild the array while removing the older element
PropLoc=( "${PropLoc[#]/$PropLoc[0]}" )
fi
done
echo "Latest file found is ${PropLoc[#]}"
Overall this isn't working. It currently appears that it doesn't even go into the loop as the first two files have the same timestamp of last year (doesn't appear to deconflict down past the day for things older than a year). Any thoughts on how to get this to work properly? Thank you
You can use ls -t, which will sort the files by modification time. The first line will then be the newest file.
newest=$(ls -t "${PropLoc[#]}" | head -n 1)
This should work as long as none of the filenames contain newlines.
Don't forget to quote your variables in case they contain whitespace or wildcard characters.
Without parsing the output of ls:
#!/usr/bin/env bash
latest=
while read -r -d '' file; do
if [ "$file" -nt "$latest" ]; then
latest=$file
fi
done < <(locate --null randomname-properties.txt)
printf 'Latest file found is %s\n' "$latest"

Shell script [ USe Grep to lookup a files and extract the specific pattern lines]

I am using the following simple shell script to read transaction file against the contents of a master file , and just output the matching lines from the transaction file.
Transaction file contains:
this is a sample line - first
this is a sample line - second
this is a sample line - nth line
Master File contains:
first
Output:
this is a sample line - first
for tranfile in transactionfile.txt
do
grep -f MasterFile.txt $tranfile >> out.txt
done
PS: When i execute this line outside of the above shel script it works like a charm ; Just that it wont return within this shell script.
What am i missing???
Without the script output text, knowing what shell, i'm just guessing, but i suspect you either fail to find grep in the $PATH ( or using a different version of grep) or fail to find one of the files or you are executing a shell in the command line and another different shell in the script.
Try adding shebang in the script with the correct shell and try to put the grep path (usually /bin/grep or /usr/bin/grep) and also add the full path to the files you are requiring.
To help debug, i suggest you to add a set -x to the top of the script, so the shell will output what is doing and you can notice what is missing. This set -x may be replaced with a -x option in the shebang (example #!/bin/bash -x )
set -x also work in the command line, use set +x to disable it
I would have done this using awk
awk 'NR==FNR {a=$0;next} $0~a' master transaction
FNR==NR {a=$0;next} stores the data from master file in variable a
$0~a test every line of transaction if it contains variable a, if so do the default action print the line.
If master file contains more than one word. Test the last field against all words like this:
awk 'FNR==NR {a[$0];next} $NF in a' master transaction
If word can be anywhere on the line:
awk 'FNR==NR {a[$0];next} {for (i in a) if ($0~i) print}' master transaction

Bash Array Script Exclude Duplicates

So I have written a bash script (named music.sh) for a Raspberry Pi to perform the following functions:
When executed, look into one single directory (Music folder) and select a random folder to look into. (Note: none of these folders here have subdirectories)
Once a folder within "Music" has been selected, then play all mp3 files IN ORDER until the last mp3 file has been reached
At this point, the script would go back to the folders in the "Music" directory and select another random folder
Then it would again play all mp3 files in that folder in order
Loop indefinitely until input from user
I have this code which does all of the above EXCEPT for the following items:
I would like to NOT play any other "album" that has been played before
Once all albums played once, then shutdown the system
Here is my code so far that is working (WITH duplicates allowed):
#!/bin/bash
folderarray=($(ls -d /home/alphekka/Music/*/))
for i in "${folderarray[#]}";
do
folderitems=(${folderarray[RANDOM % ${#folderarray[#]}]})
for j in "${folderitems[#]}";
do
echo `ls $j`
cvlc --play-and-exit "${j[#]}"
done
done
exit 0
Please note that there isn't a single folder or file that has a space in the name. If there is a space, then I face some issues with this code working.
Anyways, I'm getting close, but I'm not quite there with the entire functionality I'm looking for. Any help would be greatly appreciated! Thank you kindly! :)
Use an associative array as a set. Note that this will work for all valid folder and file names.
#!/bin/bash
declare -A folderarray
# Each folder name is a key mapped to an empty string
for d in /home/alphekka/Music/*/; do
folderarray["$d"]=
done
while [[ "${!folderarray[*]}" ]]; do
# Get a list of the remaining folder names
foldernames=( "${!folderarray[#]}" )
# Pick a folder at random
folder=${foldernames[RANDOM%${#foldernames[#]}]}
# Remove the folder from the set
# Must use single quotes; see below
unset folderarray['$folder']
for j in "$folder"/*; do
cvlc --play-and-exit "$j"
done
done
Dealing with keys that contain spaces (and possibly other special characters) is tricky. The quotes shown in the call to unset above are not syntactic quotes in the usual sense. They do not prevent $folder from being expanded, but they do appear to be used by unset itself to quote the resulting string.
Here's another solution: randomize the list of directories first, save the result in an array and then play (my script just prints) the files from each element of the array
MUSIC=/home/alphekka/Music
OLDIFS=$IFS
IFS=$'\n'
folderarray=($(ls -d $MUSIC/*/|while read line; do echo $RANDOM $line; done| sort -n | cut -f2- -d' '))
for folder in ${folderarray[*]};
do
printf "Folder: %s\n" $folder
fileArray=($(find $folder -type f))
for j in ${fileArray[#]};
do
printf "play %s\n" $j
done
done
For the random shuffling I used this answer.
One liner solution with mpv, rl (randomlines), xargs, find:
find /home/alphekka/Music/ -maxdepth 1 -type d -print0 | rl -d \0 | xargs -0 -l1 mpv

How to copy second column from all the files in the directory and place them as columns in a new text file

I have 150 tab delimited text files, I want to copy the 2nd column of each file and paste next to another in a new text file. the new file will have 150 columns of 2nd column from each file. Help me guys.
This code worked but placed each column under the other, forming one loooong column.
for file in *.txt
do
awk '{print $2}' *.txt > AllCol.txt
done
Here is another approach without looping
$ c=$(ls -1 file*.tsv | wc -l); cut -f2 file*.tsv | pr -$c -t
#!/bin/bash
# Be sure the file suffix of the new file is not .txt
OUT=AllColumns.tsv
touch $OUT
for file in *.txt
do
paste $OUT <(awk -F\\t '{print $2}' $file) > $OUT.tmp
mv $OUT.tmp $OUT
done
One of many alternatives would be to use cut -f 2 instead of awk, but you flagged your question with awk.
Since your files are so regular, you could also skip the do loop, and use a command-line utility such as rs (reshape) or datamash.

Need bash to separate cat'ed string to separate variables and do a for loop

I need to get a list of files added to a master folder and copy only the new files to the respective backup folders; The paths to each folder have multiple folders, all named by numbers and only 1 level deep.
ie /tester/a/100
/tester/a/101 ...
diff -r returns typically "Only in /testing/a/101: 2093_thumb.png" per line in the diff.txt file generated.
NOTE: there is a space after the colon
I need to get the 101 from the path and filename into separate variables and copy them to the backup folders.
I need to get the lesserfolder var to get 101 without the colon
and mainfile var to get 2093_thumb.png from each line of the diff.txt and do the for loop but I can't seem to get the $file to behave. Each time I try testing to echo the variables I get all the wrong results.
#!/bin/bash
diff_file=/tester/diff.txt
mainfolder=/testing/a
bacfolder= /testing/b
diff -r $mainfolder $bacfolder > $diff_file
LIST=`cat $diff_file`
for file in $LIST
do
maindir=$file[3]
lesserfolder=
mainfile=$file[4]
# cp $mainfolder/$lesserFolder/$mainfile $bacfolder/$lesserFolder/$mainfile
echo $maindir $mainfile $lesserfolder
done
If I could just get the echo statement working the cp would work then too.
I believe this is what you want:
#!/bin/bash
diff_file=/tester/diff.txt
mainfolder=/testing/a
bacfolder= /testing/b
diff -r -q $mainfolder $bacfolder | egrep "^Only in ${mainfolder}" | awk '{print $3,$4}' > $diff_file
cat ${diff_file} | while read foldercolon mainfile ; do
folderpath=${foldercolon%:}
lesserFolder=${folderpath#${mainfolder}/}
cp $mainfolder/$lesserFolder/$mainfile $bacfolder/$lesserFolder/$mainfile
done
But it is much more reliable (and much easier!) to use rsync for this kind of backup. For example:
rsync -a /testing/a/* /testing/b/
You could try a while read loop
diff -r $mainfolder $bacfolder | while read dummy dummy dir file; do
echo $dir $file
done

Resources