To read each file and execute the func in loop [closed] - c

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have 5000 files in my hard-disk with name as ip_file_1,ip_file_2,....
I have a executable that can merge only 2 files. How can I write a script that takes all the file residing in the hardisk (whichs start with ip_file_*) and calls the function for merging all the files.
I have a 5000 files which are binaries that contain the logging information (time that each function call has taken). I have another executable that takes only two files and merges according to the timestamp and gives the merged output.
I execute with the format like the below,
./trace ip_file1 ip_file2 mergefile # I'm not using the trace tool. It's an example
I could use the executable to merge only two files. I thought of automating it to merge all the other files.
The merges has to be done in order (merged according to the timestamp). The logic to merge is already there. And the Output of the merge is sent to the file.
My question is not on how to merge the files. My question is how to automate and merge all the files instead of two files.

To avoid excessive number of parameters or length of parameters to a command line, you want to write your merge command so that it can take a previously merged output and merge another file. The description of merge in the original problem statement is quite scant, so I'll make the assumption that you can do this:
merge -o output_file input_file
Where output_file can be a previously merged file or a new file. If you can do that, then it would be simple to merge all of them by:
find drive_path -name "ip_file_*" -exec merge -o output_file {} \;
The order here is directory order in the file system. If a different order is needed, that will need to be specified.
ADDENDUM
If you need the files in timestamp order, then I would revamp this approach and create a merge command that accepts as an input a text file which lists all of the files to merge. Create this list of files using the information given in this post: https://superuser.com/questions/294161/unix-linux-find-and-sort-by-date-modified

Where your external merge tool is real_merge, and this tool writes merged output from two command-line arguments to stdout, the following recursive shell function will do the job:
merge_files() {
next=$1; shift
case $# in
0) cat "$next" ;;
1) real_merge "$next" "$1"
*) real_merge "$next" <(merge_files "$#")
esac
}
This approach is highly parallelized -- which means that it'll use as much CPU and disk IO as is available to it. Depending on your available resources, and your operating system's facility at managing those resources, this may or may not be a good thing.
The other approach is to use a temporary file:
swap() {
local var_curr=$1
local var_next=$2
local tmp
tmp="${!var_curr}"
printf -v "$var_curr" "${!var_next}"
printf -v "$var_next" "$tmp"
}
merge_files() {
local tempfile_curr=tempfile_A
local tempfile_next=tempfile_B
local tempfile_A="$(mktemp -t sort-wip-A.XXXXXX)"
local tempfile_B="$(mktemp -t sort-wip-B.XXXXXX)"
while (( $# )); do
if [[ -s ${!tempfile_curr} ]]; then
# we already populated our temporary file
real_merge "${!tempfile_curr}" "$1" "${!tempfile_next}"
swap tempfile_curr tempfile_next
elif (( $# >= 2 )); then
# only two arguments at all
real_merge "$1" "$2" "${!tempfile_curr}"
shift
else
# only one argument at all
cat "$1"
rm -f "$tempfile_A" "$tempfile_B"
return
fi
shift
done
# write output to stdout
cat "${!tempfile_curr}"
# ...and clean up.
rm -f "$tempfile_A" "$tempfile_B"
}
You can invoke it as: merge_files ip_file_* if the filenames' lexical sort order is accurate. (This will be true if their names are zero-padded, ie. ip_file_00001, but not true if they aren't padded). If not, you'll need to sort the stream of names first. If you're using bash and have GNU stat and sort available, this could be done as so:
declare -a filenames=()
while IFS='' read -r -d ' ' timestamp && IFS='' read -r -d '' filename; do
filenames+=( "$filename" )
done < <(stat --printf '%Y %n\0' ip_file_* | sort -n -z)
merge_files "${filenames[#]}"

Related

Organizing ".bash_history" into ".bash_history.cmd" category files

My script loads a "categories" array with chosen terminal commands.
This array is then used to yank matching ".bash_history" records into
separate category files. My function: "extract_records()" extracts each
category using an ERE grep:
BASH_HISTORY="$HOME"/.bash_history
...
# (where "$1" here is a category)
grep -E "^($1 )" "$BASH_HISTORY" >> ".bash_history.$1"
Once the chosen records are grepped from "$BASH_HISTORY" into individual
category files, they are then removed from "$BASH_HISTORY". This is done
using "grep -v -e" patterns where the category list is re-specified.
My script works but a potential problem exists: the list of history
command keywords is defined twice, once in the array and then in a grep
pattern list. Excerpted from the script:
#!/usr/bin/bash--------------------------------------------------
# original array definition.
categories=(apt cat dpkg echo file find git grep less locate)
...
for i in "${categories[#]}"; do
extract_records "$i" # which does the grep -E shown above.
done
...
# now remove what has been categorized to separate files.
grep -E -v \
-e "^(apt )" \
-e "^(cat )" \
-e "^(dpkg )" \
... \
"$BASH_HISTORY" >> "$BASH_HISTORY".$$
# finally the temporary "$$" file is optionally sorted and moved
# back as the main "$BASH_HISTORY".
The first part calls extract_records() each time to grep and create
each category file. The second part uses a single grep to remove
records using a pattern list, re-specified based on the array.
PROBLEM: Potentially, the two independent lists can be mismatched.
Optimally, the array: "${categories[#]}" should be used for each part:
extracting chosen records, and then rebuilding "$BASH_HISTORY" without
the separated records. This would replace my now using the "grep -E -v"
pattern list. Something of the sort:
grep -E -v "^(${categories[#]})" "$BASH_HISTORY"
It's nice and compact, but this does not work.
The goal is to divide out oft used terminal commands into separate files
so as to keep "$BASH_HISTORY" reasonably small. The separately saved
records can then be recalled using another script that functions like
the Bash's internal history facility. In this way, no history is lost
and everything is grouped and better managed.

How to get a list of files of the current directory sorted by modification date in a bash script? [duplicate]

This question already has an answer here:
Find files in current directory sorted by modified time and store result in an array
(1 answer)
Closed 12 months ago.
I would like to get a list (or array) of all files in my current directory which is sorted by modification date. In the terminal, something like ls -lt works, but that should not be used in a bash script (http://mywiki.wooledge.org/BashPitfalls#for_i_in_.24.28ls_.2A.mp3.29)...
I tried to use the -nt opterator (https://tips.tutorialhorizon.com/2017/11/18/nt-file-test-operator-in-bash/) but I am hoping that there is a more simple and elegant solution to this.
This might help you:
In bash with GNU extensions:
Creating an array
mapfile -d '' a < <(find -maxdepth 1 -type f "%T# %p\0" | sort -z -k1,1g | cut -z -d ' ' -f2)
or looping over the files:
while read -r -d '' _ file; do
echo "${file}"
done < <(find -maxdepth 1 -type f "%T# %p\0" | sort -z -k1,1g)
Here we build up a list of files with the NULL-character as the delimiter. The field itself consists of the modification date in the epoch followed by a space and the file name. We use sort to sort that list by modification date. The output of this is passed to a while loop that reads the fields per zero-terminated record. The first field is the modification date which we read in _ and the remainder is passed to file.
In ZSH:
If you want to use another shell like zsh, you can just do something like:
a=( *(Om) )
or
for file in *(Om); do echo "${file}"; done
here Om is a glob-modifier that tells ZSH to sort the output by modification date.

bash array of file locations - how to find last updated file?

Have an array of files built from a locate command that I need to cycle through and figure out the latest and print the latest. We have a property file called randomname-properties.txt that is in multiple locations and is sometimes called randomname-properties.txt.bak or randomname-properties.txt.old. Example is below
Directory structure
/opt/test/something/randomname-properties.txt
/opt/test2/something/randomname-properties.txt.old
/opt/test3/something/randomname-properties.txt.bak
/opt/test/something1/randomname-properties.txt.working
Code
#Builds list of all files
PropLoc=(`locate randomname-properties.txt`)
#Parse list and remove older file
for i in ${PropLoc[#]} ; do
if [ ${PropLoc[0]} -ot ${PropLoc[1]} ] ; then
echo "Removing ${PropLoc[0]} from the list as it is older"
#Below should rebuild the array while removing the older element
PropLoc=( "${PropLoc[#]/$PropLoc[0]}" )
fi
done
echo "Latest file found is ${PropLoc[#]}"
Overall this isn't working. It currently appears that it doesn't even go into the loop as the first two files have the same timestamp of last year (doesn't appear to deconflict down past the day for things older than a year). Any thoughts on how to get this to work properly? Thank you
You can use ls -t, which will sort the files by modification time. The first line will then be the newest file.
newest=$(ls -t "${PropLoc[#]}" | head -n 1)
This should work as long as none of the filenames contain newlines.
Don't forget to quote your variables in case they contain whitespace or wildcard characters.
Without parsing the output of ls:
#!/usr/bin/env bash
latest=
while read -r -d '' file; do
if [ "$file" -nt "$latest" ]; then
latest=$file
fi
done < <(locate --null randomname-properties.txt)
printf 'Latest file found is %s\n' "$latest"

Unique file names in a directory in unix

I have a capture file in a directory in which some logs are being written in a file
word.cap
now there is a script in which when its size becomes exactly 1.6Gb then it clears itself and prepares files in below format in same directory-
word.cap.COB2T_1389889231
word.cap.COB2T_1389958275
word.cap.COB2T_1390035286
word.cap.COB2T_1390132825
word.cap.COB2T_1390213719
Now i want to pick all these files in a script one by one and want to perform some actions.
my script is-
today=`date +%d_%m_%y`
grep -E '^IPaddress|^Node' /var/rawcap/word.cap.COB2T* | awk '{print $3}' >> snmp$today.txt
sort -u snmp$today.txt > snmp_final_$today.txt
so, what should i write to pick all file names of above mentioned format one by one as i will place this script in crontab,but i don't want to read main word.cap file as that is being edited.
As per your comment:
Thanks, this is working but i have a small issue in this. There are
some files which are bzipped i.e. word.cap.COB2T_1390213719.bz2, so i
dont want these files in list, so what should be done?
You could add a condition inside the loop:
for file in word.cap.COB2T*; do
if [[ "$file" != *.bz2 ]]; then
# Do something here
echo ${file};
fi
done

Moving things in terminal based on their name

Edit: I think this has been answered successfully, but I can't check 'til later. I've reformatted it as suggested though.
The question: I have a series of files, each with a name of the form XXXXNAME, where XXXX is some number. I want to move them all to separate folders called XXXX and have them called NAME. I can do this manually, but I was hoping that by naming them XXXXNAME there'd be some way I could tell Terminal (I think that's the right name, but not really sure) to move them there. Something like
mv *NAME */NAME
but where it takes whatever * was in the first case and regurgitates it to the path.
This is on some form of Linux, with a bash shell.
In the real life case, the files are 0000GNUmakefile, with sequential numbering. I'm having to make lots of similar-but-slightly-altered versions of a program to compile and run on a cluster as part of my research. It would probably have been quicker to write a program to edit all the files and put in the right place in the first place, but I didn't.
This is probably extremely simple, and I should be able to find an answer myself, if I knew the right words. Thing is, I have no formal training in programming, so I don't know what to call things to search for them. So hopefully this will result in me getting an answer, and maybe knowing how to find out the answer for similar things myself next time. With the basic programming I've picked up, I'm sure I could write a program to do this for me, but I'm hoping there's a simple way to do it just using functionality already in Terminal. I probably shouldn't be allowed to play with these things.
Thanks for any help! I can actually program in C and Python a fair amount, but that's through trial and error largely, and I still don't know what I can do and can't do in Terminal.
SO many ways to achieve this.
I find that the old standbys sed and awk are often the most powerful.
ls | sed -rne 's:^([0-9]{4})(NAME)$:mv -iv & \1/\2:p'
If you're satisfied that the commands look right, pipe the command line through a shell:
ls | sed -rne 's:^([0-9]{4})(NAME)$:mv -iv & \1/\2:p' | sh
I put NAME in brackets and used \2 so that if it varies more than your example indicates, you can come up with a regular expression to handle your filenames better.
To do the same thing in gawk (GNU awk, the variant found in most GNU/Linux distros):
ls | gawk '/^[0-9]{4}NAME$/ {printf("mv -iv %s %s/%s\n", $1, substr($0,0,4), substr($0,5))}'
As with the first sample, this produces commands which, if they make sense to you, can be piped through a shell by appending | sh to the end of the line.
Note that with all these mv commands, I've added the -i and -v options. This is for your protection. Read the man page for mv (by typing man mv in your Linux terminal) to see if you should be comfortable leaving them out.
Also, I'm assuming with these lines that all your directories already exist. You didn't mention if they do. If they don't, here's a one-liner to create the directories.
ls | sed -rne 's:^([0-9]{4})(NAME)$:mkdir -p \1:p' | sort -u
As with the others, append | sh to run the commands.
I should mention that it is generally recommended to use constructs like for (in Tim's answer) or find instead of parsing the output of ls. That said, when your filename format is as simple as /[0-9]{4}word/, I find the quick sed one-liner to be the way to go.
Lastly, if by NAME you actually mean "any string of characters" rather than the literal string "NAME", then in all my examples above, replace NAME with .*.
The following script will do this for you. Copy the script into a file on the remote machine (we'll call it sortfiles.sh).
#!/bin/bash
# Get all files in current directory having names XXXXsomename, where X is an integer
files=$(find . -name '[0-9][0-9][0-9][0-9]*')
# Build a list of the XXXX patterns found in the list of files
dirs=
for name in ${files}; do
dirs="${dirs} $(echo ${name} | cut -c 3-6)"
done
# Remove redundant entries from the list of XXXX patterns
dirs=$(echo ${dirs} | uniq)
# Create any XXXX directories that are not already present
for name in ${dirs}; do
if [[ ! -d ${name} ]]; then
mkdir ${name}
fi
done
# Move each of the XXXXsomename files to the appropriate directory
for name in ${files}; do
mv ${name} $(echo ${name} | cut -c 3-6)
done
# Return from script with normal status
exit 0
From the command line, do chmod +x sortfiles.sh
Execute the script with ./sortfiles.sh
Just open the Terminal application, cd into the directory that contains the files you want moved/renamed, and copy and paste these commands into the command line.
for file in [0-9][0-9][0-9][0-9]*; do
dirName="${file%%*([^0-9])}"
mkdir -p "$dirName"
mv "$file" "$dirName/${file##*([0-9])}"
done
This assumes all the files that you want to rename and move are in the same directory. The file globbing also assumes that there are at least four digits at the start of the filename. If there are more than four numbers, it will still be caught, but not if there are less than four. If there are less than four, take off the appropriate number of [0-9]s from the first line.
It does not handle the case where "NAME" (i.e. the name of the new file you want) starts with a number.
See this site for more information about string manipulation in bash.

Resources