I am working on a script that will run daily. The script will compare individual configuration files from multiple host on one day with individual configuration files from a previous day. I am working on a CentOS host I have limited access to - meaning I can't make major changes.
Details
My hosts are running a cron job that uploads their configuration file to an sftp server in a generic stable directory (/var/log/Backups) by hostname (hostname.txt).
A cron job on the server creates a date stamped directory and moves the files from /var/log/Backups to /var/log/Backups/ddmmyyyy.
Later, well after all sftp and file move operations I want to load an array with the file names in a current directory and I load an array with matching file names from the previous days directory.
I want a script to diff the matching file names and output the information to a single text file.
I can't get the array to load the current days files and echo them to the terminal. I get a file operation error.
Script:
#!/bin/bash
# Set current date
now=`date +%d-%m-%Y`
echo $now
base=/var/log/GmonBackups/
loc=$base$now
echo $loc
# Load files from /var/log/GmonBackups/$now into an array
t_files=`ls loc`
echo $t_files
Something along these lines might help you get further:
today=$(date +%d%m%Y)
yesterday=$(date --date=yesterday +%d%m%Y)
base=/var/log/GmonBackups
today_dir=$base/$today
yesterday_dir=$base/$yesterday
today_files=( $today_dir/* )
yesterday_files=( $yesterday_dir/* )
A few points:
prefer $() to ``
don't use ls to get your list of files because it's not robust
I didn't put quotes around the variables because there are no spaces in your directory names.
Related
I am trying to use a simple windows cmd command to move a set of .csv files from one directory to another.
I know that this can be easily achieved through:
MOVE "*.csv" "D:/lorik_home"
The issue comes when I want to only move some specific files with a filename mask, for instance, I want to move all files which are in the .csv extension and start with the name: lorik_files_. I already tried using:
MOVE "lorik_files_*.csv" "D:/lorik_home"
This works when there is only one file of the format: lorik_files_20112233_09_33_33.csv, but when there are two files with the masks such as:
lorik_files_20112233_09_33_33.csv
lorik_files_20112233_10_23_42.csv
I get an error such as:
A duplicate file name exists, or the file cannot be found.
I need a hand for capturing those filename prefix masks, conditioned by .csv extension at the end.
i am using this command in cmd with cpdf.
cpdf -split a.pdf -o page %%%pdf
but I wanted to use it for a pdf list in a directory.
ie you need a batch script that runs on all pdf files in the directory and the cpdf split command is applied to each file dividing by one per sheet.
example, transform the files:
a.pdf
b.pdf
c.pdf
and more ...
in several files, 1 per page of the original with the name of the original
a1.pdf
a2.pdf
a3.pdf
b1.pdf
b2.pdf
b3.pdf
c1.pdf
c2.pdf
c3.pdf
and more ...
can help?
I'm a little bit stuck and need some help.
Currently have a process where completed work is stored in a directory as .txt files (the filename describes what the job was eg. Job1_Machine1_Randomly generated number.txt).
What I would like to do is run a dirlist via a batch file to easily extract the days work. I currently do this as:
dir *.* /s | find "%date%" >dirlisttoday.txt
The next part of my process is to upload the list into an Access database for matching and tracking, I use the files date and time stamp as a 'completed date'.
Currently the second process requires a manual manipulation as the Access import from .txt file only allows a single delimiter, (I have spaces between date/times and underscores in the title).
I can't use fixed width imports as Machine1 can vary in length. The directory is also used by other processes which can't be changed, so changing the filename isn't possible either.
I want to automate this process so it can be carried out by Windows Task Scheduler. Is there a line of script which can be added to my batch file to alter the directory list I create dirlisttoday.txt from:
31/08/2017 12:30 Job1_Machine1_Randomly generated number.txt
to:
31/08/2017 12:30 Job1 Machine1 Randomly generated number.txt
I have a PC which runs specific BAT files at different times in the day. In order to ensure I keep track that all tasks are occurring I make each task register an entry in a log file called tack.txt. This is typically how it looks:
Start Cams # 10/09/2016 15:11:43
Update List # 11/09/2016 14:13:47
EPG Update # 11/09/2016 16:59:48
Start Cams # 11/09/2016 17:01:35
Restart # 13/09/2016 17:16:52
Each entry is on a separate line and has a date and time as presented.
Is it possible to have create a BAT file which deletes entries older than say 3 days to keep the process tidy?
Yes. It's hard to get yesterday's date with Batch (handling special cases across months/years/leapdays), but it's easy with a bit of help from Powershell.
You can get the dates like this:
for /f %%a in ('"powershell [DateTime]::Now.AddDays(-2).ToString('dd/MM/yyyy')"') do ( set today-2=%%a )
Then you can search for them like this, and output the matching lines to a file:
findstr "%today-2% %today-1% %today%" test.txt >matching.txt
Then you can replace your log file with matching.txt via copy /y and del.
Is it possible to speed up batch processing of documents with CoreNLP from command line so that models load only one time? I would like to trim any unnecessarily repeated steps from the process.
I have 320,000 text files and I am trying to process them with CoreNLP. The desired result is 320,000 finished XML file results.
To get from one text file to one XML file, I use the CoreNLP jar file from command line:
java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props config.properties
-file %%~f -outputDirectory MyOutput -outputExtension .xml -replaceExtension`
This loads models and does a variety of machine learning magic. The problem I face is when I try to loop for every text in a directory, I create a process that by my estimation will complete in 44 days. I literally have had a command prompt looping on my desktop for the last 7 days and I'm nowhere near finished. The loop I run from batch script:
for %%f in (Data\*.txt) do (
java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props config.properties
-file %%~f -outputDirectory Output -outputExtension .xml -replaceExtension
)
I am using these annotators, specified in config.properties:
annotators = tokenize, ssplit, pos, lemma, ner, parse, dcoref, sentiment
I know nothing about Stanford CoreNLP, so I googled for it (you didn't included any link) and in this page I found this description (below "Parsing a file and saving the output as XML"):
If you want to process a list of files use the following command line:
java -cp
stanford-corenlp-VV.jar:stanford-corenlp-VV-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-VV.jar
-Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP [ -props YOUR CONFIGURATION FILE ] -filelist A FILE CONTAINING YOUR LIST OF FILES
where the -filelist parameter points to a file whose content lists all
files to be processed (one per line).
So I guess that you may process your files faster if you store a list of all your text files in a list file:
dir /B *.txt > list.lst
... and then pass that list in the -filelist list.lst parameter in a single execution of Stanford CoreNLP.