Cannot load files from a concatenated path - arrays

I am working on a script that will run daily. The script will compare individual configuration files from multiple host on one day with individual configuration files from a previous day. I am working on a CentOS host I have limited access to - meaning I can't make major changes.
Details
My hosts are running a cron job that uploads their configuration file to an sftp server in a generic stable directory (/var/log/Backups) by hostname (hostname.txt).
A cron job on the server creates a date stamped directory and moves the files from /var/log/Backups to /var/log/Backups/ddmmyyyy.
Later, well after all sftp and file move operations I want to load an array with the file names in a current directory and I load an array with matching file names from the previous days directory.
I want a script to diff the matching file names and output the information to a single text file.
I can't get the array to load the current days files and echo them to the terminal. I get a file operation error.
Script:
#!/bin/bash
# Set current date
now=`date +%d-%m-%Y`
echo $now
base=/var/log/GmonBackups/
loc=$base$now
echo $loc
# Load files from /var/log/GmonBackups/$now into an array
t_files=`ls loc`
echo $t_files

Something along these lines might help you get further:
today=$(date +%d%m%Y)
yesterday=$(date --date=yesterday +%d%m%Y)
base=/var/log/GmonBackups
today_dir=$base/$today
yesterday_dir=$base/$yesterday
today_files=( $today_dir/* )
yesterday_files=( $yesterday_dir/* )
A few points:
prefer $() to ``
don't use ls to get your list of files because it's not robust
I didn't put quotes around the variables because there are no spaces in your directory names.

Related

Using cmd to move files with a specific filename mask

I am trying to use a simple windows cmd command to move a set of .csv files from one directory to another.
I know that this can be easily achieved through:
MOVE "*.csv" "D:/lorik_home"
The issue comes when I want to only move some specific files with a filename mask, for instance, I want to move all files which are in the .csv extension and start with the name: lorik_files_. I already tried using:
MOVE "lorik_files_*.csv" "D:/lorik_home"
This works when there is only one file of the format: lorik_files_20112233_09_33_33.csv, but when there are two files with the masks such as:
lorik_files_20112233_09_33_33.csv
lorik_files_20112233_10_23_42.csv
I get an error such as:
A duplicate file name exists, or the file cannot be found.
I need a hand for capturing those filename prefix masks, conditioned by .csv extension at the end.

Batch file with CPDF to multi files

i am using this command in cmd with cpdf.
cpdf -split a.pdf -o page %%%pdf
but I wanted to use it for a pdf list in a directory.
ie you need a batch script that runs on all pdf files in the directory and the cpdf split command is applied to each file dividing by one per sheet.
example, transform the files:
a.pdf
b.pdf
c.pdf
and more ...
in several files, 1 per page of the original with the name of the original
a1.pdf
a2.pdf
a3.pdf
b1.pdf
b2.pdf
b3.pdf
c1.pdf
c2.pdf
c3.pdf
and more ...
can help?

Changing Delimiter in Directory List

I'm a little bit stuck and need some help.
Currently have a process where completed work is stored in a directory as .txt files (the filename describes what the job was eg. Job1_Machine1_Randomly generated number.txt).
What I would like to do is run a dirlist via a batch file to easily extract the days work. I currently do this as:
dir *.* /s | find "%date%" >dirlisttoday.txt
The next part of my process is to upload the list into an Access database for matching and tracking, I use the files date and time stamp as a 'completed date'.
Currently the second process requires a manual manipulation as the Access import from .txt file only allows a single delimiter, (I have spaces between date/times and underscores in the title).
I can't use fixed width imports as Machine1 can vary in length. The directory is also used by other processes which can't be changed, so changing the filename isn't possible either.
I want to automate this process so it can be carried out by Windows Task Scheduler. Is there a line of script which can be added to my batch file to alter the directory list I create dirlisttoday.txt from:
31/08/2017 12:30 Job1_Machine1_Randomly generated number.txt
to:
31/08/2017 12:30 Job1 Machine1 Randomly generated number.txt

BAT File to delete old log entries from a TXT File

I have a PC which runs specific BAT files at different times in the day. In order to ensure I keep track that all tasks are occurring I make each task register an entry in a log file called tack.txt. This is typically how it looks:
Start Cams # 10/09/2016 15:11:43
Update List # 11/09/2016 14:13:47
EPG Update # 11/09/2016 16:59:48
Start Cams # 11/09/2016 17:01:35
Restart # 13/09/2016 17:16:52
Each entry is on a separate line and has a date and time as presented.
Is it possible to have create a BAT file which deletes entries older than say 3 days to keep the process tidy?
Yes. It's hard to get yesterday's date with Batch (handling special cases across months/years/leapdays), but it's easy with a bit of help from Powershell.
You can get the dates like this:
for /f %%a in ('"powershell [DateTime]::Now.AddDays(-2).ToString('dd/MM/yyyy')"') do ( set today-2=%%a )
Then you can search for them like this, and output the matching lines to a file:
findstr "%today-2% %today-1% %today%" test.txt >matching.txt
Then you can replace your log file with matching.txt via copy /y and del.

Efficient batch processing with Stanford CoreNLP

Is it possible to speed up batch processing of documents with CoreNLP from command line so that models load only one time? I would like to trim any unnecessarily repeated steps from the process.
I have 320,000 text files and I am trying to process them with CoreNLP. The desired result is 320,000 finished XML file results.
To get from one text file to one XML file, I use the CoreNLP jar file from command line:
java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props config.properties
-file %%~f -outputDirectory MyOutput -outputExtension .xml -replaceExtension`
This loads models and does a variety of machine learning magic. The problem I face is when I try to loop for every text in a directory, I create a process that by my estimation will complete in 44 days. I literally have had a command prompt looping on my desktop for the last 7 days and I'm nowhere near finished. The loop I run from batch script:
for %%f in (Data\*.txt) do (
java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props config.properties
-file %%~f -outputDirectory Output -outputExtension .xml -replaceExtension
)
I am using these annotators, specified in config.properties:
annotators = tokenize, ssplit, pos, lemma, ner, parse, dcoref, sentiment
I know nothing about Stanford CoreNLP, so I googled for it (you didn't included any link) and in this page I found this description (below "Parsing a file and saving the output as XML"):
If you want to process a list of files use the following command line:
java -cp
stanford-corenlp-VV.jar:stanford-corenlp-VV-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-VV.jar
-Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP [ -props YOUR CONFIGURATION FILE ] -filelist A FILE CONTAINING YOUR LIST OF FILES
where the -filelist parameter points to a file whose content lists all
files to be processed (one per line).
So I guess that you may process your files faster if you store a list of all your text files in a list file:
dir /B *.txt > list.lst
... and then pass that list in the -filelist list.lst parameter in a single execution of Stanford CoreNLP.

Resources