Gentlemen and -women, one question regarding programming:
I need to write a program to batch rename files, under the following conditions:
All files are in a directory. Can be the same directory as the executable, if this simplifies things pathwise.
All files are .pdf, but this should not matter I believe.
All files are start with double digit prefix ranging from 01 til 99
e.g.: 01_FileNameOriginal1.pdf ; 02_FileNameOriginal2.pdf
This double digit needs to stay.
All names must be modified to a predefined range of filenames (sort of a database) which can be embedded in the executable, or read out from an external file (being .txt, .csv, whatever is most feasible).
e.g.: 01_NewName1.pdf ; 02_NewName2.pdf ; ...
Some original filenames contain an expiry date, which is labelled "EXP YYYY MMM DD", and should be appended to the new name in a different format: "e_YYYY_MM_DD". So basically it needs to be able to use a "for" statement (to loop for the number of files), and "if" statement to search for the "EXP" (matching case), cut the string and append in the rearranged format before the file extension.
The result of the program can be either renaming, or returning a renamed copy. The former seems easier to do.
So to summarize:
step 1: Run program
step 2: integer = number of files
step 3: loop:
3.A check first two digits, copy to a temp string.
3.B compare the temp string with an array of predefined filenames. The array can be embedded in the code, or read externally. For the user friendliness sake, an external read from a .csv seems easier to modify the filenames later on.
3.C to rename or not to rename. In case of a match:
Assume the old file has the following name: 01_FileNameOriginal EXP YYYY MM DD Excessivetext.pdf
Copy first two digits to a temp2 string
Scan the old name for EXP (for length of filename, if = "EXP ", matching case) and cut the following 10 characters. Put YYYY, MM, and DD in seperate strings. All essential value has now been extracted from the old filename.
Match dbl digits with the first two digits of filenames in the database. Copy the new name to a temp string.
Rename the file with the new name: eg 01_NewName.pdf
Append date strings before the extension: eg 01_NewName_e_YYYY_MM_DD.pdf
Note: date can be extracted in a single string, as long as spaces are replaced by underscores. Probably shorter to program.
in case of no match: copy old filename, and put it in a temp string, to return at the end of the process as an error (or .txt) file, with filenames that could not be renamed.
step 4: finish and return error or report (see previous)
Question:
Based on these conditions, what would be the easiest way to get started? Are there freeware programs that can do such a thing. If not what is the best approach. I have basic programming knowledge (Java/VBA), some small C++ stints but nothing spectacular. I have only programmed in a programming environment and have never produced any executables or batch files or the likes so I don't have any idea how to start atm, but wouldn't mind to give it a shot. As long as it's a guided shot, and not one in the dark cos that's where I am now.
Would love to hear some thoughts on this.
Greetings
Wouter
Related
Command ffmpeg -i file-1.mp4 -vf ass=file-1a.ass burned-1.mp4
works to burn file-1a.ass subtitles on file-1.mp4 video.
But each time I have to reiterate the same command on over 40 different videos and subtitles and each time I have to wait for rendering the output.
So perhaps there is a way to automatically reiterate the same command on all the files.
Looking for a reply found the loop command
for f in *; do ffmpeg $f;
But I am confused how to use it with 2 files, the .mp4 and the .ass file, and also the output file which should have the same number
I imagine should put the same name on each couple of files, such as:
1.mp4 1.ass
2.mp4 2.ass
3.mp4 3.ass
etc
and then
for f in *; do ffmpeg -i $f.mp4 -vf ass=$f.ass $f-output.mp4
But I have no clear idea
You have the right idea. But it won’t work if the loop executes with f == 1.mp4, then again with f == 1.ass, and so on.
So you want to modify the loop to only iterate over .mp4 files. Then you want to strip the .mp4 extension from the value of f, that is, strip the last 4 characters from the value of f, using ${f:0: -4} (this means “get a substring of f, starting at character 0 and ending at 5 characters before the end”).
You obviously want to terminate the loop with done. I also suggest wrapping the parameters in quotes, to prevent word splitting (that is, if the filenames contain certain characters, they might be split into multiple arguments to ffmpeg).
Putting it all together:
for f in *.mp4; do f=${f%.*}; ffmpeg -i "$f.mp4" -vf ass="$f.ass" "$f-output.mp4"; done
Of course, once you have run this, you need to get rid of all the output files before you can run it again. Or you can just put the output files in a different directory to begin with.
Edit: Another user posted an answer, which seems to have been deleted. It was a good answer but lacked explanation. It was basically the same as my answer, except that it used ${f%.mp4} to strip the .mp4 extension. My answer is probably slightly more complex but slightly more efficient, so it’s basically a matter of personal preference.
Edit 2: Based on the link provided by llogan’s comment, I have made these changes:
Remove the quotes in the assignment, as assignments are not subject to word splitting (this is also stated in the bash man page).
Use ${f%.*} to strip the extension. This strips a dot followed by any sequence of characters from the end. It looks for the shortest possible match, so it’s really looking for a dot followed by any sequence of non-dot characters at the end.
Could someone help me in writing a program that has to compile all the files in the directory and report error, if any. For which my program has to get the list of all files under the folder with its full path and store it in a temp-table and then it has to loop through the temp table and compile the files.
Below is a very rough start.
Look for more info around the COMPILE statement and the COMPILER system handle in the online help (F1).
Be aware that compiling requires you to have a developer license installed. Without it the COMPILE statement will fail.
DEFINE VARIABLE cDir AS CHARACTER NO-UNDO.
DEFINE VARIABLE cFile AS CHARACTER NO-UNDO FORMAT "x(30)".
ASSIGN
cDir = "c:\temp\".
INPUT FROM OS-DIR(cDir).
REPEAT:
IMPORT cFile.
IF cFile MATCHES "*..p" THEN DO:
COMPILE VALUE(cDir + cFile) SAVE NO-ERROR.
IF COMPILER:ERROR THEN DO:
DISPLAY
cFile
COMPILER:GET-MESSAGE(1) FORMAT "x(60)"
WITH FRAME frame1 WIDTH 300 20 DOWN.
END.
END.
END.
INPUT CLOSE.
Since the comment wouldn't let me paste this much into it... using INPUT FROM OS-DIR returns all of the files and directories under a directory. You can use this information to keep going down the directory tree to find all sub directories
OS-DIR documentation:
Sometimes, rather than reading the contents of a file, you want to read a list of the files in a directory. You can use the OS–DIR option of the INPUT FROM statement for this purpose.
Each line read from OS–DIR contains three values:
*The simple (base) name of the file.
*The full pathname of the file.
*A string value containing one or more attribute characters. These characters indicate the type of the file and its status.
Every file has one of the following attribute characters:
*F — Regular file or FIFO pipe
*D — Directory
*S — Special device
*X — Unknown file type
In addition, the attribute string for each file might contain one or more of the following attribute characters:
*H — Hidden file
*L — Symbolic link
*P — Pipe file
The tokens are returned in the standard ABL format that can be read by the IMPORT or SET statements.
I have two small to medium sized files (2k) that are for all intents and purposes identical. The second file is the result of the first file being duplicated and replacing backslashes with forward slashes. The new file is bigger by 80 bytes (or one byte per line).
I did this with a simple batch script,and at first I thought the script might have unintentionally added some spaces or other artifacts. Or maybe the fact that their extensions are different has something to do with it (one has a tmp extension and the other has a lst extension).
From an editor, I replaced all forward slashes in the new file with backslashes and saved it without changing the extension.
And, hey guess what? The files were the same size again.
Now, before this is written off as a random fluke, I also see the same behavior exhibited in three other pairs of files (in other words six files) created in the same manner as the first. They are all one byte bigger per line in the file. The largest is about 12k bytes, and the smallest is about 2k.
I wouldn't think it has anything to do with escaping because I am on a Windows box using the Windows 7 cmd.exe shell.
Also one other thing. I tried the following:
echo \\\\\ >> a.txt
echo ///// >> b.txt
The files matched in size (7 bytes)
Does anyone have an explanation for this behavior?
I would suggest opening the files with an editor like Notepad++ that shows the type of linefeed (Windows/Mac/Unix). This is most likely your problem if the file size differs 1 byte per line.
Notepad++ can show line endings as small CR/LF symbols (View -> Show Symbol -> Show End of Line) and convert between the Windows/Mac/Unix line endings (Edit -> EOL Conversion).
Both Unix and Mac systems are usually storing files with an one byte line ending (Mac: CR, Unix: LF), Windows uses two bytes (CR LF).
Depending on the programs your batch scripts use, this might occur even though your system is a pure Windows box. The reason you don't get a difference when using an editor is that editors usually keep the file's original line endings.
Okay. I just solved it. #schnaader pointed me in the right direction. It actually has nothing to do with the forward or backslashes.
What happened is that my script added one character of trailing white space to each line. Why the file again became the same size after I reverted the slashes is because the editor I used to find and replace (Komodo Edit) is set up to automatically trim trailing white space on file save.
Funny.
i am having a directory which contains 4 files namely 1.c,2.c,3.c and 4.c.i am reading the file names present under this directory by using readdir system call which returns to some structure variable namely myStruct.
2)I am having another open file namely a.txt file which contains file names like 1.c,2.c,3.c,4.c etc...
My intention is to compare the files present in a.txt with the files presen in the directory(just the name comparison is enough..not checking its contents).
when i do the comparison,even though the names present in the directory matches with those present in the a.txt file,they dont show equal comparison and then when i printed the lenghths they are unequal.
Can anyone please let me know any solution to this problem
thanks
maddy
When you read from the file, there is an extra null character at the end of the line you have read, so the comparison will show that they are unequal. So after reading the line, trim off the \n and then try.
EDIT
This discussion tells you about how to trim whitespaces in a string using C - Painless way to trim leading/trailing whitespace in C?
I am working with a text file, which contains a list of processes under my programs control, along with relevant data.
At some point, one of the processes will finish, and thus will need to be removed from the file (as its no longer under control).
Here is a sample of the file contents (which has enteries added "randomly"):
PID=25729 IDLE=0.200000 BUSY=0.300000 USER=-10.000000
PID=26416 IDLE=0.100000 BUSY=0.800000 USER=-20.000000
PID=26522 IDLE=0.400000 BUSY=0.700000 USER=-30.000000
So for example, if I wanted to remove the line that says PID=26416.... how could I do that, without writing the file over again?
I can use external unix commands, however I am not very familiar with them so please if that is your suggestion, give an example.
Thanks!
Either you keep the contents of the file in temporary memory and then rewrite the file. Or you could have a file for each of the PIDs with the relevant information in them. Then you simply delete the file when it's no longer running. Or you could use a database for this instead.
As others have already pointed out, your only real choice is to rewrite the file.
The obvious way to do that with "external UNIX commands" would be grep -v "PID=26416" (or whatever PID you want to remove, obviously).
Edit: It is probably worth mentioning that if the lines are all the same length (as you've shown here) and order doesn't matter, you could delete a line more efficiently by copying the last line into the space being vacated, then shorten the file so eliminate what had been the last line. This will only work if they really are all the same length though (e.g., if you got a PID of '1', you'd need to pad it to the same length as the others in the file).
The only way is by copying each character that comes after the deleted line down over the characters that are deleted.
It is far more efficient to simply rewrite the file.
how could I do that, without writing the file over again?
You cannot. Filesystems (perhaps besides more esoteric record based ones) does not support insertion or deletion.
So you'll have to write the lines to a temporary file up till the line you want to delete, skip over that line, and write the rest of the lines to the file. When done, rename/copy the temp file to the original filename
Why are you maintaining these in a text file? That's not the best model for such a task. But, if you're stuck with it ... if these lines are guaranteed to all be the same length (it appears that way from the sample), and if the order of the lines in the file doesn't matter, then you can write the last line over the line for the process that has died and then shorten the file by one line with the (f)truncate() call if you're on a POSIX system: see Jonathan Leffler's answer in How to truncate a file in C?
But note carefully netrom's answer, which gives three different better ways to maintain this info.
Also, if you stick with a text file (preferably written from scratch each time from data structures you maintain, as per netrom's first suggestion), and you want to be sure that the file is always well formed, then write the new data into a temp file on the same device (putting it in the same directory is easiest) and then do a rename() call, which is an atomic operation.
You can use sed:
sed -i.bak -e '/PID=26416/d' test
-i is for editing in place. It also creates a back-up file with the new extension .bak
-e is for specifying the pattern. The /d indicates all lines matching the pattern should be deleted.
test is the filename
The unix command for it is:
grep -v "PID=26416" myfile > myfile.tmp
mv myfile.tmp myfile
The grep -v part outputs the file without the rows with the search term.
The > myfile.tmp part creates a new temp file for this output.
The mv part renames the temp file to the original file.
Note that we are rewriting the file here, and moreover, we can lose data if someone write something to file between the two commands.