PDF to PNG ghostscript Batch file issue - batch-file

I am trying to drag a pdf onto a batch file which will then convert the pdf to a png, in the same directory. It all works well for a single page pdf, I get the right conversion to png format, the problem is when I am converting a pdf with multiple pages. The output says "Processing pages 1, 2, 3 etc... But what I end up with is only the first page of the pdf. Could anyone steer me in the right direction, I would be much appreciated.
I have created a batch file with the code below. Thank's in advance.
path=%PATH%;C:\Program Files (x86)\gs\gs9.20\bin\
cd %~dp1
gswin32c.exe -sDEVICE=pngalpha -sCompression=lzw -r300x300 -dBATCH -sOutputFile=%1.png %1

You have to use %d as part of the output filename to specify the pattern for the page number in the pngs. -sOutputFile=%1_%%d.png in that case.

Related

Batch file script to convert PDFs and image files to TIFF

I need to convert multiple PDFs, and image files (JPG, GIF, etc.) within a certain folder to TIFF and then move the TIFF to a different folder and put the original in an "ARCHIVE" folder (within the first folder). This will run every 5-10 minutes using a scheduled task within Windows.
I have been using a program called Ghostscript which works great, the command line I am using for converting a PDF for example is:
gswin64.exe -o test03.tiff -r720x720 -g6120x7920 -sDEVICE=tiffg4 test.pdf
Can anybody help me out with a script to do the above?
To convert image to tiff without external binaries you can use convImg.bat.It still not completed (I need to add help message and some checks) but it is working.Accepts two arguments source image and the target one and the format is taken by the extension:
call convImg.bat "C:\putin_gay_clown.jpg" "C:\putin_gay_clown.tiff"
To complete your script you can use:
#echo off
::: !!! CHANGE THE LOCATIONS BELLOW !!!
set "pics=c:\pics"
set "tiff_folder=c:\tiffs"
for %%a in ("%pics%\*jpg" "%pics%\*gif" "%pics%\*png" "%pics%\*bpm") do (
call convImg.bat "%%~fa" "%tiff_folder%\%%~na.tiff"
)
As for the converting a pdf to tiff I don't think this is possible without external tool.

Using ghostscript in a Windows .bat file to convert multiple pdf files to png

I have many many pdf files in a directory that I need to convert from pdf to png. Currently, I am using the ImageMagick command:
magick mogrify -format png *.pdf
Because, there are so many files, I would like to use ghostscript directly because there are several sources that suggest that I could achieve a 75% reduction in processing time by doing this.
However, I am having trouble finding a clean dos command example to accomplish the same thing as the ImageMagick command above. I believe I need to execute the gswin64c.exe module but I am unsure how to do this to accomplish what I need to get done. Can someone provide me with a clean example of the ghostscript that accomplishes what I'm doing in ImageMagick?
After much digging, what I discovered was that ghostscript does not really have a wildcard that would allow reference to all files of a certain pattern (like ImageMagick does). To convert all files in a directory that are pdf's to png's, a dos script like the following could be used:
for %%x in (*) do gswin64c.exe -sDEVICE=png16m -dBATCH -dNOPAUSE -dQUIET -
SOutputFile="%%~nx.png" %%~nx.pdf
This can also be run from the command line by simply using single percentage signs (%) instead of the double percentage signs in the script above.
The terms are as follows:
gswin64c.exe: This is the dos command version of GhostScript. It should be used as opposed to gswin64.exe which will open a GhostScript window.
-sDEVICE=png16m This indicates the form of the output file. Is this case png.
-dBATCH -dNOPAUSE. These are GhostScript options and when employed will allow for continuous operation of the script (without them, the program will pause after each file converted).
-dQUIET - This suppresses notifications that display on stdout after each processed file.
SOutputFile="%%~nx.png" %%~nx.pdf This indicates the pattern for the input files and the output files. x is the loop variable. The % sign is used as a wild card. ~nx is a Dos convention which truncates the extension of an echoed file name.

Automating multiple imagemagick commands

I have hundreds of PDFs containing powerpoint handouts (with 6 (2x3) slides on each page) that I want to convert to single slide per page PDF (or PPT, ultimate goal is importing to OneNote properly).
So far I have succeeded in finding the correct commands to create single, nicely cropped JPEG files for each slide:
C:\"Program Files"\ImageMagick-6.8.9-Q16\convert.exe -density 300 C:\Users\matt\Desktop\cmd\5.pdf -gravity Center -crop 80%x+0+0 -quality 100 -sharpen 0x1.0 C:\Users\matt\Desktop\cmd\output.jpg
This split up the PDF pages and crops away page numbers for easier trimming later.
C:\"Program Files"\ImageMagick-6.8.9-Q16\convert.exe -crop 2x3# C:\Users\matt\Desktop\cmd\output-0.jpg C:\Users\matt\Desktop\cmd\output-0%d.jpg
This divides slides up from one page.
C:\"Program Files"\ImageMagick-6.8.9-Q16\convert.exe -trim C:\Users\matt\Desktop\cmd\output-00.jpg C:\Users\matt\Desktop\cmd\output-00%d.jpg
This trims the borders of each slide.
How would I automate all this into a drag and drop script so I can quickly convert a PDF into a collection of JPEGs or from there a full PDF?
After tons of reading I figured it out. I wrote a .bat (batch?) first setting the working directory as the location the .bat was located in. Then I did all the imagemagick operations with proper use of leading zeros to get the right order (if you use *.jpg it will go through 1.jpg, 11.jpg, THEN 2.jpg). The operations basically took any .pdf in the folder and turned it in JPGs, manipulated them as I needed, then when finally in proper order, made a single PDF.
Simple in theory, tricky in practice.

creating a rtf file from a text file using MS DOS commands with wordpad header inserted in the rtf file

I was going through the link Why is my license not showing up? in which Tom has answered the question that he adjusted the script to add wordpad header to the rtf file. I basically want to know how we can achieve this using MS DOS commands in a bat file
Just open up wordpad, type anything and save the .rtf file. Now open the .rtf file you just created in a plain text viewer like notepad or notepad++. The first couple of lines are the rtf header.
so the batch code to create an rtf file would look like this:
#echo off
setlocal
set "myrtf=C:\rtffile.rtf"
>%myrtf% echo {\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}}
>>%myrtf% echo {\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\f0\fs22\par
>>%myrtf% echo this is a test\par
>>%myrtf% echo }

Parsing through many files and extract data to a new file?

Ok! My grey hairs have started popping out because of this.
I have 400 PDF files which I want to extract a line from. The line starts with DIR and then a number follows. But I will need the file name as well!
So do anyone know a way to parse through PDFs (or I can convert them to txt) and then search for a term, expand, append file name to it and save it into a new file.
Any help will be greatly appreciated!!
Thanks,
Tor
You have Itext library that you can use for opening the pdf.
Than you will need to scan each pdf for your pattern
The link to the library www.itextpdf.com

Resources