Simplest way to modify a document "template" and print via WPF? - wpf

The Situation
I have a WPF program that I want to print several documents from using some data. Currently these documents exist as an Excel Spreadsheet and a Word Document.
What I have tried
Opened XPS file (Saved as XPS from Excel) as a zip and pulled out the Page (it only consists of a single page) and slapped it into a Window with a grid, just for a test. OMG!! The resources that could not be found and red squiglies every where. Fonts that are specified in the XPS are represented in a odttf file which WPF does not seem to like. Renaming it to .ttf doesn't appear to work. The layout appeared correctly, grid lines and what not, so that is hopeful.
What I really would rather not have to do
Recreate the files as flow document, XPS, or other XAML objects by hand. The layout is pretty involved for the Excel Spreadsheet document. The Word doc is not so bad.
So really I just need to know: From the two inputs that I am using (Word Document, Excel Spreadsheet) how best would I get these into a format that I could easily print from WPF. Currently I have some code snippets that would allow me to open Excel, open the spreadsheet, put the data into the specified cells, print, issue a close command, check that the program unload and kill it if necessary. I don't want to do that anymore though. It is messy and can be buggy as well as requiring the Office Interop assemblies and other stuff to be installed.

I found an article here which explains some things that I had previously not realized.

Related

Multiple Similar (Duplicate) Reports

I currently created a report that I would like to duplicate 40 times but with a different filter each time. So far I'm afraid this is only possible with a very manual process but I was hoping if someone has a faster solution?
Would it be possible to connect to the DataStudio API (if there is one) and run a script for this?
Also, making a design change can be problematic as it needs to be copied to 40 reports. Does anyone have a suggestion for this?
A report is stored as an RDL file in the file system. Its of XML format. Make a copy of the file in the file system and open the copy using a text editor (I prefer notepad++) and find the parameter. Change the paramater to whatever value you want it to be and save the file under a different name. I would include the parameter name in the report name. If the report as using 'Memphis' I would name the report Sales_Memphis.RDL.
Do this 40 times. Be very careful about not changing the structure of the XML file (Don't change any of the element names or opening and closing symbols(<>). Re-import the file into SSDT report project to verify it is using the correct value.
To import a file into an existing project:
Right click on “Reports” folder and select “Add” then “Existing Item”
Now file browser window will open.
Find the file and double click on it.
Ideally you can have a dropdown with all the possible values for users to choose from but I guess that is not appropriate for your needs.
If this is a good solution please check it off as valid solution.
I checked this with google support and apparently there is no way yet to do this.
There are currently no API's available to do this nor can you download something similar to an RDL file. Right now the only way is manual duplication and to make adjustments for each report separately.
Have you looked into custom bookmarking. It sounds like it might be able to address the problem you're expressing. This way you would only have one report, but the links you would share would automatically apply the correct filter value.

Is it possible to manipulate pdf files in Visual Basic without an external library/SDK?

I am looking at how to implement PDF merging with raw VB code so that the code may be invoked by a bot for business process automation.
The software used to create the bot provides a function to invoke VB code, but I don't believe it can access any externally imported libraries because it expects plain source, so I essentially need to produce code that one could run in a VB shell environment without anything fancy (or convenient, it seems).
All the research I've done so far point me in the direction of external packages I would need to install, such as iText; this is what I'm looking to avoid.
(previous iText employee here)
PDF is not an easy (binary) format.
Essentially, blobs of information (text that has to be rendered, fonts, images, vector graphics, etc) are compressed and gathered into objects.
Each object gets a number. Objects are allowed to reference eachother (a piece of text might say 'I want to be rendered with font 4433')
All object numbers and their byte offset in the file are gathered in the crossreference (often called XREF) table.
A PDF includes a 'Pages' dictionary object that tells the viewer which objects belong on which page.
In order to merge PDF files, you would need to:
- read all XREF tables of all files
- adjust all of those to the correct byte offset
- update various dictionary objects within the PDF file that tell it where all the objects per page are kept
This is by no means a trivial task, but it can be done using only VB.
If you are serious about implementing a robust, scalable version of this of tool, perhaps it's better to look at the iText sourcecode and try to port it to VB?

Data Extraction from PDF

I get 15+ PDF's a day that I have to enter into a database. They are generated from a table where the "Blanks" are filled in from specific table fields. Any tools or python code examples I could use to try and develop a means of extracting the data from the PDF to either write to or create a table to import to the database table? The Database is currently Access mdb.
Thanks
There are a number of approaches that will work.
One simple approach is to simply print the PDF file out to a text file and then have Access import that text. All recent versions of windows allow you to install a “text” printer that outputs the printing of a document to a text file. You can have access “process” a folder of pdfs, print them to text and then import those text files. You might need some VBA to remove “pages” and some extra lines before you import the data into Access.
Another approach is to use Word (Automate from Access) to open a PDF. When word opens a pdf, it converts it to a word document. This approach will even format rows as a word table. You can then pluck out that table data and send that data to word. You can likely pull that text out without writing the data out to a text file – or just use Words “save-as” to a text file (you can automate this process from Access).
Another approach is to use the free Ghost Script library that can extract text from a PDF (this I would consider if did not have word at your disposal).
So which solution is best will much depend on the current software you going to have installed on the computer running Access. Opening the pdf files with word would be my first choice and test.
At my old job we used Cogniview which converted PDF to Excel spreadsheets quite quickly. If you want to use Python, a quick search yielded me this which seems straight forward enough, PDF to XLS with Python

Print PDF programmatically - C# WinForms

I need to print a SSRS report in PDF format from a WinForms application written in C#. The report is a PDF document (containing text, images & tables), in a byte array - and I don't want to save it to disk for security/performance reasons. The requirements for printing are that it needs to be done:
- in the fastest way possible
- with no user interaction
- without the need to install anything on the client machine (we can't rely on any Adobe products being installed)
- third-party libraries can be used, as long as they can be installed together with the application
I came to 2 potential solutions:
1. using MigraDoc - but I can't find a way to load and print an existing file, only a newly created PDF file, or one already saved to disk
2. sending the PDF directly to the printer, using "PDF Direct Print"/PCL/etc. This seems to be the fastest option, but I haven't implemented it yet, and it seems to not be supported by all printers.
Does anybody have any suggestions on how to implement the options above, or any other options which meet the requirements?
MigraDoc cannot print PDF files, so one of your potential solutions is void.

SSRS 2008 R2 - Excel output not formatting to page size

I have a batch of reports that are set up to print very nicely in landscape on A4 page. But when I set the default format to Excel, the resulting spreadsheet, when printed without changing anything in the print setup, is wider than an A4 page so of course it gets broken up over mulitple pages (i.e: each page is 2 pages wide rather than 1)
Most of our users just want to print these as soon as they arrive via email (but they still want Excel format so they can re-sort, cut and paste, etc) so how can I make Excel keep the print format defined in the report in SSRS so the users don't have to mess about with print settings? (These are daily reports so this is driving our users mad as some of them may get 4 or 5 reports!)
Do I have to use an Excel template (can this even be done?) or is there a way to acheieve what I want via SSRS?
TIA for any help....
Mike
The short answer is that you can't exactly do what you want with the Excel renderer. Some workarounds that come to mind:
Filling an Excel template with data might be an option, but is more of a job for SSIS, not reporting services.
Send the report in PDF for printing, and if needed in Excel as well.
Re-layout the report so it plays well with the default printing of Excel. This won't be very pretty, you'd need to either make columns much smaller (and perhaps rotate headers using the WritingMode property) or turn columns into row groups somehow.
(hack warning!) create an Excel macro or something alike for your users, that does some printing-quick-fixes.
Some background
Unfortunately SSRS gives you only a small bit of control over how the report is rendered in the various rendering extensions. There's this MSDN page on rendering extensions (additional emphasis mine) with some useful info:
Soft page-break renderers: Soft page-break renderers maintain the report layout and formatting. The resulting file is optimized for screen-based viewing and delivery, such as on a Web page. The available soft page-break renderers are: Microsoft Excel, Microsoft Word, Web archive (MHTML), and HTML.
Hard page-break renderers: Hard page-break renderers maintain the report layout and formatting. The resulting file is optimized for a consistent printing experience, or to view the report online in a book format. The available hard page-break renderers are supported: TIFF and PDF.
So, if you want to optimize for printing experience, you should probably use the PDF export. You can then play around with the page size and margins to fit as much info as possible on a page, and let the client program (probably Adobe Reader) worry about printing it nicely.

Resources