RTF blob to file via SQL Server - sql-server

We are importing a source database X to our target database Y.
X has blobs of text in RTF format which are somehow displayed in its application.
Our web app can't display RTF, so we were instructed to convert those blobs of RTF into files in our database Y.
We simply copy the RTF blob from X, where it is nvarchar, into a column in Y which we already use for storing attachments, which is of type varbinary. Then we write it out as a file foo.rtf when the user wants to view it - so they can download and open the RTF in Word.
Unfortunately the foo.rtf file, when opened in Word, just looks like raw RTF, Something like
{\rtf1\ansi\ansicpg1252\deff0\deftab1134{\fonttbl{....
What do we need to do in order to correctly convert this RTF "blob of text" into an actual RTF file? It looks like just saving the bytes doesn't work.
Thank you.

Did you save the file using the extension .rtf -- I know Word opens RTF files just fine that way (assuming the rtf is valid of course)?
ADDED
Something else is wrong then, I did a web site where I generated .RTF file with a few thousand users, with a low level of sophistication -- not a single complaint about problems opening them in word
ADDED MORE
Be sure that you check that the web server is serving up the correct MIME doc type for your files (if rendered in the browser). IIS, APache, etc. do this in different ways

It turns out the source and target encodings were different.
We used Cast(Cast(Value as Varchar(max)) as Varbinary) and that made everything work.

Related

Data Extraction from PDF

I get 15+ PDF's a day that I have to enter into a database. They are generated from a table where the "Blanks" are filled in from specific table fields. Any tools or python code examples I could use to try and develop a means of extracting the data from the PDF to either write to or create a table to import to the database table? The Database is currently Access mdb.
Thanks
There are a number of approaches that will work.
One simple approach is to simply print the PDF file out to a text file and then have Access import that text. All recent versions of windows allow you to install a “text” printer that outputs the printing of a document to a text file. You can have access “process” a folder of pdfs, print them to text and then import those text files. You might need some VBA to remove “pages” and some extra lines before you import the data into Access.
Another approach is to use Word (Automate from Access) to open a PDF. When word opens a pdf, it converts it to a word document. This approach will even format rows as a word table. You can then pluck out that table data and send that data to word. You can likely pull that text out without writing the data out to a text file – or just use Words “save-as” to a text file (you can automate this process from Access).
Another approach is to use the free Ghost Script library that can extract text from a PDF (this I would consider if did not have word at your disposal).
So which solution is best will much depend on the current software you going to have installed on the computer running Access. Opening the pdf files with word would be my first choice and test.
At my old job we used Cogniview which converted PDF to Excel spreadsheets quite quickly. If you want to use Python, a quick search yielded me this which seems straight forward enough, PDF to XLS with Python

Looking for a .fil converter

The short: I inherited a website I didn't make. The previous site (which I have been redesigning) had a file upload feature which converted .docx, .xlsx, .pdf, etc. into .fil and stored them in an uploads folder.
The previous developer is no longer available and I'm looking for a reliable way to convert the files back into their original types/extensions. Any ideas on a reliable conversion application? Or just a simple way to go about this?
I'd try to just change the file extension to .pdf or .doc and try to open the file with Microsoft Word or Adobe Reader. If that fails, give opening straight into Word a shot. I don't think that Word handles that file extension natively, but it may be able to interpret it. Good luck.

File Maker Scripting - Sending Different Attachment

Is there a way to send a mail with different PDF file to different contacts using file maker?
I am aware of sending batch emails with one attachment but I would like to send a personalize PDF for each contact which seems not so simple.
Also
Can I add PDF files to the table itself or would I have to use the path to the file?
Example:
Table 1
**Name** [James Brown] [James Blue]
**Email** [brown.j#gmail.com] [blue.j#gmail.com]
**PDFfileAttchamnet** [folder/PDF/JamesBrown.pdf] [folder/PDF/JamesBlue.pdf]
So an Email for James Brown would look like:
Dear James Brown, please see the attached file.
Attachment [JamesBrown.pdf] {actual file}
and
Dear James Blue, please see the attached file.
Attachment [JamesBlue.pdf] {actual file}
I think you can solve it by creating container field in you database and import the pdfs in it.
then you can use export Field Contents[] to export it and send it by email
Hope it useful
I would like to send a personalize PDF for each contact which seems
not so simple.
Find the records of contacts you want to include and loop among them, sending mail to each one individually (i.e. without selecting the 'Collect addresses across found set' option).
Can I add PDF files to the table itself or would I have to use the
path to the file?
You can do either, it's up to you. If the path to the file can be calculated (as in your example), you can calculate it right there in the Send Mail script step.
Note that you can also generate the PDF files during the process itself.
Do I understand correctly that you would actually like to personalize the PDF document(s)?
This is possible, maybe not very simple, but quite simple. The trick is to prepare the PDF as a form, and then fill the form fields to personalize.
PDF has a native forms data format (called FDF), which is described in ISO 32000 (as well as the older PDF specification documents provided by Adobe, as you can find in the Acrobat SDK, downloadable from the Adobe website).
FDF is a simple structured text file, which can easily be assembled using FileMaker (I have done that routinely for several catalog projects). The easiest way to get going is to open the form in Acrobat, fill in the fields, and then export the data as FDF. This gives you the pattern to "fill in the blanks".
So, you create the FDF files using Filemaker. With them you can fill the blank form and feed the saved document to the eMail system.
Which tool to use to fill the blank form depends on the volume you have to process. Acrobat is not very powerful (and you may end up in a bit of a legal gray zone, because Acrobat is not set up for being used as a service). There are applications which are made specifically for filling out forms on a server (such as FDFMerge by Appligent), or there are also several libraries which have the tools to fill out forms (iText or pdflib come to my mind). These applications also allow you to flatten the PDF, which means that there are no longer form fields, but their contents becomes part of the base.
The resulting file can now be either made to an eMail attachment, or you make it available on a server and send an eMail with the link to the file (which method you will use may depend on security and privacy regulations).

WinForms and Access Database Attachments

If I have a PDF file saved into an attachment field in an Access database, is there anyway I could get that attachment from the database and view it in the WinForm? Or the WinForm WebBrowser maybe?
Or am I just better off sticking to a field in the database that tells me the file path of said file so I can navigate my WebBrowser to that?
I've been working with Access since long before Access 2007 introduced the Attachments field type, so I have a history of shying away from imbedding images and documents in the database. (They tended to bloat the database quite significantly, and the OLE "wrappers" added to the files were a real nuisance when trying to extract the files via code.)
Access 2007+ makes this quite a bit simpler with the Attachments field because DAO has been updated to support .SaveToFile and .LoadFromFile on Attachments. Also, attachments are (apparently) compressed when saved to the database, which should help with the bloat problem.
So, I'd say that the choice is really up to you, because if you want to view (or preview, or open) the PDF attachment in your WinForm then you'll probably wind up using Microsoft.Office.Interop.Access.Dao to save the attachment as a temporary file anyway. Therefore, whatever mechanism you use to preview/view/open the attachment will be working on a file; it will either be
a temporary file extracted from the database, or
a persistent file in a filesystem that you reference from a pathname or URL in the database.
To read the file from the database you would need to read the bytes from the database and write them to a new pdf file, then point your viewer to that file. To view a pdf directly on the WinForm you would need a 3rd party control. If you have a plugin where you can view pdfs in your browser, the WinForm WebBrowser will work.
Storing just the path in the database leads to less hassle from a coding standpoint, because you are going to have to point your viewer to a file anyway. Also is theres an issue with the database, there is a higher chance all the attachments will be lost. On the flipside, if just the paths are stored you need to make sure those paths are always accessible.
I would recommend storing them outside of the database for the above reasons, especially if this is a larger database.

dsofile c# API / NTFS custom file properties

I'm searching for a good way to add meta data to a file. dsofile.dll works fine for NTFS. The meta data is lost, when one drops a copy on a FAT32 share (it uses NTFS hidden streams I guess). Microsoft Word documents contain meta data that are not lost, how do they do it? Similiar to FAT, sending the file via E-Mail strips of all meta data created with dsofile (and also meta data created by hand with Windows Explorer). Separate meta data files are not an option. It must be compatible with standard Windows techniques. If I send someone a file with Outlook and he sends it back, the meta-data should not be lost.
(the required meta data is actually only an ID)
The issue is that all file systems provide a single-stream view of the file as a greatest-common-denominator. Through this interface which exposes the files "contents", you can read or store properties and have them be transported with the "contents" by naive system (or user-) utilities. For example, CopyFile in Windows will carefully lose alternate data streams and has no notion of "shadow files".
The question is whether or not the format of the "contents" allows for arbitrary addition of properties.
Some formats allow arbitrary content (e.g., MSFT's docfile aka .doc/.xls/etc). Some allow limited content (.mp3, .jpg, .exe).
Some are completely SOL (.txt, .bmp).
Any solution would be format-dependent. MS OFfice files are (all) compound files and there's a place for properties there. In some formats (PE files, for example) it's safe to just append data to the end of the file, if you know how to read them later. In ZIP file you can probably find a place in the directory or just add a helper file with your data to the archive. Other formats can't stand this, and you'd need to find your own way at solving the problem.
Actually, file name can also be a good placeholder for your ID.
If you need to store the files somewhere but don't need the file to remain readable by outside applications, you can pack them to ZIP archive or use something like our SolFS
library.
What about the standard properties rather than custom DSOFile properties? Ie Comments, Author etc? do they get wiped?
Not sure if its ideal but a way we've gotten around it is that we have a tool that will take the DSOfile properties and save a text file, which is then emailed along with the file, and at the other end the user runs a tool to re-import the dsofile properties from the text.

Resources