How do I export data with attachments from a Lotus Notes Database into an Excel Spreadsheet or into a Microsoft Access Database? - export

Not a Lotus Notes Developer but have to get data in a Lotus Notes database into SharePoint. All of the LN entries have attachments. I tried to export to a csv file but that doesn't include the attachments. I think created a new view with the Attachments field but that only returns the number of attachments. How can I extract the associated attachments with each LN form. Thanks in advance

Your question is pretty broad. Attachments are (sometimes) treated as embedded objects in a Rich Text Field. This URL has some sample code:
https://www.ibm.com/support/knowledgecenter/en/SSVRGU_9.0.1/basic/H_EXAMPLES_EMBEDDEDOBJECTS_PROPERTY_RTITEM.html
Copy/paste may not work for you because the attachments may not be in a field called "Body" or there may be multiple "Body" fields on the document (which requires other considerations beyond the scope of this question), or the attachment may be embedded objects in the document. Or all the of the above. That that code will give you a sense of what you need to do.
Also, see this:
How to retrieve Lotus Notes attachments?

I have done this by writing LotusScript code to detach all the attachments from all docs into a single folder, using the document's UNID plus the attachment name for the filename in the folder. Adding the UNID covers cases where attachments with the same name exist in mulitple documents and might actually have different content. I do not attempt to de-duplicate.
The agent adds a NotesItem to each document giving the filename(s) of the detached attachment(s).
I then create a view containing all the fields that I want to export, including the new field with the filenames. I export that view to CSV. I hand the CSV and a zip file containing the attachments over to the SharePoint team.

Maybe a bit late but... I do have extensive experience (approx. 15 years) with data extraction from IBM Notes applications/databases - independent of the type of application - and have supported migrations of quite a few large IBM Notes applications to various targets for companies around the world.
You can access IBM Notes databases using the native C-API, LotusScript, COM or Java, for example or make a document available for further processing by exporting it to Domino XML (DXL) format.
The C-API is the foundation of IBM Notes, meaning that COM and Java APIs only offer a subset of the C-API's functionality. Any of the APIs should give you the ability to extract a document's metadata and attachments. However:
A document, including it's attachment, can be encrypted using an IBM Notes ID. If you do not have access to the ID that was used to encrypt the document, you will neither be able to extract the document nor the attachment.
Attachments can be "real attachments" or so called "embedded objects". Depending on the type of attachment, the attachment needs to be handled differently if it comes to the API calls required to do the export.
Attachments can be compressed. In most cases, the API should handle the decompression transparently. However, there is at least one proprietary compression algorithm (based on Hufman) that is widely used. If you extract documents in DXL format, you will not be able to read those attachments, as they are embedded into the DXL in compressed form.
Objects being embedded into a document using (Object Linking and Embeddeding (OLE)) cannot be extracted using the COM or Java API. I.e. even if you gain access to the documents, you will not be able to transform them into a readable format.
If the information you are trying to transfer from IBM Notes to SharePoint is important to the company you work for, I would recommend to rely on a proven solution for the export/migration rather than developing this on your own, as the details can really be tricky.
Should you have any further questions, don't hesitate to get in touch.

Related

Is there a way to show pdf in its original structure in the human review custom entity labelling in aws sagemaker?

I have modified this sample to read PDFs in tabular format. I would like to keep the tabular structure of the original pdf when doing the human review process. I notice the custom worker task template uses the crowd-entity-annotation element which seems to read only texts. I am aware that the human reviewer process reads from an S3 key which contains raw text written by the textract process.
I have been considering writing to S3 using tabulate but I don't think that is the best solution. I would like to keep the structure and still have the ability to annotate custom entities.
Comprehend now natively support to detect custom-defined entities for pdf documents. To do so, you can try the following steps:
Follow this github readme to start the annotation process for PDF documents.
Once the annotations are produced. You can use Comprehend CreateEntityRecognizer API to train a custom entity model for Semi-structured document”
Once entity recognizer is trained, you can use StartEntitiesDetectionJob API to run inference for PDF documents

IBM Watson, how to input data of entire books

Im using the IBM Watson analytics trial, it says it only takes data as CSV, Excel and a few others. How can i convert books or bodies of text into an acceptable format? thank you
It seems like the architecture of WCA(Watson Context Analytics) does not support PDF itself. Please refer the following images from IBM Link
I think it would be better to convert pdf to text with converter such as CONVERTER and pushing it into database or others.
Then, you can crawing the text data from it.
FYI, the document has to have a KEY column (i.e. name of the book).
Even if you do convert your book into an acceptable text format (.csv. .xls, .xlsx. .sav), Watson Analytics isn't optimized for text analytics. It sounds like Watson Explorer is the offering that'd best suit your needs.
Hope this helps.
Even though CSV or XLS is the acceptable format of the file, Datasets needs to be in the specific structure. You need headers for all the tables and data following it. I am not sure how a data of the book can fit into that format.
I have recently published this blog post on how to structure and refine data before importing into Watson Analytics to get the best results.
For your specific requirement, you can look into Watson Explorer as suggested by Brennan above, or even better you can learn to use IBM Content Analytics here.

How to instruct IBM Watson Discovery about the format of my documents?

I am trying to use the Watson Discovery service to build a virtual customer support agent. We have many documents with tons of Q and A in various formats. In the simplest case, we just have a doc, with an array of:
Q:..
A:...
Q:...
A:...
etc. When we upload these PDF files and then try to query it, it returns the full document that included the relevant answer. Is there a way to instruct Discover service, so that it will only return the relevant question and answer pair instead of the full document?
To have Discovery return the individual relevant QA pairs, they should be split up and passed to the service as separate documents. Discovery does not have a method to split a single document on it's own.
If your primary requirement is Q&A, you might probably look into Retrieve-Rank
Discovery is used to deal with complex unstructured data, in your case you have data in a consistent format.
Have a look at this sample app here

Generate a series of documents based on SQL table

I am trying to formulate a proposal for an application that allows a user to print a batch of documents based on data stored in a SQL table. The SQL table indicates which documents are due and also contains all demographic information. This is outside of what I normally do and am trying to see if these is a platform/application that already exists to do such a task
For example
List of all documents: Document #1 - Document #10
Person 1 is due for document #: 1,5,7,8
Person 2 is due for document #: 2.6
Person 3 is due for document #: 7,8,10
etc
Ideally, what I would like is for the user to be able to push a button and get a printed stack of documents that have been customized for each user including basic demographic info like name, DOB, etc
Like i said at the top, I already have all of the needed information in a database, I am just trying to figure out the best approach to move that information onto a document
I have done some research and found some people have used mail merge in Word or using Access as a front end but I don't know if this is the best way. I've also found this document. Any advice would be greatly appreciated
If I understand your problem correctly, your problem is two-fold: Firstly, you need to find a way to generated documents based on data (mail-merge) and secondly, you might need to print them two.
For document generation you have two basic approaches: template-based and programmatically from scratch. I suppose that you will opt for a template based approach which basically means that you design (in MS Word) a template document (Word, RTF, ...) that acts as a template and contains placeholders and other tags that designate »dynamic« parts of the document. Then, at document generation time, you need a .NET library/processor that you will pass this template document and the data, where the processor will populate the template with the data and return the resulting document.
One way to achieve this functionality would be employing MS Words' native mail-merge, but you should know that this would involve using Office COM and Word Application Automation which should be avoided almost always.
Another option is to build such a system on top of Open XML SDK. This is velid option, but it will be a pretty demanding task and will most probably cost you much more than buying a commercial .NET library that does mail-merge out-of-the-box – been there, done that. But of course, the good side here is that you will be able to tailer the solution to your needs. If you go down this road I recoment that you use Content Controls for tagging documents/templates. The solution with CCs will be much easier to implement than the solution with bookmarks.
I'm not very familliar with the open source solutions and I'm not sury how many there are that can do mail-merge. One I know is FlexDoc (on CodePlex) but its problem is that uses a construct (XmlControl) for tagging that is depricated in Word 2010+.
Then there are commercial solutions. Again I don't know them in detail but I know that the majority of them are a general purpose document processing libraries. Our company has been using this document generation toolkit for some time now and I can say it covers all our »template-based document generation« needs. It doesn't require MS Word at doc generation time, and has really helpful add-in for MS word and you only need several lines of code to integrate it in your project. Templating is very powerful and you can set-up a template in a very short time. While templates are Word documents, you can generate PDF or XPS docs as well. XPS is useful because you can use .NET/WPF prining framework that works with XPS docs to print documents. This is a very high-end solution, but of course, the downside here is that it is not a free solution.

File Maker Scripting - Sending Different Attachment

Is there a way to send a mail with different PDF file to different contacts using file maker?
I am aware of sending batch emails with one attachment but I would like to send a personalize PDF for each contact which seems not so simple.
Also
Can I add PDF files to the table itself or would I have to use the path to the file?
Example:
Table 1
**Name** [James Brown] [James Blue]
**Email** [brown.j#gmail.com] [blue.j#gmail.com]
**PDFfileAttchamnet** [folder/PDF/JamesBrown.pdf] [folder/PDF/JamesBlue.pdf]
So an Email for James Brown would look like:
Dear James Brown, please see the attached file.
Attachment [JamesBrown.pdf] {actual file}
and
Dear James Blue, please see the attached file.
Attachment [JamesBlue.pdf] {actual file}
I think you can solve it by creating container field in you database and import the pdfs in it.
then you can use export Field Contents[] to export it and send it by email
Hope it useful
I would like to send a personalize PDF for each contact which seems
not so simple.
Find the records of contacts you want to include and loop among them, sending mail to each one individually (i.e. without selecting the 'Collect addresses across found set' option).
Can I add PDF files to the table itself or would I have to use the
path to the file?
You can do either, it's up to you. If the path to the file can be calculated (as in your example), you can calculate it right there in the Send Mail script step.
Note that you can also generate the PDF files during the process itself.
Do I understand correctly that you would actually like to personalize the PDF document(s)?
This is possible, maybe not very simple, but quite simple. The trick is to prepare the PDF as a form, and then fill the form fields to personalize.
PDF has a native forms data format (called FDF), which is described in ISO 32000 (as well as the older PDF specification documents provided by Adobe, as you can find in the Acrobat SDK, downloadable from the Adobe website).
FDF is a simple structured text file, which can easily be assembled using FileMaker (I have done that routinely for several catalog projects). The easiest way to get going is to open the form in Acrobat, fill in the fields, and then export the data as FDF. This gives you the pattern to "fill in the blanks".
So, you create the FDF files using Filemaker. With them you can fill the blank form and feed the saved document to the eMail system.
Which tool to use to fill the blank form depends on the volume you have to process. Acrobat is not very powerful (and you may end up in a bit of a legal gray zone, because Acrobat is not set up for being used as a service). There are applications which are made specifically for filling out forms on a server (such as FDFMerge by Appligent), or there are also several libraries which have the tools to fill out forms (iText or pdflib come to my mind). These applications also allow you to flatten the PDF, which means that there are no longer form fields, but their contents becomes part of the base.
The resulting file can now be either made to an eMail attachment, or you make it available on a server and send an eMail with the link to the file (which method you will use may depend on security and privacy regulations).

Resources