Opensearchserver how to crawl pdf content in a database - database

I have a bunch of pdf documents but all the document data are in a database (file name, description, id of the client, and the complete pdf content as a field of the data base).
Is possible to relate the pdf parser to take the data from a field of a database crawler?
thanks

Related

Xml Generation for Amazon seller Feed API in salesforce?

Amazon seller account accepts xml's if you try to create a Listing (Product) there.I need to do this from Salesforce. Is there any way to generated xmls which best fit to Amazon's api. They Provide xsd file but I am confused that how it will be helpful in Salesforce ?
An XSD defines the structure of an XML file. try generating a sample XML file from the XSD provided to get the structure.
http://xsd2xml.com/

how to duplicate a file field entry from drupal commerce to another field so it can be previewed

Im looking for a way to copy a file being uploaded to drupal commerce using commerce file/license to another file field in drupal 7. I want to preview audio files being sold on drupal commerce but because of the private file structure that is forced by commerce license, I have to duplicate the file to another field that is public.
Is there a way to copy a file being uploaded to a public directory and also add a database entry for an additional field associated to the content type?
Basically you have to create a "public mirror" of the file you uploaded.
Not digging to much in code, since commerce_product is an entity we assume we can alter the product whenever it is saved, so you can to as follow:
create a module and use hook_entity_insert and/or hook_entity_update
find the file you need, make a public copy
fill the value of the public file in another field of the same entity

How do I access a file through SOQL queries?

I need to find something that is inside a sObject referenced in my SOQL search, but the search just returns the file name. Can anybody help me with this? Thanks.
Perhaps this will help: https://www.salesforce.com/us/developer/docs/api/Content/sforce_api_objects_contentdocumentlink.htm.
From this document:
"ContentDocumentLink represents the link between a Salesforce CRM
Content document or Chatter file and where it's shared. A file can be
shared with other users, Chatter groups, records, and Salesforce CRM
Content libraries. This object is available in versions 21.0 and later
for Salesforce CRM Content documents and Chatter files."
Example query:
"SELECT ContentDocument.title FROM ContentDocumentLink WHERE ContentDocumentId = '069D00000000so2' AND LinkedEntityId = '0D5000000089123'"

Using a collection in the where clause in the salesforce command line data loader

I'm currently trying to backup the attachment files from old cases and emails in our Salesforce org through an automated process. I first tried to do this with the Bulk API but sadly this doesn't allow me to export the body column of attachments.
I did manage to pull this data out with the dataloader via command line (and the FileExporter tool). Now what I would like to do is export only those attachments that are attached to old emails or cases.
Is it possible to use a collection of ID's (preferable in a file) from those parent objects in the WHERE clause of the query in the beans file? If so, could somebody post an example?
Something along the lines of:
entry key="sfdc.extractionSOQL" value="SELECT Id, Body FROM Attachment WHERE id IN parentIdFile.csv"/>
Would be much appreciated!

Accessing Sharepoint database to read all blob data

I have a situation where I am uploading an image in sharepoint and it is being saved using blob. I need to create an XML file with the data of the blob and other data that helps users to identify it. The following is a hint of what I want
<image>
<name>mydog</name>
<extension>.jpg</extension>
<blobid>0234234</blobid>
<blobpath>435343445</blobpath> </image>
I was looking at the tables in wss_content and came up to alldocumentstreams where there is a column called rbsid. unfortunately I cannot link this id to non of my documents. My question is this is there a way how i can get all the blob information from the DB so i can link it to other details?
Directly accessing the SharePoint database isn't supported by Microsoft.
If a server component requires information from the database, it must
get that data by using the appropriate items in the SharePoint object
model, and not by trying to get the items from the data structures in
the database through some query mechanism.
You might be better using the SharePoint object model to read these files.
Some links that should help
http://www.codeproject.com/KB/sharepoint/File_Shunter.aspx
http://www.learningsharepoint.com/2011/04/01/read-a-file-in-sharepoint-document-library/

Resources