Can I bulk load documents with inline attachments to Cloudant?

Can I bulk load documents with inline attachments to Cloudant? - cloudant

I want to bulk load a set of documents into a Cloudant DB. Cloudant provides a REST endpoint for this, _bulk_docs. But some of the documents I want to load contain attachments. If I were creating these documents individually, I could create the attachment along with the document by including it as an inline attachment. But it's not clear whether the _bulk_docs endpoint supports documents with inline attachments. The documentation does not say one way or the other, and my own attempts are so far unsuccessful.
Can someone please give an authoritative answer on whether the _bulk_docs endpoint of Cloudant supports docs with inline attachments?

The _bulk_docs endpoint does support docs with inline attachments.
I finally got this to work. My earlier attempts failed due to an unrelated problem. I was able to successfully bulk load 51 documents, 14 of which have inline attachments.

Related

Can I update metadata only with updatedocument API?

I'd like to update document's metadata only without re-uploading document itself.
So, I use updateDocument API without "File" parameter to update metadata only, but unfortunately enrich data is gone. (metadata is successfully updated!).
Is this updatedocument api's spec?
If I want to update metadata. Do I need upload document itself?
https://watson-api-explorer.ng.bluemix.net/apis/discovery-v1#!/Documents/updateDocument

Unfortunately, Discovery does not support updating only the metadata.
As you speculate, you do need to re-upload the document itself, along with the new or updated metadata that you want.
This documentation says:
Update a document
Replace an existing document. Starts ingesting a document with optional metadata.
which I can see may not be completely clear. By saying Replace an existing document it is trying to convey that the existing document is always and completely replaced.

API to Database?

Please presume that I do not know anything about any of the things I will be mentioning because I really do not.
Most OpenData sites have the possibility of exporting the presented file either in for example .csv or .json formats (Example). They also always have an API tab (Example API).
I presume using the API would mean that if the data is updated you would receive the change whereas exporting it as .csv would mean the content will not be changed anymore.
My questions is: how does one use this API code to display the same table one would get when exporting a .csv file.
Would you use a database to extract this information? What kind of database and how do you link the API to the database?

I presume using the API would mean that if the data is updated you
would receive the change whereas exporting it as .csv would mean the
content will not be changed anymore.
You are correct in the sense that, if you download the csv to your computer, that csv file won't be updated any more.
An API is something you would call - in this case, you can call the API, saying "Hey, do you have the latest data on xxx?", and you will be given back the latest information about what you have asked. This does not mean though, that this site will notify you when there's a new update - you will have to keep calling the API (every hour, every day etc) to see if there are any changes.
My questions is: how does one use this API code to display the same
table one would get when exporting a .csv file.
You would:
Call the API from a server code, or a cloud service
Let the server code or cloud service decipher (or "Parse") the response
Use the deciphered response to create a table made out of HTML, or to place it into a database
Would you use a database to extract this information? What kind of
database and how do you link the API to the database?
You wouldn't necessarily need a database to extract information, although a database would be nice to place the final data inside.
You would first need some sort of way to "call the REST API". There are many ways to do this - using Shell Script, using Python, using Excel VBA etc.
I understand this is hard to visualize, so here is an example of step 1, where you can retrieve information.
Try placing in the below URL (taken from the site you showed us) in your address bar of your Chrome browser, and hit enter
http://opendata.brussels.be/api/records/1.0/search/?dataset=associations-clubs-sportifs
See how it gives back a lot of text with many brackets and commas? You've basically asked the site to give you some data, and this is the response they gave back (different browsers work differently - IE asks you to download the response as a .json file). You've basically called an API.
To see this data more cleanly, open your developer tools of your Chrome browser, and enter the following JavaScript code
var url = 'http://opendata.brussels.be/api/records/1.0/search/?dataset=associations-clubs-sportifs';
var xhr = new XMLHttpRequest();
xhr.open('GET', url);
xhr.setRequestHeader('X-Requested-With', 'XMLHttpRequest');
xhr.onload = function() {
if (xhr.status === 200) {
// success
console.log(JSON.parse(xhr.responseText));
} else {
// error
console.log(JSON.parse(xhr.responseText));
}
};
xhr.send();
When you hit enter, a response will come back, stating "Object". If you click through the arrows, you can see this is a cleaner version of the data we just saw - more human readable.
In this case, I used JavaScript to retrieve the data, but you can use whatever code you want. You could proceed to use JavaScript to decipher the data, manipulate it, and push it into a database.
kintone is an online cloud database where you can customize it to run JavaScript codes, and have it store the data in their database, so you'll have the data stored online like in the below image. This is just one example of a database you can use.
There are other cloud services which allow you to connect API end points of different services with each other, like IFTTT and Zapier, but I'm not sure if they connect with open data.

The page you linked to shows that the API returns values as a JSON object. To access the data you can just send an appropriate http request and the response will be the requested data as a JSON. You can send requests like that over your browser if you want to.
Most languages allow JSON objects to be manipulated pro grammatically if you need to do work on the data.

Restful APIs publish model is "request and publish". Wen you request data via an API endpoint, you would receive response strings in JSON objects, CSV tables or XML.
The publisher, in this case Opendata.brussel.be would update their database on regular basis and publish the results via an API endpoint.
If you want to download the table as a relational data table in a CSV file, you'd need to parse the JSON objects into relational tables. This can be tricky since each JSON response string can vary in their paths.
There're several ways to do it. You can either write scripts to flatten the JSON objects or use a tool to parse and flatten the objects for you.
I use a tool called Acho to turn API endpoints into CSV files. It would parse almost all API endpoints through the parameters and even configure for multiple requests, such as iterative and recursive requests.
Acho API parser

REST API for SOLR analyzer

I'm going to test my SOLR analyzer and I've found instructions how to do it here: https://cwiki.apache.org/confluence/display/solr/Running+Your+Analyzer.
But I need to check several thousand of words, so I'm going to do it programmatically, not manually. Does SOLR have any REST API to run analyzer?
Thank you!

The Solr Admin page is just a set of static HTML files that uses the REST API offered by Solr behind the scenes. If you watch the Network tab in your browser's developer tools while navigating it, you'll see all the endpoints it talks to.
After doing this on the Analysis page, you can see that it makes requests to three endpoints, one to fetch the HTML, then two new requests to get the schema (for the field list) and one to perform the actual analysis:
http://localhost:8983/solr/corename/analysis/field?wt=json&analysis.showmatch=true&analysis.fieldvalue=asd&analysis.query=asd&analysis.fieldname=content

Liferay document checkin issue

I'm still new to Liferay and using Liferay 6.2
what i'm doing:
I am trying to add a document manually into my database using insert statement.
I inserted into dlfileentry, dlfileversion and AssertEntry.
Also, i created a folder with the valid name and file.
The issue:
upon entering the Documents and Media portlet, i can see the document name there but when i click on checkout, it will prompt a error saying that Documents and Media is temporarily unavailable. however i am still able to download the valid document.
Am i doing something wrong? Personally, i feel that i am missing one more table for the database but i'm not sure .
Thanks!

Yes, you're doing something wrong: You should never write to Liferay's database with SQL, as there might be more data required than what's directly visible to you. Obviously, you're running into exactly such an issue.
Liferay has an API which you can use locally, from within the same application server, or remotely as JSON or SOAP service. You should exclusively use this for write access to the database.
Alternatively, you might consider WebDAV access to your document repository as the way to add more documents to the document library.

How to add data to the solr's schema

I try to add new data to the solandra according to the solr's schema but I can't find any example about this. My ultimate goal is to integrate solandra with django-solr.
What I understand about the insert and updating in the solr based on the original solr and django-solr is to send the new data on the http protocol to the decent path, for example:
http://localhost:8983/solandra/wikipedia/update/json
However, when I access the url, the browser keep telling me HTTP ERROR: 404.
Can you help me understand the step to add new data and delete the data in the solandra environment?
I also have a look at the reuters-demo, but the procedure to insert data is process in the file of reutersimporter.jar, but I can't see the source as well. So Please help me to understand how the system work in terms of data inserting and deleting.
Thank you.

Since you are using the JSON update handler, this UpdateJSON page on the Solr Wiki has some good examples of inserting data using the JSON handler via curl. Also, the Indexing Data section of the Solr Tutorial shows how you can insert data using the post.jar file that is included with the Solr source.

Are you creating the solr schema.xml and solrconfig.xml and posting it to solandra? If you add the JSON handler then this should work. The reutersdemo uses solrj. django-solr should work as well.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Can I bulk load documents with inline attachments to Cloudant? - cloudant

The _bulk_docs endpoint does support docs with inline attachments. I finally got this to work. My earlier attempts failed due to an unrelated problem. I was able to successfully bulk load 51 documents, 14 of which have inline attachments.

Related

Can I update metadata only with updatedocument API?

API to Database?

REST API for SOLR analyzer

Liferay document checkin issue

How to add data to the solr's schema

Categories

Resources