How to load the data into Solr from Kettle.? - solr

Is there any way to load data into Solr from Kettle.?
I have tried some plug ins, but it does not seems to be working..
I want to index the data into Solr from Kettle. I want to give input as file/table/csv and output as Solr.. How this can be done.?
I have completed creating custom plugin to load the data into Solr. But, the problem now is, the HttpSolrServer class i have used is accpepting SolrInputDocument with one field only. Its throwing no such method error if the document has more than one field.

Related

Determine last page in Solr using cursorMark

I am building an API that allows for searching on a number of fields in solr with pagination. It returns cursorMark from the solr results in the API response for users to use in their next query.
I'm following the Google AIP for pagination which requires returning a blank string when there are no more results left however Solr will keep returning the same cursor.
I'm looking for a simple solution that doesn't involve keeping state on the server or needing to make multiple requests. If I knew the position of the page in the total list of results I could determine if the page is the last, but I'm not able to access this.
Is there a simple way to achieve this?

How to check if a field has been indexed

I recently updated my SOLR config and run again the import query.
Data is present but still I can't filter by a new field I just added.
How can I check the new field has beed indexed correctly?
Possibly doing this from the SOLR console.
If you are able to search on the field then you can be sure that the field is indexed.
The other option is to go to the solr analysis page and check if the field is indexed or not.
The last option could be go to the solr analysis page and check the schema file if it updated for a particular core or the collection which you are referring.

changing values in DB not reflecting in solr

I have a solr search already implemented. It showing values in the UI. Working fine. The problem here is, if i change any data in the DB, it is not reflecting in the UI. It is showing the old values. What should i do?
Every time when you change something in DB you need to re-import data
http://localhost:8983/solr/your_core_name/dataimport?command=full-import
Make sure that handlder method is defined in solrconfig.xml
To reload a core (just in case)
I suppose you're using dataImportHandler
http://wiki.apache.org/solr/DataImportHandler
You can do it via GET method.
To reload a core: http://localhost:8983/solr/admin/cores?action=RELOAD&core=your_core_name
As suggested by "Oyeme" in above answer you should update the your core document once data is updated in db.
if your using solrj client then you can add or update document in core .
link to solrj documentation
You need to update your document through code once you updated your document when it get updated in db , do commit your changes to solr core .
for committing your changes use SolrServer.commit() method this will commit your changes to core and will appear in search.

How to update the data in solr

Trying to updating the solr document with below passed data
[{"id":"6","status":{"set":"3"}},
{"id":"10","status":{"set":"3"}}]
Throwing this error message :
"msg": "Expected: OBJECT_START but got ARRAY_START at [16]",
Please suggest a best way to update solr 4.0 document data with single url.
Quoting from lucene discussion page: Reference Link
The admin page accepts only a single JSON document to be added, because it wraps it in tags like so...
{"add":{ "doc": YOUR_TEXT_AREA_INPUT, ....
You can use the curl utility or post.jar for adding multiple document at the same time.
Reference for the updating solr document using curl . Updating a Solr Index with JSON

Retrieve Response data from another website using cakephp

I am trying to hit the courier companies website from Controller ( e.g bluedart,fedex etc) by passing the courier tracking number and fetch status of the given tracking number.
I am using $HttpSocket->get/post to hit the webpage URL
I am able to display the response body
How can I fetch the data from the response.
Or is there any other way to achieve the same
Please help me out .
How can I fetch the data from the response.
Parse the result, either using regular expressions or the DOMDocument class and traverse it. See Parsing HTML in Cakephp as well.
Or is there any other way to achieve the same
Use the APIs these companies usually offer.

Resources