java code for solr geolocation indexing - solr

I am using solr for fixing my indexing and searching feature and a beginner to solr.
I actually want to index the geolocation into solr index and also want to make queries on it so went through some articles,
http://wiki.apache.org/solr/SpatialSearch
And exactly some schema type are present in my schema.xml.
Now my question is I want to write a java code to index the geolocation while indexing it for dynamic geolocation fields. So how to write it and is there any sample java code for indexing it. I looked for it but didn't found any so please if anybody can help me with it.
I also understand that when indexing we would need to write some thing like :
document.addField(myDynLocFld+"_p", val));
If using this approach what should be val an instance of location object with both lat and lng value embedded in it. So how to counter this or is there any diferent approach in solr java for this?
Thanking in advance.

Check this sample of code,
// Store the index in memory:
//Directory directory = new RAMDirectory();
// To store an index on disk
Directory directory = FSDirectory.open("/tmp/testindex");
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
Document doc = new Document();
String text = "This is the text to be indexed.";
doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
iwriter.addDocument(doc);
iwriter.close();
For more details check Lucene APIs.

Related

How to sort by geodist() function in Spring Solr Data?

Ciao,
I'm unable to get the following solr query working with Spring Solr Data.
q=name:rsa&sfield=coordinates&pt=-8.506854,115.262478&sort=geodist()%20asc
The query works if I put it inside solr admin console, but it doesn't work with spring solr data. I didn't find examples about geodist sorting inside documentation so I created a CustomRepository with the following function:
public Page<StructureDocument> findStructuresByNameAndCoordinates(String value, Point point, Pageable page) {
StringBuilder queryBuilder = new StringBuilder();
queryBuilder.append("name:*").append(value).append("*");
queryBuilder.append("&sfield=coordinates");
queryBuilder.append("&pt=").append(point.getX()).append(",").append(point.getY());
queryBuilder.append("&sort=geodist() asc");
Query query = new SimpleQuery(new SimpleStringCriteria(queryBuilder.toString())).setPageRequest(page);
But it doesn't work because of syntax error (seems that Solrj doesn't like this chars (). This is the error:
; nested exception is org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: org.apache.solr.search.SyntaxError: Cannot parse 'name:*francesco*&sfield=coordinates&pt=-8.506854,115.262478&sort=geodist() asc': Encountered " ")" ") "" at line 1, column 71.
How can I sort by geodist function with spring solr data??
Thank you very much.
I found a solution here SOLR Spring Data geospatial query
#Override
public Page<StructureDocument>
findStructuresByNameAndCoordinates(String value, Point point, Pageable
page){
SimpleQuery query = new SimpleQuery().setPageRequest(page);
query.addProjectionOnField("*");
query.addProjectionOnField("score");
StringBuffer buf = new StringBuffer("");
buf.append("{!geofilt pt=");
buf.append(point.getX());
buf.append(",");
buf.append(point.getY());
buf.append(" sfield=coordinates d=");
buf.append(2000.0);
buf.append(" score=distance}");
query.addCriteria(new SimpleStringCriteria(buf.toString()));
FilterQuery searchTerm = new SimpleFilterQuery();
searchTerm.addCriteria(new Criteria("name").contains(value));
query.addFilterQuery(searchTerm);
query.addSort(new Sort(Sort.Direction.ASC, "score"));
Moreover I added to my solr document "score" attribute, used to store distance value calculated by geodist function.
#Score
private Double score;
This solution works, but I know there's a better way that actually for me doesn't work, see this link. I'll continue to investigate it.
How to return distance and score in spatial search using spring-data-solr

How to get word count of SOLR document?

I have the binary content of a pdf file, and I want to upload it to SOLR and index its content:
ContentStreamUpdateRequest up = new ContentStreamUpdateRequest('/update/extract')
up.setParam("literal.id", map.id)
def tmpFile = null
tmpFile = File.createTempFile(map.id, ".tmp")
tmpFile.append(binary)
up.addFile(tmpFile, ".pdf")
// Do the SOLR stuff here
def solr = getSolrServer()
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true)
def response = solr.request(up)
if (tmpFile) {
tmpFile.delete()
}
return response
When I query SOLR, I can retrieve the SOLR document. How can I get the actual content of the file? Basically I need to find the word count of the document I've uploaded so I was planning to do a size() on the string returned (if that's even possible)....
I'm very new to SOLR so am probably on the wrong track... any assistance greatly appreciated :)
I am assuming you want to count the number of words in the PDF which you have indexed. Make sure that
The entire extracted contents of PDF are indexed into one field.
Make sure this field has atleast a whitespace tokenizer enabled. So that it splits the sentences into words based on whitespace.
Once you do this you can find the number of words either using facets or Term vector component. The below SO answer might be helpful:
https://stackoverflow.com/a/26933126/689625

Parsing Solr Results - javabin format

I am trying to integrate solr with java using solrj. The result retrieved are of the format
{
numFound=3,
start=0,
docs=[
SolrDocument{
id=IW-02,
name=iPod&iPodMiniUSB2.0Cable,
manu=Belkin,
manu_id_s=belkin,
cat=[
electronics,
connector
],
features=[
carpoweradapterforiPod,
white
],
weight=2.0,
price=11.5,
price_c=11.50,
USD,
popularity=1,
inStock=false,
store=37.7752,
-122.4232,
manufacturedate_dt=TueFeb1418: 55: 59EST2006,
_version_=1452625905160552448
}
Now this is the javabin format. How do I extract results from this? Have heard that solrj does convert the results to objects by itself. But cant figure out how.
Thanks for the help in advance.
Let solrReply be the response object. The you can access different parts of the result using appropriate params. Say you want docs, you can do:
docs = solrReply['docs']
if you want the first result you could do:
first = solrReply['docs'][0]
Within a result you can access each field in the same way.

Indexing PDF documents with addtional search fields using SolrNet?

I found this article useful when indexing documents, however, how can I attach additional fields so I can pass in, say, the ID of the document in our database for use in displaying the search results? I thought by using the Fields (Of the ExtractParameters class) property I could index additional data with the document, but that doesn't seem to work or that is not its function.
Example code:
var solr = ObjectLocator.Instance.Resolve<ISolrOperations<IndexDocument>>();
var guid = Guid.NewGuid().ToString();
using (var fileStream = System.IO.File.OpenRead(Server.MapPath("~/files/") + "greenroof.pdf"))
{
var response =
solr.Extract(
new ExtractParameters(fileStream, "greenRoof1234")
{
ExtractFormat = ExtractFormat.Text,
ExtractOnly = false,
Fields = new[] { new ExtractField("field1", "value1"), new ExtractField("field2", "value2") }
});
}
#aitchnyu is correct, passing the values via the literal.field=value method is the correct way to do this.
However, according to this post on ExtractingRequestHandler support in the SolrNet Google Group, there was a bug with the ExtractParameters.Fields not working properly. This was fixed in the 0.4.0.X versions of SolrNet. Please make sure you are using one of the latest versions of SolrNet. You can obtain that by one of the following means:
Project Site Downloads
NuGet PreRelease Package
Also that discussion has some good examples of using the ExtractingRequestHandler in SolrNet as well as a workaround for adding the additional field values if you cannot upgrade to a newer version of SolrNet.
This is sufficient: http://wiki.apache.org/solr/ExtractingRequestHandler#Literals .
In general use a literal.field=value while uploading.
It turned out not to be an issue with SOLRNet, but my knowledge of SOLR, in general. I needed to specify the fields in my schema. After i added the fields to my schema they were visible in my SOLR query.

Create Index for Full Text Search in Google App Engine

I'm reading the documentation on full text search api (java) in google app engine at https://developers.google.com/appengine/docs/java/search/overview. They have example on getting the index:
public Index getIndex() {
IndexSpec indexSpec = IndexSpec.newBuilder()
.setName("myindex")
.setConsistency(Consistency.PER_DOCUMENT)
.build();
return SearchServiceFactory.getSearchService().getIndex(indexSpec);
}
How about on creating an index? How to create one?
Thanks
You just did. You just created one.
public class IndexSpec
Represents information about an index. This class is used to fully specify the index you want to retrieve from the SearchService. To build an instance use the newBuilder() method and set all required parameters, plus optional values different than the defaults.
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/search/IndexSpec
You can confirm this by looking at the SearchService
SearchService is also responsible for creating new indexes. For example:
SearchService searchService = SearchServiceFactory.getSearchService();
index = searchService.getIndex(IndexSpec.newBuilder().setName("myindex"));
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/search/SearchService
Anyway, It seems your code will create a new index if it doesn't exist. That's what the docs suggest:
// Get the index. If not yet created, create it.
Index index = searchService.getIndex(
IndexSpec.newBuilder()
.setIndexName("indexName")
.setConsistency(Consistency.PER_DOCUMENT));
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/search/Index
Now, what happens if you run the code again and change the Consistency? Do you have the same index with a different consistency? Is the index overwritten? I don't know. I would use the SearchService to lookup existing indexes instead of using code that might create them just to avoid trying to get an index in my code but changing the specs inadvertantly.
An Index is implicitly created when a document is written. Consistency is an attribute of the index, i.e. you can't have two indexes of the same name with different consistencies.

Resources