Parsing Solr Results - javabin format - solr

I am trying to integrate solr with java using solrj. The result retrieved are of the format
{
numFound=3,
start=0,
docs=[
SolrDocument{
id=IW-02,
name=iPod&iPodMiniUSB2.0Cable,
manu=Belkin,
manu_id_s=belkin,
cat=[
electronics,
connector
],
features=[
carpoweradapterforiPod,
white
],
weight=2.0,
price=11.5,
price_c=11.50,
USD,
popularity=1,
inStock=false,
store=37.7752,
-122.4232,
manufacturedate_dt=TueFeb1418: 55: 59EST2006,
_version_=1452625905160552448
}
Now this is the javabin format. How do I extract results from this? Have heard that solrj does convert the results to objects by itself. But cant figure out how.
Thanks for the help in advance.

Let solrReply be the response object. The you can access different parts of the result using appropriate params. Say you want docs, you can do:
docs = solrReply['docs']
if you want the first result you could do:
first = solrReply['docs'][0]
Within a result you can access each field in the same way.

Related

How to remove escape character from solr indexed field?

I am indexing Json data into solr field, for eg
{"employees":[
{"firstName":"John", "lastName":"Doe"},
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
]}
But Json is getting indexed with escaped characters, so now I am getting the json as
"{\"employees\":[\n {\"firstName\":\"John\", \"lastName\":\"Doe\"},\n {\"firstName\":\"Anna\", \"lastName\":\"Smith\"},\n {\"firstName\":\"Peter\", \"lastName\":\"Jones\"}\n]}"
Is there any way to index without escaping the json or de escaping result while displaying from the solr end solely ?
This is perfectly fine storage of json data in a solr textfield.
If you see it through admin, you will see the json in escaped format in the UI, but if you were to query this and then decode the json, it will return correct object in the language you are using.
Python example.
my_json_field = json_string // read from solr using api calls or module like pysolr
my_obj = json.loads(my_json_field)
Finally solution was very simple by using Transforming Result Documents
eg,
fl=my_field_with_escaped_json:[json]
Thanks everyone

Getting rates from an array - json

I am trying to get the rates from this website.
So I connect with website = Faraday.get('https://bitpay.com/api/rates')).status == 200 and then try to parse this.
A segment of the response I get is:
#<Faraday::Response:0x007fcf1ce25688
#env=
#<struct Faraday::Env
method=:get,
body=
"[{\"code\":\"BTC\",\"name\":\"Bitcoin\",\"rate\":1}, {\"code\":\"USD\",\"name\":\"US Dollar\",\"rate\":586.66},{\"code\":\"EUR\",\"name\":\"Eurozone Euro\",\"rate\":528.991322},{\"code\":\"GBP\",\"name\":\"Pound Sterling\",\"rate\":449.441986},{\"code\":\"JPY\",\"name\":\"Japanese Yen\",\"rate\":59907.95922},{\"code\":\"CAD\",\"name\"
When I do a website.body I get a String class of all these values found on that website. I want to parse them though (JSON?) so that I can get each rate as a float.
I tried something JSON.parse(website.body)["GBP"]["rate"].to_f but yet again it cannot work in a string.
The return I get is TypeError: no implicit conversion of String into Integer
I was having a similar (but not the same) format from a different rates website and this is how I was handling it. Do I need to change its format first or is there a different way around it?
You're trying to access to the parsed JSON with the key "GBP" but you have an array. It's like if you did
a = [1,2,3,4,5]
a['foo']
Try out
currencies = JSON.parse(website.body)
currencies.each { |currency| puts currency['rate'] }
and change it like you need

Parsing data from Kafka in Apache Flink

I am just getting started on Apache Flink (Scala API), my issue is following:
I am trying to stream data from Kafka into Apache Flink based on one example from the Flink site:
val stream =
env.addSource(new FlinkKafkaConsumer09("testing", new SimpleStringSchema() , properties))
Everything works correctly, the stream.print() statement displays the following on the screen:
2018-05-16 10:22:44 AM|1|11|-71.16|40.27
I would like to use a case class in order to load the data, I've tried using
flatMap(p=>p.split("|"))
but it's only splitting the data one character at a time.
Basically the expected results is to be able to populate 5 fields of the case class as follows
field(0)=2018-05-16 10:22:44 AM
field(1)=1
field(2)=11
field(3)=-71.16
field(4)=40.27
but it's now doing:
field(0) = 2
field(1) = 0
field(3) = 1
field(4) = 8
etc...
Any advice would be greatly appreciated.
Thank you in advance
Frank
The problem is the usage of String.split. If you call it with a String, then the method expects it to be a regular expression. Thus, p.split("\\|") would be the correct regular expression for your input data. Alternatively, you can also call the split variant where you specify the separating character p.split('|'). Both solutions should give you the desired result.

java code for solr geolocation indexing

I am using solr for fixing my indexing and searching feature and a beginner to solr.
I actually want to index the geolocation into solr index and also want to make queries on it so went through some articles,
http://wiki.apache.org/solr/SpatialSearch
And exactly some schema type are present in my schema.xml.
Now my question is I want to write a java code to index the geolocation while indexing it for dynamic geolocation fields. So how to write it and is there any sample java code for indexing it. I looked for it but didn't found any so please if anybody can help me with it.
I also understand that when indexing we would need to write some thing like :
document.addField(myDynLocFld+"_p", val));
If using this approach what should be val an instance of location object with both lat and lng value embedded in it. So how to counter this or is there any diferent approach in solr java for this?
Thanking in advance.
Check this sample of code,
// Store the index in memory:
//Directory directory = new RAMDirectory();
// To store an index on disk
Directory directory = FSDirectory.open("/tmp/testindex");
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
Document doc = new Document();
String text = "This is the text to be indexed.";
doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
iwriter.addDocument(doc);
iwriter.close();
For more details check Lucene APIs.

Indexing PDF documents with addtional search fields using SolrNet?

I found this article useful when indexing documents, however, how can I attach additional fields so I can pass in, say, the ID of the document in our database for use in displaying the search results? I thought by using the Fields (Of the ExtractParameters class) property I could index additional data with the document, but that doesn't seem to work or that is not its function.
Example code:
var solr = ObjectLocator.Instance.Resolve<ISolrOperations<IndexDocument>>();
var guid = Guid.NewGuid().ToString();
using (var fileStream = System.IO.File.OpenRead(Server.MapPath("~/files/") + "greenroof.pdf"))
{
var response =
solr.Extract(
new ExtractParameters(fileStream, "greenRoof1234")
{
ExtractFormat = ExtractFormat.Text,
ExtractOnly = false,
Fields = new[] { new ExtractField("field1", "value1"), new ExtractField("field2", "value2") }
});
}
#aitchnyu is correct, passing the values via the literal.field=value method is the correct way to do this.
However, according to this post on ExtractingRequestHandler support in the SolrNet Google Group, there was a bug with the ExtractParameters.Fields not working properly. This was fixed in the 0.4.0.X versions of SolrNet. Please make sure you are using one of the latest versions of SolrNet. You can obtain that by one of the following means:
Project Site Downloads
NuGet PreRelease Package
Also that discussion has some good examples of using the ExtractingRequestHandler in SolrNet as well as a workaround for adding the additional field values if you cannot upgrade to a newer version of SolrNet.
This is sufficient: http://wiki.apache.org/solr/ExtractingRequestHandler#Literals .
In general use a literal.field=value while uploading.
It turned out not to be an issue with SOLRNet, but my knowledge of SOLR, in general. I needed to specify the fields in my schema. After i added the fields to my schema they were visible in my SOLR query.

Resources