I am using Solr for searching institutions... My Solr DB has around 400k documents each of which has multiple fields like ("name","id","city",...)...
A document in my DB looks like this:
"docs":
{
"id": "91348",
"p_code": "71637",
"name": "University of Toronto - Mississauga",
"ext_name": "",
"city": "Mississauga",
"country": "CA",
"state": "ON",
"type": "academic/campus",
"alt_name": "",
"ext_city": "",
"zip": "L5L 1C6",
"alt_ext_city": "",
}
I write a query like {name: (university of toronto)}... Top two matches are:
"docs":
{
"id": "91348",
"p_code": "71637",
"name": "University of Toronto - Mississauga",
"ext_name": "",
"city": "Mississauga",
"country": "CA",
"state": "ON",
"type": "academic/campus",
"alt_name": "",
"ext_city": "",
"zip": "L5L 1C6",
"alt_ext_city": "",
"_version_": 1473710223400108000,
"score": 1.499069
},
{
"id": "10624",
"p_code": "7938",
"name": "University of Toronto",
"ext_name": "",
"city": "Toronto",
"country": "CA",
"state": "ON",
"type": "academic",
"alt_name": "Saint George Downtown Campus",
"ext_city": "",
"zip": "M5S 1A1",
"alt_ext_city": "",
"_version_": 1473710220148473900,
"score": 1.4967358
}
I am really surprised to see that "University of Toronto - Mississauga" returns a higher score than "university of Toronto". Intuitively, the field containing "University of Toronto - Mississauga" should get a lower score since it is longer than the other one.
I was also very surprised to see that Solr gives different values for querynorm as follows:
(0.03198291 = queryNorm) for the top document and (0.03203078 = queryNorm) for the second ranked document. I presumed that the query norm should be exactly the same for the all documents as it is only a function of the query.
I am not sure if I got something wrong about how Solr works or there is something wrong in indexing or configuration? Has anybody faced the same problem?
Make sure that omitNorms is set to false for that field and that your collection is using the latest version of the schema. Then re-index all of your documents for the change to the field to take effect.
I've found that some schema modifications are best treated with a complete wipe of the index prior to indexing in new content. I am not sure, but I believe this may be one of them. For most of the changes you can just re-index all of your content and overwrite the old stuff.
Related
Can someone give me a tip on how see the BACnet scrape data on the VOLTTRON .log?
Would this have anything to do with the log level? Maybe I just cant see any data because of incorrect log levels? Any tips setting the log level appropriate greatly appreciated.
vctl config get platform.driver devices/201201
returns this:
{
"driver_config": {
"device_address": "12345:2",
"device_id": 201201
},
"driver_type": "bacnet",
"interval": 60,
"registry_config": "config://registry_configs/201201.csv"
}
Running:
vctl config get platform.driver registry_configs/201201.csv
Looks good I can see all of the device points that were discovered:
{
"Reference Point Name": "Oat",
"Volttron Point Name": "Oat",
"Units": "degreesFahrenheit",
"Unit Details": "",
"BACnet Object Type": "analogValue",
"Property": "presentValue",
"Writable": "FALSE",
"Index": "301",
"Write Priority": "",
"Notes": ""
},
{
"Reference Point Name": "RmTmpSpt",
"Volttron Point Name": "RmTmpSpt",
"Units": "degreesFahrenheit",
"Unit Details": "",
"BACnet Object Type": "analogValue",
"Property": "presentValue",
"Writable": "FALSE",
"Index": "302",
"Write Priority": "",
"Notes": ""
},
{
"Reference Point Name": "RmTmp",
"Volttron Point Name": "RmTmp",
"Units": "degreesFahrenheit",
"Unit Details": "",
"BACnet Object Type": "analogValue",
"Property": "presentValue",
"Writable": "FALSE",
"Index": "300",
"Write Priority": "",
"Notes": ""
}
Running a vctl status and even restarting UUID a and 4 doesnt seem to do anything.
UUID AGENT IDENTITY TAG STATUS HEALTH
a bacnet_proxyagent-0.5 platform.bacnet_proxy proxy running [73753] GOOD
4 platform_driveragent-4.0 platform.driver platform_driver running [73754] GOOD
6 simplewebagent-0.1 webagent simpleWebAgent
Also the BACpypes.ini has the proper ID address set for the IP address of the computer running VOLTTRON.
Any tips appreciated.
Generall you won't want all of the data going through the message bus in the log as that will make your log huge and fill up the system.
However, if you install and start a listener agent you will get that behaviour. A listener agent will write to the log everything that goes through the message bus. It is located in the examples/ListenerAgent from the volttron repository.
i am completely new to JSON and Java in General.
i have a Task with a similar Block of code:
{
"name": "Chew Barka",
"breed": "Bichon",
"age": "2 years",
"weight": 8,
"bio": "The park, The pool or the Playground - I love to go anywhere!",
"filename": ""
},
And i would like to have the Contents of the folder:
"C:/Temp" for example
Stored in "filename"
so that when i call "filename" i get the "C:/Temp" Content
I am working on a mini-simple app using NoSQL. I currently have the design below and I am seeking practical advice and feedback on the database schema design.
Here is the basic overview of the website: A user can log in (via Google or Github), create a course review, other users can rate this course, give comments and also favorite this course.
Here is the schema design for now:
/user
"userid": "12jdfbvsidf3123"
"username": "admin"
"FirstName": "Jobs"
"LastName": "Tim"
"email": "123123#123.edu"
"rated"
"course1": 2
"course2" 3
"favorites":
"course1_id": 2
"course2_id" 3
/courses
"courseid": "sfgsdfwthrw34523"
"userid": "12jdfbvsidf3123"
"title": "the Art of Copy and Paste from StackOverflow"
"instructor": "you-know-nothing"
"rating": 0
"introduction": "bla bla bla"
"timestamp": "2017-08-09"
/comments
"commentid": "242334h5kjh2j4"
"userid": "12jdfbvsidf3123"
"courseid": "sfgsdfwthrw34523"
"content": "Great course, tell me all about how to copy and paste without thinking"
"timestamp": "2019-09-07"
I am using gettext to translate my AngularJS site - it all works fine where I have HTML attributes that I can add 'translate' to.
However I also have quite a large and complex JSON file which needs translating, which includes arrays and objects.
Is there any way to include this in the translation that gettext does, into the PO file? Or would I need to rethink the whole idea of using a JSON file to segment the customer flow?
I have included an initial extract of the JSON file below
{
"version": "1.1",
"name": "MVP",
"description": "Initial customer segmenting flow",
"enabled": true,
"funnel": [
{
"text": "I am...",
"image": "",
"help": "",
"options": [
{
"text": "Placing an order",
"image": "image1.png",
"next": 2
},
{
"text": "E-mailing customer service",
"image": "image2.png",
"next": 2
},
Thanks
James
Process the HTML file yourself with a script at build-time and dump all translatable messages into a dummy source file with the syntax expected by your string extractor, probably something like this:
<translate>I am ...</translate>
<translate>Placing an order</translate>
<translate>E-mailing customer service</translate>
I have the data ready for index now, it is a json file:
{"122": "20180320-08:08:35.038", "49": "VIPER", "382": "0", "151": "1.0", "9": "653", "10071": "20180320-08:08:35.088", "15": "JPY", "56": "XSVC", "54": "1", "10202": "APMKTMAKING", "10537": "XOSE", "10217": "Y", "48": "179492540", "201": "1", "40": "2", "8": "FIX.4.4", "167": "OPT", "421": "JPN", "10292": "115", "10184": "337912000000002", "456": "101", "11210": "337912000000002", "1133": "G", "10515": "178", "10": "200", "11032": "-1", "10436": "20180320-08:08:35.038", "10518": "178", "11": "337912000000002", "75": "20180320", "10005": "178", "10104": "Y", "35": "RIO", "10208": "APAC.VIPER.OOE", "59": "0", "60": "20180320-08:08:35.088", "528": "P", "581": "13", "1": "TEST", "202": "25375.0", "455": "179492540", "55": "JNI253D8.OS", "100": "XOSE", "52": "20180320-08:08:35.088", "10241": "viperooe", "150": "A", "10039": "viperooe", "39": "A", "10438": "RIO.4.5", "38": "1", "37": "337912000000002", "372": "D", "660": "102", "44": "2.0", "10066": "20180320-08:08:35.038", "29": "4", "50": "JPNIK01", "22": "101"}
You can inspect the json here: https://jsonformatter.org/
I need to create index and enable searching on tags: 37(order_id), 75(trade_date) and 10242 (where available, this sample message doesn't have it)
My understanding is I need to create the file managed-schema, I added two fields as below:
<field name="order_id" type="text_general" indexed="true" stored="false" multiValued="true"/>
<field name="trd_date" type="text_general" indexed="true" stored="false" multiValued="true"/>
Then I go back to Solr Admin, I don't see the two new fields in Schema section
Anything I am missing here? and once the two fields are put in the managed-schema, can I add the json file through upload in Solr Admin?
Thank you very much.
Update: I have 100+ fields in the data to be index'ed, the data is a json file format. I wonder what is the best practice to create the schema file, thanks.
You shouldn't have to create the file yourself, that should be created by Solr (since it's a managed schema). If you're manually editing the file, you have to reload the collection/core or restart Solr afterwards.
Otherwise you can use the Schema API to add or change fields. If you're running in a cloud context / cluster, you'll want to use the Schema API so that your changes can be spread across all nodes (and your schema would live in Zookeeper in that case anyway).