Elastic4s - how to express "matched_fields" - elastic4s

I want to implement the following query in Elastic4s. Don't see any way to implement the matched_fields clause in the highlighter. Any help?
{
"query": {
"multi_match": {
"type": "most_fields",
"query": "hello world",
"fields": [ "text", "text.human" ]
}
},
"highlight": {
"order": "score",
"fields": {
"text": {
"matched_fields": [ "text", "text.human" ],
"fragment_size": 100,
"number_of_fragments": 10
}
}
}
}

A question with a very simple answer. You can't do it yet in elastic4s :)
So, I've just added it:
https://github.com/sksamuel/elastic4s/commit/3f8a4e47ae603b7a3263bc3d31c27f2b6706cd8e
This will be in the next 1.5.x release and 1.6.0.

Related

useQuery's onCompleted being called with cached value

Hopefully I can articulate this question clearly without too much code as it's difficult to extract the pieces from my codebase.
I was observing odd behavior yesterday with useQuery that I can't seem to understand. I think I understand Apollo's cache pretty well but this particular behavior doesn't make sense to me. I have a query that looks something like this:
query {
reservations {
priceBreakdown {
sections {
id
name
total
}
}
}
}
The schema is something like:
type Query {
reservations: [Reservation]
}
type Reservation {
priceBreakdown: PriceBreakdown
}
type PriceBreakdown {
sections: [Section]
}
type Section {
id: String
name: String
total: Float
}
That id on Section is not a proper ID and, in fact, is not unique. It's just a string and all PriceBreakdowns have a list of Sections that contain the same ID. I've pointed this out to the backend folks and it's being fixed but I realize this causes incorrect caching with Apollo since there will be collisions w.r.t. __typename and id. My confusion comes from how onCompleted is called. I noticed when doing
const { data } = useQuery(myQuery, {
onCompleted: console.log
})
that when the network call returns, all PriceBreakdowns are unique and correct, as they should be. But when onCompleted is called with what I thought would be that same API data, it's different and seems to reflect the cached values. In case that's confusing, here are the two results. First is straight from the API and second is the log from onCompleted:
// api results
"data": [
{
"id": "92267",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$60.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$51.00",
"id": "HOST_TOTAL"
}
]
}
},
{
"id": "92266",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$30.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$25.50",
"id": "HOST_TOTAL"
}
]
}
}
]
// onCompleted log
"data": [
{
"id": "92267",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$60.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$51.00",
"id": "HOST_TOTAL"
}
]
}
},
{
"id": "92266",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$60.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$51.00",
"id": "HOST_TOTAL"
}
]
}
}
]
As you can see, in the onCompleted log, the Sections that had the same ID as Sections from the previous record are duplicated, suggesting Apollo is rebuilding the payload from cache and calling onCompleted with that. Is that what's happening? If I set the fetchPolicy to no-cache, the results are correct, but of course that's just a patch for the problem. I want to better understand Apollo because I thought I understood and now I see something unintuitive. I wouldn't have expected onCompleted to be called with something built from the cache. Thanks in advance.

I have a datastudio error - basic json config

I've returned to try and make some datastudio custom javascript.
So I started off with a template type settings and basic js. Manifest is listing correctly - datastudio sees the custom item.
I took a long time for it to be authorised.
However, on adding the custome js, the console is reporting a load of erros.
first : data.0.type is not a valid config
second : data.0.elements.data.0.type is not a valid config.
Json:
{
"data": [
{
"id": "idtestviz",
"label": "Dimension Element Heading",
"type":"DIMENSION"
}
]
,
"style": [
{
"id": "idtestvizstyles",
"label": "Test Styles",
"elements":[
{
"id":"idtestvizfontcolor",
"label":"Font Colour",
"defaultValue":"#FFFF00"
}
]
}
]
}
It did have options in before, same error.
And appears to be the same as in https://developers.google.com/datastudio/visualization/define-config
Also it also is erroring on 'is already used in the config'
and that data.0.elements.style.0.elements.0.type required field that cannot be found
Seems like there are more checks that need to be done.
Is there a validator for json etc. before running, or has something updated on google side that their documentation hasn't been updated yet?
Or the more likely aspect, I'm missing some critical stuff...
Regards
Vince
Re checked my json config with a previous one that works, noted some errors in the objects. Corrected those and the json errors in the console have gone away.
JS errors remain - working on those... closing this question.
{
"data": [
{
"id":"test_viz_data",
"label":"Test Viz Data",
"elements":[
{
"id": "text_viz_dimensions",
"label": "Dimension Element Heading",
"type": "DIMENSION",
"options": {
"min": 1,
"max": 1
}
}
,
{
"id": "test_metrics",
"label": "Metric fields",
"type": "METRIC",
"options": {
"min": 1,
"max": 1
}
}
]
}
]
,
"style": [
{
"id": "idstyles",
"label": "Test Styles",
"elements":[
{
"id":"idfontcolor",
"label":"Font Colour",
"type":"FONT_COLOR",
"defaultValue":"#FFFF00"
}
]
}
]
,
"interactions": [
]
}

Multiple match_phrase conditions with another bool in a single ElasticSearch query?

I am trying to conduct an Elasticsearch query that searched a text field ("body") and returns items that match at least one of two multi-word phrases I provide (ie: "stack overflow" OR "the stackoverflow"). I would also like the query to only provide results that occur after a given timestamp, with the results ordered by time.
My current solution is below. I believe the MUST is working correctly (gte a timestamp), but the BOOL + SHOULD with two match_phrases is not correct. I am getting the following error:
Unexpected character ('{' (code 123)): was expecting double-quote to start field name
Which I think is because I have two match_phrases in there?
This is the ES mapping and the details of the ES API I am using details are here.
{"query":
{"bool":
{"should":
[{"match_phrase":
{"body":"a+phrase"}
},
{"match_phrase":
{"body":"another+phrase"}
}
]
},
{"bool":
{"must":
[{"range":
{"created_at:
{"gte":"thispage"}
}
}
]}
}
},"size":10000,
"sort":"created_at"
}
I think you were just missing a single " after created_at.
{
"query": {
"bool": {
"must": [
{
"range": {
"created_at": {
"gte": "1534004694"
}
}
},
{
"bool": {
"should": [
{
"match_phrase": {
"body": "a+phrase"
}
},
{
"match_phrase": {
"body": "another+phrase"
}
}
]
}
}
]
}
},
"size": 10,
"sort": "created_at"
}
Also, you are allowed to have both must and should as properties of a bool object, so this is also worth trying.
{
"query": {
"bool": {
"must": {
"range": {
"created_at": {
"gte": "1534004694"
}
}
},
"should": [
{
"match_phrase": {
"body": "a+phrase"
}
},
{
"match_phrase": {
"body": "another+phrase"
}
}
]
}
},
"size": 10,
"sort": "created_at"
}
On a side note, Postman or any JSON formatter/validator would really help in determining where the error is.

Remove elements/objects From Array in ElasticSearch Followed by Matching Query

I'm having issues trying to remove elements/objects from an array in elasticsearch.
This is the mapping for the index:
{
"example1": {
"mappings": {
"doc": {
"properties": {
"locations": {
"type": "geo_point"
},
"postDate": {
"type": "date"
},
"status": {
"type": "long"
},
"user": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
And this is an example document.
{
"_index": "example1",
"_type": "doc",
"_id": "8036",
"_score": 1,
"_source": {
"user": "kimchy8036",
"postDate": "2009-11-15T13:12:00",
"locations": [
[
72.79887719999999,
21.193036000000003
],
[
-1.8262150000000001,
51.178881999999994
]
]
}
}
Using the query below, I can add multiple locations.
POST /example1/_update_by_query
{
"query": {
"match": {
"_id": "3"
}
},
"script": {
"lang": "painless",
"inline": "ctx._source.locations.add(params.newsupp)",
"params": {
"newsupp": [
-74.00,
41.12121
]
}
}
}
But I'm not able to remove array objects from locations. I have tried the query below but it's not working.
POST example1/doc/3/_update
{
"script": {
"lang": "painless",
"inline": "ctx._source.locations.remove(params.tag)",
"params": {
"tag": [
-74.00,
41.12121
]
}
}
}
Kindly let me know where i am doing wrong here. I am using elastic version 5.5.2
In painless scripts, Array.remove() method removes by index, not by value.
Here's a working example that removes array elements by value in Elasticsearch script:
POST objects/_update_by_query
{
"query": {
... // use regular ES query to remove only in relevant documents
},
"script": {
"source": """
if (ctx._source[params.array_attribute] != null) {
for (int i=ctx._source[params.array_attribute].length-1; i>=0; i--) {
if (ctx._source[params.array_attribute][i] == params.value_to_remove) {
ctx._source[params.array_attribute].remove(i);
}
}
}
""",
"params": {
"array_attribute": "<NAME_OF_ARRAY_PROPERTY_TO_REMOVE_VALUE_IN>",
"value_to_remove": "<VALUE_TO_REMOVE_FROM_ARRAY>",
}
}
}
You might want to simplify script, if your script shall only remove values from one specific array attribute. For example, removing "green" from document's .color_list array:
_doc/001 = {
"color_list": ["red", "blue", "green"]
}
Script to remove "green":
POST objects/_update_by_query
{
"query": {
... // use regular ES query to remove only in relevant documents
},
"script": {
"source": """
for (int i=ctx._source.color_list.length-1; i>=0; i--) {
if (ctx._source.color_list[i] == params.color_to_remove) {
ctx._source.color_list.remove(i);
}
}
""",
"params": {
"color_to_remove": "green"
}
}
}
Unlike add(), remove() takes the index of the element and remove it.
Your ctx._source.locations in painless is an ArrayList. It has List's remove() method:
E remove(int index)
Removes the element at the specified position in this list (optional operation). ...
See Painless API - List for other methods.
See this answer for example code.
"script" : {
"lang":"painless",
"inline":"ctx._source.locations.remove(params.tag)",
"params":{
"tag":indexToRemove
}
}
If with ctx._source.locations.add(elt) You add the element, with ctx._source.locations.remove(indexToRemove), you remove by the index of element in the array.

How to filter embedded array in mongo document with morphia

Given my Profile data looks like below, I want to find the profile for combination of userName and productId
and only return the profile with the respective contract for this product.
{
"firstName": "John",
"lastName": "Doe",
"userName": "john.doe#gmail.com",
"language": "NL",
"timeZone": "Europe/Amsterdam",
"contracts": [
{
"contractId": "DEMO1-CONTRACT",
"productId": "ticket-api",
"startDate": ISODate('2016-06-29T09:06:42.391Z'),
"roles": [
{
"name": "Manager",
"permissions": [
{
"activity": "ticket",
"permission": "createTicket"
},
{
"activity": "ticket",
"permission": "updateTicket"
},
{
"activity": "ticket",
"permission": "closeTicket"
}
]
}
]
},
{
"contractId": "DEMO2-CONTRACT",
"productId": "comment-api",
"startDate": ISODate('2016-06-29T10:27:45.899Z'),
"roles": [
{
"name": "Manager",
"permissions": [
{
"activity": "comment",
"permission": "createComment"
},
{
"activity": "comment",
"permission": "updateComment"
},
{
"activity": "comment",
"permission": "deleteComment"
}
]
}
]
}
]
}
I managed to find the solution how to do this from the command line. But I don't seem to find a way how to accomplish this with Morphia (latest version).
db.Profile.aggregate([
{ $match: {"userName": "john.doe#gmail.com"}},
{ $project: {
contracts: {$filter: {
input: '$contracts',
as: 'contract',
cond: {$eq: ['$$contract.productId', "ticket-api"]}
}}
}}
])
This is what I have so far. Any help is most appreciated
Query<Profile> matchQuery = getDatastore().createQuery(Profile.class).field(Profile._userName).equal(userName);
getDatastore()
.createAggregation(Profile.class)
.match(matchQuery)
.project(Projection.expression(??))
Note... meanwhile I found another solution which does not use an aggregation pipeline.
public Optional<Profile> findByUserNameAndContractQuery(String userName, String productId) {
DBObject contractQuery = BasicDBObjectBuilder.start(Contract._productId, productId).get();
Query<Profile> query =
getDatastore()
.createQuery(Profile.class)
.field(Profile._userName).equal(userName)
.filter(Profile._contracts + " elem", contractQuery)
.retrievedFields(true, Profile._contracts + ".$");
return Optional.ofNullable(query.get());
}
I finally found the best way (under assumption I only want to return max. 1 element from array) to filter embedded array.
db.Profile.aggregate([
{ $match: {"userName": "john.doe#gmail.com"}},
{ $unwind: "$contracts"},
{ $match: {"contracts.productId": "comment-api"}}
])
To match according to your first design you could try the projection settings with morphia aggregation pipeline.
Query<Profile> matchQuery = getDatastore().createQuery(Profile.class).field(Profile._userName).equal(userName);
getDatastore()
.createAggregation(Profile.class)
.match(matchQuery)
.project(Projection.expression("$filter", new BasicDBObject()
.append("input", "$contracts")
.append("as", "contract")
.append("cond", new BasicDBObject()
.append("$eq", Arrays.asList('$$contract.productId', "ticket-api")));
Also see the example written by the morphia crew around line 88 at https://github.com/mongodb/morphia/blob/master/morphia/src/test/java/org/mongodb/morphia/aggregation/AggregationTest.java.

Resources