I have an elasticsearch index where some records have a #timestamp of 1st Feb. There is also a field in the _source called process_time which is different for each record with same #timestamp.
{
"_index": "elasticsearch-index",
"_type": "doc",
"_id": "qByop2gBw60PM5VYP0aG",
"_score": 1,
"_source": {
"task": "INFO",
"#timestamp": "2019-02-01T06:04:08.365Z",
"num_of_batches": 0,
"batch_size": 1000,
"process_time": "2019-02-04 06:04:04,489"
}
},
{
"_index": "elasticsearch-index",
"_type": "doc",
"_id": "qByop2gBw60PM5VYP0aG",
"_score": 1,
"_source": {
"task": "INFO",
"#timestamp": "2019-02-01T06:04:08.365Z",
"num_of_batches": 0,
"batch_size": 1000,
"process_time": "2019-02-05 06:04:04,489"
}
}
I want to update the #timestamp of all records having #timestamp as 1st Feb to whatever is the process_time in that record.
How can I do this?
EDIT:
After #mysterion's answer I did the following changes as below:
{
"script": {
"source": """ctx._source['#timestamp'] = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS").parse(ctx._source.process_time.replaceAll(",","."))""",
"lang": "painless"
},
"query": {
"term": {
"#timestamp": "2019-02-01"
}
}
}
But I got the following exception:
"type": "class_cast_exception",
"reason": "Cannot cast java.lang.String to java.util.function.Function"
You could utilize Update by Query API to update needed documents in the Elasticsearch.
POST index_name/_update_by_query?conflicts=proceed
{
“script”: {
“source”: “ctx._source[‘#timestamp’] = new SimpleDateFormat(‘yyyy-MM-dd HH:mm:ss,SSS’).parse(ctx._source.process_time)”,
“lang”: “painless”
},
“query”: {
“term”: {
“#timestamp”: “2019-01-01T00:00:00Z”
}
}
}
You’re using replaceAll method expecting that it’s a plain Java String method, in fact - it’s not. It is
String replaceAll(Pattern, Function)
More information on Painless API reference - https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-api-reference.html
Related
I have a MongoDB document collection with multiple arrays that looks like this :
{
"_id": "1235847",
"LineItems": [
{
"StartDate": ISODate("2017-07-31T00:00:00.000+00:00"),
"EndDate": ISODate("2017-09-19T00:00:00.000+00:00"),
"Amount": {"$numberDecimal": "0.00"}
},
{
"StartDate": ISODate("2022-03-20T00:00:00.000+00:00"),
"EndDate": ISODate("2022-10-21T00:00:00.000+00:00"),
"Amount": {"$numberDecimal": "6.38"}
},
{
"StartDate": ISODate("2022-09-20T00:00:00.000+00:00"),
"EndDate": ISODate("9999-12-31T00:00:00.000+00:00"),
"Amount": {"$numberDecimal": "6.17"}
}
]
}
Is there a simple way to find documents where the startdate has overlapped with previously startdate, enddate?
The startdate can not be before previous end dates within the array
The start/end can not be between previous start/end dates within the array
The below works but I don't want to hardcode the array index to find all the documents
{
$match: {
$expr: {
$gt: [
'LineItems.3.EndDate',
'LineItems.2.StartDate'
]
}
}
}
Here's one way you could find docs where "StartDate" is earlier than the immediately previous "EndDate".
db.collection.find({
"$expr": {
"$getField": {
"field": "overlapped",
"input": {
"$reduce": {
"input": {"$slice": ["$LineItems", 1, {"$size": "$LineItems"}]},
"initialValue": {
"overlapped": false,
"prevEnd": {"$first": "$LineItems.EndDate"}
},
"in": {
"overlapped": {
"$or": [
"$$value.overlapped",
{"$lt": ["$$this.StartDate", "$$value.prevEnd"]}
]
},
"prevEnd": "$$this.EndDate"
}
}
}
}
}
})
Try it on mongoplayground.net.
Hopefully I can articulate this question clearly without too much code as it's difficult to extract the pieces from my codebase.
I was observing odd behavior yesterday with useQuery that I can't seem to understand. I think I understand Apollo's cache pretty well but this particular behavior doesn't make sense to me. I have a query that looks something like this:
query {
reservations {
priceBreakdown {
sections {
id
name
total
}
}
}
}
The schema is something like:
type Query {
reservations: [Reservation]
}
type Reservation {
priceBreakdown: PriceBreakdown
}
type PriceBreakdown {
sections: [Section]
}
type Section {
id: String
name: String
total: Float
}
That id on Section is not a proper ID and, in fact, is not unique. It's just a string and all PriceBreakdowns have a list of Sections that contain the same ID. I've pointed this out to the backend folks and it's being fixed but I realize this causes incorrect caching with Apollo since there will be collisions w.r.t. __typename and id. My confusion comes from how onCompleted is called. I noticed when doing
const { data } = useQuery(myQuery, {
onCompleted: console.log
})
that when the network call returns, all PriceBreakdowns are unique and correct, as they should be. But when onCompleted is called with what I thought would be that same API data, it's different and seems to reflect the cached values. In case that's confusing, here are the two results. First is straight from the API and second is the log from onCompleted:
// api results
"data": [
{
"id": "92267",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$60.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$51.00",
"id": "HOST_TOTAL"
}
]
}
},
{
"id": "92266",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$30.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$25.50",
"id": "HOST_TOTAL"
}
]
}
}
]
// onCompleted log
"data": [
{
"id": "92267",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$60.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$51.00",
"id": "HOST_TOTAL"
}
]
}
},
{
"id": "92266",
"price_breakdown": {
"sections": [
{
"name": "Reservation",
"total": "$60.00",
"id": "RESERVATION"
},
{
"name": "Promotions and Fees",
"total": null,
"id": "PROMOTIONS_AND_FEES"
},
{
"name": "Total",
"total": "$51.00",
"id": "HOST_TOTAL"
}
]
}
}
]
As you can see, in the onCompleted log, the Sections that had the same ID as Sections from the previous record are duplicated, suggesting Apollo is rebuilding the payload from cache and calling onCompleted with that. Is that what's happening? If I set the fetchPolicy to no-cache, the results are correct, but of course that's just a patch for the problem. I want to better understand Apollo because I thought I understood and now I see something unintuitive. I wouldn't have expected onCompleted to be called with something built from the cache. Thanks in advance.
I have been working on a Mongo database for a while. The database has some visits that have this form:
[
{
"isPaid": false,
"_id": "5c12bc3dcea46f9d3658ca98",
"clientId": "5c117f2d1d6b9f9182182ae4",
"came_by": "Twitter Ad",
"reason": "Some reason",
"doctor_services": "Some service",
"doctor": "Dr. Michael",
"service": "Special service",
"payments": [
{
"date": "2018-12-13T21:23:05.424Z",
"_id": "5c12cdb9b236c59e75fe8190",
"sum": 345,
"currency": "$",
"note": "First Payment"
},
{
"date": "2018-12-13T21:23:07.954Z",
"_id": "5c12cdbbb236c59e75fe8191",
"sum": 100,
"currency": "$",
"note": "Second payment"
},
{
"date": "2018-12-13T21:23:16.767Z",
"_id": "5c12cdc4b236c59e75fe8192",
"sum": 5,
"currency": "$",
"note": "Third Payment"
}
],
"price": 500,
"createdAt": "2018-12-13T20:08:29.425Z",
"updatedAt": "2018-12-13T21:42:21.559Z",
}
]
I need to find some query to update some field of a single payment based on the _id of the visit and _id of the payment that is inside of nested array. Also when you update a payment's sum to some number so that the sum of all payments is greater than or equal to price the field isPaid is automatically updated to true.
I have tried some queries in mongoose to achieve the first part but none of them seem to work:
let paymentId = req.params.paymentId;
let updatedFields = req.body;
Visit.update(
{ "payments._id": paymentId },
{
$set: {
"visits.$": updatedFields
}
}
).exec((err, visit) => {
if (err) {
return res.status(500).send("Couldn't update payment");
}
return res.status(200).send("Updated payment");
});
As for the second part of the question I haven't really come up with anything so I would appreciate at least giving some direction on how to approach it.
I am trying to get an object out of a JSON array that is stored in elasticsearch. The layout is like this:
[
object{}
object{}
object{}
]
What I need for when I do a search and it hits on one of these objects, to get the specific object it matches to. Currently, using the java API I am searching with:
QueryBuilder qb = QueryBuilders.boolQuery()
.should(QueryBuilders.matchQuery("text", "pottery").boost(5)
.minimumShouldMatch("1"));
SearchResponse response = client.prepareSearch("stuff")
.setTypes("things")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(qb)
.setPostFilter(filter)//.setHighlighterQuery(qb)
.addField("places.numbers")
.addField("name")
.addField("city")
.setFrom(0).setSize(60).setExplain(true)
.execute()
.actionGet();
But this will just return the whole object that I hit or when I tell it to return the field "places.numbers" it will only return the first object in the "palces" array, not the one that was matched in the query.
Thank you for any help!
There are a couple of ways to handle this. I would probably do it with a nested type and inner hits, given what you've shown in your question, but it could also probably be done with the parent/child relationship.
Here is an example with nested docs. I set up a simple index like this:
PUT /test_index
{
"mappings": {
"parent_doc": {
"properties": {
"parent_name": {
"type": "string"
},
"nested_docs": {
"type": "nested",
"properties": {
"nested_name": {
"type": "string"
}
}
}
}
}
}
}
Then added a couple of simple documents:
POST /test_index/parent_doc/_bulk
{"index":{"_id":1}}
{"parent_name":"p1","nested_docs":[{"nested_name":"n1"},{"nested_name":"n2"}]}
{"index":{"_id":2}}
{"parent_name":"p2","nested_docs":[{"nested_name":"n3"},{"nested_name":"n4"}]}
And now I can search like this, using "inner_hits":
POST /test_index/_search
{
"query": {
"nested": {
"path": "nested_docs",
"query": {
"match": {
"nested_docs.nested_name": "n3"
}
},
"inner_hits" : {}
}
}
}
which returns:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 2.098612,
"hits": [
{
"_index": "test_index",
"_type": "parent_doc",
"_id": "2",
"_score": 2.098612,
"_source": {
"parent_name": "p2",
"nested_docs": [
{
"nested_name": "n3"
},
{
"nested_name": "n4"
}
]
},
"inner_hits": {
"nested_docs": {
"hits": {
"total": 1,
"max_score": 2.098612,
"hits": [
{
"_index": "test_index",
"_type": "parent_doc",
"_id": "2",
"_nested": {
"field": "nested_docs",
"offset": 0
},
"_score": 2.098612,
"_source": {
"nested_name": "n3"
}
}
]
}
}
}
}
]
}
}
Here's the code I used to test it:
http://sense.qbox.io/gist/ef7debf436fec2a10097ba2106d5ff30ff8d7c77
I've modified my contactNumber field to have a unique filter
by updating the index settings as follows
curl -XPUT localhost:9200/test-index2/_settings -d '
{
"index":{
"analysis":{
"analyzer":{
"unique_keyword_analyzer":{
"only_on_same_position":"true",
"filter":"unique"
}
}
}
},
"mappings":{
"business":{
"properties":{
"contactNumber":{
"analyzer":"unique_keyword_analyzer",
"type":"string"
}
}
}
}
}'
A sample Item looks like this,
doc_type:"Business"
contactNumber:"(+12)415-3499"
name:"Sam's Pizza"
address:"Somewhere on earth"
The Filter does not work, as duplicate items are inserted, I'd like NO two documents having the same contactNumber
in the above, I've also set only_on_same_position -> true so that existing duplicate values would be truncated/deleted
What am i doing wrong in the settings?
That's something Elasticsearch couldn't help you out of the box... you need to make this uniqueness functionality available in your app. The only idea that I can think of is to have the phone number as the _id of the document itself and whenever you insert/update something ES will use the contactNumber as _id and it will associate that document with the one that already exists or create a new one.
For example:
PUT /test-index2
{
"mappings": {
"business": {
"_id": {
"path": "contactNumber"
},
"properties": {
"contactNumber": {
"type": "string",
"analyzer": "keyword"
},
"address": {
"type": "string"
}
}
}
}
}
Then you index something:
POST /test-index2/business
{
"contactNumber": "(+12)415-3499",
"address": "whatever 123"
}
Getting it back:
GET /test-index2/business/_search
{
"query": {
"match_all": {}
}
}
It looks like this:
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test-index2",
"_type": "business",
"_id": "(+12)415-3499",
"_score": 1,
"_source": {
"contactNumber": "(+12)415-3499",
"address": "whatever 123"
}
}
]
}
You see there that the _id of the document is the phone number itself. If you want to change or insert another document (the address is different, there is a new field - whatever_field - but the contactNumber is the same):
POST /test-index2/business
{
"contactNumber": "(+12)415-3499",
"address": "whatever 123 456",
"whatever_field": "whatever value"
}
Elasticserach "updates" the existing document and responds back with:
{
"_index": "test-index2",
"_type": "business",
"_id": "(+12)415-3499",
"_version": 2,
"created": false
}
created is false, this means the document has been updated, not created. _version is 2 which again says that the document has been updated. And the _id is the phone number itself which indicate this is the document that has been updated.
Looking again in the index, ES stores this:
"hits": [
{
"_index": "test-index2",
"_type": "business",
"_id": "(+12)415-3499",
"_score": 1,
"_source": {
"contactNumber": "(+12)415-3499",
"address": "whatever 123 456",
"whatever_field": "whatever value"
}
}
]
So, the new field is there, the address has changed, the contactNumber and _id are exactly the same.