Pact, ensure key names in array - arrays

If returned json is a map, all key names specified in body response will be proved for existence. So
...
"response":
{
"status": 200,
"body":
{
"field1": "value1"
}
...
will ensure, that body contains a key "field1", if it is missing, an error occurs.
But what if response body is an array? I see no chance to test, if all or at least one element in this array have a specific key name. But this is important, I want to be warned if key names in backend are changing, because that would create errors in my application.

You can use eachLike to specify that array elements match a particular format. The correct syntax depends on which Pact framework you're using, but with pact-js, you would say:
const { somethingLike: like, term, eachLike } = pact
....
willRespondWith: {
status: 200,
body: eachLike({
"field1": "value1"
})
}
Here is the relevant part of the documentation.
Your example suggests you're writing the Pact file yourself - if this is the case, you can use the [*] notation to describe any array element, as described in the specification:
"response":
{
"status": 200,
"body":
[
{
"field1": "value1"
}
],
...
"matchingRules": {
"$.body": {
"min": 1,
"match": "type"
},
"$.body[*].field1": {
"match": "type"
},
...

Related

MongoDB Array Query - Single out an array element

I am having trouble with querying a MongoDB collection with an array inside.
Here is the structure of my collection that I am querying. This is one record:
{
"_id": "abc123def4567890",
"profile_id": "abc123def4567890",
"image_count": 2,
"images": [
{
"image_id": "ABC123456789",
"image_url": "images/something.jpg",
"geo_loc": "-0.1234,11.234567890",
"title": "A Title",
"shot_time": "01:23:33",
"shot_date": "11/22/2222",
"shot_type": "scenery",
"conditions": "cloudy",
"iso": 16,
"f": 2.4,
"ss": "1/545",
"focal": 6.0,
"equipment": "",
"instructions": "",
"upload_date": 1234567890,
"update_date": 1234567890
},
{
"image_id": "ABC123456789",
"image_url": "images/something.jpg",
"geo_loc": "-0.1234,11.234567890",
"title": "A Title",
"shot_time": "01:23:33",
"shot_date": "11/22/2222",
"shot_type": "portrait",
"conditions": "cloudy",
"iso": "16",
"f": "2.4",
"ss": "1/545",
"focal": "6.0",
"equipment": "",
"instructions": "",
"upload_date": 1234567890,
"update_date": 1234567890
}
]
}
Forgive the formatting, I didn't know how else to show this.
As you can see, it's a profile with a series of images within an array called 'images' and there are 2 images. Each of the 'images' array items contain an object of attributes for the image (url, title, type, etc).
All I want to do is to return the object element whose attributes match certain criteria:
Select object from images which has shot_type = "scenery"
I tried to make it as simple as possible so i started with:
find( { "images.shot_type": "scenery" } )
This returns the entire record and both the images within. So I tried projection but I could not isolate the single object within the array (in this case object at position 0) and return it.
I think the answer lies with projection but I am unsure.
I have gone through the MongoDB documents for hours now and can't find inspiration. I have read about $elemMatch, $, and the other array operators, nothing seems to allow you to single out an array item based on data within. I have been through this page too https://docs.mongodb.com/manual/tutorial/query-arrays/ Still can't work it out.
Can anyone provide help?
Have I made an error by using '$push' to populate my images field (making it an array) instead of using '$set' which would have made it into an embedded document? Would this have made a difference?
Using aggregation:
db.collection.aggregate({
$project: {
_id: 0,
"result": {
$filter: {
input: "$images",
as: "img",
cond: {
$eq: [
"$$img.shot_type",
"scenery"
]
}
}
}
}
})
Playground
You can use $elemMatch in this way (simplified query):
db.collection.find({
"profile_id": "1",
},
{
"images": {
"$elemMatch": {
"shot_type": 1
}
}
})
You can use two objects into find query. The first will filter all document and will only get those whose profile_id is 1. You can omit this stage and use only { } if you wnat to search into the entire collection.
Then, the other object uses $elemMatch to get only the element whose shot_type is 1.
Check an example here

JSON schema deeper object uniqueness

I'm trying to get into JSON schema definitions and wanted to find out, how to achieve a deeper object uniqueness in the schema definition. Please look at the following example definition, in this case a simple IO of a module.
{
"$schema": "http://json-schema.org/draft-06/schema#",
"type": "object",
"required": ["modulIOs"],
"properties": {
"modulIOs": {
"type": "array",
"uniqueItems": true,
"items": {
"allOf": [
{
"type": "object",
"required": ["ioPosition","ioType","ioFunction"],
"additionalProperties": false,
"properties": {
"ioPosition": {
"type": "integer"
},
"ioType": {
"type":"string",
"enum": ["in","out"]
},
"ioFunction": {
"type":"string"
}
}
}
]
}
}
}
}
When I validate the following with i.E. draft-06 I get a positive validation.
{"modulIOs":
[
{
"ioPosition":1,
"ioType":"in",
"ioFunction":"240 V AC in"
},
{
"ioPosition":1,
"ioType":"in",
"ioFunction":"24 V DC in"
}
]
}
I'm aware that the validation is successfull because the validator does what he's intended to - it checks the structure of a JSON-object, but is there a possibility to validate object value data in deeper objects or do i need to perform the check elsewhere?
This is not currently possible with JSON Schema (at draft-7).
There is an issue raised on the official spec repo github for this: https://github.com/json-schema-org/json-schema-spec/issues/538
If you (or anyone reading this) really wants this, please thumbsup the first issue comment.
It's currently unlikely to make it into the next draft, and even if it did, time to impleemntations picking it up may be slow.
You'll need to do this validation after your JSON Schema validation process.
You can validate data value of your object fields by using JSON schema validation.
For example, if you need to check if ioPosition is between 0 and 100 you can use:
"ioPosition": {
"type": "integer",
"minimum": 0,
"maximum": 100
}
If you need to validate ioFunction field you can use regualr expression such as:
"ioFunction": {
"type": "string",
"pattern": "^[0-9]+ V [A,D]C"
}
Take a look at json-schema-validation.

Updating a field with a nested array in Elastic Search

I am trying to update a field in a document with an array. I want to add an array to the field "products". I tried this:
POST /index/type/1/_update
{
"doc" :{
"products": [
{
"name": "A",
"count": 1
},
{
"name": "B",
"count": 2
},
{
"name": "c",
"count": 3
}
]
}
}
this is the error response I am getting when I try and run the code:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failed to parse [products]"
}
],
"type": "mapper_parsing_exception",
"reason": "failed to parse [products]",
"caused_by": {
"type": "illegal_state_exception",
"reason": "Can't get text on a START_OBJECT at 1:2073"
}
},
"status": 400
}
Anyone know what I am doing wrong?
The message "Can't get text on a START_OBJECT" means that Elasticsearch was expecting an entry of type "string" but you are trying to give an object as input.
If you check Kibana you will find that the field "products" exists there and is defined as a string. But since you are entering a list of dictionaries then the field "products" should have been defined from the beginning as an object (preferably with dynamic fields in it). An example would be (see full example at https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic.html )
"products": {
"dynamic": true,
"properties": {}
}
However since you already have the index then you can't change the mapping so you would need to delete the index, do the mapping beforehand and then do the update.

"There is no index available for this selector" despite the fact I made one

In my data, I have two fields that I want to use as an index together. They are sensorid (any string) and timestamp (yyyy-mm-dd hh:mm:ss).
So I made an index for these two using the Cloudant index generator. This was created successfully and it appears as a design document.
{
"index": {
"fields": [
{
"name": "sensorid",
"type": "string"
},
{
"name": "timestamp",
"type": "string"
}
]
},
"type": "text"
}
However, when I try to make the following query to find all documents with a timestamp newer than some value, I am told there is no index available for the selector:
{
"selector": {
"timestamp": {
"$gt": "2015-10-13 16:00:00"
}
},
"fields": [
"_id",
"_rev"
],
"sort": [
{
"_id": "asc"
}
]
}
What have I done wrong?
It seems to me like cloudant query only allows sorting on fields that are part of the selector.
Therefore your selector should include the _id field and look like:
"selector":{
"_id":{
"$gt":0
},
"timestamp":{
"$gt":"2015-10-13 16:00:00"
}
}
I hope this works for you!

Elasticsearch not returning hits for multi-valued field

I am using Elasticsearch with no modifications whatsoever. This means the mappings, norms, and analyzed/not_analyzed is all default config. I have a very small data set of two items for experimentation purposes. The items have several fields but I query only on one, which is a multi-valued/array of strings field. The doc looks like this:
{
"_index": "index_profile",
"_type": "items",
"_id": "ega",
"_version": 1,
"found": true,
"_source": {
"clicked": [
"ega"
],
"profile_topics": [
"Twitter",
"Entertainment",
"ESPN",
"Comedy",
"University of Rhode Island",
"Humor",
"Basketball",
"Sports",
"Movies",
"SnapChat",
"Celebrities",
"Rite Aid",
"Education",
"Television",
"Country Music",
"Seattle",
"Beer",
"Hip Hop",
"Actors",
"David Cameron",
... // other topics
],
"id": "ega"
}
}
A sample query is:
GET /index_profile/items/_search
{
"size": 10,
"query": {
"bool": {
"should": [{
"terms": {
"profile_topics": [
"Basketball"
]
}
}]
}
}
}
Again there are only two items and the one listed should match the query because the profile_topics field matches with the "Basketball" term. The other item does not match. I only get a result if I ask for clicked = ega in the should.
With Solr I would probably specify that the fields are multi-valued string arrays and are to have no norms and no analyzer so profile_topics are not stemmed or tokenized since all values should be treated as tokens (even the spaces). Not sure this would solve the problem but it is how I treat similar data on Solr.
I assume I have run afoul of some norm/analyzer/TF-IDF issue, if so how do I solve this so that even with two items the query will return ega. If possible I'd like to solve this index or type wide rather than field specific.
Basketball (with capital B) in terms will not be analyzed. This means this is the way it will be searched in the Elasticsearch index.
You say you have the defaults. If so, indexing Basketball under profile_topics field means that the actual term in the index will be basketball (with lowercase b) which is the result of the standard analyzer. So, either you set profile_topics as not_analyzed or you search for basketball and not Basketball.
Read this about terms.
Regarding to setting all the fields to not_analyzed you could do that with a dynamic template. Still with a template you can do what Logstash is doing: defining a .raw subfield for each string field and only this subfield is not_analyzed. The original/parent field still holds the analyzed version of the same text, maybe you will use in the future the analyzed field.
Take a look at this dynamic template. It's the one Logstash is using.
More specifically:
{
"template": "your_indices_name-*",
"mappings": {
"_default_": {
"_all": {
"enabled": true,
"omit_norms": true
},
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
]
}
}
}

Resources