Why is Wiremock Standalone rejecting my valid JSONPath expression?

Why is Wiremock Standalone rejecting my valid JSONPath expression? - arrays

Background
I am using wiremock-jre8-standalone-2.35.0.jar
I want it to return a 200 response if the incoming request's array contains any values:
{
"field1": "data1",
"array": [
{...},
{...},
...
],
"field2": "data2",
"field3": "data3",
"field4": "data4",
"field5": "data5"
}
I want it to return a 400 response if the incoming requests' array is empty:
{
"field1": "data1",
"array": [],
"field2": "data2",
"field3": "data3",
"field4": "data4",
"field5": "data5"
}
Wiremock should match the incoming request against the "request": {...} from the below code:
{
"id": "...",
"request": {
"urlPattern": "...",
"method": "POST",
"headers": {...},
"bodyPatterns": [
{
"matchesJsonPath": "$[?(#.length < 1)]"
}
]
}
},
"response": {
"status": 400,
"bodyFileName": "...",
"headers": {...}
},
"uuid": "..."
}
Problem
Wiremock is rejecting my JSONPath expression in the bodyPatterns array:
[{"matchesJsonPath":"$[?(#.length < 1)]"}] is not a valid match operation
Yet it seems that the expression is valid according to https://jsonpath.com/ :
JSONPath
---
$[?(#.length < 1)]
Inputs
---
{
"field1": "data1",
"array": [],
"field2": "data2",
"field3": "data3",
"field4": "data4",
"field5": "data5"
}
Evaluation Results
---
[
[]
]
...What gives?

This is what gives:
JSON Path
Deems a match if the attribute value is valid JSON and matches the JSON Path expression supplied. A JSON body will be considered to match a path expression if the expression returns either a non-null single value (string, integer etc.), or a non-empty object or array.
https://wiremock.org/docs/request-matching/
My JSON Path expression returns an empty array, so it cannot be considered to match the path expression.

Related

how to apply patterns on array enums in JSON schema

I have a simple JSON schema that looks like so (and works)
{
"cols": {
"type": "array",
"items": {
"type": "string",
"enum": [
"id",
"name",
"age",
"affiliation",
""
]
},
"additionalProperties": false
}
}
I would like the enum to be the values prescribed above + a decoration so that any of the following would be allowed
"enum" = [
"id",
"lower(name)",
"average(age)",
"distinct(affiliation)",
""
]
In other words, for cols
cols=id would be valid but no further decoration would be allowed around id
cols=name and cols=lower(name) would be valid
cols=age and cols=average(age) would be valid
cols=affiliation and cols=distinct(affiliation) would be valid
cols='' empty string would be valid
Specifying the decorations as patterns would be great so that they would be case-insensitive. For example, cols=lower(name) and cols=LOWER(name) would both be ok.

You can change your enumerated list in enum to a list of patterns:
"items": [
"type": "string",
"anyOf": [
{ "pattern": "^cols\b...the rest of your pattern here...$" },
{ etc... }
]
]

Apache Nifi: Parse data with UpdateRecord Processor

I'm trying to parse some data in Nifi (1.7.1) using UpdateRecord Processor.
Original data are json files, that I would like to convert to Avro, based on a schema.
The Avro conversion is ok, but in that convertion I also need to parse one array element from the json data to a different structure in Avro.
This is a sample data of the input json:
{ "geometry" : {
"coordinates" : [ [ 4.963087975800593, 45.76365595859971 ], [ 4.962874487781098, 45.76320922779652 ], [ 4.962815443439148, 45.763116079159374 ], [ 4.962744732112515, 45.763010484202866 ], [ 4.962096825239138, 45.762112721939246 ] ]} ...}
Being its schema (specified in RecordReader):
{ "type": "record",
"name": "features",
"fields": [
{
"name": "geometry",
"type": {
"type": "record",
"name": "geometry",
"fields": [
{
"name": "coordinatesJson",
"type": {
"type": "array",
"items": {
"type": "array",
"items": "double"
}
}
},
]
}
},
....
]
}
As you can see, coordinates is an array of arrays.
And I need to parse those data to Avro, based on this schema (specified in RecordWriter):
{
"name": "outputdata",
"type": "record",
"fields": [
{"name": "coordinatesAvro",
"type": {
"type": "array",
"items" : {
"type" : "record",
"name" : "coordinatesAvro",
"fields" : [ {
"name" : "X",
"type" : "double"
}, {
"name" : "Y",
"type" : "double"
} ]
}
}
},
.....
]
}
The problem here is that I'm not being able to parse from coordinatesJson to coordinatesAvro, using RecordPath functions
I tried several mappings, like:
Property: Value:
/coordinatesJson[0..-1]/X /geometry/coordinatesAvro[*][0]
/coordinatesJson[0..-1]/Y /geometry/coordinatesAvro[*][1]
It should be a pretty straighforward parsing step, but as I said, I've been going in circles to achive this for a while.
Any help would be really appreciated.

When I collide with something like that I do next:
1) Transofrm Json into Json with strcuture that I need (for example in your case: coordinatesAvro) by ExecuteScript Processor. I have used ECMAScript cause you can simple parse JSON and work with objects (transform them).
2) ConvertJsonToAvro with one common schema (coordinatesAvro in your case) for Reader and Writer.
It works very good and I have used it on BigData cases. This is one of possible resolutions for your problem.

How to convert a field, which is an array of objects with K-V pairs to array of arrays with only values?

I have a collection in MongoDB which has a field called "geometry" with latitude and longitude like this :
{
"abc":"xyz",
"geometry" : [
{
"lat" : 45.0,
"lng" : 25.0
},
{
"lat" : 46.0,
"lng" : 26.0
}
]
}
I want to convert the field geometry into something like this, to be compliant with the GeoJSON format:
{
"abc":"xyz",
"geometry": {
"type": "LineString",
"coordinates": [
[
25.0,
45.0
],
[
26.0,
46.0
]
]
}
}
The operation essentially involves taking an array of objects with two K/V pairs and pick only the values and store them as array of arrays(with the order reversed- so value of "lng" comes first).
My failed attempts:
I tried using an aggregate and tried to project the following:
"geometry": {"type":"LineString", "coordinates":["$points.lng","$points.lat"] }
which gave me a result similar to:
"geometry": {
"type": "LineString",
"coordinates": [
[
25.0,
26.0
],
[
45.0,
46.0
]
]
}
I've tried working with this and modifying data record by record, but the results are not consistent. And, I'm trying to avoid going through every record and changing the structure one by one. Is there a way to do this efficiently ?

You would think that the following code should theoretically work:
db.collection.aggregate({
$project: {
"abc": 1, // include the "abc" field in the output
"geometry": { // add a new geometry sub-document
"type": "LineString", // with the hardcoded "type" field
"coordinates": {
$map: {
"input": "$geometry", // transform each item in the "geometry" array
"as": "this",
"in": [ "$$this.lng", "$$this.lat" ] // into an array of values only, ith "lng" first, "lat" second
}
}
}
}
}, {
$out: "result" // creates a new collection called "result" with the transformed documents in it
})
However, the way MongoDB works at this stage as per SERVER-37635 is that the above query results in a surprising output where the coordinates field contains the desired result several times. So the following query can be used to generate the desired output instead:
db.collection.aggregate({
$addFields: {
"abc": 1,
"geometry2": {
"type": "LineString",
"coordinates": {
$map: {
"input": "$geometry",
"as": "this",
"in": [ "$$this.lng", "$$this.lat" ]
}
}
}
}
}, {
$project: {
"abc": 1,
"geometry": "$geometry2"
}
}, {
$out: "result"
})
In the comments section of the JIRA ticket mentioned above, Charlie Swanson mentions another workaround which uses $let to "trick" MongoDB into interpreting the query in the desired way. I re-post it here (note that it's missing the $out part):
db.collection.aggregate([
{
$project: {
"abc": 1,
"geometry": {
$let: {
vars: {
ret: {
"type": "LineString",
"coordinates": {
$map: {
"input": "$geometry",
"as": "this",
"in": [
"$$this.lng",
"$$this.lat"
]
}
}
}
},
in: "$$ret"
}
}
}
}
])

Is it possible to apply a solr document int field value as boost value if a specific field is matched?

Ex.
"docs": [
{
"id": "f37914",
"index_id": "some_index",
"field_1": [
{
"Some value",
"boost": 20.
}
]
},
]
If 'field_1' is matched, then boost by corresponding 'boost' field.

Boost what? the document? the specific field? you can do any of them.
Anyway the way to do it is to user Function Queries:
https://lucene.apache.org/solr/guide/6_6/function-queries.html#FunctionQueries-AvailableFunctions
For example if you want to boost the document (and assuming if the value doesn't match then the score is 0) then you can do something like that:
q:_val_:"if(query($q1), field(boost), 0)"&q1=field_1:"Some Value"
_val_ is just a hook into Solr function query, query returns true if q1 matches, field is a simple function that just return the value of the field it self and if allows us to join the two together.

So what I ended up doing is using lucence payloads and solr 6.6 new DelimitedPayloadTokenFilter feature.
First I created a terms field with the following configuration:
{
"add-field-type": {
"name": "terms",
"stored": "true",
"class": "solr.TextField",
"positionIncrementGap": "100",
"indexAnalyzer": {
"tokenizer": {
"class": "solr.KeywordTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
},
{
"class": "solr.DelimitedPayloadTokenFilterFactory",
"encoder": "float",
"delimiter": "|"
}
]
},
"queryAnalyzer": {
"tokenizer": {
"class": "solr.KeywordTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
},
{
"class": "solr.SynonymGraphFilterFactory",
"ignoreCase": "true",
"expand": "false",
"tokenizerFactory": "solr.KeywordTokenizerFactory",
"synonyms": "synonyms.txt"
}
]
}
},
"add-field" : {
"name":"terms",
"type":"terms",
"stored": "true",
"multiValued": "true"
}
}
I indexed my documents likes so:
[
{
"id" : "1",
"terms" : [
"some term|10.0",
"another term|60.0"
]
}
,
{
"id" : "2",
"terms" : [
"some term|11.0",
"another term|21.0"
]
}
]
I used solr's functional query support to query for a match on terms and grab the attached boost payload and apply it to the relevancy score:
/solr/payloads/select?indent=on&wt=json&q={!payload_score%20f=ai_terms_wtih_synm_3%20v=$payload_term%20func=max}&fl=id,score&payload_term=some+term

Inserting empty arrays in JSON type fields in datastore

I have defined a field named "value" of type JSON (among some others fields) in a resource datastore. If I run upserts using simple values or not empty arrays all works ok:
POST http://host/api/3/action/datastore_upsert
{
"resource_id": "...",
"records": [
{ "value": [ "1", "2" ] }
],
"method": "insert",
"force": "True"
}
POST http://host/api/3/action/datastore_upsert
{
"resource_id": "...",
"records": [
{ "value": "23" }
],
"method": "insert",
"force": "True"
}
However, if I use an empty array
POST http://host/api/3/action/datastore_upsert
{
"resource_id": "...",
"records": [
{ "value": [ ] }
],
"method": "insert",
"force": "True"
}
I get the following error:
{
...
"success": false,
"error": {
"info": {
"orig": [
"malformed record literal: \"{}\"\nLINE 2: VALUES (NULL, NULL, NULL, NULL, '{}', NULL, to_t...\n ^\nDETAIL: Missing left parenthesis.\n"
]
},
"__type": "Validation Error",
"data": "(DataError) malformed record literal: \"{}\"\nLINE 2: VALUES (NULL, NULL, NULL, NULL, '{}', NULL, to_t...\n ^\nDETAIL: Missing left parenthesis.\n"
}
}
Given that [ ] is a valid element in JSON, I wonder why this error happens. Is it a known issue/bug in CKAN datastore API?

Thanks! You have found a bug. I fixed it on https://github.com/ckan/ckan/pull/1776 and sent a pull request. This will be reviewed by another core dev and will be merged soon.
If you have some time, it would be useful if you could test that branch to confirm that it solves this issue. If you do so, please add a comment on the pull request.
Cheers!