Can Stream Analytics filter items of array property? - arrays

Hi I wonder if it is possible to select certain items from the array property of a JSON input of Stream Analytics and return them as an array property of a JSON output.
My example to make it more clear - I send a list of OSGI bundles running on a device with name, version and state of the bundle. (I leave out rest of the content.) Sample message:
{"bundles":[{"name":"org.eclipse.osgi","version":"3.5.1.R35x_v20090827","state":32},{"name":"slf4j.log4j12","version":"1.6.1","state":4}]}
Via Stream Analytics I want to create one JSON output (event hub) for active bundle (state == 32) and put the rest in the different output. Content of those event hubs will be processed later. But in the processing I also need the original Device ID so I fetch it from the IoTHub message properties.
So my query looks like this:
WITH Step1 AS
(
SELECT
IoTHub.ConnectionDeviceId AS deviceId,
bundles as bundles
FROM
iotHubMessages
)
SELECT
messages.deviceId AS deviceId,
bundle.ArrayValue.name AS name,
bundle.ArrayValue.version AS version
INTO
active
FROM
Step1 as messages
CROSS APPLY GetArrayElements(messages.bundles) AS bundle
WHERE
bundle.ArrayValue.state = 32
SELECT
messages.deviceId AS deviceId,
bundle.ArrayValue.name AS name,
bundle.ArrayValue.version AS version
INTO
other
FROM
Step1 as messages
CROSS APPLY GetArrayElements(messages.bundles) AS bundle
WHERE
bundle.ArrayValue.state != 32
This way there is a row for every item of the original array containing deviceId, name and version properties in the active output. So the deviceId property is copied several times, which means additional data in a message. I'd prefer a JSON with one deviceId property and one array property bundles, similar to the original JSON input.
Like active:
{"deviceid":"javadevice","bundles":[{"name":"org.eclipse.osgi","version":"3.5.1.R35x_v20090827"}]}
And other:
{"deviceid":"javadevice","bundles":[{"name":"slf4j.log4j12","version":"1.6.1"}]}
Is there any way to achieve this? - To filter items of array and return it back as an array in the same format as is in the input. (In my code I change number of properties, but that is not necessary.)
Thanks for any ideas!

I think you can achieve this using the Collect() aggregate function.
The only issue I see is that the deviceId property will be outputted in the bundle array as well.
WITH Step1 AS
(
SELECT
IoTHub.ConnectionDeviceId AS deviceId,
bundles as bundles
FROM
iotHubMessages
),
Step2 AS
(
SELECT
messages.deviceId AS deviceId,
bundle.ArrayValue.name AS name,
bundle.ArrayValue.version AS version
bundle.ArrayValue.state AS state
FROM
Step1 as messages
CROSS APPLY GetArrayElements(messages.bundles) AS bundle
)
SELECT deviceId, Collect() AS bundles
FROM Step2
GROUP BY deviceId, state, System.Timestamp
WHERE state = 32

Related

Mongo query to filter with an array specific index based on parameters

I have a few documents in a mongoDB that have the following structure
In a API web application that I am developing with spring boot I have to code the following query. I can receive a voltageLevelCode to filter register that will contains this voltage level code in the array (That it is easy) but the problem is that i can also receive a voltageLevelCode and a Type, so in this case I have to filter documents that will contains this voltage level code in the array and also WITHIN this voltage level code filter the ones that contains this type (But remember, the type within the voltage level)
I have been trying to write the query but I dont know how to dynamically set the index to filter the types within this voltage level. Something like:
{"voltageLevel.<TheIndexByTheDefinenVoltageLevelCode>.types" : "X" }
Example:
public List<MyClassRepresenting> findByFilter(String type,String voltageLevelCode);
{$and: [{'voltageLevel.voltageLevelCode' : ?1 },{'voltageLevel.<HowTogetIndexForSelectingVoltageLevelCode>.types' : ?2}]}
In this case depending on the tensionLevel received the type parameter must filter according to types within this tensionLevel
Same happens to me with another query. In SQL the equivalent is the SELECT within another SELECT to select the sub registers but no idea about how to do it in mongo.
When asking a question on stackoverflow, it's always interesting to include what you already tried.
I think what you need is a simple $elemMatch:
db.mycoll.find(
{ voltageLevel: { $elemMatch: { voltageLevelCode: "MT", types: "E" } } }
)

Gatling: Save random attribute in another attribute

Is it possible to save an attribute at runtime and then save it as another attribute? For instance, I have an ID that is used in the URL, I've captured it from one page, however there are a list of 5 on the page. I can use findAll to select them all, and then ${AttributeName.random()} to select one at random.
However how do I then go and save that as an attribute and then use it elsewhere? As it needs to be the same each time and if I run random again obviously it'll change string each time.
I could do an ${AttributeName(storedRandomNumber)} but the code could start to be a little messy and was wondering if there was something a little cleaner to use?
You could make another exec() right after this request to assign the random value you want with the session.set() method, this value then is saved for the entire thread to be reused.
EX :
val scenario = scenario("scenarioName")
.exec(
http("<-- Name Of Request -->")
.get("<LINK _TO_FIRST_REQ>")
.check(jsonPath("$.items[*].id").findAll.optional.saveAs("ListOfAttributeNames"))
)
.exec( session => session.set("randomAttributeNameSelected", session("ListOfAttributeNames").as[Seq[String]]
.apply(scala.util.Random
.nextInt((session("ListOfAttributeNames").as[Seq[String]].size - 0) + 1)))
)
.exec(
http("We use the ID here")
.get(session => "http://domain.something.com/api/" + session("randomAttributeNameSelected").as[String])
)
Thus anytime in the same thread if you access session("randomAttributeNameSelected").as[String] it will give you random ID.

Zeppelin - pass variable from Spark to Markdown to generate dynamic narrative text

Is it possible to pass a variable from Spark interpreter (pyspark or sql) to Markdown? The requirement is to display a nicely formatted text (i.e. Markdown) such as "20 events occurred between 2017-01-01 and 2017-01-08" where the 20, 2017-01-01 and 2017-01-08 are dynamically populated based on output from other paragraphs.
Posting this for benefit of other users, this is what I have been able to find:
Markdown paragraphs can only contain static text.
But it is possible to achieve a dynamic formatted text output with the Angular interpreter instead.
(First paragraph)
%spark
// create data frame
val eventLogDF = ...
// register temp table for SQL access
eventLogDF.registerTempTable( "eventlog" )
val query = sql( "select max(Date), min(Date), count(*) from eventlog" ).take(1)(0)
val maxDate = query(0).toString()
val minDate = query(1).toString()
val evCount = query(2).toString()
// bind variables which can be accessed from angular interpreter
z.angularBind( "maxDate", maxDate )
z.angularBind( "minDate", minDate )
z.angularBind( "evCount", evCount )
(Second paragaph)
%angular
<div>There were <b>{{evCount}} events</b> between <b>{{minDate}}</b> and <b>{{maxDate}}</b>.</div>
You could also print out markdown by translate it into HTML first, for those who may already have an markdown template for output, or your Zeppelin environment have no Angular interpreter(e.g. a K8s deployment).
First, install markdown2.
%sh
pip install markdown2
And use it.
%pyspark
import markdown2
# prepare your markdown string
markdown_string = template_mymarkdown.format(**locals())
# use Zeppelin %html for output
print("%html", markdown2.markdown(markdown_string, extras=["tables"]))
A screenshot for example:

App Engine Search API - Sort Results

I have several entities that I am searching across that include dates, and the Search API works great across all of them except for one thing - sorting.
Here's the data model for one of my entities (simplified of course):
class DepositReceipt(ndb.Expando):
#Sets creation date
creation_date = ndb.DateTimeProperty(auto_now_add=True)
And the code to create the search.Document where de is an instance of the entity:
document = search.Document(doc_id=de.key.urlsafe(),
fields=[search.TextField(name='deposit_key', value=de.key.urlsafe()),
search.DateField(name='created', value=de.creation_date),
search.TextField(name='settings', value=de.settings.urlsafe()),
])
This returns a valid document.
And finally the problem line. I took this snippet from the official GAE Search API tutorial and just changed the direction of the sort to DESCENDING and changed the search expression to created (the date property from the Document above).
expr_list = [search.SortExpression(
expression="created", default_value='',
direction=search.SortExpression.DESCENDING)]
I don't think this is important, but the rest of the search code looks like this:
sort_opts = search.SortOptions(expressions=expr_list)
query_options = search.QueryOptions(
cursor=query_cursor,
limit=_NUM_RESULTS,
sort_options=sort_opts)
query_obj = search.Query(query_string=query, options=query_options)
search_results = search.Index(name=index_name).search(query=query_obj)
In production, I get this error message:
InvalidRequest: Failed to parse search request "settings:ag5zfmdoaWRvbmF0aW9uc3IQCxIIU2V0dGluZ3MYmewDDA"; failed to parse date
Changing the expression="created" to anything else works perfectly fine. This also happens across my other entity types that use dates, so I have no idea what's going on. Advice?
I think default_value needs to be a valid date, rather than '' as you have it.

WFS GetFeature with multiple layers and different propertyNames

Suppose i have a Geoserver running with two layers exposed by WFS (with properties):
StreetLayer (geom, StreetName, Lanes, Length)
HouseLayer (geom, Address)
Now if i want to query StreetLayer for all streets but only get the StreetName and Lanes properties I'd send a GET request to this:
http://geoserver/wfs?REQUEST=GetFeature&VERSION=1.1.0&typename=StreetLayer&propertyname=StreetName,Lanes
But what if i now want to query both HouseLayer and StreetLayer? This doesn't work:
http://geoserver/wfs?REQUEST=GetFeature&VERSION=1.1.0&typename=StreetLayer,HouseLayer&propertyname=StreetName,Lanes,Address
I get an exception that says that StreetName and Lanes isn't in HouseLayer and vice versa. Do i need to make multiple requests?
EDIT:
So what i want to do is something like this:
http://geoserver/wfs?REQUEST=GetFeature&VERSION=1.1.0&typename=StreetLayer,HouseLayer&propertyname=(StreetName,Lanes),(Address)
Almost there, you just have an extra comma in propertyName. This one works against the vanilla GeoServer install:
http://localhost:8087/gswps/topp/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=topp:tasmania_cities,topp:tasmania_roads&propertyName=(ADMIN_NAME,CITY_NAME)(TYPE)
The difference: No comma between ) and (

Resources