Multiple terms referring to one IRI - json-ld

I want to create a context file to use for multiple data sources. Is it possible to state different terms that will refer to the exact same IRI?
For example:
{
"#context": {
"twitter_name": "http://schema.org/name",
"facebook_name": "http://schema.org/name"
}
}

If I understand your question correctly, you want to define different aliases for the same property. So without the use of prefixes, this:
{
"#context": {
"twitter_name": "http://schema.org/name",
"facebook_name": "http://schema.org/name"
}
}
This should be valid. In an object, the keys must be unique, but there is no such requirement for the values.
You can test it in the JSON-LD Playground.
This example uses the four ways how the property could be specified:
{
"#context": {
"bi": "http://schema.org/",
"twitter_name": "bi:name",
"facebook_name": "bi:name"
},
"bi:name": "Alice (prefix)",
"twitter_name": "Alice (alias for Twitter)",
"facebook_name": "Alice (alias for Facebook)",
"http://schema.org/name": "Alice (full URI)"
}
The compacted result contains an array value with the four names:
{
"http://schema.org/name": [
"Alice (prefix)",
"Alice (alias for Facebook)",
"Alice (full URI)",
"Alice (alias for Twitter)"
]
}
So all the keys are correctly interpreted to be Schema.org’s name property.

Related

MongoDb: search only if it contains all ids within the array

I am looking for an array of ids, inside another array, the problem is that if it contains at least one "id" it returns the result. The validation should be that it has to have all the "ids" that I am passing.
{ subcategories: { $in: [ ObjectId('61729d550e8fe20011cc57d2', ObjectId('61693f589a34340012b1d5d8'), ObjectId('61693f2c9a34340012b1d5b7') )] } }
example:
subcategories: ["61729d550e8fe20011cc57d2", "61693f2c9a34340012b1d5b7"] -> this one should not appear, because it contains only 2 of the 3 I am looking for
I think you are looking for $all.
The docs says:
The $all operator selects the documents where the value of a field is an array that contains all the specified elements
db.collection.find({
subcategories: {
$all: [
"61729d550e8fe20011cc57d2",
"61693f589a34340012b1d5d8",
"61693f2c9a34340012b1d5b7"
]
}
})
Example here

Using $rename in MongoDB for an item inside an array of objects

Consider the following MongoDB collection of a few thousand Objects:
{
_id: ObjectId("xxx")
FM_ID: "123"
Meter_Readings: Array
0: Object
Date: 2011-10-07
Begin_Read: true
Reading: 652
1: Object
Date: 2018-10-01
Begin_Reading: true
Reading: 851
}
The wrong key was entered for 2018 into the array and needs to be renamed to "Begin_Read". I have a list using another aggregate of all the objects that have the incorrect key. The objects within the array don't have an _id value, so are hard to select. I was thinking I could iterate through the collection and find the array index of the errored Readings and using the _id of the object to perform the $rename on the key.
I am trying to get the index of the array, but cannot seem to select it correctly. The following aggregate is what I have:
[
{
'$match': {
'_id': ObjectId('xxx')
}
}, {
'$project': {
'index': {
'$indexOfArray': [
'$Meter_Readings', {
'$eq': [
'$Meter_Readings.Begin_Reading', True
]
}
]
}
}
}
]
Its result is always -1 which I think means my expression must be wrong as the expected result would be 1.
I'm using Python for this script (can use javascript as well), if there is a better way to do this (maybe a filter?), I'm open to alternatives, just what I've come up with.
I fixed this myself. I was close with the aggregate but needed to look at a different field for some reason that one did not work:
{
'$project': {
'index': {
'$indexOfArray': [
'$Meter_Readings.Water_Year', 2018
]
}
}
}
What I did learn was the to find an object within an array you can just reference it in the array identifier in the $indexOfArray method. I hope that might help someone else.

Spark get datatype of nested object

I have some JSON data which looks like this:
{
"key1":"value1",
"key2":[
1,
2,
3
],
"key3":{
"key31":"value31",
"key32":"value32"
},
"key4":[
{
"key41":"value411",
"key42":"value412",
"key43":"value413"
},
{
"key41":"value421",
"key42":"value422",
"key43":"value423"
}
],
"key5":{
"key51":[
{
"key511":"value511",
"key512":"value512",
"key513":"value513"
},
{
"key511":"value521",
"key512":"value522",
"key513":"value523"
}
]
},
"key6":{
"key61":{
"key611":[
{
"key_611":"value_611",
"key_612":"value_612",
"key_613":"value_613"
},
{
"key_611":"value_621",
"key_612":"value_622",
"key_613":"value_623"
},
{
"key_611":"value_621",
"key_612":"value_622",
"key_613":"value_623"
}
]
}
}
}
It contains the a mix of simple, complex and array type values.
If I try to get the datatype of key1 schema.("key1").dataType, I get StringType and likewise for key2, key3 and key4.
For key5 also, I get StructType.
But when I try to get the datatype for key51, which is nested under key5 using schema.("key5.key51").dataType, I'm getting the following error:
java.lang.IllegalArgumentException: Field "key5.key51" does not exist.
at org.apache.spark.sql.types.StructType$$anonfun$apply$1.apply(StructType.scala:264)
at org.apache.spark.sql.types.StructType$$anonfun$apply$1.apply(StructType.scala:264)
at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
at scala.collection.AbstractMap.getOrElse(Map.scala:59)
at org.apache.spark.sql.types.StructType.apply(StructType.scala:263)
... 48 elided
The main intention for me is to be able to explode a given type if its of ArrayType and not explode for any other type.
The explode function is able to recognize this given key (key5.key51) properly and exploding the array. But the problem is with determining the datatype.
One possible solution for me is to do a select of key5.key51 as a separate column key51 and then explode that column.
But is there any better and more elegant way of doing this while still being able to determine the datatype of the given column?
The simplest solution is to select the field of interest, and then retrieve the schema:
df.select("key5.key51").schema.head.dataType
Using full schema directly, would require traversing schema, and might be hard to do right, while with embedded ., StructTypes and complex types (Maps and Arrays).
Here is some (recursive) code to find all ArrayType fields names:
import org.apache.spark.sql.types._
def findArrayTypes(parents:Seq[String],f:StructField) : Seq[String] = {
f.dataType match {
case array: ArrayType => parents
case struct: StructType => struct.fields.toSeq.map(f => findArrayTypes(parents:+f.name,f)).flatten
case _ => Seq.empty[String]
}
}
val arrayTypeColumns = df.schema.fields.toSeq
.map(f => findArrayTypes(Seq(f.name),f))
.filter(_.nonEmpty).map(_.mkString("."))
For your dataframe, this gives:
arrayTypeColumns.foreach(println)
key2
key4
key5.key51
key6.key61.key611
This does not work yet for arrays inside maps or nested arrays

Using $in to return all documents that match list

I'm using MongoDb. I'm trying to execute a query to search a collection and return all documents that match a list of names (strings).
Here is my query:
{
db.employees.find({ "n": { "$in": ["Alice", "Mary"]}})
}
But I'm getting the error:
"Field names in dot notation need to be in quotes at line 2, col 22"
Which is the character before "n". What gives? Thanks!
I figured it out doh. When using studio 3t, the collection is implicitly implied. As such, you only need to type:
{ "n": { "$in": ["Alice", "Mary"]}}
not
db.employees.find({ "n": { "$in": ["Alice", "Mary"]}})
Silly error, leaving this up in case someone finds it useful

Json-ld defining type for node

I have a simple json file like:
{
"name": "something"
}
Now I Have a json-ld definition where there are objects. There is object with id #something - it exists lets say on http://example.com/test.jsonld#something.
Now I want to add context without modyfying original data so Name becomes a type and value becomes IRI to http://example.com/test.jsonld#something.
I've done something like this:
{
"#context":{
"name":"#type"
},
"#id":"1234",
"name":"something"
}
This gives me in jsonld playground almost what I want:
{
"#id": "1234",
"#type": "http://json-ld.org/playground/something",
}
How do I add context so value "something is expanded to IRI http://example.com/test.jsonld#something instead of playgorund ?
Tried with "#base" but it also changes the #id to url.
You can use terms (strings mapped to IRIs) as values of #type. As you alias name to #type already, all you need to do is to add the mapping from something to http://example.com/test.jsonld#something:
{
"#context":{
"name": "#type",
"something": "http://example.com/test.jsonld#something"
},
"#id": "1234",
"name": "something"
}
Tried with "#base" but it also changes the #id to url.
The value of #id is always a IRI. It is just not expanded if you don't have a base ("#base": null)

Resources