Iterate JSONArray in Scala - arrays

I'm pretty new to the Scala language, so I need some help here.
I have this JSONArray (org.json is the name of the package):
[{"id":"HomePDA"},{"id":"House2"},{"id":"House"},{"id":"7c587a4b-851d-4aa7-a61f-dfdae8842298","value":"xxxxxxxxxxx"},{"id":"Home"}]
If this was in java, I could solve this using the "foreach" structure, but I can't find something similar to that structure. I only need to get the JSONObjects from this array.
Is that possible or do I need to change the data structure? I prefer the first option, the second one is a little mess.
Thank you in advance.

Something like this should do:
val objects = (0 until jsonArray.length).map(jsonArray.getJSONObject)

I would introduce a class House to help extracting the data.
import org.json._
import scala.util.{Try, Success, Failure}
case class House(id: String, value: String)
val jsonArray = new JSONArray("""[
{"id":"HomePDA"},
{"id":"House2"},
{"id":"House"},
{"id":"7c587a4b-851d-4aa7-a61f-dfdae8842298", "value":"xxxxxxxxxxx"},
{"id":"Home"}]""")
val objects = (0 until jsonArray.length).map(jsonArray.getJSONObject)
val houses = objects.map(s => Try(House(s.getString("id"), s.getString("value"))))
houses.foreach {
case Success(house) => println(house.value)
case Failure(exception) => Console.err.println(s"Error: $exception")
}

Related

GenericRowWithSchema ClassCastException in Spark 3 Scala UDF for Array data

I am writing a Spark 3 UDF to mask an attribute in an Array field.
My data (in parquet, but shown in a JSON format):
{"conditions":{"list":[{"element":{"code":"1234","category":"ABC"}},{"element":{"code":"4550","category":"EDC"}}]}}
case class:
case class MyClass(conditions: Seq[MyItem])
case class MyItem(code: String, category: String)
Spark code:
val data = Seq(MyClass(conditions = Seq(MyItem("1234", "ABC"), MyItem("4550", "EDC"))))
import spark.implicits._
val rdd = spark.sparkContext.parallelize(data)
val ds = rdd.toDF().as[MyClass]
val maskedConditions: Column = updateArray.apply(col("conditions"))
ds.withColumn("conditions", maskedConditions)
.select("conditions")
.show(2)
Tried the following UDF function.
UDF code:
def updateArray = udf((arr: Seq[MyItem]) => {
for (i <- 0 to arr.size - 1) {
// Line 3
val a = arr(i).asInstanceOf[org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema]
val a = arr(i)
println(a.getAs[MyItem](0))
// TODO: How to make code = "XXXX" here
// a.code = "XXXX"
}
arr
})
Goal:
I need to set 'code' field value in each array item to "XXXX" in a UDF.
Issue:
I am unable to modify the array fields.
Also I get the following error if remove the line 3 in the UDF (cast to GenericRowWithSchema).
Error:
Caused by: java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast to MyItem
Question: How to capture Array of Structs in a function and how to return a modified array of items?
Welcome to Stackoverflow!
There is a small json linting error in your data: I assumed that you wanted to close the [] square brackets of the list array. So, for this example I used the following data (which is the same as yours):
{"conditions":{"list":[{"element":{"code":"1234","category":"ABC"}},{"element":{"code":"4550","category":"EDC"}}]}}
You don't need UDFs for this: a simple map operation will be sufficient! The following code does what you want:
import spark.implicits._
import org.apache.spark.sql.Encoders
case class MyItem(code: String, category: String)
case class MyElement(element: MyItem)
case class MyList(list: Seq[MyElement])
case class MyClass(conditions: MyList)
val df = spark.read.json("./someData.json").as[MyClass]
val transformedDF = df.map{
case (MyClass(MyList(list))) => MyClass(MyList(list.map{
case (MyElement(item)) => MyElement(MyItem(code = "XXXX", item.category))
}))
}
transformedDF.show(false)
+--------------------------------+
|conditions |
+--------------------------------+
|[[[[XXXX, ABC]], [[XXXX, EDC]]]]|
+--------------------------------+
As you see, we're doing some simple pattern matching on the case classes we've defined and successfully renaming all of the code fields' values to "XXXX". If you want to get a json back, you can call the to_json function like so:
transformedDF.select(to_json($"conditions")).show(false)
+----------------------------------------------------------------------------------------------------+
|structstojson(conditions) |
+----------------------------------------------------------------------------------------------------+
|{"list":[{"element":{"code":"XXXX","category":"ABC"}},{"element":{"code":"XXXX","category":"EDC"}}]}|
+----------------------------------------------------------------------------------------------------+
Finally a very small remark about the data. If you have any control over how the data gets made, I would add the following suggestions:
The conditions JSON object seems to have no function in here, since it just contains a single array called list. Consider making the conditions object the array, which would allow you to discard the list name. That would simpify your structure
The element object does nothing, except containing a single item. Consider removing 1 level of abstraction there too.
With these suggestions, your data would contain the same information but look something like:
{"conditions":[{"code":"1234","category":"ABC"},{"code":"4550","category":"EDC"}]}
With these suggestions, you would also remove the need of the MyElement and the MyList case classes! But very often we're not in control over what data we receive so this is just a small disclaimer :)
Hope this helps!
EDIT: After your addition of simplified data according to the above suggestions, the task gets even easier. Again, you only need a map operation here:
import spark.implicits._
import org.apache.spark.sql.Encoders
case class MyItem(code: String, category: String)
case class MyClass(conditions: Seq[MyItem])
val data = Seq(MyClass(conditions = Seq(MyItem("1234", "ABC"), MyItem("4550", "EDC"))))
val df = data.toDF.as[MyClass]
val transformedDF = df.map{
case MyClass(conditions) => MyClass(conditions.map{
item => MyItem("XXXX", item.category)
})
}
transformedDF.show(false)
+--------------------------+
|conditions |
+--------------------------+
|[[XXXX, ABC], [XXXX, EDC]]|
+--------------------------+
I am able to find a simple solution with Spark 3.1+ as new features are added in this new Spark version.
Updated code:
val data = Seq(
MyClass(conditions = Seq(MyItem("1234", "ABC"), MyItem("234", "KBC"))),
MyClass(conditions = Seq(MyItem("4550", "DTC"), MyItem("900", "RDT")))
)
import spark.implicits._
val ds = data.toDF()
val updatedDS = ds.withColumn(
"conditions",
transform(
col("conditions"),
x => x.withField("code", updateArray(x.getField("code")))))
updatedDS.show()
UDF:
def updateArray = udf((oldVal: String) => {
if(oldVal.contains("1234"))
"XXX"
else
oldVal
})

How to get ID's from JSON array in groovy using the each method?

I'm trying to get a list of ID's from a JSON array in Groovy. I know how to get the ID's using the regular FOR loop, but I would like to know how to do the same with the each method. I'm not sure how to implement that. Does anyone have any idea?
Thank you in advance. Here's my code that works just fine using the regular for loop. However I would like to do it with the each method.
import groovy.json.*
def restresponse = '[{"id":5, "name":"Bob"},{"id":8, "name":"John"},{"id":12, "name":"Jim"},{"id":20, "name":"Sally"}]'
def json = new JsonSlurper().parseText(restresponse)
def myListOfIDs = []
for (int i = 0; i < json.size; i++) {
myListOfIDs.add(json[i].id) // getting all ID's for each SourceSystem
}
log.info(myListOfIDs) // This prints out all this IDs
The shortest way to perform this "conversion" is by using the Groovy's Collection collect method, e.g.:
def myListOfIDs = json.collect { ele -> ele.id }
EDIT: As pointed out by #dmahapatro there's an even shorter possibility:
def myListOfIDs = json*.id

Convert Parse.com json array into Array with swift

Can someone please help me with this. I saved my data into Parse.com into column with type array (example: ["11:30","12:45,"13:02"], just some list of some times as string). I have tried to get this data with swift:
var take: NSMutableArray!
var query = PFQuery(className: "test")
query.getObjectInBackgroundWithId("QZ6Y8Oljc5"){
(testData: PFObject!, error: NSError!) -> Void in
if (error == nil){
take = testData["workday"]
println(take)
}
else{
println(error)
}
}
the problem is that i get only json array type:
(
"11:30",
"12:45,
"13:02"
)
How can I convert it into NSArray so it could be like:
var myArray = ["11:30","12:45,"13:02"]
Thank you for any suggestions because I tried every method I found here, but without any results.
The problem with JSON data is that it is it's own array that has to be sifted and groomed. normally people would end up using huge nested IF statements which ends up looking messy. Luckily, someone created a code that sifts through JSON data and gives you back usable types (Int, Arrays, Strings) by use of a massive switch table.
https://github.com/SwiftyJSON/SwiftyJSON
Look it up, it should help. Once you have it implemented you can call it by typing..
let json = JSON(Data : JSONData)
then to sift through, you use substrings.. (depending on the data, you match it with a string or int)
let firstIndex = json["workday"]
//Int
let firstIndexOfWorkDay = json["workday"][0]
//String
let firstIndexOfWorkDay = json["workday"]["time"]
and so on... however, you will need to cast it once you singled out the data
let firstIndexOfWorkDay = json["workday"][0].valueOfFloat
//printing it would give 11:30
although personally I use ".description" .. since sometimes when I sift through all the array, its a mix of types.
let firstIndexOfWorkDay = json["workday"][0].description
println(firstIndexOfWorkDay)
//would literally give "11:30" including the quotation marks
then I use string methods to trim the quotations then cast it to whatever type I need. But it's up to your creativity once you figure out how it works

How to iterate over a list of type Class to edit the properties of its objects in Groovy

I know there are more elaborate ways to achieve this in Java, but Groovy should have a concise way to do the same as per http://groovy.codehaus.org/Looping
Class Currency.groovy
class Currency {
String name
double rate
}
CurrencyController
def select(){
List<Currency> selectedCurrencies = Currency.getAll(params.currencies)
selectedCurrencies.eachWithIndex { obj, i -> obj.rate = update(obj.name)};
[selectedCurrencies:selectedCurrencies]
}
def update(String sym){
return sym
}
The above code throws:
No signature of method: currencychecker.CurrencyController$_$tt__select_closure12.doCall() is applicable for argument types: (currencychecker.Currency)
Thanks to #dmahapatro, the issue was that I was using an iterator variable obj[i], even though obj itself is the iterated object. The rest is correct!
I experimented with selectCurrencies.each as well instead of selectCurrencies.eachWithIndex however the right one in this case is eachWithIndex

Is there a way to convert a struct into an array without using a loop?

I'm curious, is there another way to convert a struct into an array in Coldfusion without looping over it? I know it can be done this way if we use a for in loop:
local.array = [];
for (local.value in local.struct)
{
arrayAppend(local.array, local.value);
}
Does StructKeyArray suit your requirements?
Description
Finds the keys in a ColdFusion
structure.
If you are trying to maintain order in your structure you could always use a Java LinkedHashMap like so:
cfmlLinkedMap = createObject("Java", "java.util.LinkedHashMap").init();
cfmlLinkedMap["a"] = "Apple";
cfmlLinkedMap["b"] = "Banana";
cfmlLinkedMap["c"] = "Carrot";
for(key in cfmlLinkedMap){
writedump(cfmlLinkedMap[key]);
}
You could also do the same thing in a more "java" way not sure why you'd want to but its always an option:
//no need to init
linkedMap = createObject("Java", "java.util.LinkedHashMap");
//java way
linkedMap.put("d","Dragonfruit");
linkedMap.put("e","Eggplant");
linkedMap.put("f","Fig");
//loop through values
iterator = linkedMap.entrySet().iterator();
while(iterator.hasNext()){
writedump(iterator.next().value);
}
//or
//loop through keys
iterator = linkedMap.keySet().iterator();
while(iterator.hasNext()){
writedump(linkedMap.get(iterator.next()));
}
Just remember that the keys are case SeNsItIvE!
In Coldfusion 10 or Railo 4, if you want an array of values (instead of keys), you can use the Underscore.cfc library like so:
_ = new Underscore();// instantiate the library
valueArray = _.toArray({first: 'one', second: 'two'});// returns: ['one','two']
Note: Coldfusion structures are unordered, so you are not guaranteed to have any specific order for the values in the resulting array.
(Disclaimer: I wrote Underscore.cfc)

Resources