Obtaining keys and values from JSON nested array in nest - arrays

First time posting! I am converting JSON data (dictionary) from a server into a csv file. The keys and values taken are fine apart from the nest "Astronauts", which is an array. Basically every individual JSON string is a datum that may contains from 0 to an unlimited number of astronauts which features I would like to extract as independent values. For instance something like this:
Astronaut1_Spaceships_First: Katabom
Astronaut1_Spaceships_Second: The Kraken
Astronaut1_name: Jebeddia
(...)
Astronaut2_gender: Hopefully female
and so on. The problem here is that the nest is set as an array and not a dictionary so I do not know what to do. I have tried the dpath library as well as flattering the nest but nothing did change. Any ideas?
import json
import os
import csv
import datetime
import dpath.util #Dpath library needs to be installed first
datum = {"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]}
#Parsing process
parsed = json.loads(datum) #datum is the JSON string retrieved from the server
def flattenjson(parsed, delim):
val = {}
for i in parsed.keys():
if isinstance(parsed[i], dict):
get = flattenjson(parsed[i], delim)
for j in get.keys():
val[i + delim + j] = get[j]
else:
val[i] = parsed[i]
return val
flattened = flattenjson(parsed,"__")
#process of creating csv file
keys=['Astronaut1_Spaceship_First','Astronaut2_Spaceship_Second', 'Astronaut1_Name] #reduced to 3 keys for this example
writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel")
writer.writerow(flattened)
.
#JSON DATA FROM SERVER
{
"Mission": "Make Earth Greater Again",
"Objective": "Prove Earth is flat",
"Astronauts": [ {
"Spaceships": {
"First": "Katabom",
"Second": "The Kraken"
},
"Name": "Jebeddiah",
"Gender": "Hopefully male",
"Age": 35,
"Prefered colleages": [],
"Following missions": [
{
"Payment_status": "TO BE CONFIRMED"
}
]
},
{
"Spaceships": {
"First": "The Kraken",
"Second": "Minnus I"
},
"Name": "Bob",
"Gender": "Hopefully female",
"Age": 23,
"Prefered colleages": [],
"Following missions": [
{
"Payment_status": "TO BE CONFIRMED"
}
]
},
]
}
]

Firstly, the datum you have defined here is not the datum that would be extracted from the server. The datum from the server would be a string. The datum you have in this program is already processed. Now, assuming datum to be:
datum = '{"Mission": "Make Earth Greater Again", "Objective": "Prove Earth is flat", "Astronauts": [{"Spaceships": {"First": "Katabom", "Second": "The Kraken"}, "Name": "Jebeddiah", "Gender": "Hopefully male", "Age": 35, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}, {"Spaceships": {"First": "The Kraken", "Second": "Minnus I"}, "Name": "Bob", "Gender": "Hopefully female", "Age": 23, "Prefered colleages": [], "Following missions": [{"Payment_status": "TO BE CONFIRMED"}]}]}'
You don't need the the dpath library. The problem here is that your json flattener doesn't handle embedded lists. Try using the one I've put below.
Assuming that you want a one line csv file,
import json
def flattenjson(data, delim, topname=''):
"""JSON flattener that can handle embedded lists and dictionaries"""
flattened = {}
def internalflat(int_data, name=topname):
if type(int_data) is dict:
for key in int_data:
internalflat(int_data[key], name + key + delim)
elif type(int_data) is list:
i = 1
for elem in int_data:
internalflat(elem, name + str(i) + delim)
i += 1
else:
flattened[name[:-len(delim)]] = int_data
internalflat(data)
return flattened
#If you don't want mission or objective in csv file
flattened_astronauts = flattenjson(json.loads(datum)["Astronauts"], "__", "Astronaut")
keys = flattened_astronauts.keys().sort()
writer = csv.DictWriter(OD, keys ,restval='Null', delimiter=",", quotechar="\"", quoting=csv.QUOTE_ALL, dialect= "excel")
writer.writerow(flattened_astronauts)

Related

Groovy - FindAll - unique record - condition declared by field

I have a json as below
{
"Animals": [
{
"Name": "monkey",
"Age": 4
},
{
"Name": "lion",
"Age": 3
},
{
"Name": "lion",
"Age": 3,
"Misc": "001"
}
]
}
2 elements out of 3 inside json array has the Name and Age. The only difference is that 3rd element has Misc and the 2nd does not have Misc.
How to get the record having Misc when there are 2 records with same Name and Age?
Below is what I tried
parsedJson?.Animals = parsedJson?.Animals?.unique().findAll{animal -> animal?.Misc?.trim() ? animal?.Misc?.trim() : site?.Name?.trim() };
Looks like I missed one more statement or I missed something inside unique()
I also tried
parsedJson?.Animals = parsedJson?.Animals?.unique{a1,a2 -> a1?.Misc <=> a2?.Misc}
but still not get what I want
What I want is
{
"Animals": [
{
"Name": "monkey",
"Age": 4
},
{
"Name": "lion",
"Age": 3,
"Misc": "001"
}
]
}
One way to go about this is by grouping the elements and then just merge
the maps.
groupBy is used to group the elements by their "primary key" -- lets
assume, that this is Name and Age. The resulting data structure is
a map with [it.Name, it.Age] tuples and keys and a list of elements,
that hold that property.
Next reduce over the list of maps and just merge them. This assumes,
that the information there does not contradict itself (e.g. only adds to
the result). Otherwise the last map would just win.
def data = [["Name": "monkey", "Age": 4],
["Name": "lion", "Age": 3],
["Name": "lion", "Age": 3, "Misc": "001"]]
println data.groupBy{[it.Name, it.Age]}.collect{ _, xs -> xs.inject{ acc, x -> acc.putAll x; acc } }
// → [[Name:monkey, Age:4], [Name:lion, Age:3, Misc:001]]

Parsing JSON to do math?

I'm new to programming (especially JSON format), so please forgive me not for using proper terminology :)
Using Python 3.7 Requests module, I receive a JSON response. To keep things simple, I made an example:
{
"Bob":
{
"Age": "15",
"LastExamGrade": "45",
},
"Jack":
{
"Age": "16",
"LastExamGrade": "58",
}
}
What I would like to do is parse the JSON responses to extract two items from each response/structure and save it to a list like this (I think this is called a tuple of tuples?):
[("Bob","45"),("Jack","58")]
Then, after receiving doing this, I will receive another similar response, such as the following (where the only thing that changed is the exam grade):
{
"Bob":
{
"Age": "15",
"LastExamGrade": "54",
},
"Jack":
{
"Age": "16",
"LastExamGrade": "70",
}
}
I want to also save the name and grade into a tuple of tuples (list).
Lastly, I would like to subtract the first exam score of each person from their last exam score, and save this into a final list, which includes the name, final exam grade, and grade improvement, like this:
[("Bob","54","9"),("Jack","67","12")]
What is the simplest way to do this using Python 3? As for my own research, I've searched all throughout StackOverflow, but couldn't find out how to parse a JSON like mine (For example, in mine, the name is outside of the curly braces), and had difficulty doing math operations for JSON items.
I'd recommend using a dedicated package for calculations like pandas:
first_exam_grades = pd.DataFrame.from_dict(first_exam_results, orient='index').astype(int)
second_exam_grades = pd.DataFrame.from_dict(second_exam_results, orient='index').astype(int)
improvements = second_exam_grades.LastExamGrade.to_frame()
improvements['Improvement'] = second_exam_grades.LastExamGrade - first_exam_grades.LastExamGrade
This will give you something that looks like this:
Now you can output it anyway you'd like
list(zip(*([improvements.index.tolist()] + [improvements[c].values.tolist() for c in improvements])))
This will give you [('Bob', 54, 9), ('Jack', 70, 12)] as you want.
One possible solution, using coroutines. Coroutine receive_message holds up to last two values LastExamGrade from the message for each student and produces list of student name, last grade and improvement over last grade:
json_messages = [
# 1st message:
{
"Bob":
{
"Age": "15",
"LastExamGrade": "45",
},
"Jack":
{
"Age": "16",
"LastExamGrade": "58",
}
},
# 2nd message
{
"Bob":
{
"Age": "15",
"LastExamGrade": "54",
},
"Jack":
{
"Age": "16",
"LastExamGrade": "70",
}
},
# 3nd message (optional)
{
"Bob":
{
"Age": "15",
"LastExamGrade": "14",
},
"Jack":
{
"Age": "16",
"LastExamGrade": "20",
}
}
]
def receive_message():
d, message = {}, (yield)
while True:
for k, v in message.items():
d.setdefault(k, []).append(v['LastExamGrade'])
d[k] = d[k][-2:] # store max last two messages
message = yield [(k, *tuple(v if len(v)==1 else [v[1], str(int(v[1])-int(v[0]))])) for k, v in d.items()]
receiver = receive_message()
next(receiver) # prime coroutine
for msg in json_messages:
print(receiver.send(msg))
Prints:
[('Bob', '45'), ('Jack', '58')]
[('Bob', '54', '9'), ('Jack', '70', '12')]
[('Bob', '14', '-40'), ('Jack', '20', '-50')]

Import data - with firebase keys

I am trying to import some data into firebase
{
"people":
[
{
"name": "John Smith",
"age": 23,
},
{
"name": "Tony Jones",
"age": 61,
},
]
}
This is fine but it adds a "traditional" array index in firebase (0,1) - which I believe is bad?
When I insert a new value via my web form I get a mix
"0" : {
"name": "John Smith",
"age": 23,
},
"1" : {
"name": "Tony Jones",
"age": 61,
},
"-LgWkhX2DdD_ChbWJkXo" : { // inserted via form it has a firebase index
"name": "Simon Green",
"age": 37,
}
How can I get the initial inserted data to use firebase indexes it is just a normal .json file.
{
"people":
[
{
"name": "John Smith",
"age": 23,
},
{
"name": "Tony Jones",
"age": 61,
},
]
}
When you write array type JSON data into Realtime Database, you are going to get array type numeric indexes in the database. If you don't want to write like this, you will have to convert the array yourself - there is no API that's going to do that for you. You'll have to read the JSON, iterate each element of the array, and write each item into the database the way you want it to be written. It looks like perhaps you want to add each item using an automatic push ID, since you are trying to create something that looks like "-LgWkhX2DdD_ChbWJkXo".

How to fetch data from a nested array in a JSON file?

I have fetched data from a JSON file.. But when I tried to fetch another data from it, am unable to do so as it is a nested array... I know the solution can arrive easily but this is the first time am trying to loop a JSON file.. so kindly give your inputs.
SampleData = {
"squadName": "Super hero squad",
"homeTown": "Metro City",
"formed": 2016,
"secretBase": "Super tower",
"active": true,
"members": [
{
"name": "Molecule Man",
"age": 29,
"secretIdentity": "Dan Jukes",
"powers": [
"Immortality",
"Turning tiny",
"Radiation blast"
]
},
{
"name": "Madame Uppercut",
"age": 39,
"secretIdentity": "Jane Wilson",
"powers": [
"Million tonne punch",
"Damage resistance",
"Superhuman reflexes"
]
},
{
"name": "Eternal Flame",
"age": 1000,
"secretIdentity": "Unknown",
"powers": [
"Immortality",
"Heat Immunity",
"Inferno",
"Teleportation",
"Interdimensional travel"
]
}
]
};
GetJsonData() {
console.log(this.SampleData["powers"]);
for (let i = 0; i < this.SampleData["powers"].length; i++) {
if (this.SampleData["powers"][i].Immortality) {
console.log(this.SampleData.powers[i]);
}
}
}
{name: "Molecule Man", age: 29, secretIdentity: "Dan Jukes", powers: Array(3)}
{name: "Eternal Flame", age: 1000, secretIdentity: "Unknown", powers: Array(3)}
Your code needs to follow the structure of the JSON data; in particular, these are all valid things you could print:
console.log(this.SampleData.squadName);
console.log(this.SampleData.homeTown);
console.log(this.SampleData.members[0].name);
console.log(this.SampleData.members[0].powers[0]);
If you wanted to loop through each member and print their info, that might look like this:
this.SampleData.members.forEach(member => {
let powerString = member.powers.join(', ');
console.log('Name: ' + member.name);
console.log('Age: ' + member.age);
console.log('Powers: ' + powerString);
});
I used a forEach, but you can also use a for (let i = loop.

Can't flatten JSON array to prepare for CSV conversion using Ruby 2.1.4

I have an array of nested JSON "hash" objects that I need to completely flatten so it ports over to CSV cleanly, which is obviously not nested and "multidimensional" like JSON typically is.
But the flatten method (used here with ! bang) is not working (it creates the file with no error but then the file is empty).
In my ruby file below I leave a working example of commented out code which isjust doing a simply conversion without the .flatten method. Since the JSON is an array (at the highest level) - separated by commas and enclosed in square brackets, shouldn't it take the .flatten method, just as it takes .each in the working commented out block? (This is also what the docs seems to indicate!)
require 'csv'
require 'json'
# CSV.open('false-hotels-merged.csv', 'w') do |csv|
# JSON.parse(File.open('monfri-false-hotels-merged.json').read).each do |hash|
# csv << hash.values
# end
# end
CSV.open('wed-all-false-hotels.csv', 'w') do |csv|
JSON.parse(File.open('monfri-false-hotels-merged.json').read).flatten! do |f|
csv << f.values
end
end
Example JSON data snippet:
[...
{
"id": "111707",
"name": "Seven Park Place by William Drabble",
"phone": "+442073161600",
"email": "restaurant#stjameshotelandclub.com",
"website": "http://www.stjameshotelandclub.com/michelin-star-chef-william-drabble",
"location": {
"latitude": 51.5062548,
"longitude": -0.1403209,
"address": {
"line1": "7-8 Park Place",
"line2": "St James's",
"line3": "",
"postcode": "SW1A 1LP",
"city": "London",
"country": "UK"
}
}
},
{
"id": "104493",
"name": "Seymour's Restaurant & Bar",
"phone": "+442079352010",
"email": "reservations#theleonard.com",
"website": "http://www.theleonard.com",
"location": {
"latitude": 51.51463,
"longitude": -0.15779,
"address": {
"line1": "15 Seymour Street",
"line2": "",
"line3": "",
"postcode": "W1H 7JW",
"city": "London",
"country": "UK"
}
}
},
{
"id": "250922",
"name": "Shaka Zulu",
"phone": "+442033769911",
"email": "info#shaka-zulu.com",
"website": "http://www.shaka-zulu.com/",
"location": {
"latitude": 51.5414979,
"longitude": -0.1458655,
"address": {
"line1": "Stables Market ",
"line2": "Camden",
"line3": "",
"postcode": "NW1 8AB",
"city": "London",
"country": "UK"
}
}
}
]
Again, no errors at all in the terminal - just blank CSV file created.
Array#flatten only flattens arrays. There is also Hash#flatten, which also produces an array. You seem to want to flatten a nested Hash for which I don't know of a library method.
It seems that your result is empty because there's an .each missing after the flatten - the block is simply not run.
Try this:
require 'csv'
require 'json'
def hflat(h)
h.values.flat_map {|v| v.is_a?(Hash) ? hflat(v) : v }
end
CSV.open('file.csv', 'w') do |csv|
JSON.parse(File.open('file.json').read).each do |h|
csv << hflat(h)
end
end

Resources