I'm working with those JSONs:
{
"extension": [
{
"url": "url1",
"system": "system1"
},
{
"url": "url2",
"system": "system2"
}
]
}
{
"extension": [
{
"url": "url3",
"system": "system3"
}
]
}
As you can see, both JSON objects have different .extension lenght.
I'm using this command in order to map input JSONs:
jq --raw-output '[.extension[] | .url, .system] | #csv'
You can find jqplay here.
I'm getting that:
"url1","system1","url2","system2"
"url3","system3"
What I would like to get is:
"url1","system1","url2","system2"
"url3","system3",,
Any ideas about how I could map those "fields" "correctly"?
Flip the table twice using transpose | transpose to fill up the slots missing from the unrigged square shape with null:
jq -rs 'map(.extension) | transpose | transpose[] | map(.url, .system) | #csv'
"url1","system1","url2","system2"
"url3","system3",,
Demo
A fairly efficient solution:
def pad:
(map(length)|max) as $mx
| map( . + [range(length;$mx)|null] );
[inputs | [.extension[] | (.url, .system)]]
| pad[]
| #csv
This of course should be used with the -n command-line option.
Related
I know there are most likely many questions like this, but i could not find an answer.
I have a json object that contains time Series data and i have trouble loading it into a Dataframe.
The data contains timesieres data like this:
{
"categories": [
"02.01.2007",
"03.01.2007",
"04.01.2007",
...
],
unixtime": [
"1167696000",
"1167782400",
"1167868800",
...
],
"series": [
{
"au": [
{
"name": "Gold",
"data": [
15.51,
15.48,
...
],
"color": "#FFD200"
}
],
"ag": [
{
"name": "Silber",
"data": [
315.21,
313.97,
...
],
"color": "#FFD200"
}
],
...
]
}
]
}
All in all there is price data from 7 metals (au,ag,pt,pd,rh,ir,ru). categories contians a list of time stamps, unixtime as well in a different format, Than comes the series object itself, which is nested and contains Objects which each have a field data that contains the prices i want to convert into pandas Time series data.
My code is however not working
import requests
import pandas as pd
r = # http request
j = r.json()
data = pd.json_normalize(data=j)
is not working. i gues i have to use the record_path keyword but i do not understand it. I wand a DataFrame that looks like this:
| Categories | unixtime | au | ag | ... |
1| 02.01.2007 | 1167696000 | 15.51 | 315.21 | ... |
2| 03.01.2007 | 1167782400 | 15.48 | 313.97 | ... |
3| ...
thank you in advance!
I have an array of json objects, each with an array of tags. Specific tags can appear multiple times in the child array but I only want the first matching tag (key+value) copied up onto the parent object. I've come up with a filter-set but it gives me multiple outputs if the given tag appears more than once in the child array ... I only want the first one.
Sample Json Input:
[
{
"name":"integration1",
"accountid":111,
"tags":[
{ "key": "env",
"values":["prod"]
},
{ "key": "team",
"values":["cougar"]
}
]
},
{
"name":"integration2",
"accountid":222,
"tags":[
{ "key": "env",
"values":["prod"]
},
{ "key": "team",
"values":["bear"]
}
]
},
{
"name":"integration3",
"accountid":333,
"tags":[
{ "key": "env",
"values":["test"]
},
{ "key": "team",
"values":["lemur"]
},
{ "key": "Env",
"values":["qa"]
}
]
}
]
Filter-set that I came up with:
jq -r '.[] | .tags[].key |= ascii_downcase | .env = (.tags[] | select(.key == "env").values[0])|[.accountid,.name,.env] | #csv' test.json
Example output with undesirable extra line:
111,"integration1","prod"
222,"integration2","prod"
333,"integration3","test"
333,"integration3","qa" <<<
Try using first(<expression>) to get only the first matching value. In case there are no matching values at all, you can use first(<expression>, <default_value>).
jq -r '.[] | .tags[].key |= ascii_downcase | .env = first((.tags[] | select(.key == "env").values[0]),null)|[.accountid,.name,.env] | #csv' test.json
Alternatively, if you are going to want to extract other tags similarly, you might prefer to extract them all into one object like this. I'm using reverse to meet your requirement of keeping the first match for any given key, otherwise the last match would win.
jq -r '.[] | .tags |= ( map({(.key|ascii_downcase): .values[0]}) | reverse | add ) | [.accountid, .name, .tags.env] | #csv'
Let's say I have more namespaces with the similar k8s resource (some might have different images used). I am trying to get .metadata.namespace using jq from the following json object (let's call it test.json):
{
"items": [
{
"metadata": {
"name": "app",
"namespace": "test1"
},
"spec": {
"components": [
{
"database": {
"from": "service",
"value": "redis"
},
"image": "test.com/lockmanager:1.1.1",
"name": "lockmanager01",
"replicas": 2,
"type": "lockmanager"
},
{
"database": {
"from": "service",
"value": "postgresql"
},
"image": "test.com/jobmanager:1.1.1",
"name": "jobmanager01",
"replicas": 2,
"type": "jobmanager"
}
]
}
}
]
}
if following condition is met:
.spec.components[].type == "jobmanager" and .spec.components[].image != "test.com/jobmanager:1.1.1"
but can't find the correct statement.
I tried:
cat test.json | jq '.items[] | select((.spec.components[].name? | contains("jobmanager01")) and (.spec.components[].image != "test.com/jobmanager:1.1.1")) | .metadata.namespace''
but it returns all namespaces and, moreover, those I am interested in (because I know they contain different image), are returned twice.
Please advise what am I doing wrong?
You state that the selection criterion is:
.spec.components[].type == "jobmanager" and
.spec.components[].image != "test.com/jobmanager:1.1.1"
but that does not make much sense, given the semantics of .[].
I suspect you meant that you want to select items from .spec.components such that
.type == "jobmanager" and .image != "test.com/jobmanager:1.1.1"
If that's the case, you could use any, so that your query would look like this:
.items[]
| select( any(.spec.components[];
(.name? | contains("jobmanager01")) and
.image != "test.com/jobmanager:1.1.1") )
| .metadata.namespace
all distinct
If you want all the distinct .namespace values satisfying the condition, you could go with:
[.items[]
| .metadata.namespace as $it
| .spec.components[]
| select( (.name? | contains("jobmanager01")) and
.image != "test.com/jobmanager:1.1.1" )
| $it]
| unique[]
Efficient version of "all-distinct" solution
To avoid unnecessary checks, if .namespace is always a string, we could write:
reduce .items[] as $item ({};
$item.metadata.namespace as $it
| if .[$it] then . # already seen
elif any( $item.spec.components[];
((.name? | contains("jobmanager01")) and
.image != "test.com/jobmanager:1.1.1") )
then .[$it] = true
else . end )
| keys_unsorted[]
I have the given input as such:
[{
"ciAttributes": {
"entries": "{\"hostname-cdc1.website.com\":[\"127.0.0.1\"],\"hostname-cdc1-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-dfw1.website.com\":[\"127.0.0.1\"],\"hostname-dfw1-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-cdc2.website.com\":[\"127.0.0.1\"],\"hostname-cdc2-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-dfw2.website.com\":[\"127.0.0.1\"],\"hostname-dfw2-extension.website.com\":[\"127.0.0.1\"]}"
},
}]
...and when I execute my jq with the following command (manipulating existing json):
jq '.[].ciAttributes.entries | fromjson | keys | [ { hostname: .[0] }] | add' | jq -s '{ instances: . }'
...I get this output:
{
"instances": [
{
"hostname": "hostname-cdc1.website.com"
},
{
"hostname": "hostname-dfw1.website.com"
},
{
"hostname": "hostname-cdc2.website.com"
},
{
"hostname": "hostname-dfw2.website.com"
}
]
}
My end goal is to only extract "hostnames" that contain "cdc." I've tried playing with the json select expression but I get a syntax error so I'm sure I'm doing something wrong.
First, there is no need to call jq more than once.
Second, because the main object does not have distinct key names, you would have to use the --stream command-line option.
Third, you could use test to select the hostnames of interest, especially if as seems to be the case, the criterion can most easily be expressed as a regex.
So here in a nutshell is a solution:
Invocation
jq -n --stream -c -f program.jq input.json
program.jq
{instances:
[inputs
| select(length==2 and (.[0][-2:] == ["ciAttributes", "entries"]))
| .[-1]
| fromjson
| keys_unsorted[]
| select(test("cdc.[.]"))]}
I have a JSON result from an ElasticSearch query that provides multiple objects in the JSON result.
{
"buckets": [{
"perf_SP_percentiles": {
"values": {
"80.0": 0,
"95.0": 0
}
},
"perf_QS_percentiles": {
"values": {
"80.0": 12309620299487,
"95.0": 12309620299487
}
},
"latest": {
"hits": {
"total": 3256,
"max_score": null,
"hits": [{
"_source": {
"is_external": true,
"time_ms": 1492110000000
},
"sort": [
1492110000
]
}]
}
}
}]
}
I wrote the following jq with help from others
jq -r '.buckets[].latest.hits.hits[]._source | [."is_external",."time_ms"] | #csv'
I need to add the perf_QS_Percentiles to the CSV but getting an error.
jq -r '.buckets[].latest.hits.hits[]._source | [."is_external",."time_ms"], .buckets[].perf_QS_percentiles.values | [."80.0",."95.0"] | #csv'
I am getting an error jq: error (at <stdin>:734665): Cannot index array with string. may be I am missing something here. I am reading the JQ manual https://stedolan.github.io/jq/manual/#Basicfilters to see how to parse different JSON objects in the array, but asking here as someone may be able to point out more easily.
You can use (....) + (....) to create the array before piping to #csv :
jq -r '.buckets[] |
(.latest.hits.hits[]._source | [."is_external",."time_ms"]) +
(.perf_QS_percentiles.values | [."80.0",."95.0"]) | #csv'