I have a json file like this
[
{
"topic": "Example1",
"ref": {
"1": "Example Topic",
"2": "Topic"
},
"contact": [
{
"ref": [
1
],
"corresponding": true,
"name": "XYZ"
},
{
"ref": [
1
],
"name": "ZXY"
},
{
"ref": [
1
],
"name": "ABC"
},
{
"ref": [
1,
2
],
"name":"BCA"
}
] ,
"type": "Presentation"
},
{
"topic": "Example2",
"ref": {
"1": "Example Topic",
"2": "Topic"
},
"contact": [
{
"ref": [
1
],
"corresponding": true,
"name": "XYZ"
},
{
"ref": [
1
],
"name": "ZXY"
},
{
"ref": [
1
],
"name": "ABC"
},
{
"ref": [
1,
2
],
"name":"BCA"
}
] ,
"type": "Poster"
}
]
I created 3 TablesItems,Reference,Contact one is
Items:
Item_ID
topic
type
reference:
ref_ID
content
Contact:
ref_ID
contact_ID
Item_ID
name
RelationShip :
1) Items has many references
2)Items has many Authors
3)Authors has many references
Now, my question is
1) Should I doing any wrong here?
2) is there any way to improve the my current implementation ?
3) Here I am confused about to implement the corresponding(inside the contact Array). How do I implement that in design ?
Thanks.
From your above Json., what I could infer is this normalized schema. You have 2 ref in your above Json. Could you clarify it?
Also, here a useful link for you., http://jsonviewer.stack.hu/ Switch between viewer and Text tabs.
The actual example from your scenario is.,
P- Primary Key
Ref - Reference Key
Topic:
--------------------------------------------------
Topic ID (P) | TopicName | TypeID (Ref)
----------------------------------------------------
0 Example1 0
1 Example2 1
TopicReferences :
----------------------------
TopicID (P) | RefernceID (Ref)
--------------------------------
0 0
0 1
1 0
1 1
Reference :
------------------------------------
ReferenceID (P) | ReferenceName
------------------------------------
0 Example Topic
1 Topic
Presentation Type :
--------------------------
TypeID (P) | TypeName
--------------------------
0 Presentation
1 Poster
TopicContacts:
---------------------------------
TopicID | ContactID (Ref)
---------------------------------
0 0
0 1
0 2
0 3
1 0
1 1
1 2
1 3
Contact:
-------------------------------------------------------------------
ContactID(P) | ContactName | IsCorresponding ( Boolean, nullable)
------------------------------------------------------------------
0 XYZ YES
1 ZXY NULL
2 ABC NULL
3 BCA NULL
ContactsReference2:
--------------------------------------------
ContactID | Reference2ID (Ref)
--------------------------------------------
0 0
1 0
2 0
3 0
3 1
Reference2:
--------------------------------------------
Reference2ID(P) | Reference2Value (NUM)
--------------------------------------------
0 1
1 2
Related
Here is my timeseries. Timestamp may be either the same or different. I need to convert data to single DataFrame:
{'param1': [
{'ts': 1669246574000, 'value': '6.06'},
{'ts': 1669242973000, 'value': '6.5'}
],
'param2': [
{'ts': 1669246579000, 'value': '7'},
{'ts': 1669242973000, 'value': '5'}
],
}
Update 1: format of DataFrame
ts param1 param2
1669246574000 6.06 1
1669242973000 6.5 2
1669246579000 7 3
1669242973000 5 4
Update 2:
Timestamp (ts) should be index
ts param1 param2
1669242973000 6.5 5
1669246574000 6.06 Nan
1669246579000 Nan 7
Update 3: my solution
data_frames = []
for key, values in data.items():
df = pd.DataFrame(values).set_index('ts').rename(columns={'value': key})
data_frames.append(df)
data_frame = pd.concat(data_frames, axis=1)
Try:
import pandas as pd
data = {
"param1": [
{"ts": 1669246574000, "value": "6.06"},
{"ts": 1669242973000, "value": "6.5"},
],
"param2": [
{"ts": 1669246579000, "value": "7"},
{"ts": 1669242973000, "value": "5"},
],
}
df = pd.concat([pd.DataFrame(v).assign(param=k) for k, v in data.items()])
print(df)
Prints:
ts value param
0 1669246574000 6.06 param1
1 1669242973000 6.5 param1
0 1669246579000 7 param2
1 1669242973000 5 param2
EDIT: With updated question:
import pandas as pd
from itertools import count
data = {
"param1": [
{"ts": 1669246574000, "value": "6.06"},
{"ts": 1669242973000, "value": "6.5"},
],
"param2": [
{"ts": 1669246579000, "value": "7"},
{"ts": 1669242973000, "value": "5"},
],
}
c = count(1)
df = pd.DataFrame(
[
{"ts": d["ts"], "param1": d["value"], "param2": next(c)}
for v in data.values()
for d in v
]
)
print(df)
Prints:
ts param1 param2
0 1669246574000 6.06 1
1 1669242973000 6.5 2
2 1669246579000 7 3
3 1669242973000 5 4
I have an jCal JSON array which I'd like to filter with jq. JSON arrays are somewhat new to me and I have been banging my head to the wall on this for hours...
The file looks like this:
[
"vcalendar",
[
[
"calscale",
{},
"text",
"GREGORIAN"
],
[
"version",
{},
"text",
"2.0"
],
[
"prodid",
{},
"text",
"-//SabreDAV//SabreDAV//EN"
],
[
"x-wr-calname",
{},
"unknown",
"Call log private"
],
[
"x-apple-calendar-color",
{},
"unknown",
"#ffaa00"
],
[
"refresh-interval",
{},
"duration",
"PT4H"
],
[
"x-published-ttl",
{},
"unknown",
"PT4H"
]
],
[
[
"vevent",
[
[
"dtstamp",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"created",
{},
"date-time",
"2015-02-18T16:44:04Z"
],
[
"uid",
{},
"text",
"9b23142b-8d86-3e17-2f44-2bed65b2e471"
],
[
"last-modified",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"description",
{},
"text",
"Phone call to +49xxxxxxxxxx lasted for 0 seconds."
],
[
"summary",
{},
"text",
"Outgoing: +49xxxxxxx"
],
[
"dtstart",
{},
"date-time",
"2015-02-18T10:58:12Z"
],
[
"dtend",
{},
"date-time",
"2015-02-18T10:58:44Z"
],
[
"transp",
{},
"text",
"OPAQUE"
]
],
[]
],
[
"vevent",
[
[
"dtstamp",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"created",
{},
"date-time",
"2015-01-09T19:12:05Z"
],
[
"uid",
{},
"text",
"c337e092-a012-5f5a-497f-932fbc6159e5"
],
[
"last-modified",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"description",
{},
"text",
"Phone call to +1xxxxxxxxxx lasted for 39 seconds."
],
[
"summary",
{},
"text",
"Outgoing: +1xxxxxxxxxx"
],
[
"dtstart",
{},
"date-time",
"2015-01-09T17:23:16Z"
],
[
"dtend",
{},
"date-time",
"2015-01-09T17:24:19Z"
],
[
"transp",
{},
"text",
"OPAQUE"
]
],
[]
],
]
]
I would like to filter out dtstart, dtend, the target phone number and the connection duration from the description for each vevent which was created e.g. in January 2019 ("2019-01.*") and output them as a CSV.
This JSON is a bit strange because the information is stored position-based in an array instead of an object.
Using the first element of an array ("vevent") to identify its contents is not the best practice.
But anyway ... if this is the data source you are dealing with, this code should help you.
jq -r '..
| arrays
| select(.[0] == "vevent")[1]
| [
(.[] | select(.[0] == "dtstart") | .[3]),
(.[] | select(.[0] == "dtend") | .[3]),
(.[] | select(.[0] == "description") | .[3])
]
| #csv
'
Alternatively, the repeating code can be transferred into a function
jq -r 'def getField($name; $idx): .[] | select(.[0] == $name) | .[$idx];
..
| arrays
| select(.[0] == "vevent")[1]
| [ getField("dtstart"; 3), getField("dtend"; 3), getField("description"; 3) ]
| #csv
'
Output
"2015-02-18T10:58:12Z","2015-02-18T10:58:44Z","Phone call to +49xxxxxxxxxx lasted for 0 seconds."
"2015-01-09T17:23:16Z","2015-01-09T17:24:19Z","Phone call to +1xxxxxxxxxx lasted for 39 seconds."
You can also extract phone number and duration with the help of regular expressions in jq:
jq -r 'def getField($name; $idx): .[] | select(.[0] == $name) | .[$idx];
..
| arrays
| select(.[0] == "vevent")[1]
| [
getField("dtstart"; 3),
getField("dtend"; 3),
(getField("description"; 3) | match("call to ([^ ]*)") | .captures[0].string),
(getField("description"; 3) | match("(\\d+) seconds") | .captures[0].string)
]
| #csv
'
Output
"2015-02-18T10:58:12Z","2015-02-18T10:58:44Z","+49xxxxxxxxxx","0"
"2015-01-09T17:23:16Z","2015-01-09T17:24:19Z","+1xxxxxxxxxx","39"
Not the most efficient solution, but quite understandable by first building an object out of key-value pairs and then filtering and transforming those.
.[2][][1] is a stream of events encoded as arrays.
Which means that:
.[2][][1]
| map({key:.[0], value:.[3]})
| from_entries
the above gives you a stream of objects; one object per event:
{
"dtstamp": "2015-04-05T16:42:10Z",
"created": "2015-02-18T16:44:04Z",
"uid": "9b23142b-8d86-3e17-2f44-2bed65b2e471",
"last-modified": "2015-04-05T16:42:10Z",
"description": "Phone call to +49xxxxxxxxxx lasted for 0 seconds.",
"summary": "Outgoing: +49xxxxxxx",
"dtstart": "2015-02-18T10:58:12Z",
"dtend": "2015-02-18T10:58:44Z",
"transp": "OPAQUE"
}
{
"dtstamp": "2015-04-05T16:42:10Z",
"created": "2015-01-09T19:12:05Z",
"uid": "c337e092-a012-5f5a-497f-932fbc6159e5",
"last-modified": "2015-04-05T16:42:10Z",
"description": "Phone call to +1xxxxxxxxxx lasted for 39 seconds.",
"summary": "Outgoing: +1xxxxxxxxxx",
"dtstart": "2015-01-09T17:23:16Z",
"dtend": "2015-01-09T17:24:19Z",
"transp": "OPAQUE"
}
Now plug that into the final program: select the wanted objects, add CSV headers, build the rows and ultimately convert to CSV:
["start", "end", "description"],
(
.[2][][1]
| map({key:.[0], value:.[3]})
| from_entries
| select(.created | startswith("2015-01"))
| [.dtstart, .dtend, .description]
)
| #csv
Raw output (-r):
"start","end","description"
"2015-01-09T17:23:16Z","2015-01-09T17:24:19Z","Phone call to +1xxxxxxxxxx lasted for 39 seconds."
If you need to further transform .description, you can use split or capture. Or use a different property, such as .summary, in your CSV rows. Only a single line needs to be changed.
I am using Laravel 8 and I want to display list of event by joining tables that have many to many relationship.
Here is how my tables look:
Users Table
| id | firstname | status |
|----|------------|--------|
| 1 | Amy | 0 |
| 2 | 2 amy | 0 |
| 3 | 3 amy | 1 |
| 4 | 4 amy | 0 |
| 5 | 5 amy | 1 |
| 6 | 6 amy | 1 |
Here is my pivot table
events_users Table
| id | event_id | user_id |
|----|------------|---------|
| 1 | 123 | 1 |
| 1 | 123 | 2 |
| 1 | 123 | 3 |
| 1 | 123 | 4 |
Here is my events table
events Table
| id | eventid | title |
|----|------------|---------|
| 1 | 123 | title |
| 1 | 124 | title 1 |
| 1 | 125 | title 2 |
| 1 | 126 | title 3 |
Here is my model fetching the results:
$events = DB::table('events')
->join('events_users', 'events.eventid', '=', 'events_users.event_id')
->join('users', 'users.id', '=', 'events_users.user_id')
->when($sortBy, function ($query, $sortBy) {
return $query->orderBy($sortBy);
}, function ($query) {
return $query->orderBy('events.created_at', 'desc');
})
->when($search_query, function ($query, $search_query) {
return $query->where('title', 'like', '%'. $search_query . '%');
})
->select(
'title', 'eventuid', 'description', 'start_date',
'end_date', 'start_time', 'end_time', 'status',
'venue', 'address_line_1', 'address_line_2', 'address_line_3',
'postcode', 'city', 'city_id', 'country', 'image',
'users.firstname', 'users.lastname', 'users.avatar'
)
->simplePaginate(15);
This results in duplicate entries:
Current Result:
{
"current_page": 1,
"data": [
{
"title": "Who in the newspapers, at the mushroom (she had.",
"eventuid": "be785bac-70d5-379f-a6f8-b35e66c8e494",
"description": "I'd been the whiting,' said Alice, 'and why it is I hate cats and dogs.' It was opened by another footman in livery came running out of sight before the trial's over!' thought Alice. 'I'm glad they.",
"start_date": "2000-11-17",
"end_date": "1988-02-24",
"start_time": "1972",
"end_time": "2062",
"status": 1,
"venue": "4379",
"address_line_1": "Kuhn Expressway",
"address_line_2": "2295 Kerluke Drive Suite 335",
"address_line_3": "Fredtown",
"postcode": "57094",
"city": "New Cassidyburgh",
"city_id": 530,
"country": "Cocos (Keeling) Islands",
"image": "https://via.placeholder.com/1280x720.png/00dd99?text=repellat",
"firstname": "Marielle",
"lastname": "Tremblay",
"avatar": "https://via.placeholder.com/640x480.png/002277?text=eum"
},
{
"title": "Who in the newspapers, at the mushroom (she had.",
"eventuid": "be785bac-70d5-379f-a6f8-b35e66c8e494",
"description": "I'd been the whiting,' said Alice, 'and why it is I hate cats and dogs.' It was opened by another footman in livery came running out of sight before the trial's over!' thought Alice. 'I'm glad they.",
"start_date": "2000-11-17",
"end_date": "1988-02-24",
"start_time": "1972",
"end_time": "2062",
"status": 1,
"venue": "4379",
"address_line_1": "Kuhn Expressway",
"address_line_2": "2295 Kerluke Drive Suite 335",
"address_line_3": "Fredtown",
"postcode": "57094",
"city": "New Cassidyburgh",
"city_id": 530,
"country": "Cocos (Keeling) Islands",
"image": "https://via.placeholder.com/1280x720.png/00dd99?text=repellat",
"firstname": "Floyd",
"lastname": "Waelchi",
"avatar": "https://via.placeholder.com/640x480.png/0033cc?text=inventore"
},
...
]
}
What I want to retrieve is something like this:
Expecting:
{
"current_page": 1,
"data": [
{
"title": "Who in the newspapers, at the mushroom (she had.",
"eventuid": "be785bac-70d5-379f-a6f8-b35e66c8e494",
"description": "I'd been the whiting,' said Alice, 'and why it is I hate cats and dogs.' It was opened by another footman in livery came running out of sight before the trial's over!' thought Alice. 'I'm glad they.",
"start_date": "2000-11-17",
"end_date": "1988-02-24",
"start_time": "1972",
"end_time": "2062",
"status": 1,
"venue": "4379",
"address_line_1": "Kuhn Expressway",
"address_line_2": "2295 Kerluke Drive Suite 335",
"address_line_3": "Fredtown",
"postcode": "57094",
"city": "New Cassidyburgh",
"city_id": 530,
"country": "Cocos (Keeling) Islands",
"image": "https://via.placeholder.com/1280x720.png/00dd99?text=repellat",
"users" : {[
{
"firstname": "Marielle",
"lastname": "Tremblay",
"avatar": "https://via.placeholder.com/640x480.png/002277?text=eum"
},
{
"firstname": "Amy",
"lastname": "Bond",
"avatar": "https://via.placeholder.com/640x480.png/005277?text=eum"
}
]}
},
...
]
}
I have the following JSON array
[
{
"city": "Seattle",
"array10": [
"1",
"2"
]
},
{
"city": "Seattle",
"array11": [
"3"
]
},
{
"city": "Chicago",
"array20": [
"1",
"2"
]
},
{
"city": "Denver",
"array30": [
"3"
]
},
{
"city": "Reno",
"array50": [
"1"
]
}
]
My task is the following: for each "city" values, which are known, get the names of arrays and for each array, get its contents printed/displayed. Names of cities and arrays are unique, the content of arrays - are not.
The result should look like the following:
Now working on Seattle
Seattle has the following arrays:
array10
array11
Content of the array10
1
2
Content of the array11
3
Now working on Chicago
Chicago has the following arrays:
array20
Content of the array array20
1
2
Now working on Denver
Denver has the following arrays:
array30
Content of the array array30
3
Now working on Reno
Denver has the following arrays:
array50
Content of the array array50
1
Now, for each city name (which are provided/known) I can find names of arrays using the following filter (I can put city names in the vars obviously):
jq -r .[] | select ( .name | test("Seattle") ) | del (.name) | keys |#tsv
Then assign these names to a bash variable and iterate in the new cycle to get the content of each array.
While I can get what I want with the above, my question - is there a more efficient way to do it with jq?
And the second, related question - if my JSON had the following structure below, would it make my task easier for the speed/efficiency/simplicity standpoint?
[
{
"name": "Seattle",
"content": {
"array10": [
"1",
"2"
],
"array11": [
"3"
]
}
},
{
"name": "Chicago",
"content": {
"array20": [
"1",
"2"
]
}
},
{
"name": "Denver",
"content": {
"array30": [
"3"
]
}
},
{
"name": "Reno",
"content": {
"array50": [
"1"
]
}
}
]
Using the -r command-line option, the following program produces the output as shown below:
group_by(.city)[]
| .[0].city as $city
| map(keys_unsorted[] | select(test("^array"))) as $arrays
| "Now working on \($city)",
"\($city) has the following arrays:",
$arrays[],
(.[] | to_entries[] | select(.key | test("^array"))
| "Content of the \(.key)", .value[])
Output
Now working on Chicago
Chicago has the following arrays:
array20
Content of the array20
1
2
Now working on Denver
Denver has the following arrays:
array30
Content of the array30
3
Now working on Reno
Reno has the following arrays:
array50
Content of the array50
1
Now working on Seattle
Seattle has the following arrays:
array10
array11
Content of the array10
1
2
Content of the array11
3
I have a multidimensional JSON array, I am accessing the JSON array in SQL Server and using 'OPENJSON' to convert JSON data to SQL. I am currently facing problem in fetching the data from multidimensional array
Declare #Json nvarchar(max)
Set #Json= '[{
"id": 0,
"healthandSafety": "true",
"estimationCost": "7878",
"comments": "\"Comments\"",
"image": [{
"imageData": "1"
}, {
"imageData": "2"
}, {
"imageData": "3"
}, {
"imageData": "4"
}, {
"imageData": "5"
}]
}, {
"id": 1,
"healthandSafety": "false",
"estimationCost": "90",
"comments": "\"89089\"",
"image": [{
"imageData": "6"
}, {
"imageData": "7"
}, {
"imageData": "8"
}, {
"imageData": "9"
}, {
"imageData": "10"
}, {
"imageData": "11"
}]
}]'
Select ImageJsonFile from OPENJSON (#Json) with (ImageJsonFile nvarchar(max) '$.image[0].imageData')
When I tried the above code I obtained the following output:
ImageJsonFile
1
6
The output what I am expecting :
ImageJsonFile
1
2
3
4
5
You need to define query path:
Select * from OPENJSON (#Json,'$[0].image') with (ImageJsonFile nvarchar(max) '$.imageData')
You've got an answer already, so this is just to add some more details:
The following will bring back all data from your multi dimensional array, not just one array index you'd have to specify explictly.
DECLARE #Json NVARCHAR(MAX)=
N'[{
"id": 0,
"healthandSafety": "true",
"estimationCost": "7878",
"comments": "\"Comments\"",
"image": [{
"imageData": "1"
}, {
"imageData": "2"
}, {
"imageData": "3"
}, {
"imageData": "4"
}, {
"imageData": "5"
}]
}, {
"id": 1,
"healthandSafety": "false",
"estimationCost": "90",
"comments": "\"89089\"",
"image": [{
"imageData": "6"
}, {
"imageData": "7"
}, {
"imageData": "8"
}, {
"imageData": "9"
}, {
"imageData": "10"
}, {
"imageData": "11"
}]
}]';
--The query
SELECT A.id
,A.healthandSafety
,A.estimationCost
,A.comments
,B.imageData
FROM OPENJSON(#Json)
WITH(id INT
,healthandSafety BIT
,estimationCost INT
,comments NVARCHAR(1000)
,[image] NVARCHAR(MAX) AS JSON ) A
CROSS APPLY OPENJSON(A.[image])
WITH(imageData INT) B;
The result
+----+-----------------+----------------+----------+-----------+
| id | healthandSafety | estimationCost | comments | imageData |
+----+-----------------+----------------+----------+-----------+
| 0 | 1 | 7878 | Comments | 1 |
+----+-----------------+----------------+----------+-----------+
| 0 | 1 | 7878 | Comments | 2 |
+----+-----------------+----------------+----------+-----------+
| 0 | 1 | 7878 | Comments | 3 |
+----+-----------------+----------------+----------+-----------+
| 0 | 1 | 7878 | Comments | 4 |
+----+-----------------+----------------+----------+-----------+
| 0 | 1 | 7878 | Comments | 5 |
+----+-----------------+----------------+----------+-----------+
| 1 | 0 | 90 | 89089 | 6 |
+----+-----------------+----------------+----------+-----------+
| 1 | 0 | 90 | 89089 | 7 |
+----+-----------------+----------------+----------+-----------+
| 1 | 0 | 90 | 89089 | 8 |
+----+-----------------+----------------+----------+-----------+
| 1 | 0 | 90 | 89089 | 9 |
+----+-----------------+----------------+----------+-----------+
| 1 | 0 | 90 | 89089 | 10 |
+----+-----------------+----------------+----------+-----------+
| 1 | 0 | 90 | 89089 | 11 |
+----+-----------------+----------------+----------+-----------+
The idea in short:
We use the first OPENJSON to get the elements of the first level. The WITH clause will name all elements and return the [image] with NVARCHAR(MAX) AS JSON. This allows to use another OPENJSON to read the numbers from imageData, your nested dimension, while the id-column is the grouping key.