Reading data from MongoDB that contains array using Talend - arrays

I have a collection in my MongoDB that contains one field that is an array.
Refer to the data above, the field 'Courses' is an array.
The JSON format of the data is like this:
{
"_id": {
"$oid": "60eb59b98a970a20865142e8"
},
"Name": "Sadia",
"Age": 24,
"Institute": "IBA",
"Courses": [{
"Name": "ITP",
"Grade": "A-"
}, {
"Name": "OOP",
"Grade": "A-"
}]
}
I am aware that there is a way in case its an object, but could not find a way on how to read this data using Talend since it contains an array.

Related

Ruby - parse JSON file with nested arrays to ruby hash without data loss

I have a file1.json with structure like this :
[
{
"uri": "features/hdp.feature",
"id": "as-a-user-i-want-to-use-house-detailed-page",
"keyword": "Feature",
"name": "As a user I want to use house detailed page",
"description": "",
"line": 2,
"tags": [
{
"name": "#hdp",
"line": 1
}
],
"elements": [
{
As you can see - it is an array with nested key:value pairs and other arrays. I need to convert it to ruby hash, but when I'm performing JSON.parse(file1) - it creates an array (http://prntscr.com/lqio6r) with ruby hashes, arrays and so on. If I'm performing JSON.parse(file1).reduce Hash.new, :merge or JSON.parse(file1).reduce Hash.new, :update) - as one of the answers on StackOverflow supposed - the result hash losses about 60% of .json content. Can you please advice on how can I convert json file to ruby hash (without any data losses)?
UPD - not truncated array - https://gist.githubusercontent.com/M1khah/3337507e3ca1544e6098bc726bca90cb/raw/c8262ad753bd0eebf1180e111acd016ffc07d1a5/gistfile1.txt
Hash with hashes - something like this instead of an array with nested hashes
{
{
"uri": "features/hdp.feature",
"id": "as-a-user-i-want-to-use-house-detailed-page",
"keyword": "Feature",
"name": "As a user I want to use house detailed page",
"description": "",
"line": 2,
"tags": [
{
"name": "#hdp",
"line": 1
}
],
"elements": [
{
}

How to do a NoSql linked query

I have a noSql (Cloudant) database
-Within the database we have documents where one of the document fields represents “table” (type of document)
-Within the documents we have fields that represent links other documents within the database
For example:
{_id: 111, table:main, user_id:222, field1:value1, other1_id: 333}
{_id: 222, table:user, first:john, other2_id: 444}
{_id: 333, table:other1, field2:value2}
{_id: 444, table:other2, field3:value3}
We want of way of searching for _id:111
And the result be one document with data from linked tables:
{_id:111, user_id:222, field1:value1, other1_id: 333, first:john, other2_id: 444, field2:value2, field3:value3}
Is there a way to do this?
There is flexibility on the structure of how we store or get the data back—any suggestions on how to better structure the data to make this possible?
The first thing to say is that there are no joins in Cloudant. If you're schema relies on lots of joining then you're working against the grain of Cloudant which may mean extra complication for you or performance hits.
There is a way to de-reference other documents' ids in a MapReduce view. Here's how it works:
create a MapReduce view to emit the main document's body and its linked document's ids in the form { _id: 'linkedid'}
query the view with include_docs=true to pull back the document AND the de-referenced ids in one go
In your case, a map function like this:
function(doc) {
if (doc.table === 'main') {
emit(doc._id, doc);
if (doc.user_id) {
emit(doc._id + ':user', { _id: doc.user_id });
}
}
}
would allow you to pull back the main document and its linked user document in one API by hitting the GET /mydatabase/_design/mydesigndoc/_view/myview?startkey="111"&endkey="111z"&include_docs=true endpoint:
{
"total_rows": 2,
"offset": 0,
"rows": [
{
"id": "111",
"key": "111",
"value": {
"_id": "111",
"_rev": "1-5791203eaa68b4bd1ce930565c7b008e",
"table": "main",
"user_id": "222",
"field1": "value1",
"other1_id": "333"
},
"doc": {
"_id": "111",
"_rev": "1-5791203eaa68b4bd1ce930565c7b008e",
"table": "main",
"user_id": "222",
"field1": "value1",
"other1_id": "333"
}
},
{
"id": "111",
"key": "111:user",
"value": {
"_id": "222"
},
"doc": {
"_id": "222",
"_rev": "1-6a277581235ca01b11dfc0367e1fc8ca",
"table": "user",
"first": "john",
"other2_id": "444"
}
}
]
}
Notice how we get two rows back, the first is the main document body, the second the linked user.

How to projection element in array field of MongoDb collection?

MongoDb Collection Example (Person):
{
"id": "12345",
"schools": [
{
"name": "A",
"zipcode": "12345"
},
{
"name": "B",
"zipcode": "67890"
}
]
}
Desired output:
{
"id": "12345",
"schools": [
{
"zipcode": "12345"
},
{
"zipcode": "67890"
}
]
}
My current partial code for retrieving all:
collection.find({}, {id: true, schools: true})
I am querying the entire collection. But I only want to return zipcode part of school element, not other fields (because the actual school object might contain much more data which I do not need). I could retrieve all and remove those un-needed fields (like "name" of school) in code, but that's not what I am looking for. I want to do a MongoDb query.
You can use the dot notation to project specific fields inside documents embedded in an array.
db.collection.find({},{id:true, "schools.zipcode":1}).pretty()

Identify documents in mongodb when matching two key:value pairs within a single array

I am trying to identify documents where both key-value pairs within an array match using the aggregate pipeline. Specifically, if I want to find documents where one array contains user_attribute.Name = Quests_In_Progress and user_attribute.Value =3. Below is an example of such a document that I'm trying to match.
If I use
db.myCollection.aggregate({
$match: {
"user_attribute.Name": "Quests_In_Progress",
"user_attribute.Value": "3"
}
})
It will match every document that contains Quests_In_Progress for user_attribute.Name in one element of the array and contains "3" for user_attribute.Value, regardless of whether they exist in the same element of the array or not.
i.e.
db.myCollection.aggregate({
$match: {
"user_attribute.Name": "Quests_In_Progress",
"user_attribute.Value": "0"
}
})
will match the same document simply because one element of the array has a key:Value pair of Value:0 and another element of the array contains a key:value pair of Quests_In_Progress.
What I want to do is identify documents where both of those conditions are met within one element of the array.
I tried to do this with $elemMatch, but I couldn't get it to work. Plus the aggregate documentation doesn't indicate that $elemMatch works, so maybe that's why I couldn't get it to work.
Lastly, I need to use the aggregate pipeline, because there are a bunch of other things I have to do after finding these documents- specifically unwinding them.
{
"_id": ObjectId("5555bb32de938ce667f78ce00"),
"user_attribute": [{
"Value": "Facebook",
"Name": "Social_Connection"
}, {
"Name": "Total_Fireteam_Missions_Initiated",
"Value": "0"
}, {
"Name": "Quests_Completed",
"Value": "3"
}, {
"Name": "Item_Slots_Owned",
"Value": "36"
}, {
"Name": "Quests_In_Progress",
"Value": "3"
}, {
"Name": "Player_Progression",
"Value": "0"
}, {
"Value": "1",
"Name": "Characters_Owned"
}, {
"Name": "Quests_Started",
"Value": "6"
}, {
"Name": "Total_Friends",
"Value": "0"
}, {
"Name": "Device_Type",
"Value": "Phone"
}]
}
Try using $elemMatch
db.myCollection.aggregate([{$match: {"user_attribute": {$elemMatch: {"Name":"Quests_In_Progress", "Value":"0"}}}}, { $out, "temp"}])
That query will find anyone who has element of their array "Quests_In_Progress" with a value of 0 and put it into the collection temp

Node.JS - How to access Values of Dictionary within an Array of a Key in a Dictionary?

I'm new in Node.JS and I'm able to parse the JSON data and do a console log to print out name and badges.
var details = JSON.parse(body);
console.log(details.name, details.badges.length);
But I don't know how I can get the data inside the arrays of the bagdes such as id, name, url.
I tried
console.log(details.badges.length.id);
But nothing shows up. How can I access that? Thank you.
{
"name": "Andrew Chalkley",
"badges": [
{
"id": 49,
"name": "Newbie",
"url": "http:\/\/teamtreehouse.com\/chalkers",
"icon_url": "https:\/\/achievement-images.teamtreehouse.com\/Generic_Newbie.png",
"earned_date": "2012-07-23T19:59:34.000Z",
"courses": [
]
},
{
"id": 26,
"name": "Introduction",
"url": "http:\/\/teamtreehouse.com\/library\/html\/introduction",
"icon_url": "https:\/\/achievement-images.teamtreehouse.com\/HTML_Basics.png",
"earned_date": "2012-07-23T21:57:24.000Z",
"courses": [
{
"title": "HTML",
"url": "http:\/\/teamtreehouse.com\/library\/html",
"badge_count": 1
},
{
"title": "Introduction",
"url": "http:\/\/teamtreehouse.com\/library\/html\/introduction",
"badge_count": 1
}
]
}
}
It is an array, so you need the index, for example: details.badges[0].id
This will return the first (index 0) element id.
.length only returns the length of the array, so it will not be useful to get the data in it.

Resources