Import Object of arrays from CSV to Neo4j - arrays

I have a collection in mongo that I need to migrate to Neo4j. To do that, I will export it to CSV. Then, I'll import the resultant CSV to Neo4j using Cypher. The documents from the collection have an object with an array that contains objects with arrays inside them. Take a look at the JSON above:
"services" : [
{
"max_id" : "646767779849326594",
"log" : [
{
"date" : 1443024000,
"steps" : 6
},
{
"date" : 1442512800,
"steps" : 1
}
],
"service" : "home_timeline"
},
{
"max_id" : 0.0,
"log" : [
{
"date" : 1443024000,
"steps" : 4
},
{
"date" : 1442512800,
"steps" : 1
}
],
"service" : "user_timeline"
},
{
"max_id" : 0.0,
"log" : [
{
"date" : 1443024000,
"steps" : 6
},
{
"date" : 1442512800,
"steps" : 1
}
],
"service" : "mentions_timeline"
}
]
How can I import this to Neo4 properly?? I already found a solution to import arrays. But I didn't find nothing similar to my problem. How should be the header of the CSV? How should be the Cypher code to get these objets??

You can use JSON as a parameter to a Cypher query. There are a few examples of this here and here.
With your example something like this:
WITH {json} AS data
UNWIND data.services AS service
// Insert data for each service.
MERGE (s:Service { "service_name": service.service})
SET s.max_id = service.max_id
FOREACH (log IN service.logs | CREATE (l:Log {date: log.date, steps: log.steps})<-[:LOGGED]-(s))
There is also a tool for translating data from MongoDB document data model to Neo4j property graph model that you might find useful: https://github.com/neo4j-contrib/neo4j_doc_manager

Related

How can I select an value from a array inside on another array?

I'm trying to select all objects in my database which are between two dates.
Problem is: Dates are inside of an array
Already tried using both Robo 3T and Studio 3T with SQL, with no sucess.
{
"_id" : "5d9b703fe1bc4f138c5977b5",
"Number" : 112795,
"Finalizations" : [
{
"Value" : "89.95",
"Portions" : [
{
"Expiration" : ISODate("2019-11-06T02:00:00.000Z"),
"Value" : "89.95"
}
]
}
]
}
I need to return all the objects that have an "Expiration" between 11/01 and 11/25.
Assuming your collection is called mycollection you can query using the mongo shell...
db.mycollection.find(
{
"Finalizations.Portions.Expiration": {"$gte": ISODate("2019-11-01")},
"Finalizations.Portions.Expiration": {"$lt": ISODate("2019-11-25")}
}
)

MongoDB: Updating A specific array element in a sub document

I'm a novice with mongodb so please excuse me if the question is a little basic. I have a mongo collection with a relatively complex document structure. The documents contain sub documents and arrays. I need to add additional data to some of the documents in this collection. A cut down version of the document is:
"date" : ISODate("2018-08-07T08:00:00.000+0000"),
.
. <<-- Other fields
.
"basket" :
[
{
"assetId" : NumberInt(639),
"securityId" : NumberInt(12470),
.
. <<-- Other fields
.
"exGroup" : [
. << -- Fields......
.
. << -- New Data will go here
]
}
.
. << More elements
]
The following (abridged) aggregation query finds the documents that need modifying:
{
"$match" : {
"date" : {
"$gte" : ISODate("2018-08-07T00:00:00.000+0000"),
"$lt" : ISODate("2018-08-08T00:00:00.000+0000")
}
}
},
{
"$unwind" : {
"path" : "$basket"
}
},
{
"$unwind" : {
"path" : "$basket.exGroup"
}
},
{
"$project" : {
"_id" : 1.0,
"date" : 1.0,
"assetId" : "$basket.assetId",
"securityId" : "$basket.securityId",
"exGroup" : "$basket.exGroup"
}
},
{
"$unwind" : {
"path" : "$exGroup"
}
},
{
"$match" : {
"exGroup.order" : {
"$exists" : true
}
}
}
For each document returned by the mongo query I need to (in python) retrieve a set of additional data from a SQL database and then append this data to the original mongo document as shown above. The set of new fields will be the same, the data will be different. What is not clear to me is how, once I have the data I go about updating the array values.
Could somebody give me a pointer?
Try this, it works for me!
mySchema.aggregate([
//your aggregation code
],function(err, docList){
//for each doc in docList
async.each(docList, function(doc, callback){
query = {$and:[{idField:doc.idField},{"myArray.ArrayId":doc.myArray.ArrayId}]}
//Update or create field in array
update = {$set:"myArray.$.FieldNameToCreateOrUpdate":value}}
projection = {field1:1, field2:1, field3:1}
mySchema.findOneAndUpdate(query, update, projection, function(err, done){
if(err){callback(err,null)}
callback(null,'done')
})
,function(err){
//code if error
//code if no error
}
})

MongoDB: Check for missing documents using a model tree structures with an array of ancestors

I'm using a model tree structures with an array of ancestors and I need to check if any document is missing.
{
"_id" : "GbxvxMdQ9rv8p6b8M",
"type" : "article",
"ancestors" : [ ]
}
{
"_id" : "mtmTBW8nA4YoCevf4",
"parent" : "GbxvxMdQ9rv8p6b8M",
"ancestors" : [
"GbxvxMdQ9rv8p6b8M"
]
}
{
"_id" : "J5Dg4fB5Kmdbi8mwj",
"parent" : "mtmTBW8nA4YoCevf4",
"ancestors" : [
"GbxvxMdQ9rv8p6b8M",
"mtmTBW8nA4YoCevf4"
]
}
{
"_id" : "tYmH8fQeTLpe4wxi7",
"refType" : "reference",
"parent" : "J5Dg4fB5Kmdbi8mwj",
"ancestors" : [
"GbxvxMdQ9rv8p6b8M",
"mtmTBW8nA4YoCevf4",
"J5Dg4fB5Kmdbi8mwj"
]
}
My attempt would be to check each ancestors id if it is existing. If this fails, this document is missing and the data structure is corrupted.
let ancestors;
Collection.find().forEach(r => {
if (r.ancestors) {
r.ancestors.forEach(a => {
if (!Collection.findOne(a))
missing.push(r._id);
});
}
});
But doing it like this will need MANY db calls. Is it possible to optimize this?
Maybe I could get an array with all unique ancestor ids first and check if these documents are existing within one db call??
First take out all distinct ancesstors from your collections.
var allAncesstorIds = db.<collectionName>.distinct("ancestors");
Then check if any of the ancesstor IDs are not in the collection.
var cursor = db.<collectionName>.find({_id : {$nin : allAncesstorIds}}, {_id : 1})
Iterate the cursor and insert all missing docs in a collection.
cursor.forEach(function (missingDocId) {
db.missing.insert(missingDocId);
});

MongoDB Aggregate Array with Two Fields

I have vehicles collection with the following schema, all the articles are just general products (no child products included):
{
"_id" : ObjectId("554995ac3d77c8320f2f1d2e"),
"model" : "ILX",
"year" : 2015,
"make" : "Acura",
"motor" : {
"cylinder" : 4,
"liters" : "1.5"
},
"products" : [
ObjectId("554f92433d77c803836fefe3"),
...
]
}
And I have products collection, some of them are general products related with warehouse sku's and some products are "son" products that fit in multiples general products, these son products are also related with warehouse sku's:
general products
{
"_id" : ObjectId("554b9f223d77c810e8915539"),
"brand" : "Airtex",
"product" : "E7113M",
"type" : "Fuel Pump",
"warehouse_sku" : [
"1_5551536f3d77c870fc388a04",
"2_55515e163d77c870fc38b00a"
]
}
child product
{
"_id" : ObjectId("55524d0c3d77c8ba9cb2d9fd"),
"brand" : "Performance",
"product" : "P41K",
"type" : "Repuesto Bomba Gasolina",
"general_products" : [
ObjectId("554b9f223d77c810e8915539"),
ObjectId("554b9f123d77c810e891552f")
],
"warehouse_sku" : [
"1_555411043d77c8066b3b6720",
"2_555411073d77c8066b3b6728"
]
}
My question is to obtain a list of general products (_id and general_products inside child products) for warehouse_sku that follow the pattern : 1_
I have created an aggregate query with the following structure:
list_products = db.getCollection('products').aggregate([
... {$match: {warehouse_sku: /^1\_/}},
... {$group: { "_id": "$_id" } }
... ])
And that query give me successfully a result :
{ "_id" : ObjectId("55524d0c3d77c8ba9cb2d9fd") }
{ "_id" : ObjectId("554b9f223d77c810e8915539") }
but I need to obtain a list of general products so I can use $in in the vehicles collection.
list_products = [ ObjectId("55524d0c3d77c8ba9cb2d9fd"), ObjectId("554b9f223d77c810e8915539")]
example: db.vehicles.find({products:{$in: list_products}})
This last query I could not achieve it.
Use the aggregation cursor's map() method to return an array of ObjectIds as follows:
var pipeline = [
{$match: {warehouse_sku: /^1\_/}},
{$group: { "_id": "$_id" } }
],
list_products = db.getCollection('products')
.aggregate(pipeline)
.map(function(doc){ return doc._id });
The find() cursor's map() would work here as well:
var query = {'warehouse_sku': /^1\_/},
list_products = db.getCollection('products')
.find(query)
.map(function(doc){ return doc._id });
UPDATE
In pymongo, you could use a lambda function with the map function. Because map expects a function to be passed in, it also happens to be one of the places where lambda routinely appears:
import re
regx = re.compile("^1\_", re.IGNORECASE)
products_cursor = db.products.find({"warehouse_sku": regx})
list_products = list(map((lambda doc: doc["_id"]), products_cursor))

mongodb - mapreducing two collections where one collection has ids in an array of array

I'm very new to mongoDB and having some problems on joining two collections.
I've read some posts on mapReduce to perform NOSQL way of joining but still having some difficulties here
Collection 1: attraction
{
"_id" : "0001333b-e485-4fee-a0e2-9b7dc338d5a2",
"types" : "Shops",
"name" : "name",
"geo_location" : {
"lat" : 36.0567700000000002,
"lon" : -112.1354520000000008
},
"overall_rating" : 10.0000000000000000,
"num_of_review" : 6,
"review" : [
{
"review_ids" : [
"66ea1cd8-da34-40dc-8ad6-f30df5de9c2c",
"76f51c8d-d2a8-4609-8b7c-c2b0c386e35c",
"185c962a-fcfe-4d03-a3ac-86398be6312a",
"2212535b-28c6-423e-91f7-cc1dfb407d79",
"7e0f1d85-e79e-4bec-9e9c-7dfb03223816",
"f19a83a6-c6ef-4cbe-b90d-f6187bd50baa"
]
}
]
}
Collection 2: attraction_review
{
"_id" : "7e0f1d85-e79e-4bec-9e9c-7dfb03223816",
"user_id" : "somename",
"review_id" : "r122796525",
"unified_id" : "0001333b-e485-4fee-a0e2-9b7dc338d5a2",
"source_id" : "d1057961",
"review_url" : "someURL",
"title" : "some title",
"overall_rating" : 10,
"review_date" : "dates",
"content" : "some contents here",
"source" : "source",
"traval_date" : "dates",
"sort" : ""
}
Basically I need to keep (or copy) the reviews in the attraction_review whose _id has appeared in the review_ids array of the attraction collection.
The example above shows the matching review in red.
It is guaranteed that the attraction_review collection contains every ids in the review_ids for all records in the attraction collection.
The difficulty here is that the review_ids array is within review array, and I am not sure how I would go about mapping many instances of ids.
I would be grateful for some suggestions.
Many thanks

Resources