I have three csv files:
sample_users.csv
user_id, first_name, location
sample_orders.csv
order_id, user_id, product,order_ts
sample_products.csv
product_id, product
I was successful in making relationship between sample_users with sample_orders but now I want to get the product_id of each product ordered in the sample_orders table. but I am failing in that. There is no error but the relationship is not happening. What is going wrong?
//load user nodes
LOAD CSV WITH HEADERS FROM 'file:///sample_users.csv' AS row
MERGE(u:User {user_id:row.user_id, name:row.first_name, location:row.location})
RETURN count(u);
//load order nodes
LOAD CSV WITH HEADERS FROM 'file:///sample_orders.csv' AS row WITH row WHERE row.order_id IS NOT NULL
MERGE(o:Orders {order_id:row.order_id, order_ts:row.order_ts, user_id_2:row.user_id, product:row.product})
RETURN count(o);
//load product nodes
LOAD CSV WITH HEADERS FROM 'file:///sample_products_v2.csv' AS row WITH row where row.p_name IS NOT NULL
MERGE(p:Products {product_id:row.p_id, product_name:row.p_name})
RETURN count(p);
//Create relationships
LOAD CSV WITH HEADERS FROM 'file:///sample_users.csv' AS row
MATCH(u:User{user_id:row.user_id})
MATCH(o:Orders{user_id_2:row.user_id})
MERGE(u)-[:HAS_ORDERED]->(o)
RETURN *;
LOAD CSV WITH HEADERS FROM 'file:///sample_products_v2.csv' AS row
MATCH(o:Orders{product:row.product})
MATCH(p:Products{product_name:row.p_name})
MERGE(p)-[:IS_ITEM]->(o)
RETURN *;
There is a typographical error on your script. Your column names for Products are product_id, product.
This should be the loading script:
//load product nodes
LOAD CSV WITH HEADERS FROM 'file:///sample_products.csv' AS row WITH row where row.product IS NOT NULL
MERGE(p:Products {product_id:row.product_id, product_name:row.product})
RETURN count(p);
Thus you should use
LOAD CSV WITH HEADERS FROM 'file:///sample_products.csv' AS row
MATCH(o:Orders{product:row.product_id})
MATCH(p:Products{product_name:row.product})
MERGE(p)-[:IS_ITEM]->(o)
RETURN *;
Also check if Orders.product is found in Products.product_id by running this query:
MATCH (o:Orders)
MATCH (p:Products) WHERE p.product_id = o.product
RETURN n LIMIT 5
If you the query does not return any result, then your Orders and Products are not matching
Related
for the snowflake document, it has 3 columns with loading from a parquet file then you can use:
"copy into cities
from (select
$1:continent::varchar,
$1:country:name::varchar,
$1:country:city::variant
from #sf_tut_stage/cities. parquet);
"
If have 1000+ columns, can I not list all the columns like $1:col1, $1:col2...$1:co1000?
you may want to check out our INFER_SCHEMA function to dynamically obtain the columns/datatypes. https://docs.snowflake.com/en/sql-reference/functions/infer_schema.html
The expression column should be able to get you 95% of the way there.
select *
from table(
infer_schema(
location=>'#mystage'
, file_format=>'my_parquet_format'
)
);
)
I have a CSV files that have multiple columns. Sometimes it can be 2 sometimes it can be 43. I have mapped this columns to the Snowflake meta data table. I want to insert values to the target table but sometimes in the CSV files the column name can be diffrent for example subject, subject_name or subject_names. In the target table I have only one columnn for this called subjet_name. So if the column "subject" in the CSV file is null I need to check the "subject_name" column and if the "subject_name" i need to check "subject_names". Is there anyway how to check if this columns have a null values. I must add that the columns in the CSV are not always in the same place. So i can't use select $1 from #stage
i have two records ('orders' and 'menulist') that are joined with 'orderitem' by orderID nad menuID
[1]: https://i.stack.imgur.com/5kF8R.png
[2]: https://i.stack.imgur.com/sabUE.png
[3]: https://i.stack.imgur.com/0sqk1.png
i was trying to promote each orderitem record into a relationship in the graph
what I did is:
LOAD CSV WITH HEADERS FROM "file:///orders.csv" AS row
CREATE (n:Orders)
SET n = row
LOAD CSV WITH HEADERS FROM "file:///menulist.csv" AS row
CREATE (n:Menu)
SET n = row
CREATE INDEX FOR (m:Menu) ON (m.MenuID)
CREATE INDEX FOR (o:Orders) ON (o.OrderID)
LOAD CSV WITH HEADERS FROM "file:///orderitem.csv" AS row
MATCH (m:Menu), (o:Orders)
WHERE m.MenuID = row.MenuID AND o.orderID = row.orderID
CREATE (o)-[oi:CONTAINS]->(m)
SET oi = row,
oi.Quantity = toInteger(row.Quantity)
but I got (no changes, no records), seems there is an error here, can anyone help to solve?
That is not the proper way of creating node from csv. Follow the examples found in this link: https://neo4j.com/docs/cypher-manual/current/clauses/load-csv/#load-csv-import-data-from-a-csv-file
In your example, remove the Set n = row because it will not assign the columns in the csv to each property in the node. You need to define the attributes one at a time. See below.
LOAD CSV WITH HEADERS FROM "file:///orders.csv" AS row
CREATE ( :Orders {ID: row.OrderID, <and so on> , paymentID: row.PaymentID })
Also notice that the variable n in label :Orders is not needed.
In sql server, I have the products table which contains json data records as shown in below. and I want the comma separated keys records for each json row in the table format
Input:
JsonData
{"add-on":[{"ID":2546978,"ActionType":"Added","Add-On-Name":"tetss","Add-On-Price":"3"},{"ID":2546979,"ActionType":"Added","Add-On-Name":"test Addon2","Add-On-Price":"5"},{"ID":2546980,"ActionType":"Added","Add-On-Name":"tdgd","Add-On-Price":"2"}]}
{"Name":"testing ABC","Menu Column Location":"1"}
{"Features":[{"ActionType":"Added","Feature-Option":"25"}]}
{"add-on":[{"ID":2546993,"ActionType":"Updated"}],"sub add-on":[{"ID":"","ActionType":"Added","Sub-Add-On-Description":"des3"},{"ID":"","ActionType":"Added","Sub-Add-On-Description":"des34"},{"ID":"","ActionType":"Added","Sub-Add-On-Description":"des35"}]}
Desired Output:
ID,ActionType,Add-On-Name,Add-On-Price
Name,Menu Column Location
ActionType,Feature-Option,Features
ID,ActionType,Sub-Add-On-Description,add-on,sub add-on
Note:
Json keys should not be repeated for each records, All the json key records should get in one table
I have two Hive tables as shown below, along with their columns
Tbl_Customer
Id
Name
Tbl_Cntct
Id
Phone
One Id can have many phone numbers so I have a table
Tbl_All
Id
Name
Phn_List ARRAY
My question is on how to load data from Tbl_Custome and Tbl_Cntct into Tbl_All.
I can do it in PIG, but want to do same in Hive.
Thanks
Insert overwrite table Tbl_All
select cus.id,cus.name,collect_set(ctc.phone)
from Tbl_Customer cus join Tbl_Cntct ctc on cus.id = ctc.id
group by cus.id,cus.name
The collect_set UDAF is a function collects the column into an array with no duplicates.If you want to remain all the value include duplicated ones,use collect_list function