I am using cloudant NoSQL database to store the historical data. I want the attributes in data column in a separate column. Please help!
Data in the database
It looks like your data has this form:
{
d: {
id: 'xyz',
ts: 123,
ax: 'abc'
}
}
Cloudant's grid view only renders keys at the top of the object tree. If you stored data like this:
{
id: 'xyz',
ts: 123,
ax: 'abc'
}
then the grid view would show id/ts/ax in their own columns.
Related
I am trying to create a table which has a complex data type. And the data types are listed below.
map<string, array <map <String,String>>> , the data I am looking is as follows
{'tags': [{'type': 'type1', 'value': 'value1'}, {'type': 'type1', 'value': 'value1'}]}
Kindly help me with create table in Hive as well as inserting values to this Hive table.
I have an external table INV_EXT_TBL( 1 variant column named "VALUE") in Snowflake and it has 6000 rows (each row is json file). The json record has has double quote as its in dynamo_json format.
What is the best approach to parse all the json files and convert it into table format to run sql queries. I have given sample format of just top 3 json files.
"{
""Item"": {
""sortKey"": {
""S"": ""DR-1630507718""
},
""vin"": {
""S"": ""1FMCU9GD2JUA29""
}
}
}"
"{
""Item"": {
""sortKey"": {
""S"": ""affc5dd0875c-1630618108496""
},
},
""vin"": {
""S"": ""SALCH625018""
}
}
}"
"{
""Item"": {
""sortKey"": {
""S"": ""affc5dd0875c-1601078453607""
},
""vin"": {
""S"": ""KL4CB018677""
}
}
}"
I created local table and inserted data into it from external table by casting the data type. Is this correct approach OR should i use parse_json function against the json files to store data in local table.
insert into DB.SCHEMA.INV_HIST(VIN,SORTKEY)
(SELECT value:Item.vin.S::string AS VIN, value:Item.sortKey.S::string AS SORTKEY FROM INV_EXT_TBL);```
I resolved this by creating a Materialized view by using cast on variant column on External table. This helped to get rid of outer double-quotes and the performance improved multifold. I did not progress with table creation approach.
CREATE OR REPLACE MATERIALIZED VIEW DB.SCHEMA.MVW_INV_HIST
AS
SELECT value:Item.vin.S::string AS VIN, value:Item.sortKey.S::string AS SORTKEY
FROM DB.SCHEMA.INV_HIST;
Let's say we have the following relations in our database:
A Company, that can have zero to many employees
An Employee is hired by one and only one Company
An Employee can be the manager of zero or many employees
...and the following entities in our database:
Company: Id=1, Name='Contoso'
Employee: Id=1, CompanyId:1, ManagerId= Null, Name='Jack Sparrow', Salary=1000
Employee: Id=2, CompanyId:1, ManagerId= 1, Name='Will Turner', Salary=900
...which will (in my case) result in this JSON representation when converted by NewtonSoft:
{
Id: 1,
Name: 'Contoso',
"Employees": [
{Id:1, CompanyId:1, ManagerId: null, Name: 'Jack Sparrow', Salary:1000},
{Id:2, CompanyId:1, ManagerId: 1, Name: 'Will Turner', Salary:900,
Manager:{Id:1, CompanyId:1, ManagerId: null, Name: 'Jack Sparrow', Salary:1000}
}
]
}
As you can see, there are two instances of the Employee/Manager 'Jack Sparrow'.
When binding one of these instances to the UI, I would like both instances to stay in sync if changes occurs.
Are there any mechanism (or trick) in AngularJS v1 which could help me achieve this?
Please note that this is just a simple data structure to illustrate my problem.
One could argue that the 'Manager'-object should not be fully represented, but in my case it is...
Change your query so that it doesn't return the Manager property. Then just fix up the data when you receive it.
I assume your ManagerId field for Will Turner is supposed to be 1 (it's 2 in the JSON sample) in your example to refer to Jack Sparrow. If so all you need is something like:
angular.forEach(employees, function(emp) {
if(emp.ManagerId !== null) {
emp.Manager = employees[emp.ManagerId]);
}
});
Of course you don't even need to change the query not to return the manager, you can still just run this code when you receive the data and overwrite the redundant manager data.
I am currently running a job to transfer data from one table to another via a query. But I can't seem to find a way convert a column into a nested field containing the column as a child field. For example, I have a column customer_id: 3 and I would like to convert it to {"customer": {"id":3}}. Below is a snippet of my job data.
query='select * FROM ['+ BQ_DATASET_ID+'.'+Table_name+'] WHERE user="'+user+'"'
job_data={"projectId": PROJECT_ID,
'jobReference': {
'projectId': PROJECT_ID,
'job_id': str(uuid.uuid4())
},
'configuration': {
'query': {
'query': query,
'priority': 'INTERACTIVE',
'allowLargeResults': True,
"destinationTable":{
"projectId": PROJECT_ID,
"datasetId": user,
"tableId": destinationTable,
},
"writeDisposition": "WRITE_APPEND"
},
}
}
Unfortunately, if the "customer" RECORD does not exist in the input schema, it is not currently possible to generate that nested RECORD field with child fields through a query. We have features in the works that will allow schema manipulation like this via SQL, but I don't think it's possible to do accomplish this today.
I think your best option today would be an export, transformation to desired format, and re-import of the data to the desired destination table.
a simple solution is to run
select customer_id as customer.id ....
I am trying to create a table which has a complex data type. And the data types are listed below.
array
map
array< map < String,String> >
I am trying to create a data structure of 3 type . Is it ever possible to create in Hive? My table DDL looks like below.
create table complexTest(names array<String>,infoMap map<String,String>, deatils array<map<String,String>>)
row format delimited
fields terminated by '/'
collection items terminated by '|'
map keys terminated by '='
lines terminated by '\n';
And my sample data looks like below.
Abhieet|Test|Complex/Name=abhi|age=31|Sex=male/Name=Test,age=30,Sex=male|Name=Complex,age=30,Sex=female
Whever i am querying the data from the table i am getting the below values
["Abhieet"," Test"," Complex"] {"Name":"abhi","age":"31","Sex":"male"} [{"Name":null,"Test,age":null,"31,Sex":null,"male":null},{"Name":null,"Complex,age":null,"30,Sex":null,"female":null}]
Which is not i am expecting. Could you please help me to find out what should be the DDL if it ever possible for data type array< map < String,String>>
I don't think this is possible using the inbuilt serde. If you know in advance what the values in your maps are going to be, then I think a better way of approaching this would be to convert your input data to JSON, and then use the Hive json serde:
Sample data:
{'Name': ['Abhieet', 'Test', 'Complex'],
'infoMap': {'Sex': 'male', 'Name': 'abhi', 'age': '31'},
'details': [{'Sex': 'male', 'Name': 'Test', 'age': '30'}, {'Sex': 'female', 'Name': 'Complex', 'age': '30'}]
}
Table definition code:
create table complexTest
(
names array<string>,
infomap struct<Name:string,
age:string,
Sex:string>,
details array<struct<Name:string,
age:string,
Sex:string>>
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
This can be handled with array of structs using the following query.
create table complexStructArray(custID String,nameValuePairs array<struct< key:String, value:String>>) row format delimited fields terminated by '/' collection items terminated by '|' map keys terminated by '=' lines terminated by '\n';
Sample data:
101/Name=Madhavan|age=30
102/Name=Ramkumar|age=31
Though struct allows duplicate key values unlike Map, above query should handle the ask if the data is having unique key values.
select query would give the output as follows.
hive> select * from complexStructArray;
101 [{"key":"Name","value":"Madhavan"},{"key":"age","value":"30"}]
102 [{"key":"Name","value":"Ramkumar"},{"key":"age","value":"31"}]
Sample data: {"Name": ["Abhieet", "Test", "Complex"],"infoMap": {"Sex":"male", "Name":"abhi", "age":31},"details": [{"Sex":"male", "Name":"Test", "age":30}, {"Sex":"female", "Name":"Complex", "age":30}]}
Table definition code:
#hive>
create table complexTest
(names array<string>,infomap struct<Name:string,
age:string,
Sex:string>,details array<struct<Name:string,
age:string,
Sex:string>>)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'