Load Data to Hive array column - arrays

I have two Hive tables as shown below, along with their columns
Tbl_Customer
Id
Name
Tbl_Cntct
Id
Phone
One Id can have many phone numbers so I have a table
Tbl_All
Id
Name
Phn_List ARRAY
My question is on how to load data from Tbl_Custome and Tbl_Cntct into Tbl_All.
I can do it in PIG, but want to do same in Hive.
Thanks

Insert overwrite table Tbl_All
select cus.id,cus.name,collect_set(ctc.phone)
from Tbl_Customer cus join Tbl_Cntct ctc on cus.id = ctc.id
group by cus.id,cus.name
The collect_set UDAF is a function collects the column into an array with no duplicates.If you want to remain all the value include duplicated ones,use collect_list function

Related

Create a cross reference table using Golden record and associate other records to the record

I have a table where I have ranked all the rows based on the created date column and have the rank indicated on the table as below
main table
I would like to create a cross-reference table with the golden record as the recurring column and the other two records as associated records as below.
output desired
I would like to know how I can achieve this using SQL.
I have tried creating a seperate table with all ID numbers (Rank = 1) and then joining it with the main table to get the ones with rank 1,2 and 3 associated with it. But it doesnt seem to work as I intend to.
output
I have not tested but something like this should work. You might want to add a name_id field.
select b.id_number,a.id_number
from table a
join table b on a.name=b.name
where b.rank=1

Should I be using multiple PostgreSQL queries for this?

I have a couple of tables.
Table 1 (player_main)
This table has a "id" column which is a UUID type and is the primary key.
Table 2 (game_main)
This table also has an "id" column of the UUID type and is a primary key.
Table 3 (game_members)
This table has a column "member_id" of UUID type which is a primary key and a foreign reference to player_main(id).
There is also an "game_id" column of UUID type which references game_main(id).
My problem is, if a player connects to the server, I want to be able to load up their "game data" by querying the database and receiving all the data to construct their data object. I am given the UUID of the player which is stored in player_main(id). I need to obtain the game_main(id) and a list of all the game member ids that correspond to that game_main(id).
How would I do this? I've attempted to do different types of joins with a where clause to identify the game_members(member_id) but that only returns the row that is correspondent to the member that has just joined, not a column containing all of the members for that game.
Any help is appreciated, thank you.
Edit
I have tried the following query:
SELECT t1.member_id, t2.*
FROM game_members t1
INNER JOIN game_main t2
ON t1.game_id = t2.id
WHERE t1.member_id = <some UID>
which resulted in 1 row and 2 columns. The columns being "game_members.member_id" and "game_main.id". The value for the first column is the UUID that I specified in the where clause and the value for the second column is the UUID of the game. I was expecting to see 2 rows of data with the same "game_main.id" but with different "game_member.member_id"'s, as I have 2 entries in the same game currently.
Edit 2
As requested, I will provide sample data for my tables as well as the output that I wish to see.
Sample Data:
[player_main]
id
------------------------------------|
863fdf91-86fb-49a7-9232-bcb596e3a86f|
7af64cd7-72a2-410f-9b5c-620127fca0ac|
c7b1952a-b263-470f-9cae-9d5e6d7a8186|
[game_main]
id
------------------------------------|
dd76c680-5853-40a6-b757-0457d1a7e95f|
ca4f5b1f-0f8c-4f10-969c-464ccf207d9c|
[game_members]
member_id | game_id
------------------------------------|------------------------------------
863fdf91-86fb-49a7-9232-bcb596e3a86f|dd76c680-5853-40a6-b757-0457d1a7e95f
7af64cd7-72a2-410f-9b5c-620127fca0ac|dd76c680-5853-40a6-b757-0457d1a7e95f
c7b1952a-b263-470f-9cae-9d5e6d7a8186|ca4f5b1f-0f8c-4f10-969c-464ccf207d9c
[desired output]
This is what the game info of the player's current game should look like. The query should take only the player's UUID and return the following if I the UUID was equal to 863fdf91-86fb-49a7-9232-bcb596e3a86f
member_id | game_id
------------------------------------|------------------------------------
863fdf91-86fb-49a7-9232-bcb596e3a86f|dd76c680-5853-40a6-b757-0457d1a7e95f
7af64cd7-72a2-410f-9b5c-620127fca0ac|dd76c680-5853-40a6-b757-0457d1a7e95f
Try doing a self-join of the game_members table. The following query will generate all unique members who have any game in common with games used by a certain player.
SELECT DISTINCT t2.member_id, t1.game_id
FROM game_members t1
INNER JOIN game_members t2
ON t1.game_id = t2.game_id
WHERE t1.member_id = <some UID>
You could break down your problem into:
looking up the player's current game in game_members and then,
looking up the game's current players from the same game_members table
This approach translates to the following query involving a self-join:
select
gm2.member_id,
gm1.game_id
from
game_members gm1
inner join game_members gm2 on
gm1.game_id = gm2.game_id -- lookup the game's current players
where
gm1.member_id = '863fdf91-86fb-49a7-9232-bcb596e3a86f' -- lookup the player's current game

how to fill a table with using some of the data of other tables

I need to create a new table called “customer” that include some of columns from the “user table”, and also “project table”. I built my suppliers table with specific column names and I need to fill its column by using data of the other tables. Finally I am trying to finish; when user create a new account and project, the customer table automatically fill with some of other two tables varieties with different column names.
INFO: I have three different user types such as “suppliers”, “costumers”, “managers”. I am holding their information(include user types) in one table called users.
Use the following query as an example and write a query to insert the rows to destination table from source table.
Ex:-
INSERT INTO TestTable (FirstName, LastName)
SELECT FirstName, LastName
FROM Person.Contact
WHERE EmailPromotion = 2
Note: Use Join in the select query to join two tables
The 1st step would be to couple the data from the different tables using a table join command. If you can create a search result that matched your new table, then creating the table is simple a call to the below.
Create table CUSTOMER as (Select ...)
"when user create a new account and project.." this is something you plan on doing at run time in your application and not something you need to collate using sql at this point?

SQL - Updating multiple records for each record with a matching id in a table

I am bit new on the updating multiple records and i wanted to know the best way to go on about a solution for this, i am writing a stored proc were basically i have two tables,
one that matches a server id to a user id
and another table with record information for each user id with multiple columns with values.
Basically here is how its going to work:
Get all the matching user ids for the specific server id in the tb_UserServerMap table
then foreach userId in the tb_setting table update the columns with the new values
Basic structure of your stored procedure would be:
CREATE PROCEDURE Blah
#Server_ID int /* or whatever data type is appropriate */
as
UPDATE ts
SET
ColumnA = 10 /* New value for column A - maybe passed as a parameter? */
/* More columns here */
FROM
tb_setting ts
inner join
tb_UserServerMap usm
on
ts.user_id = usm.user_id
WHERE
usm.server_id = #Server_ID
I can't fill in more of it without knowing the names of columns to be updated, how those values are obtained, data types, etc.
You don't need a foreach,
Update tblName set firstCol = val1, secondCol = val2 where id in (id1, id2, id3)

Storing multiple employee IDs in one column of data

Web app is being written in classic ASP with a MSSQL backend. On this particular page, the admin can select 1 or any/all of the employees to assign the project to. I'm trying to figure out a simple way to store the employee IDs of the people assigned to it in one column.
The list of employees is generated from another table and can be dynamic (firing or hiring) so I want the program to be flexible enough to change based on these table changes.
Basically need to know how to assign multiple people to a project that can later be called up on a differen page or from a different query.
Sorry for the n00bish question, but thanks!
Don't store multiple ID's in one column! Create another table with the primary key of your existing table and a single ID that you want to store. You can then insert multiple rows into this new table, creating a 1:m (one to many) relationship. For example, let's look at an order table:
order:
order_id
order_date
and I have a product table...
product:
product_id
product_name
Now, you could go down the road of adding a column to order that let you list the products in the order, but that would be bad form. What you want instead is something like..
order_item:
order_item_id
order_id
product_id
quantity
unit_price
You can then perform a join to get all of the products for a particular order...
select
product.*
from orders
inner join order_item on order_item.order_id = order.order_id
inner join product on product.product_id = order_item.product_id
where orders.order_id = 5
Here's an example order_id of 5, and this will get all of the products in that order.
You need to create another table that stores these values such as. So this new table would store one row for each ID, and then link back to the original record with the original records ID.

Resources