I have two tables:
1,'hello'
2,'world'
4,'this'
and
1,'john'
3,'king'
and I want to produce a table
1,'hello','john'
2,'world',''
3,'' ,king
4,'this' ,''
I am currently using the Pig command:
JOIN A BY code FULL OUTER,
B BY code;
but this gives me the output:
1,'hello',1,'john'
2,'world',,''
,'' ,3,king
4,'this' ,,''
I need the code columns to combine, how can I do this? Thanks
Yes join will always produce output like this,its a expected behavior in pig. One option could be try group operator instead of join operator.
a.txt
1,'hello'
2,'world'
4,'this'
b.txt
1,'john'
3,'king'
PigScript:
A = LOAD 'a.txt' USING PigStorage(',') AS (code:int,name:chararray);
B = LOAD 'b.txt' USING PigStorage(',') AS (code:int,name:chararray);
C = GROUP A BY code,B BY code;
D = FOREACH C GENERATE group,(IsEmpty(A.name) ? TOTUPLE('') : BagToTuple(A.name)) AS aname,(IsEmpty(B.name) ? TOTUPLE('') : BagToTuple(B.name)) AS bname;
E = FOREACH D GENERATE group,FLATTEN(aname),FLATTEN(bname);
DUMP E;
Output:
(1,'hello','john')
(2,'world',)
(3,,'king')
(4,'this',)
BagToTuple() is not available in native pig, you have to download the pig-0.11.0.jar and set it in your classpath.
Download jar from this link:
http://www.java2s.com/Code/Jar/p/Downloadpig0110jar.htm
A = load 'a' using PigStorage(',') as (code:int,name:chararray);
B = load 'b' using PigStorage(',') as (code:int,name:chararray);
C = join A by code full outer ,B by code;
D = foreach C generate
(A::code IS NULL ? B::code : A::code) AS code,
A::name as aname, B::name as bname;
dump D;
the result is
(1,'hello','john')
(2,'world',)
(3,,'king')
(4,'this,)
You can use union and then do a groupBy
Union A,B will give you:
1,'hello'
2,'world'
4,'this'
1,'john'
3,'king'
Now do a groupBy based on id. This will give you:
1, {'hello', 'john'}
2, {'world'}
3, {'king'}
4, {'this'}
Now you just need a udf to parse the bag. In udf iterate over each key to generate output in your format.
I also ran into same issue. This is how I solved it.
You can use the ternary operator after your join to re-assign a new code, based on whether it was populated in the A or B relations. In this example, if A.code is null then B.code is used, else A.code is used.
C = JOIN A BY code FULL OUTER, B BY code;
D = FOREACH C GENERATE
(A.code IS NULL ? B.code : A.code) AS code,
A.field1,
A.field2,
B.field3,
B.field4;
Related
I am trying to load to a table based on load types - Full or Incremental that is being passed as parameter in stored procedure. I was able to try with substitution variable with one line of code previously, but the below code doesn't seem to work -
Stored procedure possible arguments:
LOAD_TYPE=FULL
LOAD_TYPE=INCR
var incr_condition = (load_type=='INCR')?"INNER JOIN temp_table"
with temp_table(
select data
from table a
where dt between 01-01-2019 and 09-09-2020
)
select *
from table b
${incr_condition} -- execute only if load_type=INCR
INNER JOIN TABLE C ON B.ID = C.ID
Is there any way to restrict the with clause to execute only if the load_type==INCR? Please advice.
I think the conditional operator (question mark) must have a false part in addition to the true part. Otherwise, it generates a syntax error when there's a semicolon ending the line. This example obviously doesn't run anything, but it will return the values assigned to the "out" variable, which would be run.
Since you're using a replacement variable ${incr_condition} be sure to use backticks to open and close your SQL string.
create or replace procedure foo(LOAD_TYP string)
returns string
language javascript
as
$$
var load_type = LOAD_TYP;
var incr_condition = (load_type === 'INCR') ? "INNER JOIN temp_table" : "";
var out = `
with temp_table(
select data
from table a
where dt between 01-01-2019 and 09-09-2020
)
select *
from table b
${incr_condition} -- execute only if load_type=INCR
INNER JOIN TABLE C ON B.ID = C.ID
`;
return out;
$$;
call foo('INCR'); --Adds the inner join
call foo('FULL'); --Does not add the inner join
I also recommend changing your comparison on strings from == to ===. For details on why, reference What is the correct way to check for string equality in JavaScript?.
For example I have such query as:
Chat.objects.filter(users__contains=[user.pk]).filter(users__contained_by=mentors.values_list('pk', flat=True))
This turns to such query:
SELECT "chat_chat"."created_at", "chat_chat"."updated_at", "chat_chat"."id", "chat_chat"."users"
FROM "chat_chat"
WHERE
("chat_chat"."users" #> [1]::integer[] AND
"chat_chat"."users" <# (SELECT V0."id" FROM "users_user" V0
INNER JOIN "users_userrequest_mentors" V1 ON (V0."id" = V1."user_id")
WHERE V1."userrequest_id" IN
(SELECT U0."id" FROM "users_userrequest" U0 WHERE U0."user_id" = 1))::integer[])
If I run this query , I'll face the problem of casting.
cannot cast integer to integer[]
But the documentation of Postgres says that arrays should be used as ARRAY[1,2,...,4]
So I've wrapped [1] and (SELECT ...) to ARRAY like
SELECT "chat_chat"."created_at", "chat_chat"."updated_at", "chat_chat"."id", "chat_chat"."users"
FROM "chat_chat"
WHERE
("chat_chat"."users" #> ARRAY[1]::integer[] AND
"chat_chat"."users" <# ARRAY(SELECT V0."id" FROM "users_user" V0
INNER JOIN "users_userrequest_mentors" V1 ON (V0."id" = V1."user_id")
WHERE V1."userrequest_id" IN
(SELECT U0."id" FROM "users_userrequest" U0 WHERE U0."user_id" = 1))::integer[])
And everything works fine.
I'm just curious why Django (or psycopg2) skip this wrapping. Is there any special meaning for such behavior ?
I'm new to Power Builder code. I need to add a condition to Join dynamically. Any help is appreciated. Thanks a lot
String szdSQL, psql, sznewsql
szdSQL = "Select A, B, C, D
FROM sy_staging
LEFT OUTER JOIN fd_M
ON sy_staging.id = fd_M.id
LEFT OUTER JOIN gl_M
ON sy_staging.id= gl_M.id AND sy_staging.version = gl_M.version
WHERE sy_staging.year = :lyear AND
sy_staging.location = :llocation "
psql = "Upper(fd_M.code3) = 'SMM' "
In my new query I want to add the condition present in this string variable (psql) in the join as below
sznewsql = " "Select A, B, C, D
FROM sy_staging
LEFT OUTER JOIN fd_M
ON sy_staging.id = fd_M.id AND Upper(fd_M.code3) = 'SMM'
LEFT OUTER JOIN gl_M
ON sy_staging.id= gl_M.id AND sy_staging.version = gl_M.version
WHERE sy_staging.year = :lyear AND
sy_staging.location = :llocation "
Hmmm... That's an interesting case - adding a parameter to the ON clause, not the WHERE clause.
I'd use a datawindow, for sure (because I always do), but I'm not sure you can do that in graphic mode. You might have to convert to syntax, and then just add the new parameter into the ON clause with the ":" syntax.
LEFT OUTER JOIN fd_M
ON sy_staging.id = fd_M.id AND Upper(fd_M.code3) = :newStringParm
LEFT OUTER JOIN gl_M
...
and then your PowerScript would be
dw_1.retrieve( lYear, lLocation, psql )
-Paul Horan-
If you're only going to have those two different SQLs, you could just create two separate datawindows and then dynamically swap out the one used in the datawindow control.
I have following view which is working but not sure how to add 2 tables to join.
This table is adres1 and it will join on the IDENT# and IDSFX# to table
prodta.adres1 called adent# and adsfx#, there I need a col. ads15.
then i also need to get the ship to, row in this adres1. this we get first from the order table, prodta. oeord1 in col. odgrc#. This grc# is 11 pos and is combined 8 and 3 of the ent and suf. these 2 represent the ship to record and looking in same table adres1 (we do have many logical views on them if it's easier, like adres15) we can get col. ADSTTC for the ship to state.
Not sure if can included these 2 new parts to the current view created code below. Please ask if something not clear, it's an old system and somewhat developed convoluted.
CREATE VIEW Prolib.SHPWEIGHTP AS SELECT
T01.IDORD#,
T01.IDDOCD,
T01.IDPRT#,
t01.idsfx#,
T01.IDSHP#,
T01.IDNTU$,
T01.IDENT#,
(T01.IDNTU$ * T01.IDSHP#) AS LINTOT,
T02.IAPTWT,
T02.IARCC3,
T02.IAPRLC,
T03.PHVIAC,
T03.PHORD#,
PHSFX#,
T01.IDORDT,
T01.IDHCD3
FROM PRODTA.OEINDLID T01
INNER JOIN PRODTA.ICPRTMIA T02 ON T01.IDPRT# = T02.IAPRT#
INNER JOIN
(SELECT DISTINCT
PHORD#,
PHSFX#,
PHVIAC,
PHWGHT
FROM proccdta.pshippf) AS T03 ON t01.idord# = T03.phord#
WHERE T01.IDHCD3 IN ('MDL','TRP')
I'm not exactly clear on what you're asking, and it looks like some of the column-names are missing from your description, but this should get you pretty close:
CREATE VIEW Prolib.SHPWEIGHTP AS
SELECT T01.IDORD#,
T01.IDDOCD,
T01.IDPRT#,
t01.idsfx#,
T01.IDSHP#,
T01.IDNTU$,
T01.IDENT#,
( T01.IDNTU$ * T01.IDSHP# ) AS LINTOT ,
T02.IAPTWT,
T02.IARCC3,
T02.IAPRLC,
T03.PHVIAC,
T03.PHORD#,PHSFX#,
T01.IDORDT,
T01.IDHCD3,
t04.ads15
FROM PRODTA.OEINDLID T01
INNER JOIN PRODTA.ICPRTMIA T02
ON T01.IDPRT# = T02.IAPRT#
INNER JOIN (SELECT DISTINCT
PHORD#,
PHSFX#,
PHVIAC,
PHWGHT
FROM proccdta.pshippf) AS T03
ON t01.idord# = T03.phord#
JOIN prodta.adres1 as t04
on t04.adent# = t01.adent#
and t04.adsfx# = t01.adsfx#
JOIN prodta.oeord1 t05
on t05.odgrc# = T01.IDENT# || T01.SUFFIX
WHERE T01.IDHCD3 IN ('MDL','TRP')
Let me know if you need more details.
HTH !
Please pardon me if this question has been asked before, but I simply don't have enough vocabulary to search for what I need as a novice in data bases.
I am using SQL server 2008.
I have a table tblPDCDetails with several columns. One of the columns PDCof holds values :
"A"(for applicant),
"C" for coapplicant,
"G" (for Guarantor).
Another column HolderID holds uniqueid (of holder).
The PDCHolders reside in their respective tables: Applicants in tblApplBasicDetails, CoApllicants in their own table and so on.
Now what I need is how should I retrive the names of holders from their respective tables, depending on the value in PDCof column.
Can I do it at all?
If no how should I work around this?
This should do:
SELECT A.*,
COALESCE(B.Name,C.Name,D.Name) Name
FROM dbo.tblPDCDetails A
LEFT JOIN dbo.tblApplBasicDetails B
ON A.HolderID = B.HolderID
AND A.PDCof = 'A'
LEFT JOIN dbo.tblCoApplBasicDetails C
ON A.HolderID = C.HolderID
AND A.PDCof = 'C'
LEFT JOIN dbo.tblGuarantorlBasicDetails D
ON A.HolderID = D.HolderID
AND A.PDCof = 'G'
The other option is to use a case switch:
Select case Main.PDCof
when 'A' then (select HolderID from Applicants where main.value = value)
when 'C' then (select HolderID from CoApplicants where main.value = value)
when 'G' then (select HolderID from Guarantor where main.value = value)
end
,main.*
from tblPDCDetails main
Depends on whether you run this a few times a day, or a few thousand times an hour