GOAL
I am trying to select a user ID from one table and a count of associated items from another table in DB2. I am trying to execute this query in SSIS to import the data into a SQL Server database where I will perform additional transformation processes on the data. I mention the SSIS not because I think its part of the issue, but just for background info. I'm fairly certain the problem lies with my inexperience in DB2. My background is in SQL Server, I'm very new to DB2.
ISSUE
The problem occurs when (I'm assuming) the count is 0. When I execute the query in my DB2 command editor, it just returns a blank row. I would expect it to at least return the user ID and then just have a blank field for the count, but instead the whole row is blank. This then causes issues with my SSIS package where its trying to do a bunch of inserts with no data.
Again, a lot of this is an assumption because I'm not experienced with DB2 Command Editor, or DB2, but if I open up the dates a bit, I will get the expected results (user id and count).
I've tried wrapping the count in a COALESCE() function, but that didn't resolve the issue.
QUERY
SELECT a.user_id, COUNT(DISTINCT b.item_number) as Count
FROM TABLE_A a
LEFT OUTER JOIN TABLE_B b
ON a.PRIMARY_KEY = b.PRIMARY_KEY
WHERE a.user_id = '1234'
AND b.DATE_1 >= '01/01/2013'
AND b.DATE_1 <= '01/05/2013'
AND b.DATE_2 >= '01/01/2013'
AND b.DATE_2 <= '12/23/2014'
AND a.OTHER_FILTER_FIELD = 'ABC'
GROUP BY a.user_id
Wrap the field in COALESCE()
COUNT(DISTINCT COALESCE(b.item_number,0))
also make sure
WHERE a.user_id = '1234'
exists in TABLE_A. If there is no user 1234 you will get no results with the query as written.
Related
while executing target table in snowflake using json data as source table
merge into cust tgt using (
select parse_json(s.$1):application_num as application num
from prd_json s qualify
row_number() over(partition application
order_by application desc)=1) src
on tgt.application =src.application
when not matched and op_type='I' then
insert(application) values (src.application );
qualify commands ignores all the duplicate data present and gives only unique record but while putting joins its show only less records when compare to normal select statement.
for example :
select distinct application
from prd_json where op_type='I';
--15000 rows are there
while putting joins it shows there is not matching records in target . if it is not matched it should insert all 15000rows but 8500 rows only inserting even though it was not an duplicate record . is there any function available without using "qualify" shall we insert the record. if i ignore qualify am getting dml error duplication. pls guide me if anyone knows.
How about using SELECT DISTINCT?
You demo SQL does not compile. and you using the $1 means it's also hard to guess the names of your columns to know how the ROW_NUMBER is working.
So it's hard to nail down the problem.
But with the following SQL you can replace ROW_NUMBER with DISTINCT
CREATE TABLE cust(application INT);
CREATE OR REPLACE table prd_json as
SELECT parse_json(column1) as application, column2 as op_type
FROM VALUES
('{"application_num":1,"other":1}', 'I'),
('{"application_num":1,"other":2}', 'I'),
('{"application_num":2,"other":3}', 'I'),
('{"application_num":1,"other":1}', 'U')
;
MERGE INTO cust AS tgt
USING (
SELECT DISTINCT
parse_json(s.$1):application_num::int as application,
s.op_type
FROM prd_json AS s
) AS src
ON tgt.application = src.application
WHEN NOT MATCHED AND src.op_type = 'I' THEN
INSERT(application) VALUES (src.application );
number of rows inserted
2
SELECT * FROM cust;
APPLICATION
1
2
running the MERGE code a second time gives:
number of rows inserted
0
Now if truncate CUST and I swap to using this SQL for the inner part:
SELECT --DISTINCT
parse_json(s.$1):application_num::int as application,
s.op_type
FROM prd_json AS s
qualify row_number() over (partition by application order by application desc)=1
I get three rows inserted, because the partition by application, is effectively binding to the s.application not the output application, and there are 3 different "applications" because of the other values.
The reason I wrote my code this way is your
select distinct application
from prd_json where op_type='I';
implies there is something called application already, in the table.. and thus it runs the chance of being used in the ROW_NUMBER statement..
Anyways, there is a large possible problem is you also have "update data" I guess U in your transaction block, that you want to ORDER BY the sub-select so you never have a Inser,Update trying action in Update,Inser order. And assuming you want all update operations if there are many of them.. I will stop. But if you do not have Updates, the sub-select should have the op_type='I' to avoid the non-insert ops making it. Out, or possible worse again, in your ROW_NUMBER pattern replacing the Intserts. Which I suspect is the underlying cause of your problem.
I have a linked server setup in SQL Server to hit an Oracle database. I have a query in SQL Server that joins on the Oracle table using dot notation. I am getting a “No Data Found” error from Oracle. On the Oracle side, I am hitting a table (not a view) and no stored procedure is involved.
First, when there is no data I should just get zero rows and not an error.
Second, there should actually be data in this case.
Third, I have only seen the ORA-01403 error in PL/SQL code; never in SQL.
This is the full error message:
OLE DB provider "OraOLEDB.Oracle" for linked server "OM_ORACLE" returned message "ORA-01403: no data found".
Msg 7346, Level 16, State 2, Line 1
Cannot get the data of the row from the OLE DB provider "OraOLEDB.Oracle" for linked server "OM_ORACLE".
Here are some more details, but it probably does not mean anything since you don’t have my tables and data.
This is the query with the problem:
select *
from eopf.Batch b join eopf.BatchFile bf
on b.BatchID = bf.BatchID
left outer join [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
on bf.ReferenceID = du.documentUploadID;
I can’t understand why I get a “no data found” error. The query below uses the same Oracle table and returns no data but I don’t get an error - I just get no rows returned.
select * from [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] where documentUploadID = -1
The query below returns data. I just removed one of the SQL Server tables from the join. But removing the batch table does not change the rows returned from batchFile (271 rows in both cases – all rows in batchFile have a batch entry). It should still be joining the same batchFile rows to the same Oracle rows.
select *
from eopf.BatchFile bf
left outer join [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
on bf.ReferenceID = du.documentUploadID;
And this query returns 5 rows. It should be the same 5 from the original query. ( I can’t use this because I need data from the batch and batchFile table).
select *
from [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
where du.documentUploadId
in
(
select bf.ReferenceID
from eopf.Batch b join eopf.BatchFile bf
on b.BatchID = bf.BatchID);
Has anyone experienced this error?
Today I experienced the same problem with an inner Join. As creating a Table Valued Function suggested by codechurn or using a Temporary Table suggested by user1935511 or changing the Join Types suggested by cymorg are no options for me, I like to share my solution.
I used Join Hints to drive the query optimizer into the right direction, as the problem seems to rise up from nested loops join strategy with the remote table locally . For me HASH, MERGE and REMOTE join hints worked.
For you REMOTE will not be an option because it can be used only for inner join operations. So using something like the following should work.
select *
from eopf.Batch b
join eopf.BatchFile bf
on b.BatchID = bf.BatchID
left outer merge join [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
on bf.ReferenceID = du.documentUploadID;
I've had the same problem.
Solution1: load the data from the Oracle database into a temp table, then join to that temp table instead - here's a link.
From this post a link you can find out that the problem can be with using left join.
I've checked with my problem and after changing my query it solved the problem.
In my case I had a complex view made from a linked table, 3 views based on the linked table and a local table. I was using Inner Joins throughout and this problem manifested. Changing the joins to Left and Right Outer Joins (as appropriate) resolved the issue.
Another way to work around the problem is to pull back the Oracle data into a Table Valued Function. This will cause SQL Server to go out and retrieve all of the data from Oracle and throw it into a resultant table variable. For all intent and purpose, the Oracle data is now "local" to SQL Server if you use the resultant Table Valued Function in a query.
I believe the original problem is that SQL Server is trying to optimize the execution of your compound query which includes the remote Oracle query results in-line. By using a Table Valued Function to wrap the Oracle call, SQL Server will optimize the compound query on the resultant table variable returned from the function and not the results from the remote query execution.
CREATE function [dbo].[documents]()
returns #results TABLE (
DOCUMENT_ID INT NOT NULL,
TITLE VARCHAR(6) NOT NULL,
LEGALNAME VARCHAR(50) NOT NULL,
AUTHOR_ID INT NOT NULL,
DOCUMENT_TYPE VARCHAR(1) NOT NULL,
LAST_UPDATE DATETIME
) AS
BEGIN
INSERT INTO #results
SELECT CAST(DOCUMENT_ID AS INT) AS DOCUMENT_ID, TITLE, LEGALNAME, CAST(AUTHOR_ID AS INT) AS AUTHOR_ID, DOCUMENT_TYPE, LAST_UPDATE
FROM OPENQUERY(ORACLE_SERVER,
'select DOCUMENT_ID, TITLE, LEGALNAME, AUTHOR_ID, DOCUMENT_TYPE, FUNDTYPE, LAST_UPDATE
from documents')
return
END
You can then use the Table Valued Function as it it were a table in your SQL queries:
SELECT * FROM DOCUMENTS()
I resolved it by avoiding the = operator. Try using this instead:
select * from [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] where documentUploadID < 0
I have a linked server setup in SQL Server to hit an Oracle database. I have a query in SQL Server that joins on the Oracle table using dot notation. I am getting a “No Data Found” error from Oracle. On the Oracle side, I am hitting a table (not a view) and no stored procedure is involved.
First, when there is no data I should just get zero rows and not an error.
Second, there should actually be data in this case.
Third, I have only seen the ORA-01403 error in PL/SQL code; never in SQL.
This is the full error message:
OLE DB provider "OraOLEDB.Oracle" for linked server "OM_ORACLE" returned message "ORA-01403: no data found".
Msg 7346, Level 16, State 2, Line 1
Cannot get the data of the row from the OLE DB provider "OraOLEDB.Oracle" for linked server "OM_ORACLE".
Here are some more details, but it probably does not mean anything since you don’t have my tables and data.
This is the query with the problem:
select *
from eopf.Batch b join eopf.BatchFile bf
on b.BatchID = bf.BatchID
left outer join [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
on bf.ReferenceID = du.documentUploadID;
I can’t understand why I get a “no data found” error. The query below uses the same Oracle table and returns no data but I don’t get an error - I just get no rows returned.
select * from [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] where documentUploadID = -1
The query below returns data. I just removed one of the SQL Server tables from the join. But removing the batch table does not change the rows returned from batchFile (271 rows in both cases – all rows in batchFile have a batch entry). It should still be joining the same batchFile rows to the same Oracle rows.
select *
from eopf.BatchFile bf
left outer join [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
on bf.ReferenceID = du.documentUploadID;
And this query returns 5 rows. It should be the same 5 from the original query. ( I can’t use this because I need data from the batch and batchFile table).
select *
from [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
where du.documentUploadId
in
(
select bf.ReferenceID
from eopf.Batch b join eopf.BatchFile bf
on b.BatchID = bf.BatchID);
Has anyone experienced this error?
Today I experienced the same problem with an inner Join. As creating a Table Valued Function suggested by codechurn or using a Temporary Table suggested by user1935511 or changing the Join Types suggested by cymorg are no options for me, I like to share my solution.
I used Join Hints to drive the query optimizer into the right direction, as the problem seems to rise up from nested loops join strategy with the remote table locally . For me HASH, MERGE and REMOTE join hints worked.
For you REMOTE will not be an option because it can be used only for inner join operations. So using something like the following should work.
select *
from eopf.Batch b
join eopf.BatchFile bf
on b.BatchID = bf.BatchID
left outer merge join [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] du
on bf.ReferenceID = du.documentUploadID;
I've had the same problem.
Solution1: load the data from the Oracle database into a temp table, then join to that temp table instead - here's a link.
From this post a link you can find out that the problem can be with using left join.
I've checked with my problem and after changing my query it solved the problem.
In my case I had a complex view made from a linked table, 3 views based on the linked table and a local table. I was using Inner Joins throughout and this problem manifested. Changing the joins to Left and Right Outer Joins (as appropriate) resolved the issue.
Another way to work around the problem is to pull back the Oracle data into a Table Valued Function. This will cause SQL Server to go out and retrieve all of the data from Oracle and throw it into a resultant table variable. For all intent and purpose, the Oracle data is now "local" to SQL Server if you use the resultant Table Valued Function in a query.
I believe the original problem is that SQL Server is trying to optimize the execution of your compound query which includes the remote Oracle query results in-line. By using a Table Valued Function to wrap the Oracle call, SQL Server will optimize the compound query on the resultant table variable returned from the function and not the results from the remote query execution.
CREATE function [dbo].[documents]()
returns #results TABLE (
DOCUMENT_ID INT NOT NULL,
TITLE VARCHAR(6) NOT NULL,
LEGALNAME VARCHAR(50) NOT NULL,
AUTHOR_ID INT NOT NULL,
DOCUMENT_TYPE VARCHAR(1) NOT NULL,
LAST_UPDATE DATETIME
) AS
BEGIN
INSERT INTO #results
SELECT CAST(DOCUMENT_ID AS INT) AS DOCUMENT_ID, TITLE, LEGALNAME, CAST(AUTHOR_ID AS INT) AS AUTHOR_ID, DOCUMENT_TYPE, LAST_UPDATE
FROM OPENQUERY(ORACLE_SERVER,
'select DOCUMENT_ID, TITLE, LEGALNAME, AUTHOR_ID, DOCUMENT_TYPE, FUNDTYPE, LAST_UPDATE
from documents')
return
END
You can then use the Table Valued Function as it it were a table in your SQL queries:
SELECT * FROM DOCUMENTS()
I resolved it by avoiding the = operator. Try using this instead:
select * from [OM_ORACLE]..[OM].[DOCUMENT_UPLOAD] where documentUploadID < 0
I have a SQL query looking something like this:
WITH RES_CTE AS
(SELECT
COLUMN1,
COLUMN2,
[MORE COLUMNS...]
ROW_NUMBER() OVER (ORDER BY R.RANKING DESC) AS RowNum
FROM TABLE1 As R, TABLE2 As A, TABLE3 As U, TABLE4 As S, TABLE5 As T
WHERE R.RID = A.LID
AND S.QRYID = R.QRYID
AND A.AID = U.AID
AND CONDITION1 = 'VALUE'
AND CONDITION2 = 'VALUE'
AND [MORE CONDITIONS...]
),
Results_Cnt AS
(SELECT COUNT(*) CNT FROM Results_CTE)
SELECT * FROM Results_CTE, Results_Cnt WHERE RowNum >= 1 AND RowNum <= 25
Now, this query typically runs under 1 sec and returns the 25 records out of 5000 based on CONDITION1.
Recently though, I added a new column to a TABLE1 and then use its values as a CONDITION2 in the query above. The column is populated going forward but all the values in the past are NULL.
I read something above joining table that have NULL being a reason for slow execution. The table has about 1,300,000 records. 90% of them are NULL in the problematic column. But that column is not being joined on. (The one that is being joined on has an INDEX)
However, I wanted to try that anyway by creating a new column and simply copying the data like so:
ALTER TABLE TABLE1 ADD COL_NEW
UPDATE TABLE1 SET COL_NEW = COL_OLD
My next step was to replace the NULLs with an actual value but first, just for kicks, I changed the query to use as a condition the new field COL_NEW, and the problem went away.
Although I'm happy the problem is gone, I can't explain it to myself. Why was the execution slow in the first place if it had nothing to do with the NULLs?
UPDATE: It appears the problem may have resulted from a cached query plan. So the question essentially becomes, how to force a query plan refresh?
UPDATE: Although doing ALTER TABLE may have refreshed the execution plan, the problem returned. How can I find out what is happening?
It sounds like your query plan got cached while the stats for the new column showed it completely full of nulls, forcing a table scan. Following the ALTER TABLE the query plan was refreshed, replcing the table scan with an index lookujp again, and performance returned to normal.
The only way to know for sure if that is what happened would be to examine the query plans for both queries, but those are long gone now.
I have been fighting with this all weekend and am out of ideas. In order to have pages in my search results on my website, I need to return a subset of rows from a SQL Server 2005 Express database (i.e. start at row 20 and give me the next 20 records). In MySQL you would use the "LIMIT" keyword to choose which row to start at and how many rows to return.
In SQL Server I found ROW_NUMBER()/OVER, but when I try to use it it says "Over not supported". I am thinking this is because I am using SQL Server 2005 Express (free version). Can anyone verify if this is true or if there is some other reason an OVER clause would not be supported?
Then I found the old school version similar to:
SELECT TOP X * FROM TABLE WHERE ID NOT IN (SELECT TOP Y ID FROM TABLE ORDER BY ID) ORDER BY ID where X=number per page and Y=which record to start on.
However, my queries are a lot more complex with many outer joins and sometimes ordering by something other than what is in the main table. For example, if someone chooses to order by how many videos a user has posted, the query might need to look like this:
SELECT TOP 50 iUserID, iVideoCount FROM MyTable LEFT OUTER JOIN (SELECT count(iVideoID) AS iVideoCount, iUserID FROM VideoTable GROUP BY iUserID) as TempVidTable ON MyTable.iUserID = TempVidTable.iUserID WHERE iUserID NOT IN (SELECT TOP 100 iUserID, iVideoCount FROM MyTable LEFT OUTER JOIN (SELECT count(iVideoID) AS iVideoCount, iUserID FROM VideoTable GROUP BY iUserID) as TempVidTable ON MyTable.iUserID = TempVidTable.iUserID ORDER BY iVideoCount) ORDER BY iVideoCount
The issue is in the subquery SELECT line: TOP 100 iUserID, iVideoCount
To use the "NOT IN" clause it seems I can only have 1 column in the subquery ("SELECT TOP 100 iUserID FROM ..."). But when I don't include iVideoCount in that subquery SELECT statement then the ORDER BY iVideoCount in the subquery doesn't order correctly so my subquery is ordered differently than my parent query, making this whole thing useless. There are about 5 more tables linked in with outer joins that can play a part in the ordering.
I am at a loss! The two above methods are the only two ways I can find to get SQL Server to return a subset of rows. I am about ready to return the whole result and loop through each record in PHP but only display the ones I want. That is such an inefficient way to things it is really my last resort.
Any ideas on how I can make SQL Server mimic MySQL's LIMIT clause in the above scenario?
Unfortunately, although SQL Server 2005 Row_Number() can be used for paging and with SQL Server 2012 data paging support is enhanced with Order By Offset and Fetch Next, in case you can not use any of these solutions you require to first
create a temp table with identity column.
then insert data into temp table with ORDER BY clause
Use the temp table Identity column value just like the ROW_NUMBER() value
I hope it helps,