I apologize if I don't make much sense but I've tangled my brain up trying to work this out.
I'm trying to obtain a result set using the results from one query but then also hoping to include the previous results within the new query and then somehow group them.
What I have are parent Work order numbers and it’s child work order numbers.
Sadly the system I am using doesn't have the functionality set up yet to simply produce a report that shows all the specific type of work and their linked work.
So I have an initial basic query 1 to find anything that has a "JPNUM like AK0147" and "STATUS NOT IN ('COMPLETE', 'CANCELLED', 'REVIEWED', 'CLOSED')"
The result of the above query 1 will return a result set that includes the column 'WONUM'.
I need to then do a separate search using the column 'PARENT' whereby I return any results that have a number in this column matching any of the WONUMs that were returned in query 1.
I also want to include the results of query 1, probably in query 3, so I can group them together.
How do create write a query that includes my results from query 1 into query 2 and then how do I group them so I have the parent WONUM at the top and it's children work orders underneath, like the final results table I have shown in the attached image?
You could run a select from another select and so on.
I will write you an example:
SELECT WONUM, PARENT.WONUM
FROM (SELECT WONUM, JPNUM
FROM yourTable
WHERE "JPNUM like AK0147"
AND "STATUS NOT IN ('COMPLETE', 'CANCELLED', 'REVIEWED', 'CLOSED')") PARENT
WHERE ...
This way the result of the inner SELECT acts like a temporary table.
There's more than one way to do it, if you're using sql-server, I recommend CTE:
WITH Query1
(
SELECT WONUM, JPNUM
FROM MyTable1
WHERE ...
),
Query2
(
SELECT WONUM, PARENT
FROM Query1 -- You can use Query1, if you want
JOIN MyTable2 ON Query1.JPNUM = ...
WHERE ...
)
-- Final Result:
SELECT WONUM, PARENT
FROM Query2
JOIN Query1 ON ...
JOIN Table3 ON ...
WHERE ...
In this way, you can query using previous query or previous previous query (if needed).
Related
I have data like below in the table. component part no shows the part no replaced for Part no.
Table having data
I want to write a code where I get the last part i.e. the latest part. The loop ends when Part doesnt return anything.
I want to show the data like below:
How data is needed
I tried using recursive CTE but the data is huge in table thatit keeps on running for 2 hours.
I am weak in writing stored procedure.
Any way we can achieve it? We are okay if it completes in 1 hour.
If we need to analyze the level of nesting, CTE is a good solution. The key is to choose the starting point right. Only the roots. So that there will be no infinite loops or duplicate results.
If the CTE takes too long and there is too much data, maybe try to scale up the warehouse or divide the data into batches.
The CTE should look something like this:
CREATE OR REPLACE TABLE T1 (
PART_NO STRING,
COMPOMENT_NO STRING);
INSERT INTO T1 (PART_NO, COMPOMENT_NO)
VALUES ('9U8806', '1252127'),
('1252127', '1073295'),
('1073295', '1386464'),
('1386464', '2320160'),
('2320160', '3153441');
WITH CTE AS (
SELECT T1.PART_NO AS ORIGINAL_PART_NO, T1.PART_NO, T1.PART_NO AS PREVIOUS_PART_NO, 1 AS PART_LEVEL
FROM T1
WHERE T1.PART_NO NOT IN (SELECT COMPOMENT_NO FROM T1) -- only roots
UNION ALL
SELECT CTE.ORIGINAL_PART_NO, T1.COMPOMENT_NO AS PART_NO, CTE.PART_NO AS PREVIOUS_PART_NO, CTE.PART_LEVEL + 1 AS PART_LEVEL
FROM T1
JOIN CTE ON CTE.PART_NO = T1.PART_NO
)
SELECT *
FROM CTE;
I am creating view in Snowflake that has CTE on base table without any filters. I have other CTEs that depend on Parent CTE to fetch further information.
Everything is working fine when I query all records from base table that has 45K rows. But when I query view for one particular ID, explain plan shows Base CTE is picking up 45K rows, joining rest of CTE on 45K rows then finally applying my unique ID filter and returning one row.
I am not getting any difference in performance pulling data for all records or one record. Snowflake is not optimizing base CTE to apply the filter criteria I am looking for.
Any suggestions how can I resolve this issue? I used local variables in filter criteria of base CTE but it is not viable solution.
CREATE OR REPLACE VIEW test_v AS
WITH parent_cte as
(select document_id, time, ...
from audit_table
),
emp_cte as
(select employee_details, ...
from employee_tab,
parent_cte
where parent_cte.document_id = employee_tab.document_id),
dep_cte as
(select dep_details, ....
from dependent_tab,
emp_cte
where ..........)
select *
from dep_cte, emp_cte, base_cte;
Now when I query the view for one document_id, plan is fetching all data and joining then applying filter which is not efficient.
select * from test_v where document_id = '1001';
I can't use these tables in one select with join condition as I am using "LATERAL FLATTEN" which is cross multiplying each base table record so I am going with CTE approach.
Appreciate your ideas.
I have had the following occur a few times and my solution feels crude.
Generically I have a tables that might look like this:
WorkOrder:
WorkOrderID (primary key)
WorkOrderDesc
Process:
ProcessID (primary key)
ProcessDesc
Work:
ProcessID
WorkOrderID
I want to find out how many processes that have been done on a work order so I do something like
select w.WorkOrderId,w.ProcessID,Count(*)
from Work as w
inner join WorkOrder as wo on w.WorkOrderID=wo.WorkOrderID
inner join Process as p on w.ProcessID = p.ProcessID
group by w.WorkOrderID,w.ProcessID
This code will tell me how many times each process was run on each work order.
The problem I run into is that I really don't want the query results to be the indexes, I want it to be the descriptions because those will be reported on or plotted. Because it is a group by I know that WorkOrderID and ProcessID are unique I know that all descriptions will be the same so I can just take the Max of those entries (or the Min).
select w.WorkOrderId,w.ProcessID,Count(*),Max(w.WorkOrderDesc),Max(p.ProcessDesc)
from Work as w
inner join WorkOrder as wo on w.WorkOrderID=wo.WorkOrderID
inner join Process as p on w.ProcessID = p.ProcessID
group by w.WorkOrderID,w.ProcessID
This query gives me the ID's like the first query but it also fills in the descriptions.
Is there a way to make this type of query as a single query without using this hack? I know I could use a CTE where the results of first query are joined a second time to grab the descriptions from the two tables but that seems like it ends up running the same(ish) query again.
It seems like the query engine could detect that I am grouping by a primary key and allow me select items from that row in the select statement.
Note: I realize I could have done the queries by just using the descriptions in this toy example, in a real example each of these queries would have many items reported in the query that all come from the joined tables.
I would write the query like this:
SELECT wo.WorkOrderDesc, p.ProcessDesc, x.Cnt
FROM (
SELECT w.WorkOrderId,w.ProcessID,Count(*) AS Cnt
from Work as w
group by w.WorkOrderID,w.ProcessID
) x
inner join WorkOrder as wo on x.WorkOrderID=wo.WorkOrderID
inner join Process as p on x.ProcessID = p.ProcessID
This way, the grouping is done using the ID, but we later retrieve the description for each ID which appears in the results.
I have a number of queries that are run at the same time but now I want the result to populate a permanent table that I've created.
Each of the queries will have a column called 'Descript' which is what I want all the results to join to so i want to make sure that if the Descript column is out of order (or null) on one of the queries it will link the figures to the correct Descript.
I performed an INTO after the end of each query being run but this didn't work.
The first level of data went in but the second level just went underneath the first (if that makes sense) creating more rows.
INSERT INTO dbo.RESULTTABLE (Descript, Category, DescriptCount)
SELECT Descript, Category, DescriptCount
FROM #Query1
I have around 15 queries to join into 1 table so any help to understand the logic is appreciated.
Thanks
If I understood your question clearly, you want to insert query results which is not stored in the Temptable and update already existing records in the table.
update R set Category = Q.Category, DescriptCount = Q.DescriptCount,
from #ResultTable R inner join #Query1 Q ON R.Descript = Q.Descript
INSERT INTO dbo.RESULTTABLE (Descript, Category, DescriptCount)
SELECT Descript, Category, DescriptCount FROM #Query1 where Descript NOT IN (select Descript from #ResultTable)
Then you can process the same approach for other queries.
I can't get my head around why the following query:
SELECT ref,
article_number,
count(article_number)
FROM invoice
INNER JOIN goods_list USING (invoice_number)
WHERE invoice_owner = 'someone'
GROUP BY invoice_number, article_number
LIMIT 1
is way slower than this one:
WITH base_data AS (
SELECT invoice_number
FROM invoice
WHERE invoice_owner = 'someone'
LIMIT 1
)
SELECT invoice_number,
article_number,
count(article_number)
FROM base_data
INNER JOIN goods_list USING (invoice_number)
GROUP BY invoice_number, article_number
Is the limit applied after the whole result set is returned?
The first query processes all the data for the invoice owner. It does the group by and finally returns one row.
The second query gets one row in the CTE for the invoice owner up front. It joins that single row into another table and then does the aggregation on way fewer rows.
Hence, it is not surprising that the second query is much faster, because it is processing many fewer rows for the aggregation.
Note: when using limit you should also use order by. Otherwise, you can get any matching row when the code runs -- and you might even get different rows on different runs.