Making equal instance of Union without Union - union

These days, I've learned about DBMS. And now, I have trouble with using sqlplus.
The problem is I want these two table to be united without 'Union query'.
Table1 = '1','2','3','4','5'
Table2 = '1','2','6','7'
The Union result of these two tables is '1','2','3','4','5','6','7'
But I want to achieve the same result without using Union by only using create, select, or insert.
Please, I really want to know alternative resolution of Union.

technically, you can try
insert
into table1 ( col )
select col
from table2 t2
where not exists (
select 1
from table1 tt1
where tt1.col = t2.col
)
;
but i doubt that this will be much more efficient than a union in the first place.
similar comment holds for a construct like insert into table1 select col from table2 minus select col from table 1.

Related

SQL - Attain Previous Transaction Informaiton [duplicate]

I need to calculate the difference of a column between two lines of a table. Is there any way I can do this directly in SQL? I'm using Microsoft SQL Server 2008.
I'm looking for something like this:
SELECT value - (previous.value) FROM table
Imagining that the "previous" variable reference the latest selected row. Of course with a select like that I will end up with n-1 rows selected in a table with n rows, that's not a probably, actually is exactly what I need.
Is that possible in some way?
Use the lag function:
SELECT value - lag(value) OVER (ORDER BY Id) FROM table
Sequences used for Ids can skip values, so Id-1 does not always work.
SQL has no built in notion of order, so you need to order by some column for this to be meaningful. Something like this:
select t1.value - t2.value from table t1, table t2
where t1.primaryKey = t2.primaryKey - 1
If you know how to order things but not how to get the previous value given the current one (EG, you want to order alphabetically) then I don't know of a way to do that in standard SQL, but most SQL implementations will have extensions to do it.
Here is a way for SQL server that works if you can order rows such that each one is distinct:
select rank() OVER (ORDER BY id) as 'Rank', value into temp1 from t
select t1.value - t2.value from temp1 t1, temp1 t2
where t1.Rank = t2.Rank - 1
drop table temp1
If you need to break ties, you can add as many columns as necessary to the ORDER BY.
WITH CTE AS (
SELECT
rownum = ROW_NUMBER() OVER (ORDER BY columns_to_order_by),
value
FROM table
)
SELECT
curr.value - prev.value
FROM CTE cur
INNER JOIN CTE prev on prev.rownum = cur.rownum - 1
Oracle, PostgreSQL, SQL Server and many more RDBMS engines have analytic functions called LAG and LEAD that do this very thing.
In SQL Server prior to 2012 you'd need to do the following:
SELECT value - (
SELECT TOP 1 value
FROM mytable m2
WHERE m2.col1 < m1.col1 OR (m2.col1 = m1.col1 AND m2.pk < m1.pk)
ORDER BY
col1, pk
)
FROM mytable m1
ORDER BY
col1, pk
, where COL1 is the column you are ordering by.
Having an index on (COL1, PK) will greatly improve this query.
LEFT JOIN the table to itself, with the join condition worked out so the row matched in the joined version of the table is one row previous, for your particular definition of "previous".
Update: At first I was thinking you would want to keep all rows, with NULLs for the condition where there was no previous row. Reading it again you just want that rows culled, so you should an inner join rather than a left join.
Update:
Newer versions of Sql Server also have the LAG and LEAD Windowing functions that can be used for this, too.
select t2.col from (
select col,MAX(ID) id from
(
select ROW_NUMBER() over(PARTITION by col order by col) id ,col from testtab t1) as t1
group by col) as t2
The selected answer will only work if there are no gaps in the sequence. However if you are using an autogenerated id, there are likely to be gaps in the sequence due to inserts that were rolled back.
This method should work if you have gaps
declare #temp (value int, primaryKey int, tempid int identity)
insert value, primarykey from mytable order by primarykey
select t1.value - t2.value from #temp t1
join #temp t2
on t1.tempid = t2.tempid - 1
Another way to refer to the previous row in an SQL query is to use a recursive common table expression (CTE):
CREATE TABLE t (counter INTEGER);
INSERT INTO t VALUES (1),(2),(3),(4),(5);
WITH cte(counter, previous, difference) AS (
-- Anchor query
SELECT MIN(counter), 0, MIN(counter)
FROM t
UNION ALL
-- Recursive query
SELECT t.counter, cte.counter, t.counter - cte.counter
FROM t JOIN cte ON cte.counter = t.counter - 1
)
SELECT counter, previous, difference
FROM cte
ORDER BY counter;
Result:
counter
previous
difference
1
0
1
2
1
1
3
2
1
4
3
1
5
4
1
The anchor query generates the first row of the common table expression cte where it sets cte.counter to column t.counter in the first row of table t, cte.previous to 0, and cte.difference to the first row of t.counter.
The recursive query joins each row of common table expression cte to the previous row of table t. In the recursive query, cte.counter refers to t.counter in each row of table t, cte.previous refers to cte.counter in the previous row of cte, and t.counter - cte.counter refers to the difference between these two columns.
Note that a recursive CTE is more flexible than the LAG and LEAD functions because a row can refer to any arbitrary result of a previous row. (A recursive function or process is one where the input of the process is the output of the previous iteration of that process, except the first input which is a constant.)
I tested this query at SQLite Online.
You can use the following funtion to get current row value and previous row value:
SELECT value,
min(value) over (order by id rows between 1 preceding and 1
preceding) as value_prev
FROM table
Then you can just select value - value_prev from that select and get your answer

How to divide to multiple column sql?

Based on the sql result above i want to divide the result like the image below
I tried using case it return duplicate data.
Anyone have done this or have any idea how to do this?
Can you try this one?
SELECT t1.*,t2.* from yourtable WHERE t1.hatch_num_1 != t2.hatch_num_1
JOIN yourtable t2 ON t1.delay_code_1=t2.delay_code_1
Afterwards you can mention exactly what columns you wan't from both t1 and t2 and mention with 'as' how do you wan't them to be named in your select statement, so instead of having 2 hatch_num_1 you wil have one with _1 and one with _3
;With
a As (SELECT * FROM yourtable X Where X.hatch= 'H1' ),
b AS (SELECT * FROM yourtable Y Where Y.hatch= 'H3')
SELECT A.* ,B.* FROM A , B WHERE A.[delay] = B.[delay]
If you have limited hatches and same time that are repeating then you can do it like this or show me some more records or details then i'll came to know...

SQL set operation with different number of columns in each set

Let say I have set 1:
1 30 60
2 45 90
3 120 240
4 30 60
5 20 40
and set 2
30 60
20 40
I would like to do some sort of union where I only keep rows 1,4,5 from set 1 because the latter 2 columns of set 1 can be found in set 2.
My problem is that set based operations insist on the same numnber of columns.
I've thought of concatenating the columns contents, but it feels dirty to me.
Is there a 'proper' way to accomplish this?
I'm on SQL Server 2008 R2
In the end, I would like to end up with
1 30 60
4 30 60
5 20 40
CLEARLY I need to go sleep as a simple join on 2 columns worked.... Thanks!
You are literally asking for
give me the rows in t1 where the 2 columns match in T2
So if the output is only rows 1, 4 and 5 from table 1 then it is a set based operation and can be done with EXISTS or INTERSECT or JOIN. For the "same number of column", then you simply set 2 conditions with an AND. This is evaluated per row
EXISTS is the most portable and compatible way and allows any column from table1
select id, val1, val2
from table1 t1
WHERE EXISTS (SELECT * FROM table2 t2
WHERE t1.val1 = t2.val1 AND t1.val2 = t2.val2)
INTERSECT requires the same columns in each clause and not all engines support this (SQL Server does since 2005+)
select val1, val2
from table1
INTERSECT
select val1, val2
from table2
With an INNER JOIN, if you have duplicate values for val1, val2 in table2 then you'll get more rows than expected. The internals of this usually makes it slower then EXISTS
select t1.id, t1.val1, t1.val2
from table1 t1
JOIN
table2 t2 ON t1.val1 = t2.val1 AND t1.val2 = t2.val2
Some RBDMS support IN on multiple columns: this isn't portable and SQL Server doesn't support it
Edit: some background
Relationally, it's a semi-join (One, Two).
SQL Server does it as a "left semi join"
INTERSECT and EXISTS in SQL Server usually give the same execution plan. The join type is a "left semi join" whereas INNER JOIN is a full "equi-join".
You could use union which, as opposed to union all, eliminates duplicates:
select val1, val2
from table1
union
select val1, val2
from table1
EDIT: Based on your edited question, you can exclude rows that match the second table using a not exists subquery:
select id, col1, col2
from table1 t1
where not exists
(
select *
from table2 t2
where t1.col1 = t2.col1
and t1.col2 = t2.col2
)
union all
select null, col1, col2
from table2
If you'd like to exclude rows from table2, omit union all and everything below it.

set difference in SQL query

I'm trying to select records with a statement
SELECT *
FROM A
WHERE
LEFT(B, 5) IN
(SELECT * FROM
(SELECT LEFT(A.B,5), COUNT(DISTINCT A.C) c_count
FROM A
GROUP BY LEFT(B,5)
) p1
WHERE p1.c_count = 1
)
AND C IN
(SELECT * FROM
(SELECT A.C , COUNT(DISTINCT LEFT(A.B,5)) b_count
FROM A
GROUP BY C
) p2
WHERE p2.b_count = 1)
which takes a long time to run ~15 sec.
Is there a better way of writing this SQL?
If you would like to represent Set Difference (A-B) in SQL, here is solution for you.
Let's say you have two tables A and B, and you want to retrieve all records that exist only in A but not in B, where A and B have a relationship via an attribute named ID.
An efficient query for this is:
# (A-B)
SELECT DISTINCT A.* FROM (A LEFT OUTER JOIN B on A.ID=B.ID) WHERE B.ID IS NULL
-from Jayaram Timsina's blog.
You don't need to return data from the nested subqueries. I'm not sure this will make a difference withiut indexing but it's easier to read.
And EXISTS/JOIN is probably nicer IMHO then using IN
SELECT *
FROM
A
JOIN
(SELECT LEFT(B,5) AS b1
FROM A
GROUP BY LEFT(B,5)
HAVING COUNT(DISTINCT C) = 1
) t1 On LEFT(A.B, 5) = t1.b1
JOIN
(SELECT C AS C1
FROM A
GROUP BY C
HAVING COUNT(DISTINCT LEFT(B,5)) = 1
) t2 ON A.C = t2.c1
But you'll need a computed column as marc_s said at least
And 2 indexes: one on (computed, C) and another on (C, computed)
Well, not sure what you're really trying to do here - but obviously, that LEFT(B, 5) expression keeps popping up. Since you're using a function, you're giving up any chance to use an index.
What you could do in your SQL Server table is to create a computed, persisted column for that expression, and then put an index on that:
ALTER TABLE A
ADD LeftB5 AS LEFT(B, 5) PERSISTED
CREATE NONCLUSTERED INDEX IX_LeftB5 ON dbo.A(LeftB5)
Now use the new computed column LeftB5 instead of LEFT(B, 5) anywhere in your query - that should help to speed up certain lookups and GROUP BY operations.
Also - you have a GROUP BY C in there - is that column C indexed?
If you are looking for just set difference between table1 and table2,
the below query is simple that gives the rows that are in table1, but not in table2, such that both tables are instances of the same schema with column names as
columnone, columntwo, ...
with
col1 as (
select columnone from table2
),
col2 as (
select columntwo from table2
)
...
select * from table1
where (
columnone not in col1
and columntwo not in col2
...
);

SQL Server Comparing Subsequent Rows for Duplicates

I am trying to write a SQL Server query but have had no luck and was wondering if anyone may have any ideas on how to achieve my query.
What i'm trying to do:
I have a table with several columns naming the ones that i am dealing with TaskID, StatusCode, Timestamp. Now this table just holds tasks for one of our systems that run throughout the day and when something runs it gets a timestamp and the statuscode depending on the status for that task.
Sometimes what happens is the task table will be updated with a new timestamp but the statusCode will not have changed since the last update of the task so for two or more consecutive rows of a given task the statusCode can be the same. When i say consecutive rows i mean with regards to timestamp.
So example task 88 could have twenty rows at statusCode 2 after which the status code changes to something else.
Now what i am trying to do with no luck at the moment is to retrieve a list from this table of all the tasks and the statuscodes and the timestamps but in the case where i have more than one consecutive row for a task with the same statuscode i just want to take the first row with the lowest timestamp and ignore the rest of the row until the statuscode for that task changes.
To make it simpler in this case you can assume that i have a taskid which i am filtering on so i am just looking at a single task.
Does anyone have any ideas as to how i can do this or perhaps something that i coudl probably read to help me?
Thanks
Irfan.
This are a couple ways of getting what you want:
SELECT
T1.task_id,
T1.status_code,
T1.status_timestamp
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON
T2.task_id = T1.task_id AND
T2.status_timestamp < T1.status_timestamp
LEFT OUTER JOIN My_Table T3 ON
T3.task_id = T1.task_id AND
T3.status_timestamp < T1.status_timestamp AND
T3.status_timestamp > T2.status_timestamp
WHERE
T3.task_id IS NULL AND
(T2.status_code IS NULL OR T2.status_code <> T1.status_code)
ORDER BY
T1.status_timestamp
or
SELECT
T1.task_id,
T1.status_code,
T1.status_timestamp
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON
T2.task_id = T1.task_id AND
T2.status_timestamp = (
SELECT
MAX(status_timestamp)
FROM
My_Table T3
WHERE
T3.task_id = T1.task_id AND
T3.status_timestamp < T1.status_timestamp)
WHERE
(T2.status_code IS NULL OR T2.status_code <> T1.status_code)
ORDER BY
T1.status_timestamp
Both methods rely on there being no exact matches of the status_timestamp values (two rows can't have the same exact status_timestamp for a given task_id.)
Something like
select TaskID,StatusCode,Min(TimeStamp)
from table
group by TaskID,StatusCode
order by 1,2
Note that is statuscode can duplicate, you will need an additional field, but hopefully this can point you in the right direction...
Something like the following should get you in the right direction....
CREATE TABLE #T
(
TaskId INT
,StatusCode INT
,StatusTimeStamp DATETIME
)
INSERT INTO #T
SELECT 1, 1, '2009-12-01 14:20'
UNION SELECT 1, 2, '2009-12-01 16:20'
UNION SELECT 1, 2, '2009-12-02 09:15'
UNION SELECT 1, 2, '2009-12-02 12:15'
UNION SELECT 1, 3, '2009-12-02 18:15'
;WITH CTE AS
(
SELECT TaskId
,StatusCode
,StatusTimeStamp
,ROW_NUMBER() OVER (PARTITION BY TaskId, StatusCode ORDER BY TaskId, StatusTimeStamp DESC) AS RNUM
FROM #T
)
SELECT TaskId
,StatusCode
,StatusTimeStamp
FROM CTE
WHERE RNUM = 1
DROP TABLE #T

Resources