How can I perform a subtraction between the results of 2 queries with a group by?
The first query returns the number of all, let's say houses that I can rent out, while the second returns the ones already rented.
SELECT
(SELECT COUNT(*) FROM ... GroupBy ...)
- (SELECT COUNT(*) FROM ... WHERE ...group by) AS Difference
First query result
count() column2 column3
3 studio newYork
6 studio pekin
3 apprtment pekin
5 house london
1 house lagos
Second query result
count() column2 column3
2 studio newYork
I would love to have the first query getting updated depending the the result of the second
count() column2 column3
1 studio newYork
6 studio pekin
3 apprtment pekin
5 house london
1 house lagos
Just use conditional aggregation:
SELECT COUNT(*) -
SUM(CASE WHEN <some_condition> THEN 1 ELSE 0 END) AS some_count,
column2,
column3
FROM yourTable
GROUP BY column2, column3
Here <some_condition> is whatever would have appeared in the WHERE clause of your original second count query.
How can I perform a subtraction between the results of 2 queries with
a group by?
You are close with what you have. There are a few changes that would make this work easier, however:
Alias the results of your two
subqueries. This will make them easier to use.
Return more columns from your subqueries so that you may join on "something" which would allow for subtraction to only occur on matching rows.
Add an alias to your Count(*) statements. Again, this will make them easier to use.
If this image demonstrates what you are looking for:
Then I believe this query will help you out:
SELECT op.ApartmentType,
op.ApartmentLocation,
op.TotalOwned,
ISNULL(tp.TotalOccupied, 0) AS [TotalOccupied],
op.TotalOwned - ISNULL(tp.TotalOccupied,0) AS [TotalVacant]
FROM
(
SELECT *,
COUNT(*) as TotalOwned
FROM SO_SubtractionQuestion.OwnedProperties
GROUP BY ApartmentType, ApartmentLocation
) AS op
LEFT JOIN
(
SELECT *,
COUNT(*) as TotalOccupied
FROM [SO_SubtractionQuestion].[OccupiedProperties]
GROUP BY ApartmentType, ApartmentLocation
) AS tp
ON op.ApartmentType = tp.ApartmentType
AND op.ApartmentLocation = tp.ApartmentLocation
I set this query up similar to your own: it has a select statement with two subqueries and the subqueries have a Count(*) on a grouped query. I also added what I suggested above to it:
My first subquery is aliased with op (owned properties) and my second is aliased with tp(taken properties).
I am returning more columns so that I may properly join them in my outer query.
My Count(*) statements in my subqueries have aliases.
In my outer query, I am then able to join on ApartmentType and ApartmentLocation (look below for the example table/data setup). This creates a result set that is joined on ApartmentType and ApartmentLocation that also contains how many Owned Properties there are (the Count(*) from the first subquery) and how many Occupied Properties there are (the Count(*) from the second subquery). At this point, because I have everything aliased, I am able to do simple subtraction to see how many properties are vacant with op.TotalOwned - ISNULL(tp.TotalOccupied,0) AS [TotalVacant].
I am also using ISNULL to correct for null values. If I did not have this, the result of the subtraction would also be null for rows that did not have a match from the second subquery.
Test Table/Data Setup
To set up the example for yourself, here are the queries to run:
Step 1
For organizational purposes
CREATE SCHEMA SO_SubtractionQuestion;
Step 2
CREATE TABLE SO_SubtractionQuestion.OwnedProperties
(
ApartmentType varchar(20),
ApartmentLocation varchar(20)
);
CREATE TABLE SO_SubtractionQuestion.OccupiedProperties
(
ApartmentType varchar(20),
ApartmentLocation varchar(20)
);
INSERT INTO [SO_SubtractionQuestion].[OwnedProperties] VALUES ('Studio', 'New York'), ('Studio', 'New York'), ('Studio', 'New York'), ('House', 'New York'), ('House', 'Madison');
INSERT INTO [SO_SubtractionQuestion].[OccupiedProperties] VALUES ('Studio', 'New York'), ('Studio', 'New York');
Select Column2, Column3, Count(*)-(Select Count(*)
From Table2
Where Table1.Column2=Table2.Column2 and Table1.Column3=Table2.Column3)
From Table1
Group by Column2, Column3
Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 3 years ago.
I have this table for documents (simplified version here):
id
rev
content
1
1
...
2
1
...
1
2
...
1
3
...
How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...] and [2, 1, ..]. I'm using MySQL.
Currently I use checks in the while loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn't there a SQL solution?
At first glance...
All you need is a GROUP BY clause with the MAX aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group Sub-query
In this approach, you first find the group-identifier, max-value-in-group (already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier and max-value-in-group:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality goes in the group-identifier. Then, 2 smart moves:
The second join condition is having left side value less than right value
When you do step 1, the row(s) that actually have the max value will have NULL in the right side (it's a LEFT JOIN, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
My preference is to use as little code as possible...
You can do it using IN
try this:
SELECT *
FROM t1 WHERE (id,rev) IN
( SELECT id, MAX(rev)
FROM t1
GROUP BY id
)
to my mind it is less complicated... easier to read and maintain.
I am flabbergasted that no answer offered SQL window function solution:
SELECT a.id, a.rev, a.contents
FROM (SELECT id, rev, contents,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) ranked_order
FROM YourTable) a
WHERE a.ranked_order = 1
Added in SQL standard ANSI/ISO Standard SQL:2003 and later extended with ANSI/ISO Standard SQL:2008, window (or windowing) functions are available with all major vendors now. There are more types of rank functions available to deal with a tie issue: RANK, DENSE_RANK, PERSENT_RANK.
Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup...
Following are comparisons to the solutions in #AdrianCarneiro's answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups...
I can't vouch for the performance, but here's a trick inspired by the limitations of Microsoft Excel. It has some good features
GOOD STUFF
It should force return of only one "max record" even if there is a tie (sometimes useful)
It doesn't require a join
APPROACH
It is a little bit ugly and requires that you know something about the range of valid values of the rev column. Let us assume that we know the rev column is a number between 0.00 and 999 including decimals but that there will only ever be two digits to the right of the decimal point (e.g. 34.17 would be a valid value).
The gist of the thing is that you create a single synthetic column by string concatenating/packing the primary comparison field along with the data you want. In this way, you can force SQL's MAX() aggregate function to return all of the data (because it has been packed into a single column). Then you have to unpack the data.
Here's how it looks with the above example, written in SQL
SELECT id,
CAST(SUBSTRING(max(packed_col) FROM 2 FOR 6) AS float) as max_rev,
SUBSTRING(max(packed_col) FROM 11) AS content_for_max_rev
FROM (SELECT id,
CAST(1000 + rev + .001 as CHAR) || '---' || CAST(content AS char) AS packed_col
FROM yourtable
)
GROUP BY id
The packing begins by forcing the rev column to be a number of known character length regardless of the value of rev so that for example
3.2 becomes 1003.201
57 becomes 1057.001
923.88 becomes 1923.881
If you do it right, string comparison of two numbers should yield the same "max" as numeric comparison of the two numbers and it's easy to convert back to the original number using the substring function (which is available in one form or another pretty much everywhere).
Unique Identifiers? Yes! Unique identifiers!
One of the best ways to develop a MySQL DB is to have each id AUTOINCREMENT (Source MySQL.com). This allows a variety of advantages, too many to cover here. The problem with the question is that its example has duplicate ids. This disregards these tremendous advantages of unique identifiers, and at the same time, is confusing to those familiar with this already.
Cleanest Solution
DB Fiddle
Newer versions of MySQL come with ONLY_FULL_GROUP_BY enabled by default, and many of the solutions here will fail in testing with this condition.
Even so, we can simply select DISTINCT someuniquefield, MAX( whateverotherfieldtoselect ), ( *somethirdfield ), etc., and have no worries understanding the result or how the query works :
SELECT DISTINCT t1.id, MAX(t1.rev), MAX(t2.content)
FROM Table1 AS t1
JOIN Table1 AS t2 ON t2.id = t1.id AND t2.rev = (
SELECT MAX(rev) FROM Table1 t3 WHERE t3.id = t1.id
)
GROUP BY t1.id;
SELECT DISTINCT Table1.id, max(Table1.rev), max(Table2.content) : Return DISTINCT somefield, MAX() some otherfield, the last MAX() is redundant, because I know it's just one row, but it's required by the query.
FROM Employee : Table searched on.
JOIN Table1 AS Table2 ON Table2.rev = Table1.rev : Join the second table on the first, because, we need to get the max(table1.rev)'s comment.
GROUP BY Table1.id: Force the top-sorted, Salary row of each employee to be the returned result.
Note that since "content" was "..." in OP's question, there's no way to test that this works. So, I changed that to "..a", "..b", so, we can actually now see that the results are correct:
id max(Table1.rev) max(Table2.content)
1 3 ..d
2 1 ..b
Why is it clean? DISTINCT(), MAX(), etc., all make wonderful use of MySQL indices. This will be faster. Or, it will be much faster, if you have indexing, and you compare it to a query that looks at all rows.
Original Solution
With ONLY_FULL_GROUP_BY disabled, we can use still use GROUP BY, but then we are only using it on the Salary, and not the id:
SELECT *
FROM
(SELECT *
FROM Employee
ORDER BY Salary DESC)
AS employeesub
GROUP BY employeesub.Salary;
SELECT * : Return all fields.
FROM Employee : Table searched on.
(SELECT *...) subquery : Return all people, sorted by Salary.
GROUP BY employeesub.Salary: Force the top-sorted, Salary row of each employee to be the returned result.
Unique-Row Solution
Note the Definition of a Relational Database: "Each row in a table has its own unique key." This would mean that, in the question's example, id would have to be unique, and in that case, we can just do :
SELECT *
FROM Employee
WHERE Employee.id = 12345
ORDER BY Employee.Salary DESC
LIMIT 1
Hopefully this is a solution that solves the problem and helps everyone better understand what's happening in the DB.
Another manner to do the job is using MAX() analytic function in OVER PARTITION clause
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,MAX(rev) OVER (PARTITION BY id) as max_rev
FROM YourTable
) t
WHERE t.rev = t.max_rev
The other ROW_NUMBER() OVER PARTITION solution already documented in this post is
SELECT t.*
FROM
(
SELECT id
,rev
,contents
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rank
FROM YourTable
) t
WHERE t.rank = 1
This 2 SELECT work well on Oracle 10g.
MAX() solution runs certainly FASTER that ROW_NUMBER() solution because MAX() complexity is O(n) while ROW_NUMBER() complexity is at minimum O(n.log(n)) where n represent the number of records in table !
Something like this?
SELECT yourtable.id, rev, content
FROM yourtable
INNER JOIN (
SELECT id, max(rev) as maxrev
FROM yourtable
GROUP BY id
) AS child ON (yourtable.id = child.id) AND (yourtable.rev = maxrev)
I like to use a NOT EXIST-based solution for this problem:
SELECT
id,
rev
-- you can select other columns here
FROM YourTable t
WHERE NOT EXISTS (
SELECT * FROM YourTable t WHERE t.id = id AND rev > t.rev
)
This will select all records with max value within the group and allows you to select other columns.
SELECT *
FROM Employee
where Employee.Salary in (select max(salary) from Employee group by Employe_id)
ORDER BY Employee.Salary
Note: I probably wouldn't recommend this anymore in MySQL 8+ days. Haven't used it in years.
A third solution I hardly ever see mentioned is MySQL specific and looks like this:
SELECT id, MAX(rev) AS rev
, 0+SUBSTRING_INDEX(GROUP_CONCAT(numeric_content ORDER BY rev DESC), ',', 1) AS numeric_content
FROM t1
GROUP BY id
Yes it looks awful (converting to string and back etc.) but in my experience it's usually faster than the other solutions. Maybe that's just for my use cases, but I have used it on tables with millions of records and many unique ids. Maybe it's because MySQL is pretty bad at optimizing the other solutions (at least in the 5.0 days when I came up with this solution).
One important thing is that GROUP_CONCAT has a maximum length for the string it can build up. You probably want to raise this limit by setting the group_concat_max_len variable. And keep in mind that this will be a limit on scaling if you have a large number of rows.
Anyway, the above doesn't directly work if your content field is already text. In that case you probably want to use a different separator, like \0 maybe. You'll also run into the group_concat_max_len limit quicker.
I think, You want this?
select * from docs where (id, rev) IN (select id, max(rev) as rev from docs group by id order by id)
SQL Fiddle :
Check here
NOT mySQL, but for other people finding this question and using SQL, another way to resolve the greatest-n-per-group problem is using Cross Apply in MS SQL
WITH DocIds AS (SELECT DISTINCT id FROM docs)
SELECT d2.id, d2.rev, d2.content
FROM DocIds d1
CROSS APPLY (
SELECT Top 1 * FROM docs d
WHERE d.id = d1.id
ORDER BY rev DESC
) d2
Here's an example in SqlFiddle
I would use this:
select t.*
from test as t
join
(select max(rev) as rev
from test
group by id) as o
on o.rev = t.rev
Subquery SELECT is not too eficient maybe, but in JOIN clause seems to be usable. I'm not an expert in optimizing queries, but I've tried at MySQL, PostgreSQL, FireBird and it does work very good.
You can use this schema in multiple joins and with WHERE clause. It is my working example (solving identical to yours problem with table "firmy"):
select *
from platnosci as p
join firmy as f
on p.id_rel_firmy = f.id_rel
join (select max(id_obj) as id_obj
from firmy
group by id_rel) as o
on o.id_obj = f.id_obj and p.od > '2014-03-01'
It is asked on tables having teens thusands of records, and it takes less then 0,01 second on really not too strong machine.
I wouldn't use IN clause (as it is mentioned somewhere above). IN is given to use with short lists of constans, and not as to be the query filter built on subquery. It is because subquery in IN is performed for every scanned record which can made query taking very loooong time.
Since this is most popular question with regard to this problem, I'll re-post another answer to it here as well:
It looks like there is simpler way to do this (but only in MySQL):
select *
from (select * from mytable order by id, rev desc ) x
group by id
Please credit answer of user Bohemian in this question for providing such a concise and elegant answer to this problem.
Edit: though this solution works for many people it may not be stable in the long run, since MySQL doesn't guarantee that GROUP BY statement will return meaningful values for columns not in GROUP BY list. So use this solution at your own risk!
If you have many fields in select statement and you want latest value for all of those fields through optimized code:
select * from
(select * from table_name
order by id,rev desc) temp
group by id
How about this:
SELECT all_fields.*
FROM (SELECT id, MAX(rev) FROM yourtable GROUP BY id) AS max_recs
LEFT OUTER JOIN yourtable AS all_fields
ON max_recs.id = all_fields.id
This solution makes only one selection from YourTable, therefore it's faster. It works only for MySQL and SQLite(for SQLite remove DESC) according to test on sqlfiddle.com. Maybe it can be tweaked to work on other languages which I am not familiar with.
SELECT *
FROM ( SELECT *
FROM ( SELECT 1 as id, 1 as rev, 'content1' as content
UNION
SELECT 2, 1, 'content2'
UNION
SELECT 1, 2, 'content3'
UNION
SELECT 1, 3, 'content4'
) as YourTable
ORDER BY id, rev DESC
) as YourTable
GROUP BY id
Here is a nice way of doing that
Use following code :
with temp as (
select count(field1) as summ , field1
from table_name
group by field1 )
select * from temp where summ = (select max(summ) from temp)
I like to do this by ranking the records by some column. In this case, rank rev values grouped by id. Those with higher rev will have lower rankings. So highest rev will have ranking of 1.
select id, rev, content
from
(select
#rowNum := if(#prevValue = id, #rowNum+1, 1) as row_num,
id, rev, content,
#prevValue := id
from
(select id, rev, content from YOURTABLE order by id asc, rev desc) TEMP,
(select #rowNum := 1 from DUAL) X,
(select #prevValue := -1 from DUAL) Y) TEMP
where row_num = 1;
Not sure if introducing variables makes the whole thing slower. But at least I'm not querying YOURTABLE twice.
here is another solution hope it will help someone
Select a.id , a.rev, a.content from Table1 a
inner join
(SELECT id, max(rev) rev FROM Table1 GROUP BY id) x on x.id =a.id and x.rev =a.rev
None of these answers have worked for me.
This is what worked for me.
with score as (select max(score_up) from history)
select history.* from score, history where history.score_up = score.max
Here's another solution to retrieving the records only with a field that has the maximum value for that field. This works for SQL400 which is the platform I work on. In this example, the records with the maximum value in field FIELD5 will be retrieved by the following SQL statement.
SELECT A.KEYFIELD1, A.KEYFIELD2, A.FIELD3, A.FIELD4, A.FIELD5
FROM MYFILE A
WHERE RRN(A) IN
(SELECT RRN(B)
FROM MYFILE B
WHERE B.KEYFIELD1 = A.KEYFIELD1 AND B.KEYFIELD2 = A.KEYFIELD2
ORDER BY B.FIELD5 DESC
FETCH FIRST ROW ONLY)
Sorted the rev field in reverse order and then grouped by id which gave the first row of each grouping which is the one with the highest rev value.
SELECT * FROM (SELECT * FROM table1 ORDER BY id, rev DESC) X GROUP BY X.id;
Tested in http://sqlfiddle.com/ with the following data
CREATE TABLE table1
(`id` int, `rev` int, `content` varchar(11));
INSERT INTO table1
(`id`, `rev`, `content`)
VALUES
(1, 1, 'One-One'),
(1, 2, 'One-Two'),
(2, 1, 'Two-One'),
(2, 2, 'Two-Two'),
(3, 2, 'Three-Two'),
(3, 1, 'Three-One'),
(3, 3, 'Three-Three')
;
This gave the following result in MySql 5.5 and 5.6
id rev content
1 2 One-Two
2 2 Two-Two
3 3 Three-Two
You can make the select without a join when you combine the rev and id into one maxRevId value for MAX() and then split it back to original values:
SELECT maxRevId & ((1 << 32) - 1) as id, maxRevId >> 32 AS rev
FROM (SELECT MAX(((rev << 32) | id)) AS maxRevId
FROM YourTable
GROUP BY id) x;
This is especially fast when there is a complex join instead of a single table. With the traditional approaches the complex join would be done twice.
The above combination is simple with bit functions when rev and id are INT UNSIGNED (32 bit) and combined value fits to BIGINT UNSIGNED (64 bit). When the id & rev are larger than 32-bit values or made of multiple columns, you need combine the value into e.g. a binary value with suitable padding for MAX().
Explanation
This is not pure SQL. This will use the SQLAlchemy ORM.
I came here looking for SQLAlchemy help, so I will duplicate Adrian Carneiro's answer with the python/SQLAlchemy version, specifically the outer join part.
This query answers the question of:
"Can you return me the records in this group of records (based on same id) that have the highest version number".
This allows me to duplicate the record, update it, increment its version number, and have the copy of the old version in such a way that I can show change over time.
Code
MyTableAlias = aliased(MyTable)
newest_records = appdb.session.query(MyTable).select_from(join(
MyTable,
MyTableAlias,
onclause=and_(
MyTable.id == MyTableAlias.id,
MyTable.version_int < MyTableAlias.version_int
),
isouter=True
)
).filter(
MyTableAlias.id == None,
).all()
Tested on a PostgreSQL database.
I used the below to solve a problem of my own. I first created a temp table and inserted the max rev value per unique id.
CREATE TABLE #temp1
(
id varchar(20)
, rev int
)
INSERT INTO #temp1
SELECT a.id, MAX(a.rev) as rev
FROM
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as a
GROUP BY a.id
ORDER BY a.id
I then joined these max values (#temp1) to all of the possible id/content combinations. By doing this, I naturally filter out the non-maximum id/content combinations, and am left with the only max rev values for each.
SELECT a.id, a.rev, content
FROM #temp1 as a
LEFT JOIN
(
SELECT id, content, SUM(rev) as rev
FROM YourTable
GROUP BY id, content
) as b on a.id = b.id and a.rev = b.rev
GROUP BY a.id, a.rev, b.content
ORDER BY a.id
I need to get some items from database with top three comments for each item.
Now I have two stored procedures GetAllItems and GetTopThreeeCommentsByItemId.
In application I get 100 items and then in foreach loop I call GetTopThreeeCommentsByItemId procedure to get top three comments.
I know that this is bad from performance standpoint.
Is there some technique that allows to get this with one query?
I can use OUTER APPLY to get one top comment (if any) but I don't know how to get three.
Items {ItemId, Title, Description, Price etc.}
Comments {CommentId, ItemId etc.}
Sample data that I want to get
Item_1
-- comment_1
-- comment_2
-- comment_3
Item_2
-- comment_4
-- comment_5
One approach would be to use a CTE (Common Table Expression) if you're on SQL Server 2005 and newer (you aren't specific enough in that regard).
With this CTE, you can partition your data by some criteria - i.e. your ItemId - and have SQL Server number all your rows starting at 1 for each of those "partitions", ordered by some criteria.
So try something like this:
;WITH ItemsAndComments AS
(
SELECT
i.ItemId, i.Title, i.Description, i.Price,
c.CommentId, c.CommentText,
ROW_NUMBER() OVER(PARTITION BY i.ItemId ORDER BY c.CommentId) AS 'RowNum'
FROM
dbo.Items i
LEFT OUTER JOIN
dbo.Comments c ON c.ItemId = i.ItemId
WHERE
......
)
SELECT
ItemId, Title, Description, Price,
CommentId, CommentText
FROM
ItemsAndComments
WHERE
RowNum <= 3
Here, I am selecting up to three entries (i.e. comments) for each "partition" (i.e. for each item) - ordered by the CommentId.
Does that approach what you're looking for??
You can write a single stored procedure which calls GetAllItems and GetTopThreeeCommentsByItemId, takes results in temp tables and join those tables to produce the single resultset you need.
If you do not have a chance to use a stored procedure, you can still do the same by running a single SQL script from data access tier, which calls GetAllItems and GetTopThreeeCommentsByItemId and takes results into temp tables and join them later to return a single resultset.
This gets two elder brother using OUTER APPLY:
select m.*, elder.*
from Member m
outer apply
(
select top 2 ElderBirthDate = x.BirthDate, ElderFirstname = x.Firstname
from Member x
where x.BirthDate < m.BirthDate
order by x.BirthDate desc
) as elder
order by m.BirthDate, elder.ElderBirthDate desc
Source data:
create table Member
(
Firstname varchar(20) not null,
Lastname varchar(20) not null,
BirthDate date not null unique
);
insert into Member(Firstname,Lastname,Birthdate) values
('John','Lennon','Oct 9, 1940'),
('Paul','McCartney','June 8, 1942'),
('George','Harrison','February 25, 1943'),
('Ringo','Starr','July 7, 1940');
Output:
Firstname Lastname BirthDate ElderBirthDate ElderFirstname
-------------------- -------------------- ---------- -------------- --------------------
Ringo Starr 1940-07-07 NULL NULL
John Lennon 1940-10-09 1940-07-07 Ringo
Paul McCartney 1942-06-08 1940-10-09 John
Paul McCartney 1942-06-08 1940-07-07 Ringo
George Harrison 1943-02-25 1942-06-08 Paul
George Harrison 1943-02-25 1940-10-09 John
(6 row(s) affected)
Live test: http://www.sqlfiddle.com/#!3/19a63/2
marc's answer is better, just use OUTER APPLY if you need to query "near" entities (e.g. geospatial, elder brothers, nearest date to due date, etc) to the main entity.
Outer apply walkthrough: http://www.ienablemuch.com/2012/04/outer-apply-walkthrough.html
You might need DENSE_RANK instead of ROW_NUMBER/RANK though, as the criteria of a comment being a top could yield ties. TOP 1 could yield more than one, TOP 3 could yield more than three too. Example of that scenario(DENSE_RANK walkthrough): http://www.anicehumble.com/2012/03/postgresql-denserank.html
Its better that you select the statement by using the row_number statement and select the top 3 alone
select a.* from
(
Select *,row_number() over(partition by column)[dup]
) as a
where dup<=3
I want to learn how to combine two db tables which have no fields in common. I've checked UNION but MSDN says :
The following are basic rules for combining the result sets of two queries by using UNION:
The number and the order of the columns must be the same in all queries.
The data types must be compatible.
But I have no fields in common at all. All I want is to combine them in one table like a view.
So what should I do?
There are a number of ways to do this, depending on what you really want. With no common columns, you need to decide whether you want to introduce a common column or get the product.
Let's say you have the two tables:
parts: custs:
+----+----------+ +-----+------+
| id | desc | | id | name |
+----+----------+ +-----+------+
| 1 | Sprocket | | 100 | Bob |
| 2 | Flange | | 101 | Paul |
+----+----------+ +-----+------+
Forget the actual columns since you'd most likely have a customer/order/part relationship in this case; I've just used those columns to illustrate the ways to do it.
A cartesian product will match every row in the first table with every row in the second:
> select * from parts, custs;
id desc id name
-- ---- --- ----
1 Sprocket 101 Bob
1 Sprocket 102 Paul
2 Flange 101 Bob
2 Flange 102 Paul
That's probably not what you want since 1000 parts and 100 customers would result in 100,000 rows with lots of duplicated information.
Alternatively, you can use a union to just output the data, though not side-by-side (you'll need to make sure column types are compatible between the two selects, either by making the table columns compatible or coercing them in the select):
> select id as pid, desc, null as cid, null as name from parts
union
select null as pid, null as desc, id as cid, name from custs;
pid desc cid name
--- ---- --- ----
101 Bob
102 Paul
1 Sprocket
2 Flange
In some databases, you can use a rowid/rownum column or pseudo-column to match records side-by-side, such as:
id desc id name
-- ---- --- ----
1 Sprocket 101 Bob
2 Flange 101 Bob
The code would be something like:
select a.id, a.desc, b.id, b.name
from parts a, custs b
where a.rownum = b.rownum;
It's still like a cartesian product but the where clause limits how the rows are combined to form the results (so not a cartesian product at all, really).
I haven't tested that SQL for this since it's one of the limitations of my DBMS of choice, and rightly so, I don't believe it's ever needed in a properly thought-out schema. Since SQL doesn't guarantee the order in which it produces data, the matching can change every time you do the query unless you have a specific relationship or order by clause.
I think the ideal thing to do would be to add a column to both tables specifying what the relationship is. If there's no real relationship, then you probably have no business in trying to put them side-by-side with SQL.
If you just want them displayed side-by-side in a report or on a web page (two examples), the right tool to do that is whatever generates your report or web page, coupled with two independent SQL queries to get the two unrelated tables. For example, a two-column grid in BIRT (or Crystal or Jasper) each with a separate data table, or a HTML two column table (or CSS) each with a separate data table.
This is a very strange request, and almost certainly something you'd never want to do in a real-world application, but from a purely academic standpoint it's an interesting challenge. With SQL Server 2005 you could use common table expressions and the row_number() functions and join on that:
with OrderedFoos as (
select row_number() over (order by FooName) RowNum, *
from Foos (nolock)
),
OrderedBars as (
select row_number() over (order by BarName) RowNum, *
from Bars (nolock)
)
select *
from OrderedFoos f
full outer join OrderedBars u on u.RowNum = f.RowNum
This works, but it's supremely silly and I offer it only as a "community wiki" answer because I really wouldn't recommend it.
SELECT *
FROM table1, table2
This will join every row in table1 with table2 (the Cartesian product) returning all columns.
select
status_id,
status,
null as path,
null as Description
from
zmw_t_status
union
select
null,
null,
path as cid,
Description from zmw_t_path;
try:
select * from table 1 left join table2 as t on 1 = 1;
This will bring all the columns from both the table.
If the tables have no common fields then there is no way to combine the data in any meaningful view. You would more likely end up with a view that contains duplicated data from both tables.
To get a meaningful/useful view of the two tables, you normally need to determine an identifying field from each table that can then be used in the ON clause in a JOIN.
THen in your view:
SELECT T1.*, T2.* FROM T1 JOIN T2 ON T1.IDFIELD1 = T2.IDFIELD2
You mention no fields are "common", but although the identifying fields may not have the same name or even be the same data type, you could use the convert / cast functions to join them in some way.
why don't you use simple approach
SELECT distinct *
FROM
SUPPLIER full join
CUSTOMER on (
CUSTOMER.OID = SUPPLIER.OID
)
It gives you all columns from both tables and returns all records from customer and supplier if Customer has 3 records and supplier has 2 then supplier'll show NULL in all columns
Select
DISTINCT t1.col,t2col
From table1 t1, table2 t2
OR
Select
DISTINCT t1.col,t2col
From table1 t1
cross JOIN table2 t2
if its hug data , its take long time ..
SELECT t1.col table1col, t2.col table2col
FROM table1 t1
JOIN table2 t2 on t1.table1Id = x and t2.table2Id = y
Joining Non-Related Tables
Demo SQL Script
IF OBJECT_ID('Tempdb..#T1') IS NOT NULL DROP TABLE #T1;
CREATE TABLE #T1 (T1_Name VARCHAR(75));
INSERT INTO #T1 (T1_Name) VALUES ('Animal'),('Bat'),('Cat'),('Duet');
SELECT * FROM #T1;
IF OBJECT_ID('Tempdb..#T2') IS NOT NULL DROP TABLE #T2;
CREATE TABLE #T2 (T2_Class VARCHAR(10));
INSERT INTO #T2 (T2_Class) VALUES ('Z'),('T'),('H');
SELECT * FROM #T2;
To Join Non-Related Tables , we are going to introduce one common joining column of Serial Numbers like below.
SQL Script
SELECT T1.T1_Name,ISNULL(T2.T2_Class,'') AS T2_Class FROM
( SELECT T1_Name,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS S_NO FROM #T1) T1
LEFT JOIN
( SELECT T2_Class,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS S_NO FROM #T2) T2
ON t1.S_NO=T2.S_NO;
select * from this_table;
select distinct person from this_table
union select address as location from that_table
drop wrong_table from this_database;
Very hard when you have to do this with three select statments
I tried all proposed techniques up there but it's in-vain
Please see below script. please advice if you have alternative solution
select distinct x.best_Achiver_ever,y.Today_best_Achiver ,z.Most_Violator from
(SELECT Top(4) ROW_NUMBER() over (order by tl.username) AS conj, tl.
[username] + '-->' + str(count(*)) as best_Achiver_ever
FROM[TiketFollowup].[dbo].N_FCR_Tikect_Log_Archive tl
group by tl.username
order by count(*) desc) x
left outer join
(SELECT
Top(4) ROW_NUMBER() over (order by tl.username) as conj, tl.[username] + '-->' + str(count(*)) as Today_best_Achiver
FROM[TiketFollowup].[dbo].[N_FCR_Tikect_Log] tl
where convert(date, tl.stamp, 121) = convert(date,GETDATE(),121)
group by tl.username
order by count(*) desc) y
on x.conj=y.conj
left outer join
(
select ROW_NUMBER() over (order by count(*)) as conj,username+ '--> ' + str( count(dbo.IsViolated(stamp))) as Most_Violator from N_FCR_Ticket
where dbo.IsViolated(stamp) = 'violated' and convert(date,stamp, 121) < convert(date,GETDATE(),121)
group by username
order by count(*) desc) z
on x.conj = z.conj
Please try this query:
Combine two tables that have no common columns:
SELECT *
FROM table1
UNION
SELECT *
FROM table2
ORDER BY orderby ASC