How do I get multiple values from a table in SQLite? - database

I have three tables:
authors
idname
1 Albert
2Bobby
3 Carl
4 Dan
authors_musicals
rowidauthor_idmusical_id
1 1 1
2 2 1
3 1 2
4 1 3
musicals
id title year
1 Brigadoon 1947
2My Fair Lady1956
3 Oklahoma! 1943
4 Camelot 1960
I need to get all the titles belonging to Albert (his id (1) from authors corresponds to musical_id (1, 2, 3) in authors_musicals which each correspond to title (Brigadoon, My Fair Lady, Oklahoma!) in musicals). I thought the following would work:
SELECT title FROM musicals WHERE id=(SELECT musical_id FROM authors_musicals WHERE author_id=(SELECT id FROM authors WHERE name="Albert"));
This only gives me the first listing. How can I get all three and since these tables are linked, is there a simpler way of getting what I want?

JOIN the tables:
SELECT musicals.title
FROM musicals
JOIN authors_musicals ON (musicals.id = authors_musicals.musical_id)
JOIN authors ON (authors.id = authors_musicals.author_id)
WHERE authors.name = "Albert"

I don't use SQLite but I would assume that it's basically the same as using SQL for any other database. When you use SomeColumn = SomeValue you can only have one value on the right-hand side. Even if your subqueries produce multiple results, only the first will be used because you're using =.
You should be able to keep your current SQL structure and make it work by replacing = with IN, assuming that SQLite supports that operator. Then you'll be comparing to all the results instead of just one.
That said, I don't think that you should be using subqueries at all. It seems more appropriate to be using joins there. Again, there might be some small syntax difference but something like this should work:
SELECT title FROM musicals INNER JOIN authors_musicals
ON musicals.musical_id = authors_musicals.musical_id INNER JOIN authors
ON authors.author_id = authors_musicals.author_id
WHERE authors.name = 'Albert'

Combine the info between tables and get what you need:
SELECT title
FROM authors, authors_musicals, musicals
WHERE name="Albert" and authors.id=authors_musicals.author_id and musical_id = musicals.id;

Related

SELECT INTO returning multiple records when it should return 1

EDIT:
I've since edited the query using JOINS instead of a WHERE clause in light
of suggested comments. I was using a WHERE clause instead of JOIN because I
couldn't get it to work across three tables but have figured it out. I've
also inserted SELECT DISTINCT because it does solve the problem.
Thanks #MichaelEvanchik, and #SeanLange for the help. Im still learning and hope I don't frustrate you guys too much.
I've looked over many of the multiple return threads and don't seem to find an answer that helps me.
I have 4 tables.
table1
ID Cat1_Name1 Cat1_Name2 Cat2_Name1 Cat2_Name2
12 Mike Mike George Mike
13 Jen Jen Amy Amy
14 Jeff Jen Mike Ben
15 Jeff Jeff Fred Tom
16 George Jen Luke Amy
table2
ID Cat1_Name1 Cat1_Name2 Cat2_Name1 Cat2_Name2
25 Mike Mike Jen George
table3
Name Cat1_Value Cat2_Value
Mike 6.5 20.25
Jen 10.2 0.5
Jeff 11.5 1.5
George 8.0 27.1
table4
Name Cat1_Value Cat2_Value
Mike 7.8 20.0
Jen 6.0 13.0
Jeff 13.2 5.0
George 8.0 1.2
Before anyone asks, the set of names in table2 must stay separate from table1. It isn’t duplicate information, but a SINGLE UNKNOWN SET that will be compared to every record in table1, which can contain millions of known sets (i.e.; no ID’s in table1 will ever match the ID in table2). If you look at the tables you can see that the set of names CAN match between table1 and table2 but do not have to. For example, the names for cat1 match between tables 1 and 2 for ID 12 and 25 (all 4 are Mike) but doesn't match any between IDs 13,14,15,16 and 25 (only the two in 25 are Mike). While at cat2, ID 12 and 25 match partially (i.e., the names in cat2 between tables 1 and 2 contain the name George but do not match in the second name). Here I show two categories. There will be upwards of 30 categories of names for one record but for now, I am focusing on 1 to solve this particular problem. Cat1_Name1, Cat1_Name2. I will worry about aggregating the different categories and logical name combinations with JOINs and UNIONs and answer my other question later….hopefully.
I want to create a new table that returns the ID from table1, with the associated value for each category depending on how many names match in the category. For example, since cat1_name 1 and 2 in table1 are mike,mike AND cat1_name 1 and 2 in table2 are mike,mike, return the ID from table 1 (12) and value in table 3 for cat1 (6.5). Different sets of matching names would return values from different tables (i.e.; the partially matching set in cat2 between 12 and 25 might return the value from table4 etc). I asked a similar question about this previously, here but the problem was different:
Returning Results from different tables depending on conditions from two other tables
I have a partial answer for it but now have a different problem. I plan on posting an answer to the first, once I figure out this problem (hopefully with a little help  ).
Here’s my query:
SELECT DISTINCT dbo.table1.ID, dbo.table3.Cat1_Value
INTO Cat1Table
FROM dbo.table2
INNER JOIN dbo.table3 ON (dbo.table1.Cat1_Name1 = dbo.table3.Name ) AND
(dbo.table1.Cat1_Name2 = dbo.table3.Name )
INNER JOIN dbo.table1 ON (dbo.table2.Cat2_Name1 = dbo.table3.Name ) AND
(dbo.table2.Cat2_Name2 = dbo.table3.Name )
Result table that I want:
Cat1Table
ID Cat1_Value
12 6.5
What I’m getting:
Cat1Table
ID Cat1_Value
12 6.5
12 6.5
Why am I getting a duplicate? Is it my logic or am I missing something else more simple? If I use SELECT DISTINCT it gives me the correct return but I'm thinking there might be a more efficient way because this will be expanded to millions of records. Wouldn't SELECT DISTINCT slow everything down?

Distinct rows from three tables using joins

I have three tables related to article section of my website. I need to show the top authors based on based on number if times authors articles where read. I use basic three table to store this inform.
Article has all the details related to articles, author information is stored in Authors and when a user views a particular article I update or insert a new record in Popularity.
Below is sample data:
Articles
ArticleID Title Desc AuthorID
--------- ---------------- ---- --------
1 Article One .... 100
2 Article Two .... 200
3 Article Three .... 100
4 Article Four .... 300
5 Article Five .... 100
6 Article Six .... 300
7 Article Seven .... 500
8 Article Eight .... 100
9 Article Nine .... 600
Authors
AuthorID AuthorName
-------- ------------
100 Author One
200 Author Two
300 Author Three
400 Author Four
500 Author Five
600 Author Six
Popularity
ID ArticleID Hits
-- --------- ----
1 1 20
2 2 50
3 5 100
4 3 11
5 4 21
I am trying to use following query to get the TOP 10 authors:
SELECT TOP 10 AuthorID
,au.AuthorName
,ArticleHits
,SUM(ArticleHits)
FROM Authors au
JOIN Articles ar
ON au.AuthorID = ar.ArticleAuthorID
JOIN Popularity ap
ON ap.ArticleID = ar.ArticleID
GROUP BY AuthorID,1,1,1
But this generates the following error:
Msg 164, Level 15, State 1, Line 12Each GROUP BY expression must contain at least one column that is not an outer reference.
SQL Server requires that any columns in the SELECT list must be in the GROUP BY cluase or in an aggregate function. The following query appears to be working, as you can see I included a GROUP BY au.AuthorID, au.AuthorName which contains both columns in the SELECT list that are not in an aggregate function:
SELECT top 10 au.AuthorID
,au.AuthorName
,SUM(Hits) TotalHits
FROM Authors au
JOIN Articles ar
ON au.AuthorID = ar.AuthorID
JOIN Popularity ap
ON ap.ArticleID = ar.ArticleID
GROUP BY au.AuthorID, au.AuthorName
order by TotalHits desc
See SQL Fiddle with Demo.
I am not sure if you want the Hits in the SELECT statement because you will then have to GROUP BY it. This could alter the Sum(Hits) for each article because if the hits are different in each entry you will not get an accurate sum.
I would do it this way. First figure out who your top ten authors are, then go get the name (and any other columns you want to pull along). For this query it's not a huge difference but all that grouping can become more complex and expensive as your output list requirements increase.
;WITH TopAuthors(AuthorID, ArticleHits) AS
(
SELECT TOP (10) a.AuthorID, SUM(p.Hits)
FROM dbo.Authors AS a
INNER JOIN dbo.Articles AS ar
ON a.AuthorID = ar.AuthorID
INNER JOIN dbo.Popularity AS p
ON ar.ArticleID = p.ArticleID
ORDER BY SUM(p.Hits) DESC
)
SELECT t.AuthorID, a.AuthorName, t.ArticleHits
FROM TopAuthors AS t
INNER JOIN dbo.Authors AS a
ON t.AuthorID = a.AuthorID
ORDER BY t.ArticleHits DESC;
For this specific query bluefeet's version is likely to be more efficient. But if you add additional columns to the output (e.g. more info from the authors table) the grouping might outweigh the additional seek or scan I have presented.
As many columns present with Aggregate function those have to be present in the group by clause. In your case, AuthorID, au.AuthorName, ArticleHits should also be present. Hence the group by statement would become
GROUP BY AuthorID, au.AuthorName, ARticleHits
This would help.

Binding literal list to SQL query as column that would return 1 row?

I have an SQL query that returns a single row, I have a list of numbers that I need to have returned as individual rows with the single row data bound to their row.
for example here's what I'm trying to do
select a,b,c, barcode
from database
join ('12345', '67890',...) as barcode
where a=1 and b=2 and c=3
I need to do it this way due to the fact i'm modifying some code that's looking for a specific format to come from the query, and modifying the code to match the literal list I have is far more difficult than doing something like this
Example Output:
a b c barcode
- - - -------
1 2 3 12345
1 2 3 67890
1 2 3 ....
1 2 3 ....
...
Easiest method would be to create a barcode table with a single column, insert the values you want here one at a time, then join to that table.
Could use a union to fudge it as well. Problem with join ('484','48583',...) is you are joining to a single row with multi columns, when you want one row per record.
pseudo coded:
select a,b,c, barcode
from database
join (select 12345 union all select 289384 union all...)a as barcode
where a=1 and b=2 and c=3
Basically, you could pass the list as a single CSV string and transform it into a row set of items. A table-valued function is often used in such cases, but there are actually many options to explore. A very comprehensive collection of various methods and their tests can be found in the set of articles by Erland Sommarskog: Arrays and Lists in SQL Server.
If it was e.g. a function, your query might look like this:
SELECT
t.a,
t.b,
t.c,
s.Value AS barcode
FROM yourtable t
CROSS JOIN dbo.Split('12345,67890', ',') s
WHERE t.a = 1
AND t.b = 2
AND t.c = 3

Is the 'BETWEEN' function very expensive in SQL Server?

I'm trying to join two relatively simple tables together, but my query is experiencing serious hangups. I'm not sure why, but I think it might have something to do with the 'between' function. My first table looks something like this (with a lot of other columns, but this would be the only column I'm pulling):
RowNumber
1
2
3
4
5
6
7
8
My second table "groups" my rows into "blocks", and has the following schema:
BlockID RowNumberStart RowNumberStop
1 1 3
2 4 7
3 8 8
The desired result I'm looking to get is to link the RowNumber with the BlockID like below, with the same number of rows with the first table. So the result would look like this:
RowNumber BlockID
1 1
2 1
3 1
4 2
5 2
6 2
7 2
8 3
In order to get that, I used the following query, writing the results into a temp table:
select A.RowNumber, B.BlockID
into TEMP_TABLE
from TABLE_1 A left join TABLE_2 B
on A.RowNumber between B.RowNumberStart and B.RowNumberStop
TABLE_1 and TABLE_2 are actually very large tables. Table 1 is about 122M Rows, and TABLE_2 is about 65M rows. In TABLE_1, the RowNumber is defined as a 'bigint', and in TABLE_2, the BlockID, RowNumberStart, and RowNumberStop are all defined as 'int'. Not sure that makes a difference, but just wanted include that information, too.
The query has now been hung up for eight hours. Similar queries on this type and volume of data are not taking anywhere near this long. So I'm wondering if it could be the 'between' statement that's hanging up this query.
Definitely would welcome any suggestions on how to make this more efficient.
BETWEEN is simply shorthand for :
select A.RowNumber, B.BlockID
into TEMP_TABLE
from TABLE_1 A left join TABLE_2 B
on A.RowNumber >= B.RowNumberStart AND A.RowNumber <= B.RowNumberStop
If execution plan goes from B to A (but left join would indicate it has to go from A to B, really), then I'm assuming TABLE_1 is indexed on RowNumber (and that should be covering on this query). If it's only got a clustered index on RowNumber and the table is very wide, I recommend a non-clustered index only on RowNumber, since you'll fit a lot more rows per page that way.
Otherwise, you want to index on TABLE_2 on RowNumberStart DESC or RowNumberStop ASC, because for given A you'd need a DESC on RowNumberStart to match.
I think you might want to change your join to an INNER JOIN, the way your join criteria is set up. (Are you ever going to get TABLE_1 in no block?)
If you look at your execution plan, you should get more clues as to why the performance might be bad, but the Stop criteria is probably not used on the seek into TABLE_1.
Unfortunately SQLMenace's answer about the SELECT INTO has been deleted. My comment regarding that was meant to be: #Martin SELECT INTO performance isn't as bad as it once was, but I still recommend a CREATE TABLE for most production because SELECT INTO will infer types and NULLability. This is fine if you verify it is doing what you think it is doing, but creating a super long varchar or a decimal column with very strange precision can result in not only odd tables, but performance issues (especially with some of those big varchars when you forget a LEFT or whatever). I think it just helps to make it clear what you are expecting the table to look like. Often I will SELECT INTO using WHERE 0 = 1 and check out the schema and then script it with my tweaks (like adding an IDENTITY or adding a column with a timestamp default).
You have one main problem: you want to display too much data volume at once. Ar you really sure you want handle the result of ALL 122M rows from table 1 at once? Do you really need that?

Outputting Results from complicated database structure (SQL Server)

This will be a long question so I'll try and explain it as best as I can.
I've developed a simple reporting tool in which a number of results are stored and given a report id, these results were generated from a particular quote being used on the main system, with a huge list of these being stored in a quotes table. Here are the current batch:
REPORTS
REP_ID DESC QUOTE_ID
-----------------------------------
1 Test 1
2 Today 1
3 Last Week 2
RESULTS
RES_ID TITLE REFERENCE REP_ID
---------------------------------------------------
1 Equipment Toby 1
2 Inventory Carl 1
3 Stocks Guest 2
4 Portfolios Guest 3
QUOTE
QUOTE_ID QUOTE
------------------------------------
1 Booking a meeting room
2 Car Park Policy
3 New User Guide
So far, so good, a simple stored procedure was able to pull all the information necessary.
Now, the feature list has been upped to include categories and groups of the quotes. In the Reports table quote_id has been changed to group_id to link to the following tables.
REPORTS
- REPORT_ID
- DESC
- GROUP_ID
GROUP
- GROUP_ID
- GROUP
GROUP_CAT_JOIN
- GCJ_ID
- CAT_ID
- GROUP_ID
CATEGORIES
- CAT_ID
- CATEGORY
CAT_QUOTE_JOIN
- CQJ_ID
- CAT_ID
- QUOTE_ID
The idea of these changes is so that instead of running a report on a quote I should now write a report for a group where a group is a set of quotes for certain occasions. I should also be able to run a report on a category where a category is also a set of quotes for certain departments. The trick is that several categories can fall into one group.
To explain it further, the results table has a report_id that links to reports, reports has a group_id that links to groups, groups and categories are linked through a group_cat_join table, the same with categories and quotes through a cat_quote_join table.
In basic terms I should be able to pull all the results from either a group of quotes or a category of quotes. The query will aim to pull all the results from a certain report under either a certain category, a group or both. This puzzle has left me stumped for days now as inner joins don't appear to be working and I'm struggling to find other ways to solve the problem using SQL.
Can anyone here help me?
Here's some extra clarification.
I want to be able to return all the results within a category, but as of right now the solution below and the ones I've tried always output every solution within a description, which is not what I want.
Here's an example of the data I have in there at the moment
Results
RES_ID TITLE REFERENCE REP_ID
---------------------------------------------------
1 Equipment Toby 1
2 Inventory Carl 1
3 Stocks Guest 2
4 Portfolios Guest 3
Reports
REP_ID DESC GROUP_ID
-----------------------------------
1 Test 1
2 Today 1
3 Last Week 2
GROUP
GROUP_ID GROUP
---------------------------------
1 Standard
2 Target Week
GROUP_CAT_JOIN
GCJ_ID GROUP_ID CAT_ID
----------------------------------
1 1 1
2 1 2
3 2 3
CATEGORIES
CAT_ID CAT
-------------------------------
1 York Office
2 Glasgow Office
3 Aberdeen Office
CAT_QUOTE_JOIN
CQJ_ID CAT_ID QUOTE_ID
-----------------------------------
1 1 1
2 2 2
3 3 3
QUOTE
QUOTE_ID QUOTE
------------------------------------
1 Booking a meeting room
2 Car Park Policy
3 New User Guide
This is the test data I am using at the moment and to my knowledge it is similar to what will be run through once this is done. In all honesty I'm still trying to get my head around this structure.
The result I am looking for is if I choose to search by group I'll get everything within a group, if I choose everything inside a category I get everything just inside that category, and if I choose something from a category in a group I get everything inside that category. The problem at the moment is that whenever the group is referenced everything inside every category that's linked to the group is pulled.
The following will get the necessary rows from the results:
select
a.*
from
results a
inner join reports b on
a.rep_id = b.rep_id
and (-1 = #GroupID or
b.group_id = #GroupID)
and (-1 = #CatID or
b.cat_id = #CatID)
Note that I used -1 as the placeholder for all Groups and Categories. Obviously, use a value that makes sense to you. However, this way, you can specify a specific group_id or a specific cat_id and get the results that you want.
Additionally, if you want Group/Category/Quote details, you can always append more inner joins to get that info.
Also note that I added the Group_ID and Cat_ID conditions to the Reports table. This would be the SQL necessary if and only if you add a Cat_ID column to the Reports table. I know that your current table structure doesn't support this, but it needs to. Otherwise, as my grandfather used to say, "Boy, you can't get there from here." The issue here is that you want to limit reports by group and category, but reports only knows about group. Therefore, we need to tie something to the category from reports. Otherwise, it will never, ever, ever limit reports by category. The only thing that you can limit by both group and category is quotes. And that doesn't seem to be your requirement.
As an addendum: If you add cat_id to results instead of reports, the join condition should be:
and (-1 = #CatID or
a.cat_id = #CatID)
Is this what you are looking for?
SELECT a.*
FROM Results a
JOIN Reports b ON a.REP_Id = c.REP_Id
WHERE EXISTS (
SELECT * FROM CAT_QUOTE_JOIN c
WHERE c.QUOTE_ID = b.QUOTE_ID -- correlation to the outer query
AND c.CAT_ID = #CAT_ID -- parameterization
)
OR EXISTS (
-- note that subquery table aliases are not visible to other subqueries
-- so we can reuse the same letters
SELECT * FROM CAT_QUOTE_JOIN c, GROUP_CAT_JOIN d
WHERE c.CAT_ID = d.CAT_ID -- subquery join
AND c.QUOTE_ID = b.QUOTE_ID -- correlation to the outer query
AND d.GROUP_ID = #GROUP_ID -- parameterization
)

Resources