I am trying to solve this exercise:
Under the assumption that receipts of money (inc) and payouts (out) can be registered any number of times a day for each collection point [i.e. the code column is the primary key], display a table with one corresponding row for each operating date of each collection point.
Result set: point, date, total payout per day (out), total money intake per day (inc).
Missing values are considered to be NULL.
After several hours of headbanging I found this solution online:
SELECT X.POINT,X.DATE,SUM(OUT),SUM(INC) FROM (
SELECT I.POINT,I.DATE,NULL AS OUT, SUM(I.INC) AS INC FROM INCOME I
GROUP BY I.POINT,I.DATE
UNION
SELECT O.POINT,O.DATE,SUM(O.OUT) AS OUT , NULL AS INC FROM OUTCOME O
GROUP BY O.POINT,O.DATE) AS X
GROUP BY POINT,DATE
I tried to understand how this works. I googled all variations of "NULL AS OUT" but I could not find any explanation/concept. All results point out to stored procedures which is not what I am looking fro I think.
Can someone explain to me how these lines with "AS OUT" work, please?
SELECT I.POINT,I.DATE,NULL AS OUT, SUM(I.INC) AS INC FROM INCOME I
GROUP BY I.POINT,I.DATE
UNION
SELECT O.POINT,O.DATE,SUM(O.OUT) AS OUT , NULL AS INC FROM OUTCOME O
On the left - The complete version of both tables
On the right the result
out is probably an unfortunate name there, but other than that, there's nothing magical there.
"null" is the literal value of null.
"as out" assigns a column alias to the selected null.
Syntactically, this is the equivalent to any other literal value with any other alias, e.g., SELECT 'some_varchar_literal' AS some_alias or SELECT 123 AS numeric_alias.
Copying comment as answer to mark it as worked
is this SQL server or mysql. from my understanding I think as out means null is referred to a column named OUT, basically alias name for a column. Income table will not have outcome and outcome table will not have income. Hence in both the select statement, the respective values are marked as null (NULL AS OUT & NULL AS INC). when you perform aggregate on these columns then the null values will be ignored.
you can use NULL as when your come across a similar union statement where one of the two tables does not have a column. In that case you can create this missing column as a dummy one and use it in your code.
Related
I have a small db for a college SQL class project. The database is a collection of information regarding scuba training.
I have a table which is CLASS and has a column IREQ char(2). This is a list of classes and the IREQ column is the instructor type required for that class.
Another table is INST with a column INSTYPE char(2). This is a table of instructor information and the INSTYPE column is the instructors current type.
A third table is ITITLE with three columns; TNUM int, TITLE varchar (40), TABR char(2). TNUM is sequential numbers for each row for ranking. TITLE is the full name of each trainer level. TABR is the two character abbreviation of the TITLE and corresponds to INSTYPE and IREQ in the previous tables.
I need to check which instructors have a high enough trainer level to teach a given set of courses for a given month.
I have the Class selection and month with
where
CNUMBER like 'SD____'
and 7 = month(STARTDATE)
The SELECT command includes each instructors First/Alias/Last names, Instructor Number, Class Number, Class Name, Start Date
From all three tables.
I've tried a subquery within a subquery but get an error requiring an 'exists' statement after the where.
I've tried a few other things but can't get the conversion to a number value for ranking comparison. I only have the 2 character field, that is an abbreviation for the title, that is common between all three tables but can't be a foreign key (that I can see). The only thing I can think of is to somehow set a value for the IREQ or TNUM fields equal to the TNUM value for the matching TABR field. THE ITITLE table is exactly 8 rows and the TABR are not in alphabetical order relative to the TNUM value. I made the table exclusively to assign a number to the TABR so I could do this and a similar comparison between another set of tables.
Thanks!
Try using with the windows functions in SQL
https://www.sqlshack.com/use-window-functions-sql-server/
Since you haven't provided the source of STARTDATE nor the CNUMBER, you can adapt this to your needs (add the where clause and update the columns you need):
select c.*, t1.*, i.*
from CLASS c
join ITITLE t1 on c.IREQ = t1.TABR
join ITITLE t2 on t1.TNUM <= t2.TNUM
join INST i on t2.TABR = i.INSTYPE
This will list every class with every instructor with high enough rank to teach it.
I've been puzzling over this problem for days now, and have just identified the source of my woes - an order by clause is not working as expected.
The script goes like this:
select * from my_table
order by change_effective_date, unique_id desc
change_effective_date is a datetime field, and unique_id is an int field.
I had expected this to give me the most recent row first (i.e. the row with the highest value in change_effective_date). However, it was giving the oldest row first, and the unique_id was also in ascending order (these IDs are normally sequential, so I would generally expect them to follow the same order as the dates anyway, though this is not completely reliable).
Puzzled, I turned to Google and found that data type precedence can affect order by clauses, with lower-ranking datatypes being converted to the higher-ranking datatype: https://blog.sqlauthority.com/2010/10/08/sql-server-simple-explanation-of-data-type-precedence/
However, datetime takes precedence over int, so it shouldn't be affected in this way.
More curiously, if I take unique_id out of the order by clause, it sorts the data in descending date order perfectly. I do want to add a unique identifier to the order by clause, though, as there could be multiple rows with the same date and further on in the script I want to identify the most recent (in this case, the unique_id would be the tie-breaker as I would assume it to be sequential).
If anyone can explain what's happening here, I'd really appreciate it!
Thanks.
select * from my_table
order by change_effective_date desc, unique_id desc
Why are column ordinals legal for ORDER BY but not for GROUP BY? That is, can anyone tell me why this query
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY OrgUnitID
cannot be written as
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY 1
When it's perfectly legal to write a query like
SELECT OrgUnitID FROM Employee AS e ORDER BY 1
?
I'm really wondering if there's something subtle about the relational calculus, or something, that would prevent the grouping from working right.
The thing is, my example is pretty trivial. It's common that the column that I want to group by is actually a calculation, and having to repeat the exact same calculation in the GROUP BY is (a) annoying and (b) makes errors during maintenance much more likely. Here's a simple example:
SELECT DATEPART(YEAR,LastSeenOn), COUNT(*)
FROM Employee AS e
GROUP BY DATEPART(YEAR,LastSeenOn)
I would think that SQL's rule of normalize to only represent data once in the database ought to extend to code as well. I'd want to only right that calculation expression once (in the SELECT column list), and be able to refer to it by ordinal in the GROUP BY.
Clarification: I'm specifically working on SQL Server 2008, but I wonder about an overall answer nonetheless.
One of the reasons is because ORDER BY is the last thing that runs in a SQL Query, here is the order of operations
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
so once you have the columns from the SELECT clause you can use ordinal positioning
EDIT, added this based on the comment
Take this for example
create table test (a int, b int)
insert test values(1,2)
go
The query below will parse without a problem, it won't run
select a as b, b as a
from test
order by 6
here is the error
Msg 108, Level 16, State 1, Line 3
The ORDER BY position number 6 is out of range of the number of items in the select list.
This also parses fine
select a as b, b as a
from test
group by 1
But it blows up with this error
Msg 164, Level 15, State 1, Line 3
Each GROUP BY expression must contain at least one column that is not an outer reference.
There is a lot of elementary inconsistencies in SQL, and use of scalars is one of them. For example, anyone might expect
select * from countries
order by 1
and
select * from countries
order by 1.00001
to be a similar queries (the difference between the two can be made infinitesimally small, after all), which are not.
I'm not sure if the standard specifies if it is valid, but I believe it is implementation-dependent. I just tried your first example with one SQL engine, and it worked fine.
use aliasses :
SELECT DATEPART(YEAR,LastSeenOn) as 'seen_year', COUNT(*) as 'count'
FROM Employee AS e
GROUP BY 'seen_year'
** EDIT **
if GROUP BY alias is not allowed for you, here's a solution / workaround:
SELECT seen_year
, COUNT(*) AS Total
FROM (
SELECT DATEPART(YEAR,LastSeenOn) as seen_year, *
FROM Employee AS e
) AS inline_view
GROUP
BY seen_year
databases that don't support this basically are choosing not to. understand the order of the processing of the various steps, but it is very easy (as many databases have shown) to parse the sql, understand it, and apply the translation for you. Where its really a pain is when a column is a long case statement. having to repeat that in the group by clause is super annoying. yes, you can do the nested query work around as someone demonstrated above, but at this point it is just lack of care about your users to not support group by column numbers.
I created a table, tblNewParts with 3 columns:
NewCustPart
AddedDate
Handled
and I am trying to FULL JOIN it to an existing table, tblPartsWorkedOn.
tblNewParts is defined to have Handled defaulted to 'N'...
SELECT *
FROM dbo.tblPartsWorkedOn AS BASE
FULL JOIN dbo.tblNewParts AS ADDON ON BASE.[CustPN] = ADDON.[NewCustPart]
WHERE ADDON.[Handled] IS NULL
ORDER BY [CustPN] DESC
And I want the field [Handled] to come back as 'N' instead of NULL when I run the query. The problem is that when there aren't any records in the new table, I get NULL's instead of 'N's.
I saw a SELECT CASE WHEN col1 IS NULL THEN defaultval ELSE col1 END as a mostly suitable answer from here. I am wondering if this will work in this instance, and how would I write that in T-SQL for SQL Server 2012? I need all of the columns from both tables, rather than just the one.
I'm making this a question, rather than a comment on the cited link, so as to not obscure the original link's question.
Thank you for helping!
Name the column (alias.column_name) in select statement and use ISNULL(alias.column,'N').
Thanks
After many iterations I found the answer, it's kind of bulky but here it is anyway. Synopsis:
Yes, the CASE statement does work, but it gives the output as an unnamed column. Also, in this instance to get all of the original columns AND the corrected column, I had to use SELECT *, CASE...END as [ColumnName].
But, here is the better solution, as it will place the information into the correct column, rather than adding a column to the end of the table and calling that column 'Unnamed Column'.
Select [ID], [Seq], [Shipped], [InternalPN], [CustPN], [Line], [Status],
CASE WHEN ADDON.[NewCustPart] IS NULL THEN BASE.[CustPN] ELSE
ADDON.[NewCustomerPart] END as [NewCustPart],
GetDate() as [AddedDate],
CASE WHEN ADDON.[Handled] IS NULL THEN 'N' ELSE ADDON.[Handled] END as [Handled]
from dbo.tblPartsWorkedOn as BASE
full join dbo.tblNewParts as AddOn ON Base.[CustPN] = AddOn.NewCustPart
where AddOn.Handled = 'N' or AddOn.Handled is null
order by [NewCustPart] desc
This sql code places the [CustPN] into [NewCustPart] if it's null, it puts a 'N' into the field [Handled] if it's null and it assigns the date to the [AddedDate] field. It also only returns records that have not been handled, so that you get the ones that need to be looked at; and it orders the resulting output by the [NewCustPart] field value.
Resulting Output looks something like this: (I shortened the DateTime for the output here.)
[ID] [SEQ] [Shipped] [InternalPN] [CustPN] [Status] [NewCustPart] [AddedDate] [Handled]
1 12 N 10012A 10012A UP 10012A 04/02/2016 N
...
Rather than with the nulls:
[ID] [SEQ] [Shipped] [InternalPN] [CustPN] [Status] [NewCustPart] [AddedDate] [Handled]
1 12 N 10012A 10012A UP NULL NULL NULL
...
I'm leaving this up, and just answering it rather than deleting it, because I am fairly sure that someone else will eventually ask this same question. I think that lots of examples showing how and why something is done, is a very helpful thing to have as not everything can be generalized. Just some thoughts and I hope that this helps someone else!
I'm working on a project where we have to figure out if a given field is potentially a company name versus an address.
In taking a very broad swipe at it, we are going under the assumption that if this field contains no numbers, odds are it is a name vs. a street address (we're aiming for the 80% case, knowing some will have to be done manually).
So now to the question at hand. Given a table with, for the sake of simplicity, a single varchar(100) column, how could I find those records who have no numeric characters at any position within the field?
For example:
"Main Street, Suite 10A" --Do not return this.
"A++ Billing" --Should be returned
"XYZ Corporation" --Should be returned
"100 First Ave, Apt 20" --Should not be returned
Thanks in advance!
Sql Server allows for a regex-like syntax for range [0-9] or Set [0123456789] to be specified in a LIKE operator, which can be used with the any string wildcard (%). For example:
select * from Address where StreetAddress not like '%[0-9]%';
The wildcard % at the start of the like will obviously hurt performance (Scans are likely), but in your case this seems inevitable.
Another MSDN Reference.
select * from table where column not like '%[0-9]%'
This query returns you all rows from table where column does not contain any of the digits from 0 to 9.
I like the simple regex approach, but for the sake of discussion will mention this alternative which uses PATINDEX.
SELECT InvoiceNumber from Invoices WHERE PATINDEX('%[0-9]%', InvoiceNumber) = 0
This worked for me .
select total_employee_count from company_table where total_employee_count like '%[^0-9]%'
This returned all rows that contains non numeric values including 2-3 ..
This Query to list out Tables created with numeric Characters
select * from SYSOBJECTS where xtype='u' and name like '%[0-9]%'