MS SQL Server - Use Calculated Field of SELECT statement - sql-server

i would like to ask you if there is a statement to use calculated fields of the same SELECT-statement:
For example:
Table Test:
Machine Amount Value
500 20 20
SELECT Machine,
Amount*Value AS TestFormula
TestFormula*12 AS TestFormulaYear
FROM Test
What is the correct statement to reuse this calculated field?
Thanks in advance,
Kevin

In sql server at least, you can do it with a subquery:
SELECT Machine
, TestFormula
, TestFormula*12 AS TestFormulaYear
FROM (
SELECT Machine
, Amount*Value AS TestFormula
FROM Test
) T

For the simple example you showed us, I would just recommend repeating the expression
SELECT
Machine,
Amount*Value AS TestFormula,
Amount*Value*12 AS TestFormulaYear
FROM Test;
Other answers have already shown how you can use a subquery to truly reuse the column, but that is not very performant compared to what I wrote above.

You can use a common-table expression (CTE) to reuse the value:
WITH formula AS (
SELECT Machine,
Amount*Value AS TestFormula
FROM Test
)
SELECT Machine,
TestFormula
TestFormula*12 AS TestFormulaYear
FROM formula;
If the batch with the CTE contains multiple statements, the preceding statement must be terminated with a semicolon.

Assuming this is T-SQL:
You can't reference the alias of a column in the SELECT statement, no. If you look at SELECT (Transact-SQL) you'll note that the SELECT is the 8th part of the query to be processed. This means only ORDER BY is going to be able to reference a column's alias.
If you need to do further calculations on a calculated value you need to use a CTE, subquery, or redeclare the calculation. For example:
Repeated calculation:
SELECT [Column] * 10 As Expression,
[Column] * 10 * 5 AS Expression2
FROM [Table];
CTE:
WITH Formula AS(
SELECT [Column] * 10 As Expression
FROM [Table])
SELECT Expression,
Expression * 5 AS Expression2
FROM Formula;
Sub Query:
SELECT Expression,
Expression * 5 AS Expression2
FROM (SELECT [Column] * 10 As Expression
FROM [Table]) Formula;

If you are looking to set up a Statement so that when formulas are changed many columns will be updated, I suppose you could declare the formulas and use Dynamic SQL. There can be an advantage to this if you want to be sure that lots of columns are updated correctly:
Declare #TestFormula as nvarchar(100) = '([Amount]*[Value])'
Declare #TestFormulaYear as nvarchar(100) = '(12*' + #TestFormula + ')'
declare #sql as nvarchar(max)
set #sql = 'SELECT [Machine], ' + #TestFormula + ' AS TestFormula, ' + #TestFormulaYear + ' AS TestFormulaYear
FROM (values(500, 20, 20)) a([Machine], [Amount], [Value])'
exec(#sql)

Related

SQL Server: Running CONTAINS against a substring of a column

I have a table table1 with an nvarchar column column1 that looks something like this:
phrase 1.1;phrase 1.2;phrase 1.3 ...
phrase 2.1;phrase 2.2;phrase 2.3 ...
...
I would like to run an CONTAINS query on only the first phrase in the column. I've tried several variations of this:
SELECT * FROM table1
WHERE CONTAINS(LEFT(table1.column1, CHARINDEX(';', table1.column1) - 1), <search query>)
Is this possible? Ideally, I'd like to do it without creating a new table or column.
Edit -- Some of the errors I'm getting:
Incorrect syntax near the keyword 'LEFT'., An expression of non-boolean type specified in a context where a condition is expected., Incorrect syntax near ";". Expecting '(', or SELECT.
The CONTAINS function is only used when the table (column?) is configured to use full-text indexing. I'm going to guess that this is not the case, here. (Apologies if it is--I have no experience with full-text indexing.)
In any case, as you are matching the first characters in the string, the more precise function LEFT should work fine:
SELECT *
FROM table1
WHERE LEFT(table1.column1, CHARINDEX(';', table1.column1) - 1) = #SearchQuery
Note that you may have problems if there are no semicolons in the string. One way to avoid that would be to guarantee there is always one present, like so:
SELECT *
FROM table1
WHERE LEFT(table1.column1 + ';', CHARINDEX(';', table1.column1) - 1) = #SearchQuery
So there's a couple of problems here. You can only use CONTAINS on a full-text indexed column. If your table is configured that way then great!
Your second issue is that the CONTAINS syntax is a bit clunky, and it doesn't like complexity. You could work around that using a common-table expression, e.g.:
DECLARE #table TABLE (column1 NVARCHAR(100));
INSERT INTO #table SELECT 'phrase 1.1;phrase 1.2;phrase 1.3;'
INSERT INTO #table SELECT 'phrase 2.1;phrase 2.2;phrase 2.3;'
SELECT LEFT(column1, CHARINDEX(';', column1) - 1) FROM #table;
WITH x AS (SELECT LEFT(column1, CHARINDEX(';', column1) - 1) AS search FROM #table)
SELECT * FROM x WHERE CONTAINS(x.search, 'phrase 1.2');
Noting that this won't work, because #table.column1 isn't full-text indexed. But it gets around the syntax error, and could be adapted for your case. Something like this:
WITH x AS (SELECT LEFT(column1, CHARINDEX(';', column1) - 1) AS search FROM table1)
SELECT * FROM x
WHERE CONTAINS(search, <search query>)

Unexpected output on SELECT data retrieval [duplicate]

I'm confused. How could you explain this diffenece in variable concatenation with ORDER BY?
declare #tbl table (id int);
insert into #tbl values (1), (2), (3);
declare #msg1 varchar(100) = '', #msg2 varchar(100) = '',
#msg3 varchar(100) = '', #msg4 varchar(100) = '';
select #msg1 = #msg1 + cast(id as varchar) from #tbl
order by id;
select #msg2 = #msg2 + cast(id as varchar) from #tbl
order by id+id;
select #msg3 = #msg3 + cast(id as varchar) from #tbl
order by id+id desc;
select TOP(100) #msg4 = #msg4 + cast(id as varchar) from #tbl
order by id+id;
select
#msg1 as msg1,
#msg2 as msg2,
#msg3 as msg3,
#msg4 as msg4;
Results
msg1 msg2 msg3 msg4
---- ---- ---- ----
123 3 1 123
As many have confirmed, this is not the right way to concatenate all the rows in a column into a variable - even though in some cases it does "work". If you want to see some alternatives, please check out this blog.
According to MSDN (applies to SQL Server 2008 through 2014 and Azure SQL Database) , the SELECT should not be used to assign local variables. In the remarks, it describes how, when you do use the SELECT, it attempts to behave. The interesting points to note:
While typically it should only be used to return a single value to a variable, when the expression is the name of the column, it can return multiple values.
When the expression does return multiple values, the variable is assigned the last value that is returned.
If no value is returned, the variable retains its original value (not directly relevant here, but worth noting).
The first two points here are key - concatenation happens to work because SELECT #msg1 = #msg1 + cast(id as varchar) is essentially SELECT #msg1 += cast(id as varchar), and as the syntax notes, += is an accepted compound assignment operator on this expression. Please note here that it should not be expected this operation to continue to be supported on VARCHAR and to do string concatenation - just because it happens to work in some situations doesn't mean it is ok for production code.
The bottom line as to the underlying reason is whether the Compute Scalar that runs on the select expression uses the original id column or an expression of the id column. You probably can't find any docs on why the optimizer might choose the specific plans for each query, but each example highlights different use cases that allow the msg value to be evaluated from the column (and therefore multiple rows being returned and concatenated) or expression (and therefore only the last column).
#msg1 is '123' because the Compute Scalar (the row-by-row evaluation of the variable assignment) occurs after the Sort. This allows the scalar computation to return multiple values on the id column concatenating them through the += compound operator. I doubt there is specific documentation why, but it appears the optimizer chose to do the sort before the scalar computation because the order by was a column and not an expression.
#msg2 is '3' because the Compute Scalar is done before the sort, which leaves the #msg2 in each row just being the ('' + id) - so never concatenated, just the value of the id. Again, probably not any documentation why the optimizer chose this, but it appears that since the order by was an expression, perhaps it needed to do the (id+id) in the order by as part of the scalar computation before it could sort. At this point, your original column is no longer referencing the source column, but it has been replaced by an expression. Therefore, as MSDN stated, your first column points to an expression, not a column, so the behavior assigns the last value of the result set to the variable in the SELECT. Since you sorted ASC, you get '3' here.
#msg3 is '1' for the same reason as example 2, except you ordered DESC. Again, this becomes an expression in the evaluation - not the original column, so therefore the assignment gets the last value of the DESC order, so you get '1'.
#msg4 is '123' again because the TOP operation forces an initial scalar evaluation of the ORDER BY so that it can determine your top 100 records. This is different than examples 2 and 3 in which the scalar computation contained both the order by and select computations which caused each example to be an expression and not refer back to the original column. Example 4 has the TOP separating the ORDER BY and SELECT computations, so after the SORT (TOP N SORT) is applied, it then does the scalar computation for the SELECT columns in which at this point you are still referencing the original column (not an expression of the column), and therefore it returns multiple rows allowing the concatenation to occur.
Sources:
MSDN: https://msdn.microsoft.com/en-us/library/ms187330.aspx
SQL Server will calculate the results, then sort them, then return them. In the case of assigning a variable, only the first result will be used to populate your variable. You are receiving the first value from the sorted result sets, which can move around the order SQL Server will scan the records as well as the position within the results.
TOP will always produce special query plans as it immediately forces SQL Server to stick to the natural ordering of the results instead of producing query plans that would statistically reduce the number of records it must read.
To explain the differences, you'll have to refer to how SQL Server decided to implicitly sort the values to optimize the query.
Query 1
Insert -> Table Insert -> Constant Scan
Query 2
SELECT -> Compute Scalar -> Sort -> Table Scan
Query 3, and 4
SELECT -> Sort -> Compute Scalar -> Table Scan
Query 5 and 6 (using TOP)
SELECT -> Compute Scalar -> Sort (Top N) -> Compute Scalar -> Table
Scan
I added Query 6:
select top (100)
#msg5 = #msg5 + cast(id as varchar)
from #tbl
order by id+id desc
All I can see is there is a difference in the execution plans. They all start with SELECT and end with Table Scan. The difference is in between, the Compute Scalar and the Sort.
#Msg1 has Compute Scalar then Sort. Results: 123
#Msg2 has Sort then Compute Scalar. Results: 3
#Msg3 has Sort then Compute Scalar. Results: 1
The fourth one is different because of the top. It still starts with select and ends with table scan, but it's different in the middle. It uses a different sort.
#Msg4 has Compute Scalar then Sort(Top N Sort) then Compute Scalar
You're not supposed to set variables in a select that returns more than a single row. Consider this code:
select top 1 #msg1 = #msg1 + cast(id as varchar) from #tbl
order by id;
select top 1 #msg2 = #msg2 + cast(id as varchar) from #tbl
order by id+id;
select top 1 #msg3 = #msg3 + cast(id as varchar) from #tbl
order by id+id desc;
select top 1 #msg4 = #msg4 + cast(id as varchar) from #tbl
order by id+id;
Producing 1, 1, 3 and 1, respectively.
I'm pretty surprised it doesn't cause an exception, I was quite sure it used to forbid this outright.
The underlying point is still the same: the SQL engine isn't just executing some commands procedurally, one by one, as you might expect. It will build an execution plan that is tailored to be as efficient as possible (given many constraints).
On the other hand, assigning a variable is inherently procedural, and requires an explicit execution / evaluation order to work correctly.
You're combining the two approches - select id from #tbl order by id is a non-procedural query, but select #id = id from #tbl order by id is a mix of the procedural #id = id, and the very much non-procedural select.

For xml path returns null instead of nothing

I thought that following query suppose to return nothing, but, instead, it returns one record with a column containing null:
select *
from ( select 1 as "data"
where 0 = 1
for xml path('row') ) as fxpr(xmlcol)
If you run just the subquery - nothing is returned, but when this subquery has an outer query, performing a select on it, null is returned.
Why is that happening?
SQL Server will try to predict the type. Look at this
SELECT tbl.[IsThereAType?] + '_test'
,tbl.ThisIsINT + 100
FROM
(
SELECT NULL AS [IsThereAType?]
,3 AS ThisIsINT
UNION ALL
SELECT 'abc'
,NULL
--UNION ALL
--SELECT 1
-- ,NULL
) AS tbl;
The first column will be predicted as string type, while the second is taken as INT. That's why the + operator on top works. Try to add a number to the first or a string to the second. This will fail.
Try to uncomment the last block and it will fail too.
The prediction is done at a very early stage. Look at this, where I did include the third UNION ALL (invalid query, breaking the type):
EXEC sp_describe_first_result_set
N'SELECT *
FROM
(
SELECT NULL AS [IsThereAType?]
,3 AS ThisIsINT
UNION ALL
SELECT ''abc''
,NULL
UNION ALL
SELECT 1
,NULL
) AS tbl';
The result returns "IsThereAType?" as INT! (I'm pretty sure this is rather random and might be different on your system.)
Btw: Without this last block the type is VARCHAR(3)...
Now to your question
A naked XML is taken as NTEXT (altough this is deprecated!) and needs ,TYPE to be predicted as XML:
EXEC sp_describe_first_result_set N'SELECT ''blah'' FOR XML PATH(''blub'')';
EXEC sp_describe_first_result_set N'SELECT ''blah'' FOR XML PATH(''blub''),TYPE';
The same wrapped within a sub-select returns as NVARCHAR(MAX) resp. XML
EXEC sp_describe_first_result_set N'SELECT * FROM(SELECT ''blah'' FOR XML PATH(''blub'')) AS x(y)';
EXEC sp_describe_first_result_set N'SELECT * FROM(SELECT ''blah'' FOR XML PATH(''blub''),TYPE) AS x(y)';
Well, this is a bit weird actually...
An XML is a scalar value taken as NTEXT, NVARCHAR(MAX) or XML (depending on the way you are calling it). But it is not allowed to place a naked scalar in a sub-select:
SELECT * FROM('blah') AS x(y) --fails
While this is okay
SELECT * FROM(SELECT 'blah') AS x(y)
Conclusio:
The query parser seems to be slightly inconsistent in your special case:
Although a sub-select cannot consist of one scalar value only, the SELECT ... FOR XML (which returs a scalar actually) is not rejected. The engine seems to interpret this as a SELECT returning a scalar value. And this is perfectly okay.
This is usefull with nested sub-selects as a column (correlated sub-queries) to nest XML:
SELECT TOP 5 t.TABLE_NAME
,(
SELECT COLUMN_NAME,DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS AS c
WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA
AND c.TABLE_NAME=t.TABLE_NAME
FOR XML PATH('Column'),ROOT('Columns'),TYPE
) AS AllTablesColumns
FROM INFORMATION_SCHEMA.TABLES AS t;
Without the FOR XML clause this would fail (...more than one value... / ...Only one column...)
Pass a generic SELECT as a parameter?
Some would say this is not possible, but you can try this:
CREATE FUNCTION dbo.TestType(#x XML)
RETURNS TABLE
AS
RETURN
SELECT #x AS BringMeBack;
GO
--The SELECT must be wrapped in paranthesis!
SELECT *
FROM dbo.TestType((SELECT TOP 5 * FROM sys.objects FOR XML PATH('x'),ROOT('y')));
GO
DROP FUNCTION dbo.TestType;
Empty XML Data is treated as NULL in SQL Server.
select *
from ( select 1 as "data"
where 0 = 1
for xml path('row') ) as fxpr(xmlcol)
The Subquery will be executed first and the result of the subquery i.e (Empty Rowset) will be converted to XML therefore, getting NULL Value.

SQL Server Regular expression extract pattern from DB colomn

I have a question about SQL Server: I have a database column with a pattern which is like this:
up to 10 digits
then a comma
up to 10 digits
then a semicolon
e.g.
100000161, 100000031; 100000243, 100000021;
100000161, 100000031; 100000243, 100000021;
and I want to extract within the pattern the first digits (up to 10) (1.) and then a semicolon (4.)
(or, in other words, remove everything from the semicolon to the next semicolon)
100000161; 100000243; 100000161; 100000243;
Can you please advice me how to establish this in SQL Server? Im not very familiar with regex and therefore have no clue how to fix this.
Thanks,
Alex
Try this
Declare #Sql Table (SqlCol nvarchar(max))
INSERT INTO #Sql
SELECT'100000161,100000031;100000243,100000021;100000161,100000031;100000243,100000021;'
;WITH cte
AS (SELECT Row_number()
OVER(
ORDER BY (SELECT NULL)) AS Rno,
split.a.value('.', 'VARCHAR(1000)') AS Data
FROM (SELECT Cast('<S>'
+ Replace( Replace(sqlcol, ';', ','), ',',
'</S><S>')
+ '</S>'AS XML) AS Data
FROM #Sql)AS A
CROSS apply data.nodes('/S') AS Split(a))
SELECT Stuff((SELECT '; ' + data
FROM cte
WHERE rno%2 <> 0
AND data <> ''
FOR xml path ('')), 1, 2, '') AS ExpectedData
ExpectedData
-------------
100000161; 100000243; 100000161; 100000243
I believe this will get you what you are after as long as that pattern truly holds. If not it's fairly easy to ensure it does conform to that pattern and then apply this
Select Substring(TargetCol, 1, 10) + ';' From TargetTable
You can take advantage of SQL Server's XML support to convert the input string into an XML value and query it with XQuery and XPath expressions.
For example, the following query will replace each ; with </b><a> and each , to </a><b> to turn each string into <a>100000161</a><a>100000243</a><a />. After that, you can select individual <a> nodes with /a[1], /a[2] :
declare #table table (it nvarchar(200))
insert into #table values
('100000161, 100000031; 100000243, 100000021;'),
('100000161, 100000031; 100000243, 100000021;')
select
xCol.value('/a[1]','nvarchar(200)'),
xCol.value('/a[2]','nvarchar(200)')
from (
select convert(xml, '<a>'
+ replace(replace(replace(it,';','</b><a>'),',','</a><b>'),' ','')
+ '</a>')
.query('a') as xCol
from #table) as tmp
-------------------------
A1 A2
100000161 100000243
100000161 100000243
value extracts a single value from an XML field. nodes returns a table of nodes that match the XPath expression. The following query will return all "keys" :
select
a.value('.','nvarchar(200)')
from (
select convert(xml, '<a>'
+ replace(replace(replace(it,';','</b><a>'),',','</a><b>'),' ','')
+ '</a>')
.query('a') as xCol
from #table) as tmp
cross apply xCol.nodes('a') as y(a)
where a.value('.','nvarchar(200)')<>''
------------
100000161
100000243
100000161
100000243
With 200K rows of data though, I'd seriously consider transforming the data when loading it and storing it in indivisual, indexable columns, or add a separate, related table. Applying string manipulation functions on a column means that the server can't use any covering indexes to speed up queries.
If that's not possible (why?) I'd consider at least adding a separate XML-typed column that would contain the same data in XML form, to allow the creation of an XML index.

How to add column dynamically in where clause

I want to include column in where clause depending on the condition.
e.g
select * From emp
where id=7,
and if(code is not null) then code=8;
how can i do this in sql server
If I understand you correct, you could make use of COALESCE.
COALESCE()
Returns the first nonnull expression
among its arguments.
SQL Statement
SELECT *
FROM emp
WHERE id=7
AND code = COALESCE(#code, code)
If code is a column rather than a variable the query in your question would be rewritten as follows.
SELECT *
FROM emp
WHERE id=7 AND (code IS NULL OR code=8)
You'll probably have to create a query dynamically, as a string, and then use the Execute method to actually execute it. This approach has some potentially optimization issues, but it's commonly done. You might wan to Google T-SQL Dynamic Query, or something like that.
Also use this in case of Null value in #var1.
Select * from ABC where Column1 = isNull(#var1, Column1)
here is the example:
declare #SQL varchar(500)
declare #var1 int
set int = 1
set #SQL = 'Select * from ABC Where 1 = 1'
if(#var1 = 1)
set #SQL + #SQL ' And column1 = ' #var1
exec(#SQL)
You can use COALESCE function.
Well,
I don't know if i understood your question, but i guess that you want to include the value of the code column in the results.
If i'm right it can be done in the select part instead of the where clause. i. e.
Select ..., case when code is not null then 8 else code end as code from emp where id = 7
The other interpretation is that you want to filter rows where code <> 8,that would be
Select * from emp where id = 7 and (code is null OR code = 8)

Resources