After executing a query on a table, there is a need to use the result from TABLE(RESULT_SCAN(LAST_QUERY_ID())). But we are facing a complication just because of column names being in lower case from TABLE(RESULT_SCAN(LAST_QUERY_ID())).
is aliasing the only option to have the column names in upper case? we want to avoid aliasing, because the double quoting around column names is increasing the complexity of the logic.
The case of column names from the resultset depends on the case was used in the previous query.
Remember that Snowflake will treat everything as UPPERCASE unless double quoted. So below will return the same column name MY_COLUMN:
SELECT my_column FROM test;
SELECT MY_column FROM test;
SELECT MY_COLUMN FROM test;
However, if you double quote the column name, then they will be preserved:
SELECT MY_column as "my_column" FROM test;
So above will force the returned column to be in the lower case.
See test below:
create or replace table test (my_column int);
insert into test values (1);
select my_column from test;
SELECT my_column from TABLE(RESULT_SCAN(LAST_QUERY_ID()));
+-----------+
| MY_COLUMN |
|-----------|
| 1 |
+-----------+
SELECT MY_column FROM test;
SELECT my_column from TABLE(RESULT_SCAN(LAST_QUERY_ID()));
+-----------+
| MY_COLUMN |
|-----------|
| 1 |
+-----------+
SELECT MY_COLUMN FROM test;
SELECT my_column from TABLE(RESULT_SCAN(LAST_QUERY_ID()));
+-----------+
| MY_COLUMN |
|-----------|
| 1 |
+-----------+
SELECT MY_column as "my_column" FROM test;
SELECT my_column from TABLE(RESULT_SCAN(LAST_QUERY_ID()));
000904 (42000): SQL compilation error: error line 1 at position 7
invalid identifier 'MY_COLUMN'
SELECT MY_column as "my_column" FROM test;
SELECT MY_COLUMN from TABLE(RESULT_SCAN(LAST_QUERY_ID()));
000904 (42000): SQL compilation error: error line 1 at position 7
invalid identifier 'MY_COLUMN'
SELECT MY_column as "my_column" FROM test;
SELECT "my_column" from TABLE(RESULT_SCAN(LAST_QUERY_ID()));
+-----------+
| my_column |
|-----------|
| 1 |
+-----------+
So, if I understand your question correctly, to avoid aliasing in the query against RESULT_SCAN, you will need to ensure the column names from the previous query match what are you after. Otherwise, you will have to add alias to force to use different case.
Related
I thought it was a simple task but it's a couple of hours I'm still struggling :-(
I want to have the list of column names of a table, together with its datatype and the value contained in the columns, but have no idea how to bind the table itself to get the current value:
DECLARE #TTab TABLE
(
fieldName nvarchar(128),
dataType nvarchar(64),
currentValue nvarchar(128)
)
INSERT INTO #TTab (fieldName,dataType)
SELECT
i.COLUMN_NAME,
i.DATA_TYPE
FROM
INFORMATION_SCHEMA.COLUMNS i
WHERE
i.TABLE_NAME = 'Users'
Expected result:
+------------+----------+---------------+
| fieldName | dataType | currentValue |
+------------+----------+---------------+
| userName | nvarchar | John |
| active | bit | true |
| age | int | 43 |
| balance | money | 25.20 |
+------------+----------+---------------+
In general the answer is: No, this is impossible. But there is a hack using text-based containers like XML or JSON (v2016+):
--Let's create a test table with some rows
CREATE TABLE dbo.TestGetMetaData(ID INT IDENTITY,PreName VARCHAR(100),LastName NVARCHAR(MAX),DOB DATE);
INSERT INTO dbo.TestGetMetaData(PreName,LastName,DOB) VALUES
('Tim','Smith','20000101')
,('Tom','Blake','20000202')
,('Kim','Black','20000303')
GO
--Here's the query
SELECT C.colName
,C.colValue
,D.*
FROM
(
SELECT t.* FROM dbo.TestGetMetaData t
WHERE t.Id=2
FOR XML PATH(''),TYPE
) A(rowSet)
CROSS APPLY A.rowSet.nodes('*') B(col)
CROSS APPLY(VALUES(B.col.value('local-name(.)','nvarchar(500)')
,B.col.value('text()[1]', 'nvarchar(max)'))) C(colName,colValue)
LEFT JOIN INFORMATION_SCHEMA.COLUMNS D ON D.TABLE_SCHEMA='dbo'
AND D.TABLE_NAME='TestGetMetaData'
AND D.COLUMN_NAME=C.colName;
GO
--Clean-Up (carefull with real data)
DROP TABLE dbo.TestGetMetaData;
GO
Part of the result
+----------+------------+-----------+--------------------------+-------------+
| colName | colValue | DATA_TYPE | CHARACTER_MAXIMUM_LENGTH | IS_NULLABLE |
+----------+------------+-----------+--------------------------+-------------+
| ID | 2 | int | NULL | NO |
+----------+------------+-----------+--------------------------+-------------+
| PreName | Tom | varchar | 100 | YES |
+----------+------------+-----------+--------------------------+-------------+
| LastName | Blake | nvarchar | -1 | YES |
+----------+------------+-----------+--------------------------+-------------+
| DOB | 2000-02-02 | date | NULL | YES |
+----------+------------+-----------+--------------------------+-------------+
The idea in short:
Using FOR XML PATH(''),TYPE will create a XML representing your SELECT's result set.
The big advantage with this: The XML's element will carry the column's name.
We can use a CROSS APPLY to geht the column's name and value
Now we can JOIN the metadata from INFORMATION_SCHEMA.COLUMNS.
One hint: All values will be of type nvarchar(max) actually.
The value being a string type might lead to unexpected results due to implicit conversions or might lead into troubles with BLOBs.
UPDATE
The following query wouldn't even need to specify the table's name in the JOIN:
SELECT C.colName
,C.colValue
,D.DATA_TYPE,D.CHARACTER_MAXIMUM_LENGTH,IS_NULLABLE
FROM
(
SELECT * FROM dbo.TestGetMetaData
WHERE Id=2
FOR XML AUTO,TYPE
) A(rowSet)
CROSS APPLY A.rowSet.nodes('/*/#*') B(attr)
CROSS APPLY(VALUES(A.rowSet.value('local-name(/*[1])','nvarchar(500)')
,B.attr.value('local-name(.)','nvarchar(500)')
,B.attr.value('.', 'nvarchar(max)'))) C(tblName,colName,colValue)
LEFT JOIN INFORMATION_SCHEMA.COLUMNS D ON CONCAT(D.TABLE_SCHEMA,'.',D.TABLE_NAME)=C.tblName
AND D.COLUMN_NAME=C.colName;
Why?
Using FOR XML AUTO will use attribute centered XML. The elements name will be the tables name, while the values rest within attributes.
UPDATE 2
Fully generic function:
CREATE FUNCTION dbo.GetRowWithMetaData(#input XML)
RETURNS TABLE
AS
RETURN
SELECT C.colName
,C.colValue
,D.*
FROM #input.nodes('/*/#*') B(attr)
CROSS APPLY(VALUES(#input.value('local-name(/*[1])','nvarchar(500)')
,B.attr.value('local-name(.)','nvarchar(500)')
,B.attr.value('.', 'nvarchar(max)'))) C(tblName,colName,colValue)
LEFT JOIN INFORMATION_SCHEMA.COLUMNS D ON CONCAT(D.TABLE_SCHEMA,'.',D.TABLE_NAME)=C.tblName
AND D.COLUMN_NAME=C.colName;
--You call it like this (see the extra paranthesis!)
SELECT * FROM dbo.GetRowWithMetaData((SELECT * FROM dbo.TestGetMetaData WHERE ID=2 FOR XML AUTO));
As you see, the function does not even has to know anything in advance...
I've seen a few questions like this - Count NULL Values from multiple columns with SQL
But is there really not a way to count nulls in a table with say, over 30 columns? Like I don't want to specify them all by name?
But is there really not a way to count nulls in a table with say, over 30 columns? Like I don't want to specify them all by name?
yes exactly that. I don't understand why it's so difficult - it's like 1 line in pandas?
Keypoint here is if something is not provided as "batteries included" then you need to write your own version. It is not so hard as it may look.
Let's say the input table is as follow:
CREATE OR REPLACE TABLE t AS SELECT $1 AS col1, $2 AS col2, $3 AS col3, $4 AS col4
FROM VALUES (1,2,3,10),(NULL,2,3,10),(NULL,NULL,4,10),(NULL,NULL,NULL,10);
SELECT * FROM t;
/*
+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 |
+------+------+------+------+
| 1 | 2 | 3 | 10 |
| NULL | 2 | 3 | 10 |
| NULL | NULL | 4 | 10 |
| NULL | NULL | NULL | 10 |
+------+------+------+------+
*/
You probably know how to write the query that gives the desired output, but as it was not provided in the question I will use my own version:
WITH cte AS (
SELECT
COUNT(*) AS total_rows
,total_rows - COUNT(col1) AS col1
,total_rows - COUNT(col2) AS col2
,total_rows - COUNT(col3) AS col3
,total_rows - COUNT(col4) AS col4
FROM t
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (col1,col2,col3, col4))
ORDER BY COLUMN_NAME;
/*
+-------------+--------------------+-------------------+
| COLUMN_NAME | NULLS_COLUMN_COUNT | NULLS_TOTAL_COUNT |
+-------------+--------------------+-------------------+
| COL1 | 3 | 6 |
| COL2 | 2 | 6 |
| COL3 | 1 | 6 |
| COL4 | 0 | 6 |
+-------------+--------------------+-------------------+
*/
Here we could see that the query is "static" in nature with few moving parts(column_count_list/table_name/column_list):
WITH cte AS (
SELECT
COUNT(*) AS total_rows
<column_count_list>
FROM <table_name>
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (<column_list>))
ORDER BY COLUMN_NAME;
Now using the metadata and variables:
-- input
SET sch_name = 'my_schema';
SET tab_name = 't';
SELECT
LISTAGG(c.COLUMN_NAME, ', ') WITHIN GROUP(ORDER BY c.COLUMN_NAME) AS column_list
,ANY_VALUE(c.TABLE_SCHEMA || '.' || c.TABLE_NAME) AS full_table_name
,LISTAGG(REPLACE(SPACE(6) || ',total_rows - COUNT(<col_name>) AS <col_name>'
|| CHAR(13)
, '<col_name>', c.COLUMN_NAME), '')
WITHIN GROUP(ORDER BY COLUMN_NAME) AS column_count_list
,REPLACE(REPLACE(REPLACE(
'WITH cte AS (
SELECT
COUNT(*) AS total_rows
<column_count_list>
FROM <table_name>
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (<column_list>))
ORDER BY COLUMN_NAME;'
,'<column_count_list>', column_count_list)
,'<table_name>', full_table_name)
,'<column_list>', column_list) AS query_to_run
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE TABLE_SCHEMA = UPPER($sch_name)
AND TABLE_NAME = UPPER($tab_name);
Running the code will generate the query to be run:
Copying the output and running it will give the output. This template could be further refined and wrapped with stored procedure if needed(but I will left it as an exercise).
#chris you should note that the metadata in Snowflake is similar to SQL Server. So anything you want to know at metadata level, would have already been solved by SQL Server practitioners.
See this link - Count number of NULL values in each column in SQL
This is different in Oracle where the metadata table gives the number of nulls in each column as well as density.
Because it is too complicated to solve this problem without real data, I will try to add some:
| tables 1 | table 2 | ... | table n
---------------------------------------------------------------------------------------
columns_name: | name | B | C | D | name | B | C | D | ... | name | B | C | D
---------------------------------------------------------------------------------------
column_content:| John | ... | Ben | ... | ... | John| ...
The objective is to extract the rows in the N tables where name = 'John'.
Where we already have a table called [table_names] with the n tables names stored in the column [column_table_name].
Now we want to do something like that:
SELECT [name]
FROM (SELECT [table_name]
FROM INFORMATION_SCHEMA.TABLES)
WHERE [name] = 'Jonh'
Tables names are dynamic and thus unknown until we run the information_schema.tables query.
This final query is giving me an error. Any clue about how to use multiple stored tables names in a subquery?
You need to alias your subquery in order to reference it. Plus name should be table_name
SELECT [table_name]
FROM (SELECT [table_name]
FROM INFORMATION_SCHEMA.TABLES) AS X
WHERE [table_name] = 'Jonh'
Scenario:
For whatever reason, you discover a table with columns that contain the literal characters 'NULL' but they should actually contain a real NULL. So you decide to update the 'NULL' to NULL. To wit:
IF OBJECT_ID('tempdb..#my_nulls') IS NOT NULL DROP TABLE #my_nulls;
SELECT 1 as id
,'NULL' as my_column
INTO #my_nulls
UNION ALL
SELECT 2 as id
,'to the man with a hammer' as my_column
UNION ALL
SELECT 3 as id
,'everything looks like a' as my_column
UNION ALL
SELECT 4 as id
,'NULL' as my_column
--------- -------------------------------------------------------------------------------------------------
SELECT id
,my_column
,replace(my_column,'NULL',NULL) as nullified
FROM #my_nulls
Here's the possibly surprising answer:
id | my_column | nullified
---- |------------------------- |-------------
1 | NULL | NULL
2 | to the man with a hammer | NULL
3 | everything looks like a | NULL
4 | NULL | NULL
So why have row 2 and 3 been converted to NULL when they clearly don't contain the string 'NULL'
I think I know the answer to this but I'm asking for a few reasons:
I could easily be wrong :-)
I can't see this specific question/answer with reference to replace() on Stack Overflow, AND
Others may be scratching their heads too AND
I didn't read BOL properly the first time!
So what I eventually discovered is that the reason the strings are getting converted to NULLs is that NULL is an undefined value. You can't create knowledge out of nothing.
Anything compared with NULL returns NULL and the REPLACE function is no exception. BOL says
"...Returns NULL if any one of the arguments is NULL...."
To get around this problem I replaced REPLACE with the NULLIF expression. So:
SELECT id
,my_column
,nullif(my_column,'NULL') as nullified
FROM #my_nulls
Returns:
id | my_column | nullified
-----|--------------------------|-------------------------
1 | NULL | NULL
2 | to the man with a hammer | to the man with a hammer
3 | everything looks like a | everything looks like a
4 | NULL | NULL
I have a complex sql query, with several joins and subqueries. I want to know if there's a simple way to generate several groups using a GROUP BY clause for the same vale of the grouping fields, when this fields have some special value.
For example, if I'm grouping by fieldA, and fieldB, I get groups for diferent values of fieldA and fieldB, but when fieldA takes a special constant value like "specialValue", generate different groups for each record.
For example, if I have this records:
fieldA | fieldB | fieldC
_______________________
val1 | val2 | 1
val1 | val2 | 2
val2 | valx | 3
val2 | valx | 4
specialValue | vala | 5
specialValue | vala | 6
Selecting (fieldA, fieldB, max(fieldC)), Grouping by (filedA,fieldB) but ifgoring "specialValue" in fieldA, I would obtain the following results:
fieldA | fieldB | fieldC
_______________________
val1 | val2 | 2
val2 | valx | 4
specialValue | vala | 5 <-- Two rows
specialValue | vala | 6 <--
I want to get it in the simpliest possible way, if it's possible without join or subqueries, because the query is already too complex.
Thanks
What about
SELECT fieldA, fieldB, MAX(fieldC)
FROM table
WHERE fieldA <> 'specialValue'
GROUP BY fieldA, fieldB
UNION ALL
SELECT fieldA, fieldB, fieldC
FROM table
WHERE fieldA = 'specialValue'
?
Or alternatively with a SUBSELECT, although the availability of ROW_NUMBER and the OVER clause will depend on your version of SQL Server1.
SELECT fieldA, fieldB, MAX(fieldC)
FROM (
SELECT fieldA, fieldB, fieldC,
case when fieldA = 'specialValue' then
ROW_NUMBER() over (ORDER BY fieldA)
end as rownum
FROM t
) as subselect
GROUP BY rownum, fieldA, fieldB
1 2008 to present - source
You may want to look into using cube and/or rollup.
Does the table have a primary key column that is different from these? You could do:
group by case when fieldA = 'specialValue' then primarykey else fieldA end