SQL Count Records by Columns (Data Points)

SQL Count Records by Columns (Data Points) - sql-server

I am pretty familiar with the COUNT function in SQL, but I don't know how to use it to achieve the results I'm looking for.
I manage a data warehouse with routine loads and I would like to report the number of raw data points (defined by #records * #columns) each time I load records into a table.
I can use the COUNT function to get the #records in the formula, but how can I dynamically count columns in a table? From there I can do the math.
Thanks!

You can query the schema tables where table meta data is stored in sql server.
SELECT COUNT(*) FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'MyTable'
SELECT COUNT(*) FROM MyTable

Related

Query to find if the list of columns that exists in the SQL Server database

I have a list of columns. I want to build a query to find if the columns exists in the ServicingDB database.
We can also use the filter for the tables starting with abcd (example).
Thanks in advance.

You can use the below query to find out the Specific column along with database name, table name etc.
select * from information_schema.columns where column_name = 'yourColumnName'

Count null columns in each row - SQL

I want to count number of columns that are null or = '' in each row in SQL. And group by Row_ID.
Something like this:
SELECT
Row_ID, COUNT(*) AS 'cnt_blankCol'
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
table_catalog = 'db'
AND table_name = 'tblName'
AND COLUMNS IS NULL OR COLUMNS = ''
GROUP BY
Row_ID
ORDER BY
COUNT(*)
Thank you.

Information_Schema tables are metadata tables that contain information about the database objects them selves, it does not contain the actual data from the tables, and it does not contain data aggregates per object.
This can not be done querying information_schema. Perhaps the Op can update the question and give a scenario of the goal of the question.

Use result of stored procedure to join to a table

I have a stored procedure that returns a dataset from a dynamic pivot query (meaning the pivot columns aren't know until run-time because they are driven by data).
The first column in this dataset is a product id. I want to join that product id with another product table that has all sorts of other columns that were created at design time.
So, I have a normal table with a product id column and I have a "dynamic" dataset that also has a product id column that I get from calling a stored procedure. How can I inner join those 2?

Dynamic SQL is very powerfull, but has some severe draw backs. One of them is exactly this: You cannot use its result in ad-hoc-SQL.
The only way to get the result of a SP into a table is, to create a table with a fitting schema and use the INSERT INTO NewTbl EXEC... syntax...
But there are other possibilities:
1) Use SELECT ... INTO ... FROM
Within your SP, when the dynamic SQL is executed, you could add INTO NewTbl to your select:
SELECT Col1, Col2, [...] INTO NewTbl FROM ...
This will create a table with the fitting schema automatically.
You might even hand in the name of the new table as a paramter - as it is dynamic SQL, but in this case it will be more difficult to handle the join outside (must be dynamic again).
If you need your SP to return the result, you just add SELECT * FROM NewTbl. This will return the same resultset as before.
Outside your SP you can join this table as any normal table...
BUT, there is a big BUT - ups - this sounds nasty somehow - This will fail, if the tabel exists...
So you have to drop it first, which can lead into deep troubles, if this is a multi-user application with possible concurrencies.
If not: Use IF EXISTS(SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME='NewTbl') DROP TABLE NewTbl;
If yes: Create the table with a name you pass in as parameter and do you external query dynamically with this name.
After this you can re-create this table using the SELECT ... INTO syntax...
2) Use XML
One advantage of XML is the fact, that any structure and any amount of data can be stuffed into one single column.
Let your SP return a table with one single XML column. You can - as you know the schema now - create a table and use INSERT INTO XmlTable EXEC ....
Knowing, that there will be a ProductID-element you can extract this value and create a 2-column-derived-table with the ID and the depending XML. This is easy to join.
Using wildcards in XQuery makes it possible to query XML data without knowing all the details...
3) This was my favourite: Don't use dynamic queries...

Retrive data from partition table using select

I created partitions on one table. I have created the partitions using column year.
I want to retrieve the data from partition table using clause.
How can I write my select clause?

Something like this...
-- fetch all the rows on the 2012 partition
SELECT
t.name,
t.year
FROM
mytable AS t
WHERE
$PARTITION.<partition_function_name>(t.year) = $PARTITION.<partition_function_name>(2012)
For more info look here.

Does MS SQL Server automatically create temp table if the query contains a lot id's in 'IN CLAUSE'

I have a big query to get multiple rows by id's like
SELECT *
FROM TABLE
WHERE Id in (1001..10000)
This query runs very slow and it ends up with timeout exception.
Temp fix for it is querying with limit, break this query into 10 parts per 1000 id's.
I heard that using temp tables may help in this case but also looks like ms sql server automatically doing it underneath.
What is the best way to handle problems like this?

You could write the query as follows using a temporary table:
CREATE TABLE #ids(Id INT NOT NULL PRIMARY KEY);
INSERT INTO #ids(Id) VALUES (1001),(1002),/*add your individual Ids here*/,(10000);
SELECT
t.*
FROM
[Table] AS t
INNER JOIN #ids AS ids ON
ids.Id=t.Id;
DROP TABLE #ids;
My guess is that it will probably run faster than your original query. Lookup can be done directly using an index (if it exists on the [Table].Id column).
Your original query translates to
SELECT *
FROM [TABLE]
WHERE Id=1000 OR Id=1001 OR /*...*/ OR Id=10000;
This would require evalutation of the expression Id=1000 OR Id=1001 OR /*...*/ OR Id=10000 for every row in [Table] which probably takes longer than with a temporary table. The example with a temporary table takes each Id in #ids and looks for a corresponding Id in [Table] using an index.
This all assumes that there are gaps in the Ids between 1000 and 10000. Otherwise it would be easier to write
SELECT *
FROM [TABLE]
WHERE Id BETWEEN 1001 AND 10000;
This would also require an index on [Table].Id to speed it up.