TSQL - Get maximum length of data in every column in every table without Dynamic SQL - sql-server

Is there a way to get maximum length of data stored in every column in the database? I have seen some solutions which used Dynamic SQL, but I was wondering if it can be done with a regular query.

Yes, Just query the INFORMATION_SCHEMA.COLUMNS view for the database, you can get the information out from all columns of all tables in the database if you desire, see the following for more details:
Information_Schema - COLUMNS

If you are talking about the length of particular data in and not the declared length of a column, I am afraid that is not achievable without dynamic SQL.
The reason is that there is only way to retrieve data, and that is the SELECT statement. This statement however requires an explicit column, which is part of the statement itself. There is nothing like
-- This does not work
select col.Data
from Table
where Table.col.Name='ColumnName'
So the answer is: No.

Related

Can a Snowflake UDF be used to create MD5 on the fly?

I was wondering if anyone has an example of creating an MD5 result using an UDF in Snowflake?
Scenario: I want a UDF that can set X columns depending on the source to create an MD5 result. So table A might have 5 columns, table B has 10....and accounting for various data types.
Thanks,
Todd
Snowflake already provided md5 built in fucntion.
https://docs.snowflake.com/en/sql-reference/functions/md5.html
select md5('Snowflake');
----------------------------------+
MD5('SNOWFLAKE') |
----------------------------------+
edf1439075a83a447fb8b630ddc9c8de |
----------------------------------+
There are many ways you can do the MD5 calculation. But I thought it will be good to understand your use case. I am assuming that you want to use MD5 to validate the data migrated to Snowflake. If that is the case, then MD5 way of checking each row on snowflake may be expensive. A more optimal way of validation will be to identify each column for the table and calculate the MIN, MAX, COUNT, NUMBER OF NULLS, DISTINCT COUNT for each column and validate it with the source. I have created a framework with this approach where I use the 'SHOW COLUMNS' query to get the list if COLUMNS. The framework also allows to skip some columns if required, also filter the number of rows retrieved based on a dynamic criteria. This way of validating the data will be more optimal. It will definitely help to understand your use case better.
MD5
Does this work for you
create or replace function md5_calc (column_name varchar)
returns varchar
LANGUAGE SQL
AS $$
select md5(column_name)
$$;
SELECT EMPLID,md5_calc(EMPLID),EMPNAME,md5_calc(EMPNAME) from employee;

SSIS 2008 R2 - Can I set a variable to the name of the current scope?

This seems ridiculously easy, but I can't find it anywhere...
I have a VERY simple sequence container with two tasks: Truncate a SQL table, and repopulate it from production. But this container will be repeated for about 50 tables. The container's name (entered manually) = the name of both the source and destination tables.
I have two variables:
"TableName" is entered manually.
"DelTable" is an expression that uses #[User::TableName] to generate a simple SQL statement.
I'm super-lazy and would like to use an expression to set "TableName" = the name of the current scope so I only have to enter it once.
Ideas???
THANK YOU!
if you are truncating all tables in a DB and replacing with exactly the same structure, how about this approach:
Execute SQL:
select table_name
from INFORMATION_SCHEMA.TABLES --Add a where to limit the tables to the ones you want
Save results to an object variable called TABLES
Add a for each loop:
Loop through ADO Object setting value to a string variable called table
Add Execute SQL to FE LOOP: truncate table ? and map parameter.
Add a 2nd Execute SQL statement:
INSERT INTO SERVER.DB.SCHEMA.?
select * from ?
Again map the parameters.
If you are having trouble mapping parameters set up variables and use them to create the SQL statements to run.
#TomPhillips is correct, I cannot unfortunately comment or make that answer useful. hence commenting here.
There's no easy quick fix to use a loop/automate unless all the 50 tables are same structure which is rare by any stretch of imagination.
BIML is the way to go if you are lazy :)
SSIS is not dynamic. Data Flows require fixed input and output at compile time, not runtime. You cannot simply change the table name and have it work.
If you have a list of 50 tables to do the same function on, you can use BIML to dynamically generate the SSIS package(s). But the DF itself cannot be dynamic.

Is it possible in SQL Server to convert the type of a field based on the content of another field?

I have a table, DD, which is a data dictionary, with fields (say):
ColumnID (longint PK), ColumnName (varchar), Datatype (varchar)
I have another table, V, where I have sets of records in the form:
ColumnID (longint FK), ColumnValue (varchar)
I want to be able to convert sets of records from V into another table, Results, where each field will be translated based on the value of DD.Datatype, so that the destination table might be (say):
ColumnID (longint FK), ColumnValue (datetime)
To be able to do this, ISTM that I need to be able to do something like
CONVERT(value of DD.Datatype, V.ColumnValue)
Can anyone give me any clues on whether this is even possible, and if so what the syntax would be? My google-fu has proved inadequate to find anything relevant
You could do something like this with dynamic sql, certainly. As long as you are aware of the limitation that the datatype is a property of the COLUMN in the resultset, and not each cell in the resultset. So all the rows in a given column must have the same datatype.
The only way to accomplish something like CONVERT(value of DD.Datatype, V.ColumnValue) in SQL is with dynamic SQL. That has it's own problems, such as basically needing to use stored procedures to keep queries efficient.
Alternately, you could fetch the datatype metadata with one query, construct a new query in your application, and then query the database again. Assuming you're using SQL Server 2012+, you could also try using TRY_CAST() or TRY_CONVERT(), and writing your query like:
SELECT TRY_CAST(value as VARCHAR(2)) FieldName
FROM table
WHERE datatype = 'VARCHAR' AND datalength = 2
But, again, you've got to know what the valid types are; you can't determine that dynamically with SQL without dynamic SQL. Variables and parameters are not allowed to be used for object or type names. However, no matter what you do, you need to remember that all data in a given column of a result set must be of the same datatype.
Most Entity-Attribute-Value tables like this sacrifice data integrity that strong typing brings by accepting that the data type is determined by the application and not the RDBMS. EAV does not allow you to have your cake (store data without a fixed schema) and eat it, too (enjoy DB enforced strong data typing, not having to typecast strings in the application, etc.).
EAV breaks data normalization pretty badly. It breaks First Normal Form; the most basic rule, and this is just one of the consequences. EAV tables will make querying the data anywhere from awkward to extremely difficult, and you're almost always going to sacrifice performance doing it because the RDBMS is built around the relational model.
That doesn't mean you shouldn't ever use EAV tables. They're relatively great for user defined fields. However, it does mean that they're always going to suck to query and manage. That's just the tradeoff. You broke First Normal Form. Querying and performance are going to suffer consequences of that choice.
If you really want to store your all your data like this, you should look at either storing data as blobs of XML or JSON (SQL Server 2016) -- but that's a general pain to query -- or use a NoSQL data store like MongoDB or Cassandra instead of an SQL RDBMS.

SQL Server : parameters for column names instead of values

This might seem like a silly question, but I'm surprised that I didn't find a clear answer to this already:
Is it possible to use SQL Server parameters for writing a query with dynamic column names (and table names), or does the input just need to be sanitized very carefully?
The situation is that tables and their column names (and amount of columns) are generated dynamically and there is no way to know beforehand to manually write a query. Since the tables & columns aren't known I can't use an ORM, so I'm resorting to manual queries. Usually I'd use parameters to fill in values to prevent SQL injection, however I'm pretty sure that this cannot be done the same way when specifying the table name and/or column names. I want to create generic queries for insert, update, upsert, and select, but I obviously don't want to open myself up to potential injection. Is there a best practices on how to accomplish this safely?
Just as an FYI - I did see this answer, but since there's no way for me to know the column / table names beforehand a case statement probably won't work for this situation.
Environment: SQL Server 2014 via ADO.NET (.NET 4.5 / C#)
There is no mechanism for passing table or column references to procedures. You just pass them as strings and then use dynamic SQL to build your queries. You do have to take precautions to ensure that your string parameters are valid.
One way to do this would be to validate that all table and column reference strings have valid names in sys.tables and sys.columns before building your T-SQL queries. Then you can be sure that they can be used safely.
You can also use literal parameters with dynamic sql when using the sp_executesql procedure. You can't use it to validate your table and column names, but it validates and prevents SQL injection with your other parameters.

SSIS, splitting a single row into multiple rows

My problem is as follows. I have a CSV file (~100k rows) containting history information with the column format of:
ID1,History1,ID2,History2...ID110,History110
Each row may have anywhere between 0 and 110 history entries. Each separate entry requires a stored procedure to be called.
If there were a small number of possible entries per row, I imagine the way to do this would be to transform the data using a script, and send it to a unique path. Creating 110 paths would probably work, but isn't very elegant (and quite time consuming).
What would the best way to approach this be?
Just load the data (raw csv unchanged, one row per file line) into a staging table. Then, call a stored procedure that will use a string splitter to break up and loop over the staging table rows and call your other procedure for each history entry.
see: Arrays and Lists in SQL Server 2005 and Beyond
also see this previous answer: SQL comma delimted column => to rows then sum totals?
If you want to solve this in SSIS without the staging tables, you could create a destination script component. You could use switch statement or hashtable to lookup the right sproc to execute for the data row.
It is unclear whether this is a better solution then the staging table approach above; but it is an alternative.
I know you already accepted an answer, but couldn't you use an Unpivot task to achieve what you wanted to do here?

Resources