I have limited knowledge of SQL and have access to a SQL Server database in Python3, with no documentation and without user-friendly table and column name descriptions. I'm struggling to find where to look at for the right tables and columns.
I've written a few helper functions using Pandas to get names and tables in the database and to find table or column names containing specific strings:
def find_tables(tablas, s):
return tablas.loc[(tablas.table_name.str.contains(s, case=False))].drop_duplicates('table_name')
def find_column(tablas, s):
if isinstance(s, list):
cond = tablas.column_name.str.contains('|'.join(s))
else:
cond = tablas.column_name.str.contains(s, case=False)
return tablas.loc[(cond)]
def explora_tabla(tablas, s):
return tablas.loc[(tablas.table_name.str.contains(s, case=False))]
Unfortunately, given odd names, this is usually not enough to pull out the information I need. Therefore I though I could try the "brute force way" to find (by value) which table and column names contain a specific value, possibly filtering for other known fields when they are available. Obviously more subtle ways to solve the problem are also welcome.
I have found the following answer, tried it with value 8004YS1LSLR but returns an error I guess because of data type. In any case I would need to be able to match a wider range of formats.
I would like to find a general query that I could pass to pd.read_sql that retuns table and column names in the database containing given value that could be an integer, float, string, etc.
This is yuck, but is based on something I had to write a while ago for a similar problem at the office.
I use a sql_variant here as this means that you can use it to search for other data types, and not have a bunch of implicit conversions in the WHERE. Note, however, that this means it will filter to the underlying data type of the sql_variant; if you supply an nvarchar it won't search varchar, nchar or char columns for example.
As I'm using sql_variant there's some silliness with explicitly conversions as well, which I include the definition of here too.
QuoteSqlvariant:
CREATE FUNCTION [fn].[QuoteSqlvariant] (#SQLVariant sql_variant)
RETURNS nvarchar(258)
AS
/*
Written by Thom A 2021-03-21
Original Source: https://wp.larnu.uk/sql_variant-and-dynamic-sql/
Licenced under CC BY-ND 4.0
*/
BEGIN
RETURN QUOTENAME(CONVERT(sysname,SQL_VARIANT_PROPERTY(#SQLVariant,'BaseType'))) +
CASE WHEN CONVERT(sysname,SQL_VARIANT_PROPERTY(#SQLVariant,'BaseType')) IN (N'char',N'varchar') THEN CONCAT(N'(',CONVERT(int,SQL_VARIANT_PROPERTY(#SQLVariant,'MaxLength')),N')')
WHEN CONVERT(sysname,SQL_VARIANT_PROPERTY(#SQLVariant,'BaseType')) IN (N'nchar',N'nvarchar') THEN CONCAT(N'(',CONVERT(int,SQL_VARIANT_PROPERTY(#SQLVariant,'MaxLength'))/2,N')')
WHEN CONVERT(sysname,SQL_VARIANT_PROPERTY(#SQLVariant,'BaseType')) IN (N'datetime2',N'datetimeoffset',N'time') THEN CONCAT(N'(',CONVERT(int,SQL_VARIANT_PROPERTY(#SQLVariant,'Scale')),N')')
WHEN CONVERT(sysname,SQL_VARIANT_PROPERTY(#SQLVariant,'BaseType')) IN (N'decimal',N'numeric',N'time') THEN CONCAT(N'(',CONVERT(int,SQL_VARIANT_PROPERTY(#SQLVariant,'Precision')),N',',CONVERT(int,SQL_VARIANT_PROPERTY(#SQLVariant,'Scale')),N')')
WHEN CONVERT(sysname,SQL_VARIANT_PROPERTY(#SQLVariant,'BaseType')) IN (N'varbinary') THEN CONCAT(N'(',CONVERT(int,SQL_VARIANT_PROPERTY(#SQLVariant,'TotalBytes'))-4,N')')
ELSE N''
END;
END
GO
Solution
DECLARE #SearchValue sql_variant = CONVERT(varchar(15),'8004YS1LSLR'); --Explicit Converting is **IMPORTANT** here
DECLARE #CRLF nchar(2) = NCHAR(13) + NCHAR(10),
#OrDelim nvarchar(10) = NCHAR(13) + NCHAR(10) + N' OR ',
#SQL nvarchar(MAX);
WITH ORs AS(
SELECT s.name AS SchemaName,
t.name AS TableName,
STRING_AGG(QUOTENAME(c.[name]) + N' = CONVERT(' + fn.QuoteSqlvariant(#SearchValue) + N',#SearchValue)', #OrDelim) AS ORClauses
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.object_id
JOIN sys.types ct ON c.system_type_id = ct.system_type_id
WHERE ct.[name] = CONVERT(sysname,SQL_VARIANT_PROPERTY(#SearchValue,'BaseType'))
GROUP BY s.name,
t.name)
SELECT #SQL = STRING_AGG(N'SELECT N' + QUOTENAME(SchemaName,'''') + N' AS SchemaName,' + #CRLF +
N' N' + QUOTENAME(TableName,'''') + N' AS TableName,' + #CRLF +
N' *' + #CRLF +
N'FROM ' + QUOTENAME(SchemaName) + N'.' + QUOTENAME(TableName) + #CRLF +
N'WHERE ' + OrClauses + N';',#CRLF)
FROM ORs;
EXEC sys.sp_executesql #SQL, N'#SearchValue sql_variant', #SearchValue;
To get the column use this code:
SELECT name,is_nullable,max_length
FROM sys.columns
WHERE object_id = OBJECT_ID('yourtablename')
Related
I have a list of tables and a list of users. I need to find all of the occurrences of each user that happen to occur in every table in that is in the list. I need to search each table for one user ID at a time. I have 310,000 users to search for and 400 tables to search through. Each User needs to be searched for on each table. I'm not quite sure about the best way to go about this as I don't know how I could loop through either list to find all the records for each user.
Summary:
310,000 users
400 tables
need to find how many records each user has in every table.
This is a complete stab in the dark, but this might be what you are after. This assumes you are using a fully supported version of SQL Server (as you haven't stated otherwise); if not you'll need to use the old FOR XML PATH (and STUFF) method for string aggregation.
DECLARE #ColumnName sysname = N'YourColumnName',
#CRLF nchar(2) = NCHAR(13) + NCHAR(10),
#SQL nvarchar(MAX);
SELECT #SQL = STRING_AGG(N'SELECT N' + QUOTENAME(t.[name],'''') + N' AS TableName,' + #CRLF +
N' ' + QUOTENAME(c.name) + N',' +#CRLF +
N' COUNT(*) AS RowsInTable' + #CRLF +
N'FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.name) + N';',#CRLF)
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.object_id
WHERE t.[name] IN (SELECT YT.TableName
FROM dbo.YourTableOfTables YT)
AND c.[name] = #ColumnName;
CREATE TABLE #UserRowCounts (TableName sysname,
UserID int, --Guessed data type
RowsInTable int);
INSERT INTO #UserRowCounts
EXEC sys.sp_executesql #SQL;
SELECT TableName,
UserID,
RowsInTable
FROM #UserRowCounts URC
WHERE URC.UserID IN (SELECT YT.UserID
FROM dbo.YourTableOfUsers)
ORDER BY UserID,
TableName;
As mentioned in the comments, this probably won't be quick; but then you are counting the rows from 400~ tables in a single batch so you shouldn't expect it to be.
I am trying to get all the table names and the values present in a particular column if the column is present in the database. For the tables without the column ignore those.
For example: Find all the table name and values from a column 'last_refresh_date' in a database for all the tables with last_refresh_date column.
The code I tried is:
EXEC sp_MSforeachtable 'SELECT distinct ''?'' TableName, last_refresh_date FROM ?'
I get an error as some of the tables doesn't have the column name 'last_refresh_date' in it.
Thanks for the help in advance
I'm going to assume you are using a recent version of SQL Server, and thus have access to STRING_AGG, if not you'll need to use the "old" FOR XML PATH method.
Anyway, you achieve this with a little bit of dynamic SQL, and UNION ALL. I assume you want the table and schema names as well:
DECLARE #SQL nvarchar(MAX),
#ColumnName sysname = N'YourColumnName',
#CRLF nchar(2) = CHAR(13) + CHAR(10);
DECLARE #Delimiter nvarchar(30) = #CRLF + N'UNION ALL' + #CRLF
SET #SQL = (SELECT STRING_AGG(N'SELECT N' + QUOTENAME(s.[name],'''') + N' AS SchemaName, N' + QUOTENAME(t.[name],'''') + N' AS TableName, ' + QUOTENAME(c.[name]) + N' FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.[name]), #Delimiter) WITHIN GROUP (ORDER BY t.object_id)
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.object_id
WHERE c.[name] = #ColumnName)
EXEC sys.sp_executesql #SQL;
Is there a possibility to alter a column from "allows null" to "does not allow null" without knowledge of the actual data type of the column?
I think no, so I have made as the basic skeleton code for my stored procedure:
SELECT t.name,c.max_length FROM sys.types t
LEFT JOIN sys.columns c ON(t.system_type_id = c.system_type_id)
WHERE object_id=OBJECT_ID(#TableName) AND c.name=#FieldName;
and
EXEC('UPDATE ' + #TableName + ' SET ' + #FieldName + ' = ' + #DefaultValue + ' WHERE ' + #FieldName + ' IS NULL');
EXEC('ALTER TABLE ' + #TableName + ' ALTER COLUMN ' + #FieldName + ' NOT NULL');
I guess now I only have to get the return values from the first query back into the second. I can't get my head around how to get the values into a variable and then access them again. Ideas?
Since the INFORMATION_SCHEMA has all required information and is part of a SQL standard, it might be better to use that in this case (however, SQL Server's ALTER TABLE ALTER COLUMN is non-standard anyway so it might not matter as much).
Either way, you should also be checking for whether there's character length and/or numeric precision being specified, and make sure you're altering the table in the correct schema (and not getting dbo.TableName instead of customschema.TableName). You could try something like this (I used INFORMATION_SCHEMA here but you could easily refactor this to use the sys.columns view):
DECLARE #retVal VARCHAR(500);
SELECT #retVal =
CASE WHEN CHARACTER_MAXIMUM_LENGTH > 0
THEN CONCAT(DATA_TYPE, '(', CHARACTER_MAXIMUM_LENGTH ,')')
WHEN CHARACTER_MAXIMUM_LENGTH = -1 AND DATA_TYPE <> 'xml'
THEN CONCAT(DATA_TYPE, '(MAX)')
WHEN DATA_TYPE IN ('numeric', 'decimal')
THEN CONCAT(DATA_TYPE, '(', NUMERIC_PRECISION,',', NUMERIC_SCALE,')')
ELSE DATA_TYPE
END
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = #schemaName
AND TABLE_NAME = #tableName
AND COLUMN_NAME = #columnName
#retVal will now capture datatypes like int, varchar(100), varbinary(MAX), or decimal(10,2) correctly.
And then build up a dynamic SQL Query like this:
DECLARE #sql VARCHAR(MAX);
SET #sql = 'ALTER TABLE ' + #schemaName + '.' + #tableName + ' ALTER COLUMN ' + #columnName + ' ' + #retVal + ' NOT NULL;'
EXEC(#sql);
You select values into variables like this:
SELECT #Var1=t.name,#Var2=c.max_length FROM sys.types t
LEFT JOIN sys.columns c ON(t.system_type_id = c.system_type_id)
WHERE object_id=OBJECT_ID(#TableName) AND c.name=#FieldName;
This of course assumes that you have already declared Var1 & Var2, and that your query will only return one row.
I have a table, tabEvent, with 11 fields that are DateTime. I want to query every DateTime field in tabEvent, and if any values are > '4/1/2014' and < dateDate(), then return that row.
But I don't want to hard-code the field names, because they are always changing (the site is growing and morphing constantly).
I have the snippet below which returns me the field names, but is not a query on the table (yet).
select c.name ColumnName
from sys.columns c
join sys.types t on (c.user_type_id = t.user_type_id)
where object_name(c.OBJECT_ID) = 'tabEvent'
and t.name = 'datetime'
order by c.OBJECT_ID
English translation: I'm trying to code a proc that will, each time I log into the admin page, check the tabEvent table for any key dates that have passed (since the last notification) and send a message to me, the admin. Then it will update the notification-date value to today, for next time's running of the proc.
Any help is appreciated!
You can achieve this using dynamic sql:
DECLARE #Startdate VARCHAR(10)= '2014-04-01'
DECLARE #EndDate VARCHAR(10) = convert(VARCHAR(10),GETDATE(),20)
DECLARE #sql nVARCHAR(max)
DECLARE #where nVARCHAR(max)
SELECT #where = ' WHERE ' + STUFF((select c.name + ' BETWEEN ''' + #StartDate + ''' AND ''' + #EndDate + ''' AND '
from sys.columns c
join sys.types t on (c.user_type_id = t.user_type_id)
where object_name(c.OBJECT_ID) = 'tabEvent'
and t.name = 'datetime'
order by c.OBJECT_ID
FOR XML PATH('')),1,0,'')
SELECT #sql = 'SELECT * FROM tabEvent' + LEFT(#where,LEN(#where)-4)
EXECUTE sp_executesql #sql
when executing the following stored procedure I get Invalid Object Name dbo.Approved. The object dbo.Approved does exist, so presumably this is something to do with the way i pass the table name in as the parameter?
I should also add that i get the error either by executing the procedure via .NET, or from within SMSS.
#tableName as nvarchar(100)
AS
BEGIN
EXEC('
UPDATE T1
SET T1.NPTid = dbo.Locations.NPT_ID
FROM ' + '[' + #tableName + '] As T1
INNER JOIN dbo.Locations ON T1.Where_Committed = dbo.Locations.Location_Name
')
END
Edit after receiving help from Joe and JNK the sproc is now this but i get the error
Msg 102, Level 15, State 1, Procedure sp_Updater, Line 14
Incorrect syntax near 'QUOTENAME'.
new sproc
#tableName as nvarchar(100),
#schemaName as nvarchar(20)
AS
BEGIN
EXEC('
--Update NPT
UPDATE T1
SET T1.NPTid = dbo.Locations.NPT_ID
FROM ' + QUOTENAME(#schemaName) + '.' + QUOTENAME(#tableName) + ' As T1
INNER JOIN dbo.Locations ON T1.Where_Committed = dbo.Locations.Location_Name
')
END
With the square brackets in you string, your table reference turns into [dbo.Approved] which is not valid. The reference should be [dbo].[Approved] instead.
You might want to consider passing schema name and table name as two separate parameters.
It would also be better to use the QUOTENAME function instead of hard coding the square brackets.
declare #sql nvarchar(1000)
set #sql = N'UPDATE T1
SET T1.NPTid = dbo.Locations.NPT_ID
FROM ' + QUOTENAME(#schemaName) + N'.' + QUOTENAME(#tableName) + N' As T1
INNER JOIN dbo.Locations ON T1.Where_Committed = dbo.Locations.Location_Name
'
EXEC (#sql)
If you use brackets for the three-part-name, you need to have brackets around each section but not the period, i.e.:
[dbo].[Approved]
If you pass dbo.Approved as your parameter, your Dynamic SQL is reading it as [dbo.Approved] which would only work if you had a table named that (i.e. the dbo. is part of the table name not the schema).
Change it to:
'...[dbo].[' + #tablename + ']...
And just pass Approved as the parameter.
Your wrapping the id too early so '[' + #tableName + '] is getting translated to [dbo.approved] when it should be [dbo].[Approved]
Table names and column names are actually sysname (which is, as I recall an NVARCHAR(128) or NVARCHAR(256) - off the top of my head I don't quite remember)
Also, You are vulnerable to a SQL Injection Attack. You should validate that #tableName is a real table by checking it against INFORMATION_SCHEMA.TABLES
Finally, just to be absolutely sure, in case the real table has some odd characters in it, you should use QUOTENAME(#tableName) to fully escape the table name.