How to access information schema using a pypika query? - sql-server

I'm trying to get the names of the columns from a table in an Azure SQL database using a PyPika SQL query, but keep running into trouble. Here's the code I'm using to generate the query:
def dbView(table):
infoSchema = ppk.Table("INFORMATION_SCHEMA.COLUMNS")
return ppk.MSSQLQuery.from_(infoSchema).select(infoSchema.COLUMN_NAME).where(infoSchema.TABLE_NAME == table)
I created another function that uses the PyODBC library to get the SQL from the query, execute it against the database, and return all the rows:
def getData(query: ppk.Query):
'''
Execute a query against the Azure db and return
every row in the results list.
'''
print("QUERY: ", query.get_sql())
conn = getConnection()
with conn.cursor() as cursor:
cursor.execute(query.get_sql())
return cursor.fetchall()
I know the getData() function works because when I pass it a simple select query, everything works correctly. However, when I try to use the query generated by pypika above, I get the following error:
pyodbc.ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'INFORMATION_SCHEMA.COLUMNS'. (208) (SQLExecDirectW)")
To make sure this wasn't just some kind of permissions error, I wrote the following query by hand and executed it using the getData() function and it worked just fine:
SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'Validation'
I also printed out the query that pypika generated to the console. The only difference appears to be the addition of some double quotes:
SELECT "COLUMN_NAME" FROM "INFORMATION_SCHEMA.COLUMNS" WHERE "TABLE_NAME"='Validation'
What am I doing wrong? For some reason, this error appears to be limited to specifically the information schema table, because I have used similar queries several other times in my code without issue. I know I can just use the query I wrote by hand, but the point of using PyPika was to make all my SQL queries more readable and reusable - it'd be nice to understand why it doesn't work in this very specific situation.
Thanks!

It apparently has an API to schema-qualify tables.
from pypika import Table, Query, Schema
views = Schema('views')
q = Query.from_(views.customers).select(customers.id, customers.phone)
https://pypika.readthedocs.io/en/latest/2_tutorial.html#tables-columns-schemas-and-databases

Related

SAS query issue on external DBMS TABLE where Column Name has space

Through SAS/ACCESS, I can successfully run data steps querying external DBMS tables. E.g.,
Data OutTable;
Set ExternalDBMS.Table1;
Where Var1 ='abc';
Run;
However, when column name has space, it caused a problem even I used ''n.
One example as shown below:
Data OutTable;
Set ExternalDBMS.Table1;
Where 'Var 2'n ='abc';
Run;
ERROR: CLI open cursor error: [SAS][ODBC SQL Server Wire Protocol driver][Microsoft SQL Server]Incorrect syntax near the keyword 'Function'.
Further try with SAS Option validvarname=v7 to standardize the var names with spaces still caused same error.
After using SAS Option sastrace=',,,d' I found that SAS/ACCESS submitted statement to SQL server like this:
SELECT Var 1, .....
FROM schema1.Table1
WHERE (Var 1 ='abc' );
Apparently the code above would cause error in SQL server side because the Var 1 was neither quoted nor bracketed.
One way to fix it is using explicit pass-through query. I'm just wondering if there's any other ways to solve this problem too.
Thanks in advance!
when using an explicit pass-through query, put a set of square brackets around the variable name. This would be similar to how you'd write your code in SSMS.
SELECT [Var 1], ...
FROM schema1.Table1
WHERE ([Var 1] ='abc' );

How to use a string to search varbinary columns in SQL Server 2008+ using Linq to Entities

My environment is .NET Framework 4.5.1, C#, Entity Framework 6.1.3 (Object Context), SQL Server 2008+, ASP.NET MVC5.
I have a table in SQL Server with documents objects begin stored in a varbinary(max) column.
I am rewriting an existing web application within which we give the users the ability to enter a search string via a textbox ~ which then searches through each record in the table and returns a list of records that contain the user input search string.
Currently the SQL query, including the user specified search string is built as a string command, and then executed via ADO.NET data adapter and passed to SQL Server and works perfectly.
Here's a test example of the complete SQL query that is built, including the user specified search string that currently runs (taken from SQL Server profiler)
SELECT TOP 1000000 *
FROM
(SELECT DISTINCT docInfo.docInfoID
FROM docFile
INNER JOIN docInfo ON docInfo.docFileID = docFile.docFileID
INNER JOIN SOInfoArchive ON SOInfoArchive.docInfoID = docInfo.docInfoID
WHERE (docInfo.docType LIKE 'SYSOUT')
AND (SOInfoArchive.serverInfoID IN (1))
AND CONTAINS(docFileObject, 'REM')
)
The above query successfully returns 89,576 records.
I would like to replace the current process of building the SQL query as a string command within my C# code by using LINQ to Entities.
I am using LINQ to Entities for all of my SQL server interactions through out the project.
However, I am unable to pass the user supplied string to my LINQ queries in such a way that I can replace the SQL command string query as shown above.
At first (wrongly) I thought I could use the .Contains() method and pass to it the user submitted string converted into bytes, an example as follows :
string input = "orderno=012p92"; //passed from the user via textbox
var bytes = System.Text.Encoding.Unicode.GetBytes(input); // convert the string to binary ready for searching the varbinary fields
using (SysviewEntities context = new SysviewEntities())
{
var docs = from df in context.docFile
where df.docFileObject.Contains('bytes')
select df;
}
But this will not even compile as I get the error
'byte[]' does not contain a definition for 'Contains' and the best
extension method overload
'System.Linq.ParallelEnumerable.Contains(System.Linq.ParallelQuery,
TSource)' has some invalid arguments
I then researched some similar questions that had been asked here on SO and thought the solution would be to use an equals evaluation operator replacing the .Contains() method as follows:
string theString = "orderno=*012p92";
byte[] theBytes = Encoding.Unicode.GetBytes(theString);
using (SysviewEntities context = new SysviewEntities())
{
string input = "orderno=012p92";
var bytes = System.Text.Encoding.Unicode.GetBytes(input);
var docs = (from df in context.docFile
where df.docFileObject == bytes
select df).ToList();
}
Now the query compiles but returns NO results ~ the resultant SQL query passed to SQL Server (from SQL Profiler) from my LINQ query is
exec sp_executesql N'SELECT
[Extent1].[docFileID] AS [docFileID],
[Extent1].[docFileHash] AS [docFileHash],
[Extent1].[docFileObject] AS [docFileObject],
[Extent1].[docFilterType] AS [docFilterType]
FROM [dbo].[docFile] AS [Extent1]
WHERE [Extent1].[docFileObject] = #p__linq__0',N'#p__linq__0 varbinary(8000)',#p__linq__0=0x6F0072006400650072006E006F003D00300031003200700039003200
So my binary conversion of the search string obviously has not worked.
All the other questions that I've read here at SO that seem similar, don't seem to apply to what I'm trying to achieve here ~ although I do concede it is very possible that I have just not understood the recommended and suggested solutions correctly.
So to summarise my questions are
Is it possible to easily duplicate this SQL query using LINQ to Entities?
SELECT TOP 1000000 *
FROM
(SELECT DISTINCT docInfo.docInfoID
FROM docFile
INNER JOIN docInfo ON docInfo.docFileID = docFile.docFileID
INNER JOIN SOInfoArchive ON SOInfoArchive.docInfoID = docInfo.docInfoID
WHERE (docInfo.docType LIKE 'SYSOUT')
AND (SOInfoArchive.serverInfoID IN (1))
AND CONTAINS(docFileObject, 'REM')
)
How / why does this part of the SQL query
AND CONTAINS(docFileObject, 'REM')
take the string (in this instance REM) and search through all the varbinary docFileObjects in the table to find records that match the search string REM ?
There must be some sort of conversion going on somewhere?
How do I replicate this conversion successfully so that I can pass a converted string value to my LINQ query?
Am I better off using the current process and not using LINQ to Entities for this specific functionality.
I would be very grateful to hear from you if you can explain to me in simple terms what it is that I'm doing wrong, what are the issues of trying to do this using LINQ to Entities and any simple suggestions as to what I can do to get a solution.
The bottom line is that I cannot use the LINQ to Entities .Contains method to pass a string variable to a SQL Contains statement where the SQL column is Varbinary format.
My solution was to create a stored procedure in SQL and pass the string variable as an input nvarchar within my stored procedure.
My first attempt to do this using temporary SQL tables failed (the stored procedure worked ~ but I was unable to consume the data within EF because EF could not 'see' the metadata of the columns my stored procedure was trying to return).
I eventually solved the problem by using SQL temporary variables.
Both my failed process and the way I resolved that problem are documented in the following SO post
(Additional) EF can't infer return schema from stored procedure selecting from a #temp table

Why can't SQL Server database name begin with a number if I run a query?

Recently I found an anomaly with SQL Server database creation. If I create with the sql query
create database 6033SomeDatabase;
It throws an error.
But with the Management Studio UI, I can manually create a database with a name of 6033SomeDatabase.
Is this expected behaviour or is it a bug? Please throw some light on this issue.
Try like this,
IF DB_ID('6033SomeDatabase') IS NULL
CREATE DATABASE [6033SomeDatabase]
I'll try to give you detailed answer.
SQL syntax imposes some restrictions to names of database, tables, and fields. F.e.:
SELECT * FROM SELECT, FROM WHERE SELECT.Id = FROM.SelectId
SQL parser wouldn't parse this query. You should rewrite it:
SELECT * FROM [SELECT], [FROM] WHERE [SELECT].Id = [FROM].SelectId
Another example:
SELECT * FROM T1 WHERE Code = 123e10
Is 123e10 the name of column in T1, or is it a numeric constant for 123×1010? Parser doesn't know.
Therefore, there are rules for naming. If you need some strange database or table name, you can use brackets to enclose it.

Execute sql task mapping variables in ssis

INSERT INTO [DEV_BI].dbo.[DimAktivitet]([Beskrivning],[företag],[Projektnummer],[Aktivitet],
loaddate)
SELECT NULL,
a.DATAAREAID,
a.PROJID,
a.MID_ACTIVITYNUMBER,
GETDATE()
FROM [?].dbo.[v_ProjCostTrans_ProjEmplTrans] a
LEFT OUTER JOIN [DEV_BI] .dbo.[DimAktivitet] b ON a.MID_ACTIVITYNUMBER = b.Aktivitet
AND a.DataAreaID = b.företag
AND a.ProjID = b.Projektnummer
WHERE b.Aktivitet_key IS NULL
I have this above sql code in execute sql task and in the parameter mapping i have mapped a variable named user::connectionstring with data type nvarchar , parameter name = 0. Im getting this following error.
[Execute SQL Task] Error: Executing the query "insert into [DEV_BI].dbo.[DimAktivitet]([Beskrivni..." failed with the following error: "Invalid object name '?.dbo.v_ProjCostTrans_ProjEmplTrans'.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
please someone help me to solve this.
It appears you are trying to change the database based on a variable. The Execute SQL Task can only use parameters as filters in the WHERE clause. This behavior is described in this TechNet article. For instance, you could do this:
insert into [DEV_BI].dbo.[DimAktivitet]([Beskrivning],[företag],[Projektnummer],[Aktivitet],loaddate)
select null,a.DATAAREAID,a.PROJID,a.MID_ACTIVITYNUMBER,GETDATE() from
[DEV_BI].dbo.[v_ProjCostTrans_ProjEmplTrans] a
left outer join
[DEV_BI] .dbo.[DimAktivitet] b
on a.MID_ACTIVITYNUMBER = b.Aktivitet AND a.DataAreaID = b.företag AND a.ProjID = b.Projektnummer
where b.Aktivitet_key is null
AND b.SomeFilterCriteria = ?;
If you really want to vary the database based on a variable, then you have three options:
Vary the Connection Manager connection string to your database connection based on an expression as described in a blog post. This is the best solution if you are only changing the database and nothing else.
Generate the entire SQL code as a variable and execute a variable as the SQL command instead of passing variables to the Execute SQL Command. This is described in this blog post under the section "Passing in the SQL Statement from a Variable".
Create a stored procedure, pass the parameter to the stored procedure, and let it generate the SQL it needs on the fly.

SQL Server query behaves differently interactively than over JDBC - omits some tables

I've been trying to retrieve constraint data for all tables using an SQL query over JDBC.
My test database has only 3 tables.
If I execute the query interactively using MS SQL Server Management Studio, I get all the results that I expect (ie. 3 rows - there's a primary key in each of 3 tables).
if I use the JDBC method to specifically retrieve primary keys (as below) then I also correctly get 3 results:
ResultSet rs = dbmd.getPrimaryKeys(jdbcCatalog, jdbcSchema, jdbcTableName);
If I use the exact same SQL statement (that I used interactively and got 3 results back) as a query over JDBC (using executeQuery() shown below) then I only get 1 result instead of the expected 3.
String query =
"select PK.CONSTRAINT_NAME, PK.TABLE_SCHEMA, PK.TABLE_NAME " +
"from information_schema.TABLE_CONSTRAINTS PK";
ResultSet rs = null;
try {
Statement stmt = con.createStatement();
rs = stmt.executeQuery(query);
}catch (Exception exception) {
// Exception handler code
}
while (rs.next()){
// Only executes once.
}
I would be very grateful if someone could explain why the SQL query over JDBC is performing differently to the exact same SQL query performed interactively. Could it be a security/ownership issue? (although the JDBC call getPrimaryKeys() doesn't suffer this)
Thanks.
I don't see where you're setting your database context, but I suspect that that's the issue. As a test, you can change your statement to "select db_name()" and see what it returns. If it's not the database that you think that you should be in, that's your issue.

Resources