SAS query issue on external DBMS TABLE where Column Name has space

SAS query issue on external DBMS TABLE where Column Name has space - sql-server

Through SAS/ACCESS, I can successfully run data steps querying external DBMS tables. E.g.,
Data OutTable;
Set ExternalDBMS.Table1;
Where Var1 ='abc';
Run;
However, when column name has space, it caused a problem even I used ''n.
One example as shown below:
Data OutTable;
Set ExternalDBMS.Table1;
Where 'Var 2'n ='abc';
Run;
ERROR: CLI open cursor error: [SAS][ODBC SQL Server Wire Protocol driver][Microsoft SQL Server]Incorrect syntax near the keyword 'Function'.
Further try with SAS Option validvarname=v7 to standardize the var names with spaces still caused same error.
After using SAS Option sastrace=',,,d' I found that SAS/ACCESS submitted statement to SQL server like this:
SELECT Var 1, .....
FROM schema1.Table1
WHERE (Var 1 ='abc' );
Apparently the code above would cause error in SQL server side because the Var 1 was neither quoted nor bracketed.
One way to fix it is using explicit pass-through query. I'm just wondering if there's any other ways to solve this problem too.
Thanks in advance!

when using an explicit pass-through query, put a set of square brackets around the variable name. This would be similar to how you'd write your code in SSMS.
SELECT [Var 1], ...
FROM schema1.Table1
WHERE ([Var 1] ='abc' );

Related

How to access information schema using a pypika query?

I'm trying to get the names of the columns from a table in an Azure SQL database using a PyPika SQL query, but keep running into trouble. Here's the code I'm using to generate the query:
def dbView(table):
infoSchema = ppk.Table("INFORMATION_SCHEMA.COLUMNS")
return ppk.MSSQLQuery.from_(infoSchema).select(infoSchema.COLUMN_NAME).where(infoSchema.TABLE_NAME == table)
I created another function that uses the PyODBC library to get the SQL from the query, execute it against the database, and return all the rows:
def getData(query: ppk.Query):
'''
Execute a query against the Azure db and return
every row in the results list.
'''
print("QUERY: ", query.get_sql())
conn = getConnection()
with conn.cursor() as cursor:
cursor.execute(query.get_sql())
return cursor.fetchall()
I know the getData() function works because when I pass it a simple select query, everything works correctly. However, when I try to use the query generated by pypika above, I get the following error:
pyodbc.ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'INFORMATION_SCHEMA.COLUMNS'. (208) (SQLExecDirectW)")
To make sure this wasn't just some kind of permissions error, I wrote the following query by hand and executed it using the getData() function and it worked just fine:
SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'Validation'
I also printed out the query that pypika generated to the console. The only difference appears to be the addition of some double quotes:
SELECT "COLUMN_NAME" FROM "INFORMATION_SCHEMA.COLUMNS" WHERE "TABLE_NAME"='Validation'
What am I doing wrong? For some reason, this error appears to be limited to specifically the information schema table, because I have used similar queries several other times in my code without issue. I know I can just use the query I wrote by hand, but the point of using PyPika was to make all my SQL queries more readable and reusable - it'd be nice to understand why it doesn't work in this very specific situation.
Thanks!

It apparently has an API to schema-qualify tables.
from pypika import Table, Query, Schema
views = Schema('views')
q = Query.from_(views.customers).select(customers.id, customers.phone)
https://pypika.readthedocs.io/en/latest/2_tutorial.html#tables-columns-schemas-and-databases

I have a CSV file which is connected as database and I want to update column values not all column using query from automation anywhere

I have connected CSV file as database in automation anywhere tool and i want to update certain column values using update query.
Update [$vOutputFileName$]
Set [column 7] = 88
Where [column1] = "5744543"
When I use this query, I get an error
[Microsoft][ODBC Text Driver] Too few parameters. Expected 1.
Please help.

Try ' instead of "
This is from MS Reference.
Other incorrect formatting like extra spaces in the SQL statement can also cause this error.

Upsert into SQL Server from SAS

I've got several datasets which need to be upserted into a SQL server database from SAS (my environment uses SAS DI 4.9).
The default table loader transformation that comes packaged with SAS DI offers an Update/Insert load style, with options to match by SQL set, column, or index. None of these works for me, instead throwing the error
ERROR: CLI execute error: [SAS][ODBC SQL Server Wire Protocol driver][Microsoft SQL Server]A cursor with the name
'SQL_CUR608F0C44282B0000' does not exist.
This SAS note indicates that this issue may be related to the version of the DataDirect driver and that there are workarounds, but the workaround for the version of SAS running in my environment causes poor read performance (which isn't acceptable for my needs). The environment is administered by IT.
What I'd like to do is leverage SAS DI's custom transformation abilities to build something that works the way the Table Loader transformation should have for users with my setup. This would entail some SQL pass-through which uses an update + insert approach, but where the column and table names are programmatically determined from the inputs and outputs to the transformation, and the match columns are specified by the user as with the default transformation.
This requires some serious macro magic.
Here's what I've tried for just the update portion (with anonymized info in [ square brackets ]):
%let conn = %str([my libname]);
%let where_clause = &match_col0 = &match_col0;
%macro custom_upsert;
data _null_;
put 'proc sql;';
put 'connect to ODBC(&conn);';
put "execute(update &_OUTPUT";
%do i=1 %to &_OUTPUT_col_count;
put '&&_OUTPUT_col_&i_name = &&_OUTPUT_col_&i_name';
%end;
put 'from &_OUTPUT join &_INPUT on';
put 'where &where_clause';
put ') by ODBC;';
put 'quit;';
run;
%mend;
%custom_upsert;
But this is failing with errors about unbalanced quotation marks and the quoted string exceeding 262 characters.
How can I get this working as intended?
EDIT
Here is the SQL server code that I am ultimately trying to get at with my SAS code, with the major difference here being that the SQL code references two SQL server tables but in reality I'm trying to update from a SAS table:
begin
update trgt_tbl
set col1 = col1
, ...
,coln = coln
from trgt_tbl
join upd_tbl
on trgt_tbl.match_col = upd_tbl.match_col;
insert into trgt_tbl
select * from
(select
col1
, ...
,coln
from upd_tbl) as temp
where not exists
(select 1 from trgt_tbl
where match_col = temp.match_col);
end

The macro could generate the SQL code directly, not output the desired code to log (which put will do). However, you could also put to a file that will be submitted via %include. The code gen into the file still has macro resolution references (&&) due to the single quoted put. Thus, those macro variables to be resolved must be existent in the scope at the %include time.
%macro myupsert;
filename myupsert 'c:\temp\passthrough-upsert.sas';
data _null_;
file myupsert;
…
/* same puts */
…
run;
%include myupsert;
filename myupsert;
%mend;
%myupsert;

Dynamically Create Views from parameters/variables SQL SSIS

I have a table that contains just under a million rows. I'm building a form using SSIS that asks for user input and uses the values as parameters to build a view from source data. I'm having trouble getting SSIS to create the view from a variable.
The purpose of this 'tool' is to provide a dialogue that programmatically builds a view and later an update statement based upon parameters defined via a form that will execute an SSIS package. A number of the ppl on my team know 0 SQL. Therefore this circumvents any SQL knowledge. Creating an entirely standalone app is not ideal as it would require too much additional overhead on my side and would deviate from a number of our existing processes that currently use SSIS/SQL to achieve similar results.
With that here is what I've tried/trying.
I have an SSIS package that contains 'Execute SQL Task'
This task brings up a form with 5 inputs (variables)
var1,var2,var3,var4,var5.
some vars are strings others are doubles, ints etc... (they all vary)
You populate the fields and hit okay.
These variables are passed to an 'Execute Package Task'.
Inside this package (Package B)
the vars are used in an 'Execute SQL Task'.
This task is attempting to take the users input and create a view with a where clause containing 4 other variables.
example:
Create View ? AS Select col1,col2,col3,col4 WHERE
col1 = ?
AND col2 =?
AND col3 =?.........
First it appears that using ? in the create view is invalid.
The error being:
Error: 0x0 at Build_Query: Incorrect syntax near '#P1'.
Error: 0xC002F210 at Build_Query, Execute SQL Task: Executing the query "CREATE VIEW ? as Select * from S_t_equip_template
..." failed with the following error: "The batch could not be analyzed because of compile errors.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established
correctly.
Task failed: Build_Query
If I use the create view variable as an expression and remove the variables/paramerters for the where statements, I can create the view no problem.
However the where statements throw errors once I add them back in. I've tried evaluating these as an expression in the 'Execute SQL Task' but as these are of various types I get the error:
[Execute SQL Task] Error: Executing the query "
CREATE VIEW testing AS SELECT
P.label,P.uniq..." failed with the following error: "The metadata could not
be determined because statement 'CREATE VIEW testingagain AS SELECT
P.label,P.uniqueid,C.label as Child_Label,C.uniqueid as Child_uni' does not
support metadata discovery.". Possible failure reasons: Problems with the
query, "ResultSet" property not set correctly, parameters not set correctly,
or connection not established correctly.
No idea what is going on. Any help would be appreciated
I've googled the error and found some info but the other use-cases are so different that it's difficult to understand the actual cause, or another work around.
AS requested (simplified example):
I've created a package variable (datatype string) called: View_Name
Execute SQL Task:
CREATE VIEW #[User::View_Name] AS SELECT
* from table1
where col1 = 100;
Specifically it does not like that I use a variable here.
If I set the View name everything works until: I move on to my Where clauses that contain variables.
Create a variable called type (datatype int)
I map the variable/parameter in my sql task
Example:
CREATE VIEW tempTable AS SELECT
* from table1
where col1 = ?;
This won't work, same error.
If i attempt to do the above via an expression or expressions I get the following error
[Execute SQL Task] Error: Executing the query "CREATE VIEW test_45678 AS SELECT P.label,P.uniquei..." failed with the following error: "Must declare the scalar variable "#".". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
Generally having to due with the variables cannot be evaluated this way. I'm guessing I'd need to evaluate each piece individually and build the expression piece by piece. That's fine but very inefficient and not maintainable.

I had some fundamental misunderstanding in my attempt to evaluate my expressions. Specifically syntax issues and having to cast each variable to a string.
My SQLstatement variable
ON C.pid = P.ID
where
C.width >="+(DT_WSTR, 8)#[User::Width] +"-"+ (DT_WSTR, 8)#[User::Range]+........
The final expression looks like so:
"Create View "+#[User::View_Name] + " AS SELECT " + #[User::SQLStatement]

Execute sql task mapping variables in ssis

INSERT INTO [DEV_BI].dbo.[DimAktivitet]([Beskrivning],[företag],[Projektnummer],[Aktivitet],
loaddate)
SELECT NULL,
a.DATAAREAID,
a.PROJID,
a.MID_ACTIVITYNUMBER,
GETDATE()
FROM [?].dbo.[v_ProjCostTrans_ProjEmplTrans] a
LEFT OUTER JOIN [DEV_BI] .dbo.[DimAktivitet] b ON a.MID_ACTIVITYNUMBER = b.Aktivitet
AND a.DataAreaID = b.företag
AND a.ProjID = b.Projektnummer
WHERE b.Aktivitet_key IS NULL
I have this above sql code in execute sql task and in the parameter mapping i have mapped a variable named user::connectionstring with data type nvarchar , parameter name = 0. Im getting this following error.
[Execute SQL Task] Error: Executing the query "insert into [DEV_BI].dbo.[DimAktivitet]([Beskrivni..." failed with the following error: "Invalid object name '?.dbo.v_ProjCostTrans_ProjEmplTrans'.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
please someone help me to solve this.

It appears you are trying to change the database based on a variable. The Execute SQL Task can only use parameters as filters in the WHERE clause. This behavior is described in this TechNet article. For instance, you could do this:
insert into [DEV_BI].dbo.[DimAktivitet]([Beskrivning],[företag],[Projektnummer],[Aktivitet],loaddate)
select null,a.DATAAREAID,a.PROJID,a.MID_ACTIVITYNUMBER,GETDATE() from
[DEV_BI].dbo.[v_ProjCostTrans_ProjEmplTrans] a
left outer join
[DEV_BI] .dbo.[DimAktivitet] b
on a.MID_ACTIVITYNUMBER = b.Aktivitet AND a.DataAreaID = b.företag AND a.ProjID = b.Projektnummer
where b.Aktivitet_key is null
AND b.SomeFilterCriteria = ?;
If you really want to vary the database based on a variable, then you have three options:
Vary the Connection Manager connection string to your database connection based on an expression as described in a blog post. This is the best solution if you are only changing the database and nothing else.
Generate the entire SQL code as a variable and execute a variable as the SQL command instead of passing variables to the Execute SQL Command. This is described in this blog post under the section "Passing in the SQL Statement from a Variable".
Create a stored procedure, pass the parameter to the stored procedure, and let it generate the SQL it needs on the fly.