Run SQL statement from file to create data in SAS - sql-server

I have very little experience in SAS. I do have experience in SQL.
I want to do the following:
- Use a SQL statement that is stored in a text file to import data into SAS.
What works is to copy and paste the SQL server query and run it as a pass-through query in SAS. I get the data (after a few minutes).
But I would like to be able to manage and develop the SQL script in SSMS, and store the script in a sql file. So I tried the following:
proc sql;
connect to ODBC("dsn=DatabaseOfInterest");
create table NewDataSet as select * from connection to odbc(
%include 'C:\sqlscript.sql';
);
quit ;
This does not work and creates the following error:
**ERROR: CLI prepare error:
[Microsoft][ODBC SQL Server Driver][SQL Server]Incorrect syntax near '%'.
**
Is there a way to achieve this?

I don't know if there's a truly clean way to work around this. The issue is that the connect to SQL is passing %include to the SQL parser, which is of course incorrect compared to what you intend.
It will, however, correctly resolve macros and macro variables, so you can read your SQL command into a macro variable and use it that way. One way to do that is below.
filename tempfile temp; *imaginary file - this would be your SQL script;
data _null_; *creating a mocked up SQL script file;
file tempfile;
put "select * from country";
run;
data _null_; *reading the SQL script into a variable, hopefully under 32767?;
infile tempfile recfm=f lrecl=32767 pad;
input #1 sqlcode $32767.;
call symputx('sqlcode',sqlcode); *putting it into a macro variable;
run;
proc sql; *using it;
connect to oledb(init_string=&dev_string);
select * from connection to oledb(
&sqlcode.
);
quit;

The file containing your SQL code is C:\sqlscript.sql. I'll assume it looks something like this:
select * from mytable;
Edit the file so that it now looks like this...
%macro sqlscript;
select * from mytable;
%mend;
... and then rename the file extension to C:\sqlscript.sas.
Finally, change your proc sql code to look like this:
options sasautos = ("c:\", sasautos);
proc sql;
connect to ODBC("dsn=DatabaseOfInterest");
create table NewDataSet as select * from connection to odbc
(
%sqlscript;
);
quit;
Explanation: The %include statement you tried to use although it uses a % sign and looks like macro code can't really be substituted in any random point in code as it is a SAS statement. It's really meant to be issued outside of PROC statements and data steps (it probably shouldn't even have a %sign infront of it but unforunately that's how SAS designed it...). So that's why it won't work.
SAS provides the ability to search for macro functions outside of the current program being run. If you call a macro function that isn't defined in your current SAS program (in this case %sqlscript), it's going to go look for it in the list of pathnames specified in the SASAUTOS option. If it finds a file in one of the SASAUTOS pathnames that exactly matches the macro it's searching for, and if the contents of that file contain a definition for the macro, SAS will compile and run that macro. In the above example, the macro simply substitutes in the SQL code contained within it.
In the options sasautos= statement - we are simply prepending the c:\ path to the existing list of pathnames currently in SASAUTOS. It will search the pathnames in order, and I'm assuming we want our custom macros to override any existing macros if there happens to be a conflict. You only need to specifyt options sasautos= once per SAS session, so don't copy/paste it before every proc sql statement.
Documentation for SASAUTOS . These are also known as autocall macros so that should turn up some useful hits in google too.
Also - obviously I don't recommend storing code in c:\ so adjust as necessary. A note to non-windows users - macro names and definitions are case sensitive so be consistent!

Based on feedback from my previous answer I've provided an alternate approach below that should better address your exact needs.
The code below shows how the final program will 'work' once it is all combined together. We are going to take this code and split it into different files as indicated by the comments:
%macro myQuery; /* FILE 1 - header.sas */
select * from myTable; /* FILE 2 - query.sql */
%mend; /* FILE 3 - footer.sas */
/* BEGIN FILE 4 - main.sas */
proc sql;
connect to ODBC("dsn=DatabaseOfInterest");
create table NewDataSet as
select *
from connection to odbc
(
%myQuery;
);
quit ;
/* END FILE 4 */
FILE1 - "header.sas" will look like:
%macro myQuery;
FILE2 - "query.sql" will look like:
select * from myTable;
FILE3 - "footer.sas" will look like:
%mend;
FILE4 will become:
%include "c:\header.sas"
"c:\query.sql"
"c:\footer.sas"
;
proc sql;
connect to ODBC("dsn=DatabaseOfInterest");
create table NewDataSet as
select *
from connection to odbc
(
%myQuery;
);
quit ;
You can see that we are defining the macro using the include statements. The query which is the body of the macro will be kept separate in it's own .sql file. This should allow you to continue to edit/submit your queries via both SAS and your favorite SQL editor. The header and footer files can be re-used if you have multiple query files.

Related

Upsert into SQL Server from SAS

I've got several datasets which need to be upserted into a SQL server database from SAS (my environment uses SAS DI 4.9).
The default table loader transformation that comes packaged with SAS DI offers an Update/Insert load style, with options to match by SQL set, column, or index. None of these works for me, instead throwing the error
ERROR: CLI execute error: [SAS][ODBC SQL Server Wire Protocol driver][Microsoft SQL Server]A cursor with the name
'SQL_CUR608F0C44282B0000' does not exist.
This SAS note indicates that this issue may be related to the version of the DataDirect driver and that there are workarounds, but the workaround for the version of SAS running in my environment causes poor read performance (which isn't acceptable for my needs). The environment is administered by IT.
What I'd like to do is leverage SAS DI's custom transformation abilities to build something that works the way the Table Loader transformation should have for users with my setup. This would entail some SQL pass-through which uses an update + insert approach, but where the column and table names are programmatically determined from the inputs and outputs to the transformation, and the match columns are specified by the user as with the default transformation.
This requires some serious macro magic.
Here's what I've tried for just the update portion (with anonymized info in [ square brackets ]):
%let conn = %str([my libname]);
%let where_clause = &match_col0 = &match_col0;
%macro custom_upsert;
data _null_;
put 'proc sql;';
put 'connect to ODBC(&conn);';
put "execute(update &_OUTPUT";
%do i=1 %to &_OUTPUT_col_count;
put '&&_OUTPUT_col_&i_name = &&_OUTPUT_col_&i_name';
%end;
put 'from &_OUTPUT join &_INPUT on';
put 'where &where_clause';
put ') by ODBC;';
put 'quit;';
run;
%mend;
%custom_upsert;
But this is failing with errors about unbalanced quotation marks and the quoted string exceeding 262 characters.
How can I get this working as intended?
EDIT
Here is the SQL server code that I am ultimately trying to get at with my SAS code, with the major difference here being that the SQL code references two SQL server tables but in reality I'm trying to update from a SAS table:
begin
update trgt_tbl
set col1 = col1
, ...
,coln = coln
from trgt_tbl
join upd_tbl
on trgt_tbl.match_col = upd_tbl.match_col;
insert into trgt_tbl
select * from
(select
col1
, ...
,coln
from upd_tbl) as temp
where not exists
(select 1 from trgt_tbl
where match_col = temp.match_col);
end
The macro could generate the SQL code directly, not output the desired code to log (which put will do). However, you could also put to a file that will be submitted via %include. The code gen into the file still has macro resolution references (&&) due to the single quoted put. Thus, those macro variables to be resolved must be existent in the scope at the %include time.
%macro myupsert;
filename myupsert 'c:\temp\passthrough-upsert.sas';
data _null_;
file myupsert;
…
/* same puts */
…
run;
%include myupsert;
filename myupsert;
%mend;
%myupsert;

Mapping an SQL server library in SAS

I'm trying to map the source of a dataset i usually access using SQL parse through as a library.
Below is the SQL parse through code I use to access the table
proc sql noprint;
connect to ODBC (DSN='Location1' );
create table test as
Select *
from Connection to ODBC
(
Select *
from CentralDB.dbo.Table_i_want
)
;
disconnect from ODBC
;
quit;
And below is the libname statement I tried writing
LIBNAME mylib ODBC DATASRC='Location1' SCHEMA=dbo ;
The above statement doesn't map it to the correct location, where do I put the CentralDB part?
Can anyone help me create a libname statement out of this?
Thanks,
libname mydblib odbc
noprompt="uid=testuser;pwd=testpass;dsn=sqlservr;"
stringdates=yes;
proc print data=mydblib.customers;
where state='CA';
run;
If you have a metadata server: It is always better to ask your SAS Admin to register the library and tables you want want in SAS metadata / Folders so you will have a permanent and standard way to access them.
Quick Solution:
Use the libname below and update the schema, user and password
LIBNAME mylib ODBC DATASRC=location_1 SCHEMA=dbo USER=sql user PASSWORD="xxx" ;
The step below will list the tables in the library
proc datasets lib=mylib ;
quit;
I always opt to build a connection string. I don't have an example on hand, but you should be able to build a string that specifies both the database CentralDB and the schema dbo.
It will look something like this:
libname mylib odbc noprompt='Driver={SQLServer};Server=Your_Server_Name; Database=CentralDB;Schema=dbo;Uid=Your_Username; Pwd=Your_Password;';

How to pass macro variable to PROC SQL on IN statement in WHERE clause on MS SQL Server

I have a table in MS SQL Server that looks like:
ID, Code
01, A
02, A
03, B
04, C
...
and is defined in SAS as
LIBNAME MSSQLDB ODBC
CONNECTION=SHAREDREAD
COMPLETE='Description=OIPE DW (Dev);DRIVER=SQL Server Native Client 11.0;SERVER=Amazon;Trusted_Connection=Yes;DATABASE=OIPEDW_Dev;'
SCHEMA='dbo'
PRESERVE_TAB_NAMES=YES
PRESERVE_COL_NAMES=YES;
I have a SAS dataset that has records of the same format as MSSQLDB (ID and Code variables) but is just a subset of the full database.
I would like to do the following:
PROC SQL NOPRINT;
/* If SASDS contains just codes A and B, CodeVar=A B
SELECT DISCTINCT CODE INTO :CodeVar SEPARATED BY ' ' FROM SASDS;
QUIT;
/* seplist is a macro that wraps each code in a quote */
%LET CodeInVar=%seplist( &CodeVar, nest=%STR(") );
PROC SQL;
DELETE * FROM MSSQLDB WHERE CODE IN (&CodeInVar);
/* Should execute DELETE * FROM MSSQL WHERE CODE IN ('A','B');
QUIT;
The problem is this generates a syntax error on the values in the &CodeInVar macro variable.
Any idea how to pass the macro variable value to SQL Server in the IN statement?
I think you have a few problems here; hopefully some of them are just transcription errors.
First off, this will not do anything:
PROC SQL;
DELETE * FROM MSSQLDB WHERE CODE IN (&CodeInVar);
/* Should execute DELETE * FROM MSSQL WHERE CODE IN ('A','B');
QUIT;
MSSQLDB is your libname, not the table; you need to define it as MSSQLDB.dbname here. Perhaps that's just a copying error.
Fundamentally there's nothing explicitly wrong with what you've typed. I would suggest first identifying if there are any problems with your macro variable. Put a %put statement in there:
%put &codeinvar.;
See what that outputs. Is it what you wanted? If not, then fix that part (the macro, presumably).
I would say that there are a lot of better ways to do this. First off, you don't need to add a macro to add commas or quotes or anything.
PROC SQL NOPRINT;
/* If SASDS contains just codes A and B, CodeVar=A B */
SELECT DISCTINCT cats("'",CODE,"'") INTO :CodeVar SEPARATED BY ',' FROM SASDS;
QUIT;
That should get you &codevar precisely as you want [ie, 'A','B' ].
Secondly, since you're using LIBNAME and not passthrough SQL, consider using SQL syntax rather than this entirely.
proc sql;
delete from MSSQLDB.yourtable Y where exists
(select 1 from SASDS S where S.code=Y.code);
quit;
That is sometimes faster, depending on the circumstances (it also could be slower). If code is something that has a high frequency, summarize it using PROC FREQ or a SQL query first.

How to Find Current Directory of itself with a TSQL Script File In SQL Server

I Think That My Question Is Simple.
How Can I Find That My Query Is Running From Where
( Where is The Location of the Script File itself ) ?
Edit :
Thank You For Your Answer.
I Need To Import a XML File Using my TSQL Script File And i want to Keep Them Together,
so Wherever Someone try to run the TSQL script file, it must knows the current directory of itself to know where is the XML file and then import it. Thank Again !
You need a well known location where you can place XML files for the server to load. This could be a share on the SQL Server machine, or on a file server which the SQL Server service account has permissions to read from.
You then need a comment like this at the top of your script:
--Make sure you've placed the accompanying XML file on \\RemoteMachine\UploadShare
--Otherwise, expect this script to produce errors
Change \\RemoteMachine\UploadShare to match the well known location you've selected. Optionally, have the comment followed by 30-40 blank lines (or more comments), so that it's obvious to anyone running it that they might need to read what's there.
Then, write the rest of your script based on that presumption.
I Found A Solution to my problem that's simpler !
You Know I Just Import My XML File To A Temp Table for once.
Then I Write a Select Query for That Temp Table That Contains my imported Data Like This :
" SELECT 'INSERT INTO MyTable VALUES (' + Col1 + ', ' + Col2 + ')' FROM MyImportedTable "
And Now I Have Many Insert Commands For Each One Of My Imported Records.
And I Save All of the Insert Commands in My Script. And So I Just Need My Script File Everywhere I Go.

Fully Qualified Names from SQL Server in SAS

I need to be able to specify the schema that I want to access in SAS. I have used a connection string with the following schema=?? but SAS will not let me select or print the contents of any object in the named schema. Has anyone been able to write a PROC SQL statement selecting objects in a schema other than dbo?
Thank you,
SAS does not use fully quallified names from the SQL server but you can direct SAS to a specific schema. The following is an example that uses a libname as connection to a 2008 SQL Server.
proc print data=myDBconn.v_Lots (SCHEMA=SAS);
WHERE Study_ID IS NOT NULL;
run;
proc print data=myDBconn.Drugs (SCHEMA=Pharmacy);
where _drug_id=1;
run;
proc sql;
create table myTest.drugs as ;
(SELECT * FROM myDbconn.drugs (SCHEMA=Pharmacy));
quit;

Resources