Fully Qualified Names from SQL Server in SAS - sql-server

I need to be able to specify the schema that I want to access in SAS. I have used a connection string with the following schema=?? but SAS will not let me select or print the contents of any object in the named schema. Has anyone been able to write a PROC SQL statement selecting objects in a schema other than dbo?
Thank you,

SAS does not use fully quallified names from the SQL server but you can direct SAS to a specific schema. The following is an example that uses a libname as connection to a 2008 SQL Server.
proc print data=myDBconn.v_Lots (SCHEMA=SAS);
WHERE Study_ID IS NOT NULL;
run;
proc print data=myDBconn.Drugs (SCHEMA=Pharmacy);
where _drug_id=1;
run;
proc sql;
create table myTest.drugs as ;
(SELECT * FROM myDbconn.drugs (SCHEMA=Pharmacy));
quit;

Related

Extracting SQL Server data into SAS with datetime filter

I'm pulling data into SAS from a SQL server database with a (SAS DI) extract transformation. The data in the table goes back several years and has a little over 16 million rows; I only need data for the last few years which should amount to roughly 2.6 million rows.
So my extract uses a datetime filter. The autogenerated proc sql looks as follows (after obfuscating the libref, table, and column names):
%let SYSLAST = %nrquote(LIBREF.SRC_TBL);
proc sql;
create table work.WFJVX4PU as
select
col1,
col2,
dt_col,
col3
from &SYSLAST
where dt_col >= &start_dt
;
quit;
The code works and returns the desired rows, but it takes far too long to run when compared with executing a similar query directly on the SQL server. When inspecting the statistics of the execution I discovered that the entire table was brought into SAS before the where clause was applied.
I remember reading that this will occur when attempting to use filters that apply functions to source data (a la datepart(dt_col) >= input("&start_date",date9.)) which is why I tried to pass the datetime used by the filter directly to the SQL engine. I also tried the "datepart" approach and got the same result.
Is there something else I should be doing to apply the filter server side before bringing the data into SAS? This isn't the experience I've had in the past when working with other database tables (e.g. Teradata, MySQL, Oracle, etc).
Additional Details:
The macro variable start_date is defined using the SAS DI prompt creation tool; the auto-generated code for this is
%let start_date_label = 730 days ago (July 02, 2017);
%let start_date_rel = D-730D;
%let start_date = 02Jul2017;
I then create the datetime macro variable start_dt by executing the following in the precode of the job:
%let start_dt = %sysfunc(dhms("&start_date"d,0,0,0));
Below is the libname statement (after obfuscation):
LIBNAME MYLIB SQLSVR CONNECTION=SHARED PRESERVE_TAB_NAMES=YES dbconinit="set ansi_warnings off;use MY_DB_NAME;set nocount on;" Datasrc=MY_DATA_SRC SCHEMA=dbo AUTHDOMAIN="my_sql_authentication" ;
If I modify the where clause to use a literal of the form '2019-06-25' then SAS throws the error
ERROR: Expression using greater than or equal (>=) has components that
are of different data types.
because the dt_col field is of type numeric (format datetime22.3) on the SAS instance of the SQL server table that is registered from the library. If I use a literal as follows:
dt_col >= '25JUN2019'D
then I do get the desired set of rows (despite comparison between a datetime field and a date literal), but the query still takes a long time to execute and the job statistics indicate SAS is still grabbing all 16 million rows to perform this task.
UPDATE
I'm having issues following Tom's advice below. If I execute the following code:
LIBNAME MYLIB SQLSVR CONNECTION=SHARED PRESERVE_TAB_NAMES=YES dbconinit="set ansi_warnings off;use MYDB;set nocount on;" Datasrc=MYSRC SCHEMA=dbo AUTHDOMAIN="my_sql_authentication" ;
/*---- Map the columns ----*/
proc datasets lib = work nolist nowarn memtype = (data view);
delete sql_psthru2;
quit;
proc sql;
create table work.sql_psthru2 as
select
col1,
col2,
dt_col,
col3
from MYLIB.MYTBL
where dt_col>= '25JUN2019'd
;
quit;
then I get data from the database, but if I execute
LIBNAME MYLIB SQLSVR CONNECTION=SHARED PRESERVE_TAB_NAMES=YES dbconinit="set ansi_warnings off;use MYDB;set nocount on;" Datasrc=MYSRC SCHEMA=dbo AUTHDOMAIN="my_sql_authentication" ;
proc sql;
connect using MYLIB;
create table work.sql_psthru as
select * from connection to MYLIB
(select
col1
,col2
,dt_col
from MYTBL
where dt_col >= '2019-06-25'
)
;
quit;
then I receive the error
ERROR: CLI error trying to establish connection: [DataDirect][ODBC
lib] Data source name not found and no default driver specified
immediately after the connect using MYLIB; line.
I've tried many variants of the explicit passthru which I've found all over the internet which I won't post here, but none worked.
An interesting side note is that I believe the reason the SAS DI statistics are indicating that all 16 million rows are returned is that the extract transformation auto-generates the following macro:
%macro etls_recordCheck;
%let etls_recCheckExist = %eval(%sysfunc(exist(MYLIB.MYTBL, DATA)) or
%sysfunc(exist(MYLIB.MYTBL, VIEW)));
%if (&etls_recCheckExist) %then
%do;
proc sql noprint;
select count(*) into :etls_recnt from MYLIB.MYTBL;
quit;
%end;
%mend etls_recordCheck;
%etls_recordCheck;
So the culprit in the long execution time is not that the full dataset is being returned to SAS (I removed the macro and the code still takes far too long to run).
You could try explicitly writing the code in the remote database. So if you already have a libref named LIBREF defined you can use that in the CONNECT statement in PROC SQL.
proc sql;
connect using libref ;
create table WFJVX4PU as
select * from connection to libef
(select
col1
,col2
,dt_col
,col3
from SRC_TBL
where dt_col >= &start_dt
)
;
quit;
Just make sure everything inside the () is valid syntax for that database system. Included the values of the macro variable START_DT.

Mapping an SQL server library in SAS

I'm trying to map the source of a dataset i usually access using SQL parse through as a library.
Below is the SQL parse through code I use to access the table
proc sql noprint;
connect to ODBC (DSN='Location1' );
create table test as
Select *
from Connection to ODBC
(
Select *
from CentralDB.dbo.Table_i_want
)
;
disconnect from ODBC
;
quit;
And below is the libname statement I tried writing
LIBNAME mylib ODBC DATASRC='Location1' SCHEMA=dbo ;
The above statement doesn't map it to the correct location, where do I put the CentralDB part?
Can anyone help me create a libname statement out of this?
Thanks,
libname mydblib odbc
noprompt="uid=testuser;pwd=testpass;dsn=sqlservr;"
stringdates=yes;
proc print data=mydblib.customers;
where state='CA';
run;
If you have a metadata server: It is always better to ask your SAS Admin to register the library and tables you want want in SAS metadata / Folders so you will have a permanent and standard way to access them.
Quick Solution:
Use the libname below and update the schema, user and password
LIBNAME mylib ODBC DATASRC=location_1 SCHEMA=dbo USER=sql user PASSWORD="xxx" ;
The step below will list the tables in the library
proc datasets lib=mylib ;
quit;
I always opt to build a connection string. I don't have an example on hand, but you should be able to build a string that specifies both the database CentralDB and the schema dbo.
It will look something like this:
libname mylib odbc noprompt='Driver={SQLServer};Server=Your_Server_Name; Database=CentralDB;Schema=dbo;Uid=Your_Username; Pwd=Your_Password;';

WHERE clause operator requires compatible variables

Below script (fx is SQL Server's table):
LIBNAME SQL ODBC DSN='sql server' ;
DATA new;
SET SQL.fx;
WHERE repo_date = '2016-04-29 00:00:00.000';
RUN;
PROC PRINT DATA=new;
RUN;
returns me an error (in SAS log):
191 WHERE repo_date = '2016-04-29 00:00:00.000';
ERROR: WHERE clause operator requires compatible variables.
192 RUN;
Where can I check which data conversion I need (in this case and in others)?
In SQL Server 2008R2 repo_date is a datetime column.
You are comparing a string to a numeric value.
So your datetime-format is wrong (like Heinzi mentioned), and also you have to convert it to a datetime value (by adding a dt at the end)
Working should this:
WHERE repo_date ='29APR2016 00:00:00.000'dt;
If repo_time is datetime and the time is not relevant, you can just compare the date:
WHERE datepart(repo_date) = '29APR2016'd;
SAS is using the ODBC libname engine to translate SAS data step code into SQL code. Because you're writing it in SAS, SAS is assuming that you are looking for the string 2016-04-29 00:00:00.000. Instead, you want to put it in a SAS date literal so SAS knows how to translate the data.
LIBNAME SQL ODBC DSN='sql server' ;
DATA new;
SET SQL.fx;
WHERE repo_date = '29APR2016:00:00:00'dt;
RUN;
PROC PRINT DATA=new;
RUN;
If you were doing SQL passthrough to directly run SQL on the server, then your above code would work.
proc sql noprint;
connect to odbc(datasrc='sql server');
create table new as
select * from connection to odbc
(select * from schema.fx where repo_date='2016-04-29 00:00:00.000');
disconnect from odbc;
quit;
Basically, what the above is doing is having the SQL server pull the columns, then SAS simply pulls it all over to itself. Think of it as using SAS as a proxy program to run commands directly on the SQL server.

Insert Date Field to MS SQL from Proc SQL

I'd like to insert a date field into a SQL server table form Proc SQL in SAS. Here is my code for Proc SQL:
proc sql;
insert into CFS_SQL.Data_DSB_Raw(sasdatefmt=(TheDate='mmddyy10.'))
select TheDateIncoming
from Work.Upload;
quit;
According to the SAS help documentation (http://support.sas.com/kb/6/450.html), this should work as long as TheDateIncoming also has format mmddyy10.. I've verified that the format on TheDateIncoming is correct, so I think this should work.
Unfortunately, however, I'm getting a "Value 1 on the SELECT clause does not match the data type of the corresponding column" error.
Any thoughts?
Annnnnd... solved. It actually had nothing to do with the code. It was a driver problem. Switching to the SQL Server Native Client 11.0 ODBC driver fixed the issue.

How to pass macro variable to PROC SQL on IN statement in WHERE clause on MS SQL Server

I have a table in MS SQL Server that looks like:
ID, Code
01, A
02, A
03, B
04, C
...
and is defined in SAS as
LIBNAME MSSQLDB ODBC
CONNECTION=SHAREDREAD
COMPLETE='Description=OIPE DW (Dev);DRIVER=SQL Server Native Client 11.0;SERVER=Amazon;Trusted_Connection=Yes;DATABASE=OIPEDW_Dev;'
SCHEMA='dbo'
PRESERVE_TAB_NAMES=YES
PRESERVE_COL_NAMES=YES;
I have a SAS dataset that has records of the same format as MSSQLDB (ID and Code variables) but is just a subset of the full database.
I would like to do the following:
PROC SQL NOPRINT;
/* If SASDS contains just codes A and B, CodeVar=A B
SELECT DISCTINCT CODE INTO :CodeVar SEPARATED BY ' ' FROM SASDS;
QUIT;
/* seplist is a macro that wraps each code in a quote */
%LET CodeInVar=%seplist( &CodeVar, nest=%STR(") );
PROC SQL;
DELETE * FROM MSSQLDB WHERE CODE IN (&CodeInVar);
/* Should execute DELETE * FROM MSSQL WHERE CODE IN ('A','B');
QUIT;
The problem is this generates a syntax error on the values in the &CodeInVar macro variable.
Any idea how to pass the macro variable value to SQL Server in the IN statement?
I think you have a few problems here; hopefully some of them are just transcription errors.
First off, this will not do anything:
PROC SQL;
DELETE * FROM MSSQLDB WHERE CODE IN (&CodeInVar);
/* Should execute DELETE * FROM MSSQL WHERE CODE IN ('A','B');
QUIT;
MSSQLDB is your libname, not the table; you need to define it as MSSQLDB.dbname here. Perhaps that's just a copying error.
Fundamentally there's nothing explicitly wrong with what you've typed. I would suggest first identifying if there are any problems with your macro variable. Put a %put statement in there:
%put &codeinvar.;
See what that outputs. Is it what you wanted? If not, then fix that part (the macro, presumably).
I would say that there are a lot of better ways to do this. First off, you don't need to add a macro to add commas or quotes or anything.
PROC SQL NOPRINT;
/* If SASDS contains just codes A and B, CodeVar=A B */
SELECT DISCTINCT cats("'",CODE,"'") INTO :CodeVar SEPARATED BY ',' FROM SASDS;
QUIT;
That should get you &codevar precisely as you want [ie, 'A','B' ].
Secondly, since you're using LIBNAME and not passthrough SQL, consider using SQL syntax rather than this entirely.
proc sql;
delete from MSSQLDB.yourtable Y where exists
(select 1 from SASDS S where S.code=Y.code);
quit;
That is sometimes faster, depending on the circumstances (it also could be slower). If code is something that has a high frequency, summarize it using PROC FREQ or a SQL query first.

Resources