I am attempting to use the following code from Jack Shostak's book 'SAS Programming in the Pharmaceutical Industry' for a medications table in SAS:
PROC SQL NOPRINT;
SELECT COUNT(DISTINCT USUBJID) FORMAT = 3.
INTO :n1
FROM ADSL
WHERE TRTPN = 1;
SELECT COUNT(DISTINCT USUBJID) FORMAT = 3.
INTO :n2
FROM ADSL
WHERE TRTPN = 0;
SELECT COUNT(DISTINCT USUBJID) FORMAT = 3.
INTO :n3
FROM ADSL
WHERE TRTPN NE .;
QUIT;
PROC SQL NOPRINT;
CREATE TABLE CMTOSUM AS
SELECT UNIQUE(C.CMDECOD) AS CMDECOD, C.USUBJID, T.TRTPN
FROM CM AS C, ADSL AS T
WHERE C.USUBJID = T.USUBJID
ORDER BY USUBJID, CMDECOD;
QUIT;
ODS LISTING CLOSE;
ODS OUTPUT CROSSTABFREQS = COUNTS;
PROC FREQ DATA = CMTOSUM;
TABLES TRTPN * CMDECOD;
RUN;
ODS OUTPUT CLOSE;
ODS LISTING;
PROC SORT DATA = COUNTS;
BY CMDECOD;
RUN;
DATA CM;
MERGE COUNTS(WHERE = (TRTPN = 1) RENAME = (FREQUENCY = COUNT1))
COUNTS(WHERE = (TRTPN = 0) RENAME = (FREQUENCY = COUNT2))
COUNTS(WHERE = (TRTPN = .) RENAME = (FREQUENCY = COUNT3))
END = EOF;
BY CMDECOD;
KEEP CMDECOD ROWLABEL COL1-COL3 SECTION;
LENGTH ROWLABEL $25 COL1-COL3 $10;
IF CMDECOD = '' THEN
DO;
ROWLABEL = 'ANY MEDICATION';
SECTION = 1;
END;
ELSE
DO;
ROWLABEL = CMDECOD;
SECTION = 2;
END;
PCT1 = (COUNT1/ &n1) *100;
PCT2 = (COUNT2/ &n2) *100;
PCT3 = (COUNT3/ &n3) *100;
COL1 = PUT(COUNT1, 3.) || " (" || PUT(PCT1, 3.) || "%)";
COL2 = PUT(COUNT2, 3.) || " (" || PUT(PCT2, 3.) || "%)";
COL3 = PUT(COUNT3, 3.) || " (" || PUT(PCT3, 3.) || "%)";
RUN;
This code correctly tabulates the number of subjects within each treatment arm on specific medications. However, when I run this code it generates a count based on the number of medications in the 'ANY MEDICATION' row rather than the total number of subjects. Currently the percentage exceeds 100; I would like to modify the count so that it stops once it hits the total number of subjects in each treatment arm. Any insight would be appreciated.
I was able to resolve the issue by adding the following lines of code:
IF COUNT1 GE &N1 THEN COUNT1 = &n1;
IF COUNT2 GE &N2 THEN COUNT2 = &n2;
IF COUNT3 GE &N3 THEN COUNT3 = &n3;
This restricts the counts to total number of subjects within each group.
Below is the updated code for reference.
PROC SQL NOPRINT;
SELECT COUNT(DISTINCT USUBJID) FORMAT = 3.
INTO :n1
FROM ADSL
WHERE TRTPN = 1;
SELECT COUNT(DISTINCT USUBJID) FORMAT = 3.
INTO :n2
FROM ADSL
WHERE TRTPN = 0;
SELECT COUNT(DISTINCT USUBJID) FORMAT = 3.
INTO :n3
FROM ADSL
WHERE TRTPN NE .;
QUIT;
PROC SQL NOPRINT;
CREATE TABLE CMTOSUM AS
SELECT UNIQUE(C.CMDECOD) AS CMDECOD, C.USUBJID, T.TRTPN
FROM CM AS C, ADSL AS T
WHERE C.USUBJID = T.USUBJID
ORDER BY USUBJID, CMDECOD;
QUIT;
ODS LISTING CLOSE;
ODS OUTPUT CROSSTABFREQS = COUNTS;
PROC FREQ DATA = CMTOSUM;
TABLES TRTPN * CMDECOD;
RUN;
ODS OUTPUT CLOSE;
ODS LISTING;
PROC SORT DATA = COUNTS;
BY CMDECOD;
RUN;
DATA CM;
MERGE COUNTS(WHERE = (TRTPN = 1) RENAME = (FREQUENCY = COUNT1))
COUNTS(WHERE = (TRTPN = 0) RENAME = (FREQUENCY = COUNT2))
COUNTS(WHERE = (TRTPN = .) RENAME = (FREQUENCY = COUNT3))
END = EOF;
BY CMDECOD;
KEEP CMDECOD ROWLABEL COL1-COL3 SECTION;
LENGTH ROWLABEL $25 COL1-COL3 $10;
IF COUNT1 GE &N1 THEN COUNT1 = &n1;
IF COUNT2 GE &N2 THEN COUNT2 = &n2;
IF COUNT3 GE &N3 THEN COUNT3 = &n3;
IF CMDECOD = '' THEN
DO;
ROWLABEL = 'ANY MEDICATION';
SECTION = 1;
END;
ELSE
DO;
ROWLABEL = CMDECOD;
SECTION = 2;
END;
PCT1 = (COUNT1/ &n1) *100;
PCT2 = (COUNT2/ &n2) *100;
PCT3 = (COUNT3/ &n3) *100;
COL1 = PUT(COUNT1, 3.) || " (" || PUT(PCT1, 3.) || "%)";
COL2 = PUT(COUNT2, 3.) || " (" || PUT(PCT2, 3.) || "%)";
COL3 = PUT(COUNT3, 3.) || " (" || PUT(PCT3, 3.) || "%)";
RUN;
Related
In the following example, assuming that NULL values are not allowed.
Are these 2 update statements:
Query #1:
UPDATE #R12
SET SOURCE_NAME = DEFLT.SOURCE_NAME
FROM R12_MLR_REBATE_GL_STRING_DEFLT DEFLT
WHERE #R12.PAYMT_TY = DEFLT.PAYMT_TY
AND #R12.AMOUNT_TYPE = DEFLT.AMOUNT_TYPE
AND #R12.COMPANY <> #BNKER_COMPANY
AND DEFLT.BNKER_IND = 'N'
Query #2:
UPDATE #R12
SET SOURCE_NAME = DEFLT.SOURCE_NAME
FROM R12_MLR_REBATE_GL_STRING_DEFLT DEFLT
WHERE #R12.PAYMT_TY = DEFLT.PAYMT_TY
AND #R12.AMOUNT_TYPE = DEFLT.AMOUNT_TYPE
AND #R12.COMPANY = #BNKER_COMPANY
AND DEFLT.BNKER_IND = 'Y'
equivalent to this one update statement?
UPDATE #R12
SET SOURCE_NAME = DEFLT.SOURCE_NAME
FROM R12_MLR_REBATE_GL_STRING_DEFLT DEFLT
WHERE #R12.PAYMT_TY = DEFLT.PAYMT_TY
AND #R12.AMOUNT_TYPE = DEFLT.AMOUNT_TYPE
AND ((#R12.COMPANY <> #BNKER_COMPANY AND DEFLT.BNKER_IND = 'N')
OR (#R12.COMPANY = #BNKER_COMPANY AND DEFLT.BNKER_IND = 'Y')
)
Do I get the same behavior if I use a JOIN syntax? By this I mean that if I run them both, in that order, that they will provide the same eventual output as running the 'unified' version.
UPDATE R12
SET SOURCE_NAME = DEFLT.SOURCE_NAME
FROM #R12 R12
INNER JOIN R12_MLR_REBATE_GL_STRING_DEFLT DEFLT ON R12.PAYMT_TY = DEFLT.PAYMT_TY
AND R12.AMOUNT_TYPE = DEFLT.AMOUNT_TYPE
AND ((R12.COMPANY <> #BNKER_COMPANY AND DEFLT.BNKER_IND = 'N')
OR (R12.COMPANY = #BNKER_COMPANY AND DEFLT.BNKER_IND = 'Y'))
I'm experiencing some paranoia and would feel better with a second opinion.
Thank you
Hello I have a procedure and questions about it. This procedure is used for extracting data then inserting them into one table. When I test my code, I have to enter some parameters for executing procedure.
`--this is how I execute the procedure
begin
GPU_DATA_EXTRACTOR(to_date('31/08/2021','DD/MM/YYYY'));
end;`
But what I want to do is that when the billdate parameter is NULL, the procedure should execute last day of the previous month as a parameter automatically. How can I make this change? I am open to any update advices thank you from now.
Updated the script below.
create or replace procedure GPU_DATA_EXTRACTOR_TEST(pid_billdate DATE DEFAULT LAST_DAY(ADD_MONTHS(TRUNC(SYSDATE), -1))) is
c_limit CONSTANT PLS_INTEGER DEFAULT 10000;
CURSOR c1 IS
SELECT DISTINCT intl_prod_id
FROM apld_bill_rt abr,
acct_bill ab
WHERE abr.CHRG_TP = 'INSTALLMENT'
AND abr.TAX_CATG_ID = 'NOTAX'
AND abr.acct_bill_id = ab.acct_bill_id
AND ab.bill_date = pid_billdate;
TYPE prod_ids_t IS TABLE OF apld_bill_rt.intl_prod_id%TYPE INDEX BY PLS_INTEGER;
l_prod_ids prod_ids_t;
begin
execute immediate 'truncate table GPU_INV_TEST';
OPEN c1;
LOOP
FETCH c1 BULK COLLECT INTO l_prod_ids LIMIT c_limit;
EXIT WHEN l_prod_ids.COUNT = 0;
FORALL indx IN 1 .. l_prod_ids.COUNT
INSERT INTO GPU_INV_TEST
SELECT AB.ACCT_BILL_ID,
AB.BILL_NO,
AB.INV_ID,
AB.BILL_DATE,
ba2.bill_acct_id,
ba1.bill_acct_id parent_bill_acct_id,
AB.DUE_DATE,
PG.CMPG_ID,
ABR.NET_AMT,
AB.DUE_AMT,
P.PROD_NUM,
pds.DST_ID,
ABR.DESCR,
p.intl_prod_id
FROM apld_bill_rt abr,
acct_bill ab,
prod p,
FCBSADM.PROD_DST pds,
bill_acct_prod bap,
bill_acct ba1,
bill_acct ba2,
prod_cmpg pg
WHERE ab.intl_bill_acct_id = ba1.intl_bill_acct_id
AND AB.ACCT_BILL_ID = ABR.ACCT_BILL_ID
AND ba1.intl_bill_acct_id = ba2.parent_bill_acct_id
AND ba2.intl_bill_acct_id = bap.intl_bill_acct_id
AND bap.intl_prod_id = abr.intl_prod_id
AND ABR.CHRG_TP = 'INSTALLMENT'
AND bap.intl_prod_id = pds.intl_prod_id
AND bap.intl_prod_id = p.intl_prod_id
AND p.intl_prod_id = pg.intl_prod_id(+)
AND ABR.intl_prod_id = l_prod_ids(indx)
UNION
SELECT AB.ACCT_BILL_ID,
AB.BILL_NO,
AB.INV_ID,
AB.BILL_DATE,
ba1.bill_acct_id,
ba1.bill_acct_id parent_bill_acct_id,
AB.DUE_DATE,
PG.CMPG_ID,
ABR.NET_AMT,
AB.DUE_AMT,
P.PROD_NUM,
pds.DST_ID,
ABR.DESCR,
p.intl_prod_id
FROM apld_bill_rt abr,
acct_bill ab,
prod p,
FCBSADM.PROD_DST pds,
bill_acct_prod bap,
bill_acct ba1,
prod_cmpg pg
WHERE ab.intl_bill_acct_id = ba1.intl_bill_acct_id
AND AB.ACCT_BILL_ID = ABR.ACCT_BILL_ID
--AND ba1.intl_bill_acct_id = ba2.parent_bill_acct_id
AND ba1.intl_bill_acct_id = bap.intl_bill_acct_id
AND bap.intl_prod_id = abr.intl_prod_id
AND ABR.CHRG_TP = 'INSTALLMENT'
AND bap.intl_prod_id = pds.intl_prod_id
AND bap.intl_prod_id = p.intl_prod_id
AND p.intl_prod_id = pg.intl_prod_id(+)
AND ABR.intl_prod_id = l_prod_ids(indx);
COMMIT;
END LOOP;
CLOSE c1;
end;
You can add a default value for your parameters. Take the following function as an example:
CREATE OR REPLACE FUNCTION sf_showDefault
(
p_in DATE DEFAULT LAST_DAY(ADD_MONTHS(TRUNC(SYSDATE), -1))
)
RETURN DATE
IS
BEGIN
RETURN p_in;
END sf_showDefault;
/
When no parameters are entered it gets a truncated SYSDATE and subtracts one month, then if finds the last day of that month. All the function does is return that data (or the one that you pass in...if you feel like it).
Here is a DBFiddle showing the effect of DEFAULT parameters (LINK)
I have a SAS data step statement –
Data work.CABGothers2;
set work.CABGothers1;
IF proc_p in (a HUGE LIST OF ICD10 CODES) and PDDCABG = 1
and TypeofCABG_PDDTemp = . then TypeofCABG_PDDTemp = 4;
IF proc2 in (a HUGE LIST OF ICD10 CODES) and PDDCABG = 1
and TypeofCABG_PDDTemp = . then TypeofCABG_PDDTemp = 4;
IF proc3 in (a HUGE LIST OF ICD10 CODES) and PDDCABG = 1
and TypeofCABG_PDDTemp = . then TypeofCABG_PDDTemp = 4;
...
run;
This IF-THEN section goes on 21 times, so you can imagine how HUGE and cumbersome this sas code file gets, especially when it comes to any modifications to the ICD10 code list. It would have to be changed individually in all the proc1,proc2... columns.
Also, the ICD10 lists are very huge with over 7000 codes, I was wondering if someone could show me a better SAS code that might take as input a column of data (ICD10 codes) from a file.
I would like a proc sql or Data step procedure. Whichever is more efficient.
Current code-
Data work.CABGothers2;
set work.CABGothers1;
IF proc_p in (a HUGE LIST OF ICD10 CODES) and PDDCABG = 1
and TypeofCABG_PDDTemp = . then TypeofCABG_PDDTemp = 4;
run;
UPDATE--
I got this to work if the list is small...however I have a column with 8000 unique ICD10 codes. So I get an error message as shown below.
proc sql;
select quote(icd10) into :cabgvalexcl separated by ','
from newlink.cabgvalexcl2019;
quit;
Data work.test1;
set WORK.cabgpddcol;
IF proc_p in (&cabgvalexcl.) and PDDCABG = 1 then CABGVAL_Excl = 1;
IF oproc1 in (&cabgvalexcl.) and PDDCABG = 1 then CABGVAL_Excl = 1 ;
IF oproc2 in (&cabgvalexcl.) and PDDCABG = 1 then CABGVAL_Excl = 1;
IF oproc3 in (&cabgvalexcl.) and PDDCABG = 1 then CABGVAL_Excl = 1 ;
IF oproc4 in (&cabgvalexcl.) and PDDCABG = 1 then CABGVAL_Excl = 1;
run;
**> ERROR message- ERROR: The length of the value of the macro variable
CABGVALEXCL (65540) exceeds the maximum length (65534). The value has
been
truncated to 65534 characters.**
UPDATE --
eXAMPLE (JUST FEW ROWS) of ONLY 1 column (I do not have multiple columns. I did that in the macro example because macro variable was running out of max space.) containing ICD10 codes and the data file in which I have to tag rows that have any of the ICD10 codes -
OUTPUT table-
LOgic - If any of the ICD10 codes listed in cabgvalexcl2019 (shown here in RED) is found in the table CABGOTHERS1, create a column called - EXCLUDE - and put a value of 1 for that record.
Here's a hash-based example. It doesn't use macro variables, so it should work for any number of ICD10 codes:
data cabgvalexcl2019;
input (icd1-icd3) (:$2.);
datalines;
1 2 3
4 5 6
7 8 9
;
run;
/*Generate some dummy data*/
data cabgpddcol;
array keys[*] $2 proc_p oproc1-oproc20;
call streaminit(1); /*Set random number seed*/
do i = 1 to 20;
do j = 1 to dim(keys);
keys[j] = put(int(rand('uniform') * 11 + 9), 2.); /*Chosen so we get a few rows with no exclusion codes*/
end;
PDDCABG = rand('uniform') < 0.75;
output;
end;
drop i j;
run;
/* CABGval_Excl = Identify CABG+VALVE exclusions which are "CABG OTHERS". This is the 2019 CABG+VALVE exclusion list. */
/* If the RECORD IN following table has CABGVAL_Excl = 1 then it is a CABG+valve WITH EXCLUSION*/
Data work.CABGval_Excl; /* CABG OTHERS prior to refinement into non-iso CABG WITH Valve and non-iso CABG WITHOUT Valve */
/*Create hash object to hold list of ICD codes*/
length icd $ 2;
if _n_ = 1 then do;
declare hash h();
rc = h.definekey('icd');
rc = h.definedone();
do until(eof);
set cabgvalexcl2019 end = eof;
/*Consider using an array here if you have lots of ICD columns*/
do icd = icd1, icd2, icd3;
rc = h.add();
end;
end;
end;
set cabgpddcol;
/*Loop through all the keys and stop if we find one in the hash*/
array keys[*] proc_p oproc1-oproc20;
rc = -1;
do i = 1 to dim(keys) until(rc = 0);
rc = h.find(key:keys[i]); /*This sets rc = 0 if a match is found*/
end;
drop i rc icd:;
CABGVAL_Excl = rc ne 0 and PDDCABG = 1;
run;
Constructing the hash object is a little bit fiddly if you have multiple columns holding all the distinct ICD10 codes you care about - if they're all in one column then there's a simpler way of doing this:
declare hash h(dataset:'cabgvalexcl2019');
rc = h.definekey('icd');
rc = h.definedone();
I have the following MSSQL update statement that contains innner join and a case expression in the update statement is it possible convert the update statement into DB2 update state.
UPDATE LIBNAME1.OPTR_POS_FIX
SET VAL_TYPE = #VAL_TYPE
,PORT_SNAME = #PORT_SNAME
,ISIN_NO = #ISIN_NO
,SEC_SNAME = #SEC_SNAME
,SEC_CCY_ABBR = #SEC_CCY_ABBR
,BASE_CCY = #BASE_CCY
,TRX_BCCY_EX_RATE = #TRX_BCCY_EX_RATE
,QUANTITY = #QUANTITY
,MKT_PRICE = #MKT_PRICE
,AVG_COST = #AVG_COST
,MVAL_AMT_SC = ROUND(#QUANTITY * #MKT_PRICE / (
CASE
WHEN FDBVAL.VLGTI = 100
THEN 100
ELSE 1
END
), 3)
,MVAL_AMT_BC = ROUND(#QUANTITY * #MKT_PRICE / (
CASE
WHEN FDBVAL.VLGTI = 100
THEN 100
ELSE 1
END
) / #TRX_BCCY_EX_RATE, 3)
,AVG_BVAL_SC = #AVG_BVAL_SC
,AVG_BVAL_BC = #AVG_BVAL_BC
,INT_AMT_SC = #INT_AMT_SC
,INT_AMT_BC = #INT_AMT_BC
FROM LIBNAME1.OPTR_POS_FIX
INNER JOIN LIBNAME2.FDBVAL ON OPTR_POS_FIX.SEC_CODE = FDBVAL.VLVALR
WHERE (OPTR_POS_FIX.VALN_DATE = #VALN_DATE)
AND (OPTR_POS_FIX.PORT_CODE = #PORT_CODE)
AND (OPTR_POS_FIX.SEC_CODE = #SEC_CODE)
I appreciate any help.
This is how you would write the UPDATE statement in Db2
UPDATE LIBNAME1.OPTR_POS_FIX F
SET VAL_TYPE = #VAL_TYPE
,PORT_SNAME = #PORT_SNAME
,ISIN_NO = #ISIN_NO
,SEC_SNAME = #SEC_SNAME
,SEC_CCY_ABBR = #SEC_CCY_ABBR
,BASE_CCY = #BASE_CCY
,TRX_BCCY_EX_RATE = #TRX_BCCY_EX_RATE
,QUANTITY = #QUANTITY
,MKT_PRICE = #MKT_PRICE
,AVG_COST = #AVG_COST
,AVG_BVAL_SC = #AVG_BVAL_SC
,AVG_BVAL_BC = #AVG_BVAL_BC
,INT_AMT_SC = #INT_AMT_SC
,INT_AMT_BC = #INT_AMT_BC
,(MVAL_AMT_SC, MVAL_AMT_BC)
= (SELECT ROUND(#QUANTITY * #MKT_PRICE / (CASE WHEN F.VLGTI = 100 THEN 100 ELSE 1 END ), 3)
, ROUND(#QUANTITY * #MKT_PRICE / (CASE WHEN F.VLGTI = 100 THEN 100 ELSE 1 END ) / #TRX_BCCY_EX_RATE, 3)
FROM LIBNAME2.FDBVAL V
WHERE F.SEC_CODE = V.VLVALR
)
WHERE
VALN_DATE = #VALN_DATE
AND PORT_CODE = #PORT_CODE
AND SEC_CODE = #SEC_CODE
AND EXISTS (
SELECT 1
FROM LIBNAME2.FDBVAL V
WHERE F.SEC_CODE = V.VLVALR
)
Probably something like this assuming, that #var are some application parameters:
MERGE INTO LIBNAME1.OPTR_POS_FIX O
USING LIBNAME2.FDBVAL F ON O.SEC_CODE = F.VLVALR
AND (O.VALN_DATE = #VALN_DATE)
AND (O.PORT_CODE = #PORT_CODE)
AND (O.SEC_CODE = #SEC_CODE)
WHEN MATCHED THEN UPDATE SET
VAL_TYPE = #VAL_TYPE
, ...
, MVAL_AMT_SC = ROUND(#QUANTITY * #MKT_PRICE / (
CASE
WHEN F.VLGTI = 100
THEN 100
ELSE 1
END
), 3)
, ...
;
I am defined four variables here and each of the variables with different number of ICD10 codes:
%LET DX_27800_CODE = 'E6609', 'E661', 'E668', 'E669';
%LET DX_27801_CODE = 'E6601';
%LET DX_2859_CODE = 'D649';
%LET DX_6202_CODE = 'N8320', 'N8329';
now I want to use create an array that can easy mapping those variables that with my icd 10 table columns so that I could assign flags variables with it.
the regular way would be:
data test; set input;
if (dx1 in ( &DX_27800_CODE) or dx2 in (&DX_27800_CODE) or dx3 in (&DX_27800_CODE))
then dx_27800 = 1; else dx_27800 =0;
run;
in the regular way I would need to do this procedure four times to get all four flags variable. So I'm wondering if it could be done by using array.
data test; set input;
array dx_code10 [4] &DX_27800_CODE &DX_27801_CODE &DX_2859_CODE &DX_6202_CODE;
ARRAY DX_VARIABLE[4] DX_27800 DX_27801 DX_2859 DX_6202;
DO I = 1 TO DIM(dx_code10);
IF (DX1 IN (DX_CODE10[I]) OR DX2 IN (DX_CODE10[I]) OR DX3 IN (DX_CODE10[I]))
THEN DX_VARIABLE[I] = 1;
ELSE DX_VARIABLE[I] = 0;
END;
END;
RUN;
But seems like it can't be done by this way. Please help me to solve this problem. thanks.
I think a better approach is to use formats. I'd rather have those DX codes in a spreadsheet or a text file or something, and then input that to make the formats, but even with the not-best-practice %LETs, you can still use a format solution.
Approach is to make a format that turns each of those DX code pairs into a value that returns the dx value (the 27800, 27801, etc.); then use that to drive how you assign the followup array.
%LET DX_27800_CODE = 'E6609', 'E661', 'E668', 'E669';
%LET DX_27801_CODE = 'E6601';
%LET DX_2859_CODE = 'D649';
%LET DX_6202_CODE = 'N8320', 'N8329';
proc format;
value $dxcode
&dx_27800_code = '27800'
&dx_27801_code = '27801'
&dx_2859_code = '2859'
&dx_6202_code = '6202'
other=' '
;
quit;
data input;
input dx1 $;
datalines;
E6601
E6609
E6608
E661
E668
D649
D650
N8320
E669
N8329
;;;;
run;
data want;
set input;
array dx_codes[4] dx_27800 dx_27801 dx_2859 dx_6202;
dx_code_val = put(dx1,$dxcode5.);
do _i = 1 to dim(dx_codes);
if dx_code_val = scan(vname(dx_codes[_i]),2,'_') then dx_codes[_i]=1;
else dx_codes[_i]=0;
end;
run;
For your specific example you could use FINDW() function instead of the IN operator. Turn your code lists into delimited strings instead.
%LET DX_27800_CODE = E6609,E661,E668,E669;
%LET DX_27801_CODE = E6601 ;
%LET DX_2859_CODE = D649 ;
%LET DX_6202_CODE = N8320,N8329;
data test;
set input;
array dx_code_list (4) $200 _temporary_ ("&dx_27800_code" "&dx_27801_code" "&dx_2859_code" "&dx_6202_code");
array dx_variable (4) dx_27800 dx_27801 dx_2859 dx_6202;
array dx dx1-dx3 ;
do i = 1 to dim(dx_variable);
dx_variable(i)=0;
do j=1 to dim(dx) while (dx_variable(i)=0);
if findw(dx_code_list(i),dx(j),',','it') then dx_variable(i)=1;
end;
end;
drop i j;
run;
So if I make some sample data.
data input ;
length dx1-dx3 $7 ;
input dx1 - dx3 ;
cards;
E6609 E661 .
E668 E669 .
E6601 . .
D649 N8320 N8329
. . .
;
I get this result: