SAS Create permanent format from permanent dataset - dataset

I have a permanent data set called Branch(Branch code, Branch description)
I want to create a format from that dataset (a permanent one)
I can see that this gives me more or less what I want, but now to put it into a permanent dataset?
proc format library = Home.Branch fmtlib;
Run;
What I've tried
proc print data=Home.DataSetToApply
format B_Code $B_CODE_FORMAT.;
RUN;
This works if I manually create the format. I can't seem to create a permanent format directly from a data set.
Could you point me in the right direction?
Resources
Creating a Format from Raw Data or a SASĀ® Dataset

SAS has an autoexec.sas file which executes when you start SAS.
Of course, whether this is a valid option depends on your access rights + the OS you're running.
Have a look here: http://support.sas.com/documentation/cdl/en/hostwin/63285/HTML/default/viewer.htm#win-sysop-autoexec.htm
You could just drop the format code in the auto-executing script then to have your format always available when using SAS.

This will create a dataset with formats in the current library.
proc format cntlout=myfmtdataset lib=mylibname;
select myformatname; *if you want to just pick one or some - leave out select for all;
quit;
This will import that back into formats (later):
proc format cntiln=myfmtdataset lib=myotherlibname;
quit;
That could of course be in your autoexec, or in your regular code.
If you are trying to take a dataset to make a permanent format, you need to set it up like this:
Required:
fmtname = name of format start = starting value (or, single value)
end = ending value (this can be missing if only single values)
label = formatted value
Optional:
type = type of format (n=numeric, c=character, i=informat, j=character informat)
hlo = various options (h=end is highest value, l = start is lowest value,
o=other, m=multilabel, etc.)
Then use the CNTLIN option to load it. SAS documentation has more detail if you need it.

Related

Macro that loads multiple datasets that updates and change names?

Im working with a database in SAS that updates every so often. I want the macro to automatically load the most recent dataset of a given year. The datasets cover the years 2015-2018 and each year has a different updated version which is stated in the name of the dataset, i.e. 2015_version9. With my current code you need to update the macro manually everytime a dataset change its version and name.
You can scan through each library and find the max version number, then save those to a single macro variable string that you can supply to a set statement. Here are the assumptions of this solution:
Your libraries are named lib_2015, lib_2016, etc. and follow 8-char libname requirements
Your libraries are static for years 2015-2018
Your datasets are named _version1, _version2, etc.
Here's how we'll do it.
%let libraries = "LIB_2015", "LIB_2016", "LIB_2017", "LIB_2018";
proc sql noprint;
select cats(libname, '.', memname)
, input(compress(memname,,'KD'), 8.) as version
into :data separated by ' '
from dictionary.members
where upcase(libname) IN(&libraries.)
AND upcase(memname) LIKE "^_VERSION%" escape '^'
group by libname
having version = max(version)
;
quit;
data want;
set &data. indsname=name;
dsn = name;
run;
This code does the following:
Gets all dataset names from each library that starts with _VERSION. The ^ in the like clause is an escape character that we defined so that we can match _ literally.
Removes all non-digits from the dataset name and converts it to a version number, version. The KD option in the compress() function says to keep only digits from the string.
Keeps only names in each library where version is the highest value
Saves all the dataset names to a single macro variable, &data
&data will store a string of all the relevant datasets you want with the highest version number for each library. For example:
%put &data.;
LIB_2015._VERSION9 LIB_2016._VERSION19 LIB_2017._VERSION12 LIB_2018._VERSION8
The indsname option in the data step will store the full dataset name of each observation. We're saving that to a variable named dsn. This shows where each observation comes from so you can split them out to individual datasets as needed.

Large Number Entry in Oracle APEX

I have a Number field in DB and Oracle APEX.
My Issue is:
If Users want to entry the number data with this format "1.000.000,01", then takes Charakter Error that the entry must be Number.
How can I solve this problem in Application Layer ? In Database Layer there are some solutions , but in Application Layer so far I can not find any solution.
As Summary: I want to entry number as 1.000.000,12 in Application and I want to see it in the same format.
NOT: A procedure runs in the Application to insert the data in DB.
You can/should set appropriate format mask, e.g. 999G999G990D00 where
G represents thousands character (dot in your case)
D represents decimal character (comma in your case)
But, where do you set NLS numeric characters (represented by G and D)? In Apex 20.2, it is to be set in:
application builder
shared components
globalization attributes
security
initialization PL/SQL code - in here, you'll probably see what they are set to. Change those values, if necessary. For example:
begin
execute immediate q'[alter session set nls_numeric_characters = ',.']';
execute immediate q'[alter session set nls_date_format = 'dd.mm.yyyy hh24:mi:ss']';
end;

How can you create a table (or other object) that always returns the value passed to its WHERE-clause, like a mirror

There is a legacy application that uses a table to translate job names to filenames. This legacy application queries it as follows:
SELECT filename FROM aJobTable WHERE jobname = 'myJobName'
But in reality those jobnames always match the filenames (e.g. 'myJobName.job' is the jobname but also the filename) That makes this table appear unnecessary. But unfortunately, we cannot change the code of this program, and the program just needs to select it from a table.
That's actually a bit annoying. Because we do need to keep this database in sync. If a jobname is not in the table, then it cannot be used. So, as our only way out, right now we have some vbscripts to synchronize this table, adding records for each possible filename. As a result, the table just 2 columns with identical values. -- We want to get rid of this.
So, we have been dreaming about some hack that queries the data with the jobname, but just always returns the jobname again, like a copy/mirror query. Then we don't actually have to populate a table at all.
"Exploits"
The following can be configured in this legacy application. My hunch is that these may open the door for some tricks/hacks.
use of either MS Access or SQL Server (we prefer sql server)
The name of the table (e.g. aJobTable)
The name of the filename column (e.g. filename)
The name of the jobname column (e.g. jobname)
Here is what I came up with:
If I create a table-valued function mirror(a) then I get pretty close to what I want. Then I could use it like
SELECT filename FROM mirror('MyJobName.job')
But that's just not good enough, it would be if I could force it to be like
SELECT filename FROM mirror WHERE param1 = 'MyJobName.job'
Unfortunately, I don't think it's possible to call functions like that.
So, I was wondering if perhaps somebody else knows how to get it working.
So my question is: "How can you create a table (or other object) that always returns the value passed to its WHERE-clause, like a mirror."
It's kinda hard to answer not knowing the code that the application use, but if we assume it only takes strings and concatenate them without any tests whatsoever, I would assume code like this: (translated to c#)
var sql = "SELECT "+ field +" FROM "+ table +" WHERE "+ conditionColumn +" = '"+ searchValue +"'";
As this is an open door for SQL injection, and given the fact that SQL Server allows you two ways of creating an alias - value as alias and alias = value,
you can take advantage of that and try to generate an SQL statement like this:
SELECT field /* FROM table WHERE conditionColumn */ = 'searchValue'
So field should be "field /* ",
and conditionColumn should be "conditionColumn */"
table name doesn't matter, you could leave an empty string for it.

Remove Duplicate adjacent Sub-String from String in Microsoft SQL Server

I am using SQL Server 2008 and I have a column in a table, which has values like below. It basically shows departure and arrival information.
-->Heathrow/Dublin*Dublin/Heathrow
-->Gatwick/Liverpool*Liverpool/Carlisle *Carlisle/Gatwick
-->Heathrow/Dublin*Liverpool/Heathrow
(The 3rd example shown above is slightly different where the person did not depart from Dublin, instead departed from a Liverpool).
This makes the column too lengthy, and I want to remove only the adjacent duplicates, so the information can be shown like below:
-->Heathrow/Dublin/Heathrow
-->Gatwick/Liverpool/Carlisle/Gatwick
-->Heathrow/Dublin***Liverpool/Heathrow
So, this would still show the correct travel route, but omits only the contiguous duplicates. Also, in the 3rd case, since the departure and arrival information location is not the same, Iwould like to show it as ***.
I found a post here that removes all duplicates (Find and Remove Repeated Substrings) but this is slightly different from the solution that I need.
Could someone share any thoughts please?
The first step is to adapt the process defined in the following link so that it splits based on /:
T-SQL split string
This returns a table which you would then loop through checking if the value contains an *. In that case you would get the text values before and after the * and compare them. Use CHARINDEX to get the position of the *, and SUBSTRING to get the values before and after. Once you have those check both values and append to your output string accordingly.
So you have a database column that contains this text string? Is your concern to display the data to the user in a new format, or to update the data in your database table with a new value?
Do you have access to the original data from which this text string was built? It would probably be easier to re-create the string in the format you desire than it would be to edit the existing string programmatically.
If you don't have access to this data, it would probably be a lot simpler to update your data (or reformat it for display) if you do the string manipulation in a high-level language such as c# or java.
If you're reformatting it for display, write the string manipulation code in whatever language is appropriate, right before displaying it. If you're updating your table, you could write a program to process the table, reading each record, building the replacement string, and updating the record before moving on to the next one.
The bottom line is that T-SQL is just not a good language for doing this sort of string examination and manipulation. If you can build a fresh string from the original data, or do your manipulation in a high-level language, you'll have an easier job of it and end up with more maintainable code.
I wrote a code for the first example you gave. You still need to
improve it for the rest ...
DECLARE #STR VARCHAR(50)='Heathrow/Dublin*Dublin/Heathrow'
IF (SELECT SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1)) =
(SELECT SUBSTRING(#STR,CHARINDEX('*',#STR)+1,LEN(SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1))))
BEGIN
SELECT STUFF(#STR,CHARINDEX('*',#STR),LEN(SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1))+1,'')
END
ELSE
BEGIN
SELECT STUFF(#STR,CHARINDEX('*',#STR),LEN(SUBSTRING(#STR,CHARINDEX('*',#STR)+1,LEN(SUBSTRING(#STR,CHARINDEX('/',#STR)+1,CHARINDEX('*',#STR)-CHARINDEX('/',#STR)-1)))),'***')
END

SAS dataset fieldtype num to char with current format

I have currently a dataset which contains the variables I need together with the needed formats.
Now I am using the getvarc() function (among others) in a loop to get those variables to write to a file.
The first problem occuring ofcourse, is that some variables are not char type but num type. I could use getvarn() to retrieve those, but then the format goes to waste which I really need.
E.g. date in num type: 18750. Format = yymmdd10. Thus looking like 2011-05-03
Using the value=getvarn() for this field, it would retrieve 18750 for value. If I then want to output (using PUT) this value, it would not give me the 2011-05-03 date as I want.
So now I am looking for a better way to do this.
My first option is to use the sashelp.vcolumn data set, to retrieve the format on this field, and use it in the put statement to output in the right format.
My second option is to convert the data set containing this field (in reality it is about multiple data sets), where I convert all num type fields to char types remaining the right format.
Which option should I go with?
And in case of the second option, can this be done in a generic way (as I said, this is not about one variable only) ?
EDIT:
After the answer of Cmjohn I got to find a way to fix my problem. Not using the sashelp.vcolumn but using the varfmt function.
So a short explanation of what I am doing in my code:
data _null_
set dataset1;
file file1;
if somefield = x then do;
dsid = open(datasetfield, i);
rc = fetchobs (dsid);
varnum = varnum(dsid,anotherfield);
**varfmt = varfmt(dsid,anotherfield);
if vartype(dsid,anotherfield) = 'N' then do;**
value = getvarn(dsid,varnum);
**value_formatted = putn(value,varfmt);**
**end;
else do;**
value = getvarc(dsid,varnum);
**value_formatted = putc(value,varfmt);
end;**
put value_formatted;
end;
run;
So this is in general (quickly out of my head) what I am doing now. The bold part of the solution is what I came up with after Cmjohns response. So my first question is answered, 'How to do it?'.
But my added question: What would be most efficient in the long run: keep this process or make my data sets the way that all data can be read in by only using getvarc without the need of the type-check and the getfmt()?
I'd suggest that you use the vvalue function, or any of the other techniques provided in the answers to this question to put formatted values to a file.

Resources