SAS ExportPackage command exceeds 8191 characters - batch-file

We have an automated process for exporting metadata items for promotion, using the ExportPackage commandline utility (documented here).
The command is written to a .bat file, and then executed (in SAS) via a filename pipe.
We recently observed a strange behaviour when exporting multiple objects (around 60), that we believe is due to the windows line length limitation for batch commands.
Basically, one character would be removed (meaning that particular object would not be found), but the rest of the line (after 8191 chars) executes successfully.
Am interested to know:
Can the ExportPackage command be executed in a way that does not hit the 8191 limitation?
Alternatively, can the ExportPackage command be split over multiple lines somehow?
Or is there some way to pass a file to the -objects parameter, rather than space separated values?
Or is it possible to append to (rather than replace) an .spk file?

I doubt there's any answer to this that you're going to like. The documentation you linked states that existing package files with the same names are overwritten and does not mention any way of appending to one.
You can split the command over multiple lines within the batch file using ^ characters, but this still doesn't get around the overall 8191 character limit after recombining the pieces.
Therefore, you will need to do one or more of the following:
Export your items to separate packages with different filenames or in different folders, e.g. 20 at a time
Move your objects into a limited set of folders and subfolders before exporting, and export only the top level folders rather than the individual objects. It looks as though you can still use the other command line options to limit which objects are exported.
Silly option: create a dummy object with dependencies on all of the objects you want to export, mention only that one explicitly in the objects list, and use the -includeDep parameter to force the export utility to export all its dependencies.
Disclaimer: I have never actually used the export utility in question.

I got around this issue by inserting a dummy character at the 8191 point. Note that this 8191 length limit is everything in the command AFTER "ExportPackage".
One solution is therefore as follows:
/* If the ExportPackage command line is more than 8191 characters, it will
fail due to a windows line length limitation. To avoid this, add a
hash character at the 8191 point.
*/
%let log= \path\to\my.log;
%let profile= -profile "\path\to\my\dummy\profile.swa";
%let package= -package "\path\to\my\desired.spk";
%let str=&my_list_of_objects; /* previously defined */
%let breakpoint=%eval(
8191 - %length(%str(&profile &package -objects))-1);
%if %length(&str)>=&breakpoint %then %let objects=%substr(
&str,1,&breakpoint-1)#%substr(&str,&breakpoint,%length(&str)-&breakpoint+1);
%else %let objects=&str;
The command could then executed along the lines of:
ExportPackage &profile &package &objects -subprop -includeEmptyFolders -log &log
What DIDN'T work:
Inserting spaces between quotes at the 8191 point, eg:
"object1" "object2"
The extra spaces were ignored and a character was still removed from the second object.
Inserting spaces within the literal, eg:
"object1 " "object2"
Object 1 was not found, presumably due to the trailing spaces.

Related

Determining number of characters in SQL Script

I'm looking to parameterize a SQL script that holds more than 8000 characters, and since a variable can only hold 8000 characters, I am wondering if there is a way to determine how many characters are in a specific script, so I would have some foresight on when I should use a new variable.
Any ideas?
I had many cases like this and the simpliest and free tool to be used is Notepad++. Just copy your script there and start selecting characters (you can first Ctrl+A to see whether the script is more than 8000 characters at all). There is "Sel" parameter in the status bar in the borrom, when it reaches about 8000 cheracters - just break your current variable and start a new one.

JMeter read from csv file to array

Is there any possibility to read files from .csv into array of variables?
Instead of getting:
https://loadtest.com/mo/75245.json
https://loadtest.com/mo/190554MHG.json
https://loadtest.com/mo/190223MJG.json
https://loadtest.com/mo/198533FTR.json
...
I would like to get an array:
https://loadtest.com/mo/75245.190554MHG.190223MJG.198533FTR.19023.HGTYTRWEYRWEHF.1922MHGDGO.json
Does anybody have some idea?
Thank you in advance.
Check out the following JMeter Functions:
__FileToString() - to read your CSV file into a JMeter Variable
__split() - to "split" the aforementioned JMeter Variable holding CSV file content into separate variables using any suitable delimited (comma, tabulation symbol, newline, whatever)
A workaround for this, if you don't want to use Groovy, can be using text editor that supports regex (like Notepad++) to restructure your CSV, so that multiple lines are collapsed into a single multi value line.
An example for Notepad++ would be replacing all instances of:
^(.+)\R(.+)\R(.+)\R
With
$1 $2 $3
To collapse every 3 lines of text into a single line.
Then you can just use that one line as a single variable in JMeter. This way I've passed multiple comma separated Ids into an array in an Http request. Remember to use a different delimiter in JMeter CSV Data Set Config for actual CSV columns, than the one used to delimit your multiple values.

Writing a batch program for renaming files

Gentlemen and -women, one question regarding programming:
I need to write a program to batch rename files, under the following conditions:
All files are in a directory. Can be the same directory as the executable, if this simplifies things pathwise.
All files are .pdf, but this should not matter I believe.
All files are start with double digit prefix ranging from 01 til 99
e.g.: 01_FileNameOriginal1.pdf ; 02_FileNameOriginal2.pdf
This double digit needs to stay.
All names must be modified to a predefined range of filenames (sort of a database) which can be embedded in the executable, or read out from an external file (being .txt, .csv, whatever is most feasible).
e.g.: 01_NewName1.pdf ; 02_NewName2.pdf ; ...
Some original filenames contain an expiry date, which is labelled "EXP YYYY MMM DD", and should be appended to the new name in a different format: "e_YYYY_MM_DD". So basically it needs to be able to use a "for" statement (to loop for the number of files), and "if" statement to search for the "EXP" (matching case), cut the string and append in the rearranged format before the file extension.
The result of the program can be either renaming, or returning a renamed copy. The former seems easier to do.
So to summarize:
step 1: Run program
step 2: integer = number of files
step 3: loop:
3.A check first two digits, copy to a temp string.
3.B compare the temp string with an array of predefined filenames. The array can be embedded in the code, or read externally. For the user friendliness sake, an external read from a .csv seems easier to modify the filenames later on.
3.C to rename or not to rename. In case of a match:
Assume the old file has the following name: 01_FileNameOriginal EXP YYYY MM DD Excessivetext.pdf
Copy first two digits to a temp2 string
Scan the old name for EXP (for length of filename, if = "EXP ", matching case) and cut the following 10 characters. Put YYYY, MM, and DD in seperate strings. All essential value has now been extracted from the old filename.
Match dbl digits with the first two digits of filenames in the database. Copy the new name to a temp string.
Rename the file with the new name: eg 01_NewName.pdf
Append date strings before the extension: eg 01_NewName_e_YYYY_MM_DD.pdf
Note: date can be extracted in a single string, as long as spaces are replaced by underscores. Probably shorter to program.
in case of no match: copy old filename, and put it in a temp string, to return at the end of the process as an error (or .txt) file, with filenames that could not be renamed.
step 4: finish and return error or report (see previous)
Question:
Based on these conditions, what would be the easiest way to get started? Are there freeware programs that can do such a thing. If not what is the best approach. I have basic programming knowledge (Java/VBA), some small C++ stints but nothing spectacular. I have only programmed in a programming environment and have never produced any executables or batch files or the likes so I don't have any idea how to start atm, but wouldn't mind to give it a shot. As long as it's a guided shot, and not one in the dark cos that's where I am now.
Would love to hear some thoughts on this.
Greetings
Wouter

Which is the best character to use as a delimiter for ETL?

I recently unloaded a customer table from an Informix DB and several rows were rejected because the customer name column contained non-escaped vertical bars (pipe symbol) characters, which is the default DBDELIMITER in the source db. I found out that the field in their customer form has an input mask allowing any alphanumeric character to be entered, which can include any letters, numbers or symbols. So I persuaded the user to run a blanket update on that column to change the pipe symbol to a semicolon. I also discovered other rows containing asterisks and commas in different columns. I could imagine what would happen if this table were to be unloaded in csv format or what damage the asterisks could do!
What is the best character to define as a delimiter?
If tables are already tainted with pipes, commas, asterisks, tabs, backslashes, etc., what's the best way to clean them up?
I have to deal with large volumes of narrative data at my job. This is always a nightmare because users are apt to put ANY character in there, including unprintable characters. You can run a cleanup operation, but you have to do it every time you load data, and it likely won't work forever. Eventually someone will put in what every character you choose as a separator, which is not a problem if your CSV handling libraries can handle escaping properly, but many can't. If this is a one time load/unload, you're probably fine, but if you have to do it more often....
In the past I've changed the separator to the back-tick '`', the tilde '~', or the caret '^'. All failed in the current effort. The best solution I could come up with is to not use CSV format at all. I switched to XML. Even so there were still XML illegal characters, but these can be translated out with atlassian-xml-cleaner-0.1.jar.
Unload customer table with default pipe; string search for a character that doesn't exist. ie. "~"
unload to file delimiter "~"
select * from customer;
Clean your file (or not)
(vi replace string):g/theoldstring/s//thenewstring/g)
or
(unix prompt) sed 's/old-char/new-char/g' fileold > filenew
(Once clean id personally change back "~" in unload file to "|" or "," as csv standard)
Load to source db.
If you can, use a multi-character delimiter. It can still fail, but it should be much more highly unlikely.
Or, escape the delimiter while writing the export file (Informix docs say "LOAD TABLE" escapes by prefixing delimiter characters with backslash). Proper CSV has quoting and escaping so it shouldn't matter if a comma is in the data, unless your exporter and loader cannot handle proper CSV.

removing a line from a text file?

I am working with a text file, which contains a list of processes under my programs control, along with relevant data.
At some point, one of the processes will finish, and thus will need to be removed from the file (as its no longer under control).
Here is a sample of the file contents (which has enteries added "randomly"):
PID=25729 IDLE=0.200000 BUSY=0.300000 USER=-10.000000
PID=26416 IDLE=0.100000 BUSY=0.800000 USER=-20.000000
PID=26522 IDLE=0.400000 BUSY=0.700000 USER=-30.000000
So for example, if I wanted to remove the line that says PID=26416.... how could I do that, without writing the file over again?
I can use external unix commands, however I am not very familiar with them so please if that is your suggestion, give an example.
Thanks!
Either you keep the contents of the file in temporary memory and then rewrite the file. Or you could have a file for each of the PIDs with the relevant information in them. Then you simply delete the file when it's no longer running. Or you could use a database for this instead.
As others have already pointed out, your only real choice is to rewrite the file.
The obvious way to do that with "external UNIX commands" would be grep -v "PID=26416" (or whatever PID you want to remove, obviously).
Edit: It is probably worth mentioning that if the lines are all the same length (as you've shown here) and order doesn't matter, you could delete a line more efficiently by copying the last line into the space being vacated, then shorten the file so eliminate what had been the last line. This will only work if they really are all the same length though (e.g., if you got a PID of '1', you'd need to pad it to the same length as the others in the file).
The only way is by copying each character that comes after the deleted line down over the characters that are deleted.
It is far more efficient to simply rewrite the file.
how could I do that, without writing the file over again?
You cannot. Filesystems (perhaps besides more esoteric record based ones) does not support insertion or deletion.
So you'll have to write the lines to a temporary file up till the line you want to delete, skip over that line, and write the rest of the lines to the file. When done, rename/copy the temp file to the original filename
Why are you maintaining these in a text file? That's not the best model for such a task. But, if you're stuck with it ... if these lines are guaranteed to all be the same length (it appears that way from the sample), and if the order of the lines in the file doesn't matter, then you can write the last line over the line for the process that has died and then shorten the file by one line with the (f)truncate() call if you're on a POSIX system: see Jonathan Leffler's answer in How to truncate a file in C?
But note carefully netrom's answer, which gives three different better ways to maintain this info.
Also, if you stick with a text file (preferably written from scratch each time from data structures you maintain, as per netrom's first suggestion), and you want to be sure that the file is always well formed, then write the new data into a temp file on the same device (putting it in the same directory is easiest) and then do a rename() call, which is an atomic operation.
You can use sed:
sed -i.bak -e '/PID=26416/d' test
-i is for editing in place. It also creates a back-up file with the new extension .bak
-e is for specifying the pattern. The /d indicates all lines matching the pattern should be deleted.
test is the filename
The unix command for it is:
grep -v "PID=26416" myfile > myfile.tmp
mv myfile.tmp myfile
The grep -v part outputs the file without the rows with the search term.
The > myfile.tmp part creates a new temp file for this output.
The mv part renames the temp file to the original file.
Note that we are rewriting the file here, and moreover, we can lose data if someone write something to file between the two commands.

Resources