stripping text from beginning/end of an variable - arrays

I have an array containing a list of backup files, I want to go through and strip off the leading /path/to/file/ and the trailing _date_stamp.tar.gz My code works to strip off the leading pathtofile and if I set it to just strip off the .tar.gz it works, but if I try to strip the date it fails. So as an example I want to take:
/path/to/file/backup_domain1.com_02_16_2015.tar.gz
and be left with:
domain1.com
This removed from start: /path/to/file/backup_
This removed from end: _02_16_2015.tar.gz but obviously as they are date stamped then the integers will vary.
My code snippet:
# strip leading path/to/file :
$bubasedir=/path/to/file
buarray=( "${buarray[#]#"$bubasedir/backup_"}" )
buarray=( "${buarray[#]%".tar.gz"}" )
This strips .tar.gz but I need to strip the date as well.

Use an expression which matches the date expression, just like you do for the prefix. Assuming the domain name cannot contain an underscore (as per the DNS spec, but sometimes violated for internal domains and special domains like _dkim),
buarray=( "${buarray[#]%%_*}" )
%% says to trim the longest possible match and _* matches everything starting from an underscore. ("${buarray[#]%_*}" would trim from the last underscore.)

Related

ksh: remove last extension from a multiple extension filename

I have a filename in the format dir1/dir2/filename.txt.org and I like to rename this to dir1/dir2/filename.txt . how can this be done. I tried 'cut' with '.' separator but it also removes .txt
You can try korn shell variable expansion formats, instead of using a subprocess (e.g. cut) . This can be much faster.
example:
var1=dir1/dir2/filename.txt.org
var2=${var1%.*}
If you now print $var2 its value will be dir1/dir2/filename.txt
The % tells it to delete the smallest matching rightmost match for .* (which means anything following the rightmost period character).
${variable%pattern} - return the value of variable without the smallest ending portion that matches pattern.
Other variable expansion formats are available, it is worthwhile to study the docs.

How to avoid sub folders in snowflake copy statement

I have a requirement to exclude certain folder from prefix and process the data in snowflake (Copy statement)
In the below example I need to process files under emp/ and exclude files from abc/
Input :
s3://bucket1/emp/
E1.CSV
E2.CSV
/abc/E11.csv
s3://bucket1/emp/abc/ - E11.csv
Output :
s3://bucket1/emp/
E1.CSV
E2.CSV
Is there any suggestion around pattern to handle this ?
With the pattern keyword you can try to exclude certain files. However when using the pattern matching with the NOT syntax, you exclude any file with any of the characters.
Assuming your stage URL is defined as s3://bucket1/emp/
LS #MY_STAGE pattern = '[^abc].*';
Excludes anything starting with a, b, or c
LS #MY_STAGE pattern = '[^a][^b][^c][^\\/].*';
Excludes anything where:
The first character is a, OR
The second character is b, OR
The third character is c, OR
The fourth character is a forward slash /
Edit
After testing with Sharvan's example. Here is what I've found:
Doesn't work:
ls #my_stage PATTERN='^((?!/abc/).)*$'; because the first forward slash is duplicated as part of the stage URL (it is automatically appended to the stage URL if not present)
Works: ls #my_stage PATTERN='^((?!abc/).)*$'; because the first forward slash is removed
Updated as the forward slash does not need to be escaped
Snowflake does not support backreferences (per their documentation) but there is no mention of lookaheads or lookbehinds, which I thought was un-supported.
https://docs.snowflake.net/manuals/sql-reference/functions-regexp.html#backreferences
Use this to exclude the prefix pattern
ls #stage PATTERN='^((?!/abc/).)*$'

RegEx for matching two letters with special boundaries

I want to make sure user input has:
Two letters at the start
And the support for any number of optional space characters following these two letters.
Additionally, if at least one space character is provided, optionally allow letters, digits or . characters after it.
Here's the expression I currently have:
[a-zA-Z][a-zA-Z] (?\\s+ (?a-zA-Z0-9.))
And here's my thinking:
[a-zA-Z][a-zA-Z] makes sure the input begins with at least two letters
(?\\s+ begins an optional statement. This optional statement must start with at least one space (I'm on windows which is why I have two slashes).
(?a-zA-Z0-9.)) finishes the optional statement. So, if at least one space is provided, at least one optional character, number or . can also be added.
For instance, ab, ab , ab .s, and ab .asd2 should all be valid inputs.
How do I solve this problem?
The problem with your attempt is that both (?\ and (?a are syntax errors. If you want to create an optional group, you need to write (...)?, not (?...).
(The other issue is that a-zA-Z0-9 in your regex matches literally because it's not part of a character class.)
Besides, \s (to match whitespace) does not exist in POSIX regex.
My suggestion:
^[a-zA-Z]{2}( +[a-zA-Z0-9.]*)?$
That is:
^ # beginning of string
[a-zA-Z]{2} # exactly two letters
(
\ + # one or more spaces
[a-zA-Z0-9.]* # zero or more of: letters, digits, or dot
)? # ... this group is optional
$ # end of string

C How do i specify a POSIX regex that begins in a blank line and ends in a blank line?

I am trying to write code to scan a file and produce a "match!" message when the tool reads a certain line of code preceded and followed by blank lines. The line I am interested in matching is:
Appliance Version 3.1.2
Using regex.h, I have a simple tool that compiles my regex pattern then executes it against every line in the file to search for a match. The basic functionality of the tool is fine: I am able to get it to successfully search for various regex matches. Trouble arises when I try to match a regex containing a blank line before and after the above line of text. Here is my precompiled regex:
[[:space:]]+\n^Appliance Version [[:alnum:]]$\n
I have tried a series of different combinations similar to this, and nothing seems to work. I think it might have to do with \n in which case I would need to figure out a new way to specify the two blank lines. Any insight of POSIX regex would be greatly appreciated!
Looking at your regex, it looks like it is trying to match
Appliance Version [[:alnum:]]
at the end of a line ($). That would be matched by
Appliance Version 3
(3 is an instance of [:alnum:]), but not by
Appliance version 33
([[:alnum:]] only matches one character), and much less by
Appliance version 3.1.2
(the above problem, and also . is not an instance of [:alnum:])
So at a minimum you need to change [[:alnum:]] to [.[:alnum:]]* (or some such).
In addition, your use of ^ and $ is redundant with the explicit \n, but nothing in the regex requires the match to be preceded or followed by a blank line. For example, [[:space:]]\n would happily be matched with the line:
Not a blank line, but with a blank at the end: \n
(where I've written the \n explicitly to show the blank character at the end of the line.)
Matching blank lines
A single blank line is matched with ^[[:space:]]*$. That does not match the newlines at either end. If you want to match a blank line before something, use: ^[[:space:]]*\nSOMETHING. To match a blank line after something: SOMETHING\n[[:space:]]*$. Or, if you really want a blank line before and after: ^[[:space:]]*\nSOMETHING\n[[:space:]]*$. (But that won't match if SOMETHING happens to be the first line of the input, for example. Or the last line.)
As #rici notes, you cannot combine \n^ to match two blank lines -- the markers ^ and $ match a position, not a literal \n character.
To match a blank line, use \n\n, or -- better because you probably don't want to do anything with the hard return that ends the line above, (?<=\n)\n at the start. You can leave the \n\n at the end, though.

vim, reformat text to initializers

I've a big file with lines that look like
2 No route to specified transit
network
3 No route to destination
i.e. a number at the start of a line followed by a description.
And I'd like to transform that for use as a struct initializer
{2,"No route to specified transit
network"},
{3,"No route to destination"},
How would I do this ?
Try
:%s/^\(\d\+\)\s\(.*\)$/{\1, "\2"},/
This uses search-and-replace and searches for a line starting with a digit, followed by whitespace, followed by arbitrary text until the end of the line. This is replaced by the pattern you specified.
Or, using “more magic” (thanks to Al in the comments):
:%s/\v^(\d+)\s(.*)$/{\1, "\2"},/

Resources