How do I set word delimiters? - arrays

User's guide chapter 6.1.5 The Word Chunk A word is a string of characters delimited by space, tab, or return characters or enclosed by double quotes. Is it possible to have additional word delimiters?
I have the following code snippet taken from the User's Guide chapter 6.5.1 'When to use arrays', p. 184
on mouseUp
--cycle through each word adding each instance to an array
repeat for each word tWord in field "sample text"
add 1 to tWordCount[tWord]
end repeat
-- combine the array into text
combine tWordCount using return and comma
answer tWordCount
end mouseUp
It counts the number of occurences of each word form in the field "Sample text".
I realize that full stops after words are counted as part of the word with the default setting.
How do I change the settings that a full stop (and, or a comma) is considered a word boundary?

Alternatively you could simply remove the offending characters before processing.
This can be done using either the REPLACE function or the "REPLACETEXT function.
The REPLACETEXT function can use a regular expression matchstring but is slower than the REPLACE function. So here I am using the REPLACE function.
on mouseUp
put field "sample" into twords
--remove all trailing puncuation and quotes
replace "." with "" in twords
replace "," with "" in twords
replace "?" with "" in twords
replace ";" with "" in twords
replace ":" with "" in twords
replace quote with "" in twords
--hyphenated words need to be seperated?
replace "-" with " " in twords
repeat for each word tword in twords
add 1 to twordcount[tword]
end repeat
combine twordcount using return and comma
answer twordcount
end mouseUp

I think you are asking a question about delimiters. Some delimiters are built-in:
spaces for words,
commas for items,
return (CR) for lines.
The ability to create your own custom delimiter property (the itemDelimiter) is a powerful feature of the language, and pertains to "items". You can set this to any single character:
set the itemDelimiter to "C"
answer the number of items in "XXCXXCXX" --call this string "theText"
The result will be "3"
As others have pointed out, the method of replacing one string for another allows formidable control over custom parsing of text:
replace "C" with space in theText
yields "XX XX XX"
Craig Newman

As the User's guide says in chapter 6.1.5 The Word Chunk A word is a string of characters delimited by space, tab, or return characters or enclosed by double quotes.
There is itemDelimiter but not wordDelimiter.
So punctuation as to be removed first before adding the word to the word count array.
This may be done with a function effectiveWord.
function effectiveWord aWord
put last char of aWord into it
if it is "." then delete last char of aWord
if it is "," then delete last char of aWord
if it is ":" then delete last char of aWord
if it is ";" then delete last char of aWord
return aWord
end effectiveWord
on mouseUp
--cycle through each word adding each instance to an array
repeat for each word tWord in field "Sample text"
add 1 to tWordCount[effectiveWord(tWord)]
end repeat
-- combine the array into text
combine tWordCount using return and comma
answer tWordCount
end mouseUp

Related

Repeat loop with if conditions for first and last time

I have a script that I made that works fine but I have to make some very minor edits to the output. Instead I'd like to just do it correctly.
on run {input, parameters}
set the formatted to {}
set listContents to get the clipboard
set delimitedList to paragraphs of listContents
repeat with listitem in delimitedList
set myVar to "#\"" & listitem & "\"," & (ASCII character 10)
copy myVar to the end of formatted
end repeat
display dialog formatted as string
return formatted as string
end run
I'd like prepend the first item slightly differently and append the last a little different.
I tried the following but the script is not right.
repeat with n from 1 to count of delimitedList
-- not sure how to if/else n == 0 or delimitedList.count
end repeat
There is a more efficient way, text item delimiters. It can insert the comma and the linefeed character between the list items
on run {input, parameters}
set the formatted to {}
set listContents to get the clipboard
set delimitedList to paragraphs of listContents
repeat with listitem in delimitedList
copy "#" & quote & listitem & quote to the end of formatted
end repeat
set {saveTID, text item delimiters} to {text item delimiters, {"," & linefeed}}
set formatted to formatted as text
set text item delimiters to saveTID
display dialog formatted
return formatted
end run
Side note: ASCII character 10 is deprecated since macOS 10.5 Leopard, there is linefeed, tab (9), return (13), space (32) and quote (34).
In addition to an index or range, the various items of a list can also be accessed by using location parameters such as first, last, beginning, etc (note that AppleScript lists start at index 1). To deal with the first and last items separately, you can do something like:
on run {input, parameters} -- example
set formatted to {}
set delimitedList to paragraphs of (the clipboard)
if delimitedList is not {} then
if (count delimitedList) > 2 then repeat with anItem in items 2 thru -2 of delimitedList
set the end of formatted to "#" & quote & anItem & quote
end repeat
set the beginning of formatted to "#" & quote & "First: " & first item of delimitedList & quote -- or whatever
if rest of delimitedList is not {} then set the end of formatted to "#" & quote & "Last: " & last item of delimitedList & quote -- or whatever
set {tempTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "," & linefeed}
set {formatted, AppleScript's text item delimiters} to {formatted as text, tempTID}
end if
display dialog formatted as text
return formatted as text
end run

Python joining a list by "\"

Lets say I have a list of elements.
l = ["xf3", "x03", "x8c"] etc.
Now I would like to join the elements inside my list with a "\". I tried r"\".join(l) but it didn't work.
\ is used to escape 'special' characters, hence a Python string can not terminate with a single \ because it escapes the closing quote.
You have to escape it by using a second \, ie '\\'.join(l)
l = ["xf3", "x03", "x8c"]
'\\'.join(l)
The important part is to escape the '\\' as inn Python Strings:
the backslash "\" is a special character, also called the "escape"
character. It is used in representing certain whitespace characters:
"\t" is a tab, "\n" is a newline, and "\r" is a carriage return. As well "\"
can be used to escape itself: "\" is the literal backslash character.
I'm assuming is what you actually want is to create a string containing those escaped characters. The easiest way I can think of is ast.literal_eval:
>>> import ast
>>> ast.literal_eval("'\\" + "\\".join(l) + "'")
'รณ\x03\x8c'
This works by first creating a string of those strings joined by backslash characters (xf3\x03\x8c), surrounding those by quotes and adding the initial backslash ('\xf3\x03\x8c'), and finally, by evaluating it as a literal, to turn it from a length 12 string into a length 3 string.

Regex to reject if all numbers and reject colon

I am trying for a regex to
reject if input is all numbers
accept alpha-neumeric
reject colon ':'
I tried ,
ng-pattern="/[^0-9]/" and
ng-pattern="/[^0-9] [^:]*$/"
for example ,
"Block1 Grand-street USA" must be accepted
"111132322" must be rejected
"Block 1 grand : " must be rejected
You may use
ng-pattern="/^(?!\d+$)[^:]+$/"
See the regex demo.
To only forbid a : at the end of the string, use
ng-pattern="/^(?!\d+$)(?:.*[^:])?$/"
See another regex demo
The pattern matches
^ - start of string
(?!\d+$) - no 1+ digits to the end of the string
[^:]+ - one or more chars other than :
(?:.*[^:])? - an optional non-capturing group that matches 1 or 0 occurrences of
.* - any 0+ chars other than line break chars, as many as possible
[^:] - any char other than : (if you do not want to match an empty string, replace the (?: and )?)
$ - end of string.
According to comments, you want to match any character but colon.
This should do the job:
ng-pattern="/^(?!\d+$)[^:]+$/"

SQL Select statement until a character

I'm looking to extract all the text up until a '\' (backslash).
The substring is required to remove all proceeding characters (17 in total) and so I would like to return all after the 17th until it comes across a backslash.
I've tried using charindex but it doesn't seem to stop at the \ it returns characters afterward. My code is as follows
SELECT path, substring(path,17, CHARINDEX('\',Path)+ LEN(Path)) As Data
FROM [Table].[dbo].[Projects]
WHERE Path like '\ENQ%\' AND
Deleted = '0'
Example
The below screen shot shows the basic query and result i.e the whole string
I then use substring to remove the first X characters as there will always be the same amount of proceeding characters
But what Im actually after is (based on the above result) the "Testing 1" "Testing 2" and "Testing ABC" section
The substring is required to remove all proceeding characters (17 in total) and so I would like to return all after the 17th until it comes across a backslash.
select
substring(path,17,CHARINDEX('\',Path)-17)
from
table
To overcome Invalid length parameter passed to the LEFT or SUBSTRING function error, you can use CASE
select
substring(path,17,
CASE when CHARINDEX('\',Path,17)>0
Then CHARINDEX('\',Path)-17)
else VA end
)
from
table

Including single quote in data while inserting to database

I have a string "I love McDonald's burgers. it's the best."
and I would like to insert it into a column breaking them into 15 character strings.
Hence I need the result as string inserted in 3 rows
I love McDonald
's burgers. it'
s the best.
But if I use ' ' to include the ', an extra ' is present in the string which will affect my calculation of 15 character breakage.
Is there any other way to include ' without having to use one more ' to escape it?
Please help.
You don't need to add an extra ' if you're breaking the string into a variable:
DECLARE
mcdonald_string VARCHAR2(50) := 'I love McDonald''s burgers. it''s the best.';
BEGIN
WHILE LENGTH(mcdonald_string) > 0 LOOP
INSERT INTO your_table(your_field) VALUES (SUBSTR(mcdonald_string,1,15));
mcdonald_string := SUBSTR(mcdonald_string,16);
END LOOP;
COMMIT;
END;
Doubling the quotation marks within a complicated literal,
particularly one that represents a SQL statement, can be tricky. You
can also use the following notation to define your own delimiter
characters for the literal. You choose a character that is not present
in the string, and then do not need to escape other single quotation
marks inside the literal:
-- q'!...!' notation allows the of use single quotes
-- inside the literal
string_var := q'!I'm a string, you're a string.!';
http://docs.oracle.com/cd/B19306_01/appdev.102/b14261/fundamentals.htm#sthref339

Resources