Case insensitive array in Lua - arrays

I'm trying to program an addon for WoW (in lua). It's a chat filter based on specific words. I can't figure out how to get the array of these words to be case insensitive, so that any upper/lower case combination of the word matches the array. Any ideas would be greatly appreciated. Thanks!
local function wordFilter(self,event,msg)
local keyWords = {"word","test","blah","here","code","woot"}
local matchCount = 0;
for _, word in ipairs(keyWords) do
if (string.match(msg, word,)) then
matchCount = matchCount + 1;
end
end
if (matchCount > 1) then
return false;
else
return true;
end
end

Use if msg:lower():find ( word:lower() , 1 , true ) then
==> it lower cases both of the arguments to string.find: hence case insensitivity.
Also I used string.find because you probably want the 'plain' option, which doesn't exist for string.match.
Also you can easily return on the first word found:
for _ , keyword in ipairs(keywords) do
if msg:lower():find( keyword:lower(), 1, true ) then return true end
end
return false

Define keyWords outside of function. Otherwise you're recreating
table every time just to thorw it away moments latter, wasting time
on both creation and GC.
Convert keyWords to patter that match
both upper and lower case letters.
You don't need captured data
from string, so use string.find for speed.
According to your
logic, if you've got more than one match you signal 'false'. Since
you need only 1 match, you don't need to count them. Just return
false as soon as you hit it. Saves you time for checking all
remaining words too. If later you decide you want more than one
match, you still better check it inside loop and return as soon as
you've reached desired count.
Don't use ipairs. It's slower than simple for loop from 1 to array length and ipairs is deprecated in Lua 5.2 anyway.
local keyWords = {"word","test","blah","here","code","woot"}
local caselessKeyWordsPatterns = {}
local function letter_to_pattern(c)
return string.format("[%s%s]", string.lower(c), string.upper(c))
end
for idx = 1, #keyWords do
caselessKeyWordsPatterns[idx] = string.gsub(keyWords[idx], "%a", letter_to_pattern)
end
local function wordFilter(self, event, msg)
for idx = 1, #caselessKeyWordsPatterns do
if (string.find(msg, caselessKeyWordsPatterns[idx])) then
return false
end
end
return true
end
local _
print(wordFilter(_, _, 'omg wtf lol'))
print(wordFilter(_, _, 'word man'))
print(wordFilter(_, _, 'this is a tEsT'))
print(wordFilter(_, _, 'BlAh bLAH Blah'))
print(wordFilter(_, _, 'let me go'))
Result is:
true
false
false
false
true

You can also arrange this with metatables, in an entirely transparent way:
mt={__newindex=function(t,k,v)
if type(k)~='string' then
error'this table only takes string keys'
else
rawset(t,k:lower(),v)
end
end,
__index=function(t,k)
if type(k)~='string' then
error'this table only takes string keys'
else
return rawget(t,k:lower())
end
end}
keywords=setmetatable({},mt)
for idx,word in pairs{"word","test","blah","here","code","woot"} do
keywords[word]=idx;
end
for idx,word in ipairs{"Foo","HERE",'WooT'} do
local res=keywords[word]
if res then
print(("%s at index %d in given array matches index %d in keywords"):format(word,idx,keywords[word] or 0))
else
print(word.." not found in keywords")
end
end
This way the table can be indexed in whatever case. If you add new words to it, it will automatically lower-case them too. You can even adjust it to allow matching with patterns or whatever you'd like.

Related

Recursive SQL function returning array has extra elements when self-invocation uses array function

Goal: write a function in PostgreSQL SQL that takes as input an integer array whose each element is either 0, 1, or -1 and returns an array of the same length, where each element of the output array is the sum of all adjacent nonzero values in the input array having the same or lower index.
Example, this input:
{0,1,1,1,1,0,-1,-1,0}
should produce this result:
{0,1,2,3,4,0,-1,-2,0}
Here is my attempt at such a function:
CREATE FUNCTION runs(input int[], output int[] DEFAULT '{}')
RETURNS int[] AS $$
SELECT
CASE WHEN cardinality(input) = 0 THEN output
ELSE runs(input[2:],
array_append(output, CASE
WHEN input[1] = 0 THEN 0
ELSE output[cardinality(output)] + input[1]
END)
)
END
$$ LANGUAGE SQL;
Which gives unexpected (to me) output:
# select runs('{0,1,1,1,1,0,-1,-1,-1,0}');
runs
----------------------------------------
{0,1,2,3,4,5,6,0,0,0,-1,-2,-3,-4,-5,0}
(1 row)
I'm using PostgreSQL 14.4. While I am ignorant of why there are more elements in the output array than the input, the cardinality() in the recursive call seems to be causing it, as also does using array_length() or array_upper() in the same place.
Question: how can I write a function that gives me the output I want (and why is the function I wrote failing to do that)?
Bonus extra: For context, this input array is coming from array_agg() invoked on a table column and the output will go back into a table using unnest(). I'm converting to/from an array since I see no way to do it directly on the table, in particular because WITH RECURSIVE forbids references to the recursive table in either an outer join or subquery. But if there's a way around using arrays (especially with a lack of tail-recursion optimization) that will answer the general question (But I am still very very curious why I'm seeing the extra elements in the output array).
Everything indicates that you have found a reportable Postgres bug. The function should work properly, and a slight modification unexpectedly changes its behavior. Add SELECT; right after $$ to get the function to run as expected, see Db<>fiddle.
A good alternative to a recursive solution is a simple iterative function. Handling arrays in PL/pgSQL is typically simpler and faster than recursion.
create or replace function loop_function(input int[])
returns int[] language plpgsql as $$
declare
val int;
tot int = 0;
res int[];
begin
foreach val in array input loop
if val = 0 then tot = 0;
else tot := tot + val;
end if;
res := res || tot;
end loop;
return res;
end $$;
Test it in Db<>fiddle.
The OP wrote:
this input array is coming from array_agg() invoked on a table column and the output will go back into a table using unnest().
You can calculate these cumulative sums directly in the table with the help of window functions.
select id, val, sum(val) over w
from (
select
id,
val,
case val
when 0 then 0
else sum((val = 0)::int) over w
end as series
from my_table
window w as (order by id)
) t
window w as (partition by series order by id)
order by id
Test it in Db<>fiddle.

Dealing with errors while parsing strings

I'm tasked with pulling relevent data out of a field which is essentially free text. I have been able to get the information I need 98% of the time by looking for keywords and using CASE statements to break the field down into 5 different fields.
My issue is I can't get around the last 2% because the errors don't follow any logical order - they are mostly misspellings.
I could bypass the field with a TRY CATCH, but I don't like giving up 4 good pieces of data when the routine is choking on one.
Is there any way to handle blanket errors within a CASE statement, or is there another option?
Current code, the 'b' with the commented out section is where it's choking right now:
CASE WHEN #Location = 0 THEN
CASE WHEN #Duration = 0 THEN
CASE WHEN #Timing = 0 THEN
SUBSTRING(#Comment,#Begin, #Context-#Begin)
ELSE
SUBSTRING(#Comment,#Begin, #Timing-#Begin)
END
ELSE SUBSTRING(#Comment,#Begin, #Duration-#Begin)
END
ELSE SUBSTRING(#Comment,#Begin, #Location-#Begin)
END AS Complaint
,CASE WHEN #Location = 0 THEN ''
ELSE
CASE WHEN #Duration = 0 THEN
CASE WHEN #Timing = 0 THEN SUBSTRING(#Comment,#Location+10, (#CntBegin-11))
ELSE SUBSTRING(#Comment,#Location+10, #Timing-(#Location+10))
END
ELSE SUBSTRING(#Comment,#Location+10, #Duration-(#Location+10))
END
END AS Location
,CASE WHEN #Timing = 0 THEN ''
ELSE
CASE WHEN #CntBegin = 0 THEN
SUBSTRING(#Comment,#Timing+#TimingEnd, (#Location+#Context)-(#Timing+#TimingEnd))
ELSE
'b'--SUBSTRING(#Comment,#Timing+#TimingEnd, (#Location+#CntBegin-1)-(#Timing+#TimingEnd))
END
END AS Timing
On this statement, which has a comma in an odd spot. I have to reference the comma usually for the #CntBegin, but in this case it's making my (#Location+#CntBegin-1) shorter then the (#Timing+#TimingEnd):
'Pt also presents with/for mild check MGP/MGD located in OU, since 12/2015 ? Stability.'
Please take into account, I'm not necessarily trying to fix this error, I'm looking for a way to handle any error that comes up as who knows what someone is going to type. I'd like to just display 'ERR' in that particular field when the code runs into something it can't handle. I just don't want the routine to die.
Assuming your error is due to the length parameter in SUBSTRING being less than 0. I always alias my parameters using CROSS APPLY and then validate the input before calling SUBSTRING(). Something like this should work:
SELECT
CASE WHEN CA.StringLen > 0 /*Ensure valid length*/
THEN SUBSTRING(#comment,#Timing+#TimingEnd,CA.StringLen)
ELSE 'Error'
END
FROM YourTable
CROSS APPLY (SELECT StringLen = (#Location+#CntBegin-1)-(#Timing+#TimingEnd)) AS CA

How to extract substring from string in crystal report

I am new to crystal reports.
My data(employee ids) is of the following format
Abc123, uttd333, ddt-435
I want to extract only numbers and remove leading letters and special characters.
Also there are certain values that should never be printed.
Admin Ids such as
Gree999, ttt999
I know there is a mid function but that requires me to specify the position from where the substring should begin. These values don't have a fixed number of leading letters.
Is there anything like Ltrim like we have in SQL that we can use to achieve this in crystal reports?
Afaik there's no built-in function in CR to remove non-numeric characters from a string, so you'll have to roll your own (replace {Befehl.EmployeeId} with your field):
StringVar employeeId := {Befehl.EmployeeId};
StringVar result := "";
NumberVar i;
// transfer numeric chars into result one by one
For i := 1 To Length(employeeId) Do
(
If IsNumeric(employeeId[i]) Then
result := result + employeeId[i];
);
// output result only if employeeId is not in your admin list
If employeeId In Split('Gree999,ttt999', ',') Then
'--Admin--'
Else
result;
Here's an alternative approach to grabbing just the numeric portion of the string:
strReverse(ToText(Val(strReverse({EmpId})),0,""))
It takes advantage of knowing that the digits are always at the end.
To handle trailing zeros, you can use:
local stringvar withoutTrailingZeros := strReverse(ToText(Val(strReverse({EmpId})),0,""));
withoutTrailingZeros + ReplicateString("0", Len({EmpId}) - (instr({EmpId}, withoutTrailingZeros) + Len(withoutTrailingZeros) - 1))
But mweber's answer above is simpler now.

Is there a way to specify various types for a parameter

Is there a way to restrict the conformance of a type to be a collection of types?
Let me explain by example:
give_foo (garbage: ANY): STRING
do
if attached {STRING} garbage as l_s then
Result := l_s
elseif attached {INTEGER} garbage as l_int then
Result := l_int.out
elseif attached {JSON_OBJECT} garbage as l_int then
Result := l_int.representation
elseif attached {RABBIT} garbage as l_animal then
Result := l_animal.name + l_animal.color
else
Result := ""
check
unchecked_type_that_compiler_should_be_able_to_check_for_me: False
end
end
end
Couldn't I do something like (like a convert function could do)
give_foo (garbage: {STRING, INTEGER, JSON_OBJECT, RABBIT}): STRING
do
if attached {STRING} garbage as l_s then
Result := l_s
elseif attached {INTEGER} garbage as l_int then
Result := l_int.out
elseif attached {JSON_OBJECT} garbage as l_int then
Result := l_int.representation
elseif attached {RABBIT} garbage as l_animal then
Result := l_animal.name + l_animal.color
else
Result := ""
check
unchecked_type_that_compiler_should_be_able_to_check_for_me: False
end
end
end
or something like
not_garbage_hash_table: HASH_TABLE[{INTEGER, STRING, ANIMAL}, STRING]
Conformance to a collection of types is not supported for several reasons:
Calling a feature on an expression of such a type becomes ambiguous because the same name could refer to completely unrelated features.
In one case we need a sum (disjoint union) of types, in the second - plain union, in the third - an intersection, etc. And then, there could be combinations. One would need an algebra built on top of a type system that becomes too complicated.
If the requirement is to check that an argument is one of expected types, the following precondition can be used:
across {ARRAY [TYPE [detachable ANY]]}
<<{detachable STRING}, {INTEGER}, {detachable JSON_OBJECT}>> as t
some argument.generating_type.conforms_to (t.item) end
A common practice to process an expression of a potentially unknown type is a visitor pattern that deals with known cases and falls back to a default for unknown ones.
Possibly place Alexander's solution in a BOOLEAN query so it can be reused?
is_string_integer_or_json_object (v: detachable ANY): BOOLEAN
-- Does `v' conform to {STRING}, {INTEGER}, or {JSON_OBJECT}?
do
across {ARRAY [TYPE [detachable ANY]]}
<<{detachable STRING}, {INTEGER}, {detachable JSON_OBJECT}>> as t
some v.generating_type.conforms_to (t.item) end
end

Lcase operation on array taking too long

I'm working with a script designed to compare values returned from a form against values from a database dumped to an array, via GetRows. The purpose of the check is to compare form values against database values and only update the matching ids' rows in the database.
I've seen this done with hidden variables in forms, but as we have quite a few users online at any given time, the values on the db end could change while a user was completing the form.
Currently, the code uses an inner and outer loop to run this comparison, with a temporary variable being assigned the current col/row from the aforementioned array. An lcase and trim operation are performed on the value to obtain the temporary variable.
This is causing a considerable performance drain, and I was wondering if the lcase/trim functionality could perhaps be performed during the creation of that array, rather than in a looping situation?
Here's my code:
**note: this utilizes the FastString Class for concatenation, thus the "FastString" and ".Append"
dim iRowLoop, iColLoop, zRowLoop, strChange, tempDbValsCase
Set strChange = New FastString
for iRowLoop = 0 to ubound(arrDbVals, 2)
for zRowLoop = 0 to ubound(arrFormComplete)
''#****below line is what is causing the bottleneck, according
''#****to a timer test
tempDbValsCase = lcase(trim(arrDbVals(1, iRowLoop)))
''#****
if (mid(trim(arrFormComplete(zRowLoop)),1,8) = trim(arrDbVals(0, iRowLoop))) AND (mid(trim(arrFormComplete(zRowLoop)),9) <> tempDbValsCase) then
dim strFormAllVals
strFormAllVals = arrFormComplete(zRowLoop)
strChange.Append strFormAllVals & ","
end if
next
next
On the database side (MS SQL Server 2008), the table from which the array is derived through GetRows contains the bit datatype column "Complete". The lcase and trim operations are performed upon this column of the array. Does the bit datatype add any hidden characters in the output? Visually, I don't detect any, but when I compare a value of "True" from the form input against a value from the array that looks like "True," it doesn't match, until I run the lcase and trim on the "Complete" column.
Try
dim iRowLoop, iColLoop, zRowLoop, strChange, tempDbValsCase
dim iCount1, iCount2, match
Set strChange = New FastString
iCount1 = ubound(arrDbVals, 2)
iCount2 = ubound(arrFormComplete)
for iRowLoop = 0 to iCount1
for zRowLoop = 0 to iCount2
' Assign array lookup to a variable '
tempDbValsCase = arrDbVals(1, iRowLoop)
' ...and then perform operations on it one at a time '
tempDbValsCase = trim(tempDbValsCase)
tempDbValsCase = lcase(tempDbValsCase)
' Assign this array lookup to a variable and perform trim on it '
match = trim(arrFormComplete(zRowLoop))
if (mid(match,1,8) = trim(arrDbVals(0, iRowLoop))) AND (mid(match,9) <> tempDbValsCase) then
strChange.Append match & ","
end if
next
next

Resources