Semantic Syntax Markup Language (SSML) on blueprint? - alexa

I was trying to use SSML syntax while filling out the Alexa blueprint skill form but then I got an error. Is there a way the form can support SSML?
Entry:
<amazon:effect name="whispered">I am not a real human.</amazon:effect>.
Error:
Special characters are not supported. However, Alexa can speak special characters ( # # $ % _ + = | ; ), if enclosed in single quotes ( '#' ).

Sorry, but (currently) you can only use PlainText in blueprint based skills, because (as you mentioned the error message) the special characters needed for an XML based syntax are not supported by the form.
Just an additional hint to your SSML text, if you want to use it in a regular skill:
As a XML based syntax it needs embracing speak tag and no outside/final dot.
<speak><amazon:effect name="whispered">I am not a real human.</amazon:effect></speak>
https://developer.amazon.com/en-US/docs/alexa/custom-skills/speech-synthesis-markup-language-ssml-reference.html

Related

Regex Negative lookbehind in React app breaking app on iOS devices

After a recent build my app has stopped displaying on iOS devices (just shows a blank screen).
After a log of digging, I've been able to narrow down the cause and it's this regex expression:
(?<!#)
Here's a context how I used it:
/\b(?<!#)gmail\b|\b(?<!#)google\b/i
which means I want to capture the words "gmail" and "google", but only if they are not preceded by an "#" symbol.
My question is, what is the correct regex expression that will do the same job, and work on all browsers/devices?
Thank you
You could capture the words "gmail" and "google", but only if they are not preceded by an "#" symbol by matching them using a non capture group #(?:gmail|google)
Use an alternation | and a capture group (gmail|google) for google or gmail.
#(?:gmail|google)\b|\b(gmail|google)\b
See a regex demo
For example, if you are doing a replacement you could check for the existence of group 1.
const regex = /#(?:gmail|google)\b|\b(gmail|google)\b/g;
const str = `gmail
google
#gmail
#google
test#google.com
#agmail`;
let res = Array.from(str.matchAll(regex), m => m[1] ? `[REPLACED]${m[1]}[REPLACED]` : m[0]);
console.log(res)
You could just directly match a non # character:
/[^#](google|gmail)\b/i
If you need to also match the domain (which can only be google or gmail), then you may access the first capture group.
It seems that a combination of your two answers worked for me:
/[^#](?:gmail|google)\b|\b(gmail|google)\b/i
Thanks!

QRadar, parsing Log

I want to parse some application log, I did a lot of regex that works correctly with notepad++ and the website www.regex101.com .
But when I apply them in QRadar they don't match nothing.
For example
12/2/2017 9:53:58,4040007,blablablbla,blablabla --- Abonnement Mobile N° : 0663016666 | balbalbal | 03/06/2006 11:11:22 --- Soldes,10.10.10.10
I did this regex (?<=---)\s+[A-Za-z+ \/\w+0-9._%+-]+(?=(\sN°|\s\sN°|\sID)) to match Abonnement mobile it works correctly , but it doesn't match anything in QRadar.
QRadar does not accept all regex configurations. When you try parsing something you can use extract property field to check. Here is a regex that works fine in my system.
\-\-\-\s(\w+\s\w+)\s
this regex will work if only "Abonnement Mobile" field is includes letters or digits. If you want to catch "Abonnement Mobile N°" you can use this regex and this will work whatever comes in this field.
\-\-\-\s([^\:]+)\:

How to query atom field with unicode value in Google App Engine production search?

I wrote some text search with use Google App Engine search.
In SDK I tested such query on atom field:
u'tag:"wartości"'
In production I run the same query but it not works on same data.
How can I do unicode query on atom field?
Is it possible to use unicode in Google App Engine search?
We are aware of this issue and plan to fix ASAP. The fix that we're currently planning will require that the atom field value include exactly the same accent characters in order to match. Matches will continue to be case-insensitive. We expect that at least initially, values that use combining diacritical marks will be treated as different values than those using precomposed characters. We may revisit that decision depending on feedback, but it's the most straightforward fix on our end.
For more on the precomposed characters vs. combining diacritical marks, see this Wikipedia article:
http://en.wikipedia.org/wiki/Precomposed_character
Chris
It looks that I need translate AtomField values into new string and I need to translate queries too. This workaround will allow only Polish unicode search. I do not know tonkenization rules so I use 'q', 'x' to expand alphabet since not used in Polish.
# coding=utf-8
translate = {
u'ą': u'aq',
u'Ą': u'Aq',
u'ć': u'cq',
u'Ć': u'Cq',
u'ę': u'eq',
u'Ę': u'Eq',
u'ł': u'lq',
u'Ł': u'Lq',
u'ń': u'nq',
u'Ń': u'Nq',
u'ó': u'oq',
u'Ó': u'Oq',
u'ś': u'sq',
u'Ś': u'Sq',
u'ż': u'zx',
u'Ż': u'Zx',
u'ź': u'zq',
u'Ź': u'Zq',
}
import re
reTranslate = re.compile(u'(%s)' % u'|'.join(translate))
print reTranslate.pattern
test = u"""\
Właściwie prowadzona komunikacja wewnętrzna w firmie,\
zwłaszcza dużej czy posiadającej rozproszoną sieć oddziałów,\
może przynieść oszczędność czasu, a co za tym idzie, również pieniędzy."""
print reTranslate.sub(lambda match: translate[match.group(0)], test)

How does Salesforce.com validate Email Fields?

I'm trying to store email addresses in Salesforce.com from another service that allows invalid email addresses to be specified. If one of those bad invalid email addresses is sent to Salesforce.com via their Web Services API, Salesforce.com will prevent the record from saving with an INVALID_EMAIL_ADDRESS error code.
I can't find any documentation on how to disable validation on Email fields, so it looks like I'll need to validate them in my integration and pull out those that fail. Does anyone know the validation process Salesforce.com uses to determine if an email address is valid? All I have right now is a Regex, but I'd like it to match Salesforce.com's process.
EDIT: For reference, here is my Regex (I'm using C#/.NET):
^(\w|[!#$%'*+-/=?^_`\{\}~.&])+#\w+([-.]\w+)*\.\w+([-.]\w+)*([,;]\s*\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*)*$
Summary: we're using the following .NET RegEx:
const string SFEmailRegExPattern = #"^[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$";
If you can believe SF's own documentation then:
For the local part of the email address we accept the following characters. The local part is anything before the # sign.
abcdefg.hijklmnopqrstuvwxyz!#$%&'*/=?^_+-`{|}~0123456789
Note: The character dot . is supported; provided that it is not the first or last character in the local-part
For the domain part of the email address we accept. The domain part is anything after the # in an email address:
0-9 and A-Z and a-z and dash -
A couple of people have coded this up as a Java regex as:
String pat = '[a-zA-Z0-9\\.\\!\\#\\$\\%\\&\\*\\/\\=\\?\\^\\_\\+\\-\\`\\{\\|\\}\\~\'._%+-]+#[a-zA-Z0-9\\-.-]+\\.[a-zA-Z]+';
although to me this looks like it fails to reject an email that starts with a "." so isn't perfect.
I don't know how salesforce.com is validating email addresses, but since you are using .NET I'd suggest you to consider an email validation component like our EmailVerify.NET, which is 100% compliant with the current IETF standards (RFC 1123, RFC 2821, RFC 2822, RFC 3490, RFC 3696, RFC 4291, RFC 5321, RFC 5322 and RFC 5336) and does not suffer from ReDoS: if needed, it even checks the DNS records of the email domain under test, its SMTP availability, validates the related mailbox and can even tell if the target mail exchanger is a catch-all or if it is a disposable/free email address provider.
I don't know what salesforce.com uses (and I don't think there's any way for you to find out), but \b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b from here is a commmon one and should work for most of the cases.
I've looked previously and not been able to find a definitive answer on exactly which rules SFDC applies to the native "Email" field type. The quickest path to success that I would suggest would be this:
in your initial data integration from the external application, map the email field that you describe into a new (non-email, just text 255) custom field in SFDC.
if this is a one-time dataload, run a separate process that, for every row in SFDC with this custom field populated, attempts to copy the contents of this custom field to the native email field. If any row fails with the email validation error, you just skip it. Then you can decide what to do with the non-compliant addresses.
if this is an ongoing integration, it may be better to do something like attempt to insert new rows one-at-a-time via WS API, and if the email validation exception is thrown, you catch it and either insert the record without an email address, store the bad email in a different field (like a custom field called "non-compliant email address"), or skip the row altogether (if bad emails == bad record).
Hope that helps.
Apex has native Pattern and Matcher classes, based on java.
You can validate your email addresses in Apex code, using your RegEx expression as a string
String emailPattern = {your regex expression);
Boolean validEmail = pattern.match(emailPattern, emailAddress);
You can't definitely create common regex for salesforce email, due to inconsistency of their own requirements.
The one rule is about to give possibilities to put IP address after the local part. Example -> email#123.123.123.123.
The second is about do not allow digits in top-level domain.
For example: test#test.com1
So, they are mutually excluded.
But as I understood the email address with IP after the local part is more important and commonly used comparing with numbers in top-level domain.
Here is some examples of valid/invalid emails for salesforce.
Valid:
a#ua.fm
email#domain.com
firstname.lastname#domain.com
email#subdomain.domain.com
firstname+lastname#domain.com
email#123.123.123.123
1234567890#domain.com
email#domain-one.com
_______#domain.com
email#domain.name
email#buyacar.co.uk
ail#github.dennis.co.uk
email#news.i.ua
firstname-lastname#domain.com
Alexka1!+1123klsn&*^%$%$#^^^#a3432.4s.c4p.uk
frw...??//||/wt'f`fe#wfwfg-----wfwef.mm
a..#test.jp
abcdefg.hijklmnopqrstuvwxyz!#$%&'*/=?^_+-`{|}~0123456789#acme-inc.com
Invalid:
aasd#sdfжжж.rf
plainaddress
##%^%#$##$##.com
#domain.com
email.domain.com
email#domain#domain.com
.email#domain.com
あいうえお#domain.com
email#domain.com (Joe Smith)
email#domain
email#domain..com
email#domain.com.e
email#domain.com.33
As result of above, the final regex is:
/^(?!\.)(([^<>()\[\]\\a-zA-Z0-9.,;:\s#"]*(\.[^<>()\[\]\\.,;:\s#"]+)*)|(".+"))[a-zA-Z0-9.!#$%&'‘*+\/=?^_{|}~-]+#[\w.-?]+.[A-Za-z]*(?
Here is a regular expression based on this help page + a lot of experimenting in Salesforce:
^(?=(?:\([^)]*\))*[^()]+[^#]*#)(?!(?:\([^)]*\))*\.)(?:(?:[\w!#$%&'*+-/=?^`{|}~]|\([\w!#$%&'*+-/=?^`{|}~]*\))+|"(?:[\w!#$%&'*+-/=?^`{|}~]|\([\w!#$%&'*+-/=?^`{|}~]*\))*")#(?:\([A-Za-z0-9-]*\))*(?:(?:[A-Za-z0-9]+|[A-Za-z0-9]+(?:\([A-Za-z0-9-]*\))*-(?:\([A-Za-z0-9-]*\))*[A-Za-z0-9]+)(?:\([A-Za-z0-9-]*\))*)(?:\.(?:\([A-Za-z0-9-]*\))*(?:(?:[A-Za-z0-9]+|[A-Za-z0-9]+(?:\([A-Za-z0-9-]*\))*-(?:\([A-Za-z0-9-]*\))*[A-Za-z0-9]+)(?:\([A-Za-z0-9-]*\))*))+$
See this Demo. It gives the same validation result as Salesforce for all the values I could think of testing - copied below - any counter examples are welcome...
************* VALID *************
a#a.a
-#a.a
a#1.a
a#a-a.a
a#a.a-a
!#$%&'*+-/=?^_`{|}~#test.jp
a..a#test.jp
a..#test.jp
"a"#test.jp
""#test.jp
(comment)(comment)a(comment)(comment)(comment)#(comment)a.a
(comment)(comment)a.(comment)(comment)(comment)#(comment)a.a
(comment)(comment)a(comment).(comment)(comment)#(comment)a.a
a#(comment)a(comment)-(comment)a(comment).a
john.doe#(-comment)example.com
john.doe#example.com(comment-)
()a#test.jp
(a)a#test.jp
a(a)#test.jp
a#(a)test.jp
a#test.jp(a)
simple#example.com
very.common#example.com
disposable.style.email.with+symbol#example.com
other.email-with-hyphen#example.com
fully-qualified-domain#example.com
user.name+tag+sorting#example.com
x#example.com
example-indeed#strange-example.com
test/test#test.com
example#s.example
"john..doe"#example.org
mailhost!username#example.org
user%example.com#example.org
user-#example.org
1#1234567890123456789012345678901234567890123456789012345678901234.1.2.3.4.5.6.7
************* INVALID *************
a#a
a#a.
a#-a.a
a#a-.a
a#a.a-
a#a.-a
a;a#test.jp
.a#test.jp
";"#test.jp
"#"#test.jp
"a#test.jp
a"#test.jp
a""#test.jp
""a#test.jp
()#test.jp
)(a#test.jp
(a)#test.jp
(a#test.jp
(())a#test.jp
(comment)(comment).(comment)a(comment)(comment)#(comment)a.a
john.doe#(comment).com
a#(comment)a(comment)-(comment)(comment).a
Αθήνα#email.com
admin#mailserver1
" "#example.org
"very.(),:;<>[]\".VERY.\"very#\\ \"very\".unusual"#strange.example.com
postmaster#[123.123.123.123]
postmaster#[IPv6:2001:0db8:85a3:0000:0000:8a2e:0370:7334]

local.cf spamassassin

Actually I am trying to write one rule in local.cf spamassassin.
What I need is to block all Viagra emails.
As you know in these emails they write Viagra,VIAGRA,VIAGRA(c) sometimes in the Subject field, sometimes in the Name field, sometime it is the body of the message.
Can you please tell me what will be rule exactly to stop all these emails?
Well, you can try these kind of simple rules:
header VIAGRA_SUBJECT Subject =~ /viagra/i
header VIAGRA_FROM From =~ /viagra/i
meta VIAGRA_HEADER VIAGRA_FROM && VIAGRA_SUBJECT
score VIAGRA_HEADER 10.0
describe VIAGRA_HEADER Block Mails with Viagra in subject
body VIAGRA_BODY /viagra/i
score VIAGRA_BODY 10.0
describe VIAGRA_BODY Block Mails with Viagra in body
I have two more to add:
body LOCAL_OBFU_VIAGRA /(?:\b[vu]|\B(?:\\\/|\xCE\xBD))[\W_]{0,3}(?:[il1:\|\*\xCC-\xCF\xEC-\xEF\xA6]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)[\W_]{0,3}(?:[a4\*\#\xC0-\xC5\xAA\xE0-\xE5]|\/\\|\xC4[\x80-\x85]|\xC7[\x8D-\x8E]|\xC7[\xBA-\xBB]|\xCE\x86|\xCE\x91|\xCE\x94|\xCE\x9B|\xCE\xAC|\xCE\xB1|\xD0\x90|\xD0\xB0)[\W_]{0,3}(?:[g6]|\xC4[\x9C-\xA3]])[\W_]{0,3}(?:[r\xAE]|\xC5[\x94-\x99]|\xD1\x93)[\W_]{0,3}(?:[a4]\b|(?:[\*\#\xC0-\xC5\xAA\xE0-\xE5]|\/\\|\xC4[\x80-\x85]|\xC7[\x8D-\x8E]|\xC7[\xBA-\xBB]|\xCE\x86|\xCE\x91|\xCE\x94|\xCE\x9B|\xCE\xAC|\xCE\xB1|\xD0\x90|\xD0\xB0)\B)/i
score LOCAL_OBFU_VIAGRA 1.8
describe LOCAL_OBFU_VIAGRA Obfuscated 'VIAGRA' in body
The rule above will block obfuscated "Viagra" in the body of the message. The following rule does the same sort of thing but with special characters spelled out:
describe MANGLED_VIAGRA mangled viagra
body MANGLED_VIAGRA /(?!viagra)v{1,3}(?:[_\W]{0,5}|[viagra])[iÌÍÎÏìíîï\|1l\!](?:[_\W]{0,5}|[viagra])[aÀÁÂÃÄÅàáâãäå4\#](?:[_\W]{0,5}|[viagra])g(?:[_\W]{0,5}|[viagra])r(?:[_\W]{0,5}|[viagra])[aÀÁÂÃÄÅàáâãäå4\#]/i
score MANGLED_VIAGRA 2.5

Resources