How does Salesforce.com validate Email Fields? - salesforce

I'm trying to store email addresses in Salesforce.com from another service that allows invalid email addresses to be specified. If one of those bad invalid email addresses is sent to Salesforce.com via their Web Services API, Salesforce.com will prevent the record from saving with an INVALID_EMAIL_ADDRESS error code.
I can't find any documentation on how to disable validation on Email fields, so it looks like I'll need to validate them in my integration and pull out those that fail. Does anyone know the validation process Salesforce.com uses to determine if an email address is valid? All I have right now is a Regex, but I'd like it to match Salesforce.com's process.
EDIT: For reference, here is my Regex (I'm using C#/.NET):
^(\w|[!#$%'*+-/=?^_`\{\}~.&])+#\w+([-.]\w+)*\.\w+([-.]\w+)*([,;]\s*\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*)*$

Summary: we're using the following .NET RegEx:
const string SFEmailRegExPattern = #"^[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$";
If you can believe SF's own documentation then:
For the local part of the email address we accept the following characters. The local part is anything before the # sign.
abcdefg.hijklmnopqrstuvwxyz!#$%&'*/=?^_+-`{|}~0123456789
Note: The character dot . is supported; provided that it is not the first or last character in the local-part
For the domain part of the email address we accept. The domain part is anything after the # in an email address:
0-9 and A-Z and a-z and dash -
A couple of people have coded this up as a Java regex as:
String pat = '[a-zA-Z0-9\\.\\!\\#\\$\\%\\&\\*\\/\\=\\?\\^\\_\\+\\-\\`\\{\\|\\}\\~\'._%+-]+#[a-zA-Z0-9\\-.-]+\\.[a-zA-Z]+';
although to me this looks like it fails to reject an email that starts with a "." so isn't perfect.

I don't know how salesforce.com is validating email addresses, but since you are using .NET I'd suggest you to consider an email validation component like our EmailVerify.NET, which is 100% compliant with the current IETF standards (RFC 1123, RFC 2821, RFC 2822, RFC 3490, RFC 3696, RFC 4291, RFC 5321, RFC 5322 and RFC 5336) and does not suffer from ReDoS: if needed, it even checks the DNS records of the email domain under test, its SMTP availability, validates the related mailbox and can even tell if the target mail exchanger is a catch-all or if it is a disposable/free email address provider.

I don't know what salesforce.com uses (and I don't think there's any way for you to find out), but \b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b from here is a commmon one and should work for most of the cases.

I've looked previously and not been able to find a definitive answer on exactly which rules SFDC applies to the native "Email" field type. The quickest path to success that I would suggest would be this:
in your initial data integration from the external application, map the email field that you describe into a new (non-email, just text 255) custom field in SFDC.
if this is a one-time dataload, run a separate process that, for every row in SFDC with this custom field populated, attempts to copy the contents of this custom field to the native email field. If any row fails with the email validation error, you just skip it. Then you can decide what to do with the non-compliant addresses.
if this is an ongoing integration, it may be better to do something like attempt to insert new rows one-at-a-time via WS API, and if the email validation exception is thrown, you catch it and either insert the record without an email address, store the bad email in a different field (like a custom field called "non-compliant email address"), or skip the row altogether (if bad emails == bad record).
Hope that helps.

Apex has native Pattern and Matcher classes, based on java.
You can validate your email addresses in Apex code, using your RegEx expression as a string
String emailPattern = {your regex expression);
Boolean validEmail = pattern.match(emailPattern, emailAddress);

You can't definitely create common regex for salesforce email, due to inconsistency of their own requirements.
The one rule is about to give possibilities to put IP address after the local part. Example -> email#123.123.123.123.
The second is about do not allow digits in top-level domain.
For example: test#test.com1
So, they are mutually excluded.
But as I understood the email address with IP after the local part is more important and commonly used comparing with numbers in top-level domain.
Here is some examples of valid/invalid emails for salesforce.
Valid:
a#ua.fm
email#domain.com
firstname.lastname#domain.com
email#subdomain.domain.com
firstname+lastname#domain.com
email#123.123.123.123
1234567890#domain.com
email#domain-one.com
_______#domain.com
email#domain.name
email#buyacar.co.uk
ail#github.dennis.co.uk
email#news.i.ua
firstname-lastname#domain.com
Alexka1!+1123klsn&*^%$%$#^^^#a3432.4s.c4p.uk
frw...??//||/wt'f`fe#wfwfg-----wfwef.mm
a..#test.jp
abcdefg.hijklmnopqrstuvwxyz!#$%&'*/=?^_+-`{|}~0123456789#acme-inc.com
Invalid:
aasd#sdfжжж.rf
plainaddress
##%^%#$##$##.com
#domain.com
email.domain.com
email#domain#domain.com
.email#domain.com
あいうえお#domain.com
email#domain.com (Joe Smith)
email#domain
email#domain..com
email#domain.com.e
email#domain.com.33
As result of above, the final regex is:
/^(?!\.)(([^<>()\[\]\\a-zA-Z0-9.,;:\s#"]*(\.[^<>()\[\]\\.,;:\s#"]+)*)|(".+"))[a-zA-Z0-9.!#$%&'‘*+\/=?^_{|}~-]+#[\w.-?]+.[A-Za-z]*(?

Here is a regular expression based on this help page + a lot of experimenting in Salesforce:
^(?=(?:\([^)]*\))*[^()]+[^#]*#)(?!(?:\([^)]*\))*\.)(?:(?:[\w!#$%&'*+-/=?^`{|}~]|\([\w!#$%&'*+-/=?^`{|}~]*\))+|"(?:[\w!#$%&'*+-/=?^`{|}~]|\([\w!#$%&'*+-/=?^`{|}~]*\))*")#(?:\([A-Za-z0-9-]*\))*(?:(?:[A-Za-z0-9]+|[A-Za-z0-9]+(?:\([A-Za-z0-9-]*\))*-(?:\([A-Za-z0-9-]*\))*[A-Za-z0-9]+)(?:\([A-Za-z0-9-]*\))*)(?:\.(?:\([A-Za-z0-9-]*\))*(?:(?:[A-Za-z0-9]+|[A-Za-z0-9]+(?:\([A-Za-z0-9-]*\))*-(?:\([A-Za-z0-9-]*\))*[A-Za-z0-9]+)(?:\([A-Za-z0-9-]*\))*))+$
See this Demo. It gives the same validation result as Salesforce for all the values I could think of testing - copied below - any counter examples are welcome...
************* VALID *************
a#a.a
-#a.a
a#1.a
a#a-a.a
a#a.a-a
!#$%&'*+-/=?^_`{|}~#test.jp
a..a#test.jp
a..#test.jp
"a"#test.jp
""#test.jp
(comment)(comment)a(comment)(comment)(comment)#(comment)a.a
(comment)(comment)a.(comment)(comment)(comment)#(comment)a.a
(comment)(comment)a(comment).(comment)(comment)#(comment)a.a
a#(comment)a(comment)-(comment)a(comment).a
john.doe#(-comment)example.com
john.doe#example.com(comment-)
()a#test.jp
(a)a#test.jp
a(a)#test.jp
a#(a)test.jp
a#test.jp(a)
simple#example.com
very.common#example.com
disposable.style.email.with+symbol#example.com
other.email-with-hyphen#example.com
fully-qualified-domain#example.com
user.name+tag+sorting#example.com
x#example.com
example-indeed#strange-example.com
test/test#test.com
example#s.example
"john..doe"#example.org
mailhost!username#example.org
user%example.com#example.org
user-#example.org
1#1234567890123456789012345678901234567890123456789012345678901234.1.2.3.4.5.6.7
************* INVALID *************
a#a
a#a.
a#-a.a
a#a-.a
a#a.a-
a#a.-a
a;a#test.jp
.a#test.jp
";"#test.jp
"#"#test.jp
"a#test.jp
a"#test.jp
a""#test.jp
""a#test.jp
()#test.jp
)(a#test.jp
(a)#test.jp
(a#test.jp
(())a#test.jp
(comment)(comment).(comment)a(comment)(comment)#(comment)a.a
john.doe#(comment).com
a#(comment)a(comment)-(comment)(comment).a
Αθήνα#email.com
admin#mailserver1
" "#example.org
"very.(),:;<>[]\".VERY.\"very#\\ \"very\".unusual"#strange.example.com
postmaster#[123.123.123.123]
postmaster#[IPv6:2001:0db8:85a3:0000:0000:8a2e:0370:7334]

Related

unicodePwd Attribute in AD

Can someone explain how the unicodePwd attribute works in AD.
Specifically, when I open the "ADSI Editor", then open a users properties and look at the attributes, the "unicodePwd" attribute shows as "not set". When in fact the user does have a password. Is this where password gets stored?
Also, I can reset a users password by right-clicking on the user and choosing Reset Password...but if I try to set by updating the unicodePwd, using proper Hex and I'm pretty sure using the right complexity, I get:
Operation Failed. Error code: 0x1f
SvcErr: DSID-031A1236, problem 5003 (WILL_NOT_PERFORM), data 0
I have been working on this for a while, and you cannot change the user password from the gui like that. There are two ways you can do this as far as I have found which is:
Use the powershell module found documented here: https://learn.microsoft.com/en-us/powershell/module/activedirectory/set-adaccountpassword?view=windowsserver2022-ps
Do it with the help of LDAP / OPENLDAP, like so:
a. add quotes (") to the beginning and end of the password
b. convert each character to unicode
c. convert the bytestring you now have using base 64 encoding.
The code I used to do the encoding is shown here:
import base64
import sys
def encode_pwd(pw):
new_pw = b""
pw = "\"" + pw + "\""
for char in pw:
new_pw += char.encode("utf-16le")
return base64.standard_b64encode(new_pw)
pw = sys.argv[1]
result = encode_pwd(pw)
stripped_result=str(result).strip("b'")
print(stripped_result)
This will return the encoded unicodePwd.
Finally modify the entry with ldapmodify:
ldapmodify -v -H 'ldaps://host' -U "user" -w "password" -Y DIGEST-MD5 \
<<EOF
dn: "DN=contoso,DN=com"
replace: unicodePwd
unicodePwd::$b64pwd
EOF
I hope this answers your question. :)
Active Directory stores the password on a user object or inetOrgPerson object in the unicodePwd attribute.
UnicodePwd doesn’t store the user password it is not set by default itself. It is use for encoding the password in a attribute.
This attribute can be written under restricted conditions, but it cannot be read. The attribute can only be modified; it cannot be added on object creation or queried by a search.
In order to modify this attribute, the client must have a 128-bit Transport Layer Security (TLS)/Secure Socket Layer (SSL) connection to the server. An encrypted session using SSP-created session keys using NTLM or Kerberos are also acceptable as long as the minimum key length is met.
Reference: https://learn.microsoft.com/en-us/troubleshoot/windows/win32/change-windows-active-directory-user-password
Note: The syntax of the unicodePwd attribute is Object(Replica-Link). However, the DC requires that the password value be specified in a UTF-16 encoded Unicode string containing the password surrounded by quotation marks, which has been BER-encoded as an octet string per the Object(Replica-Link) syntax.

Validate angular regex so so it doesn’t have info#, admin#, help#, sales#

I would like to block some kind of emails using angular ng-pattern
The emails below should not be valid
info#anything.com
admin#anything.com
help#anything.com
sales#anything.com
The regex below worked
^((?!info)(?!admin)(?!help)(?!sales)[a-zA-Z0-9._%+-])+#[a-zA-Z0-9.-]+\.[a-zA-Z]{1,63}$
But not as I expected because I wold like to allow i.e
information#anything.com
How can I block the info#, admin#, help#, sales#?
Thanks
You may join the lookaheads into 1 and add # after the values to ensure you match the user part up to # (as a whole):
/^(?!(?:info|admin|help|sales)#)[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{1,63}$/
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
See the regex demo
The (?!(?:info|admin|help|sales)#) negative lookahead fails the match if, after the start of a string (^), there is info# or admin# or help#, or sales#.

How to write a email pattern in angularjs for specific domain list

I need to write a email pattern with ng-pattern for the following domain list: any domain that end with .com/.is.edu after # or #com/#is.edu. For example, the email address for xxx#y.com or xxx#com should be valid, xxx#g.is.edu and xxx#is.edu should all be valid.
Try something like this:
/^\S+#\S*\.(com|id.edu)$/

AD returns Objectsid as String and SecurityIdentifier is failing interprete this

Usually AD returns 'Objectsid' as a byte[]. So I type cast the value returned by AD in to byte[]. This procedure worked against several AD but not in one case. In this AD environment, I get following exception.
Exception: Unable to cast object of type 'System.String' to type 'System.Byte[]'. (System.InvalidCastException)
To debug this I started checking data-type of the value returned by AD, and it was system.string not byte[]. I printed this string and it was garbage. Then I passed this string to SecurityIdentifier() and I got exception again.
Exception: Value was invalid. Parameter name: sddlForm (System.ArgumentException)
Code:
//Using System.DirectoryServices.Protocols objects
object s = objSrEc[k1].Attributes[(string)obj3.Current][0];
string x = s.GetType().FullName;
if (x.ToLower() == "system.byte[]")
{
byte[] bSID = ((byte[])s);
if (bSID != null)
{
SecurityIdentifier SID = new SecurityIdentifier(bSID, 0);
String ObjectSID = SID.Value;
}
}
else if (x.ToLower() == "system.string")
{
SecurityIdentifier SID = new SecurityIdentifier((String)s); //ssdl excception
String ObjectSID = SID.Value;
}
This is the first time I am seeing AD return string data for ObjectSID. I have run my code against many AD servers. I am planning to check the data-type of ObjectSID in AD schema.
Do any one come across this behavior? Should I call the Win32 api ConvertByteToStringSid()?
Thanks
Ramesh
Sorry for reviving a graveyard post, but I had the same issue a year or so ago, managed to find out why and I figured I'd at least share the reason behind this behavior.
When using the System.DirectoryServices.Protocols namespace, all attribute values should be either a) a byte array, or b) a UTF-8 string. Thing is, the developers at Microsoft figured that they should help people by returning a string when the byte array returned from the underlying LDAP API can be formatted as one, and the byte array itself when the UTF-8 conversion fails. However, this is only true for the indexer of the DirectoryAttribute class, and not for the iterator (which always returns the byte array) or the GetValues method.
The safest way to always get a byte array when you want the SID is, as previously mentioned by others, the GetValues method.
I came through the same. Found this behavior normal when deal with ForeignSecurityPrincipals, however recently found this when translate attributes of built-in groups from some old Win 2K3 domains.
I don't like this as can't just ask the result attribute to tell me via GetType() what type are you and what should I do with you ( GetValues(Attribute.GetType()) ). One of the solutions was reading all attributes definition from AD schema, but this part might be a bit heavy (depends what you're looking for) although it was only a small part of overall AD processing the solution was performing.
Cheers,
Greg

How to retrieve field name in case of unmarshalling error in CXF?

Precondition: service based on CXF receives request/response with data, which violates XSD restriction.
Actual behavior:
In this case CXF returns fault with message like:
cvc-maxLength-valid: Value 'string_length_violated_value' with length = '28' is not facet-valid with respect to maxLength '13' for type 'XSDStringTypeWithLengthRestriction'
Goal:
return fault to consumer with name of field which contains invalid data. F.e. something like this:
Response from provider contains invalid data. Value 'string_length_violated_value' of field 'field_name' is not facet-valid with respect to maxLength '13'.
I'm wondering if it is possible and if so, then how to determine (where to retrieve from) this field name?
I'm not sure if this will completely work, but you can give it a try:
You can create a JAXB ValidationEventHandler and register that on your endpoint.
The ValidationEvent that it gets has the basic string (that you see above) and other information. I would put a breakpoint in there and dig into the event to see if ANY additional and useful information is available.

Resources