I need a help with next question:
I need to do a search using non-ascii characters on Gmail (Cyrillic alphabet (for example Russian or Ukrainian)). When I use the standard IMAP SEARCH command I receive an error:
A12 SEARCH CHARSET UTF-8 SUBJECT "текст" ALL
A12 BAD Could not parse command
In Java, it looks like
Message[] foundMessages = imapFolder.search(new SubjectTerm("текст"));
I've found some help here IMAP search for non-ascii characters. Using openssl s_client -crlf -connect imap.gmail.com:993 I've connected to my mailbox via Terminal and I've received next results:
A12 SEARCH CHARSET UTF-8 X-GM-RAW {10}
+ go ahead
текст
* SEARCH 226
A13 OK SEARCH completed (Success)
The main question - How to implement this in Java?
UPDATE
I've done some research on JavaMail source code. I've found next lines
// if server supports UTF-8, enable it for client use
// note that this is safe to enable even if mail.mime.allowutf8=false
if (p.hasCapability("UTF8=ACCEPT") || p.hasCapability("UTF8=ONLY"))
p.enable("UTF8=ACCEPT");
}
and from gmail server we receive next capabilities
A1 LOGIN test#gmail.com password
* CAPABILITY IMAP4rev1 UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN
X-GM-EXT-1 UIDPLUS COMPRESS=DEFLATE ENABLE MOVE CONDSTORE ESEARCH
UTF8=ACCEPT LIST-EXTENDED LIST-STATUS
LITERAL-SPECIAL-USE APPENDLIMIT=35651584
So, JavaMail sets mail.mime.allowutf8 to true automatically. But in this case, JavaMail does a search using next command
C6 SEARCH CHARSET UTF-8 X-GM-RAW "текст" ALL
And I receive an error
C6 BAD Could not parse command
I've gone ahead and investigated
https://github.com/javaee/javamail/blob/52e04fc107d0b83fa794e6f622f7c76b9e85e395/mail/src/main/java/com/sun/mail/iap/Argument.java#L313
Argument.nastring(byte[] bytes, Protocol protocol, boolean doQuote)
boolean utf8 = protocol.supportsUtf8(); --> For Gmail it's true. That's why JavaMail doesn't use a literal.
byte b;
for (int i = 0; i < len; i++) {
b = bytes[i];
if (b == '\0' || b == '\r' || b == '\n' ||
(!utf8 && ((b & 0xff) > 0177))) {
// NUL, CR or LF means the bytes need to be sent as literals
literal(bytes, protocol);
return;
}
if (b == '*' || b == '%' || b == '(' || b == ')' || b == '{' ||
b == '"' || b == '\\' ||
((b & 0xff) <= ' ') || ((b & 0xff) > 0177)) {
quote = true;
if (b == '"' || b == '\\') // need to escape these characters
escape = true;
}
}
I've tested other email provider which doesn't have UTF8=ACCEPT. And all work fine.
K11 SEARCH CHARSET UTF-8 SUBJECT {10}
+ continue
текст ALL
* SEARCH 1194
K11 OK SEARCH completed
From a quick look at the source, it ought to Just Work if you use javamail 1.6.1. You may want/need to set the mail.mime.allowutf8 property to true.
In more detail: 1.6 adds support for unicode email addresses such as jøran#blåbærsyltetøy.gulbrandsen.priv.no, which as a side effect regularises the use of UTF8 almost everywhere. When you connect to gmail, javamail 1.6 ought to send a login command, then automatically one along the lines of a04 enable utf8=accept and once gmail has acked utf8=accept, a12 search subject "текст" all becomes legal syntax that ought to do what you want.
Related
So I'm coding discord bot, and I have a statement that is supposed to only send a certain message if the second (Argument 1) argument is "version"
here's the code:
case 'help':
if(args[1] === 'version'){
message.channel.send(VERSIONDETAILS)
}
else
message.channel.send('Command list for version ' + VERSION)
message.channel.send('Prefix: ' + PREFIX)
message.channel.send(`botdetails: Sends information on the bot \ninvite: Sends a permanent invite link \ninfo: Sends info on Brothaus \nversioninfo: Sends the version number, and latest version details` )
break;
But when I run the b!help version command, it sends both the VERSIONDETAILS and the else statement messages. How do I solve this?
Edit: Gosh this was beyond a nimwit post
it should look like this, you've forgotten {}
case 'help':
if (args[1] === 'version') {
message.channel.send(VERSIONDETAILS)
} else {
message.channel.send('Command list for version ' + VERSION)
message.channel.send('Prefix: ' + PREFIX)
message.channel.send(`botdetails: Sends information on the bot \ninvite: Sends a permanent invite link \ninfo: Sends info on Brothaus \nversioninfo: Sends the version number, and latest version details`)
}
break;
I can't find any way to fix having all commands available to all channels. I'm creating a bot
that allows you to play "arcade" games in Discord.
...
client.on('message', (msg) => {
if (message.channel.id != config.singleChannelID) return; // I tried using this, as recommended by other users but I don't know where to put ChannelID.
let prefixd = 'd!'
if (!msg.content.startsWith(prefixd)) return
let command = msg.content.toLowerCase().slice(prefixd.length).split(" ")[0]
if (command == '20') msg.channel.send(`You rolled a(n) **${Math.floor(Math.random() * 20) + 1}**!`)
if (command == '12') msg.channel.send(`You rolled a(n) **${Math.floor(Math.random() * 12) + 1}**!`)
if (command == '8') msg.channel.send(`You rolled a(n) **${Math.floor(Math.random() * 8) + 1}**!`)
if (command == '6') msg.channel.send(`You rolled a(n) **${Math.floor(Math.random() * 6) + 1}**!`)
});
What you have in there will work, but you need to either store the desired channel ID somewhere (config in your example) or just hard code it if that is acceptable for your needs:
if (msg.channel.id != SomeChannelsIdGoesHere) return;
If you don't know how to get the channel ID, consult the discord support pages
Also, note that you used the identifier 'message' instead of 'msg' as it appears in your event callback.
I'm using java ldap to access active directory, more specifically spring ldap.
a group search by objectGUID yields no results when the filter is encoded as specified in rfc2254.
this is the guid in its hex representation:
\49\00\f2\58\1e\93\69\4b\ba\5f\8b\86\54\e9\d8\e9
spring ldap encodes the filter like that:
(&(objectClass=group)(objectGUID=\5c49\5c00\5cf2\5c58\5c1e\5c93\5c69\5c4b\5cba\5c5f\5c8b\5c86\5c54\5ce9\5cd8\5ce9))
as mentioned in rfc2254 and in microsoft technet:
the character must be encoded as the backslash '' character (ASCII
0x5c) followed by the two hexadecimal digits representing the ASCII
value of the encoded character. The case of the two hexadecimal
digits is not significant.
Blockquote
so a backslash should be '\5c'
but I get no results with above filter from AD. also if I put that filter in AD management console custom filters it does not work.
when I remove the 5c from the filter it works both from java and in AD console.
Am I missing something here?
of course I can encode the filter without the 5c but I'm nt sure it the right way and I prefer to let spring encode the filters because it knows a lot of things that I should do manually.
I think the blog entry at:http://www.developerscrappad.com/1109/windows/active-directory/java-ldap-jndi-2-ways-of-decoding-and-using-the-objectguid-from-windows-active-directory/ provides the information you need.
i found solution with php to get user with objectGUID
etap one when i create user i put his objectGuid in bdd, the objectGuid that you see in the Ad ex $guid_str = "31207E1C-D81C-4401-8356-33FEF9C8A"
after i create my own function to transform this object id int hexadécimal
function guidToHex($guid_str){
$str_g= explode('-',$guid_str);
$str_g[0] = strrev($str_g[0]);
$str_g[1] = strrev($str_g[1]);
$str_g[2] = strrev($str_g[2]);
$retour = '\\';
$strrev = 0;
foreach($str_g as $str){
for($i=0;$i < strlen($str)+2; $i++){
if($strrev < 3)
$retour .= strrev(substr($str,0,2)).'\\' ;
else
$retour .= substr($str,0,2).'\\' ;
$str = substr($str,2);
}
if($strrev < 3)
$retour .= strrev($str);
else
$retour .= $str ;
$strrev++;
}
return $retour;
}
this function return me a string like \1C\7E\20\31\1C\D8\01\44\83\EF\9C\8A"\F9\ED\C2\7F after this i put this string in my filter and i get the user
#
to get format of objectGuid
i use this fonction that i foud it in internet
function convertBinToMSSQLGuid($binguid)
{
$unpacked = unpack('Va/v2b/n2c/Nd', $binguid);
return sprintf('%08X-%04X-%04X-%04X-%04X%08X', $unpacked['a'], $unpacked['b1'], $unpacked['b2'], $unpacked['c1'], $unpacked['c2'], $unpacked['d']);
}
i mean this format = 31207E1C-D81C-4401-8356-33FEF9C8A
Pass a byte array and search should work.
I want to create a parser to a document very similar to the following samba configuration file. it has many sections, every section has a header line, which start with [ followed by a keyword section name, e.g. global, share_name, etc., till the end of line. followed the section header line is the parameters for this section. We don't know the end of a section till we reach the beginning of another section new line [.., how can I write a rule for this kind of doc? All antlr examples I found knows exactly when start a section and when to end a section. Thanks a lot!
[global]
netbios name = NETBIOS_NAME
workgroup = WORKGROUP
security = user
[SHARE_NAME]
comment = COMMENT
force create mode = 0770
locking = yes
[printers]
comment = COMMENT
path = /var/spool/samba
browseable = No
Here is my grammar:
grammar SambaConfiguration;
file : global_section
share_name_section
printer_section
EOF
;
global_section
: SECTION_TAG_START GLOBAL_SECTION_TAG (.)* SECTION_TAG_END NEW_LINE
(~SECTION_TAG_START (.)* NEW_LINE)*
;
share_name_section
: SECTION_TAG_START SHARE_NAME_SECTION_TAG (.)* SECTION_TAG_END NEW_LINE
((~SECTION_TAG_START) (.)* NEW_LINE)*
;
printer_section
: SECTION_TAG_START PRINTER_SECTION_TAG (.)* SECTION_TAG_END NEW_LINE
((~SECTION_TAG_START) (.)* NEW_LINE)*
;
SECTION_TAG_START
: '['
;
SECTION_TAG_END
: ']'
;
GLOBAL_SECTION_TAG
: 'global'
;
SHARE_NAME_SECTION_TAG
: 'SHARE_NAME'
;
PRINTER_SECTION_TAG
: 'printer'
;
NEW_LINE :
'\r' ? '\n' | '\r'
;
WHITE_SPACE
: ' ' | '\t'
;
Somehow, it does not work properly. When running in Antlrworks, it gives me the following exception:
problem matching token at 12:19 NoViableAltException('o'#[1:1: Tokens
: ( SECTION_TAG_START | SECTION_TAG_END | GLOBAL_SECTION_TAG |
SHARE_NAME_SECTION_TAG | PRINTER_SECTION_TAG | NEW_LINE | WHITE_SPACE
);])
Thanks.
The error message:
problem matching token at 12:19 NoViableAltException('o'#[1:1: Tokens : ( SECTION_TAG_START | SECTION_TAG_END | GLOBAL_SECTION_TAG | SHARE_NAME_SECTION_TAG | PRINTER_SECTION_TAG | NEW_LINE | WHITE_SPACE );])
means that ANTLR encounters a character, 'o', that it cannot create a token for. You probably think it will be matched by the . in your parser rules, but it doesn't. Inside parser rules, the . matches any token, while only inside lexer rules it matches any character.
Your lexer only creates the following tokens: SECTION_TAG_START, SECTION_TAG_END, GLOBAL_SECTION_TAG, SHARE_NAME_SECTION_TAG, PRINTER_SECTION_TAG, NEW_LINE and WHITE_SPACE. So a . inside a parser rule matches any of these tokens, nothing more.
Unless you're doing this to learn ANTLR, I'd hesitate to use ANTLR for this task. You can do this easier with some built-in string operations and reading the input line-by-line.
Using ANTLR, you could do something similar to this:
grammar T;
parse
: section* EOF
;
section
: header line*
;
header
: SECTION_TAG_START name=text SECTION_TAG_END NEW_LINE
{
System.out.println("name=" + $name.text);
}
;
line
: key=text ASSIGN value=text (NEW_LINE | EOF)
{
System.out.println(" key=`" + $key.text.trim() +
"`, value=`" + $value.text.trim() + "`");
}
;
text
: OTHER+
;
SECTION_TAG_START : '[';
SECTION_TAG_END : ']';
ASSIGN : '=';
NEW_LINE : '\r'? '\n';
OTHER : . /* any other char: must be the last rule! */;
Parsing your example input would print the following to your console:
name=global
key=`netbios name`, value=`NETBIOS_NAME`
key=`workgroup`, value=`WORKGROUP`
key=`security`, value=`user`
name=SHARE_NAME
key=`comment`, value=`COMMENT`
key=`force create mode`, value=`0770`
key=`locking`, value=`yes`
name=printers
key=`comment`, value=`COMMENT`
key=`path`, value=`/var/spool/samba`
key=`browseable`, value=`No`
email = self.request.get('email')
name = self.request.get('name')
mail.send_mail(sender="myemail", email=email, body=name, subject="sss " + name + "sdafsaã")
// added ã: the problem was that "sdafsaã" should be u"sdafsaã". with a "u" before the string. and now it works
then i get this
main.py", line 85, in post
subject="sss " + name + "sdafsa",
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 36: ordinal not in range(128)
the might have chars like õ ó and something like that.
for more details:
the code to run the worker(the code before)
the name is the one that is received from the datastore and contains chars like õ and ó...
taskqueue.add(url='/emailworker', params={'email': e.email, 'name': e.name})
thanks
Try reading a little about how unicode works in Python:
Dive Into Python - Unicode
Unicode In Python, Completely Demystified
Also, make sure you're running Python 2.5 if you are seeing this error on the development server.
You should use:
email = self.request.get('email')
name = self.request.get('name')
mail.send_mail(sender="myemail",
email=email,
body=name,
subject="hello " + name.encode('utf-8') + " user!")
The variable name is a unicode string and should encoded in utf-8 or in the kind of encode you are using in you web application before concatenating to other byte strings.
Without name.encode(), Python uses the default 7 bits ascii codec that can't encode that specific character.
the problem is joining 2 strings: ||| body = name + "ã" => error ||| body = name + u"ã" => works!!! |||
Try with encode
t ='việt ứng '
m = MyModel()
m.data = t.encode('utf-8')
m.put() #success!