I am trying to evaluate a WSD model using well-known WSD data set (SemEval, SensEval). But I am don't understand the format of the gold key text file.
seneval3.gold.key.txt
d000.s000.t000 man%1:18:00::
d000.s000.t001 say%2:32:01::
d000.s001.t000 peer%2:39:00::
d000.s001.t001 companion%1:18:00::
d000.s001.t002 bleary%5:00:00:indistinct:00
d000.s001.t003 eye%1:08:00::
d000.s002.t000 have%2:40:00::
d000.s002.t001 ready%5:00:01:available:00
d000.s002.t002 answer%1:04:00::
d000.s002.t003 much%3:00:00::
d000.s002.t004 surprise%1:12:00::
d000.s002.t005 fit%1:26:00::
d000.s002.t006 coughing%1:26:00::
d000.s003.t000 man%1:18:00::
d000.s003.t001 drunk%3:00:00::
d000.s003.t002 crazy%5:00:00:insane:00
d000.s004.t000 newfound%5:00:00:new:00
I know that in the first line d000.s000.t000 talking about the document #0 sentence #0 token #0 by looking at the data file.
senseval3.data.xml
<sentence id="d000.s000">
<wf lemma="that" pos="DET">That</wf>
<wf lemma="'" pos="VERB">'s</wf>
<wf lemma="what" pos="PRON">what</wf>
<wf lemma="the" pos="DET">the</wf>
<instance id="d000.s000.t000" lemma="man" pos="NOUN">man</instance>
<wf lemma="have" pos="VERB">had</wf>
<instance id="d000.s000.t001" lemma="say" pos="VERB">said</instance>
<wf lemma="." pos=".">.</wf>
</sentence>
But I don't know what is meant after %, for example 1:18:00:: for lemma man.
This answer is composed based on the comment given for this SO post.
The number sequence followed by % is the lex_index. Lex index composed as follows.
ss_type:lex_filenum:lex_id:head_word:head_id
More information is in the WordNet documentation.
I have a postgres data-config file.
<dataConfig>
<dataSource driver=”org.postgresql.Driver” url=”jdbc:postgresql://127.0.0.1:5432/mydb” user=”user” password=”pw” />
...
</dataConfig>
But when I run it, it shows error
Data Config problem: Open quote is expected for attribute "driver" associated with an element type "dataSource".
What's the problem here. is driver information that I put wrong?
Your quotes are wrong.
” and " are not the same kind of quotes (see the different presentation). Only " is a valid double quote in an XML file (and in most/all programming contexts).
The examples in your config file seems to have been mangled by a blog or a text editor on the way.
I have a small script (spring/groovy/ldap) that finds, in Active Directory, the 'management tree' under a person,
i.e. from a 'root person' the script finds the root person's direct reports then uses recursion: for each direct report find their direct reports, etc.
the directReports users attribute specifies a list of DN's in the form:
CN=Simpson\, Homer,OU=OU_0731DevOps,OU=OU_0100Monitor Services,OU=OU_0001U*Nuclear Energy Corporation,OU=OU_UNuclearUsers,DC=corp,DC=unucleargrp,DC=com
The script does an "ldap lookup" for each direct report by DN, e.g.:
obj = ldapTemplate.lookup(pDn, new UserAttributesMapper())
Problem
The ldap lookup throws an InvalidNameException
[LDAP: error code 34 - 0000208F: LdapErr: DSID-0C090787
I've tried various combinations of escaping but still get the error.
What am I missing???
More Info
This url https://social.technet.microsoft.com/wiki/contents/articles /5312.active-directory-characters-to-escape.aspx shows which characters to escape:
Active Directory requires that the following ten characters be escaped
with the backslash "\" escape character if they appear in any of the
individual components of a distinguished name:
Comma ,
Backslash character \
Pound sign (hash sign) #
Plus sign +
Less than symbol <
Greater than symbol >
Semicolon ;
Double quote (quotation mark) "
Equal sign =
Leading or trailing spaces
Tools
Groovy
Spring Boot
JVM
thanks!
I found the answer by poking around with LDAPNameBuilder.
TLDR:
ldapTemplate.lookup requires stripping off the "DC.." portion of the DN.*
If you know a cleaner/more-official solution, please post!
LDAP Lookup fails with a DN like this:
This DN has "DC=.." components and fails using spring ldap lookup.
CN=Simpson\, Homer,OU=OU_0731DevOps,OU=OU_0100Monitor Services,OU=OU_0001U*Nuclear Energy Corporation,OU=OU_UNuclearUsers,DC=corp,DC=unucleargrp,DC=com
LDAP succeeds with this (no "DC" components):
This DN has no "DC=" components. Spring LDAP template provides the basedn.
CN=Simpson\, Homer,OU=OU_0731DevOps,OU=OU_0100Monitor Services,OU=OU_0001U*Nuclear Energy Corporation
Context Reminder
This application traverses 'management tree.' It gets a persons managees by the 'directReports' attribute (which lists the full-DN's of each direct report). This application wanted to lookup that user by his/her DN.
Tweak/Example
This tweak got the ldap lookup to work:
User lookupUserByDn(String pDn) {
// needed this to get it to work
String dn=pDn.replace(",${ldapConfig.base}","")
ldapTemplate.lookup(dn, new UserAttributesMapper())
}
for the record, my application.yml ldap portion looked like this:
spring:
ldap:
urls: ldap://dc.corp.unucleargrp.com:389
base: DC=corp,DC=unucleargrp,DC=com
username: username_val
password : password_val
According to this https://docs.spring.io/spring-ldap/docs/2.3.1.RELEASE/reference/#contextsource-configuration
Removing the base attribute, All operations going back and forth will use full DNs.
When I try to get the attribute of URL in a test XML:
<Test> <Item URL="http://127.0.0.1?a=1&b=2"/>
</Test>
After I call: attr=xmlGetProp(cur, BAD_CAST "URL");
The libxml2 give a message: Entity: line 1: parser error : EntityRef: expecting ';'
and return value of attr is "http://127.0.0.1?a=1=2"
How can I get the completion attribution of URL? Thanks
You cannot get the “correct” URL here because the XML file is not well-formed. the & should have been written as &. You have to ask the creator of the XML file to create a syntactically valid, well-formed XML file.
XML is not created by just putting strings together, they also have to be encoded properly.
I am trying to parse an XML file using the SAX interface of libxml2 in C.
My problem is that whitespace characters between end of a tag and start of a new tag are causing the callback
"Characters" to be executed...Hi All,
i.e.
<?xml version="1.0"?>
<doc>
<para>Hello, world!</para>
</doc>
produces these events:
start document
start element: doc
start element: para
characters: Hello, world!
end element: para
characters:
end element: doc
characters:
end document
It would be really nice if somehow these whitespaces don't get recognized as "characters".
Anybody got any idea why this is happening or how this can be prevented from happening???
This is, of course, happening since whitespace between elements is significant in XML. So it's just operating according to specification.
See, for instance, this discussion.