Explanation of the IMAP protocol? - c

Im looking for information about how the IMAP protocol works. Google yields only high level information, but not enough to understand the details. I'd like to know enough to be able to create my own implementation. I found a c library which does it, but is poorly documented.
Some basic questions are: what are the IMAP uid's and what are their guaruntees? For example, will an id ever change? will it be reused if deleted?

This looks like a good starting point:
http://www.imapwiki.org/ImapRFCList
In general, the keyword you want when searching for details on an internet protocol is "RFC". Add that to your search along with the name of the protocol and you should get off to a good start.

Google yields only high level information, but not enough to understand the details.
Google is a general search engine, and its results will only be as good as the search terms you supplied. If you want to get detailed and definitive technical information about a protocol or standard or programming language, you should start by searching for the specification; i.e. use "specification" as one of your search terms.
I'd like to know enough to be able to create my own implementation. I found a c library which does it, but is poorly documented.
If you've already found an implementation, why would you want to create another? Or even know enough to (hypothetically) create another?
I'm sure there are other open source implementations of IMAP around in various languages.
It is a bit much to expect an implementation of IMAP to be sufficiently well documented as to serve as a specification.
Some basic questions are: what are the IMAP uid's and what are their guaruntees? For example, will an id ever change? will it be reused if deleted?
I expect that these questions can be answered by reading the IMAP specification; see RFC 3501

Related

Using the IBM Watson Concept Insights service for Natural Language Search

We are trying to implement a natural language search function using the IBM Watson Cognitive Insights (CI) service. We want the user to be able to type in a question using natural language and then return the appropriate document(s) from a CI corpus. We are using CI rather than the Watson QA service to avoid the need for training and to keep Watson infrastructure costs down (i.e. avoid the need for a dedicated instance of Watson for each corpus/use case).
We are able to build the necessary corpus through the CI API but we are not sure which APIs to use in what order to accomplish the most precise/accurate query possible.
Our initial thought was to:
Accept the user’s natural language question and Post that text string to the “Identifies concepts in a piece of text” API (listed 6th from the bottom in the CI API Reference document) to get a list of concepts related to the question.
Then do a GET using the “Performs a conceptual search within a corpus” API (listed 3rd from the bottom in the CI API Reference document) to get a list of related documents back from the corpus.
The first question - is this the right way to go about achieving our objective described in the first paragraph of this post? Should we be combining the CI APIs differently or using multiple Watson services together to achieve the objective?
If our initial approach is the right one, then we are finding that when we submit a simple question (e.g. “How can I repair MySQL database corruption”) to the “Identifies concepts in a piece of text” API we are not getting a comprehensive list of associated concepts back. For example:
curl -u userid:password -k -d "How can I repair MySQL database corruption" https://gateway.watsonplatform.net/concept-insights-beta/api/v1/graph/wikipedia/en-20120601?func=annotateText
returns:
[{"concept":"/graph/wikipedia/en-20120601/MySQL","coords":[[17,22]],"weight":0.85504603}]
Yet clearly there are other concepts associated with the example question (repair, corruption, database, etc.).
In another example we just submitted the text “repair” to the “Identifies concepts in a piece of text” API:
curl -u userid:password -k -d "repair" https://gateway.watsonplatform.net/concept-insights-beta/api/v1/graph/wikipedia/en-20120601?func=annotateText
and it returned:
[{"concept":"/graph/wikipedia/en-20120601/Repair","coords":[[0,6]],"weight":0.65392953}]
It seems that we should have gotten back the “Repair” concept from the first example also. Why would the API return the “repair” concept when we submit "repair" but not when we submit the text “How can I repair MySQL database corruption” which also includes the word “repair.”
Please advise as to the best way to implement a natural language search function based on the Watson Concept Insights service (perhaps in combination with other services if appropriate).
Thank you very much for your question and my apologies for being so late in answering it.
The first question - is this the right way to go about achieving our objective >described in the first paragraph of this post? Should we be combining the CI >APIs differently or using multiple Watson services together to achieve the objective?
Doing the steps above would be a natural way to accomplish what you want to do. Please note however that the "annotate text" API uses currently exactly the same technology that the system has for connecting documents in your corpus to concepts in the core knowledge graph and as such, it is more "paragraph" oriented rather than individual question oriented. To be more precise, the problem of extracting concepts in a smaller piece of text is generally more difficult than in a larger piece of text because in the latter there is more context that can be used to make the right choices. Given this observation, the annotate text API goes the more conservative route again given its paragraph focus.
Having said that, the /v2 API that we now have does improve the speed and quality of the concept extraction technology, so it is possible that you would be more successful in using it in order to extract topics from natural language questions. Here's what I would do/watch out for:
1) Clearly display to the user what CI extracted from the natural language in the input. Our APIs give you a way to retrieve a little abstract per concept which can be used to explain to a user what a concept means - do use that.
2) Give the user the ability to eliminate a concept from the extracted concept list (strike it out)
3) Since the concepts in concept insights currently correspond roughly to the notion of "topics", there is no way to deduce more abstract intent (for example, if the key to the meaning of a question is on a verb or an adjective as opposed to a noun, concept insights would be a poor way to deduce it). Watson does have technology oriented towards question answering as you pointed out before (the natural language classifier being one component of that), so I would take a look at that.
Yet clearly there are other concepts associated with the example question >(repair, corruption, database, etc.).
The answer for this and the rest of the posted question is in a sense above - our intention was to provide a technology first for "larger text" which as I explained is an easier task. Since this question was first posted and today, we did introduce new annotation technology (/v2) so I would encourage the reader to see whether it performs a little better.
For the longer term, we do have the intention to give the user a formal way to specify context for a general application so that the chances of extraction of relevant concepts increase. We also have a plan to have the user be able to specify custom concepts, as it has been observed in the past that some topics of interest to users are impossible to match in our current design because they are not in wikipedia.

Genetic Algorithm vs Expert System

I'm having some doubts about which system should I use for a new software.
No code has been written yet, I'm just breaking apart all the needs and only then start coding.
This will be implemented in a computer company that provides services for other companies, onsite and remotely.
These are my variables:
Number of technicians
Location of customer
Type of problem
Services already scheduled for the technician
Expertise of the technician about the situation
Customer priority
Maybe some are missing, but these are the most important ones.
This job is being done manually, and has humans, we fail to see the best route to be taken sometimes.
Let's say that a customer calls with a printer problem.
First, check which tech knows about printers.
Then, is the tech available? far from the customer? can it be done remotely (software issues)?
Can it be done by another tech who is closer from the customer location?
Does this customer have more priority than the other where the same tech should be going?
Is the technician schedule full? If yes, pass to another printer/hardware tech.
I know my english is not perfect (not my natural language), but I'll try to provide more details or correct the text as needed.
So, my question is this, what kind of approach would you take? Genetic algorithm seems nice for this kind of job, and I also have some experience with GAF and WatchMaker (Java GA Framework). However, when reading the text above, an expert system seems also appropriate.
Have someone done something like this?!I had search for this kind of software and couldn't find anything alike.
Would another approach be better than the two asked?!
Also, I'm building up a table with all the techs capabilities and expertise, with simple rules like, 1 to 5 about each expertise. This is also a decision factor.
Thanks.
Why not do both? Use an expert system (a rule engine) to define your constraints and use a metaheuristic (such as Local Search or Genetic Algorithms) to solve it. The planning engine OptaPlanner (java, open source) does exactly that (by using the rule engine Drools). The architecture look likes this:
Here's a video demonstrating the constraint flexibility on the vehicle routing problem (VRP). Your problem seems to be an advanced variant on VRP (which is a variant on TSP).
Maybe you can start off with TSP,
here http://en.m.wikipedia.org/wiki/Travelling_salesman_problem
I guess it only deals with the distance.

Conversation bot source or API

I would like to make a bot that can carry on a simple conversation. I would like to be able to supply the bot with parameters about the things it knows and how it responds to certain subjects. I am wondering if anyone knows of any freely available source code or an API for a decent conversational bot.
I would like to use this to facilitate gaming by having computer-controlled characters that interact with the real players without having completely pre-scripted, static dialog. I am hoping that I can find something capable of holding a simple, generic conversation unless asked about a specific topic, at which point it can give specific replies to a pre-set list of specific topics.
I am asking more about the conversational-processing aspect and not so much about a front end or hooks to other apps or anything like that. Initially, I will just make this a local command-line based thing, then if satisfied I am looking into libpurple as an API to access various communication networks once I have the dialog processing ready.
So, does anyone know of any source code or API for something like this? Google brings up mostly tools for things like imified. I'm not expecting there to be a lot. A source code for something that exists that can handle various emotions and topics and such would be awesome, but I'd be happy with something that just holds the simplest of conversations, as there should be something somewhere that does this, seeing how there are multiple IM bots in existence.
In the absence of a good source or API, would anyone happen to know of any good materials about programming an AI that can have a conversation? Again, I'm not talking about PhD papers discussing robots that can pass believably as humans or anything like that; I mean materials that discuss some simple programming techniques that common conversational bots use to hold rudimentary conversations.
Because of the libpurple API, I'll probably be doing this in C++. So C++ resources are preferable but not required.
(edit) I just stumbled onto AIML (Artificial Intelligence Markup Language). I am currently looking into that, and it sounds like it might be promising, especially if there are any pre-made conversational resources available for it, as then I could just add topics to it in the manner I mentioned, if I am understanding it correctly.
AIML is old, obsolete and is a torture to create his database. I suggest you follow this gamasutra's article about chatbot languages. This article describes the ChatScript language, is a great alternative for AIML.
Another language is RiveScript that have a cool clean style, but it seem like a copy of AIML with the same bad concepts.
I'm developing the Aerolito language that is based on YAML, it's just a hobby project and it's not usable yet. =]
In my opinion, ChatScript is the best option for now.
I understand this question is old, but things have changed in the time since the question has posted. Check out the following projects, these bots learn from either text files, irc chat logs or in the case of triplie, they can read websites (albeit not perfectly).
triplie-ng: https://github.com/spion/triplie-ng
cobe: https://github.com/pteichman/cobe
Giorgio Robino mentioned http://superscriptjs.com/ but it's more than just chatscript - it's a superset of rivescript and chatscript and also includes a built-in triple store to implement WordNet etc.

Technology for long-term archiving (LTA) of digitally signed documents

Imagine that you have thousands or millions documents signed in CAdES, XAdES or PAdES format. Signing certificate for end user is typically issued for 1-3 years. After few years, certificate will expire, revocation data (CRLs) required for verification will not be available and original crypto algorithms will not guaranee anything after 10-20 years.
I am courious if there is some mature and ready to use solution for this. I know that this can be handled by archive timestamps, but I need real product which will automatically maintain data required for long term validation, add timestamps automatically, etc.
Could you recommend me some application or library? Is it standalone solution or something what can be integrated with filenet or similar system?
The EU does currently try to endorse Advanced Digital Signatures based on the CAdES, XAdES and PAdES standards. These were specifically designed with the goal of providing the possibility for long-term archiving and validation.
CAdES is based on CMS, XAdES on XML-DSig and PAdES on the signatures defined in ISO 32000-1, which themselves again are based on CMS.
One open source solution for XAdES is the Belgian eid project, you could have a look at that.
These are all standards for signatures, they do not, however, go into detail on how you would actually implement an archiving solution, this would still be up to you.
These are all standards for signatures, they do not, however, go into detail on how you would actually implement an archiving solution, this would still be up to you.
However, this is something what am I looking for. It seems that Belgian eid mentioned above does not address it at all. (I added some clarification to my original question).
You may find this web site helpful. It's an official site even though its pointing to an IP address. The site discusses in detail your problem and offers a great deal of advise in dealing with long term electronic record storage through a Standards based approach.
The VERS standard is quite extensive and fully supports digital signatures and how best to deal with expired signatures.
The standard is also being adopted by leading EDMS/ECM providers.
If I got your question right, our SecureBlackbox components support XAdES, PAdES and CAdES standards and pulls necessary revocation information (and timestamps) and embeds them in to the signature automatically.

What is the best approach of creating a talking bot?

When creating a AI talking bot what kind of methods of design should I use? Should it be one function, multiple modules, should it have classes?
Understanding language is complicated, so the goal you need to determine first is what aspect of language you want to understand.
An AI must be able to understand what the person says to it, then relate it to what it already knows, and then generate a legitimate response.
These three steps can all be thought of as nearly independent, so you need to address each on its own.
The brain, the world's best language processor, uses a Neural Network, but that's not likely to work well for you.
A logic-based proof solving system, where facts that follow from facts are derived would probably work best, and I know of at least one system that uses it fairly effectively.
I'd start with an existing AI program (like the famous Eliza) and run its output through a speech synthesizer.
Some source for Eliza is available here. One open source speech synthisizer is FreeTTS.
If you're using a language other than Java, there are similar candidates AI bots and text-to-speech code out there.
I've started to do some work in this space using this open source project called Talkify:
https://github.com/manthanhd/talkify
It is a bot framework intended to help orchestrate flow of information between bot providers like Microsoft (Skype), Facebook (Messenger) etc and your backend services. The framework doesn't really provide implementation for the bot providers yet but does provide hooks into its natural language recognition engine.
The built in natural language recognition library can be used to classify sentences to topics which you can then map to skill functions.
Give it a try! I'd really like people's input to see if how it can be improved.

Resources