Where can I find the simple information of the format for uploading questions? - retrieve-and-rank

Situation:
I want to train and simple configure the retrieve and rank service.
I just uploaded some PDFs and now I want to upload some questions.
In the documentation I do not find a simple information how the csv file must be structured and which are the must fields and which are not must files.
Something like: "[YOUR QUESTION (MUST)]",[DOCUMENT ID (MUST)], [RANKING (OPTIONAL)]
The document ID you will find in xyz in section xyz.
Inside the help I can not find such kind of help.
https://www.ibm.com/watson/developercloud/doc/retrieve-rank/training_data.shtml#script
Impact:
There is no chance to get a "real" documentation of the configuration outside the tutorial.
Possible Solution:
Provide additional documenation.
Maybe I was not able to find it and someone can guide me to the right place?

Ok, I found the solution for me, by try and error. Following steps do work for me:
1) You need a plain text file and the ending should be *.txt
2) Inside the file you have to write your questions like this:
What is the best place to be?
Why should I travel to the USA?
-> Don't do it like
"What is the best place to be?"
For me the help was missleading, because saying something about CSV files.
You can take a look also in the comment of #dalelane he is right, and highlight the entry text for the upload of the file.

Related

shapefile for mumbai city areawise or section wise

please don't kill me if this question is listed anywhere else, i really searched high and low to find answer to this question.
i am working on a project in which i need to plot data on Mumbai city map section wise or area wise but i am not able to get Shapefile for the same. can anyone please provide me the link where i can get this Shapefile? or guide me on creating one using QGIS??.I mentioned QGIS because i know a little about OGIS and hence it will save time.Also i tried certain links which are listed below. but i couldn't find what i need.Guys it would be a great help because i really need this and i am stuck on this from few days.Also i mentioned shapefile because these are easy to manipulate, later i will convert it into geoJSON and work on it so if geoJSON files are there then it would also help.
Note: only the important links are listed:
1. http://monsoon.mcgm.gov.in:8080/RESTFulWS/WardMaps.html0
2.http://www.naturalearthdata.com/downloads/10m-cultural-vectors/
thanks.

Create a corpus with a single file (webpage)

I want to read a single file (the file is a html document) from my computer and store it in a Corpus (I'm using the package tm).
Do you have any solution to do that?
Here is what I tried :
data<-read.csv(fileName)
c2<-Corpus(VectorSource(data))
it mostly works, but I sometime get the error : more columns than column names
I guess I'm not supposed to use read.csv for a webpage, as I didn't find a better solution.
Thanks for your help =)
A webpage definitely does not conform to the specifications that a CSV should. Instead you probably want to use the readHTMLTable function from the XML package.
This is grabbing from an actual webpage but it should be the same idea
file <- "http://xkcd.com/"
dat <- readLines(file)
c2 <- Corpus(VectorSource(dat))

Difficulty with filename and filemime when using Migrate module

I am using the Drupal 7 Migrate module to create a series of nodes from JPG and EPS files. I can get them to import just fine. But I notice that when I am done importing them if I look at the nodes it creates, none of the attached filefield and thumbnail files contain filename information.
Upon inspecting the file_managed table I see that both the filename and filemime fields are empty for ONLY the files that I attached via the migrate module. This also creates an issue with downloading the files.
Now I think the problem has to do with the fact that I am using "file_link" instead of "file_copy" as the file operation I specify. The problem is I am importing around 2TB (thats Terabytes) of image files. We had to put in a special request with Rackspace just to get access to that much disk space on our server. So I can't go around copying from one directory to the next because of space issues. So "file_link" seems like the obvious choice.
Now you probably want to see how I am doing this exactly, so here is the code snippet:
$jpg_arguments = MigrateFileFieldHandler::arguments(NULL,
'file_link', FILE_EXISTS_RENAME, 'en', array('source_field' => 'jpg_name'),
array('source_field' => 'jpg_filename'), array('source_field' => 'jpg_filename'));
$this->addFieldMapping('field_image', 'jpg_uri')
->arguments($jpg_arguments);
As you can see I am specifying no base path (just like the beer.inc example file does). I have set file_link, the language, and the source fields for the description, title, and alt.
It is able to generate thumbnails from the JPGs. But still missing those columns of data in the db table. I traced through the functions the best I could but I don't see what is causing this. I tried running the uri in the table through the functions that generate the filename and the filemime and they output just fine. It is like something is removing just those segments of data.
Does anyone have any idea what this could be? I am using the Drupal 7 Migrate module version 2.2. It is running on Drupal 7.8.
Thanks,
Patrick
Ok, so I have found the answer to yet another question of mine. This is actually an issue with the migrate module itself. The issue is documented here. I will be repealing this bounty (as soon as I figure out how).

Can I digitalize a dictionary?

I've found a public domain latin<->portuguese dictionary in PDF which I'd like to convert to plain text, parse and use as the database of a program. After some testing, however, I got a little skeptical. Take a look at the original file and at the resulting text of gocr. Is there any hope that I might reach 99%+ accuracy in some method? I thought of reCaptcha's database, but I guess it is Google's property, isn't it?
Thanks!
Another route is to use one of the freely available dictionary files, like http://www.brothersoft.com/downloads/dictionary-database.html
Or WordNet.
EDIT: I've just spotted that this is a Latin/Portuguese dictionary, so WordNet clearly is no good.

How Do I Use Multiple po Files in CakePHP?

I'm just beginning the process of exploring i18n in CakePHP and I can't seem to find the right combination of files and functions that will allow me to use multiple po files. If I want to use a single po file (default.po) for every bit of translatable text, that works fine, but I see that becoming an unmaintainable hairball very, very quickly. I've read the docs and the few articles I can find, but none really dive into i18n beyond the trivial use of one .po file.
Here's where I am right now:
I've "baked" my po templates (.pot files) and copied those into app/locale/eng/LC_MESSAGES (I'm not going to be using the default text as the key so that I can easily spot missing keys). For now, I have -views-layouts-default.po and -views-pages-index.po.
In those .po files, I've entered the text I want to use for each key.
In my homepage (views/pages/index.ctp) and default layout (views/layouts/default.ctp) I've wrapped the text key I want to translate with the __() function.
When I load the homepage, though, all I see are they keys. No text has been translated. If I throw up a default.po file, though, any keys I drop in there are populated just fine. I'm clearly missing some piece of the puzzle, but I can't find it. Any help would be much appreciated.
Thanks.
I found the piece I was missing thanks to the CakePHP Google Group. I had been playing with the __d() convenience function, but didn't have a clear picture of how to tie it together to my .po files. The answer is easy once you know it:
The domain translation:
__d ( 'login', 'PLEASE_LOGIN' );
Will look for the "PLEASE_LOGIN" key in the file named login.po. I didn't know (and hadn't read anywhere) that domain == po file name (without extension). Learning that made all the difference.

Resources