Is there any interests database for download? - database

I need a database of interests for a coming project.
By saying "interests" I mean like:
Sports - Football, Basketball and so on...
Is somebody know something like this?
I just don't want start writing thousands (or even millions) of interests.

Try to google on "interests list" or "hobbies list" and the write simple parcer to extract all (based on neko-html for example). Examples of useful links:
http://www.notsoboringlife.com/list-of-hobbies/
http://www.buzzle.com/articles/list-types-of-hobbies/
Also it is possible to ask admins of any dating sites to make a simple DB-query to get relevant result.

There is an API from google called Freebase. I allows you suggest to the user what input he should give the same way google search suggests you what to look for.
Freebase

Related

Matching article text against pre-existing list of categories

I'm new to Azure Cognitive Services, and while I'm pretty sure it can help me solve my problem, I don't quite understand which part of it to use for it...
Here's what I want to do:
We have blog posts, say ~1k, and those blog posts all have categories and tags (multiple each). What I want to do, is to "guess" the right categories/tags for each article based on the content, and then present that to the editor as a suggestions at the time of input ("looks like this article is about: health, well-being, ..."). The ~1k articles we already have in the system are currently correctly tagged/categorized, so I'd like to use these a data source for this "guessing".
I've used Azure Search before, and it seems like some combination of EntityRecognition and KeyPhraseExtraction might be a way in the right direction? Azure Cognitive Services also seems to have an API that supports TextAnalytics that would do something similar. I'm a bit confused about why these are two different things (or are they not?)
This also seems like an entirely common problem (matching text against pre-defined categories based on other text that is categorized), so I'm wondering if I'm just missing an obvious solution here?
Thanks in advance.
I think the Azure Cognitive Text Analytics API is your best bet as you are looking for real-time analysis prior to tagging/categorizing for storage.
Text Analytics could return a list of named entities that you could map to your available tags/categories and present to the user.
Azure Cognitive Search requires an indexer and skillset to process target text with an end result of storing the processed results to an index specifically for searching.

Auto-complete and some technical in web application

I have some problems still do not understand and I need help.
In some web application like Facebook, when I type a word to find my friends or places, apps..., a suggestion list appear for me. I can traverse and choose one of them.
So I want to ask something:
Best or good tool, language or something like that to do this? I think it's jQuery. More options for me?
Which is a list item in the suggestion list called?
Where did these things from? I means: how can I create or take them? From database directly?
The answers are:
You can use jQuery coupled with a server side language that retrieve the data
Usually it's called suggestion list. These function is Google is called Google suggest
The data are retrieved from the database

Where to get an updated list of video games?

I am currently designing a reviews site for video games similar to gamespot am wondering where and if there is an online database that contains information such as name, publisher, release date etc with an API. I dont really want to have to enter each title manually or let users enter the title manually.
Where do these large sites get information like this? I wouldn't think it would be manually. I know for movies IMDB exists.
How would I go about adding it to my database?
Thanks
May I point you to web scraping?
Be sure to read the section legal issues and on well-behaved bots.
There's always Amazon and their product advertising API. Some older, but interesting code snippets can be found on this page.
If you know Perl, there is an amzing module called WWW::Mechanize
Pretty much you can write a script to get to any website and grab any data you need.
So for example you can go to www.gamespot.com, get list like the one below and put them in your database.
http://www.gamespot.com/games.html?platform=1029&mode=all&sort=views&dlx_type=all&sortdir=asc&official=all&tag=games%3Bfooter%3Bmore

Receiving / retrieving email in CakePHP

I am developing a basic yet highly customized CRM for a small training centre which has the ability to store student records and also send emails to them. I'm using SwiftMailer following this excellent tutorial in CakePHP to accomplish the sending part.
Of course, students are sometimes going to reply to emails and I'd like to retrieve them within my CRM and store them along with the student record.
However, I cannot find a single reference to doing this. I've tried the following Google searches: "receiving email cakephp" , "retrieving email cakephp" and even "email client cakephp" but all of these queries give results relating to sending mail rather than receiving it -- very frustrating!
Finally, I broadened my search to non-cake solutions and found someone recommending a library called ezComponents. It doesn't seem to have had any active development for about a year, but it includes an email receiving class which is exactly what I want. Unfortunately, I have no idea how to add this to CakePHP and the only post I've been able to find on the entire web on the matter doesn't exactly go into much detail. It's certainly not a step-by-step tutorial on using ezComponents on CakePHP like the SwiftMailer tutorial I mentioned above.
I also found a class on Google Code called php-imap which looks like it would do the job but, again, I haven't the slightest clue how to get it working happily in Cake like SwiftMailer is.
I realize that I may have to learn how to package classes for use in Cake by myself but I'm asking this question first on the off-chance that there is already a Cake-friendly solution to this problem that I just haven't realized :-)
Joseph
Thanks to everyone for your answers, but I've been doing some more searching and it looks like the solution is actually incredibly simple.
Basically, with the help of a plugin, I can set up the mail server in databases.php as a datasource and then write a Model and Controller to interact with it.
Here's the example I found: https://github.com/kvz/cakephp-emails-plugin
Edit: the repo has been deprecated and is now available at https://github.com/kvz/deprecated/tree/cakephp-emails-plugin
You will want to pipe your email to PHP and use stdin:// to read the contents of the email and add the e-mail to your database.
I've done this with cake and the simplest way is to make a Cake console application to handle the parsing. Also using cpanel's account level filtering to generate the pipe is really simple.
http://forums.cpanel.net/f5/piping-mail-php-scripts-howto-checklist-50985.html
http://www.evolt.org/incoming_mail_and_php
Sounds like you want to include SwiftMailer as a Cake plugin, amirite?
http://book.cakephp.org/view/1111/Plugins
-- if you want to package it yourself. Otherwise, a cursory search of the Bakery yielded this result:
http://bakery.cakephp.org/articles/sky_l3ppard/2009/11/07/updated-swiftmailer-4-xx-component-with-attachments-and-plugins
Hopefully it will at least get you pointed in the right direction. HTH. :)

Webapps: Storing and searching through user submitted blocks of text

Background:
I'm building a poetry site with user submitted content. The relevant user actions for my questions are that users can:
a. Go to fancysitename.com/view to see all poems so far
b. Go to fancysitename.com/submit to submit your own poem.
c. Go to fancysitename.com/apoemid to view a particular poem you've bookmarked before.
d. Go to fancysitename.com/search to enter a word to search for in all the poems.
All the poems are stored as text fields in a database and referenced by a poem id. So the "apoemid" in step c will be the primary key of the tuple and I'll just pull up the text after getting the key from the url.
Question:
The poems exist nowhere except in a database. My webapp is literally 4 html files. Will this approach affect my search engine rankings?
Is there a more efficient way to do 'd' rather than do a Select * on the db and manually parsing the text on the server? Each poem will be at the most 10 lines long, so I would imagine using a full text search engine like Lucerne will probably be overkill.
Caveat
I'm running this on the google app engine for now, so my database customization options are pretty limited. So while I'd certainly be interested in hearing about the ideal way to do this, this is a pet side project so my budget is limited :(
Thanks!
Edit: Apparently I don't google so well at 7am. I've since found a solution for question 2 here so please disregard question 2.
AppEngine currently doesnt support full text indexing, they do have a better than nothing SearchableModel.
Some details of SearchableModel can be found here:
http://groups.google.com/group/google-appengine/browse_thread/thread/f64eacbd31629668/8dac5499bd58a6b7?lnk=gst&q=searchablemodel
Regarding search engine ranking, yes having all your poems in the datastore can affect your ranking. This is generally overcome through the use of a sitemap. Here is an article about how StackOverflow uses a sitemap to help its search ranking.
http://www.codinghorror.com/blog/archives/001174.html
In most database engines, you can accomplish this kind of searching. For example MysQL does have full text searching. I am not sure how app engine works but you can always have a stored procedure does this search.
Where you store your data will not affect your site's ranking, only how you serve it up (on what URLs, etc). There's absolutely no way for an arbitrary search spider to tell where you store your data, and no reason for it to care, either.
Regardless of the length of your text, you will need full-text searching if you want to search inside a string. As Sam points out, SearchableModel ought to work just fine for that.

Resources