Search Engine that can use SKOS? - solr

I am currently working on project where we want to take SKOS and plug it into a search engine to make the search results better. An example of this would be something like Semaphore Smartlogic (closed, not free, too big to partner with).
Searchblox is a very good, free, configurable, lucene/solr search engine, but it does not have SKOS abilities and is not open source.
Constellio is similar to Searchblox (not quite as good), and claims to be working on accepting SKOS, but I can't get it to function properly.
Before I go and build this: Does anyone know of an existing free search engine that has has the ability to accept SKOS? Or, does any know of an open source Lucene/Solr search engine like Searchblox that I could add this functionality to quickly?

You know Solr is a search engine on it's own? Check http://wiki.apache.org/solr/ for more info.
A Google search led me to http://code.google.com/p/lucene-skos/wiki/HowTo
Not the most active project, but I guess a good start.
Should't have to be too hard to combine the 2 into the solution you need.

I am not sure if SIREn supports SKOS, but it is a semantic lucene plugin that may be worth checking out.

Related

How to integrate search engine of a wiki with a search engine of a QA system?

In a knowledge management project I need to use a wiki engine ( I'm thinking about DokuWiki ) and a Question and Answer system ( I'm thinking about Question2Answer), and I need to create a Search Funcionality that search in both systems (Wiki and QA) and return what has been found. ( Like a Google of the two systems in same time)
Anyone knows a direction to help me how to do it properly?
I don't know about the wiki side, but you can write your own search plugin for Question2Answer that let you integrate your own search results into those of Q2A. See the docs for more info.
If your wiki software has an API or functions you can call to find search results, then in Q2A you can use the process_search method in your plugin, call your wiki search function and add the results to the Q2A results.
Having said all that, you might find it easier to integrate Google Custom Search Engine into your site!

Choose Lucene or Solr

We need to integrate a search engine in our plataform Catalog management software in Share point. The information is stored in multiple databases and a storage of files ( doc , ppt , pdf .....). Our dev platform is Asp.Net and we have done some pre-liminary work on Lucene, found it to be good. However, we just came to know of Solr.
We need to continue using lucene, but we need to defend her the solr.
Please any help is accepted.
And sorry for my english.
Lucene is a full-text search library used to provide search functionalities to an application. It can't be used as an application by itself. Solr is a complete search engine built around Lucene providing its search functionalities and others. Solr is a web application that can be used by itself without any development around it.
If you need a search engine to be called by your application I recommend you to use Solr.

Implement search across application

I have one search text box on top bar of my angular application. Something like following.
User can type some keyword and search across application. User will have list of links with some description.
I can implement it at client side as well as at server side. I am using angular at client side and .net at server side.
Can anyone suggest me some framework available to implement these? It can be client side or server side.
I can implement it from scratch, that is not an issue. But first I want to go through solution already available.
Please suggest.
It depends on many things:
if you have very little content to search in, you may opt for a full client-side solution, but that's usually not a very good idea
if you need full-text search, and at least basic semantic features (make sure e.g. "trees" matches "tree"), you should really have a look at Elasticsearch which is pretty easy to setup, and has very good .NET bindings (look for "NEST")
if you want fuzzy suggest on keywords (e.g. tags associated on documents) and are open to paying a service to handle it for you, I can suggest Algolia (http://www.algolia.com) which is a SaaS for search and suggest and should be very reasonably priced if you are in an enterprise environment and not a high-traffic website. We are using it for that use case and we are extremely happy with it.

How can one perform full text search in Google App Engine?

It's a simple question, but I haven't found the answer anywhere. Thoughts and input appreciated.
I'm using Django, too, for what it's worth. :)
Cheers.
The Search API is now available as experimental for Java and Python .
With Java GAE, you could use Compass, but that won't help with Django. For Python, Bill Katz offers one solution -- open source -- and these guys offer a Django-specific approach which, however, is free only for non-commercial applications (i.e. if your app makes money they want you to pay for their full-text search). I have no real-world experience with either of these solutions so I can't really give well-grounded recommendations, but from what one can see with just a little playing around they seem quite useful.
An overview of the Python App Engine searches that I am aware of:
Google did add a cut down search using SearchableModel although that has limitations (5000 indexed word limit, String property only not Text):
http://groups.google.com/group/google-appengine/browse_thread/thread/f64eacbd31629668/8dac5499bd58a6b7?lnk=gst&q=searchablemodel
Or as another posters have pointed out there are these options:
The Quick and simple text search:
http://www.billkatz.com/2009/6/Simple-Full-Text-Search-for-App-Engine
This product which has a fairly comprehensive free version and a more extensive commercial version:
http://gae-full-text-search.appspot.com/customers/download/
I've read that Google do have a project to bring full text search to App Engine although this is not scheduled to happen any time soon
I'd really like to see a comparison of the various searching frameworks and see how they stack up to each other. Does anyone know of any report like this?
Edit:
Google Search API now available (although still experimental)
For now, the real answer is that there is no real full-text search on Google App Engine. The solutions provided by the other answers here are fine for toy data sets, but do not scale to anything more than O(10000) documents or so. Google will have to provide search as an infrastructural feature of GAE. See the feature request for (mostly superfluous) discussion.
# The Quick and simple text search:
http://www.billkatz.com/2009/6/Simple-Full-Text-Search-for-App-Engine
this solution did not work for me - and looking at the limitations below, it is unlikely to be useful for real use cases.
It uses StringListProperty to store phrases which has a limitation of 500 characters.
It does not work with the standard query filters.
Issue 217 Bill Katz released a package to deal with and http://gae-full-text-search.appspot.com/ is available alternatively, levensthein is a another match measure
You should be able to adapt Whoosh! to write in the datastore instead of on disk. It's a pure python full-text search engine. It's not as fast or full-featured as Lucene, but it should run on GAE without too many modifications.

Crawler/parser for Xapian

I would like to implement a search engine which should crawl a set of web sites, extract specific information from the pages and create full-text index of that specific information.
It seems to me that Xapian could be a good choice for the search engine library.
What are the options for a crawler/parser to integrate with Xapian?
Would Solr be a better choice than Xapian to integrate with open source crawlers/parsers?
Here's a little comparison between Xapian and Solr.
But if you want to build a crawler, take a look at Nutch. It's extensible with plugins, so you could write a plugin that analyzes the information that you're looking for.
Flax may provide some of what you're looking for.

Resources