How can one perform full text search in Google App Engine? - google-app-engine

It's a simple question, but I haven't found the answer anywhere. Thoughts and input appreciated.
I'm using Django, too, for what it's worth. :)
Cheers.

The Search API is now available as experimental for Java and Python .

With Java GAE, you could use Compass, but that won't help with Django. For Python, Bill Katz offers one solution -- open source -- and these guys offer a Django-specific approach which, however, is free only for non-commercial applications (i.e. if your app makes money they want you to pay for their full-text search). I have no real-world experience with either of these solutions so I can't really give well-grounded recommendations, but from what one can see with just a little playing around they seem quite useful.

An overview of the Python App Engine searches that I am aware of:
Google did add a cut down search using SearchableModel although that has limitations (5000 indexed word limit, String property only not Text):
http://groups.google.com/group/google-appengine/browse_thread/thread/f64eacbd31629668/8dac5499bd58a6b7?lnk=gst&q=searchablemodel
Or as another posters have pointed out there are these options:
The Quick and simple text search:
http://www.billkatz.com/2009/6/Simple-Full-Text-Search-for-App-Engine
This product which has a fairly comprehensive free version and a more extensive commercial version:
http://gae-full-text-search.appspot.com/customers/download/
I've read that Google do have a project to bring full text search to App Engine although this is not scheduled to happen any time soon
I'd really like to see a comparison of the various searching frameworks and see how they stack up to each other. Does anyone know of any report like this?
Edit:
Google Search API now available (although still experimental)

For now, the real answer is that there is no real full-text search on Google App Engine. The solutions provided by the other answers here are fine for toy data sets, but do not scale to anything more than O(10000) documents or so. Google will have to provide search as an infrastructural feature of GAE. See the feature request for (mostly superfluous) discussion.

# The Quick and simple text search:
http://www.billkatz.com/2009/6/Simple-Full-Text-Search-for-App-Engine
this solution did not work for me - and looking at the limitations below, it is unlikely to be useful for real use cases.
It uses StringListProperty to store phrases which has a limitation of 500 characters.
It does not work with the standard query filters.

Issue 217 Bill Katz released a package to deal with and http://gae-full-text-search.appspot.com/ is available alternatively, levensthein is a another match measure

You should be able to adapt Whoosh! to write in the datastore instead of on disk. It's a pure python full-text search engine. It's not as fast or full-featured as Lucene, but it should run on GAE without too many modifications.

Related

How to do FTS within Google Cloud Platform

Does Google Cloud Platform have a product to do full-text search via an API with non-web data (such as json or xml documents)? This may seem like a pretty silly question, but the only options I have come across are:
Search inside of Google App Engine (only available for python2, not python3) -- https://cloud.google.com/appengine/training/fts_intro/.
Related to web search only: https://developers.google.com/custom-search/docs/tutorial/introduction
Using a managed Elasticsearch: https://console.cloud.google.com/marketplace/details/google/elasticsearch.
Cloud firestore explicitly states it doesn't offer that and suggests using Aloglia (and gives details on integrating): https://cloud.google.com/firestore/docs/solutions/search
Is there something I'm missing? I'm basically looking to index and search about a million documents in a sort of free-form type of search. Is this offered as a product from Google outside of App Engine? If so, how can I access it?
You have pretty much covered it there. There is currently no specific Google service for full-text search. As you mentioned, App Engine Search API is available for Python 2.7, which will stop being maintained after January 2020, and not Python 3.
There is one more option you could consider, which is using Lucene foe GAE. I found this blog where several possibilities are studied, perhaps could be an interesting reading for you.
To sum up, I would recommend ElasticSearch or Aloglia, but for the latter you need a Firebase project.

Are Cloud Endpoints with Go Google App Engine Standard possible?

I have implemented a simple API in Go on Google App Engine Standard using just:
func init() {
http.HandleFunc("/api/v1/resource",submitResource)
}
Nothing special. However I want to port this code to using Cloud Endpoints instead in order to get the better monitoring and diagnostics.
Is it even possible with STANDARD instances or must I move to FLEXIBLE?
I can't find any documentation on this. Nor answers to this seemingly simple question. At the moment I half wish I had chosen Python because its support seems more mature. I chose Go because it seems more appropriate for API-like code because my minimal research suggested Go offered better performance.
If it is possible, are there any pointers to how please?
Only Python and Java are supported on GAE Standard via the Endpoints Frameworks. However, Go is supported on GAE Flexible.
Here is the Go GAE Flexible sample:
https://github.com/GoogleCloudPlatform/golang-samples/tree/master/endpoints/getting-started
After much research and trial and error, the simple answer is "No." - as of Dec 2016.
The longer answer is it's possible if you want to put far too much effort into making up to date libraries of your own. There is basically no support, even in alpha, for the current Google Cloud Endpoints using Go with Google App Engine Standard.
It's possible to run Go+endpoints on GAE Standard environment, however libraries might be outdated now.
Libraries and sample app can be found on github:
https://github.com/GoogleCloudPlatform/go-endpoints
I have successfully deployed "Greetings" as AppEngine SE app, and it works.

Stanford Parser as a Google App Engine Service

I'm new to Goole App Engine. I'm struggling to find a way to use Stanford Parser as a backend for a mobile app (iOS, Android). Is it possible to run the Parser as a service in GAE so that the app can send the string in wich the parsing will be done and after the processing, the app gets a JSON with the results?
If yes, any hints or tutorial that you can direct me to?
Thank you.
I can't answer your exact question, but I'm also very interested in this.
Have you tried running the parser locally on iOS or Android? I suspect it would be somewhat slow, but I wonder if it's "tolerably slow" for small sentences. The official page just mentions a 100MB memory minimum, I can't find any mention of a minimum requirement for CPU power.
Here they explain how they run their own online parser:
https://mailman.stanford.edu/pipermail/parser-user/2011-March/000954.html

Search Engine that can use SKOS?

I am currently working on project where we want to take SKOS and plug it into a search engine to make the search results better. An example of this would be something like Semaphore Smartlogic (closed, not free, too big to partner with).
Searchblox is a very good, free, configurable, lucene/solr search engine, but it does not have SKOS abilities and is not open source.
Constellio is similar to Searchblox (not quite as good), and claims to be working on accepting SKOS, but I can't get it to function properly.
Before I go and build this: Does anyone know of an existing free search engine that has has the ability to accept SKOS? Or, does any know of an open source Lucene/Solr search engine like Searchblox that I could add this functionality to quickly?
You know Solr is a search engine on it's own? Check http://wiki.apache.org/solr/ for more info.
A Google search led me to http://code.google.com/p/lucene-skos/wiki/HowTo
Not the most active project, but I guess a good start.
Should't have to be too hard to combine the 2 into the solution you need.
I am not sure if SIREn supports SKOS, but it is a semantic lucene plugin that may be worth checking out.

Google App Engine : GeoPtProperty query

I have this latlng = db.GeoPtProperty() in app engine.
Given a lat long, how do I query the first 10 closer latlng without using any third party library?
Is there any nice documentation for me to refer?
Thanks in advance.
Geospatial queries aren't supported natively by the datastore. There are userland implementations however, including geomodel, documented here.
I don't believe this is natively supported yet. However, there is a talk at Google I/O 2011 on App Engine full text search (emphasis mine):
"At last we are adding a full text
search service to App Engine. The
upcoming service will be built on top
of the very infrastructure used by
Google. In addition to full text
search queries we will also offer
numeric, geo, date search
capabilities, and much more. This
session will cover the basic full text
search API, briefly outline more
advanced features, and how full text
search ties to existing services such
as datastore."
Stay tuned...

Resources