Database search with a RESTful Web API - database

I am quite new to REST, but what I want to do is implement a (not that simple) search function to my API. I want to search a database using multiple keywords (e.g. author, booktitle, ...) and different search operators (e.g. ~, =, !=, ...).
What I'm looking for is something like a convention, an example or a "best-practice"-tutorial how to do this in an elegant way in terms of routing parameters etc.
EDIT: Basically I want to know how to include the operators belonging to each keyword in the URL in a nice way.

First check the URI template standard. If it is enough for you, and you can solve your problem with multiple links, then you got luck.
If not, then you have to send back some structure description about your search queries. First you have to choose a query language. It is probably better to choose a standard query language, so you don't have to create a new one. After that you have to send back some semantics (probably in RDF) about the constraints of your search queries. For example you can search for words in the articles titles and order the results by dates, and so on... this kind of stuff... So the client can generate the query using the details of the query language, the constraint descriptions and the user input.
After the query was synthesised by your client, you can send it in the following formats:
GET /blah?q="query details" - the query string serialized in a single query param, (the query language can be anything)
GET /blah?query=x|details=z - using an URI query language
SEARCH /blah ... - sending in the SEARCH body with the proper MIME type (the query language can be anything, but be aware that caching is probably not supported by the SEARCH method, because it is an old webdav method with not maintained standard)
So the key problem that we don't have currently a standard or an RDF vocab to describe the query structure for the client, and thus sending a query link will probably violate the self-descriptive constraint and couple the client to the service implementation. (By most of the current APIs the client reusability and fulfilling the REST constraints is not a concern.)

Related

How to use "Like" where predicate in commerce tools

I am using commerce tools and I want to fetch data for matching result.
For example: I need to fetch "keyword" by giving key only, same as we do by using like in SQL.
Has anyone know any such query predicate in commerce-tools ?
Thanks in advance :)
The commercetools query predicates do not support substring matches today (Feb 2019). You can do language aware full text search over Products and Categories which would match certain types of substrings (grammatical variations or parts of compound words in languages like german), but not the exact behaviour that you know of SQL LIKE '%key%'.
It basically means you have to plan ahead relatively well in what form of data you need on a resource for your use cases. Or check whether there is an explicit feature for your use case that helps you out.

Solr multilingual search

I'm currently working on a project where we have indexed text content in SOLR. Every content is writen in one specific language (we have 4 differents
european languages) but we would like to add a feature that if the primary search (search text entered by the user) doesn't return much result then we try too look for document in other languages. Thus we would somehow need to translate the query.
Our base is that we can have a mapping list of translated words commonly used in the field of the project.
One solution that came to me was to use synonym search feature. But this might not provide the best results.
Does people have pointers on existing modules that could help us achieving this multilingual search feature? Or conception ideas we cold try to investigate?
Thanks
It seems like multi-lingual search is not a unique problem.
Please take a look
http://lucene.472066.n3.nabble.com/Multilingual-Search-td484201.html
and
Solr index and search multilingual data
those two links suggest to have dedicated fields for each language, but you can also have a field that states language, and you can add filter query (&fq=) for the language you have detected (from user query). This is more scalable solution, I think.
One option would be for you to translate your terms at index time, this could probably be done at Solr level or even before Solr at the application level, and then store the translated texts in different fields so you would have fields like:
text_en: "Hello",
text_fi: "Hei"
Then you can just query text_en:Hello and it would match.
And if you want to score primary language matches higher, you could have a primary_language field and then boost documents where it matches the search language higher.

Solr AND operator

I have a problem getting the right results with my SOLR query. Basically, let's say I want all documents in English containing the string "toto".
http://127.0.0.1:8080/solr-webservice/query/?q=iso_lang_cd:en&ctnt_val:*toto*
The problem is that this query sends me all documents in English AND all documents containing toto.
What I need is to get all documents that are in English AND contain toto. How could I achieve this? I'd think this is the standard use of the AND operator...
Actually OR is the default query operator for Solr and your query is not formatted in such a away as to force an AND operation. In order to achieve the AND behavior you could specify your query in one of the following formats:
+iso_lang_cd:en +ctnt_val:*toto*
iso_lang_cd:en && ctnt_val:*toto*
Or you can optionally pass the q.op=AND to force an AND operation. Additionally, you might want to consider using Filter Queries, where you could filter on the language. There are some performance improvements with using filter queries, but please refer to the documentation for more details.
q=ctnt_val:*toto*&qf=iso_lang_cd:en
Please see The Standard Query Parser for more details and a good overview of querying.

Wildcard search in cassandra database

I want to know if there is any way to perform wildcard searches in cassandra database.
e.g.
select KEY,username,password from User where username='\*hello*';
Or
select KEY,username,password from User where username='%hello%';
something like this.
There is no native way to perform such queries in Cassandra. Typical options to achieve the same are
a) Maintain an index yourself on likely search terms. For example, whenever you are inserting an entry which has hello in the username, insert an entry in the index column family with hello as the key and the column value as the key of your data entry. While querying, query the index CF and then fetch data from your data CF. Of course, this is pretty restrictive in nature but can be useful for some basic needs.
b) A better bet is to use a full text search engine. Take a look at Solandra, https://github.com/tjake/Solandra or Datastax enterprise http://www.datastax.com/products/enterprise
This project also looks promising
http://tuplejump.github.io/stargate/
I have not looked deeply at it recently, but when I last evaluated it, it looked promising.

How to implement an Enterprise Search

We are searching disparate data sources in our company. We have information in multiple databases that need to be searched from our Intranet. Initial experiments with Full Text Search (FTS) proved disappointing. We've implemented a custom search engine that works very well for our purposes. However, we want to make sure we are doing "the right thing" and aren't missing any great tools that would make our job easier.
What we need:
Column search
ability to search by column
we flag which columns in a table are searchable
Keep some relation between db column and data
we provide advanced filtering on the results
facilitates (amazon style) filtering
filter provided by grouping of results and allowing user to filter them via a checkbox
this is a great feature, users like it very much
Partial Word Match
we have a lot of unique identifiers (product id, etc).
the unique id's can have sub parts with meaning (location, etc)
or only a portion may be available (when the user is searching)
or (by a decidedly poor design decision) there may be white space in the id
this is a major feature that we've implemented now via CHARINDEX (MSSQL) and INSTR (ORACLE)
using the char index functions turned out to be equivalent performance(+/-) on MSSQL compared to full text
didn't test on Oracle
however searches against both types of db are very fast
We take advantage of Indexed (MSSQL) and Materialized (Oracle) views to increase speed
this is a huge win, Oracle Materialized views are better than MSSQL Indexed views
both provide speedups in read-only join situations (like a search combing company and product)
A search that matches user expectations of the paradigm CTRL-f -> enter text -> find matches
Full Text Search is not the best in this area (slow and inconsistent matching)
partial matching (see "Partial Word Match")
Nice to have:
Search database in real time
skip the indexing skip, this is not a hard requirement
Spelling suggestion
Xapian has this http://xapian.org/docs/spelling.html
Similar to google's "Did you mean:"
What we don't need:
We don't need to index documents
at this point searching our data sources are the most important thing
even when we do search documents, we will be looking for partial word matching, etc
Ranking
Our own simple ranking algorithm has proven much better than an FTS equivalent.
Users understand it, we understand it, it's almost always relevant.
Stemming
Just don't need to get [run|ran|running]
Advanced search operators
phrase matching, or/and, etc
according to Jakob Nielsen http://www.useit.com/alertbox/20010513.html
most users are using simple search phrases
very few use advanced searches (when it's available)
also in Information Architecture 3rd edition Page 185
"few users take advantage of them [advanced search functions]"
http://oreilly.com/catalog/9780596000356
our Amazon like filtering allows better filtering anyway (via user testing)
Full Text Search
We've found that results don't always "make sense" to the user
Searching with FTS is hard to tune (which set of operators match the users expectations)
Advanced search operators are a no go
we don't need them because
users don't understand them
Performance has been very close (+/1) to the char index functions
but the results are sometimes just "weird"
The question:
Is there a solution that allows us to keep the key value pair "filtering feature", offers the column specific matching, partial word matching and the rest of the features, without the pain of full text search?
I'm open to any suggestion. I've wondered if a document/hash table nosql data store (MongoDB, et al) might be of use? ( http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo ). Any experience with these is appreciated.
Again, just making sure we aren't missing something with our in-house customized version. If there is something "off the shelf" I would be interested in it. Or if you've built something from some components, what components (search engines, data stores, etc) did you use and why?
You can also make your point for FTS. Just make sure it meets the requirements above before you say "just use Full Text Search because that's the only tool we have."
I ended up coding my own.
The results are fantastic. Users like it, it works well with our existing technologies.
It really wasn't that hard. Just took some time.
Features:
Faceted search (amazon, walmart, etc)
Partial word search (the real stuff not full text)
Search databases (oracle, sql server, etc) and non database sources
Integrates well with our existing environment
Maintains relations, so I can have a n to n search and display
--> this means I can display child records of a master record in search results
--> also I can search any child field and return the master record
It's really amazing what you can do with dictionaries and a lot of memory.
I recommend looking into Solr, I believe it will meet you needs:
http://lucene.apache.org/solr/
For an off-she-shelf solution: Have you checked out the Google Search Appliance?
Quote from the Google Mini/GSA site:
... If direct database indexing is a requirement for you, we encourage you to consider the Google Search Appliance, which has direct database connectivity.
And of course it indexes everything else in the Googly manner you'd expect it to.
Apache Solr is a good way to start your project with and it is open source . You can also try Elastic Search and there are a lot of off shelf products which offer good customization abilities and search features such as Coveo, SharePoint Fast, Google ...

Resources