Apply pagination using Sunburnt highlighted search - solr

I am using Sunburnt Python Api for Solr Search. I am using highlighted search in Sunburnt it works fine.
I am using the following code:
search_record = solrconn.query(search_text).highlight("content").highlight("title")
records = search_record.execute().highlighting
Problem is it returns only 10 records. I know it can be change from solr-config.xml but issue is I want all records
I want to apply pagination using highlighted search of Sunburnt.

Given the SOLR-534 issue, which is still unresolved, you can't tell Solr to give you all results, but you can use a really high rows parameter depending on how many documents you expect to have in your index. I don't know anything about sunburnt but I believe something like this should work:
search_record = solrconn.query(search_text).paginate(rows=10000).highlight("content").highlight("title")
You just have to replace the rows value with something enough big depending on your index size.

The general approach to this is to use a paginator:
from django.core.paginator import Paginator
paginator = Paginator(si.query("black"), 30)
Once that's done, you can just paginate through everything:
for result in paginator.object_list:
print result

Related

Get each item in a collection with one query

I have a collection of slugs and want to get each corresponding page with one query.
Something like ...
Page::whereIn('slug', $slugs)->get();
... does only return the first page matching any slug in the collection.
Currently there is a loop, but that are dozens of queries I want to avoid.
Try using the whereRaw method and imploding your array into a string:
Page::whereRaw('slug IN ("' . $slugs->implode('","') . ')')->get();
As it turned out, whereIn was the right way. There was one minor mistake in my logic, and at the same time insufficient seeding data, that blowed everything up.
If someone does not know: whereRaw should be used with caution. To avoid SQL injection vulnerability, all user-submitted entries have to be passed as parameters.
Page::whereRaw('slug IN (?)', [$slug]);
Beware: Wrapping ? with quotes is a syntax error. The passed data will be single-quoted by default, at least on my machineā„¢.
select * from `pages` where `slug` in ('page');

peewee get_or_create and then save: error binding

Is there an easy way to update a field on a get of a get_or_create?
I have a class ItemCategory and I want to either create a new entry or get the already created entry and update a field (update_date).
What I do is:
item,created= ItemCategory.get_or_create(cat=cat_id,item=item_id)
if created == True:
print "created"
else:
item.update_date = datetime.now
item.somethingelse = 'aaa'
item.save()
This works for a while in my loop. But after 50-100 get/create it crashes:
peewee.InterfaceError: Error binding parameter 4 - probably unsupported type.
Maybe I should use upsert(), I tried but wasn't able to get anything working. Also it's not probably the best solution, since it makes a replace of the whole row instead of just a field.
I like peewee, it's very easy and fast to use, but I can't find many full examples and that's a pity
Newbie mistake
item.update_date = datetime.now()
I am not 100% sure this is the only answer though. I modified my code so many times that it might be also something else.
Regarding my question about create_or_update , I've done this:
try:
Item.create(...)
except IntegrityError:
Item.update(...)
peewee is really great, I wonder why no one ever asked for a create_or_update.

Solr find all ids that start with certain path

Have a number of id's that look like this:
/content/myProject/path1
/content/myProject/path1/page1
/content/myProject/path2
Now, I want to find all the children of path1, so I do /content/myProject/path1/*.
The problem is that I receive also /content/myProject/path2. How do I make a correct query ?
Thanks,
Peter
Sounds like you are using generic text tokenizer definition. You may want to look at the PathHierarchyTokinizer instead. It's designed to split at the path prefixes. And then, you will not need to do the * at the end.

Paging with reverse cursors in appengine

I am trying to get forward and backwards pagination working for a query I have on my app.
I have started with the example at: https://developers.google.com/appengine/docs/python/ndb/queries#cursors
I would expect that example to do a typical forward/back pagination to create cursors that you can pass to your template in order to be used in a subsequent request for the page after/before the current one. But what it is doing is getting cursors for the same page, one from the beginning and the other from the end (if I have understood correctly).
What I want is a cursor to the beginning of the following page, and a cursor to the beginning of the previous page, to use in my UI.
I have managed to almost get that with the following code, based on the mentioned example:
curs = Cursor(urlsafe=self.request.get('cur'))
q = MyModel.query(MyModel.usett == usett_key)
q_forward = q.order(-MyModel.sugerida)
q_reverse = q.order(MyModel.sugerida)
ofus, next_curs, more = q_forward.fetch_page(num_items_page,
start_cursor=curs)
rev_cursor = curs.reversed()
ofus1, prev_curs, more1 = q_reverse.fetch_page(num_items_page,
start_cursor=rev_cursor)
context = {}
if more and next_curs:
context['next_curs'] = next_curs.urlsafe()
if more1 and prev_curs:
context['prev_curs'] = prev_curs.reversed().urlsafe()
The problem, and the point of this question, is that I use more and more1 to see if there is a next page. And that is not working in the backwards sense. For the first page, more1 is True, in the second page more1 is False, and subsequent pages give True.
I would need something that gives False for the first page and True for every other page. It seems like this more return value is the thing to use, but maybe I have a bad Query setup, or any other thing wrong.
Thanks everyone!
Edit: Since I didn't find a simple solution for this, I switched to using ndbpager.
There's no such thing.
You know thats theres (at least) one page before the current page if you started the query with a cursor (the first page usualy dosnt have a cursor).
A common trick to access the previous page is inverting the sort-order.
If you have a list, sorted by creationdate desc, you could take the creationdate of the first element of your current page, query for elements with creationdate < this creationdate using inverted sort order. This will return the oldest elements which are newer then the given creationdate. Flip the list of retrived elements (to bring them into the correct order again) and there you have the elements of the page before, without using a cursor.
Note: this requires the values of your sortorder beeing distinct.
In some cases, its also possible to use a prebuild index allowing random-access to different pages, see https://bitbucket.org/viur/server/src/98de79b91778bb9b16e520acb28e257b21091790/indexes.py for more.
I have a workaround and not the best solution. it is baiscally redirecting back to the previous page.
Previous
I think PagedQuery has the capability but still waiting for someone to post a more comprehensive tutorial about it.

How to Increase/Configure Snippet size of a highlight?

I want to know that how we can configure Snippet Size(number of words/Characters) in highlighting? Currently i m facing a problem, sometimes solr Gives me snippet exactly the matched word. like let say I query solr as "Contents:risk" using solrnet it gives me exactly "risk" in highlighting snippets no more characters or words i do the same with Solr admin and it gives the same result too.
I'm not quite familiar with highlighting features but I believe this is done with the hl.fragsize parameter.
Mauricio already answered and this is a little bit of an old thread, but just to add the solution using SolrNet it would be:
Create a new Highlighting parameters object.
Set fragsize
Other parameters are possible
Highlighting documentation can be found here: Highlighting.md
Here is a sample code:
private HighlightingParameters SetHighLightSnippetParameters()
{
return new HighlightingParameters
{
Fragsize = SearchConstants.SnippetSize
};
}

Resources