Batch collecting in Solr PostFilter - solr

What I want to do is exactly like this:
Solr: How to perform a batch request to an external system from a PostFilter?
and the approach I took is similar:
-don't call super.collect(docId) in the collect method of the PostFilter but store all docIds in an internal map
-call the external system in the finish() then call super.collect(docId) for all the docs that pass the external filtering
The problem I have: docId exceeds maxDoc "(docID must be >= 0 and < maxDoc=100000 (got docID=123456)"
I suspect I am storing local docIds and when Reader is changed, docBase is also changed so the global docId, which I believe is constructed in super.collect(docId) using the parameter docId and docBase, becomes incorrect. I've tried storing super.delegate.getLeafCollector(context) along with docId and call super.delegate.getLeafCollector(context).collect() instead of super.collect() but this doesn't work either (got a null pointer exception)

Look at the code for the CollapsingQParserPlugin in the Solr codebase, particularly CollapsingScoreCollector.finish.
The docId's you receive in the collect call are not globally unique. The Collapsing collector makes them unique by adding the docBase from the context to the local docId to create a globalDoc during the collect() phase.
Then in the finish() phase, you must find the context containing the doc in question and set the reader/leafDelegate depending on what version of Solr your running. Specifying the right docId with the wrong context will throw Exceptions. For the Collapsing collector, you iterate through the contexts until you find the first docBase smaller than the globalDoc.
Finally, if you added docBase in collect(), don't forget to subtract docBase in finish() when you call collect() on the appropriate DelegationCollector object, as the author may or may not have done the first time.

Related

How to find a MoveTo destination filled by database?

I could need some help with a Anylogic Model.
Model (short): Manufacturing scenario with orders move in a individual route. The workplaces (WP) are dynamical created by simulation start. Their names, quantity and other parameters are stored in a database (excel Import). Also the orders are created according to an import. The Agent population "order" has a collection routing which contains the Workplaces it has to stop in the specific order.
Target: I want a moveTo block in main which finds the next destination of the agent order.
Problem and solution paths:
I set the destination Type to agent and in the Agent field I typed a function agent.getDestination(). This function is in order which returns the next entry of the collection WP destinationName = routing.get(i). With this I get a Datatype error (while run not compiling). I quess it's because the database does not save the entrys as WP Type but only String.
Is there a possiblity to create a collection with agents from an Excel?
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP. WP targetWP = findFirst(wps, w->w.name == destinationName);
Of corse wps (the population of Workplaces) couldn't be found.
How can I search the population?
Maybe with an Agentlink?
I think it is not that difficult but can't find an answer or a solution. As you can tell I'm a beginner... Hope the description is good an someone can help me or give me a hint :)
Thanks
Is there a possiblity to create a collection with agents from an Excel?
Not directly using the collection's properties and, as you've seen, you can't have database (DB) column types which are agent types.1
But this is relatively simple to do directly via Java code (and you can use the Insert Database Query wizard to construct the skeleton code for you).
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP
Yes, this is one approach. If your order details are in Excel/the database, they are presumably referring to workplaces via some String ID (which will be a parameter of the workplace agents you've created from a separate Excel worksheet/database table). You need to use the Java equals method to compare strings though, not == (which is for comparing numbers or whether two objects are the same object).
I want a moveTo block in main which finds the next destination of the agent order
So the general overall solution is
Create a population of Workplace agents (let's say called workplaces in Main) from the DB, each with a String parameter id or similar mapped from a DB column.
Create a population of Order agents (let's say called orders in Main) from the DB and then, in their on-startup action, set up their collection of workplace IDs (type ArrayList, element class String; let's say called workplaceIDsList) using data from another DB table.
Order probably also needs a working variable storing the next index in the list that it needs to go to (so let's say an int variable nextWorkplaceIndex which starts at 0).
Write a function in Main called getWorkplaceByID that has a single String argument id and returns a Workplace. This gets the workplace from the population that matches the ID; a one-line way similar to yours is findFirst(workplaces, w -> w.id.equals(id)).
The MoveTo block (which I presume is in Main) needs to move the Order to an agent defined by getWorkplaceByID(agent.workplaceIDsList.get(nextWorkplaceIndex++)). (The ++ bit increments the index after evaluating the expression so it is ready for the next workplace to go to.)
For populating the collection, you'd have two tables, something like the below (assuming using strings as IDs for workplaces and orders):
orders table: columns for parameters of your orders (including some String id column) other than the workplace-list. (Create one Order agent per row.)
order_workplaces table: columns order_id, sequence_num and workplace_id (so with multiple rows specifying the sequence of workplace IDs for an order ID).
In the On startup action of Order, set up the skeleton query code via the Insert Database Query wizard as below (where we want to loop through all rows for this order's ID and do something --- we'll change the skeleton code to add entries to the collection instead of just printing stuff via traceln like the skeleton code does).
Then we edit the skeleton code to look like the below. (Note we add an orderBy clause to the initial query so we ensure we get the rows in ascending sequence number order.)
List<Tuple> rows = selectFrom(order_workplaces)
.where(order_workplaces.order_id.eq(id))
.orderBy(order_workplaces.sequence_num.asc())
.list();
for (Tuple row : rows) {
workplaceIDsList.add(row.get(order_workplaces.workplace_id));
}
1 The AnyLogic database is a normal relational database --- HSQLDB in fact --- and databases only understand their own specific data types like VARCHAR, with AnyLogic and the libraries it uses translating these to Java types like String. In the user interface, AnyLogic makes it look like you set the column types as int, String, etc. but these are really the Java types that the columns' contents will ultimately be translated into.
AnyLogic does support columns which have option list types (and the special Code type column for columns containing executable Java code) but these are special cases using special logic under the covers to translate the column data (which is ultimately still a string of characters) into the appropriate option list instance or (for Code columns) into compiled-on-the-fly-and-then-executed Java).
Welcome to Stack Overflow :) To create a Population via Excel Import you have to create a method and call Code like this. You also need an empty Population.
int n = excelFile.getLastRowNum(YOUR_SHEET_NAME);
for(int i = FIRST_ROW; i <= n; i++){
String name = excelFile.getCellStringValue(YOUR_SHEET_NAME, i, 1);
double SEC_PARAMETER_TO_READ= excelFile.getCellNumericValue(YOUR_SHEET_NAME, i, 2);
WP workplace = add_wps(name, SEC_PARAMETER_TO_READ);
}
Now if you want to get a workplace by name, you have to create a method similar to your try.
Functionbody:
WP workplaceToFind = wps.findFirst(w -> w.name.equals(destinationName));
if(workplaceToFind != null){
//do what ever you want
}

How to send an Email notification by day end using Rules with all the nodes published that day?

I am trying to achieve email notification . The condition is , it should go by end of the day with the current day published content list.
For the same I have tried couple of things using Rules, but stuck in between.
Any help?
I tried using rules, and I created a rule like so:
Events:
After updating existing content of type(content type name)
Cron maintenance tasks are performed
Condition: Data to compare: [node:field-img-status], Data value: Approve
When I am trying to add second condition to check if the node is published within 24hrs, I am unable to achieve it. When I add strtotime("-1 day"), I get an error like:
Wrong date format. Specify the date in the format 2017-05-10 08:17:18.
I tried date('Y-m-d h:i:s',strtotime("-1 day")) but I did not succeed.
Now I am trying one more method to achieve it using Views Rules which is suggested in this answer to the question about 'How to create a Drupal rule to check (on cron) a date field and if passed set field "status" to "ended"?'.
Below is a blueprint of how I'd get this to work ...
Step 1: Create a single eMail for each node that was published
Create a view (using Views) of all the nodes that were published the last 24 hours. Make sure to include a column in that view for the various data you want to be included about each node in your eMail later on.
Use Rules to create a rule with a Rules Action that consists of a "Rules Loop", in which its "list items" are actually the list of nodes that you want to be included in your eMail later on. To create this Rules Loop, use the Views Rules combined with a Views display type of "Views Rules", for the view that you created. Refer to my answer to "How to pass arguments to a view from Rules?" for way more details on how to use the Views Rules module.
For each list item in the Rules Loop of the previous step, you have access to all data for each column in the View you created. By using these data you could add an additional Rules Action (within the same Rules Loop) to send an appropriate eMail about the node being processed.
Step 2: Group all eMails in a single eMail
Obviously, the previous step creates a single eMail for each node that was published in the last 24 hours. If you only have a few nodes that may not be a real issue to worry about. But if you have dozens (or more?) of such nodes then you might want to consider consolidating all such eMails in a single eMail, which contains (in its eMail body) the complete list of nodes.
A possible solution to implement such consolidation, is similar to what is shown in the Rules example included in my answer to "How to concatenate all token values of a list in a single field within a Rules loop?". In your case, you could make it work like so:
Add some new Rules variable that will be used later on as part of the eMail body, before the start of your loop. Say you name the variable nodes_list_var_for_email_body.
Within your loop, for each iteration, prepend or append the value for each "list item" to that variable nodes_list_var_for_email_body.
Move the Rules Action to send an eMail outside your loop, and after the loop completed. And finetune the details (configuration) of your (new) "send an eMail" Rules Action. When doing so, you'll be able to select the token for nodes_list_var_for_email_body to include anywhere in your eMail body.
Step 3: Schedule the daily execution of your rule
Use the Rules Once per Day to schedule the daily execution of your rule. Refer to my answer to "How to limit the execution of a rule for sending an email to only run once in a day?" for way more details about this module.
VoilĂ , that's it ...
This is how I would achieve this:
Make some view which would list all nodes created today.
Make some end-point (from my module, check out: https://api.drupal.org/api/drupal/modules%21system%21system.api.php/function/hook_menu/7.x)
It would call this view, and grab that node list (i.e. with views_get_view_result : https://api.drupal.org/api/views/views.module/function/views_get_view_result/7.x-3.x ), loop through the list, compose the email and send it.
Then I would set cron job to call that end-point at end of every day.

BadArgumentError: _MultiQuery with cursors requires __key__ order in ndb

I can't understand what this error means and apparently, no one ever got the same error on the internet
BadArgumentError: _MultiQuery with cursors requires __key__ order
This happens here:
return SocialNotification.query().order(-SocialNotification.date).filter(SocialNotification.source_key.IN(nodes_list)).fetch_page(10)
The property source_key is obviously a key and nodes_list is a list of entity keys previously retrieved.
What I need is to find all the SocialNotifications that have a field source_key that match one of the keys in the list.
The error message tries to tell you you that queries involving IN and cursors must be ordered by __key__ (which is the internal name for the key of the entity). (This is needed so that the results can be properly merged and made unique.) In this case you have to replace your .order() call with .order(SocialNotification._key).
It seems that this also happens when you filter for an inequality and try to fetch a page.
(e.g. MyModel.query(MyModel.prop != 'value').fetch_page(...) . This basically means (unless i missed something) that you can't fetch_page when using an inequality filter because on one hand you need the sort to be MyModel.prop but on the other hand you need it to be MyModel._key, which is hard :)
I found the answer here: https://developers.google.com/appengine/docs/python/ndb/queries#cursors
You can change your query to:
SocialNotification.query().order(-SocialNotification.date, SocialNotification.key).filter(SocialNotification.source_key.IN(nodes_list)).fetch_page(10)
in order to get this to work. Note that it seems to be slow (18 seconds) when nodes_list is large (1000 entities), at least on the Development server. I don't have a large amount of test
data on a test server.
You need the property you want to order on and key.
.order(-SocialNotification.date, SocialNotification.key)
I had the same error when filtering without a group.
The error occurred every time my filter returned more than one result.
To fix it I actually had to add ordering by key.

How to ignore errors in datastore.Query.GetAll()?

I just started developing a GAE app with the Go runtime, so far it's been a pleasure. However, I have encountered the following setback:
I am taking advantage of the flexibility that the datastore provides by having several different structs with different properties being saved with the same entity name ("Item"). The Go language datastore reference states that "the actual types passed do not have to match between Get and Put calls or even across different App Engine requests", since entities are actually just a series of properties, and can therefore be stored in an appropriate container type that can support them.
I need to query all of the entities stored under the entity name "Item" and encode them as JSON all at once. Using that entity property flexibility to my advantage, it is possible to store queried entities into an arbitrary datastore.PropertyList, however, the Get and GetAll functions return ErrFieldMismatch as an error when a property of the queried entities cannot be properly represented (that is to say, incompatible types, or simply a missing value). All of these structs I'm saving are user generated and most values are optional, therefore saving empty values into the datastore. There are no problems while saving these structs with empty values (datastore flexibility again), but there are when retrieving them.
It is also stated in the datastore Go documentation, that it is up to the caller of the Get methods to decide if the errors returned due to empty values are ignorable, recoverable, or fatal. I would like to know how to properly do this, since just ignoring the errors won't suffice, as the destination structs (datastore.PropertyList) of my queries are not filled at all when a query results in this error.
Thank you in advance, and sorry for the lengthy question.
Update: Here is some code
query := datastore.NewQuery("Item") // here I use some Filter calls, as well as a Limit call and an Order call
items := make([]datastore.PropertyList, 0)
_, err := query.GetAll(context, &items) // context has been obviously defined before
if err != nil {
// something to handle the error, which in my case, it's printing it and setting the server status as 500
}
Update 2: Here is some output
If I use make([]datastore.PropertyList, 0), I get this:
datastore: invalid entity type
And if I use make(datastore.PropertyList, 0), I get this:
datastore: cannot load field "Foo" into a "datastore.Property": no such struct field
And in both cases (the first one I assume can be discarded) in items I get this:
[]
According to the following post the go datastore module doesn't support PropertyList yet.
Use a pointer to a slice of datastore.Map instead.

Rename field using Objectify and Google App Engine

I am trying a case where we changed a field name in our entity. we have something like this for example
class Person {
String name; //The original declaration was "String fullName"
}
According to objectify you have to use annonation #AutoLoad(""). This is ok and it works as Google Datastore doesn't delete the data Actually but it makes a new field so this annotation is like a mapping between the old and the new field. No problem when you are reading the whole table.
The problem arises when you apply a filter on your query (Suppose you made 5 objects with old name and 5 with new name). The result of your query depends on whether you used the old variable name or the new one (returns back only 5 but never the 10). It won't fetch both of them and map them. Any Suggestions for this problem? I hope i explained it in a clear way.
Thanks in advance
The simplest straight forward solution. fetch all data with the annonation "AutoLoad()". Then store them again. In this way they will be saved as the new field. The old one doesn't exist anymore or at least it doesn't contain any data anymore. It is like migrating the data from the old name to the new name. Anyone has better suggestions ?
If you've changed the name of your field, you need to load and re-put all your data (using the mapreduce API would be one option here). There's no magic way around this - the data you've stored exists with two different names on disk.
You can use #OldName
http://www.mail-archive.com/google-appengine-java#googlegroups.com/msg05586.html

Resources