GAE Datastore (Golang): Filter Query When Adding New DB Field - google-app-engine

I'm running a GAE Golang application that's working with datastore. I have a struct which translates to a DB model on datastore, and I have added a new field to the struct, call it NewField (type string)
Existing instances ("rows" in the DB) for this struct have this NewField missing of course, which is expected.
I'm looking to create a query that will return all instances with where this NewField is missing (the existing instances).
This is what I tried:
q := datastore.NewQuery("MyModel")
q = q.Filter("NewField =", "")
However this doesn't seem to work.
Any ideas on how to achieve this?

The bad news is that you can't.
Every query on GAE Datastore operates on an index. Since you just added the new property, existing entities without that property will not be in any indices (that includes that property). What you would need is to loop over entities with no index records, but that is not possible.
Your best bet is to query all entities, and do the filtering / update manually in Go code, where the NewField field has the zero value. Once you re-save existing entities, the new property will get indexed, and you will be able to search / filter by that property in the future.
If by any chance your entities store the creation time or last updated time (in a property), then you may use that: filter by last updated time to list only entities where the timestamp is less than the time when you added the new property to your Go model.
Another option (for future changes) is to add a "version" property to your entities. Whenever you perform a model update, increment the version for new entities. And you can always query entities with old versions (or with a specific version).

Related

How to work with unsaved entities even though ID attribute is needed?

I'm creating a React application where my data has the following structure:
interface BookCase {
id: number;
bookShelves: BookShelf[];
}
interface BookShelf {
id: number;
}
Every bookcase and every bookshelf has an id property. I use this for the key attribute and for locating a bookshelf inside the bookShelves array. The id is generated in the backend by the database (With a BigSerial in PostgreSQL) on save.
I now want to create a new bookcase inside my frontend without immediately saving it to the backend. First I want to work with it, perform some operations on it (e.g. place a book on the shelf), and afterwards send the whole batch of changes with the new entities to the backend where it will then be persisted in the database.
Of course I do not yet have an id, although I need one to work on the bookcases. Should I rewrite my application to also accept null for id (I would prefer not to)? Should I just randomly create an temporary id, possibly having duplicates with the ids already present in the database (or for example use a negative value like -1)? Then I would need to replace all the ids afterwards after it has been saved to the database.
With UUIDs I could generate it on the frontend, but I guess there also has to be a common pattern to work with just incrementing integers as the id.
I do not think there is a clear answer here.
Essentially you have a object-relational mapping and there are various ways to handle it. Entity Framework for example just uses the default for the data type. So if the entity does not exist yet the ID will be 0 and any persisted entities have values starting at 1 so there are no conflicts.
One way i usually handle saving is by returning the updated record from the request, so you just replace your old one with that and you have the correct ID value applied automatically.

Howto: Reload entities in solr

Lets say you have a Solr core with multiple entities in your document. In my case the reason for that is that the index is fed by SQL queries and I don't want to deal with multiple cores. So, in case you add or change one entity configuration, you eventually have to re-index the whole shop, which can be time consuming.
There is a way, to delete and re-index one single entity, and this is how it works:
Prerequisite: your index entries have to have a field, which reflects the entity name. You could either do that via a constant in your SQL statement or by using the TemplateTransformer:
<field column="entityName" name="entityName" template="yourNameForTheEntity"/>
You can use this name to remove all entity items from the index via using the Solr admin UI. Go to documents,
request-Handler: /update
Document-Type: JSON
Document(s): delete: {query:{entityName:yourNameForTheEntity}}
After submitting the document, all related items are gone and you can see that via running a query on the query page:
{!term f=entityName}yourNameForTheEntity
Then go to the Dataimport page to re-load you entity. Uncheck the Clean checkbox, select your entity and Execute.
After the indexing is complete, you can go back to the query page and check the result.
That's it.
Have fun,
Christian

Datastore does not return any result when cursor pointing to entity gets updated but retains its position in list

I'm using google datastore to get data of a user:
This is what i'm trying to do:
When data is updated, its updated_at [indexed] property gets set to current timestamp.
I query data on updated_at in ascending order and store cursor returned for later use.
Now user has updated the last entity (which cursor points currently) and no other data is added or updated.
Now i'm expecting that last entity to be returned in next query (using that old cursor) because it was updated and now has a new updated_at timestamp.
But that is not the case, i do not see that (my result is empty list) And now i have lost that update completely because query will return all the other object except that last entity that was updated.
Am I doing something wrong or is this the way it is? If this is natural behavior then what is a preferred way to get the last entity that was updated?
Disclosure This answer represent just my understanding how GAE datastore works. Reality can be different. But the solution should work anyway.
You can think of a cursor as a pointer to a node in linked list.
Basically it's storing just query used to get it and a key to the "last/current" entity. When entities updated in datastore it has no way to update a cursor.
When you change entity's field updated_at it does not change the key stored at cursor. So if you update filtred/ordered properties the old cursor points to the same node but in a different "chain".
Solution: Instead of storing cursor you can store the last (max) updated_at and query your data with .filter('updated_at >', last_updated_at). This way you will:
Get your entity in the results if updated_at changed (increased)
Have smaller & more readable "cursor" to pass around.
Think of datastore cursors as pointing "between" entities, as in "Here's where to continue the scan."
The documentation says "... a cursor, which is an opaque base64-encoded string marking the index position of the last result retrieved," but that last result won't be re-retrieved.
It is the expected behaviour, because the data was modified between the moment when the cursor was created and when it is used. From Cursors and data updates:
... If the results for a query change between uses of a cursor, the
query notices only changes that occur in results after the cursor.
...

Is incremental adds are possible in neo4j?

I have a quick question. I have database of one million nodes and 4 million relationships. This all data in neo4j i have created with import csv command. Now after testing the graph database and analyzing the queries according to my need. Now i want to make a php program where the data will be automatically loaded and i will get the results in the end (according to my query). Now here is the question, as my data will update after 15 min. Is neo4j has a ability of incremental adds. Like to show which new relationships or nodes added in this specific time.i was thinking to use the time command to see which data was created in that time. Correct me if i am wrong. i only want to see the new addition. because i dont want neo4j to waste time on the calculation of already existing nodes/relationships.is there any other way to do that.
thanks in advance.
You could add a property to store a string of the date/time that the nodes are added. Then you could query for everything since the last date/time. I'm not 100% sure on the index performance of that, though.
However, if all you care about is showing the the most recently imported, you could have a boolean value with an index:
CREATE INDEX ON :Label(recently_added)
Then when you import your data you can unset all of the current ones and set the new ones like this:
MATCH (n:Label {recently_added: true})
REMOVE n.recently_added
Second query:
MATCH (n:Label {id: {id_list}})
SET n.recently_added = true
That is assuming that you have some sort of unique identifier on the nodes which you can use to set the ones which you just added.

WPF and LINQ to Entities binding to newly added records

I'm in the process of learning LINQ to Entities and WPF so forgive me if I get some terminology wrong. I have a model which contains Clients and I want the user to be able to bulk enter up to 20 clients at a time (this will be done by data entry staff off a paper list so I want to avoid entering one and saving one).
I was planning on adding 20 new clients to my model and have a datagrid/listbox bound to this.
In LINQ, how do I select out the newly added records to the model? I could rely on certain fields being blank but is there a better method? Alternatively, is there another way of doing this?
DataContext db; // ...
db.GetChangeSet();
The change set will contain lists of newly insert, update and delete operations. If you access that prior to any SubmitChanges you should be able to get what you want. However LINQ does preform inserts in a transactional manner so what is it that you wanna achieve here?

Resources