Will two entities get written to datastore with this endpoint? - google-app-engine

I have an endpoint method that first uses a query to see if an entity with certain params exists, and if it does not it will create it. If it exists, I want to increase a counter in the variable:
Report report = ofy().load().type(Report.class)
.filter("userID", userID)
.filter("state", state).first().now();
if (report == null) {
//write new entity to datastore with userID and state
} else {
//increase counter in entity +1
report.setCount(report.count+1)
//save entity to datastore
}
My question is, what if someone clicks a button to execute the above endpoint with the same params very rapidly, what will happen? Will two identical Report entities get written to the datastore? I only want to make sure one is written.

By itself this code is not safe and has a race condition that will allow multiple Reports to be created.
To make this safe, you need to run the code in a transaction. Which means you must have an ancestor query (or convert it to a simple primary key lookup). One option is to give Report a #Parent of the User. Then you can so something like this:
ofy().transact(() -> {
Report report = ofy().load().type(Report.class)
.ancestor(user)
.filter("state", state).first().now();
if (report == null) {
//write new entity to datastore with userID and state
} else {
//increase counter in entity +1
report.setCount(report.count+1)
//save entity to datastore
}
});

Related

Trigger to restrict duplicate record for a particular type

I have a custom object consent and preferences which is child to account.
Requirement is to restrict duplicate record based on channel field.
foe example if i have created a consent of channel email it should throw error when i try to create second record with same email as channel.
The below is the code i have written,but it is letting me create only one record .for the second record irrespective of the channel its throwing me the error:
Trigger code:
set<string> newChannelSet = new set<string>();
set<string> dbChannelSet = new set<string>();
for(PE_ConsentPreferences__c newCon : trigger.new){
newChannelSet.add(newCon.PE_Channel__c);
}
for(PE_ConsentPreferences__c dbcon : [select id, PE_Channel__c from PE_ConsentPreferences__c where PE_Channel__c IN: newChannelSet]){
dbChannelSet.add(dbcon.PE_Channel__c);
}
for(PE_ConsentPreferences__c newConsent : trigger.new){
if(dbChannelSet.contains(newConsent.PE_Channel__c))
newConsent.addError('You are inserting Duplicate record');
}
Your trigger blocks you because you didn't filter by Account in the query. So it'll let you add 1 record of each channel type and that's all.
I recommend not doing it with code. It is going to get crazier than you think really fast.
You need to stop inserts. To do that you need to compare against values already in the database (fine) but also you should protect against mass loading with Data Loader for example. So you need to compare against other records in trigger.new. You can kind of simplify it if you move logic from before insert to after insert, you can then query everything from DB... But it's weak, it's a validation that should prevent save, it logically belongs in before. It'll waste account id, maybe some autonumbers... Not elegant.
On update you should handle update of Channel but also of Account Id (reparenting to another record!). Otherwise I'll create consent with acc1 and move it to acc2.
What about undelete scenario? I create 1 consent, delete it, create identical one and restore 1st one from Recycle Bin. If you didn't cover after undelete - boom, headshot.
Instead go with pure config route (or simple trigger), let the database handle that for you.
Make a helper text field, mark it unique.
Write a workflow / process builder / simple trigger (before insert, before update) that writes to this field combination of Account__c + ' ' + PE_Channel__c. Condition could be ISNEW() || ISCHANGED(Account__c) || ISCHANGED(PE_Channel__c)
Optionally prepare data fix to update existing records.
Job done, you can't break it now. And if you ever need to allow more combinations (3rd field) it's easy for admin to extend it. As long as you keep under 255 chars total.
Or (even better) there are duplicate matching rules ;) give them a go before you do anything custom? Maybe check https://trailhead.salesforce.com/en/content/learn/modules/sales_admin_duplicate_management out.

Common strategy in handling concurrent global 'inventory' updates

To give a simplified example:
I have a database with one table: names, which has 1 million records each containing a common boy or girl's name, and more added every day.
I have an application server that takes as input an http request from parents using my website 'Name Chooser' . With each request, I need to pick up a name from the db and return it, and then NOT give that name to another parent. The server is concurrent so can handle a high volume of requests, and yet have to respect "unique name per request" and still be high available.
What are the major components and strategies for an architecture of this use case?
From what I understand, you have two operations: Adding a name and Choosing a name.
I have couple of questions:
Qustion 1: Do parents choose names only or do they also add names?
Question 2 If they add names, doest that mean that when a name is added it should also be marked as already chosen?
Assuming that you don't want to make all name selection requests to wait for one another (by locking of queueing them):
One solution to resolve concurrency in case of choosing a name only is to use Optimistic offline lock.
The most common implementation to this is to add a version field to your table and increment this version when you mark a name as chosen. You will need DB support for this, but most databases offer a mechanism for this. MongoDB adds a version field to the documents by default. For a RDBMS (like SQL) you have to add this field yourself.
You havent specified what technology you are using, so I will give an example using pseudo code for an SQL DB. For MongoDB you can check how the DB makes these checks for you.
NameRecord {
id,
name,
parentID,
version,
isChosen,
function chooseForParent(parentID) {
if(this.isChosen){
throw Error/Exception;
}
this.parentID = parentID
this.isChosen = true;
this.version++;
}
}
NameRecordRepository {
function getByName(name) { ... }
function save(record) {
var oldVersion = record.version - 1;
var query = "UPDATE records SET .....
WHERE id = {record.id} AND version = {oldVersion}";
var rowsCount = db.execute(query);
if(rowsCount == 0) {
throw ConcurrencyViolation
}
}
}
// somewhere else in an object or module or whatever...
function chooseName(parentID, name) {
var record = NameRecordRepository.getByName(name);
record.chooseForParent(parentID);
NameRecordRepository.save(record);
}
Before whis object is saved to the DB a version comparison must be performed. SQL provides a way to execute a query based on some condition and return the row count of affected rows. In our case we check if the version in the Database is the same as the old one before update. If it's not, that means that someone else has updated the record.
In this simple case you can even remove the version field and use the isChosen flag in your SQL query like this:
var query = "UPDATE records SET .....
WHERE id = {record.id} AND isChosend = false";
When adding a new name to the database you will need a Unique constrant that will solve concurrenty issues.

GAE datastore inequality filter two properties advice

I've got a scenario where I need to query the datastore for some random users who have been active in the last X minutes.
Each of my User entities have a property called 'random'. When I want to find some random users I generate a random min and max value and use them to query the datastore against the users random property.
This is what I've got so far:
public static List<Entity> getRandomUsers(Key filterKey, String gender, String language, int maxResults) {
ArrayList<Entity> nonDuplicateEntities = new ArrayList<>();
HashSet<Entity> hashSet = new HashSet<>();
int attempts = 0;
while (nonDuplicateEntities.size() < maxResults) {
attempts++;
if (attempts >= 10) {
return nonDuplicateEntities;
}
double ran1 = Math.random();
double ran2 = Math.random();
Filter randomMinFilter = new Query.FilterPredicate(Constants.KEY_RANDOM, Query.FilterOperator.GREATER_THAN_OR_EQUAL, Math.min(ran1, ran2));
Filter randomMaxFilter = new Query.FilterPredicate(Constants.KEY_RANDOM, Query.FilterOperator.LESS_THAN_OR_EQUAL, Math.max(ran1, ran2));
Filter languageFilter = new Query.FilterPredicate(Constants.KEY_LANGUAGE, Query.FilterOperator.EQUAL, language);
Filter randomRangeFilter;
if (gender == null || gender.equals(Constants.GENDER_ANY)) {
randomRangeFilter = Query.CompositeFilterOperator.and(randomMinFilter, randomMaxFilter, languageFilter);
} else {
Filter genderFilter = new Query.FilterPredicate(Constants.KEY_GENDER, Query.FilterOperator.EQUAL, gender);
randomRangeFilter = Query.CompositeFilterOperator.and(randomMinFilter, randomMaxFilter, genderFilter, languageFilter);
}
Query q = new Query(Constants.KEY_USER_CLASS).setFilter(randomRangeFilter);
PreparedQuery pq = DatastoreServiceFactory.getDatastoreService().prepare(q);
List<Entity> entities = pq.asList(FetchOptions.Builder.withLimit(maxResults - nonDuplicateEntities.size()));
for (Entity entity : entities) {
if (filterKey.equals(entity.getKey())) {
continue;
}
if (hashSet.add(entity)) {
nonDuplicateEntities.add(entity);
}
if (nonDuplicateEntities.size() == maxResults) {
return nonDuplicateEntities;
}
}
}
return nonDuplicateEntities;
}
I now need just users who have been active recently.
Each of the User entities also have a 'last active' property, which I want to include in the query e.g. last active > 30 minutes ago.
This would mean having an inequality filter on two properties, which I can't do.
What is the most efficient way to do this?
I could get all user entities active in the last X minutes, and then pick some random ones. I could leave my code as is and do a check for last active before adding them to the non duplicate entity list, but this might involve lots of calls to the datastore.
Is there some other way I can do this just using the query?
Given the above comments as requested here is one approach.
With the assumption you have a "last active" property which stores a date time stamp you can then perform a keys only query where the last active datetime_stamp > "a datetime stamp of interest".
On retrieving the keys perform a random choice on the result set, then explicitly fetch the key with a get operation. This will limit costs to small ops and a get.
I would consider then caching this set of keys in memcache, with a defined expiry period, so you can re-use the set of keys if you need another random choice in the next nominated period rather than re-querying, 2 secs later. Accuracy doesn't appear to be too important given the random choice.
If you do adopt the caching strategy, you do have to deal with cache expiry and refreshing the cache.
A potential issue here is running into the dogpile effect, where multiple requests all fail to retrieve the cache at the same time and each handler starts building the cache. In a lightly loaded system this may not be an issue, in a heavily loaded system with a lot of activity, you may want to keep the cache active with a task. - Just something to think about.

Is there any way to do a Insert or Update / Merge / Upsert in LLBLGen

I'd like to do an upmerge using LLBLGen without first fetching then saving the entity.
I already found the possibility to update without fetching the entity first, but then I have to know it is already there.
Updating entries would be about as often as inserting a new entry.
Is there a possibility to do this in one step?
Would it make sense to do it in one step?
Facts:
LLBLgen Pro 2.6
SQL Server 2008 R2
.NET 3.5 SP1
I know I'm a little late for this, but As I remember working with LLBLGenPro, it is totally possible and one of its beauties is everithing is possible!
I don't have my samples, but I'm pretty sure you there is a method named UpdateEntitiesDirectly that can be used like this:
// suppose we have Product and Order Entities
using (var daa = new DataAccessAdapter())
{
int numberOfUpdatedEntities =
daa.UpdateEntitiesDirectly(OrderFields.ProductId == 23 && OrderFields.Date > DateTime.Now.AddDays(-2));
}
When using LLBLGenPro we were able to do pretty everything that is possible with an ORM framework, it's just great!
It also has a method to do a batch delete called DeleteEntitiesDirectly that may be usefull in scenarios that you need to delete an etity and replace it with another one.
Hope this is helpful.
I think you can achieve what you're looking for by using EntityCollection. First fetch the entities you want to update by FetchEntityCollection method of DataAccessAdapter then, change anything you want in that collection, insert new entities to it and save it using DataAccessAdapter, SaveCollection method. this way existing entities would be updated and new ones would be inserted to the Database. For example in a product order senario in which you want to manipulate orders of a specified product then you can use something like this:
int productId = 23;
var orders = new EntityCollection<OrderEntity>();
using (DataAccessAdapter daa = new DataAccessAdapter())
{
daa.FetchEntityCollection(orders, new RelationPredicateBucket(OrderFields.ProductId == productId))
foreach(var order in orders)
{
order.State = 1;
}
OrderEntity newOrder = new OrderEntity();
newOrder.ProductId == productId;
newOrder.State = 0;
orders.Add(newOrder);
daa.SaveEntityCollection(orders);
}
As far as I know, this is not possible, and could not be possible.
If you were to just call adapter.Save(entity) on an entity that was not fetched, the framework would assume it was new. If you think about it, how could the framework know whether to emit an UPDATE or an INSERT statement? No matter what, something somewhere would have to query the database to see if the row exists.
It would not be too difficult to create something that did this more or less automatically for single entity (non-recursive) saves. The steps would be something like:
Create a new entity and set it's fields.
Attempt to fetch an entity of the same type using the PK or a unique constraint (there are other options as well, but none as uniform)
If the fetch fails, just save the new entity (INSERT)
If the fetch succeeds, map the fields of the created entity to the fields of the fetched entity.
Save the fetched entity (UPDATE).

Datastore - Resource contention in a single entity group -

I am getting lost on the following regarding the Datastore :
It is recommended to denormalize data as the Datastore does not support join queries. This means that the same information is copied in several entities
Denormalization means that whenever you have to update
data, it must be updated in different entities
But there is a limit of 1 write / second in a single entity group.
The problem I have is therefore the following :
In order to update records, I open a transaction then
Update all the required entities. The entities to be updated are within the same entity group but relate to different kinds
I am getting a "resource contention" exception
==> It seems therefore that the only way to update denormalized data is outside of a transaction. But doing this is really bad as some entities could be updated whereas other entities wouldn't.
Am I the only one having this problem ? How did you solve it ?
Thanks,
Hugues
The (simplified version of the ) code is as follows :
Objectify ofy=ObjectifyService.beginTransaction();
try {
Key<Party> partyKey=new Key<Party>(realEstateKey, Party.class, partyDTO.getId());
//--------------------------------------------------------------------------
//-- 1 - We update the party
//--------------------------------------------------------------------------
Party party=ofy.get(partyKey);
party.update(partyDTO);
//---------------------------------------------------------------------------------------------
//-- 2 - We update the kinds which have Party as embedded field, all in the same entity group
//---------------------------------------------------------------------------------------------
//2.1 Invoices
Query<Invoice> q1=ofy.query(Invoice.class).ancestor(realEstateKey).filter("partyKey", partyKey);
for (Invoice invoice: q1) {
invoice.setParty(party);
ofy.put(invoice);
}
//2.2Payments
Query<Payment> q2=ofy.query(Payment.class).ancestor(realEstateKey).filter("partyKey", partyKey);
for (Payment payment: q2) {
payment.setParty(payment);
ofy.put(payment);
}
}
ofy.getTxn().commit();
return (RPCResults.SUCCESS);
}
catch (Exception e) {
final Logger log = Logger.getLogger(InternalServiceImpl.class.getName());
log.severe("Problem while updating party : " + e.getLocalizedMessage());
return (RPCResults.FAILURE) ;
}
finally {
if (ofy.getTxn().isActive()) {
ofy.getTxn().rollback();
partyDTO.setCreationResult(RPCResults.FAILURE);
return (RPCResults.FAILURE) ;
}
}
This is happening because multiple requests to update the same entity group are occurring in a short period of time, not because you are updating many entities in the same entity group at once.
Since you have not shown your code, I can assume one of two things are happening:
The method you describe above is not actually using a transaction and you are running put_multi() with many entities of the same entity group. (If I had to guess, it'd be this.)
You have a high-traffic site and many other updates are simultaneously occurring at the same time.
Just in case someones gets in the same issue.
The problem was in the party.update(partyDTO) where under some specific conditions, I was initiating another transaction.
What I learned today is that :
--> Inside a transaction, you are allowed to include multiple puts even getting over the 1 entity / second
--> However, you should take care not initiating another transaction within your transaction

Resources