Thread safety in google endpoints and Objectify and how does allocateId works ? - google-app-engine

I have an OfyService class of this type
/**
* Custom Objectify Service that this application should use.
*/
public class OfyService {
/**
* This static block ensure the entity registration.
*/
static {
factory().register(MerchantProfile.class);
factory().register(Product.class);
}
/**
* Use this static method for getting the Objectify service object in order to make sure the
* above static block is executed before using Objectify.
* #return Objectify service object.
*/
public static Objectify ofy() {
return ObjectifyService.ofy();
}
/**
* Use this static method for getting the Objectify service factory.
* #return ObjectifyFactory.
*/
public static ObjectifyFactory factory() {
return ObjectifyService.factory();
}
}
I use factory().allocateId() method to allocate Key (to get Long id) before saving an entity. I have a problem where I need to transfer money from one account to the other and add an entry to Transaction table. So, I use ofy().transact(new Work<~>) in the following way
WrappedBoolean result = ofy().transact(new Work<WrappedBoolean>() {
#Override
public WrappedBoolean run() {
}
}
I allocate Id for Transaction before entering the transact part and then I subtract money from one account add it to other and then save both the accounts and Transaction entity.
My concern is as follows
What happens when there are two concurrent requests and app engine Instance provide them separate request handlers and same ID is allocated to both of them, depending upon the database State or it is not possible that the same id gets allocated twice.
What is the flow of control of Work as compared to the conventional synchronization block that we use in Java for making critical sections?
PS: To perform the same in other frameworks like Jersey (with JPA) I would have used a Synchronization block and would have done the Transaction in that block. And since at a time only one thread can access that block and id is also assigned once data is saved to the table there would have bee no issues.

Thread safety is not relevant to data consistency with either the datastore or with JPA/RDBMSes. If you are relying on synchronization, you are doing something wrong.
If you create a complete unit of work that performs your task and execute it in a transaction, the datastore will ensure that it is either completely applied or not applied at all. It will also guarantee that all transactions behave as if they were operated in serial. This might result in any particular execution aborting and retrying, but you don't see this as a user.
In short: Just put this in a transaction and do not worry about threading.

Related

Prevent one user from accessing a particular page when another user is already using it in .net core api and react js front end

We have a requirement to create a kind of user session. Our front end is react and backend is .net core 6 api and db is postgres.
When 1 user clicks on a delete button , he should not be allowed to delete that item when another user is already using that item and performing some actions.
Can you guys suggest me an approach or any kind of service that is available to achieve this. Please help
I would say dont make it too complicated. A simple approach could be to add the properties 'BeingEditedByUserId' and 'ExclusiveEditLockEnd' (datetime) to the entity and check these when performing any action on this entity. When an action is performed on the entity, the id is assigned and a timeslot (for example 10 minutes) would be assigned for this user. If any other user would try to perform an action, you block them. If the timeslot is expired anyone can edit again.
I have had to do something similar with Java (also backed by a postgres db)
There are some pitfalls to avoid with a custom lock implementation, like forgetting to unlock when finished, given that there is not guarantee that a client makes a 'goodbye, unlock the table' call when they finish editing a page, they could simply close the browser tab, or have a power outage... Here is what i decided to do:
Decide if the lock should be implemented in the API or DB?
Is this a distributed/scalable application? Does it run as just a single instance or multiple? If multiple, then you can not (as easily) implement an API lock (you could use something like a shared cache, but that might be more trouble than it is worth)
Is there a record in the DB that could be used as a lock, guaranteed to exist for each editable item in the DB? I would assume so, but if the app is backed by multiple DBs maybe not.
API locking is fairly easy, you just need to handle thread safety as most (if not all) REST/SOAP... implementations are heavily multithreaded.
If you implement at the DB consider looking into a 'Row Level Lock' which allows you to request a lock on a specific row in the DB, which you could use as a write lock.
If you want to implement in the API, consider something like this:
class LockManager
{
private static readonly object writeLock = new();
// the `object` is whatever you want to use as the ID of the resource being locked, probably a UUID/GUID but could be a String too
// the `holder` is an ID of the person/system that owns the lock
Dictionary<object, _lock> locks = new Dictionary<object, _lock>();
_lock acquireLock(object id, String holder)
{
_lock lok = new _lock();
lok.id = id;
lok.holder = holder;
lock (writeLock)
{
if (locks.ContainsKey(id))
{
if (locks[id].release > DateTime.Now)
{
locks.Remove(id);
}
else
{
throw new InvalidOperationException("Resource is already locked, lock held by: " + locks[id].holder);
}
}
lok.allocated = DateTime.Now;
lok.release = lok.allocated.AddMinutes(5);
}
return lok;
}
void releaseLock(object id)
{
lock (writeLock)
{
locks.Remove(id);
}
}
// called by .js code to renew the lock via ajax call if the user is determined to be active
void extendLock(object id)
{
if (locks.ContainsKey(id))
{
lock (writeLock)
{
locks[id].release = DateTime.Now.AddMinutes(5);
}
}
}
}
class _lock
{
public object id;
public String holder;
public DateTime allocated;
public DateTime release;
}
}
This is what i did because it does not depend on the DB or client. And was easy to implement. Also, it does not require configuring any lock timeouts or cleanup tasks to release locked items with expired locks on them, as that is taken care of in the locking step.

Non Serializable object in Apache Flink

I am using Apache Flink to perform analytics on streaming data.
I am using a dependency whose object takes more than 10 secs to create as it is reads several files present in hdfs before initialisation.
If I initialise the object in open method I get a timeout Exception and if in the constructor of a sink/flatmap, I get serialisation exception.
Currently I am using static block to initialise the object in some other class, using Preconditions.checkNotNull(MGenerator.mGenerator) in main file and then it's working if used in a flatmap of sink.
Is there a way to create a non serializable dependency's object which might take more than 10 secs to be initialised in Flink's flatmap or sink?
public class DependencyWrap {
static MGenerator mGenerator;
static {
final String configStr = "{}";
final Config config = new Gson().fromJson(config, Config.class);
mGenerator = new MGenerator(config);
}
}
public class MyStreaming {
public static void main(String[] args) throws Exception {
Preconditions.checkNotNull(MGenerator.mGenerator);
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(parallelism);
...
input.flatMap(new RichFlatMapFunction<Map<String,Object>,List<String>>() {
#Override
public void open(Configuration parameters) {
}
#Override
public void flatMap(Map<String,Object> value, Collector<List<String>> out) throws Exception {
out.collect(MFVGenerator.mfvGenerator.generateMyResult(value.f0, value.f1));
}
});
}
}
Also, Please correct me if I am wrong about the question.
Doing it in the Open method is 100% the right way to do it. Is Flink giving you a timeout exception, or the object?
As a last ditch method, you could wrap your object in a class that contains both the object and it's JSON string or Config (is Config serializable?) with the object marked transient and then override the ReadObject/WriteObject methods to call the constructor. If the mGenerator object itself is stateless (and you'll have other problems if it's not), the serialization code should get called only once when jobs are distributed to taskmanagers.
Using open is usually the right place to load external lookup sources. The timeout is a bit odd, maybe there is a configuration around it.
However, if it's huge using a static loader (either static class as you did or singleton) has the benefit that you only need to load it once for all parallel instances of the task on the same task manager. Hence, you save memory and CPU time. This is especially true for you, as you use the same data structure in two separate tasks. Further, the static loader can be lazily initialized when it's used for the first time to avoid the timeout in open.
The clear downside of this approach is that the testability of your code suffers. There are some ways around that, which I could expand if there is interest.
I don't see a benefit of using the proxy serializer pattern. It's unnecessarily complex (custom serialization in Java) and offers little benefit.

Java-EE database connection pool runs out of max

I have a default standalone.xml configuration where there is a maximum of 20 connections to be active at the same time in the pool of connections to the database. With good reasons, I guess. We run an Oracle database.
There's a reasonable amount of database traffic as there is third party API traffic, e.g. SOAP and HTTP calls in the enterprise application I'm developing.
We often do something like the following:
#PersistenceContext(unitName = "some-pu")
private EntityManager em;
public void someBusinessMethod() {
someEntity = em.findSomeEntity();
soap.callEndPoint(someEntity.getSomeProperty()); // may take up to 1 minute
em.update(someEntity);
cdiEvent.fire(finishedBusinessEvent);
}
However, in this case the database connection is acquired when the entity is fetched and is released after the update (actually when the entire transaction is done). About transactions, everything is container managed, no additional annotations. I know that you shouldn't "hold" the database connection longer than necessary, and this is exactly what I'm trying to solve. For one I wouldn't know how to programmatically release the connection nor do I think it would be a good idea, because you still want to be able to roll back for the entire transaction.
So? How to attack this problem? There's a number of options I tried:
Option 1, using ManagedExecutorService:
#Resource
private ManagedExecutorService mes;
public void someBusinessMethod() {
someEntity = em.findSomeEntity();
this.mes.submit(() -> {
soap.callEndPoint(someEntity.getSomeProperty()); // may take up to 1 minute
em.update(someEntity);
cdiEvent.fire(finishedBusinessEvent);
});
}
Option 2, using #Asynchronous:
#Inject
private AsyncBean asyncBean;
public void someBusinessMethod() {
someEntity = em.findSomeEntity();
this.asyncBean.process(someEntity);
}
public class AsyncBean {
#Asynchronous
public void process() {
soap.callEndPoint(someEntity.getSomeProperty()); // may take up to 1 minute
em.update(someEntity);
cdiEvent.fire(finishedBusinessEvent);
}
}
This in fact solved the database connection pooling issue, e.g. the connection is released as soon as the soap.callEndPoint happened. But it did not feel really stable (can't pinpoint the problems here). And of course the transaction is finished once you enter the a-sync processing, so whenever something went wrong during the soap call there was nothing roll backed.
wrapping up...
I'm about to move the long running IO tasks (soap and http calls) to a separate part of the application offloaded via queue's and feeding the result back in the application via queue's once again. In this case everything is done via transactions and no connections are held up. But this is a lot of overhead, thus before doing so I'd like to hear your opinion / best practices how to solve this problem!
Your queue solution is viable, but perhaps not necessary if you only perform read operations before your calls, you could split the transaction into 2 transactions (as you would also do with the queue) by using a DAO pattern.
Example:
#Stateless
private DaoBean dao;
#TransactionAttribute(TransactionAttributeType.NEVER)
public void someBusinessMethod() {
Entity e = dao.getEntity(); // creates and discards TX
e = soap.callEndPoint(e.getSomeProperty());
dao.update(e); // creates TX 2 and commits
}
This solutions has a few caveats.
The business method above can not be called while a transaction is already active because it would negate the purpose of the DAO (one TX suspended with NOT_SUPPORTED).
You will have to handle or ignore the possible changes that could have occurred on the entity during the soap call (#Version ...).
The entity will be detached in the business method, so you will have to eager load everything you need in the soap call.
I can't tell you if this would work for you as it depends on what is done before the business call. While still complex, it would be easier than a queue.
You were kind of heading down the right track with Option 2, it just needs a little more decomposition to get the transaction management happening in a way that keeps them very short.
Since you have a potentially long running web service call you're definitely going to need to perform your database updates in two separate transactions:
short find operation
long web service call
short update operation
This can be accomplished by introducing a third EJB as follows:
Entry point
#Stateless
public class MyService {
#Inject
private AsyncService asyncService;
#PersistenceContext
private EntityManager em;
/*
* Short lived method call returns promptly
* (unless you need a fancy multi join query)
* It will execute in a short REQUIRED transaction by default
*/
public void someBusinessMethod(long entityId) {
SomeEntity someEntity = em.find(SomeEntity.class, entityId);
asyncService.process(someEntity);
}
}
Process web service call
#Stateless
public class AsyncService {
#Inject
private BusinessCompletionService businessCompletionService;
#Inject
private SomeSoapService soap;
/*
* Long lived method call with no transaction.
*
* Asynchronous methods are effectively run as REQUIRES_NEW
* unless it is disabled.
* This should avoid transaction timeout problems.
*/
#Asynchronous
#TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED)
public void process(SomeEntity someEntity) {
soap.callEndPoint(someEntity.getSomeProperty()); // may take up to 1 minute
businessCompletionService.handleBusinessProcessCompletion(someEntity);
}
}
Finish up
#Stateless
public class BusinessCompletionService {
#PersistenceContext
private EntityManager em;
#Inject
#Any
private Event<BusinessFinished> businessFinishedEvent;
/*
* Short lived method call returns promptly.
* It defaults to REQUIRED, but will in effect get a new transaction
* for this scenario.
*/
public void handleBusinessProcessCompletion(SomeEntity someEntity) {
someEntity.setSomething(SOMETHING);
someEntity = em.merge(someEntity);
// you may have to deal with optimistic locking exceptions...
businessFinishedEvent.fire(new BusinessFinished(someEntity));
}
}
I suspect that you may still need some connection pool tuning to cope effectively with your peak load. Monitoring should clear that up.

How to know a operations of Google AppEngine datastore are complete

I'm execute method Datastore.delete(key) form my GWT web application, AsyncCallback had call onSuccess() method .Them i refresh http://localhost:8888/_ah/admin immediately , the Entity i intent to delete still exist. Smilar to, I refresh my GWT web application immediately the item i intent to delete still show on web page.Note the the onSuccess() had been call.
So, how can i know when the Entity already deleted ?
public void deleteALocation(int removedIndex,String symbol ){
if(Window.confirm("Sure ?")){
System.out.println("XXXXXX " +symbol);
loCalservice.deletoALocation(symbol, callback_delete_location);
}
}
public AsyncCallback<String> callback_delete_location = new AsyncCallback<String>() {
public void onFailure(Throwable caught) {
Window.alert(caught.getMessage());
}
public void onSuccess(String result) {
// TODO Auto-generated method stub
int removedIndex = ArryList_Location.indexOf(result);
ArryList_Location.remove(removedIndex);
LocationTable.removeRow(removedIndex + 1);
//Window.alert(result+"!!!");
}
};
SERver :
public String deletoALocation(String name) {
// TODO Auto-generated method stub
Transaction tx = Datastore.beginTransaction();
Key key = Datastore.createKey(Location.class,name);
Datastore.delete(tx,key);
tx.commit();
return name;
}
Sorry i'm not good at english :-)
According to the docs
Returns the Key object (if one model instance is given) or a list of Key objects (if a list of instances is given) that correspond with the stored model instances.
If you need an example of a working delete function, this might help. Line 108
class DeletePost(BaseHandler):
def get(self, post_id):
iden = int(post_id)
post = db.get(db.Key.from_path('Posts', iden))
db.delete(post)
return webapp2.redirect('/')
How do you check the existence of the entity? Via a query?
Queries on HRD are eventually consistent, meaning that if you add/delete/change an entity then immediately query for it you might not see the changes. The reason for this is that when you write (or delete) an entity, GAE asynchronously updates the index and entity in several phases. Since this takes some time it might happen that you don't see the changes immediately.
Linked article discusses ways to mitigate this limitation.

Hibernate 2nd level cache invalidation when another process modifies the database

We have an application that uses Hibernate's 2nd level caching to avoid database hits.
I was wondering if there is some easy way to invalidate the Java application's Hibernate 2nd level cache when an outside process such as a MySQL administrator directly connected to modify the database (update/insert/delete).
We are using EHCache as our 2nd level cache implementation.
We use a mix of #Cache(usage = CacheConcurrencyStrategy.READ_WRITE) and #Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE), and we don't have Optimistic concurrency control enabled using timestamps on each entity.
The SessionFactory contains methods to manage the 2nd level cache:
- Managing the Caches
sessionFactory.evict(Cat.class, catId); //evict a particular Cat
sessionFactory.evict(Cat.class); //evict all Cats
sessionFactory.evictCollection("Cat.kittens", catId); //evict a particular collection of kittens
sessionFactory.evictCollection("Cat.kittens"); //evict all kitten collections
But because we annotate individual entity classes with #Cache, there's no central place for us to "reliably" (e.g. no manual steps) add that to the list.
// Easy to forget to update this to properly evict the class
public static final Class[] cachedEntityClasses = {Cat.class, Dog.class, Monkey.class}
public void clear2ndLevelCache() {
SessionFactory sessionFactory = ... //Retrieve SessionFactory
for (Class entityClass : cachedEntityClasses) {
sessionFactory.evict(entityClass);
}
}
There's no real way for Hibernate's 2nd level cache to know that an entity changed in the DB unless it queries that entity (which is what the cache is protecting you from). So maybe as a solution we could simply call some method to force the second level cache to evict everything (again because of lack of locking and concurrency control you risk in progress transactions from "reading" or updating stale data).
Based on ChssPly76's comments here's a method that evicts all entities from 2nd level cache (we can expose this method to admins through JMX or other admin tools):
/**
* Evicts all second level cache hibernate entites. This is generally only
* needed when an external application modifies the game databaase.
*/
public void evict2ndLevelCache() {
try {
Map<String, ClassMetadata> classesMetadata = sessionFactory.getAllClassMetadata();
for (String entityName : classesMetadata.keySet()) {
logger.info("Evicting Entity from 2nd level cache: " + entityName);
sessionFactory.evictEntity(entityName);
}
} catch (Exception e) {
logger.logp(Level.SEVERE, "SessionController", "evict2ndLevelCache", "Error evicting 2nd level hibernate cache entities: ", e);
}
}
SessionFactory has plenty of evict() methods precisely for that purpose:
sessionFactory.evict(MyEntity.class); // remove all MyEntity instances
sessionFactory.evict(MyEntity.class, new Long(1)); // remove a particular MyEntity instances
Both hibernate and JPA now provide direct access to the underlying 2nd level cache:
sessionFactory.getCache().evict(..);
entityManager.getCache().evict(..)
I was searching how to invalidate all Hibernate caches and I found this useful snippet:
sessionFactory.getCache().evictQueryRegions();
sessionFactory.getCache().evictDefaultQueryRegion();
sessionFactory.getCache().evictCollectionRegions();
sessionFactory.getCache().evictEntityRegions();
Hope it helps to someone else.
You may try doing this:
private EntityManager em;
public void clear2ndLevelHibernateCache() {
Session s = (Session) em.getDelegate();
SessionFactory sf = s.getSessionFactory();
sf.getCache().evictQueryRegions();
sf.getCache().evictDefaultQueryRegion();
sf.getCache().evictCollectionRegions();
sf.getCache().evictEntityRegions();
return;
}
I hope It helps.
One thing to take into account when using distributed cache is that QueryCache is local, and evicting it on one node, does not evicts it from other. Another issue is - evicting Entity region without evicting Query region will cause N+1 selects,when trying to retrieve date from Query cache. Good readings on this topic here.

Resources