(Google AppEngine) Memcache Lock Entry

(Google AppEngine) Memcache Lock Entry - google-app-engine

i need a locking in memcache. Since all operations are atomic that should be an easy task. My idea is to use a basic spin-lock mechanism. So every object that needs locking in memcache gets a lock object, which will be polled for access.
// pseudo code
// try to get a lock
int lock;
do
{
lock = Memcache.increment("lock", 1);
}
while(lock != 1)
// ok we got the lock
// do something here
// and finally unlock
Memcache.put("lock", 0);
How does such a solution perform? Do you have a better idea how to lock a memcache object?
Best regards,
Friedrich Schick

Be careful. You could potentially burn through a lot of your quota in that loop.

Locking is generally a bad idea - and in your example, will result in a busy-wait loop that consumes huge amounts of quota.
What do you need locking for? Perhaps we can suggest a better alternative.

If you really do need a loop: don't busy-wait, but include a delay, possibly with exponential back-off:
int delay = 100;
do {
lock = Memcache.increment("lock", 1);
usleep(delay);
delay = min(delay * 2, 100000);
}
while (!lock);

All operations on memcache are atomic, as you said. To echo others' responses, do not use a naive spin lock on the app engine. You'll use up your daily quota in about 20 minutes. Now, to your solution:
I've done something like this. I created a task queue with a bucket size of 1 and a exec rate of 1/10s (one task per 10 seconds). I used this queue for "spinning", except that it has the advantage of only checking once per 10 seconds. I'm not sure what your use case is, but even executing a task once per second is far better than just spinning in a loop. So you implement a task servlet that checks the status of this lock and, if free, does whatever you want it to do.

Related

Does Flink's windowing operation process elements at the end of window or does it do a rolling processing?

I am having some trouble understanding the way windowing is implemented internally in Flink and could not find any article which explain this in depth. In my mind, there are two ways this can be done. Consider a simple window wordcount code as below
env.socketTextStream("localhost", 9999)
.flatMap(new Splitter())
.groupBy(0)
.window(Time.of(500, TimeUnit.SECONDS)).sum(1)
Method 1: Store all events for 500 seconds and at the end of the window, process all of them by applying the sum operation on the stored events.
Method 2: We use a counter to store a rolling sum for every window. As each event in a window comes, we do not store the individual events but keep adding 1 to previously stored counter and output the result at the end of the window.
Could someone kindly help to understand which of the above methods (or maybe a different approach) is used by Flink in reality. The reason is, there are pros and cons to both approach and is important to understand in order configure the resources for the cluster correctly.
eg: The Method 1 seems very close to batch processing and might potentially have issues related to spike in processing at every 500 sec interval while sitting idle otherwise etc while Method2 would need to maintain a common counter between all task managers.

sum is a reducing function as mentioned here(https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#reducefunction). Internally, Flink will apply reduce function to each input element, and simply save the reduced result in ReduceState.
For other windows functions, like windows.apply(WindowFunction). There is no aggregation so all input elements will be saved in the ListState.
This document(https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#window-functions) about windows stream mentions about how the internal elements are handled in Flink.

Getting JMeter to work with Throughput Shaping timer and Concurrency Thread Group

I am trying to shape a JMeter test involving a Concurrency Thread Group and a Throughput Shaping Timer as documented here and here. the timer is configured to run ten ramps and stages with RPS from 1 to 333.
I want to set up the Concurrency Thread Group to use the schedule feedback function and added the formula in the Target concurrency field (I have updated the example from tst-name to the actual timer name). ramp-up time and steps I have set to 1 as I assume the properties are not that important if the throughput is managed by the timer; the Hold Target Rate time is 8000, which is longer than the steps added in the timer (6200).
When I run the test, it ends without any exceptions within 3 seconds or so. The log file shows a few rows about starting and ending threads but nothing alarming.
The only thing I find suspicious is the Log entry "VirtualUserController: Test limit reached, thread is done plus thread name.
I am not getting enough clues from the documentation linked here to figure this out myself, do you have any hints?

According to the documentation rampup time and steps should be blank:
When using this approach, leave Concurrency Thread Group Ramp Up Time and Ramp-Up Steps Count fields blank"
So your assumption that setting them to 1 is OK, seems false...

How to send out messages with a defined rate in ZeroMQ?

I want to test how many subscribers I can connect to a publisher, which is sending out messages fast, but not with a maximum speed, e.g. every microsecond.
The reason is, if I send out messages with maximum speed, I miss messages at the receiver ( High-water-mark ).
I thought, I can use nanosleep(), and it works nice with 20 messages a second ( sleep: 50000000 [ns] ). But with a shorter sleeping time, it gets worse: 195 (5000000), 1700(500000), 16000 (50000) messages. And with even shorter sleeping times, I don't really get more messages. It seems that the sleep-function itself needs some time, I can see this, if I print out timestamps.
So, I think, it is the wrong way to run a function with a specific rate. But I didn't find a way to do that in another way.
Is there a possibility to send out roughly 1000000 messages a second?

Q: How to send out messages with a defined rate?
Given API is v.4.2.3+, one can use a { pgm:// | epgm:// }-transport class for this very purpose and setup the adequately tuned .setsockopt( ZMQ_RATE, <kbps> ) plus exhibit some additional performance related tweaking of buffer-sizing ( ZMQ_SNDBUF, ZMQ_IMMEDIATE, ZMQ_AFFINITY, ZMQ_TOS and ZMQ_MULTICAST_MAXTPDU ) with some priority-mapping e.t.c., so as to safely get as close to the hardware limits as needed.
Q: Is there a possibility to send out roughly 1,000,000 messages a second?
Well, given not more than about a 1000 [ns] per a message-to-wire dispatch latency, a carefull engineering is due to take place.
The best candidate for such rate would be to use the inproc:// transport class, as this does not rely on ZeroMQ's Context-instance IO-thread(s) performance / bottlenecks inside an external O/S scheduler ( and will definitely work faster than any other kind of available transport-classes ). Still, it depends, if it can meet less than the required 1000 [ns] latency, based on your application design and message sizes ( Zero-Copy being our friend here to better meet the latency deadline ).

c - Multiple select()s to monitor multiple FD_SETs

I'm not an expert in Network Programming. I basically have two kinds of clients who have different time-outs. I am supposed to use UDP with connected sockets for client-server communication.
The problem is twofold:
a) I need to mark as died whichever client (alternatively, socket) does not respond for t1 seconds. Using select would time out if none of the sockets in read_fd_set have anything to read within the timeout value. So, how do I time-out any one socket which is not having data to read for quite some time?
Currently, whenever select returns, I myself keep track of which sockets are responding and which not. And I add t1.tu_sec to the individual time elapsed of each client (socket). Then, I manually close and exclude from FD_SET the socket which does not respond for (n) * (t1.tu_sec) time. Is this a good enough approach?
b) The main problem is that there are two kinds of clients which have different time-outs, t1 and t2. How do I handle this?
Can I have two select()s for the two kinds of clients in the same loop? Would it cause starvation without threads? Is using threads advisable (or even required) in this case?
I've been roaming around the web for ages!
Any help is much appreciated.

This is just a special case of a very common pattern, where a select/poll loop is associated with a collection of timers.
You can use a priority queue of tasks, ordered on next (absolute) firing time; the select timeout is always then just the absolute time at the front of the queue.
when select times out (and just before the next iteration, if your tasks may take a long time to complete), get the current time, pull every task that should already have executed off the queue, and execute it
(some) tasks will need to be re-scheduled, so make sure they can mutate the priority queue while you do this
Then your logic is trivial:
on read, mark the socket busy
on timer execution, mark the socket idle
if it was already idle, that means nothing was received since the last timer expiry: it's dead

A quick solution that comes to my mind, is to keep the sockets in a collection sorted by the time remaining until the nearest timeout.
Use select with the timeout set to the smallest time remaining, remove/close/delete the timed-out socket from the collection, and repeat.
So, in pseudo-code it might look like this:
C = collection of structs ( socket, timeout, time_remaining := timeout )
while (true) {
sort_the_collection_by_time_remaining
next_timeout = min(time_remaining in C)
select ( sockets in C, next_timeout )
update_all_time_remaining_values
remove_from_C_if_required //if timeout occured
}

It can easily be solved with a single select call. For each socket have two values related to the timeout: The actual timeout; And the amount of time until timeout. Then count down the "time until timeout" every 0.1 second (or similar), and when it reaches zero close the socket. If the socket receives traffic before the timeout simply reset the "time until timeout" to the timeout value and start the down-count again.

Recursion within a thread

Is it a good idea to call a recursive function inside a thread ?
I am creating 10 threads, the thread function in turn call a recursive function . The bad part is
ThreadFunc( )
{
for( ;condn ; )
recursiveFunc(objectId);
}
bool recursiveFunc(objectId)
{
//Get a instance to the database connection
// Query for attibutes of this objectId
if ( attibutes satisfy some condition)
return true;
else
recursiveFunc(objectId) // thats the next level of objectId
}
The recursive function has some calls to the database
My guess is that a call to recursive function inside a loop is causing a performance degradation. Can anyone confirm

Calling a function recursively inside a thread is not a bad idea per se. The only thing you have to be aware of is to limit the recursion depth, or you may produce a (wait for it...) stack overflow. This is not specific to multithreading but applies in any case where you use recursion.
In this case, I would recommend against recursion because it's not necessary. Your code is an example of tail recursion, which can always be replaced with a loop. This eliminates the stack overflow concern:
bool recursiveFunc(objectId)
{
do
{
// Get an instance to the database connection
// Query for attributes of this objectId
// Update objectId if necessary (not sure what the "next level of objectId" is)
}
while(! attributes satisfy some condition);
return true;
}

There's no technical reason why this wouldn't work - it's perfectly legal.
Why is this code the "bad part"?
You'll need to debug/profile this and recursiveFunc to see where the performance degradation is.
Going by the code you've posted have you checked that condn is ever satisfied so that your loop terminates. If not it will loop for ever.
Also what does recursiveFunc actually do?
UPDATE
Based on your comment that each thread performs 15,000 iterations the first thing I'd do would be to move the Get an instance to the database connection code outside recursiveFunc so that you are only getting it once per thread.
Even if you rewrite into a loop (as per Martin B's answer) you would still want to do this.

It depends on how the recursive function talks to the database. If each (or many) level of recursion reopens the database that can be the reason for degradation. If they all share the same "connection" to the database the problem is not in recursion but in the number of threads concurrently accessing the database.

The only potential problem I see with the posted code is that it can represent an infinite loop, and that's usually not what you want (so you'd have to force break somewhere on known reachable conditions to avoid having to abend the application in order to break out of the thread (and subsequently the thread).
Performance degradation can happen with both threading, recursion, and database access for a variety of reasons.
Whether any or all of them are at fault for your problems is impossible to ascertain from the little you're showing us.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

(Google AppEngine) Memcache Lock Entry - google-app-engine

Be careful. You could potentially burn through a lot of your quota in that loop.

Locking is generally a bad idea - and in your example, will result in a busy-wait loop that consumes huge amounts of quota. What do you need locking for? Perhaps we can suggest a better alternative.

If you really do need a loop: don't busy-wait, but include a delay, possibly with exponential back-off: int delay = 100; do { lock = Memcache.increment("lock", 1); usleep(delay); delay = min(delay * 2, 100000); } while (!lock);

Related

Does Flink's windowing operation process elements at the end of window or does it do a rolling processing?

Getting JMeter to work with Throughput Shaping timer and Concurrency Thread Group

How to send out messages with a defined rate in ZeroMQ?

c - Multiple select()s to monitor multiple FD_SETs

Recursion within a thread

Categories

Resources