How do I perform an atomic operation using the Datomic database? - database

TL;DR I want the function: "update Y ONLY IF Y=10", otherwise it fails.
Example: imagine the timeline being T1, T2 and T3. At time T1, the entity X contains the attribute Y=10, at time T2 the attribute is Y=14. My aim is to apply a complex operation in Y (assume that this operation is the sum of 1). I read the value of Y at T1, which is 10 and place this value in a queue to be processed. At T3, when the complex operation is completed and the result is 11, I will update the attribute Y. If I simply update the attribute, the value Y=14 that was at T2, it will be mistakenly discarded. However, at T3, upon updating, I want to be sure that the final value is Y=10, otherwise I have to read Y=14 at T2 for reprocessing.
I know about Database Functions to make atomic read-modify-update processing, but this approach is not good if the operation is complex and need to be done distributed (after put in a queue).
What I want is something equivalent to Conditional Writes in DynamoDB.

You could run the ensure process peer-side, validate for a certain basis T, and then check the basis T of the database in the transaction. So the computationally complex or expensive code is handled peer-side and the transaction function is only responsible for the basis T validation.
For anything that matches the standard use case (e.g., your example in your description), Database Functions are the correct and canonical answer.
The built-in :db.fn/cas function is available and is now documented at http://docs.datomic.com/transactions.html

With regard to Felipe's comment "a transaction function that insert only if or throw exception", can you not just use the built-in one see :db.fn/cas?

Related

Flink: handle skew by partitioning by a field of the key

I have skew when I keyBy on my data. Let's say the key is:
case class MyKey(x: X, y:Y)
To solve this I am thinking of adding an extra field that would make distribution even among the workers by using this field only for partitioning:
case class MyKey(z: evenlyDistributedField, x: X, y:Y) extends MyKey(x, y) {
override def hashCode(): Int = z.hashCode
}
due to this line my records will use the overridden hashCode and be distributed evenly to each worker and use the original equals method (that takes into consideration only the X and Y fields) to find the proper keyed state in later stateful operators.
I know that same (X, Y) pairs will end in different workers, but I can handle that later. (after making the necessary processing with my new key to avoid skew).
My question is where else is the hashCode method of the Key is used?
I suspect for sure when getting keyed state (what is namespace btw?) as I saw extending classes use the key in a hashMap to get the state for this key. I know that retrieving the KeyedState from the map will be slower as as the hashCode will not consider the X, Y fields. But is there any other place in the flink code that uses the hashcode method of the key?
Is there any other way to solve this? I thought of physical partitioning but I cannot use keyBy as well afaik.
SUMMING UP I WANT TO:
partition my data in each worker randomly to produce an even distribution
[EDITED] do a .window().aggregate() in each partition independently from one another (as if the others dont exists). The data in each window aggregate should be keyed on (X,Y)s of this partition ignoring same (X,Y) keys in other partitions.
merge the conflicts due to same (X,Y) pairs appearing in different partition later (This i need not guidance. I just do a new key by on (X, Y))
In this situation I usually create a transient Tuple2<MyKey, Integer>, where I fill in the Tuple.f1 field with whatever I want to use to partition by. The map or flatMap operation following the .keyBy() can emit MyKey. That avoids mucking with MyKey.hashCode().
And note that having a different set of fields for the hashCode() vs. equals() methods leads to pain and suffering. Java has a contract that says "equals consistency: objects that are equal to each other must return the same hashCode".
[updated]
If you can't offload a significant amount of unkeyed work, then what I would do is...
Set the Integer in the Tuple2<MyKey, Integer> to be hashCode(MyKey) % <operator parallelism * factor>. Assuming your parallelism * factor is high enough, you'll only get a few cases of 2 (or more) of the groups going to the same sub-task.
In the operator, use MapState<MyKey, value> to store state. You'll need this since you'll get multiple unique MyKey values going to the same keyed group.
Do your processing and emit a MyKey from this operator.
By using hashCode(MyKey) % some value, you should get a pretty good mix of unique MyKey values going to each sub-task, which should mitigate skew. Of course if one value dominates, then you'll need another approach, but since you haven't mentioned this I'm assuming it's not the case.

Should All Parameters In A Prepared Statement Always Use Placeholders?

I have been attempting to find something that tells me one way or the other on how to write a prepared statement when a parameter is static.
Should all parameters always use placeholders, even when the value is always the same?
SELECT *
FROM student
WHERE admission_status = 'Pending' AND
gpa BETWEEN ? AND ?
ie, in this example admission_status will never be anything but 'Pending', but gpa will change depending on either user input or different method calls.
I know this isn't the best example, but the reason I ask is that I have found a noticeable difference in execution speed when replacing all static parameters that where using a placeholder with their static value counterpart in queries that are hundreds of lines long.
Is it acceptable to do this? Or does this go against the standards of prepared statement use? I would like to know one way or the other before I begin to "optimize" larger queries by testing new indexes and replacing the ?s with values to see if there is a boost in execution speed.

Apache flink complex analytics stream design & challenges

Problem statement:
Trying to evaluate Apache Flink for modelling advanced real time low latency distributed analytics
Use case abstract:
Provide complex analytics for instruments I1, I2, I3... etc each having product definition P1, P2, P3; configured with user parameters (Dynamic) U1, U2,U3 & requiring streaming Market Data M1, M2, M3...
Instrument Analytics function (A1,A2) are complex in terms of computation complexity, some of them could take 300-400ms but can be computed in parallel.
From above clearly Market data stream would be much faster (<1ms) than analytics function & need to consume latest consistent market data for calculations.
Next challenge is multiple Dependendant Enrichment functions E1,E2,E3 (e.g. Risk/PnL) which combine streaming Market data with instrument analytics result (E.g. Price or Yield)
Last challenge is consistency for calculations - as function A1 could be faster than A2 and need a consistent all instrument result from given market input.
Calculation Graph dependency examples (scale it to hundreds of instruments & 10-15 market data sources):
In case above image is not visible, graph dependency flow is like:
- M1 + M2 + P1 => A2
- M1 + P1 => A1
- A1 + A2 => E2
- A1 => E1
- E1 + E2 => Result numbers
Questions:
Correct design/model for these calculation data streams, currently I use ConnectedStreams for (P1 + M1), Another approach could be to use Iterative model feeding same instruments static data to itself again?
Facing issue to use just latest market data events in calculations as analytics function (A1) is lot slower than Market data (M1) streaming.
Hence need stale market data eviction for next iteration retaining those where no value is not available (LRU cache like)
Need to synchronize / correlate function execution of different time complexity so that iteration 2 starts only when everything in iteration 1 finished
This is quite a broad question and to answer it more precisely, one would need a few more details.
Below are a few thoughts that I hope will point you in a good direction and help you to approach your use case:
Connected streams by key (a.keyBy(...).connect(b.keyBy(...)) are the most powerful join- or union-like primitive. Using CoProcessFunction on a connected stream should give you the flexibility to correlate or join values as needed. You can for example store the events from one stream in the state while waiting for a matching event to arrive from the other stream.
Holding always the latest data of one input is easily doable by just putting that value into the state of a CoFlatMapFunction or a CoProcessFunction. For each event from input 1, you store the event in the state. Each event from stream 2, you look into the state to find the latest event from stream 1.
To synchronize on time, you could actually look into using event time. Event time can also be "logical time", meaning just a version number, iteration number, or anything. You only need to make sure that the timestamp you assign and the watermarks you generate reflect that consistently.
If you window by event time then, you will get all data of that version together, regardless of whether one operator is faster than others, or the events arrive via paths with different latency. That is the beauty of real event time processing :-)

Spark speed up multiple join operations

Suppose I have a rule like this:
p(v3,v4) :- t1(k1,v1), t2(k1,v2), t3(v1,v3), t4(v2,v4).
The task is join t1, t2, t3, and t4 together to produce a relation p.
Suppose t1, t2, t3, and t4 are already having same partitioner for their keys.
A common strategy is to join relations one by one, but it will force at least 3 shuffle/repartition operations. Details are below(suppose I have 10 partitions).
1.join: x = t1.join(t2)
2.repartition: x = x.map(lambda (k1, (v1,v2)): (v1,v2)).partitionBy(10)
3.join: x = x.join(t3)
4.repartition: x = x.map(lambda (v1, (v2,v3)): (v2,v3)).partitionBy(10)
5.join: x = x.join(t4)
6.repartition: x = x.map(lambda (v2, (v3,v4)): (v3,v4)).partitionBy(10)
Because t1 to t4 all have same partitioner, and I repartition the intermediate result after every join, each join operations will not involve any shuffle.
However, the intermediate result(i.e. variable x) is huge in my practical code, 3 shuffle operations are still too many for me.
My questions are:
Is there anything wrong with my strategy to evaluate this rule? Is there any better, more efficient solution?
My understanding of shuffle operation is that, for each partition, Spark will do repartition independently, it will generate repartition results for each partition on disk(so-called shuffle write). Then, for each partition, Spark will get new repartition results from disk(so-called shuffle read). If my understanding is correct, each shuffle/repartition will always cost disk read and write. It's kind of a waste, if I can guarantee my memory is huge enough to store all data. Just as said in http://www.trongkhoanguyen.com/2015/04/understand-shuffle-component-in-spark.html. Is there any workaround to disable this kind of shuffle write and read operations? I think my program's performance bottleneck is due to shuffle IO overhead.
Thank you.

what is the serializability graph of this?

I try to figure out a question, however I do not how to solve it, I am unannounced most of the terms in the question. Here is the question:
Three transactions; T1, T2 and T3 and schedule program s1 are given
below. Please draw the precedence or serializability graph of the s1
and specify the serializability of the schedule S1. If possible, write
at least one serial schedule. r ==> read, w ==> write
T1: r1(X);r1(Z);w1(X);
T2: r2(Z);r2(Y);w2(Z);w2(Y);
T3: r3(X);r3(Y);w3(Y);
S1: r1(X);r2(Z);r1(Z);r3(Y);r3(Y);w1(X);w3(Y);r2(Y);w2(Z);w2(Y);
I do not have any idea about how to solve this question, I need a detailed description. In which resource should I look for? Thank in advance.
There are various ways to test for serializability. The Objective of serializability is to find nonserial schedules that allow transactions to execute concurrently without interfering with one another.
First we do a Conflict-Equivalent Test. This will tell us whether the schedule is serializable.
To do this, we must define some rules (i & j are 2 transactions, R=Read, W=Write).
We cannot Swap the order of actions if equivalent to:
1. Ri(x), Wi(y) - Conflicts
2. Wi(x), Wj(x) - Conflicts
3. Ri(x), Wj(x) - Conflicts
4. Wi(x), Rj(x) - Conflicts
But these are perfectly valid:
R1(x), Rj(y) - No conflict (2 reads never conflict)
Ri(x), Wj(y) - No conflict (working on different items)
Wi(x), Rj(y) - No conflict (same as above)
Wi(x), Wj(y) - No conflict (same as above)
So applying the rules above we can derive this (using excel for simplicity):
From the result, we can clearly see with managed to derive a serial-relation (i.e. The schedule you have above, can be split into S(T1, T3, T2).
Now that we have a serializable schedule and we have the serial schedule, we now do the Conflict-Serialazabile test:
Simplest way to do this, using the same rules as the conflict-equivalent test, look for any combinations which would conflict.
r1(x); r2(z); r1(z); r3(y); r3(y); w1(x); w3(y); r2(y); w2(z); w2(y);
----------------------------------------------------------------------
r1(z) w2(z)
r3(y) w2(y)
w3(y) r2(y)
w3(y) w2(y)
Using the rules above, we end up with a table like above (e.g. we know reading z from one transaction and then writing z from another transaction will cause a conflict (look at rule 3).
Given the table, from left to right, we can create a precedence graph with these conditions:
T1 -> T2
T3 -> T2 (only 1 arrow per combination)
Thus we end up with a graph looking like this:
From the graph, since there it's acyclic (no cycle) we can conclude the schedule is conflict-serializable. Furthermore, since its also view-serializable (since every schedule that's conflict-s is also view-s). We could test the view-s to prove this, but it's rather complicated.
Regarding sources to learn this material, I recommend:
"Database Systems: A practical Approach To design, implementation and management: International Edition" by Thomas Connolly; Carolyn Begg - (It is rather expensive so I suggest looking for a cheaper, pdf copy)
Good luck!
Update
I've developed a little tool which will do all of the above for you (including graph). It's pretty simple to use, I've also added some examples.

Resources