Clearcase - selective merge - clearcase

I have a peculiar Clearcase doubt. I cannot fully describe why I'm doing such a confusing architecture, but I need to do it (thanks to the mistake done by someone long back).
Ok, here's a bit of detail: B1 is a contaminated branch where both my group's changes and another group's changes got mixed together so badly that there is no way of finding which code is whose). So the solution proposed is to create a new branch called B2 (at the same level as B1) and put all the unmodified code of the other group on it (The way to do that would be to merge B1 with B2 and then go about removing all changes from it till it becomes original). Then create a CR branch on B1 and keep only my group's newly added files or modified files on that branch. Finally create an integration branch out of B2 and merge the changes from CR branch of B1 to integration branch of B2.
So here is what I did: (The use case is where I have dir D where file a, b and c are there. My group ended up modifying file a while b and c are not modified at all).
There is a branch B1 on which there are files a, b and c. There is another branch B2. A merge is done from B1 to B2. Now B2 also has a, b and c.
At this point both branch B1 and B2 are same. Now I delete file a from branch B2 (rmname). Now B2 has b and c only. I put a label to this branch called Label1. This makes the code with label Label1 as the unmodified code from other group.
Now I create a sub branch called CR1 from B1 and delete all the files that are there in B2 branch (i.e b and c) such that it contains only the modified code from original code on it. In my case it is file a.
At this point branch B2 with label Label1 has files b and c (those are unmodified code) and branch CR1 coming off B1 has only a (that is modified by us).
Now I create another branch called integration branch that comes off B2 Label1. And then I do a merge of CR branch on to that expecting that it will have all three files a, b and c for me. All I'd need to do is to do a version tree view and see who modified what.
But the problem I face is that since I had done a rmname of file a on branch B2 earlier to putting Label. The merge does not really take the file a from CR branch.
How to I get around that problem. I want to selectively merge. Is it possible?
sorry if it is a bad design. I'm not really conversant with Clear case and have limited options and time to clear some one else's mess.

(tongue in cheek)
In one of those situations, I actually create a git workspace within my ClearCase view, untangle the situation in Git branches, before clearfsimporting back the result!
More to the point:
You can do a negative or subtractive merge on one file history to remove some commits (versions)
I try to follow your scenario:
:
CR1----------------------------(a)
. (C1 branch from B1) \ (a merged to Int)
. \
B1-(abc) Int (b,c) # where is a???
. .
. (B2 branch from B1) . (Int branch from B2)
B2-(abc)-----------------------(bc,label1)
A merge from CR1 won't add a back, because:
the common ancestor of A (on B1) has been part of B2, before being deleted on B2
Int comes from B2
a is not considered for merge to a branch coming from B2 since it was already part of B2.
In this instance, I would recommend a graphical merge of the directory, which would allow you to manually solve that merge (forcing the resolution to be: "take a")

Related

Variable Break Even Values - Spreadsheet

I'm looking to figure out the break even variables for a variety of columns based on the following spreadsheet.
Locations have a set number of devices (column B) and transactions (column C) that occur.
I would like to figure out a formula for columns M-Q that would show the break evens for each of those columns.
I made the following adjustments to match these specific columns for "Location 1" as an example:
Calculated value M3 by updating the cell, C3, until E3 (variable) matched J3 (static).
Calculated value N3 by updating the cell, C3, until F3 (variable) matched K3 (static).
Calculated value O3 by updating the cell, B3, until J3 (variable) matched E3 (static).
Calculated value P3 by updating the cell, B22, until J3 (variable) matched E3 (static).
Was not able to figure out a simple way to figure out how many years it would take to match lane model (static) to the per stransaction model (variable).
I'd like the sheet to be dynamic meaning if I adjust any of the variable fields B3:B18, C3:C18, or B22:B24 that I would get the breakeven values in columns M-Q to update automatically.
I think that I've got part of the answer for you - still working on the rest.
Try these two formula, in M3 and N3.
Break Even Transactions in M3
=ArrayFormula((H3:H+I3:I)/$B$23)
Break Even Transactions 5 Yr in N3
=ArrayFormula(((H3:H+(5*I3:I))/5)/$B$23)
UPDATE
Joe, I think I have the remaining three formulas that you need. Please check the results to see that they make sense, but I believe that I have the correct logic in the formulas.
Note also that to help me follow the logic, I used named ranges for three variables, which you currently have in cells B22, B23, and B24. I find it increases the chance of errors if I mix up true data with different types of variables or constants, so I moved them off to the right in my sample sheet. Your B22 I named PerLicenseCost, B23 is PerTransCost, and B24 is MaintRate.
The new formulas are as follows:
Break Even Device Count
=ArrayFormula(ROUNDUP(((C3:C*PerTransCost)/(1+MaintRate))/PerLicenseCost))
Break Even Avg. License Fee
=ArrayFormula(((C3:C*PerTransCost)/(1 + MaintRate))/B3:B)
Break Even Years
= ArrayFormula(-1 * (B3:B * PerLicenseCost) / ((B3:B * PerLicenseCost * MaintRate) - (C3:C * PerTransCost)))
Note that for the Break Even Device Count, I used ROUNDUP on the result in the formula, so that it would not include a fraction of a device. You may not want this.
For the Break Even Years, there were four negative results. I'm hoping this means that those locations are losing money, and will never break even, but I'll let you review them and decide that.
Note also that you can easily modify these formulas to move them up into the header row (row 2), if you prefer. To do that, in the header of the desired column, copy the existing formula from the cell below WITHOUT the equal sign and paste in to the following:
={“ text header for the column “; PASTE CURRENT FORMULA HERE }
Be sure to then delete the formula in the cell below (in row 3) or you’ll have a #REF error. If you need any help with this, let me know.
Let me know if you have any questions about any of this. I’ll be happy to answer them, but possibly not for a day or two.

Trouble decomposing to 3rd Normal Form (DB)

I've recently started studying Data Bases but I'm struggling with this specific part.
I've read the definition of each Normal Form but I still can't seem to understand. Here's an example that I couldn't solve properly:
**R(A,B,C,D,E,F)**
A->B; B->CD; AD->E
Solution: R1(*A*,B,E); R2(*B*,C,D); R3(*A*,*F*)
I can't understand why the R3 is like that
R3 is to make sure it is in 2nd Normal Form and there is no update anomaly. F in R1 would lead to duplicate rows of A,B,E where there are multiple F values for A. B and E values might be either ambiguous or completely redundant.

Non-trivial Functional Dependency of this table

Which non-trivial functional dependencies hold in the following table?
Can anyone explain step by step the rules please?
A B C
------------
a1 b2 c1
a2 b1 c6
a3 b2 c4
a1 b2 c5
a2 b1 c3
a1 b2 c7
I'll start with a disclaimer to state that my knowledge of functional dependencies is limited to what was explained in the Wikipedia article, and that I currently don't have the need nor the inclination to study up on it further.
However, since OP asked for clarification, I'll attempt to clarify how I obtained the seemingly correct answer that I posted in the comments.
First off, this is Wikipedia's definition:
Given a relation R, a set of attributes X in R is said to
functionally determine another set of attributes Y, also in R,
(written X → Y) if, and only if, each X value is associated with
precisely one Y value; R is then said to satisfy the functional
dependency X → Y.
Additionally, Wikipedia states that:
A functional dependency FD: X → Y is called trivial if Y is a
subset of X.
Taking these definitions, I arrive at the following two non-trivial functional dependencies for the given relation:
A → B
C → {A, B}
Identifying these was a completely inductive process. Rather than applying a series of rules, formulas and calculations, I looked at the presented data and searched for those constraints that satisfy the above definitions.
In this case:
A → B
There are three possible values presented for A: a1, a2 and a3. Looking at the corresponding values for B, you'll find the following combinations: a1 → b2, a2 → b1, and a3 → b2. Or, every value of A is associated with precisely one B value, conforming to the definition.
C → {A, B}
The same reasoning goes for this dependency. In this case, identifying it is a little bit easier as the values for C are unique in this relation. In this sense, C could be considered as a key. In database terms, a candidate key is exactly that: a minimal set of attributes that uniquely identifies every tuple.
Undoubtedly, there's a way to mathematically derive the functional dependencies from the data, but for simple cases like this, the inductive process seems to work just fine.
So, non-trivial functional dependencies in the above table are:
1. A->B
2. A,C->B
3. B,C->A
4. C->A,B

what is the serializability graph of this?

I try to figure out a question, however I do not how to solve it, I am unannounced most of the terms in the question. Here is the question:
Three transactions; T1, T2 and T3 and schedule program s1 are given
below. Please draw the precedence or serializability graph of the s1
and specify the serializability of the schedule S1. If possible, write
at least one serial schedule. r ==> read, w ==> write
T1: r1(X);r1(Z);w1(X);
T2: r2(Z);r2(Y);w2(Z);w2(Y);
T3: r3(X);r3(Y);w3(Y);
S1: r1(X);r2(Z);r1(Z);r3(Y);r3(Y);w1(X);w3(Y);r2(Y);w2(Z);w2(Y);
I do not have any idea about how to solve this question, I need a detailed description. In which resource should I look for? Thank in advance.
There are various ways to test for serializability. The Objective of serializability is to find nonserial schedules that allow transactions to execute concurrently without interfering with one another.
First we do a Conflict-Equivalent Test. This will tell us whether the schedule is serializable.
To do this, we must define some rules (i & j are 2 transactions, R=Read, W=Write).
We cannot Swap the order of actions if equivalent to:
1. Ri(x), Wi(y) - Conflicts
2. Wi(x), Wj(x) - Conflicts
3. Ri(x), Wj(x) - Conflicts
4. Wi(x), Rj(x) - Conflicts
But these are perfectly valid:
R1(x), Rj(y) - No conflict (2 reads never conflict)
Ri(x), Wj(y) - No conflict (working on different items)
Wi(x), Rj(y) - No conflict (same as above)
Wi(x), Wj(y) - No conflict (same as above)
So applying the rules above we can derive this (using excel for simplicity):
From the result, we can clearly see with managed to derive a serial-relation (i.e. The schedule you have above, can be split into S(T1, T3, T2).
Now that we have a serializable schedule and we have the serial schedule, we now do the Conflict-Serialazabile test:
Simplest way to do this, using the same rules as the conflict-equivalent test, look for any combinations which would conflict.
r1(x); r2(z); r1(z); r3(y); r3(y); w1(x); w3(y); r2(y); w2(z); w2(y);
----------------------------------------------------------------------
r1(z) w2(z)
r3(y) w2(y)
w3(y) r2(y)
w3(y) w2(y)
Using the rules above, we end up with a table like above (e.g. we know reading z from one transaction and then writing z from another transaction will cause a conflict (look at rule 3).
Given the table, from left to right, we can create a precedence graph with these conditions:
T1 -> T2
T3 -> T2 (only 1 arrow per combination)
Thus we end up with a graph looking like this:
From the graph, since there it's acyclic (no cycle) we can conclude the schedule is conflict-serializable. Furthermore, since its also view-serializable (since every schedule that's conflict-s is also view-s). We could test the view-s to prove this, but it's rather complicated.
Regarding sources to learn this material, I recommend:
"Database Systems: A practical Approach To design, implementation and management: International Edition" by Thomas Connolly; Carolyn Begg - (It is rather expensive so I suggest looking for a cheaper, pdf copy)
Good luck!
Update
I've developed a little tool which will do all of the above for you (including graph). It's pretty simple to use, I've also added some examples.

Efficient data structure and strategy for synchronizing several item collections

I want one primary collection of items of a single type that modifications are made to over time. Periodically, several slave collections are going to synchronize with the primary collection. The primary collection should send a delta of items to the slave collections.
Primary Collection: A, C, D
Slave Collection 1: A, C (add D)
Slave Collection 2: A, B (add C, D; remove B)
The slave collections cannot add or remove items on their own, and they may exist in a different process, so I'm probably going to use pipes to push the data.
I don't want to push more data than necessary since the collection may become quite large.
What kind of data structures and strategies would be ideal for this?
For that I use differential execution.
(BTW, the word "slave" is uncomfortable for some people, with reason.)
For each remote site, there is a sequential file at the primary site representing what exists on the remote site.
There is a procedure at the primary site that walks through the primary collection, and as it walks it reads the corresponding file, detecting differences between what currently exists on the remote site and what should exist.
Those differences produce deltas, which are transmitted to the remote site.
At the same time, the procedure writes a new file representing what will exist at the remote site after the deltas are processed.
The advantage of this is it does not depend on detecting change events in the primary collection, because often those change events are unreliable or can be self-cancelling or made irrelevant by other changes, so you cut way down on needless transmissions to the remote site.
In the case that the collections are simple lists of things, this boils down to having local copies of the remote collections and running a diff algorithm to get the delta.
Here are a couple such algorithms:
If the collections can be sorted (like your A,B,C example), just run a merge loop:
while(ix<nx && iy<ny){
if (X[ix] < Y[iy]){
// X[ix] was inserted in X
ix++;
} else if (Y[iy] < X[ix]){
// Y[iy] was deleted from X
iy++;
} else {
// the two elements are equal. skip them both;
ix++; iy++;
}
}
while(ix<nx){
// X[ix] was inserted in X
ix++;
}
while(iy<ny>){
// Y[iy] was deleted from X
iy++;
}
If the collections cannot be sorted (note relationship to Levenshtein distance),
Until we have read through both collections X and Y,
See if the current items are equal
else see if a single item was inserted in X
else see if a single item was deleted from X
else see if 2 items were inserted in X
else see if a single item was replaced in X
else see if 2 items were deleted from X
else see if 3 items were inserted in X
else see if 2 items in X replaced 1 items in Y
else see if 1 items in X replaced 2 items in Y
else see if 3 items were deleted from X
etc. etc. up to some limit
Performance is generally not an issue, because the procedure does not have to be run at high frequency.
There's a crude video demonstrating this concept, and source code where it is used for dynamically changing user interfaces.
If one doesn't push all data, sort of a log is required, which, instead of using pipe bandwidth, uses main memory. The parameter to find a good balance between CPU & memory usage would be the 'push' frequency.
From your question, I assume, you have more than one slave process. In this case, some shared memory or CMA (Linux) approach with double buffering in the master process should outperform multiple pipes by far, as it doesn't even require multithreaded pushing, which would be used to optimize the overall pipe throughput during synchronization.
The slave processes could be notified using a global synchronization barrier for reading from masterCollectionA without copying, while master modifies masterCollectionB (which is initialized with a copy from masterCollectionA) and vice versa. Access to a collection should be interlocked between slaves and master. The slaves could copy that collection (snapshot), if they would block it past the next update attempt from master, thus, allowing it to continue. Modifications in slave processes could be implemented with a copy on write strategy for single elements. This cooperative approach is rather simple to implement and in case the slave processes don't copy whole snapshots everytime, the overall memory consumption is low.

Resources