How to Reset a Counter Column in Scylla DB to Zero? - database

Is it possible to reset a counter column in Scylla DB?
In Cassandra, it is not possible to directly reset a counter column to zero:
There is no operation that will allow resetting the counter value. You need to read the value and decrement it using that value.
Be aware that this operation might not be successful since the counter can be changed between your "reset" operations.
Is this true with Scylla DB as well?

There is a similar behavior in Scylla.
We cannot reset a counter value to 0, counters can only be incremented or decremented.
scylla#cqlsh:ks> CREATE TABLE cf (pk int PRIMARY KEY, my_counter counter);
scylla#cqlsh:ks> UPDATE cf SET my_counter = my_counter + 3 WHERE pk = 0;
scylla#cqlsh:ks> SELECT * FROM cf;
pk | my_counter
----+------------
0 | 3
(1 rows)
scylla#cqlsh:ks> UPDATE cf SET my_counter = 0 WHERE pk = 0;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot set the value of counter column my_counter (counters can only be incremented/decremented, not set)"
scylla#cqlsh:ks> UPDATE cf SET my_counter = my_counter - 3 WHERE pk = 0;
scylla#cqlsh:ks> SELECT * FROM cf;
pk | my_counter
----+------------
0 | 0
(1 rows)
Notice the error when trying to set a value directly.
More about it:
https://docs.scylladb.com/using-scylla/counters/#scylla-counters
https://docs.scylladb.com/getting-started/types/#counters
https://www.scylladb.com/2017/04/04/counters/

Related

Setting conditions in arrays in SAS

I have a set of data that looks like this:
ID Status31Jan2007 Status28Jan2007 Status31Mar2007
001 0 0
002 1 0 0
003 1 1 0
I have Statusddmmyyyy fields of either '0' or '1' for 118 months. (here, I only have three months as a sample)
I want to get results like this:
ID Flag1 Flag2 Flag3
001 N N N
002 Y N N
003 Y Y N
The logic is, if as at Status31Jan2007 = 1 and the following two months, count of Status fields with 0 > 0, then flag it as 'Y'. Else, N.
Meaning,
If my ID is 001 and as at Status31Jan2007, value is missing, i flag it as 'N' under Flag1.
Moving on to the next month, Status28Feb2007, value is 0, i automatically flag it as 'N' as well under Flag2. This applies to the next month.
Looking at ID 002, Status31Jan2007 is 1. And following two months, I have two 0 values. Count of '0' value is > 0. So I flag it as 'Y' under Flag1.
But as at Status28Feb2007, it is 0. It doesnt fit the criteria so i flag it as 'N' under Flag2.
As long as as at the field, I need the status to be 1 then only I proceed to look into the following two months.
After getting the results, how do I count the number of flags N and Y under each fields?
Count1 Count2 Count3
N 1 2 3
Y 2 1 0
Would appreciate the help as I am new to SAS. Thanks.
This will only work if the column names across are in calendar order.
Use an ARRAY statement to organize and then access variables by index and thus easily process the [index+1] and [index+2] checks your logic indicates. You can also use temporary arrays to maintain a count as you assign the flag values; at the last row the counts are output to a separate table.
Note: for status variables taking on either 0 or 1 the count of 1's can be computed using SUM. The sum of two status variables will be < 2 when either of them is 0.
* simulate some data;
data prelim;
do id = 1 to 20;
do date = '01jan07'd by 1 until(intck('month', '01jan07'd, date) >= 117);
date = intnx('month', date, 1) - 1;
status = ranuni(123) < 0.45;
if date = '31jan07'd and mod(id,5) = 1 then status = .;
output;
end;
end;
format date date9.;
run;
* change the shape of simulated data to match the question;
proc transpose data=prelim prefix=Status out=have(drop=_name_);
by id;
var status;
id date;
run;
* process the problem shaped data;
data
want (keep=id status: flag:)
want_count (keep=flag_value count:);
;
set have end=lastid;
retain sentinel1 sentinel2 0;
array status status: sentinel1 sentinel2; * map all the Status* variables to an array named status;
array flag [118] $1 ; * automatically creates 118 new variables flag1 to flag118;
array yfreq [118] _temporary_ (118*0); * temporary arrays initialized to 0;
array nfreq [118] _temporary_ (118*0);
* process each month status, -2 because of the sentinels ;
do i = 1 to dim(status)-2;
* assign flag according to the logic, some cases require a 2-month look ahead;
select;
when ( status(i) = . ) flag(i) = 'N';
when ( status(i) = 0 ) flag(i) = 'N';
when ( status(i) = 1
and sum(status(i+1),status(i+2)) < 2 ) flag(i) = 'Y'; * SUM trick;
otherwise
flag(i) = 'N';
end;
* track frequencies of flags assigned;
if flag(i) = 'N'
then nfreq(i)+1;
else yfreq(i)+1;
end;
output want;
if lastid then do;
* all flags for all ids have been binned for frequency;
* output the freqs to a count data set;
length flag_value $1;
array freq count1-count118;
flag_value = 'N'; do i = 1 to dim(nfreq); freq(i) = nfreq(i); end; output want_count;
flag_value = 'Y'; do i = 1 to dim(yfreq); freq(i) = yfreq(i); end; output want_count;
end;
run;

SQL Server - poor performance during Insert transaction

I have a stored procedure which executes a query and return the line into variables like below:
SELECT #item_id = I.ID, #label_ID = SL.label_id,
FROM tb_A I
LEFT JOIN tb_B SL ON I.ID = SL.item_id
WHERE I.NUMBER = #VAR
I have a IF to check if #label_ID is null or not. If it is null, it goes to INSERT statement, otherwise it goes to UPDATE statement. Let's focus on INSERT where I know I'm having problems. The INSERT part is like below:
IF #label_ID IS NULL
BEGIN
INSERT INTO tb_B (item_id, label_qrcode, label_barcode, data_leitura, data_inclusao)
VALUES (#item_id, #label_qrcode, #label_barcode, #data_leitura, GETDATE())
END
So, tb_B has a PK in ID column and a FK in item_ID column which refers to column ID in tb_A table.
I ran SQL Server Profiler and I saw that sometimes the duration for this stored procedure takes around 2300ms and the normal average for this is 16ms.
I ran the "Execution Plan" and the biggest cost is in the "Clustered Index Insert" component. Showing below:
Estimated Execution Plan
Actual Execution Plan
Details
More details about the tables:
tb_A Storage:
Index space: 6.853,188 MB
Row count: 45988842
Data space: 5.444,297 MB
tb_B Storage:
Index space: 1.681,688 MB
Row count: 15552847
Data space: 1.663,281 MB
Statistics for INDEX 'PK_tb_B'.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Name Updated Rows Rows Sampled Steps Density Average Key Length String Index
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PK_tb_B Sep 23 2018 2:30AM 15369616 15369616 5 1 4 NO 15369616
All Density Average Length Columns
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
6.506343E-08 4 id
Histogram Steps
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 0 1 0 1
8192841 8192198 1 8192198 1
8270245 65535 1 65535 1
15383143 7111878 1 7111878 1
15383144 0 1 0 1
Statistics for INDEX 'IDX_tb_B_ITEM_ID'.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Name Updated Rows Rows Sampled Steps Density Average Key Length String Index
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
IDX_tb_B_ITEM_ID Sep 23 2018 2:30AM 15369616 15369616 12 1 7.999424 NO 15369616
All Density Average Length Columns
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
6.50728E-08 3.999424 item_id
6.506343E-08 7.999424 item_id, id
Histogram Steps
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 2214 0 1
16549857 0 1 0 1
29907650 65734 1 65734 1
32097131 131071 1 131071 1
32296132 196607 1 196607 1
32406913 98303 1 98303 1
40163331 7700479 1 7700479 1
40237216 65535 1 65535 1
47234636 6946815 1 6946815 1
47387143 131071 1 131071 1
47439431 31776 1 31776 1
47439440 0 1 0 1
PK_tb_B Index fragmentation
IDX_tb_B_Item_ID
Is there any best practices where I can apply and make this execution duration stable?
Hope you can help me !!!
Thanks in advance...
It's probably that the problem is the DbType of the clustered index. Clustered indexes store the data in the table based on the key values. By default, your primary key is created with a clustered index. This is often the best place to have it,
but not always. If you have, for example, a clustered index over a NVARCHAR column, every time that an INSERT is performed, needs to find the right place to insert the new record. For example, if your table have one million rows, with registers ordered alphabetically, and your new register starts with A, then your clustered index needs to move registers from B to Z to put your new register in the A group. If your new register stars with Z, then moves a smaller number of records, but this doesn't mean that its fine too. If you donĀ“t have a column that let you insert new register sequentially, then you can create an identity column for this or have another column that logically is sequential to any transaction entered regardless of the system, for example, a datetime column that registers the time at the insert ocurrs.
If you want more info, please check this Microsoft documentation

Create an ID to identify each loop in SAS

I have a log dataset in SAS like this, which has been ordered by TimeStamp ascendantly
TimeStamp Status
2015Dec01:1:00:00 1
2015Dec01:2:00:00 2
2015Dec01:3:00:00 3
2015Dec01:4:00:00 4
2015Dec01:5:00:00 1
2015Dec01:6:00:00 2
2015Dec01:7:00:00 2
2015Dec01:8:00:00 4
2015Dec01:9:00:00 5
2015Dec01:10:00:00 1
2015Dec01:11:00:00 3
2015Dec01:11:30:00 4
I wanted to create an ID to identify each loop which always started from status 1 and ended at status 4 (no matter what status between 1 and 4) like this:
Time Stamp Status ID
2015Dec01:1:00:00 1 1
2015Dec01:2:00:00 2 1
2015Dec01:3:00:00 3 1
2015Dec01:4:00:00 4 1
2015Dec01:5:00:00 1 2
2015Dec01:6:00:00 2 2
2015Dec01:7:00:00 2 2
2015Dec01:8:00:00 4 2
2015Dec01:9:00:00 5 .
2015Dec01:10:00:00 1 3
2015Dec01:11:00:00 3 3
2015Dec01:11:30:00 4 3
Does anyone can help me out? Thanks a lot
Define your rules (assumed):
Increment ID when status=1
If status>5 then ID is missing
data tmp;
set have;
retain ID_TMP 0; *initialize ID;
if status=1 then ID_TMP + 1;
if status<5 then ID=ID_TMP;
DROP ID_TMP;
run;
We know that a new group has started if current status is < previous status. We'll save the previous status using the lag function so that we can compare the current status to the previous one.
Create a temporary variable called count to count when we increment the ID. Since we are using if-then logic, we want to initialize ID and count with a value of 1 using the retain statement.
There are three cases to account for:
If the current ID is less than the previous ID, then increment count by 1 and set ID to be the value of count;
If the current ID is > 4, set ID to missing.
Any other time, ID will stay the same.
data want;
set log;
retain count ID 1;
Prior_Status = lag(Status);
if(Status < Prior_Status) then do;
count+1;
ID = count;
end;
else if(Status > 4) then call missing(ID);
drop count Prior_Status;
run;
Here's a couple of ways of doing it.
Method 1:
data want;
set have;
if status = 1 then tmp + 1;
if status <= 4 then id = tmp;
else id = .;
run;
Method 2:
data want;
set have;
if status = 1 then tmp + 1;
id = choosen((status<=4)+1, ., tmp);
run;

Updating multiple rows with respect to a change in one row

I am using PostgreSQL and let's say I have a tasks table to keep track of task items. Tasks table is as follows;
Id Name Index
7 name A 1
5 name B 2
6 name C 3
3 name D 4
Index column in tasks table stores the sort order of the tasks. Therefore I will output the tasks with respect to index in increasing order.
So When I change Task D(id = 3)' s index into 2 the new indexes should be as below;
Id Name Index
7 name A 1
5 name B 3
6 name C 4
3 name D 2
or when I change Task A(id = 7)' s index into 4 the new indexes should be as below;
Id Name Index
7 name A 4
5 name B 2
6 name C 3
3 name D 1
What I think is updating all row's index values one by one is pretty inefficient.
So what is the most efficient way to update all index values when I change one of the indexes in my Tasks table?
Edit :
First of all sorry for the confusion. What I am asking is not a simple exchanging two row indexes. If you look at the examples when I change Task D's index in to 2 more than one rows change. So when Task D is 2, Task B becomes 3 and Task C becomes 4.
For instance;
It is like when you drag Task D and drop below Task A so that it's index becomes 2 and B and C's index increases by 1.
SQL Fiddle
What you are doing is exchanging two row's indexes. So it is necessary to store the index value of the first updated one in a temp variable and setting it temporarily to a special value to avoid a unique index collision, that is, if the index is unique. If the index is not unique that step is unnecessary.
begin;
create temp table t as
select
(
select index
from tasks
where id = 3
) as index,
(
select id
from tasks
where index = 2
) as id
;
update tasks
set index = -1
where id = (select id from t)
;
update tasks
set index = 2
where id = 3
;
update tasks
set index = (select index from t)
where id = (select id from t)
;
drop table t;
commit;
The following assumes the index column (as well as id) is unique:
with swapped as (
select n1.id as id1,
n1.name as name1,
n1.index as index1,
n2.id as id2,
n2.name as name2,
n2.index as index2
from names n1
join names n2 on n2.index = 2 -- this is the value of the "new index"
where n1.id = 3 -- this is the id of the row where the index should be changed to the new value
)
update names as n
set index = case
when n.id = s.id1 then s.index2
when n.id = s.id2 then s.index1
end
from swapped s
where n.id in (s.id1, s.id2);
The CTE first creates a single row with the ids of the two rows to be swapped and then the update just compares the ids of the target table with those from the CTE, swapping the values.
SQLFiddle example: http://sqlfiddle.com/#!15/71dc2/1

rewrite T-SQL bitwise logic

How do I rewrite this T-SQL code to produce the same results
SELECT ACC.Title,
ACC.AdvertiserHierarchyId,
1 AS Counter
FROM admanAdvertiserHierarchy_tbl ACC
JOIN dbo.admanAdvertiserObjectType_tbl AOT ON AOT.AdvertiserObjectTypeId = ACC.AdvertiserObjectTypeId
WHERE (EXISTS
(SELECT 1
FROM dbo.admanAdvertiserHierarchy_tbl CAMP
JOIN dbo.admanAdvertiserAdGroup_tbl AG ON CAMP.AdvertiserHierarchyId = AG.AdvertiserHierarchyId
JOIN dbo.admanAdvertiserCreative_tbl AC ON AC.AdvertiserAdGroupId = AG.AdvertiserAdGroupId
AND CAMP.ParentAdvertiserHierarchyId = ACC.AdvertiserHierarchyId
WHERE CAMP.ERROR = 0
AND AC.Dirty & 7 > 0
AND AC.ERROR = 0
AND AG.ERROR = 0 ))
its preventing the optimizer from using indexes efficiently .
trying to achieve the following results
Title AdvertiserHierarchyId Counter
trcom65#travelrepublic.co.uk 15908 1
paul570#travelrepublic.co.uk 37887 1
es88#travelrepublic.co.uk 37383 1
it004#travelrepublic.co.uk 27006 1
011 10526 1
013 10528 1
033 12013 1
062 17380 1
076 20505 1
this is a count of the dirty tinyint column
Dirty total
0 36340607
1 117569
2 873553
3 59
that links to a static reason table
DirtyReasonId Title
0 Nothing
1 Overnight Engine
2 End To End
3 Overnight And End To End
4 Pause Resume
5 Overnight Engine and Paused
6 Overnight Engine E2E and Paused
7 All Three
If you are asking specifically about the use of the BITWISE AND operator, I believe you are correct, and it's unlikely that SQL Server sees that as sargable, at least, not with an index with Dirty as a leading column.
You are showing only the lowest two bits in use (maximum value of Dirty is 3), yet you are testing the lowest three bits.
So, AC.Dirty > 0 would return an equivalent result, given that 3 is largest value of Dirty. But there is a possibility that other (higher-order) bits are set, for example Dirty could be set to 8. So, if the intent is to check ONLY the lowest three bits, then we need to ensure that we test only the three lowest-order bits. This expression would do that, and one of the predicates is sargable:
( AC.Dirty > 0 AND AC.Dirty % 8 > 0 )
This basically tests first whether ANY bits in AC.Dirty are set, and then checks if any of the last three bits are set. (We're using the MODULO division operator to return the remainder of AC.Dirty divided by 8, which will of course return an integer value between 0 and 7. If we get a zero, then we know that none of the lower three bits are set, else we know at least one of the bits is set.
Just to be clear: the predicate on AC.Dirty > 0 is redundant. It's included here in case you are wanting to make sure that database can at least consider using an existing index with Dirty as a leading column.
I will mention that another option to consider would be adding a persisted COMPUTED COLUMN on the expression, and create an index on it. But that seems a bit overkill for what you need here.
If you are asking specifically about getting an index used on table admanAdvertiserCreative_tbl (AC), then likely your best candidate would be covering index on (AdvertiserAdGroupId, Error, Dirty).
The SQL rewrite below should return equivalent results, perhaps with better performance (depending on your data distribution, indexes, et al.)
Basically, replace the EXISTS (correlated subquery) with a JOIN to a subquery. The subquery returns distinct values of CAMP.ParentAdvertiserHierarchyId, which is the column you referenced to correlate the subquery.
This may or may not make use of any indexes, depending on what indexes are available. (It's likely have clustered unique indexes on the primary keys, and have non-clustered indexes on the foreign keys, which should help join performance.)
Untested:
SELECT ACC.Title,
ACC.AdvertiserHierarchyId,
1 AS Counter
FROM admanAdvertiserHierarchy_tbl ACC
JOIN dbo.admanAdvertiserObjectType_tbl AOT
ON AOT.AdvertiserObjectTypeId = ACC.AdvertiserObjectTypeId
JOIN (SELECT CAMP.ParentAdvertiserHierarchyId
FROM dbo.admanAdvertiserHierarchy_tbl CAMP
JOIN dbo.admanAdvertiserAdGroup_tbl AG
ON CAMP.AdvertiserHierarchyId = AG.AdvertiserHierarchyId
JOIN dbo.admanAdvertiserCreative_tbl AC
ON AC.AdvertiserAdGroupId = AG.AdvertiserAdGroupId
WHERE CAMP.ERROR = 0
AND ( AC.Dirty > 0 AND AC.Dirty % 8 > 0 )
AND AC.ERROR = 0
AND AG.ERROR = 0 )
GROUP BY CAMP.ParentAdvertiserHierarchyId
) c
ON c.ParentAdvertiserHierarchyId = ACC.AdvertiserHierarchyId

Resources