I'm using redis lists and pushing to new items to a list. The problem is I really only need the most recent 10 items in a list.
I'm using lpush to add items to a list and lrange to get the most recent 10.
Is there anyway to drop items after a certain number? I'll end up with lists that may have 1,000's of items and can cause performance issues with latency.
Thank you!
After every lpush, call ltrim to trim the list to 10 elements
See http://redis.io/commands/ltrim
You can use LTRIM intermittently after any LPUSH, no need to call LTRIM after every LPUSH as that would add to overall latency in your app ( though redis is really fast, but you can save lots of LPUSH operations )
Here is a pseudo code to achieve an LTRIM on approximately every 5th LPUSH:
LPUSH mylist 1
random_int = some random number between 1-5
if random_int == 1: # trim my list with 1/5 chance
LTRIM mylist 0 10
Though your list may grow to be a few elements more than 10 elements at times, but it will surely get truncated at regular intervals.
This approach is good for most practical purposes and saves a lot of LTRIM operations, keeping your pushes fast.
The following code,
pushes the item to the list,
keep the size fixed to 10,
and returns the most recent 10 elements
in a transaction.
MULTI
LPUSH list "item1"
LTRIM list 0 9
LRANGE list 0 9
EXEC
No one has ever mentioned the real solution about storing only most 10 recent items.
Let's create a sample list with 15 items (here just numbers):
RPUSH list 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Now indicate offset from the end of the list:
LTRIM list -10 -1
Show list
LRANGE list 0 -1
1) "6"
2) "7"
3) "8"
4) "9"
5) "10"
6) "11"
7) "12"
8) "13"
9) "14"
10) "15"
Now you can add new items and run trim:
RPUSH list 16
LTRIM list -10 -1
1) "7"
2) "8"
3) "9"
4) "10"
5) "11"
6) "12"
7) "13"
8) "14"
9) "15"
10) "16"
Just an alternative. According to official doc of LPUSH, it returns the length of the list after the push operations. You can set a threshold length like k (in your case k > 10) and call LTRIM when returned length is bigger than k. Sample pseudo code as follows:
len = LPUSH mylist xxx
if len > k:
LTRIM mylist 0 9
LRANGE mylist 0 9
It's more controllable than random method. Greater k triggers less LTRIM but with more memory cost. You can adjust k according to how often you want to call LTRIM since calling extra command is more expensive.
Calling LTRIM <list-name> -1 -10 after LPUSH <list-name> <item>
is the simplest answer. Many had covered it.
You must do this two operations in a transaction or must use Lua script to ensure the operation is atomic.
Related
In my program I keep filling the following array with data obtained from a database table then inspect it to find certain words:
01 PRODUCTS-TABLE.
03 PRODUCT-LINE PIC X(40) OCCURS 50 TIMES.
sometimes it occurs 6 times, sometimes more than 6 times.
I'd like to find the number of lines in the array every time I write data to it , how can I do that ?
I tried this but it based on a fixed length:
INSPECT-PROCESS.
MOVE 0 TO TALLY-1.
INSPECT PRODUCTS-TABLE TALLYING TALLY-1 FOR ALL "PRODUCT"
IF TALLY-1 > 0
MOVE SER-NUMBER TO HITS-SN-OUTPUT
MOVE FILLER-SYM TO FILLER-O
MOVE PRODUCT-LINE(1) TO HITS-PR-OUTPUT
WRITE HITS-REC
PERFORM WRITE-REPORT VARYING CNT1 FROM 2 BY 1 UNTIL CNT1 = 11.
WRITE-REPORT.
MOVE " " TO HITS-SN-OUTPUT
MOVE PRODUCT-LINE(CNT1) TO HITS-TX-OUTPUT
WRITE HITS-REC.
In the first output line it writes the SN and the first product-line then in the following lines it writes all remaining product-line and blank out SN.
Something like:
12345678 first product-line
Second product-line
etc
It’s working, however, it only stops when CNT1 is 11, how can I feed the procedure with a variable CNT1 based on how many lines are actually in PRODUCTS-TABLE each time?
I solved the problem by adding an array line counter (LINE-COUNTER-1) to count (ADD 1 TO LINE-COUNTER-1) how many times I add data to the array and stop writing the report when "WRITE-COUNTER = LINE-COUNTER-1"
INSPECT-PROCESS.
MOVE 0 TO TALLY-1
INSPECT PRODUCTS-TABLE TALLYING TALLY-1 FOR ALL "PRODUCT"
IF TALLY-1 > 0
MOVE HOLD-SER-NUM TO HITS-SN-OUTPUT
MOVE FILLER-SYM TO FILLER-O
MOVE PRODUCT-LINE(1) TO HITS-PR-OUTPUT
WRITE HITS-REC
PERFORM WRITE-REPORT VARYING WRITE-COUNTER FROM 2 BY 1
UNTIL WRITE-COUNTER = LINE-COUNTER-1.
I just learned about the "do" loop today and would like to try using it for data entry in SAS. I have tried most examples online, but I still cannot figure it out.
My dataset in an experiment with 6 treatments (1 to 6) using 2 sets of cues, 3 each, Visual and Audio. There's lag measured in seconds, which are 5, 10, and 15, which there are 2 sets.
Basically it looks like this:
Table
The entries I want are:
1. Obs_no, ranging from 1 to 18 (total of 18 observations, this allows me to easily delete outliers with an IF THEN)
2. Treatment type, which are Auditory and Visual.
3.Treatment number, 1 to 6, 3 sets.
4. Lag, 5, 10 or 15.
5. And the data itself
So far, my code makes 2 and 5 possible, it also makes the rest possible with an IF THEN statement and input statement, although I assume there's a way easier method:
data AVCue;
do cue = 'Auditory','Visual';
do i = 1 to 3;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
Lag and the rest was made possible using an IF THEN statement and the crude method of input:
data AVCue;
set AVCue;
IF i=1 THEN Lag=5;
IF i=2 THEN Lag=10;
IF i=3 THEN Lag=15;
input obs_no treatment;
cards;
1 1
2 2
3 3
4 4
5 5
6 6
7 1
8 2
9 3
10 4
11 5
12 6
13 1
14 2
15 3
16 4
17 5
18 6
;
proc print data=AVCue;
run;
The IF THEN should be fine, but the input statement here is just in my opinion counterproductive, and defeats the purpose of using loops, which is to me, to save time. If done this way, I might as well just put the data into excel and import it, or type everything out with ample copy and paste of the text in the
input obs_no treatment;
cards;
section.
My coding knowledge is basic, so sorry if this question sounds silly, I want to know:
1. How would I make a list of numbers using the "do" loops in SAS? I've made several attempts and all I get is a list containing the next number. I know why this happens, the loop counts to x and the value assigned would just be x. I just don't know how to get around that. Somehow this didn't happen in the datalines section, I guess SAS knows there's 18 numbers and the entry i is stored accordingly... or something?
2. How would I go about assigning in this case, the numbers 1 to 6 to each entry?
Thanks!
It is certainly much easier to read in the actual dataset instead of having to impute some of the variables based on the order the values have in the source data. You might be able to combine a SET statement and an INPUT statement in the same data step and get it to work, but it is probably NOT worth the effort. Just make two datasets and merge them.
Looking at the photograph you posted it looks like TREATMENT is not an independent variable. Instead it is just a label for the combination of CUE and LAG. To make it cycle from 1 to 6 just reset it back to 1 when it gets too large.
data AVCue;
do cue = 'Auditory','Visual';
do lag= 5, 10, 15 ;
treatment+1;
if treatment=7 then treatment=1;
obsno+1;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
You can get in trouble if you just let SAS guess at how you want to define your variables. For example if you change the order of the CUE values do cue = 'Visual','Auditory'; then SAS will make CUE with length $5 instead of $8. Add a LENGTH statement to define your variables before you use them.
length obsno 8 treatment 8 cue $8 lag 8 AVCue 8 ;
This will also let you control the order they are created in the dataset.
If you really did already have a SAS dataset and you wanted to add a variable like TREATMENT that cycled from 1 to 6 (or really any DO loop construct) then could nest the SET statement inside the DO loop. Just remember to add the explicit OUTPUT statement.
data new ;
do treatment=1 to 6 ;
set old;
output;
end;
run;
I want to do something I thought was really simple.
My (mock) data looks like this:
data list free/totalscore.1 to totalscore.5.
begin data.
1 2 6 7 10 1 4 9 11 12 0 2 4 6 9
end data.
These are total scores accumulating over a number of trials (in this mock data, from 1 to 5). Now I want to know the number of scores earned in each trial. In other words, I want to subtract the value in the n trial from the n+1 trial.
The most simple syntax would look like this:
COMPUTE trialscore.1 = totalscore.2 - totalscore.1.
EXECUTE.
COMPUTE trialscore.2 = totalscore.3 - totalscore.2.
EXECUTE.
COMPUTE trialscore.3 = totalscore.4 - totalscore.3.
EXECUTE.
And so on...
So that the result would look like this:
But of course it is not possible and not fun to do this for 200+ variables.
I attempted to write a syntax using VECTOR and DO REPEAT as follows:
COMPUTE #y = 1.
VECTOR totalscore = totalscore.1 to totalscore.5.
DO REPEAT trialscore = trialscore.1 to trialscore.5.
COMPUTE #y = #x + 1.
END REPEAT.
COMPUTE trialscore(#i) = totalscore(#y) - totalscore(#i).
EXECUTE.
But it doesn't work.
Any help is appreciated.
Ps. I've looked into using LAG but that loops over rows while I need it to go over 1 column at a time.
I am assuming respid is your original (unique) record identifier.
EDIT:
If you do not have a record indentifier, you can very easily create a dummy one:
compute respid=$casenum.
exe.
end of EDIT
You could try re-structuring the data, so that each score is a distinct record:
varstocases
/make totalscore from totalscore.1 to totalscore.5
/index=scorenumber
/NULL=keep.
exe.
then sort your cases so that scores are in descending order (in order to be bale to use lag function):
sort cases by respid (a) scorenumber (d).
Then actually do the lag-based computations
do if respid=lag(respid).
compute trialscore=totalscore-lag(totalscore).
end if.
exe.
In the end, un-do the restructuring:
casestovars
/id=respid
/index=scorenumber.
exe.
You should end up with a set of totalscore variables (the last one will be empty), which will hold what you need.
you can use do repeat this way:
do repeat
before=totalscore.1 to totalscore.4
/after=totalscore.2 to totalscore.5
/diff=trialscore.1 to trialscore.4 .
compute diff=after-before.
end repeat.
I need to get all score available for a redis sorted set.
redis> ZADD myzset 10 "one"
(integer) 1
redis> ZADD myzset 20 "two"
(integer) 1
redis> ZADD myzset 30 "three"
(integer) 1
Now I want to retrieve all score for myzset, ie. 10,20,30.
EDIT: Since your problem with the size of the values wasn't obvious before, I did some additional research.
There is according to the current documentation no way to get just the scores from a sorted set.
What you'll need to do to get just the scores is to simultaneously add them to a separate set and get them from there when needed.
What you should probably do first though is to try to map your problem differently into data structures. I can't tell from your question why you'd need to get the scores, but there may be other ways to structure the problem that will map better to Redis.
--
I'm not sure there is any way to get all scores without getting the keys, but ZRANGE will at least get the information you're looking for;
redis> ZADD myzset 10 "one"
(integer) 1
redis> ZADD myzset 20 "two"
(integer) 1
redis> ZADD myzset 30 "three"
(integer) 1
redis> ZRANGE myzset 0 -1 WITHSCORES
["one","10","two","20","three","30"]
One way to address this problem is to use server-side Lua scripting.
Consider the following script:
local res = {}
local result = {}
local tmp = redis.call( 'zrange', KEYS[1], 0, -1, 'withscores' )
for i=1,#tmp,2 do
res[tmp[i+1]]=true
end
for k,_ in pairs(res) do
table.insert(result,k)
end
return result
You can execute it by using the EVAL command.
It uses the zrange command to extract the content of the zset (with scores), then it builds a set (represented with a table in Lua) to remove redundant scores, and finally build the reply table. So the values of the zset are never sent over the network.
This script has a flaw if the number of items in the zset is really high, because it copies the entire zset in a Lua object (so it takes memory). However, it is easy to alter it to iterate on the zset incrementally (20 items per 20 items). For instance:
local res = {}
local result = {}
local n = redis.call( 'zcard', KEYS[1] )
local i=0
while i<n do
local tmp = redis.call( 'zrange', KEYS[1], i, i+20, 'withscores' )
for j=1,#tmp,2 do
res[tmp[j+1]]=true
i = i + 1
end
end
for k,_ in pairs(res) do
table.insert(result,k)
end
return result
Please note I am a total newbie in Lua, so there are perhaps more elegant ways to achieve the same thing.
You need to pass the optional argument WITHSCORES. See documentation here:
ZREVRANGE key start stop [WITHSCORES] Return a range of members in a
sorted set, by index, with scores ordered from high to low
When it comes to ruby the following command will do
redis.zrange("zset", 0, -1, :with_scores => true)
# => [["a", 32.0], ["b", 64.0]]
source Ruby Docs
I have the following table in Excel (blank spaces are empty):
A B C D
1 1
2 3
3 4
4 -2
5 4
6 9
7 8
8
9
10
I would like to return the minimum of column A from A1 to A1000000, using the QUARTILE function, while excluding all negative values. The reason I want it from A1 to A1000000 and not A1 to A7 is because I want to update the table (adding new rows starting from A8) and have the formula also automatically update. The reason I want the QUARTILE and not MIN function is because I will be extending it to calculate other statistics like 1st and 3rd quartile.
This function works correctly and returns 1 (pressing ctrl+shift+enter):
QUARTILE(IF(A1:A7 > -1, A1:A7), 0)
However, when I tried the following, it returned 0 when it should still return 1 (pressing ctrl+shift+enter):
QUARTILE(IF(A1:A1000000 > -1, A1:A1000000), 0)
I also tried the following and it returned 0 (pressing ctrl+shift+enter):
QUARTILE(IF(AND(NOT(ISBLANK(A1:A1000000)), A1:A1000000 > -1), A1:A1000000), 0)
Anybody have a solution to my problem?
Create a dynamic named range, called for example, rng, defined by =OFFSET($A$1,0,0,COUNT($A1:$A10000),1)
Then modify your array formula to refer to rng, via =QUARTILE(IF(rng >-1,rng), 0)
Actually what you have works. Try doing:
=QUARTILE(IF(A:A > 0,A:A ),0)
The reason you are returning 0 is that a blank cell is considered to be of the value 0 when this formula is ran. For example, erase one of the values in the A1:A7 range and your original formula will return 0. Also, I would run the formula on the entire A column if possible (for readability, etc.)
Or do you need to return a "0" if that number is in the list?