Known that the function dropPartition requires parameters like this:
dropPartition(dbHandle, partitionPaths, [tableName])
How to specify the parameter partitionPaths if I want to drop several range partitions with one call of the function? I can only drop one partition once in this way:
db=database(dbPath, RANGE, 0 5 10 15 20)
t=table(rand(20, 100) as ID,rand(1.0, 100) as x)
pt = db.createPartitionedTable(t, `tb1, `ID)
pt.append!(t)
dropPartition(db,"0_5")
To drop multiple partitions in one call, please put partition paths in array.
dropPartition(db,["/0_5", "/10_15"])
Related
If I have a function that returns a row of cells (for instance from ImportHTML), the results will always return 9 cells and I need the function to only return the cells from A2:A5 of the results, how can I trim the results to only give me the cells within the indexes that I need?
I need this to work inline, and bonus points if you can do it without using Query, because this will be on a huge spreadsheet with tens of thousands of rows, and performance will be a factor.
If I have a function that returns a row of cells
for columns B:E:
=QUERY(your-formula-here, "select Col2,Col3,Col4,Col5", 0)
without query:
=ARRAY_CONSTRAIN(OFFSET(your-formula-here,,1), 9^9, 4)
in array:
={INDEX(your-formula-here,,2), INDEX(your-formula-here,,3),
INDEX(your-formula-here,,4), INDEX(your-formula-here,,5)}
an overkill:
=ARRAYFORMULA(IFNA(VLOOKUP(INDEX(your-formula-here,,1), your-formula-here, {2,3,4,5}, 0)))
Try this:
=query({1;2;3;4;5;6;7;8;9},"Select * limit 5")
The Result:
{1,2,3,4,5}
Change {1;2;3;4;5;6;7;8;9} to your data source, for example from Range, HTML or importrange
or you use like this:
=query(query({1;2;3;4;5;6;7;8;9},"Select * offset 1"), "Select * limit 4")
or
=query({1;2;3;4;5;6;7;8;9},"Select * limit 4 offset 1")
Both give the Result:
{2,3,4,5}
or
=ARRAY_CONSTRAIN(offset({1;2;3;4;5;6;7;8;9},1 ,0),4,1)
The Result:
{2,3,4,5}
I want to do something I thought was really simple.
My (mock) data looks like this:
data list free/totalscore.1 to totalscore.5.
begin data.
1 2 6 7 10 1 4 9 11 12 0 2 4 6 9
end data.
These are total scores accumulating over a number of trials (in this mock data, from 1 to 5). Now I want to know the number of scores earned in each trial. In other words, I want to subtract the value in the n trial from the n+1 trial.
The most simple syntax would look like this:
COMPUTE trialscore.1 = totalscore.2 - totalscore.1.
EXECUTE.
COMPUTE trialscore.2 = totalscore.3 - totalscore.2.
EXECUTE.
COMPUTE trialscore.3 = totalscore.4 - totalscore.3.
EXECUTE.
And so on...
So that the result would look like this:
But of course it is not possible and not fun to do this for 200+ variables.
I attempted to write a syntax using VECTOR and DO REPEAT as follows:
COMPUTE #y = 1.
VECTOR totalscore = totalscore.1 to totalscore.5.
DO REPEAT trialscore = trialscore.1 to trialscore.5.
COMPUTE #y = #x + 1.
END REPEAT.
COMPUTE trialscore(#i) = totalscore(#y) - totalscore(#i).
EXECUTE.
But it doesn't work.
Any help is appreciated.
Ps. I've looked into using LAG but that loops over rows while I need it to go over 1 column at a time.
I am assuming respid is your original (unique) record identifier.
EDIT:
If you do not have a record indentifier, you can very easily create a dummy one:
compute respid=$casenum.
exe.
end of EDIT
You could try re-structuring the data, so that each score is a distinct record:
varstocases
/make totalscore from totalscore.1 to totalscore.5
/index=scorenumber
/NULL=keep.
exe.
then sort your cases so that scores are in descending order (in order to be bale to use lag function):
sort cases by respid (a) scorenumber (d).
Then actually do the lag-based computations
do if respid=lag(respid).
compute trialscore=totalscore-lag(totalscore).
end if.
exe.
In the end, un-do the restructuring:
casestovars
/id=respid
/index=scorenumber.
exe.
You should end up with a set of totalscore variables (the last one will be empty), which will hold what you need.
you can use do repeat this way:
do repeat
before=totalscore.1 to totalscore.4
/after=totalscore.2 to totalscore.5
/diff=trialscore.1 to trialscore.4 .
compute diff=after-before.
end repeat.
Given a PostgreSQL ARRAY of items of one type, how can I create a new array where each item is derived from the items in the initial array?
Example: I have an array of INTERVAL values. I want a new array where each item is a NUMERIC(10, 1) that is the total number of seconds in the corresponding INTERVAL value.
I know how to convert one INTERVAL value:
foo=> SELECT '00:01:20.000'::INTERVAL AS duration_interval;
duration_interval
-------------------
00:01:20
(1 row)
foo=> SELECT extract(EPOCH FROM date_trunc('second', '00:01:20.000'::INTERVAL))
::NUMERIC(10, 1) AS duration_seconds;
duration_seconds
------------------
80.0
(1 row)
The array does not exist in a table – this is a value returned from another function call – so the conversion code needs to operate on it as an array.
How can I convert an array of INTERVAL values to an array of corresponding NUMERIC values?
You need to unnest() the array, do the conversion and then aggregate back into an array.
Assuming you want to do this on a real table with a primary key:
SELECT pk, array_agg(extract(epoch from dur_int)::numeric(10,1)
ORDER BY ordinality) AS duration_seconds
FROM my_table, unnest(duration_interval) WITH ORDINALITY d(dur_int)
GROUP BY pk;
If you have a single array, such as the result from a function call:
SELECT array_agg(extract(epoch from dur_int)::numeric(10,1)
ORDER BY ordinality) AS duration_seconds
FROM unnest(function(...)) WITH ORDINALITY d(dur_int);
Note that you need the WITH ORDINALITY clause when unnesting the array. This will add a column ordinality to the result such that every row has two columns: (dur_int interval, ordinality bigint). When putting the array back again with seconds instead of an interval, you order the rows by the ordinality column. That way you ensure that the order in the resulting array of seconds is the same as in the original array of intervals. (In general, SQL row sources have no specific ordering, the server may present rows in any order it prefers.)
If you have access to the function and you are not breaking other uses of it, you might be better off by changing the function such that you can use its result directly.
If there is a primary key then #Patrick answer is enough. If not then use row_number to aggregate on:
with i(i) as (values
(array['00:01:20.000','00:00:30.000']::interval[]),
(array['00:02:10.000','00:01:30.000']::interval[])
)
select array_agg(extract(epoch from a)::numeric(10,1))
from (
select i, row_number() over() as r
from i
) s, unnest(i) a (a)
group by r
;
array_agg
--------------
{80.0,30.0}
{130.0,90.0}
I have a cell array called BodyData in MATLAB that has around 139 columns and 3500 odd rows of skeletal tracking data.
I need to extract all rows between two string values (these are timestamps when an event happened) that I have
e.g.
BodyData{}=
Column 1 2 3
'10:15:15.332' 'BASE05' ...
...
'10:17:33:230' 'BASE05' ...
The two timestamps should match a value in the array but might also be within a few ms of those in the array e.g.
TimeStamp1 = '10:15:15.560'
TimeStamp2 = '10:17:33.233'
I have several questions!
How can I return an array for all the data between the two string values plus or minus a small threshold of say .100ms?
Also can I also add another condition to say that all str values in column2 must also be the same, otherwise ignore? For example, only return the timestamps between A and B only if 'BASE02'
Many thanks,
The best approach to the first part of your problem is probably to change from strings to numeric date values. In Matlab this can be done quite painlessly with datenum.
For the second part you can just use logical indexing... this is were you put a condition (i.e. that second columns is BASE02) within the indexing expression.
A self-contained example:
% some example data:
BodyData = {'10:15:15.332', 'BASE05', 'foo';...
'10:15:16.332', 'BASE02', 'bar';...
'10:15:17.332', 'BASE05', 'foo';...
'10:15:18.332', 'BASE02', 'foo';...
'10:15:19.332', 'BASE05', 'bar'};
% create column vector of numeric times, and define start/end times
dateValues = datenum(BodyData(:, 1), 'HH:MM:SS.FFF');
startTime = datenum('10:15:16.100', 'HH:MM:SS.FFF');
endTime = datenum('10:15:18.500', 'HH:MM:SS.FFF');
% select data in range, and where second column is 'BASE02'
BodyData(dateValues > startTime & dateValues < endTime & strcmp(BodyData(:, 2), 'BASE02'), :)
Returns:
ans =
'10:15:16.332' 'BASE02' 'bar'
'10:15:18.332' 'BASE02' 'foo'
References: datenum manual page, matlab help page on logical indexing.
I need a different random number for each row in my table. The following seemingly obvious code uses the same random value for each row.
SELECT table_name, RAND() magic_number
FROM information_schema.tables
I'd like to get an INT or a FLOAT out of this. The rest of the story is I'm going to use this random number to create a random date offset from a known date, e.g. 1-14 days offset from a start date.
This is for Microsoft SQL Server 2000.
Take a look at SQL Server - Set based random numbers which has a very detailed explanation.
To summarize, the following code generates a random number between 0 and 13 inclusive with a uniform distribution:
ABS(CHECKSUM(NewId())) % 14
To change your range, just change the number at the end of the expression. Be extra careful if you need a range that includes both positive and negative numbers. If you do it wrong, it's possible to double-count the number 0.
A small warning for the math nuts in the room: there is a very slight bias in this code. CHECKSUM() results in numbers that are uniform across the entire range of the sql Int datatype, or at least as near so as my (the editor) testing can show. However, there will be some bias when CHECKSUM() produces a number at the very top end of that range. Any time you get a number between the maximum possible integer and the last exact multiple of the size of your desired range (14 in this case) before that maximum integer, those results are favored over the remaining portion of your range that cannot be produced from that last multiple of 14.
As an example, imagine the entire range of the Int type is only 19. 19 is the largest possible integer you can hold. When CHECKSUM() results in 14-19, these correspond to results 0-5. Those numbers would be heavily favored over 6-13, because CHECKSUM() is twice as likely to generate them. It's easier to demonstrate this visually. Below is the entire possible set of results for our imaginary integer range:
Checksum Integer: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Range Result: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 0 1 2 3 4 5
You can see here that there are more chances to produce some numbers than others: bias. Thankfully, the actual range of the Int type is much larger... so much so that in most cases the bias is nearly undetectable. However, it is something to be aware of if you ever find yourself doing this for serious security code.
When called multiple times in a single batch, rand() returns the same number.
I'd suggest using convert(varbinary,newid()) as the seed argument:
SELECT table_name, 1.0 + floor(14 * RAND(convert(varbinary, newid()))) magic_number
FROM information_schema.tables
newid() is guaranteed to return a different value each time it's called, even within the same batch, so using it as a seed will prompt rand() to give a different value each time.
Edited to get a random whole number from 1 to 14.
RAND(CHECKSUM(NEWID()))
The above will generate a (pseudo-) random number between 0 and 1, exclusive. If used in a select, because the seed value changes for each row, it will generate a new random number for each row (it is not guaranteed to generate a unique number per row however).
Example when combined with an upper limit of 10 (produces numbers 1 - 10):
CAST(RAND(CHECKSUM(NEWID())) * 10 as INT) + 1
Transact-SQL Documentation:
CAST(): https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql
RAND(): http://msdn.microsoft.com/en-us/library/ms177610.aspx
CHECKSUM(): http://msdn.microsoft.com/en-us/library/ms189788.aspx
NEWID(): https://learn.microsoft.com/en-us/sql/t-sql/functions/newid-transact-sql
Random number generation between 1000 and 9999 inclusive:
FLOOR(RAND(CHECKSUM(NEWID()))*(9999-1000+1)+1000)
"+1" - to include upper bound values(9999 for previous example)
Answering the old question, but this answer has not been provided previously, and hopefully this will be useful for someone finding this results through a search engine.
With SQL Server 2008, a new function has been introduced, CRYPT_GEN_RANDOM(8), which uses CryptoAPI to produce a cryptographically strong random number, returned as VARBINARY(8000). Here's the documentation page: https://learn.microsoft.com/en-us/sql/t-sql/functions/crypt-gen-random-transact-sql
So to get a random number, you can simply call the function and cast it to the necessary type:
select CAST(CRYPT_GEN_RANDOM(8) AS bigint)
or to get a float between -1 and +1, you could do something like this:
select CAST(CRYPT_GEN_RANDOM(8) AS bigint) % 1000000000 / 1000000000.0
The Rand() function will generate the same random number, if used in a table SELECT query. Same applies if you use a seed to the Rand function. An alternative way to do it, is using this:
SELECT ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS [RandomNumber]
Got the information from here, which explains the problem very well.
Do you have an integer value in each row that you could pass as a seed to the RAND function?
To get an integer between 1 and 14 I believe this would work:
FLOOR( RAND(<yourseed>) * 14) + 1
If you need to preserve your seed so that it generates the "same" random data every time, you can do the following:
1. Create a view that returns select rand()
if object_id('cr_sample_randView') is not null
begin
drop view cr_sample_randView
end
go
create view cr_sample_randView
as
select rand() as random_number
go
2. Create a UDF that selects the value from the view.
if object_id('cr_sample_fnPerRowRand') is not null
begin
drop function cr_sample_fnPerRowRand
end
go
create function cr_sample_fnPerRowRand()
returns float
as
begin
declare #returnValue float
select #returnValue = random_number from cr_sample_randView
return #returnValue
end
go
3. Before selecting your data, seed the rand() function, and then use the UDF in your select statement.
select rand(200); -- see the rand() function
with cte(id) as
(select row_number() over(order by object_id) from sys.all_objects)
select
id,
dbo.cr_sample_fnPerRowRand()
from cte
where id <= 1000 -- limit the results to 1000 random numbers
select round(rand(checksum(newid()))*(10)+20,2)
Here the random number will come in between 20 and 30.
round will give two decimal place maximum.
If you want negative numbers you can do it with
select round(rand(checksum(newid()))*(10)-60,2)
Then the min value will be -60 and max will be -50.
try using a seed value in the RAND(seedInt). RAND() will only execute once per statement that is why you see the same number each time.
If you don't need it to be an integer, but any random unique identifier, you can use newid()
SELECT table_name, newid() magic_number
FROM information_schema.tables
You would need to call RAND() for each row. Here is a good example
https://web.archive.org/web/20090216200320/http://dotnet.org.za/calmyourself/archive/2007/04/13/sql-rand-trap-same-value-per-row.aspx
The problem I sometimes have with the selected "Answer" is that the distribution isn't always even. If you need a very even distribution of random 1 - 14 among lots of rows, you can do something like this (my database has 511 tables, so this works. If you have less rows than you do random number span, this does not work well):
SELECT table_name, ntile(14) over(order by newId()) randomNumber
FROM information_schema.tables
This kind of does the opposite of normal random solutions in the sense that it keeps the numbers sequenced and randomizes the other column.
Remember, I have 511 tables in my database (which is pertinent only b/c we're selecting from the information_schema). If I take the previous query and put it into a temp table #X, and then run this query on the resulting data:
select randomNumber, count(*) ct from #X
group by randomNumber
I get this result, showing me that my random number is VERY evenly distributed among the many rows:
It's as easy as:
DECLARE #rv FLOAT;
SELECT #rv = rand();
And this will put a random number between 0-99 into a table:
CREATE TABLE R
(
Number int
)
DECLARE #rv FLOAT;
SELECT #rv = rand();
INSERT INTO dbo.R
(Number)
values((#rv * 100));
SELECT * FROM R
select ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) as [Randomizer]
has always worked for me
Use newid()
select newid()
or possibly this
select binary_checksum(newid())
If you want to generate a random number between 1 and 14 inclusive.
SELECT CONVERT(int, RAND() * (14 - 1) + 1)
OR
SELECT ABS(CHECKSUM(NewId())) % (14 -1) + 1
DROP VIEW IF EXISTS vwGetNewNumber;
GO
Create View vwGetNewNumber
as
Select CAST(RAND(CHECKSUM(NEWID())) * 62 as INT) + 1 as NextID,
'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'as alpha_num;
---------------CTDE_GENERATE_PUBLIC_KEY -----------------
DROP FUNCTION IF EXISTS CTDE_GENERATE_PUBLIC_KEY;
GO
create function CTDE_GENERATE_PUBLIC_KEY()
RETURNS NVARCHAR(32)
AS
BEGIN
DECLARE #private_key NVARCHAR(32);
set #private_key = dbo.CTDE_GENERATE_32_BIT_KEY();
return #private_key;
END;
go
---------------CTDE_GENERATE_32_BIT_KEY -----------------
DROP FUNCTION IF EXISTS CTDE_GENERATE_32_BIT_KEY;
GO
CREATE function CTDE_GENERATE_32_BIT_KEY()
RETURNS NVARCHAR(32)
AS
BEGIN
DECLARE #public_key NVARCHAR(32);
DECLARE #alpha_num NVARCHAR(62);
DECLARE #start_index INT = 0;
DECLARE #i INT = 0;
select top 1 #alpha_num = alpha_num from vwGetNewNumber;
WHILE #i < 32
BEGIN
select top 1 #start_index = NextID from vwGetNewNumber;
set #public_key = concat (substring(#alpha_num,#start_index,1),#public_key);
set #i = #i + 1;
END;
return #public_key;
END;
select dbo.CTDE_GENERATE_PUBLIC_KEY() public_key;
Update my_table set my_field = CEILING((RAND(CAST(NEWID() AS varbinary)) * 10))
Number between 1 and 10.
Try this:
SELECT RAND(convert(varbinary, newid()))*(b-a)+a magic_number
Where a is the lower number and b is the upper number
If you need a specific number of random number you can use recursive CTE:
;WITH A AS (
SELECT 1 X, RAND() R
UNION ALL
SELECT X + 1, RAND(R*100000) --Change the seed
FROM A
WHERE X < 1000 --How many random numbers you need
)
SELECT
X
, RAND_BETWEEN_1_AND_14 = FLOOR(R * 14 + 1)
FROM A
OPTION (MAXRECURSION 0) --If you need more than 100 numbers