creating a histogram using SQL - sql-server

I am extremely new to SQL and wish to create a histogram which should look as the following - 2 columns, one with "z" values, ranging from 0 to 1, with an interval of 0.01, and the second which will include the number of count(z) per each interval. Visually, it should look something like this:
z | count(z)
-------------
0-0.01| 12312
0.01 - 0.02 | 143565
0.02 - 0.03 | 23445
and so on...
I tried looping, concatinating string and using EXEC, but nothing seems to work :(
the closest I've got to extracting some useful data was by using the following code, which produced a 2D matrix with the first column containing the data and the rest NULL:
DECLARE #i float = 0
WHILE #i < 0.1
BEGIN
exec ('select count(z) as ''' +#i +''' from specObj where z BETWEEN
' +#i +' and (' +#i +'+0.01)')
SET #i = #i + 0.01
END
Thanks

There is no need to loop. Just use some arithmetic in the group by. Here is the basic idea:
select cast(z * 100 as int)/100.0 as range_start, (1 + cast(z * 100 as int))/100.0 as range_end,
count(*)
from table t
group by cast(z * 100 as int);
Actually turning the range start and range end into a string (such as '[0.01-0.02]') requires string manipulations that, alas, depend on the particular database, which is not specified in the question.

Related

Issue while adding values in SQL Server

Please read again till end (description updated)
I want something like this.
ex :
if (7200 / 42) is float then
floor(7200/42) + [7200 - {(floor(7200/42)) * 42}] / 10 ^ length of [7200 - {(floor(7200/42)) * 42}]
STEP : 1 => 171 + ((7200 - (171*42))/10 ^ len(7200-7182))
STEP : 2 => 171 + ((7200 - 7182)/10 ^ len(18))
STEP : 3 => 171 + (18/10 ^ 2)
STEP : 4 => 171 + (18/100)
STEP : 5 => 171 + 0.18
STEP : 6 => 171.18
I have written the code in SQL which actually works perfectly but the addition of 171 + 0.18 only gives 171
IF I can get "171/18" instead of "171.18" as string then it'd also be great. (/ is just used as separator and not a divison sign)
Following is the code I written
Here,
(FAP.FQTY + FAP.QTY) = 7200,
PRD.CRT = 42
(values only for example)
select
case when PRD.CRT <> 0 then
case when (FAP.FQTY + FAP.QTY)/PRD.CRT <> FLOOR((FAP.FQTY + FAP.QTY)/PRD.CRT) then --DETERMINE WHETHER VALUE IS FLOAT OR NOT
(floor((FAP.FQTY + FAP.QTY)/PRD.CRT)) +
((FAP.FQTY + FAP.QTY) - floor((FAP.FQTY + FAP.QTY)/PRD.CRT) * PRD.CRT) /
POWER(10, len(floor((FAP.FQTY + FAP.QTY) - floor((FAP.FQTY + FAP.QTY)/PRD.CRT) * PRD.CRT)))
else
(FAP.FQTY + FAP.QTY)/PRD.CRT -- INTEGER
end
else
0
end
from FAP inner join PRD on FAP.Comp_Year = PRD.Comp_Year and
FAP.Comp_No = PRD.Comp_No and FAP.Prd_Code = PRD.Prd_Code
I got all the values correct till 171 + 0.1800 correct but after that I am only receiving 171 in the addition. I want exactly 171.18.
REASON FOR THIS CONFUSING CALCULATION
Its all about accounting
Suppose, a box(or a cartoon) has 42 nos. of items.
A person sends 7200 items. how many boxes he has to send?
So that will be (7200/42) = 171.4257.
But boxes cannot be cut (its whole number i.e 171).
so 171 * 42 ie 7182 items.
Remaining items = 7200 - 7182 = 18.
So answer is 171 boxes and 18 items.
In short 171.18 or "171/18"
Please help me with this..
Thank you in advance.
Recognise that you're not producing an actual numeric result, I'd describe it as unhealthy to try to keep it using such a datatype1.
This produces the strings you're seeking, if I've understood your requirement:
;With StartingPoint as (
select 7200 as Dividend, 42 as Divisor
)
select
CONVERT(varchar(10),Quotient) +
CASE WHEN Remainder > 0 THEN '.' + CONVERT(varchar(10),Remainder)
ELSE '' END as FinalString
from
StartingPoint
cross apply
(select Dividend/Divisor as Quotient, Dividend % Divisor as Remainder) t
(Not tested for negative values. Some adjustments may be required. Technically % computes the modulus rather than the remainder, etc)
1Because someone might try and add two of these values together and I doubt that produces a correct result, not even necessarily if using the same Divisor to compute both.
Just another idea about how to calculate it.
Simple calculate the whole boxes.
And concatinate a dot with the remaining items (using a modulus).
Wrapped it all up in a CASE WHEN (or IIF) to avoid the divide by zero.
Example snippet:
declare #TestTable table (FQTY numeric(18,2), QTY numeric(18,2), CRT numeric(18,0));
insert into #TestTable (FQTY,QTY,CRT) values
(5000, 2200, 42),
(5000, 2200, 0),
( 100, 200, 10);
select *,
(CASE
WHEN CRT>0
THEN CONCAT(CAST(FLOOR((FQTY+QTY)/CRT) as INT),'/',CAST((FQTY+QTY)%CRT as INT))
ELSE '0'
END) AS Boxes
from #TestTable;
Result:
FQTY QTY CRT Boxes
------- ------- --- ------
5000.00 2200.00 42 171/18
5000.00 2200.00 0 0
100.00 200.00 10 30/0
The CONCAT returns a varchar, and so does the CASE WHEN.
But you could wrap that CASE WHEN in a CAST.
You're getting an automatic type conversion from int to decimal(10,0) which is probably not what you want.
https://learn.microsoft.com/en-us/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql?view=sql-server-2017
Check out the "Caution" box.
If you want a specific amount of precision, you'll need to explicitly cast() the values to the desired data type.
if i understand your logic correctly you want the remainder of 7200 divide by 42 and the remainder is to divide by 100
declare
#dividend int = 7200,
#divisor int = 42
select (#dividend / #divisor)
+ convert(decimal(10,4),
(#dividend % #divisor) * 1.0 / power(10, len(#dividend % #divisor)))
EDIT: change to handle the 10^len(remainder)

Expression Evaluation in SQL Stored Procedure

I have formula table with some of the formula. Is it possible to write SQL stored procedure which could evaluate every formula and give the result ? Every formula is an expression which could have all different arithmetic operations. Every value used in formula is an id value referring to some other table
FormulaId | Formula
--------------+-------------
1 | `1 * 3`
2 | `(2 + 3) * (4 + 1)`
3 | `((2 + 3) * (4 + 1)) / 5`
4 | `(4 + 1) - (3 + 1) - (2 + 1)`
Id | Value
--------------+-------------
1 | 5
2 | 10
3 | 15
4 | 20
5 | 25
Result should be something like
FormulaId | EvaluatedValue
--------------+----------------
1 | 75
2 | 625
3 | 25
4 | -10
I'd strongly recommend doing this in your application rather than SQL. I have a solution, but it's ridiculously involved. I would create a proc which returns the formula, plus the values that need to be replaced. Then use .Replace() or a regex library to replace the tags with the values (Note, you're going to run into trouble if your tags are integers; consider wrapping them in something like some kind of bracket). You can then use one of several techniques to evaluate the expression at run time (see Evaluating string "3*(4+2)" yield int 18 for some ideas).
Note, I listed stuff primarily in C#, since that's what I know; but if you're using a different language, there should be something equivalent.
Here is how you can evaluate an expression stored in a table:
declare #SQL nvarchar(max)
Declare #Result As Int
Select #SQL = Formula
From dbo.Formulae
Where ID = 3
Set #SQL = 'select #x = ' + #SQL
exec sp_executesql #SQL, N'#x int out', #Result out
select #Result
Assuming that Xendi and Prasad are correct and the numbers in the Formula entry are keys used to look up values from the Values table, then you'll have to define the delimiters used in your formula strings and then write a routine to parse the formulae and do the substitution. Then you'd feed the resulting string into the logic I wrote above.
All that being said, TSQL is not the right tool for this job.

How to get the count of digits after the decimal point in a float column in ms sql?

I have to count the digits after the decimal point in a database hosted by a MS Sql Server (2005 or 2008 does not matter), in order to correct some errors made by users.
I have the same problem on an Oracle database, but there things are less complicated.
Bottom line is on Oracle the select is:
select length( substr(to_char(MY_FIELD), instr(to_char(MY_FILED),'.',1,1)+1, length(to_char(MY_FILED)))) as digits_length
from MY_TABLE
where the filed My_filed is float(38).
On Ms Sql server I try to use:
select LEN(SUBSTRING(CAST(MY_FIELD AS VARCHAR), CHARINDEX('.',CAST(MY_FILED AS VARCHAR),1)+1, LEN(CAST(MY_FIELD AS VARCHAR)))) as digits_length
from MY_TABLE
The problem is that on MS Sql Server, when i cast MY_FIELD as varchar the float number is truncated by only 2 decimals and the count of the digits is wrong.
Can someone give me any hints?
Best regards.
SELECT
LEN(CAST(REVERSE(SUBSTRING(STR(MY_FIELD, 13, 11), CHARINDEX('.', STR(MY_FIELD, 13, 11)) + 1, 20)) AS decimal))
from TABLE
I have received from my friend a very simple solution which is just great. So I will post the workaround in order to help others in the same position as me.
First, make function:
create FUNCTION dbo.countDigits(#A float) RETURNS tinyint AS
BEGIN
declare #R tinyint
IF #A IS NULL
RETURN NULL
set #R = 0
while #A - str(#A, 18 + #R, #r) <> 0
begin
SET #R = #R + 1
end
RETURN #R
END
GO
Second:
select MY_FIELD,
dbo.countDigits(MY_FIELD)
from MY_TABLE
Using the function will get you the exact number of digits after the decimal point.
The first thing is to switch to using CONVERT rather than CAST. The difference is, with CONVERT, you can specify a format code. CAST uses whatever the default format code is:
When expression is float or real, style can be one of the values shown in the following table. Other values are processed as 0.
None of the formats are particularly appealing, but I think the best for you to use would be 2. So it would be:
CONVERT(varchar(25),MY_FIELD,2)
This will, unfortunately, give you the value in scientific notation and always with 16 digits e.g. 1.234567890123456e+000. To get the number of "real" digits, you need to split this number apart, work out the number of digits in the decimal portion, and offset it by the number provided in the exponent.
And, of course, insert usual caveats/warnings about trying to talk about digits when dealing with a number which has a defined binary representation. The number of "digits" of a particular float may vary depending on how it was calculated.
I'm not sure about speed. etc or the elegance of this code. it was for some ad-hoc testing to find the first decimal value . but this code could be changed to loop through all the decimals and find the last time a value was greater than zero easily.
DECLARE #NoOfDecimals int = 0
Declare #ROUNDINGPRECISION numeric(32,16) = -.00001000
select #ROUNDINGPRECISION = ABS(#ROUNDINGPRECISION)
select #ROUNDINGPRECISION = #ROUNDINGPRECISION - floor(#ROUNDINGPRECISION)
while #ROUNDINGPRECISION < 1
Begin
select #NoOfDecimals = #NoOfDecimals +1
select #ROUNDINGPRECISION = #ROUNDINGPRECISION * 10
end;
select #NoOfDecimals

ASCII increment with defined range

Client wants to append a field with a literal increment based on a count.
The range goes from 'aa' to 'zz'.
'aa' represents a count of 1 and 'zz' represents the max value in the range: 676
I have sql that almost works but would appreciate an expert eye to get me over the last hurdle.
--Constants
DECLARE #START_ASCII INT = 97
DECLARE #ASCII_OFFSET INT = 1
DECLARE #ALPHABET_LETTER_COUNT INT = 26
--Variables
DECLARE #RecordCount INT = 0
DECLARE #FirstLetter VARCHAR(1) = NULL
DECLARE #SecondLetter VARCHAR(1) = NULL
SET #RecordCount = 1 --Range is 1 to 676 (e.g. 'aa' to 'zz')
SET #FirstLetter = CHAR(round(#RecordCount / #ALPHABET_LETTER_COUNT, 2, 1) + #START_ASCII)
SET #SecondLetter = CHAR((((#RecordCount - #ASCII_OFFSET) % #ALPHABET_LETTER_COUNT) + #START_ASCII))
SELECT #FirstLetter + #SecondLetter
The problem with the above sql involves the first letter. It works till the end of the alphabet is reached for the second letter. For example, at a count of 26, I expect 'az', but instead get 'bz'.
I want to keep the SQL small and tight (e.g. no CASE statements). Is there a small tweak I can make to the above code so that it will work?
Or, if there is just a smarter way to skin this cat, I'd like to know that.
I would think of this as computing the base-26 representation of #RecordCount-1 (range 0 to 675). Then map the two-digits of the base-26 number to the ASCII characters:
SET #FirstLetter = CHAR(floor((#RecordCount-1) / #ALPHABET_LETTER_COUNT) + #START_ASCII)
SET #SecondLetter = CHAR(((#RecordCount-1) % #ALPHABET_LETTER_COUNT) + #START_ASCII)

How do I generate a random number for each row in a T-SQL select?

I need a different random number for each row in my table. The following seemingly obvious code uses the same random value for each row.
SELECT table_name, RAND() magic_number
FROM information_schema.tables
I'd like to get an INT or a FLOAT out of this. The rest of the story is I'm going to use this random number to create a random date offset from a known date, e.g. 1-14 days offset from a start date.
This is for Microsoft SQL Server 2000.
Take a look at SQL Server - Set based random numbers which has a very detailed explanation.
To summarize, the following code generates a random number between 0 and 13 inclusive with a uniform distribution:
ABS(CHECKSUM(NewId())) % 14
To change your range, just change the number at the end of the expression. Be extra careful if you need a range that includes both positive and negative numbers. If you do it wrong, it's possible to double-count the number 0.
A small warning for the math nuts in the room: there is a very slight bias in this code. CHECKSUM() results in numbers that are uniform across the entire range of the sql Int datatype, or at least as near so as my (the editor) testing can show. However, there will be some bias when CHECKSUM() produces a number at the very top end of that range. Any time you get a number between the maximum possible integer and the last exact multiple of the size of your desired range (14 in this case) before that maximum integer, those results are favored over the remaining portion of your range that cannot be produced from that last multiple of 14.
As an example, imagine the entire range of the Int type is only 19. 19 is the largest possible integer you can hold. When CHECKSUM() results in 14-19, these correspond to results 0-5. Those numbers would be heavily favored over 6-13, because CHECKSUM() is twice as likely to generate them. It's easier to demonstrate this visually. Below is the entire possible set of results for our imaginary integer range:
Checksum Integer: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Range Result: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 0 1 2 3 4 5
You can see here that there are more chances to produce some numbers than others: bias. Thankfully, the actual range of the Int type is much larger... so much so that in most cases the bias is nearly undetectable. However, it is something to be aware of if you ever find yourself doing this for serious security code.
When called multiple times in a single batch, rand() returns the same number.
I'd suggest using convert(varbinary,newid()) as the seed argument:
SELECT table_name, 1.0 + floor(14 * RAND(convert(varbinary, newid()))) magic_number
FROM information_schema.tables
newid() is guaranteed to return a different value each time it's called, even within the same batch, so using it as a seed will prompt rand() to give a different value each time.
Edited to get a random whole number from 1 to 14.
RAND(CHECKSUM(NEWID()))
The above will generate a (pseudo-) random number between 0 and 1, exclusive. If used in a select, because the seed value changes for each row, it will generate a new random number for each row (it is not guaranteed to generate a unique number per row however).
Example when combined with an upper limit of 10 (produces numbers 1 - 10):
CAST(RAND(CHECKSUM(NEWID())) * 10 as INT) + 1
Transact-SQL Documentation:
CAST(): https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql
RAND(): http://msdn.microsoft.com/en-us/library/ms177610.aspx
CHECKSUM(): http://msdn.microsoft.com/en-us/library/ms189788.aspx
NEWID(): https://learn.microsoft.com/en-us/sql/t-sql/functions/newid-transact-sql
Random number generation between 1000 and 9999 inclusive:
FLOOR(RAND(CHECKSUM(NEWID()))*(9999-1000+1)+1000)
"+1" - to include upper bound values(9999 for previous example)
Answering the old question, but this answer has not been provided previously, and hopefully this will be useful for someone finding this results through a search engine.
With SQL Server 2008, a new function has been introduced, CRYPT_GEN_RANDOM(8), which uses CryptoAPI to produce a cryptographically strong random number, returned as VARBINARY(8000). Here's the documentation page: https://learn.microsoft.com/en-us/sql/t-sql/functions/crypt-gen-random-transact-sql
So to get a random number, you can simply call the function and cast it to the necessary type:
select CAST(CRYPT_GEN_RANDOM(8) AS bigint)
or to get a float between -1 and +1, you could do something like this:
select CAST(CRYPT_GEN_RANDOM(8) AS bigint) % 1000000000 / 1000000000.0
The Rand() function will generate the same random number, if used in a table SELECT query. Same applies if you use a seed to the Rand function. An alternative way to do it, is using this:
SELECT ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS [RandomNumber]
Got the information from here, which explains the problem very well.
Do you have an integer value in each row that you could pass as a seed to the RAND function?
To get an integer between 1 and 14 I believe this would work:
FLOOR( RAND(<yourseed>) * 14) + 1
If you need to preserve your seed so that it generates the "same" random data every time, you can do the following:
1. Create a view that returns select rand()
if object_id('cr_sample_randView') is not null
begin
drop view cr_sample_randView
end
go
create view cr_sample_randView
as
select rand() as random_number
go
2. Create a UDF that selects the value from the view.
if object_id('cr_sample_fnPerRowRand') is not null
begin
drop function cr_sample_fnPerRowRand
end
go
create function cr_sample_fnPerRowRand()
returns float
as
begin
declare #returnValue float
select #returnValue = random_number from cr_sample_randView
return #returnValue
end
go
3. Before selecting your data, seed the rand() function, and then use the UDF in your select statement.
select rand(200); -- see the rand() function
with cte(id) as
(select row_number() over(order by object_id) from sys.all_objects)
select
id,
dbo.cr_sample_fnPerRowRand()
from cte
where id <= 1000 -- limit the results to 1000 random numbers
select round(rand(checksum(newid()))*(10)+20,2)
Here the random number will come in between 20 and 30.
round will give two decimal place maximum.
If you want negative numbers you can do it with
select round(rand(checksum(newid()))*(10)-60,2)
Then the min value will be -60 and max will be -50.
try using a seed value in the RAND(seedInt). RAND() will only execute once per statement that is why you see the same number each time.
If you don't need it to be an integer, but any random unique identifier, you can use newid()
SELECT table_name, newid() magic_number
FROM information_schema.tables
You would need to call RAND() for each row. Here is a good example
https://web.archive.org/web/20090216200320/http://dotnet.org.za/calmyourself/archive/2007/04/13/sql-rand-trap-same-value-per-row.aspx
The problem I sometimes have with the selected "Answer" is that the distribution isn't always even. If you need a very even distribution of random 1 - 14 among lots of rows, you can do something like this (my database has 511 tables, so this works. If you have less rows than you do random number span, this does not work well):
SELECT table_name, ntile(14) over(order by newId()) randomNumber
FROM information_schema.tables
This kind of does the opposite of normal random solutions in the sense that it keeps the numbers sequenced and randomizes the other column.
Remember, I have 511 tables in my database (which is pertinent only b/c we're selecting from the information_schema). If I take the previous query and put it into a temp table #X, and then run this query on the resulting data:
select randomNumber, count(*) ct from #X
group by randomNumber
I get this result, showing me that my random number is VERY evenly distributed among the many rows:
It's as easy as:
DECLARE #rv FLOAT;
SELECT #rv = rand();
And this will put a random number between 0-99 into a table:
CREATE TABLE R
(
Number int
)
DECLARE #rv FLOAT;
SELECT #rv = rand();
INSERT INTO dbo.R
(Number)
values((#rv * 100));
SELECT * FROM R
select ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) as [Randomizer]
has always worked for me
Use newid()
select newid()
or possibly this
select binary_checksum(newid())
If you want to generate a random number between 1 and 14 inclusive.
SELECT CONVERT(int, RAND() * (14 - 1) + 1)
OR
SELECT ABS(CHECKSUM(NewId())) % (14 -1) + 1
DROP VIEW IF EXISTS vwGetNewNumber;
GO
Create View vwGetNewNumber
as
Select CAST(RAND(CHECKSUM(NEWID())) * 62 as INT) + 1 as NextID,
'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'as alpha_num;
---------------CTDE_GENERATE_PUBLIC_KEY -----------------
DROP FUNCTION IF EXISTS CTDE_GENERATE_PUBLIC_KEY;
GO
create function CTDE_GENERATE_PUBLIC_KEY()
RETURNS NVARCHAR(32)
AS
BEGIN
DECLARE #private_key NVARCHAR(32);
set #private_key = dbo.CTDE_GENERATE_32_BIT_KEY();
return #private_key;
END;
go
---------------CTDE_GENERATE_32_BIT_KEY -----------------
DROP FUNCTION IF EXISTS CTDE_GENERATE_32_BIT_KEY;
GO
CREATE function CTDE_GENERATE_32_BIT_KEY()
RETURNS NVARCHAR(32)
AS
BEGIN
DECLARE #public_key NVARCHAR(32);
DECLARE #alpha_num NVARCHAR(62);
DECLARE #start_index INT = 0;
DECLARE #i INT = 0;
select top 1 #alpha_num = alpha_num from vwGetNewNumber;
WHILE #i < 32
BEGIN
select top 1 #start_index = NextID from vwGetNewNumber;
set #public_key = concat (substring(#alpha_num,#start_index,1),#public_key);
set #i = #i + 1;
END;
return #public_key;
END;
select dbo.CTDE_GENERATE_PUBLIC_KEY() public_key;
Update my_table set my_field = CEILING((RAND(CAST(NEWID() AS varbinary)) * 10))
Number between 1 and 10.
Try this:
SELECT RAND(convert(varbinary, newid()))*(b-a)+a magic_number
Where a is the lower number and b is the upper number
If you need a specific number of random number you can use recursive CTE:
;WITH A AS (
SELECT 1 X, RAND() R
UNION ALL
SELECT X + 1, RAND(R*100000) --Change the seed
FROM A
WHERE X < 1000 --How many random numbers you need
)
SELECT
X
, RAND_BETWEEN_1_AND_14 = FLOOR(R * 14 + 1)
FROM A
OPTION (MAXRECURSION 0) --If you need more than 100 numbers

Resources