Calculate percent in spotfire cross table - pivot-table

I have a cross table where a person has a set of tasks and if the person has completed the task or not. I want to be able to add another column to the cross table that calculates the percent of tasks completed (Yes/Grand total). Is this possible?
Here's an image of the cross table:

Insert the below expression on your VALUE AXIS where [ValueColumn] is what ever column is giving you the values for YES and NO. If it's just a Count of the rows, i.e. COUNT() or if you are counting a non-integer column, you'll use Count() or UniqueCount() in place of Sum([ValueColumn])
Sum([ValueColumn]) THEN [Value] / Sum([Value]) OVER (All([Axis.Rows]))

Related

how to select first rows distinct by a column name in a sub-query in sql-server?

Actually I am building a Skype like tool wherein I have to show last 10 distinct users who have logged in my web application.
I have maintained a table in sql-server where there is one field called last_active_time. So, my requirement is to sort the table by last_active_time and show all the columns of last 10 distinct users.
There is another field called WWID which uniquely identifies a user.
I am able to find the distinct WWID but not able to select the all the columns of those rows.
I am using below query for finding the distinct wwid :
select distinct(wwid) from(select top 100 * from dbo.rvpvisitors where last_active_time!='' order by last_active_time DESC) as newView;
But how do I find those distinct rows. I want to show how much time they are away fromm web apps using the diff between curr time and last active time.
I am new to sql, may be the question is naive, but struggling to get it right.
If you are using proper data types for your columns you won't need a subquery to get that result, the following query should do the trick
SELECT TOP 10
[wwid]
,MAX([last_active_time]) AS [last_active_time]
FROM [dbo].[rvpvisitors]
WHERE
[last_active_time] != ''
GROUP BY
[wwid]
ORDER BY
[last_active_time] DESC
If the column [last_active_time] is of type varchar/nvarchar (which probably is the case since you check for empty strings in the WHERE statement) you might need to use CAST or CONVERT to treat it as an actual date, and be able to use function like MIN/MAX on it.
In general I would suggest you to use proper data types for your column, if you have dates or timestamps data use the "date" or "datetime2" data types
Edit:
The query aggregates the data based on the column [wwid], and for each returns the maximum [last_active_time].
The result is then sorted and filtered.
In order to add more columns "as-is" (without aggregating them) just add them in the SELECT and GROUP BY sections.
If you need more aggregated columns add them in the SELECT with the appropriate aggregation function (MIN/MAX/SUM/etc)
I suggest you have a look at GROUP BY on W3
To know more about the "execution order" of the instruction you can have a look here
You can solve problem like this by rank ordering the results by a key and finding the last x of those items, this removes duplicates while preserving the key order.
;
WITH RankOrdered AS
(
SELECT
*,
wwidRank = ROW_NUMBER() OVER (PARTITION BY wwid ORDER BY last_active_time DESC )
FROM
dbo.rvpvisitors
where
last_active_time!=''
)
SELECT TOP(10) * FROM RankOrdered WHERE wwidRank = 1
If my understanding is right, below query will give the desired output.
You can have conditions according to your need.
select top 10 distinct wwid from dbo.rvpvisitors order by last_active_time desc

SQL Server 2014 Random Value in Group By

I'm trying to figure out how to get a single random row returned per account from a table. The table has multiple rows per account or in some cases just a single row. I want to be able to get a random result back in my select so each day that I run the same statement I might get a different result.
This is basis of the query:
select number, phonenumber
from phones_master with(nolock)
where phonetypeid = '3'
This is a sample result set
number phonenumber
--------------------------
4130772, 6789100949
4130772, 6789257988
4130774, 6784519098
4130775, 6786006874
The column called Number is the account. I'd like to return a single random row. So based on the sample result set above the query should return 3 rows.
Any suggestions would be greatly appreciated. I'm beating my head against the wall with this one.
Thanks
You can use WITH TIES in concert with Row_Number()
Select Top 1 with ties *
From YourTable
Order by Row_Number() over (Partition By Number Order By NewID())
Returns (for example)
number phonenumber
4130772 6789257988
4130774 6784519098
4130775 6786006874
If you have another table called account where those number's are generated/created then here is one way using Cross Apply.
SELECT at.number,
cs.phonenumber
FROM account_table at
CROSS apply(SELECT TOP 1 phonenumber
FROM phones_master pm
WHERE at.number = pm.number
AND phonetypeid = '3'
ORDER BY Newid()) cs (phonenumber)
Also this considers the number in account table is unique.
Creating a Index on number and phonetypeid in phones_master table should improve the performance

How can this expression reach the NULL expression?

I'm trying to randomly populate a column with values from another table using this statement:
UPDATE dbo.SERVICE_TICKET
SET Vehicle_Type = (SELECT TOP 1 [text]
FROM dbo.vehicle_typ
WHERE id = abs(checksum(NewID()))%21)
It seems to work fine, however the value NULL is inserted into the column. How can I get rid of the NULL and only insert the values from the table?
This can happen when you don't have an appropriate index on the ID column of your vehicle_typ table. Here's a smaller query that exhibits the same problem:
create table T (ID int null)
insert into T(ID) values (0),(1),(2),(3)
select top 1 * from T where ID = abs(checksum(NewID()))%3
Because there's no index on T, what happens is that SQL Server performs a table scan and then, for each row, attempts to satisfy the where clause. Which means that, for each row it evaluates abs(checksum(NewID()))%3 anew. You'll only get a result if, by chance, that expression produces, say, 1 when it's evaluated for the row with ID 1.
If possible (I don't know your table structure) I would first populate a column in SERVICE_TICKET with a random number between 0 and 20 and then perform this update using the already generated number. Otherwise, with the current query structure, you're always relying on SQL Server being clever enough to only evaluate abs(checksum(NewID()))%21once for each outer row, which it may not always do (as you've already found out).
#Damien_The_Unbeliever explained why your query fails.
My first variant was not correct, because I didn't understand the problem in full.
You want to set each row in SERVICE_TICKET to a different random value from vehicle_typ.
To fix it simply order by random number, rather than comparing a random number with ID. Like this (and you don't care how many rows are in vehicle_typ as long as there is at least one row there).
WITH
CTE
AS
(
SELECT
dbo.SERVICE_TICKET.Vehicle_Type
CA.[text]
FROM
dbo.SERVICE_TICKET
CROSS APPLY
(
SELECT TOP 1 [text]
FROM dbo.vehicle_typ
ORDER BY NewID()
) AS CA
)
UPDATE CTE
SET Vehicle_Type = [text];
At first we make a Common Table Expression, you can think of it as a temporary table. For each row in SERVICE_TICKET we pick one random row from vehicle_typ using CROSS APPLY. Then we UPDATE the original table with chosen rows.

MS SQL Server Algebraic Syntax

I have a table logging a floating point value from a scale (a weight). I'd like to evaluate the absolute value of the integral of this curve dynamically. I'm attempting to perform some simple algebra based on the trapezoidal approx. with a sampling rate (b-a=1) of one:
(b-a)((f(a)+f(b))/2 - f(a))
The values f(a) and f(b) represent the 2 most recent values logged in my SQL Server table. I've attempted the following with an evalution error:
SELECT TOP 2
SUM(Scale_Weight) OVER(ORDER BY t_stamp DESC)/2.0
FROM table
This query evaluates, but simply divides the most recent value by 2:
SELECT
SUM(Scale_Weight) OVER(ORDER BY t_stamp DESC)/2.0
FROM table
As you can see, I haven't even attempted the absolute value or the subtraction of the "2nd most recent" value because I didn't know how to reference a specific row (cell?). As a noob, I feel the math is doable in a single query, I just can't find the proper syntax. Thanks in advance.
So to update more clearly:
Thanks for the input ps2goat, though for some reason I'm unable to implement "TOP" function, so I currently have this:
SELECT ABS(SUM(Scale_Weight) OVER(PARTITION BY quality_code
ORDER BY t_stamp
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)/2.0)
FROM table
Still need to subtract the preceding value, something like:
SELECT ABS(SUM(Scale_Weight) OVER(PARTITION BY quality_code
ORDER BY t_stamp
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)/2.0
- 1 PRECEDING)
FROM table
Any ideas to reference the preceding value for subtraction?
You can use the LAG function to refer to the last value in a certain order. For example:
SELECT Scale_Weight AS Current, LAG(Scale_Weight) AS Last OVER (ORDER BY t_stamp)
FROM table
You can add your formula tothis query.
This is what I did. Instead of timestamps, I used an Identity field, as those are incremented and easier to enter manually (not sure if you had datetime values or actual timestamp values)
fiddle: http://sqlfiddle.com/#!6/77bcb/4/0
schema:
create table x(
xId int identity(1,1) not null primary key,
scale_weight decimal(12,4)
);
insert into x(scale_weight)
select 24.1234 union all
select 32.4455 union all
select 88.1234 union all
select 223.443;
The inner query (below) grabs the top two rows, ordered by id descending (use your t_stamp column). The outer query sums all the Scale_Weight values returned by the inner query and divides that value by two.
sql:
select SUM(Scale_Weight)/2.0 from
(
SELECT TOP 2 Scale_Weight
FROM x
ORDER BY xid DESC
) y

"order by newid()" - how does it work?

I know that If I run this query
select top 100 * from mytable order by newid()
it will get 100 random records from my table.
However, I'm a bit confused as to how it works, since I don't see newid() in the select list. Can someone explain? Is there something special about newid() here?
I know what NewID() does, I'm just
trying to understand how it would help
in the random selection. Is it that
(1) the select statement will select
EVERYTHING from mytable, (2) for each
row selected, tack on a
uniqueidentifier generated by NewID(),
(3) sort the rows by this
uniqueidentifier and (4) pick off the
top 100 from the sorted list?
Yes. this is pretty much exactly correct (except it doesn't necessarily need to sort all the rows). You can verify this by looking at the actual execution plan.
SELECT TOP 100 *
FROM master..spt_values
ORDER BY NEWID()
The compute scalar operator adds the NEWID() column on for each row (2506 in the table in my example query) then the rows in the table are sorted by this column with the top 100 selected.
SQL Server doesn't actually need to sort the entire set from positions 100 down so it uses a TOP N sort operator which attempts to perform the entire sort operation in memory (for small values of N)
In general it works like this:
All rows from mytable is "looped"
NEWID() is executed for each row
The rows are sorted according to random number from NEWID()
100 first row are selected
as MSDN says:
NewID() Creates a unique value of type
uniqueidentifier.
and your table will be sorted by this random values.
use select top 100 randid = newid(), * from mytable order by randid
you will be clarified then..
I have an unimportant query which uses newId() and joins many tables. It returns about 10k rows in about 3 seconds. So, newId() might be ok in such cases where performance is not too bad & does not have a huge impact. But, newId() is bad for large tables.
Here is the explanation from Brent Ozar's blog - https://www.brentozar.com/archive/2018/03/get-random-row-large-table/.
From the above link, I have summarized the methods which you can use to generate a random id. You can read the blog for more details.
4 ways to get a random row from a large table:
Method 1, Bad: ORDER BY NEWID() > Bad performance!
Method 2, Better but Strange: TABLESAMPLE > Many gotchas & is not really
random!
Method 3, Best but Requires Code: Random Primary Key >
Fastest, but won't work for negative numbers.
Method 4, OFFSET-FETCH (2012+) > Only performs properly with a clustered
index.
More on method 3:
Get the top ID field in the table, generate a random number, and look for that ID. For top N rows, call the code below N times or generate N random numbers and use in an IN clause.
/* Get a random number smaller than the table's top ID */
DECLARE #rand BIGINT;
DECLARE #maxid INT = (SELECT MAX(Id) FROM dbo.Users);
SELECT #rand = ABS((CHECKSUM(NEWID()))) % #maxid;
/* Get the first row around that ID */
SELECT TOP 1 *
FROM dbo.Users AS u
WHERE u.Id >= #rand;

Resources