Move item in list with sorted numeric-index - database

Case: Consider a list stored in sql-database with 2 columns:
-----------------------
Note | Index_Value
-----------------------
Yellow | 4
Red | 7
Blue | 3
Grey | 30
The user is shown data of this list by simply sorting on Index_Value column, which is shown as:
Blue
Yellow
Red
Grey
How do I move Grey between Red & Blue, without affecting any other row but Grey's index value?
I have came up with simple maths, but it's far from perfect.
Solution: Median of values between which object is moved
( Index_Value(Previous) + Index_Value(After) )/2
(7 + 3)/2 = 5.5
This approach produces large decimal numbers just after few repetitions.

Related

Running Delta Issue in Google Data Studio

Data
Datapull | product | value
8/30 | X | 2
8/30 | Y | 3
8/30 | Y | 4
9/30 | X | 5
9/30 | Y | 6
Report
date range dimension: datapull
dimensions: data pull & product
metric: running delta of record count
Chart
For 8/30, The totals to start for product Y are right but Product X shows nothing when the first row of data has an entry for product X in 8/30.
The variances in 9/30 are wrong too.
Can someone please let me know if there is a way to do running deltas with 2 dimensions? This is not calculating correctly.
Using the combination of Breakdown dimension and Running delta calculation brakes the chart!
If you not mind to create a field for each Breakdown dimension (product), this will work:
case when product="X" then 1 else 0 end
And put each of these fields into 'Metric':

Equivalent of Excel Pivoting in Stata

I have been working with country-level survey data in Stata that I needed to reshape. I ended up exporting the .dta to a .csv and making a pivot table in in Excel but I am curious to know how to do this in Stata, as I couldn't figure it out.
Suppose we have the following data:
country response
A 1
A 1
A 2
A 2
A 1
B 1
B 2
B 2
B 1
B 1
A 2
A 2
A 1
I would like the data to be reformatted as such:
country sum_1 sum_2
A 4 4
B 3 2
First I tried a simple reshape wide command but got the error that "values of variable response not unique within country" before realizing reshape without additional steps wouldn't work anyway.
Then I tried generating new variables conditional on the value of response and trying to use reshape following that... the whole thing turned into kind of a mess so I just used Excel.
Just curious if there is a more intuitive way of doing that transformation.
If you just want a table, then just ask for one:
clear
input str1 country response
A 1
A 1
A 2
A 2
A 1
B 1
B 2
B 2
B 1
B 1
A 2
A 2
A 1
end
tabulate country response
| response
country | 1 2 | Total
-----------+----------------------+----------
A | 4 4 | 8
B | 3 2 | 5
-----------+----------------------+----------
Total | 7 6 | 13
If you want the data to be changed to this, reshape is part of the answer, but you should contract first. collapse is in several ways more versatile, but your "sum" is really a count or frequency, so contract is more direct.
contract country response, freq(sum_)
reshape wide sum_, i(country) j(response)
list
+-------------------------+
| country sum_1 sum_2 |
|-------------------------|
1. | A 4 4 |
2. | B 3 2 |
+-------------------------+
In Stata 16 up, help frames introduces frames as a way to work with multiple datasets in the same session.

make x in a cell equal 8 and total

I need an excel formula that will look at the cell and if it contains an x will treat it as a 8 and add it to the total at the bottom of the table. I have done these in the pass and I am so rusty that I cannot remember how I did it.
Generally, I try and break this sort of problem into steps. In this case, that'd be:
Determine if a cell is 'x' or not, and create new value accordingly.
Add up the new values.
If your values are in column A (for example), in column B, fill in:
=if(A1="x", 8, 0) (or in R1C1 mode, =if(RC[-1]="x", 8, 0).
Then just sum those values (eg sum(B1:B3)) for your total.
A | B
+---------+---------+
| VALUES | TEMP |
+---------+---------+
| 0 | 0 <------ '=if(A1="x", 8, 0)'
| x | 8 |
| fish | 0 |
+---------+---------+
| TOTAL | 8 <------ '=sum(B1:B3)'
+---------+---------+
If you want to be tidy, you could also hide the column with your intermediate values in.
(I should add that the way your question is worded, it almost sounds like you want to 'push' a value into the total; as far as I've ever known, you can really only 'pull' values into a total.)
Try this one for total sum:
=SUMIF(<range you want to sum>, "<>" & <x>, <range you want to sum>)+ <x> * COUNTIF(<range you want to sum>, <x>)

How can I use the LAG function and return the same value if the subsequent value in a row is duplicated?

I am using the LAG function to move my values one row down.
However, I need to use the same value as previous if the items in source column is duplicated:
ID | SOURCE | LAG | DESIRED OUTCOME
1 | 4 | - | -
2 | 2 | 4 | 4
3 | 3 | 2 | 2
4 | 3 | 3 | 2
5 | 3 | 3 | 2
6 | 1 | 3 | 3
7 | 4 | 1 | 1
8 | 4 | 4 | 1
As you can see, for instance in ID range 3-5 the source data doesn't change and the desired outcome should be fed from the last row with different value (so in this case ID 2).
Sql server's version of lag supports an expression in the second argument to determine how many rows back to look. You can replace this with some sort of check to not look back e.g.
select lagged = lag(data,iif(decider < 0,0,1)) over (order by id)
from (values(0,1,'dog')
,(1,2,'horse')
,(2,-1,'donkey')
,(3,2,'chicken')
,(4,23,'cow'))f(id,decider,data)
This returns the following list
null
dog
donkey
donkey
chicken
Because the decider value on the row with id of 2 was negative.
Well, first lag may not be the tool for the job. This might be easier to solve with a recursive CTE. Sql and window functions work over set. That said, our goal here is to come up with a way of describing what we want. We'd like a way to partition our data so that sequential islands of the same value are part of the same set.
One way we can do that is by using lag to help us discover if the previous row was different or not.
From there, we can now having a running sum over these change events to create partitions. Once we have partitions, we can assign a row number to each element in the partition. Finally, once we have that, we can now use the row number to look
back that many elements.
;with d as (
select * from (values
(1,4)
,(2,2)
,(3,3)
,(4,3)
,(5,3)
,(6,1)
,(7,4)
,(8,4)
)f(id,source))
select *,lag(source,rn) over (order by Id)
from (
select *,rn=row_number() over (partition by partition_id order by id)
from (
select *, partition_id = sum(change) over (order by id)
from (
select *,change = iif(lag(source) over (order by id) != source,1,0)
from d
) source_with_change
) partitioned
) row_counted
As an aside, this an absolutely cruel interview question I was asked to do once.

Query on C exercise

From the link
http://www.cs.cf.ac.uk/Dave/C/node8.html#SECTION00840000000000000000
Exercise 12347
I am not able to understand the meaning of these 2 points in the question.
1)
The total number of characters after tab expansion
The total number of spaces after tab expansion
The total number of leading spaces after tab expansion
2)
NOTE: All tab characters ('') on input should be interpreted as multiple spaces using the rule:
"move to the next modulo 8 column"
where the first column is numbered column 0.
col before tab | col after tab
---------------+--------------
0 | 8
1 | 8
7 | 8
8 | 16
9 | 16
15 | 16
16 | 24
Sham
It's simple enough once you understand. Tab expansion is the act of replacing tabs with a series of spaces, the quantity of which will move you to the next tab stop.
So, for example, let's consider tabstops at columns 8, 16 and so on. The first line below will be tab-expanded to the second (assuming . is the tab):
11111111112
12345678901234567890 <- Ruler line
--------------------
hi.there
hi there
You can see that the single tab has been expanded into five spaces so that the next character starts on the tabstop at column 8.
So you just need to re-examine those questions in light of that information.

Resources