SQL Server : Islands And Gaps - sql-server

I'm struggling with an "Islands and Gaps" issue. This is for SQL Server 2008 / 2012 (we have databases on both).
I have a table which tracks "available" Serial-#'s for a Pass Outlet; i.e., Buss Passes, Admissions Tickets, Disneyland Tickets, etc. Those Serial-#'s are VARCHAR, and can be any combination of numbers and characters... any length, up to the max value of the defined column... which is VARCHAR(30). And this is where I'm mightily struggling with the syntax/design of a VIEW.
The table (IM_SER) which contains all this data has a primary key consisting of:
ITEM_NO...VARCHAR(20),
SERIAL_NO...VARCHAR(30)
In many cases... particularly with different types of the "Bus Passes" involved, those Serial-#'s could easily track into the TENS of THOUSANDS. What is needed... is a simple view in SQL Server... which simply outputs the CONSECUTIVE RANGES of Available Serial-#'s...until a GAP is found (i.e. a BREAK in the sequences). For example, say we have the following Serial-#'s on hand, for a given Item-#:
123
124
125
139
140
ABC123
ABC124
ABC126
XYZ240003
XYY240004
In my example above, the output would be displayed as follows:
123 -to- 125
139 -to- 140
ABC123 -to- ABC124
ABC126 -to- ABC126
XYZ240003 to XYZ240004
In total, there would be 10 Serial-#'s...but since we're outputting the sequential ranges...only 5-lines of output would be necessary. Does this make sense? Please let me know...and, again, THANK YOU!...Mark

This should get you started... the fun part will be determining if there are gaps or not. You will have to handle each serial format a little bit differently to determine if there are gaps or not...
select x.item_no,x.s_format,x.s_length,x.serial_no,
LAG(x.serial_no) OVER (PARTITION BY x.item_no,x.s_format,x.s_length
ORDER BY x.item_no,x.s_format,x.s_length,x.serial_no) PreviousValue,
LEAD(x.serial_no) OVER (PARTITION BY x.item_no,x.s_format,x.s_length
ORDER BY x.item_no,x.s_format,x.s_length,x.serial_no) NextValue
from
(
select item_no,serial_no,
len(serial_no) as S_LENGTH,
case
WHEN PATINDEX('%[0-9]%',serial_no) > 0 AND
PATINDEX('%[a-z]%',serial_no) = 0 THEN 'NUMERIC'
WHEN PATINDEX('%[0-9]%',serial_no) > 0 AND
PATINDEX('%[a-z]%',serial_no) > 0 THEN 'ALPHANUMERIC'
ELSE 'ALPHA'
end as S_FORMAT
from table1 ) x
order by item_no,s_format,s_length,serial_no
http://sqlfiddle.com/#!3/5636e2/7
| item_no | s_format | s_length | serial_no | PreviousValue | NextValue |
|---------|--------------|----------|-----------|---------------|-----------|
| 1 | ALPHA | 4 | ABCD | (null) | ABCF |
| 1 | ALPHA | 4 | ABCF | ABCD | (null) |
| 1 | ALPHANUMERIC | 6 | ABC123 | (null) | ABC124 |
| 1 | ALPHANUMERIC | 6 | ABC124 | ABC123 | ABC126 |
| 1 | ALPHANUMERIC | 6 | ABC126 | ABC124 | (null) |
| 1 | ALPHANUMERIC | 9 | XYY240004 | (null) | XYZ240003 |
| 1 | ALPHANUMERIC | 9 | XYZ240003 | XYY240004 | (null) |
| 1 | NUMERIC | 3 | 123 | (null) | 124 |
| 1 | NUMERIC | 3 | 124 | 123 | 125 |
| 1 | NUMERIC | 3 | 125 | 124 | 139 |
| 1 | NUMERIC | 3 | 139 | 125 | 140 |
| 1 | NUMERIC | 3 | 140 | 139 | (null) |

Related

Multi-dimensional data structure management in R

I have a concern about data organisation and the best approach to simplify some multi-layered data. Simply, I have a 10 replicates of small wood beams (BeamID, ~10) subjected to a 10 different treatment (TreatID, ~10), and each beam is load tested which produces a series data of a Load with consequent Displacement (ranging from 10 to 50 rows per test; I have code that corrects for disparities in row length). Each wood beam is tested multiple times (Rep, ~10).
My plan was to lump all this data into a 5-D array:
Array[Load, Deflection, BeamID, TreatID, Rep]
This way, I should be able to plot the load~deflection curves for a given BeamID, TreatID, for all Reps by using Array[ , ,1,1, ], right? So the hypothetical output for Array[ , ,1,1,1], would be:
+------------+--------+-----+
| Deflection | Load | Rep |
+------------+--------+-----+
| 0 | 0 | 1 |
| 6.35 | 10.5 | 1 |
| 12.7 | 20.8 | 1 |
| 19.05 | 45.3 | 1 |
| 25.4 | 75.2 | 1 |
+------------+--------+-----+
And Array[ , ,1,1,2] would be:
+------------+--------+-----+
| Deflection | Load | Rep |
+------------+--------+-----+
| 0 | 0 | 2 |
| 7.3025 | 12.075 | 2 |
| 14.605 | 23.92 | 2 |
| 21.9075 | 52.095 | 2 |
| 29.21 | 86.48 | 2 |
+------------+--------+-----+
Or I think I could keep it as a simpler, 'melted' dataframe, which would have columns for Load and Deflection, and BeamID, TreatID, and Rep would be repeated for each row of the test output.
+------------+--------+-----+--------+---------+
| Deflection | Load | Rep | BeamID | TreatID |
+------------+--------+-----+--------+---------+
| 0 | 0 | 1 | 1 | 1 |
| 6.35 | 10.5 | 1 | 1 | 1 |
| 12.7 | 20.8 | 1 | 1 | 1 |
| 19.05 | 45.3 | 1 | 1 | 1 |
| 25.4 | 75.2 | 1 | 1 | 1 |
| 0 | 0 | 2 | 1 | 1 |
| 7.3025 | 12.075 | 2 | 1 | 1 |
| 14.605 | 23.92 | 2 | 1 | 1 |
| 21.9075 | 52.095 | 2 | 1 | 1 |
| 29.21 | 86.48 | 2 | 1 | 1 |
+------------+--------+-----+--------+---------+
However, with the latter, I'm not sure how I could easily and discretely pull out all the Rep test values for a specific BeamID and TreatID, especially since I use a linear model to fit a 3rd order polynomial for an specific test to extract the slope of the curves. Having it as a continuous dataframe means I'd have to specify starting and stopping points to start the linear model, correct?
Thoughts, suggestions? Am I headed in the right direction in using a 5-D array? R is a new programming language for me, so please pardon my misunderstandings.

How do you make a table into one long row in SAS?

I have a table with a number of variables such as:
+-----------+------------+---------+-----------+--------+
| DateFrom | DateTo | Price | Discount | Cost |
+-----------+------------+---------+-----------+--------+
| 01jan17 | 01jul17 | 17 | 4 | 5 |
| 01aug17 | 01feb18 | 15 | 1 | 3 |
| 01mar18 | 01dec18 | 12 | 2 | 1 |
| ... | ... | ... | ... | ... |
+-----------+------------+---------+-----------+--------+
However I want to split this so I have:
+------------+------------+----------+-------------+---------+-------------+------------+----------+-------------+-------------+
| DateFrom1 | DateTo1 | Price1 | Discount1 | Cost1 | DateFrom2 | DateTo2 | Price2 | Discount2 | Cost2 ... |
+------------+------------+----------+-------------+---------+-------------+------------+----------+-------------+-------------+
| 01jan17 | 01jul17 | 17 | 4 | 5 | 01aug17 | 01feb18 | 15 | 1 | 3 |
+------------+------------+----------+-------------+---------+-------------+------------+----------+-------------+-------------+
There's a cool (not at all obvious) solution using proc summary and the idgroup statement that only takes a few lines of code. This runs in memory and you're likely to come into problems if the dataset is large, otherwise this works very well.
Note that out[3] relates to the number of rows in the source data. You could easily make this dynamic by adding a prior step that calculates the number of rows and stores it in a macro variable.
/* create initial dataset */
data have;
input (DateFrom DateTo) (:date7.) Price Discount Cost;
format DateFrom DateTo date7.;
datalines;
01jan17 01jul17 17 4 5
01aug17 01feb18 15 1 3
01mar18 01dec18 12 2 1
;
run;
/* transform data into 1 row */
proc summary data=have nway;
output out=want (drop=_:)
idgroup(out[3] (_all_)=) / autoname;
run;

Building index for specific value

I have a table that keeps inventory information for products in stores on daily basis. It is like:
|------------|-----------|---------|-----------------|
| Date | ProductId | StoreId | InventoryOnHand |
|------------|-----------|---------|-----------------|
| 2017-10-11 | 348 | 121 | 2 |
| 2017-10-11 | 110 | 200 | 0 |
| 2017-10-11 | 254 | 587 | -2 |
| 2017-10-12 | 311 | 875 | 26 |
| 2017-10-12 | 954 | 364 | 15 |
| 2017-10-12 | 348 | 121 | 0 |
| 2017-10-12 | 441 | 121 | 7 |
| . | . | . | . |
| . | . | . | . |
| . | . | . | . |
|------------|-----------|---------|-----------------|
In most queries I used have condition like WHERE InventoryOnHand > 0. I need to speed up these queries.
Therefore, I want to build and index that separates values on column InventoryOnHand whether they are greater than 0 or not.
Filtered Index does not solve my problem because if I use filtered index all values greater than 0 will be indexed and this increases index size. I only need to know if a value greater than 0 or not.
i.e. I want to build an index that only works when condition is InventoryOnHand > 0. Is there any way to do this on SQL-Server?

SQL Database Constraint | Multi-table Constraint

I need to make 2 database constraints that connect two different tables at one time.
1. The total score of the four quarters equals the total score of the game the quarters belong to.
2. The total point of all the players equals to the score of the game of that team.
Here is what my tables look like.
quarter table
+------+--------+--------+--------+
| gNum | Period | hScore | aScore |
+------+--------+--------+--------+
| 1 | 1 | 13 | 18 |
| 1 | 2 | 12 | 19 |
| 1 | 3 | 23 | 31 |
| 1 | 4 | 32 | 18 |
| | | Total | Total |
| | | 80 | 86 |
+------+--------+--------+--------+
Game Table
+-----+--------+--------+--------+
| gID | hScore | lScore | tScore |
+-----+--------+--------+--------+
| 1 | 86 | 80 | 166 |
+-----+--------+--------+--------+
Player Table
+-----+------+--------+--------+
| pID | gNum | Period | Points |
+-----+------+--------+--------+
| 1 | 1 | 1 | 20 |
| | | 2 | 20 |
| | | 3 | 20 |
| | | 4 | 20 |
+-----+------+--------+--------+
So Virtually I need to use CHECK I think to make sure that players points = score of their team ie (hScore, aScore) and also make sure that the hScore and aScore = the total score in the Game table.
I was thinking of creating a foreign key variable on one of the tables and setting up constraints on that would this be the best way of going about it?
Thanks

Transform ranged data in an Access table

I have a table in Access database as below;
Name | Range | X | Y | Z
------------------------------
A | 100-200 | 1 | 2 | 3
A | 200-300 | 4 | 5 | 6
B | 100-200 | 10 | 11 | 12
B | 200-300 | 13 | 14 | 15
C | 200-300 | 16 | 17 | 18
C | 300-400 | 19 | 20 | 21
I have trying write a query that convert this into the following format.
Name | X_100_200 | Y_100_200 | Z_100_200 | X_200_300 | Y_200_300 | Z_200_300 | X_300_400 | Y_300_400 | Z_300_400
A | 1 | 2 | 3 | 4 | 5 | 6 | | |
B | 10 | 11 | 12 | 13 | 14 | 15 | | |
C | | | | 16 | 17 | 18 | 19 | 20 | 21
After trying for a while the best method I could come-up with is to write bunch of short queries that selects the data for each Range and then put them together again using a Union query. The problem is that for this example I have shown 3 columns (X, Y and Z), but I actually have much more. Access is starting to strain with the amount of SQL I have come up with.
Is there a better way to achieve this?
The answer was simple. Just use Access Pivotview. Finding it hard to export the results to Excel though.

Resources