Searching chars in database records? - sql-server

I have variable
Dim s As String = "1,5,20,22,28,31,40"
And I have records in (fieldnum) database (mssql) :
3,5,19,20,26,28,33,40
1,2,5,18,20,22,31,38
1,7,12,22,23,24,31
Now, how to find (SELECT ...???) records where at least 4 numbers in fieldnum are equal with numbers s variable?
In this example must be selected only those two rows.
3,5,19,20,26,28,33,40 (equal numbers: 5,20,28,40 = 4 nums)
1,2,5,18,20,22,31,38 (euqal numbers: 1,5,20,22,31 = 5 nums)

It is not going to recognize them as seperate numbers. You could trying using a LIKE search in your sql statement though.

Related

How to Get SSIS Expression Leftover odd Rows right?

I’m trying to come up with Expression in Conditional Split To Return Whatever leftOver Rows.
For example I have Flat File with 10 Rows and if want to split the file onto 2 Rows, I will have 5 files with 2 Rows in each file, but what if I want to split the file By 3 I will have only 3 files with 3 Rows each file, But Where is the Leftover 1 Row modulus remainder will go?. I tried the Conditional split and was fine But the LeftOver Rows Output DID NOT Come out Right. Please take look at my images and tell me if I have issues with my expression in Conditional Split and For Loop.
I have these variables:
#Counter = 0
#ReCount = total count that comes from RowCount Transformation
#SplitRow = Given Number that will Divided by
#EndCount = #ReCount / #SplitRow
I need to have Right expression for LeftOver or modulus remainder of any ‘Odds’ Number
Given that you already know the rownumber for each row as well as the rowcount of the entire file and the split number of rows in each destination file, the expression that will flag up any leftover rows from your splitting is any row in which the following returns true:
rownumber > (rowcount - (rowcount % split))

Counting rows in a table based on multiple array criterias

I am trying to count rows in a table based on multiple criteria in different columns of that table. The criteria are not directly in the formula though; they are arrays which I would like to refer to (and not list them in the formula directly).
Range table example:
Group Status
a 1
b 4
b 6
c 4
a 6
d 5
d 4
a 2
b 2
d 3
b 2
c 1
c 2
c 1
a 4
b 3
Criteria/arrays example:
group
a
b
status
1
2
3
I am able to do this if i only have one array search through a range (1 column) in that table:
{=SUM(COUNTIFS(data[Group],group[group]))}
Returns "9" as expected (=sum of rows in the group column of the data table which match any values in group[group])
But if I add a second different range and a different array I get an incorrect result:
{=SUM(COUNTIFS(data[Group],group[group], data[Status],status[status]))}
Returns "3" but should return "5" (=sum of rows which have 1, 2 or 3 in the status column and a or b in the group column)
I searched and googled for various ideas related to using sumproduct or defining arrays within the formula instead of classifying the whole formula as an array but I was not able to get expected results via those means.
Thank you for your help.
Because your group and status criteria are a different number of values (2 values for group, but 3 values for status), I'm not sure you can do this in a single formula. Best way I know of to do this would be to use a helper column (which can be hidden if preferred).
Put this array formula in a helper column and copy down the length of your data (array formulas must be confirmed with Ctrl+Shift+Enter):
=AND(OR(data[#Group]=group[group]),OR(data[#Status]=status[status]))
And then get the count with: =COUNTIF(helpercolumn,TRUE)
You could use a slightly different approach, using Power Query / Power Pivot.
Name your tables Data, Group and Status, then create the following query, named Filtered Data:
let
tbData = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
tbGroup = Excel.CurrentWorkbook(){[Name="Group"]}[Content],
tbStatus = Excel.CurrentWorkbook(){[Name="Status"]}[Content],
#"Merged Group" = Table.NestedJoin(tbData,{"Group"},tbGroup,{"Group"},"tbGroup",JoinKind.Inner),
#"Merged Status" = Table.NestedJoin(#"Merged Group",{"Status"},tbStatus,{"Status"},"Merged Status",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Status",{"tbGroup", "Merged Status"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Status", type number}})
in
#"Changed Type"
Load To as connection only, and tick Load to Data Model
Now create a DAX measure:
Status Sum:=SUM ( 'Filtered Data'[Status] )
You can then use the following formula on your worksheet, to get the Sum of Status values, for rows matching the criteria specified in the Group and Status tables:
=CUBEVALUE("ThisWorkbookDataModel","[Measures].[Status Sum]")
Simply refresh the data connection to update the value.

lag over columns/ variables SPSS

I want to do something I thought was really simple.
My (mock) data looks like this:
data list free/totalscore.1 to totalscore.5.
begin data.
1 2 6 7 10 1 4 9 11 12 0 2 4 6 9
end data.
These are total scores accumulating over a number of trials (in this mock data, from 1 to 5). Now I want to know the number of scores earned in each trial. In other words, I want to subtract the value in the n trial from the n+1 trial.
The most simple syntax would look like this:
COMPUTE trialscore.1 = totalscore.2 - totalscore.1.
EXECUTE.
COMPUTE trialscore.2 = totalscore.3 - totalscore.2.
EXECUTE.
COMPUTE trialscore.3 = totalscore.4 - totalscore.3.
EXECUTE.
And so on...
So that the result would look like this:
But of course it is not possible and not fun to do this for 200+ variables.
I attempted to write a syntax using VECTOR and DO REPEAT as follows:
COMPUTE #y = 1.
VECTOR totalscore = totalscore.1 to totalscore.5.
DO REPEAT trialscore = trialscore.1 to trialscore.5.
COMPUTE #y = #x + 1.
END REPEAT.
COMPUTE trialscore(#i) = totalscore(#y) - totalscore(#i).
EXECUTE.
But it doesn't work.
Any help is appreciated.
Ps. I've looked into using LAG but that loops over rows while I need it to go over 1 column at a time.
I am assuming respid is your original (unique) record identifier.
EDIT:
If you do not have a record indentifier, you can very easily create a dummy one:
compute respid=$casenum.
exe.
end of EDIT
You could try re-structuring the data, so that each score is a distinct record:
varstocases
/make totalscore from totalscore.1 to totalscore.5
/index=scorenumber
/NULL=keep.
exe.
then sort your cases so that scores are in descending order (in order to be bale to use lag function):
sort cases by respid (a) scorenumber (d).
Then actually do the lag-based computations
do if respid=lag(respid).
compute trialscore=totalscore-lag(totalscore).
end if.
exe.
In the end, un-do the restructuring:
casestovars
/id=respid
/index=scorenumber.
exe.
You should end up with a set of totalscore variables (the last one will be empty), which will hold what you need.
you can use do repeat this way:
do repeat
before=totalscore.1 to totalscore.4
/after=totalscore.2 to totalscore.5
/diff=trialscore.1 to trialscore.4 .
compute diff=after-before.
end repeat.

MATLAB Extract all rows between two variables with a threshold

I have a cell array called BodyData in MATLAB that has around 139 columns and 3500 odd rows of skeletal tracking data.
I need to extract all rows between two string values (these are timestamps when an event happened) that I have
e.g.
BodyData{}=
Column 1 2 3
'10:15:15.332' 'BASE05' ...
...
'10:17:33:230' 'BASE05' ...
The two timestamps should match a value in the array but might also be within a few ms of those in the array e.g.
TimeStamp1 = '10:15:15.560'
TimeStamp2 = '10:17:33.233'
I have several questions!
How can I return an array for all the data between the two string values plus or minus a small threshold of say .100ms?
Also can I also add another condition to say that all str values in column2 must also be the same, otherwise ignore? For example, only return the timestamps between A and B only if 'BASE02'
Many thanks,
The best approach to the first part of your problem is probably to change from strings to numeric date values. In Matlab this can be done quite painlessly with datenum.
For the second part you can just use logical indexing... this is were you put a condition (i.e. that second columns is BASE02) within the indexing expression.
A self-contained example:
% some example data:
BodyData = {'10:15:15.332', 'BASE05', 'foo';...
'10:15:16.332', 'BASE02', 'bar';...
'10:15:17.332', 'BASE05', 'foo';...
'10:15:18.332', 'BASE02', 'foo';...
'10:15:19.332', 'BASE05', 'bar'};
% create column vector of numeric times, and define start/end times
dateValues = datenum(BodyData(:, 1), 'HH:MM:SS.FFF');
startTime = datenum('10:15:16.100', 'HH:MM:SS.FFF');
endTime = datenum('10:15:18.500', 'HH:MM:SS.FFF');
% select data in range, and where second column is 'BASE02'
BodyData(dateValues > startTime & dateValues < endTime & strcmp(BodyData(:, 2), 'BASE02'), :)
Returns:
ans =
'10:15:16.332' 'BASE02' 'bar'
'10:15:18.332' 'BASE02' 'foo'
References: datenum manual page, matlab help page on logical indexing.

Adding Numbers to get the target Numbers

I want to generate a random number against StudentID, I am using the following SQL
The Result i am getting is :
Please Help.
You're just setting that column to the column index of the partitioned row_number, which is 1. If you want a random number for each row, it would be easier to do this:
UPDATE TAB_EXAM_FORMS
SET RNO_CODE = CAST(RAND(CHECKSUM(NEWID())) * 10 AS INT) + 1
Just change the multiplication to whatever range you want. This range is 1-10.

Resources