Excel Lookup IP addresses in multiple ranges - arrays

I am trying to find a formula for column A that will check an IP address in column B and find if it falls into a range (or between) 2 addresses in two other columns C and D.
E.G.
A B C D
+---------+-------------+-------------+------------+
| valid? | address | start | end |
+---------+-------------+-------------+------------+
| yes | 10.1.1.5 | 10.1.1.0 | 10.1.1.31 |
| Yes | 10.1.3.13 | 10.1.2.16 | 10.1.2.31 |
| no | 10.1.2.7 | 10.1.1.128 | 10.1.1.223 |
| no | 10.1.1.62 | 10.1.3.0 | 10.1.3.127 |
| yes | 10.1.1.9 | 10.1.4.0 | 10.1.4.255 |
| no | 10.1.1.50 | … | … |
| yes | 10.1.1.200 | | |
+---------+-------------+-------------+------------+
This is supposed to represent an Excel table with 4 columns a heading and 7 rows as an example.
I can do a lateral check with
=IF(AND((B3>C3),(B3 < D3)),"yes","no")
which only checks 1 address against the range next to it.
I need something that will check the 1 IP address against all of the ranges. i.e. rows 1 to 100.
This is checking access list rules against routes to see if I can eliminate redundant rules... but has other uses if I can get it going.
To make it extra special I can not use VBA macros to get it done.
I'm thinking some kind of index match to look it up in an array but not sure how to apply it. I don't know if it can even be done. Good luck.

Ok, so I've been tracking this problem since my initial comment, but have not taken the time to answer because just like Lana B:
I like a good puzzle, but it's not a good use of time if i have to keep guessing
+1 to Lana for her patience and effort on this question.
However, IP addressing is something I deal with regularly, so I decided to tackle this one for my own benefit. Also, no offense, but getting the MIN of the start and the MAX of the end is wrong. This will not account for gaps in the IP white-list. As I mentioned, this required 15 helper columns and my result is simply 1 or 0 corresponding to In or Out respectively. Here is a screenshot (with formulas shown below each column):
The formulas in F2:J2 are:
=NUMBERVALUE(MID(B2,1,FIND(".",B2)-1))
=NUMBERVALUE(MID(B2,FIND(".",B2)+1,FIND(".",B2,FIND(".",B2)+1)-1-FIND(".",B2)))
=NUMBERVALUE(MID(B2,FIND(".",B2,FIND(".",B2)+1)+1,FIND(".",B2,FIND(".",B2,FIND(".",B2)+1)+1)-1-FIND(".",B2,FIND(".",B2)+1)))
=NUMBERVALUE(MID(B2,FIND(".",B2,FIND(".",B2,FIND(".",B2)+1)+1)+1,LEN(B2)))
=F2*256^3+G2*256^2+H2*256+I2
Yes, I used formulas instead of "Text to Columns" to automate the process of adding more information to a "living" worksheet.
The formulas in L2:P2 are the same, but replace B2 with C2.
The formulas in R2:V2 are also the same, but replace B2 with D2.
The formula for X2 is
=SUMPRODUCT(--($P$2:$P$8<=J2)*--($V$2:$V$8>=J2))
I also copied your original "valid" set in column A, which you'll see matches my result.

You will need helper columns.
Organise your data as outlined in the picture.
Split address, start and end into columns by comma (ribbon menu Data=>Text To Columns).
Above the start/end parts, calculate MIN FOR START, and MAX FOR END for all split text parts (i.e. MIN(K5:K1000) .
FORMULAS:
VALIDITY formula - copy into cell D5, and drag down:
=IF(AND(B6>$I$1,B6<$O$1),"In",
IF(OR(B6<$I$1,B6>$O$1),"Out",
IF(B6=$I$1,
IF(C6<$J$1, "Out",
IF( C6>$J$1, "In",
IF( D6<$K$1, "Out",
IF( D6>$K$1, "In",
IF(E6>=$L$1, "In", "Out"))))),
IF(B6=$O$1,
IF(C6>$P$1, "Out",
IF( C6<$P$1, "In",
IF( D6>$Q$1, "Out",
IF( D6<$Q$1, "In",
IF(E6<=$R$1, "In", "Out") )))) )
)))

Related

How to validate two data sets coming from an algorithm to check its effectiveness

I have two data sets:
Ist (AKA "OLD") [smaller - just a sample]:
Origin | Alg.Result | Score
Star123 | Star123 | 100
Star234 | Star200 | 90
Star421 | Star420 | 98
Star578 | Star570 | 95
... | ... | ...
IInd (AKA "NEW") [bigger - used all real data]:
Origin | Alg.Result | Score
Star123 | Star120 | 90
Star234 | Star234 | 100
Star421 | Star423 | 98
Star578 | Star570 | 95
... | ... | ...
Those DFs are the results of two different algorithms. Let's call them "OLD" and "NEW".
The logic of those algorithms is following:
it takes value from some table (represented in the column: 'Origin'), and tries to match this value from some different table (outcome represented as a column: Alg. Result). Plus it calculates a score of the match based on some internal logic (column: Score).
Additionally important information:
I DF (old) is a smaller sample
II DF (new) is a bigger sample
Values in ORIGIN are the same for both datasets, excluding the fact that the old dataset has fewer of them compared to the NEW set.
Values in Alg. Result can:
be exactly the same as in Origin
can be similar
can be completely something else
In a solution where those algorithms are used, the threshold is used based on SCORE. For OLD it's a Score > 90. For the new, it's the same.
What I want to achieve is to:
How accurate is the new algorithm?
Validate how accurate is the new approach ("NEW") in matching values with Origin values.
What are the discrepancies between the OLD and NEW sets:
which cases the OLD has that the NEW doesn't have
which cases the NEW has, which the OLD doesn't have
What kind of comparison would you do to achieve those goals?
I thought about checking:
True positive => by taking NEW dataset with condition NEW.Origin == NEW.Alg.Result and NEW.Score == 100
False positive => by taking NEW dataset with condition NEW.Origin != NEW.Alg.Result and NEW.Score == 100
False-negative => by taking NEW dataset with condition NEW.Origin == NEW.Al.Result and NEW.Score != 100
I don't see a sense to count True negatives if the algorithm always generates some match. I'm not sure what this could look like.
What else you'd suggest? What to do to compare OLD and NEW values? Do you have some ideas?

Excel how to find all rows matching elements from a comma-separated list without VBA

Here is my problem, I have a table with two columns: product references and corresponding notice ids:
| A | B | C | D |
---------------------------------------
1| Product | Notice | | |
2| p1 | n1 | | |
3| p2 | n2 | | |
4| p3 | n3 | | |
5| | | | |
6| | | p1, p3 | =... |
(edit: in my real life application, columns 'product references' and 'notice ids' are not alongside but separated by other columns)
In another cell (e.g. C6), I have a comma separated list of product references, let's say p1, p3 and I need a formula to output the corresponding notice ids, i.e. n1, n3 in this case, in cell D6.
Important: For different reasons, I cannot use VBA, I need a standard excel array formula.
Here is what I can do at the moment:
with the FILTERXML function, I can split the comma-separated list into an array: FILTERXML("<t><s>" & SUBSTITUTE(C6, ", ", "</s><s>") & "</s></t>", "//s")
with the TEXTJOIN function, I can merge an array into a string.
I can extract a single match with a combination of INDEX and MATCH functions, e.g.:
=IF(ISERROR(MATCH("p3"; A:A; 0)); "not found"; INDEX(B:B; MATCH("p3"; A:A; 0)))
(which is not useful for me, since again the references in column A are unique)
(By the way, I don't know if there is a better way to handle error raised by MATCH when no match is found)
I can extract and join elements of column B corresponding to multiple matches to a single reference in column A with (array formula activated with Ctrl+Shift+Enter):
{=TEXTJOIN(", "; TRUE; IF(A:A="p2"; B:B; ""))}
(which is not useful for me, since again the references in column A are unique)
In summary: I can find and merge multiple matches to a single reference, but I cannot find and merge single unique match to multiple references (what I want to do).
Failed attempts
I tried to mix the previous formulae in different ways to get what I want, but all failed with an error.
Combining 1, 2 and 4 (using OR on boolean array of matches):
{=TEXTJOIN(", "; TRUE; IF(OR(A:A=FILTERXML("<t><s>" & SUBSTITUTE(C6, ", ", "</s><s>") & "</s></t>", "//s")); B:B; ""))}
or (using SUM on boolean array of matches):
{=TEXTJOIN(", "; TRUE; IF(SUM(A:A=FILTERXML("<t><s>" & SUBSTITUTE(C6, ", ", "</s><s>") & "</s></t>", "//s")); B:B; ""))}
Here, I am not sure how to handle the different arrays that are considered in the IF (column A and list of references given by FILTERXML).
Combining 1, 2 and 3:
{=TEXTJOIN(", "; TRUE; INDEX(B:B; MATCH(FILTERXML("<t><s>" & SUBSTITUTE(C6, ", ", "</s><s>") & "</s></t>", "//s"); A:A; 0)))}
Here, I am not sure how to handle (i) again the different arrays that are considered (column A and list of references given by FILTERXML), (ii) the error raised by MATCH when no match is found, (iii) the array references passed to INDEX function.
Nice question. If you just have Excel 2019, you could maybe go with:
Formula in E1:
=TEXTJOIN(", ",,IFERROR(VLOOKUP(FILTERXML("<t><s>"&SUBSTITUTE(D1,", ","</s><s>")&"</s></t>","//s"),A:B,2,FALSE),""))
If you have Excel O365, then maybe:
=TEXTJOIN(", ",,XLOOKUP(FILTERXML("<t><s>"&SUBSTITUTE(D1,", ","</s><s>")&"</s></t>","//s"),A:A,B:B,"",0))
Try:
=TEXTJOIN(",",TRUE,VLOOKUP(FILTERXML("<t><s>" & SUBSTITUTE(C6,",","</s><s>")&"</s></t>","//s"),tblProd[[Product]:[Notice]],COLUMNS(tblProd[[Product]:[Notice]]),FALSE))
I used tables and structured references, although you can change this to regular addressing if you absolutely need to, but I think with Tables, and auto-adjusting references, it will be easier to maintain.
Since you did not know the distance between the Product column and the Notice column, I constructed an array, and obtained the Column Number argument for VLOOKUP using the COLUMNS function

Use excel to summarise data from a column by identifier

I have a spreadsheet with a column called MRN (the identifier) and the drugs administered next to them. There are duplicates of the MRN in column A that correspond to different courses of drugs. What I'm hoping to do is to summarise all the drugs administered associated with one MRN in one line, removing all duplicates. It looks something like this.
| | A | B |
| 1 | MRN Item
| 2 | 1 cefoTAXime
| 3 | 1 ampicillin
| 4 | 1 cefoTAXime
| 5 | 1 vancomycin
| 6 | 1 cefTRIaxone
| 7 | 2 ampicillin
| 8 | 2 vancomycin
| 9 | 2 vancomycin
I have 3 different formulas. The first is to produce a list of MRNs that are all unique. The second is to pull all drugs by MRN and list them in one line. The third is to remove duplicates from this list. They are below (in order).
{=IFERROR(INDEX($A$2:$A$2885, MATCH(0,COUNTIF(D$1:$D1, $A$2:$A$2885),0 )),"")}
{=INDEX($A$2:$B$2885,SMALL(IF($A$2:$A$2885=$D2,ROW($A$2:$A$2885)),COLUMN(D:D))-4,2)}
{=IFERROR(INDEX($E$2:$AE$2, MATCH(0,COUNTIF(D$3:$D3, $E$2:$AE$2),0 )),"")}
*I know that I can edit the second one by adding IF(ISERROR ...) to remove NA and print blanks if drug not found, but want to keep the formulas as simple as possible at this time.
My problem is that second formula isn't pulling all the drugs by MRN, and in an ideal world I would be able to combine the second and third formula into one, but I am not sure how to. Here is a link to a test file that shows my issue and the formulas in action.
https://1drv.ms/x/s!ApoCMYBhswHzhooXnumW2iV7yx-JaA
I appreciate that there may be a better way to do this using python/R, and if that's possible then I'm more than happy to try, but I couldn't make any headway. Thanks for your help and suggestions.
If you could deal with a count of the number of courses per drug per MRN, you can do this with Power Query (aka Get & Transform in Excel 2016)
Starting with the data you provided on your worksheet, the results would look like:
M-Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"MRN", Int64.Type}, {"Item", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"MRN"}, {{"Count", each _, type table}}),
#"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"MRN", "Item"}, {"Count.MRN", "Count.Item"}),
#"Pivoted Column" = Table.Pivot(#"Expanded Count", List.Distinct(#"Expanded Count"[Count.Item]), "Count.Item", "Count.MRN", List.NonNullCount)
in
#"Pivoted Column"

Use TQuery.Locate() function to find other then first matching

Locate moves the cursor to the first row matching a specified set of search criteria.
Let's say that q is TQuery component, which is connected to the database with two columns TAG and TAGTEXT. With next code I am getting letter a. And I would like to use Locate() function to get letter d.
If q.Locate('TAG','1',[loPartialKey]) Then
begin
tag60 := q.FieldByName('TAGTEXT');
end
For example if I got table like this:
TAG | TAGTEXT
+---+--------+
| 1 | a |
+---+--------+
| 2 | b |
+---+--------+
| 3 | c |
+---+--------+
| 1 | d |
+---+--------+
| 4 | e |
+---+--------+
| 1 | f |
+---+--------+
is it possible to locate the second time number one occurred in table?
EDIT
My job is to find the occurrence of TAG with value 1 (which occurrence I need depends on the parameter I get), I need to iterate through table and get the values from all the TAGTEXT fields till I find that value in TAG field is again number 1. Number 1 in this case represents the start of new segment, and all between the two number 1s belongs to one segment. It doesn't have to be same number of rows in each segment. Also I am not allowed to do any changes on table.
What I thought I could do is to create a counter variable that is going to be increased by one every time it comes to TAG with value 1 in it. When the counter equals to the parameter that represents the occurrence I know that I am in the right segment and I am going to iterate through that segment and get the values I need.
But this might be slow solution, and I wanted to know if there was any faster.
You need to be a bit wary of using Locate for a purpose like this, because some
TDataSet descendants' implementation of Locate (or the underlying db-access layer) construct a temporary index on the dataset. which can be discarded immediately afterwards, so repeatedly calling Locate to iterate the rows of a given segment may be a lot more inefficient than one might expect it to be.
Also, TClientDataSet constructs, uses and then discards an expression parser for each invocation of Locate (in its internal call to LocateRecord), which is a lot of overhead for repeated calls, especial when they are entirely avoidable.
In any case, the best way to do this is to ensure that your table records which segment a given row belongs to, adding a column like the SegmentID below if your table does not already have one:
TAG | TAGTEXT|SegmentID
+---+--------+---------+
| 1 | a | 1
| 2 | b | 1
| 3 | c | 1
| 1 | d | 2
+---+--------+---------+ // btw, what happened to the 2 missing rows after this one?
| 4 | e | 2
| 1 | f | 3
+---+--------+---------+
Then, you could use code like this to iterate the rows of a segment:
procedure IterateSegment(Query : TSomeTypeOfQueryComponent; SegmentID : Integer);
var
Sql; String;
begin
Sql := Format('select * from mytable where SegmentID = %d order by Tag', [SegmentID]);
if Query.Active then
Query.Close;
Query.Sql.Text := Sql;
Query.Open;
Query.DisableControls;
try
while not Query.Eof do begin
// process row here
Query.Next;
end;
finally
Query.EnableControls;
end;
end;
Once you have the SegmentID column in the table, if you don't want to open a new query to iterate a block, you can set up a local index (by SegmentID then Tag), assuming your dataset type supports it, set a filter on the dataset to restrict it to a given SegmentID and then iterate over it
You have much options to do this.
If your component don´t provide a locateNext you can make your on function locateNext, comparing the value and make next until find.
You can also bring the sql with order by then use locate for de the first value and test if the next value match the comparision.
If you use a clientDataset you can filter into the component filter propertie, or set IndexFieldNames to order values instead the "order by" of sql in the prior suggestion.
You can filter it on the SQL Where clausule too.

sort 2d Array re-order the first column

For example:
Array
ID | Primary | Data2
------------------
1 | N | Something 1
2 | N | Something 2
3 | Y | Something 3
I'm trying to sort it based on the primary column and I want the "Y" to show first. It should bring all the other column at the top.
The end result would be:
Sorted Array
ID | Primary | Data2
------------------
3 | Y | Something 3
1 | N | Something 1
2 | N | Something 2
Is there a pre-made function for that. If not, how do we do this?
It is declared like this:
Dim Array(,) As String
regards,
I like using LINQ's OrderBy and ThenBy to order collections of objects. You just pass in a selector function to use to order the collections. For example:
orderedObjs = objs.OrderByDescending(function(x) x.isPrimary).ThenBy(function(x) x.id).ToList()
This code orders a collection first by the .isPrimary boolean, then by the id. Finally, it immediately evaluates the query into a List and assigns it to some variable.
Demo
There's a similar C# question whose solution applies just as well to VB. In short, you can use an overload of Array.Sort if you first split your 2D array into separate (1D) arrays:
Dim Primary() As String
Dim Data2() As String
// ...
Array.Sort(Primary,Data2)
This would reorder Data2 according to the Y/N sort of Primary, after which point you could then recombine them into a 2D array.

Resources