Sort data with multiple criteria including MATCH - arrays

I'm honestly not 100% sure if this is the right place to ask but I have to give it a try so I'm trying to sort by curse and by A -> Z the names but I have try plenty of ways and I never get the correct format and I have google it and they just keep giving useless answers that is not for this specific case.
As you can see I made a custom sort for the curse but I still have no idea how or where to add the A -> Z for the names I always get it wrong and it tells me is wrong. Any advice ?
I have done plenty attempts trying to format it the right way buuut always end in failure an example:
Formula: =ARRAYFORMULA(SORT(DATA!A2:BY101,match(DATA!P2:P101,{"1º";"2º";"3º";"4º";"5º";"6º";"7º";"8º";"9º";"10º";"11º";"12º"},0),TRUE))

try:
=ARRAYFORMULA(SORT(DATA!A2:BY101, MATCH(DATA!P2:P101,
{"1º";"2º";"3º";"4º";"5º";"6º";"7º";"8º";"9º";"10º";"11º";"12º"}, 0), 1, 2, 1))

Related

Ive got a pipe that consists of 5 pieces, each including 5 properties

Inlet -> front -> middle -> rear -> outlet
Those five properties have a value anything between 4 - 40. Now i want to calculate a specific match for each of those values that is either a full 10 or a 5 when a single property is summed from each pipe piece. There might be hundreds of different pipe pieces all with different properties.
So if i have all 5 pieces and when summed, their properties go like 54,51,23,71,37. That is not good and not what im looking.
Instead 55,50,25,70,40. That would be perfect.
My trouble is there are so many of the pieces that it would be insane to do the miss'matching manually, and new ones come up frequently.
I have manually inserted about 100 of these already into SQLite, but should be easy to convert into any excel or other database formats, so answer can be related to anything like mysql or googlesheets.
I need the calculation that takes every piece in account and results either in "no match" or tells me the id of each piece that is required for a match and if multiple matches are available, it separates them.
Edit: Even just the math needed to do this kind of calculation would be a lot of help here, not much of a math guy myself. I guess there should be a reference piece i need to use and then that gets checked against every possible scenario.
If the value you want to verify is in A1, use: =ROUND(A1/5,0)*5
If the pipes may not be shorter than the given values, use =CEILING(A1,5)

Expanding a difficult formula using ARRAYFORMULA while avoiding circular references (Google Sheets)

I'm trying to set up a formula in Google Sheets that does this...
=IF(IFERROR(INDEX(Matches!$L$2:$L,MATCH($A2&(B$1-14),Matches!$H$2:$H&Matches!$M$2:$M,0)))=1,
IFERROR(OFFSET($A$1,INDEX(Matches!$I$2:$I,MATCH($A2&(B$1-14),Matches!$H$2:$H&Matches!$M$2:$M,0)),COLUMN()-1)-(1/(2*($A2)^2))),
$A2)
That is to say, IF(match was won, take the current period's rank of your defeated opponent and do math to it, else show last period's rank).
But I want to set it into an ARRAYFORMULA so that it will expand automatically. What I have is this (and it doesn't work):
=ARRAYFORMULA(IF(IFERROR(VLOOKUP($A$2:$A&($B$1:$1-14),{Matches!$H$2:$H&Matches!$M$2:$M,Matches!$L$2:$L},2,0))=1,
OFFSET($A$1,VLOOKUP($A$2:$A&($B$1:$1-14),{Matches!$H$2:$H&Matches!$M$2:$M,Matches!$I$2:$I},2,0),SEQUENCE(1,COUNTA($B$1:$1),2,1))-(1/(2*($A$2:$A)^2)),
$A$2:$A))
What it should look like is this:
https://i.stack.imgur.com/Ui3nE.png
How it actually comes out is:
https://i.stack.imgur.com/tQOl7.png
All of those errors are the same message, which is that VLOOKUP couldn't find 143997, which is just the first value pair. I've tried using VLOOKUP/MATCH, but it produces a circular reference error.
Is this possible? I'm willing to believe it's not, but I thought I should ask. Thanks for any help you can offer.
You can't use that OFFSET formula within ARRAYFORMULA. When the VLOOKUP throws an error the OFFSET is not able to handle it. That's why it won't work. You should apply your math on the output of the VLOOKUP.

Does anyone know of potential problems with st_line_substring in postGIS?

Specifically I'm getting a result that I do not understand. It is possible that my understanding is simply wrong, but I don't think so. So I'm hoping that someone will either say "yes, that's a known problem" or "no, it is working correct and here is why your understanding is wrong".
Here is my example.
To start I have the following geometry of lat/longs.
LINESTRING(-1.32007599 51.06707497,-1.31192207 51.09430508,-1.30926132 51.10206677,-1.30376816 51.11133597,-1.29261017 51.12981493,-1.27510071 51.15906713,-1.27057314 51.16440941,-1.26606703 51.16897072,-1.26235485 51.17439257,-1.26089573 51.17875111,-1.26044512 51.1833917,-1.25793457 51.19727033,-1.25669003 51.20141159,-1.25347137 51.20630532,-1.24845028 51.21110444,-1.23325825 51.22457158,-1.2274003 51.22821321,-1.22038364 51.23103494,-1.20326042 51.23596583,-1.1776185 51.24346193,-1.16356373 51.24968088,-1.13167763 51.26363353,-1.12247229 51.2659966,-1.11629248 51.26682901,-1.10906124 51.26728549,-1.09052181 51.26823871,-1.08522177 51.26885628,-1.07013702 51.27070895,-1.03683472 51.27350122,-1.00917578 51.27572955,-0.98243952 51.2779175,-0.9509182 51.28095094,-0.9267354 51.28305811,-0.90499878 51.28511151,-0.86051702 51.2883055,-0.83661318 51.29023789,-0.7534647 51.29708113,-0.74908733 51.29795323,-0.7400322 51.2988924,-0.71535587 51.30125366,-0.68475723 51.29863749,-0.65746307 51.30220618,-0.63246489 51.30380261,-0.60542822 51.30645873,-0.58150291 51.3103219,-0.57603121 51.31150225,-0.57062387 51.31317883,-0.54195642 51.32475227,-0.4855442 51.34771616,-0.4553318 51.36283147)
This is in a column called "geom" in my table, called "fibre_lines". When I run the following query,
select st_length(geography(geom), false) as full_length,
st_length(geography(st_line_substring(geom, 0, 1)), false) as full_length_2,
st_length(geography(st_line_substring(geom, 0, 0.5)), false) as first_half,
st_length(geography(st_line_substring(geom, 0.5, 1)), false) as second_half
from fibre_lines
where id = 10;
I get the following result...
76399.4939375278 76399.4939375278 41008.9667229201 35390.5272197668
The first two make sense to me, they are simply the length of my line assuming a spherical earth. The first is just using the obvious function while the second is using st_line_substring to get the length of the entire line. These two values agree.
But the last two have me puzzled. I am asking for the length of the first half of the line, then I'm asking for the length of the last half. My expectation was that these would be equal or nearly equal. Instead the first half is about 6km longer than the second half.
If you plot the geometry on the map you will see that the first third of the line is fairly north/south oriented and the remaining two thirds are more east/west. I wouldn't have thought that would make a difference when asking for the length on a spherical earth, but I am happy to be told that I'm wrong (so long as it is also explained why I'm wrong).
For reference the PostGIS I am using is 1.5.8. If this is a bug, upgrading to a newer version is possible, but not trivial, so I would prefer to only do that if it is necessary.
Anyone have ideas?
While Arunas' comments didn't directly answer my question, it did lead me to some research that I think identifies the problem. I'm posting it here in part to get it straight in my own mind and in part in case others are wondering.
It seems the key is the PostGIS distinction between a "geometry" and a "geography". A geometry is a 2D planar geometry that is typically in UTMs and used with a projection of the globe onto a flat surface (which projection is configurable). A geography, on the other hand, is designed to store latitude/longitude information specifically and is used to work either on a sphere or a spheroid. So the essential problem I have is twofold:
Perhaps not obvious from my original post is that I am using a geometry object to store lat/long information rather than UTMs. I cast that to a geography most of the time so that I get the correct answers, but it would be more correct if I actually stored it as a geography object. That would eliminate the need for a number of the casts in my code as well as allow PostGIS to tell me when I am doing something wrong.
While ST_Length will work with either a geometry or a geography, ST_Line_Substring only works with geometries. Hence when I ask it for the halfway point, I am asking it for the halfway point of a flat geometry. This will give me the correct answer for the latitude coordinate, but for the longitude it will have an error term that increases (for most projections) the farther I am from the equator.
I've looked into newer versions of PostGIS and they don't seem to have an ST_Line_Substring or anything similar that will give me the 50% point of a geography, so I will have to do it the "hard" way by using ST_Length to give me all my segment lengths and then adding them up and doing the math needed for my interpolation.
Sorry I can't add comments so will provide it as an answer.
I experienced the same problem and I resolved by transforming my lat-lon geometries to utm geometries into st_line_substring function call. The I as getting sub-geometries with proper length. Of course I had to transform them back to lat-lon afterward.

Need to check for range in Solr if function

We are just building a Solr index for a knowledge base and I have some problems implementing
boosting.
First of all: We want to have multiplicative boosting and no additive.
And: The more hits a document has, the more it should be boosted, but only to a certain degree.
First of all we thounght about a function like boost=sum(div(hits,10000),1), but that would push certain
documents too much.
So we thought about something like this
(beside some others, but those all work and only these give me an error):
&boost=if(hits,[0+TO+100],1)
&boost=if(hits,[101+TO+250],1.25)
&boost=if(hits,[250+TO+100000],1.5)
Error is:
org.apache.solr.search.SyntaxError: Expected identifier at pos 8 str='if(hits,[101 TO 250],1.25)'
So the obvious reason is the range in the if function, if I remove that with a single value, all works, but that does not really help me.
So my question is: Is it not possible to combine an "if()" function with a range of values to match?
I know I could try a million different ways to solve this, but actually we would be glad to have it in some way like this, as the boost param values could be configurable for the different ranges and it's easy to get that syntax working with our framework to access Solr.
However, if there is no chance to get this running, I am of course open for alternative solutions.
Thanks a lot,
Markus
You can use bq (Boost Query) as following:
&bq=hits:[0 TO 100]^1.0
So to clean this up here:
It is not possible to use a range within an if function.
But we found a way with the map function which pretty much does what we wanted to achieve with
that if-range attempt:
&boost=map(hits, 0, 100, 1, map(hits,101, 250, 1.25, map(hits,250, 10000, 1.5)))

Parsing a domain name

I am parsing the domain name out of a string by strchr() the last . (dot) and counting back until the dot before that (if any), then I know I have my domain.
This is a rather nasty piece code and I was wondering if anyone has a better way.
The possible strings I might get are:
domain.com
something.domain.com
some.some.domain.com
You get the idea. I need to extract the "domain.com" part.
Before you tell me to go search in google, I already did. No answer, hence I am asking here.
Thank you for your help
EDIT:
The string I have contains a full hostname. This usually is in the form of whatever.domain.com but can also take other forms and as someone mentioned it can also have whatever.domain.co.uk. Either way, I need to parse the domain part of the hostname: domain.com or domain.co.uk
Did you mean strrchr()?
I would probably approach this by doing:
strrchr to get the last dot in the string, save a pointer here, replace the dot with a NUL ('\0').
strrchr again to get the next to last dot in the string. The character after this is the start of the name you are looking for (domain.com).
Using the pointer you saved in #1, put the dot back where you set it NUL.
Beware that names can sometimes end with a dot, if this is a valid part of your input set, you'll need to account for it.
Edit: To handle the flexibility you need in terms of example.co.uk and others, the function described above would take an additional parameter telling it how many components to extract from the end of the name.
You're on your own for figuring out how to decide how many components to extract -- as Philip Potter mentions in a comment below, this is a Hard Problem.
This isn't a reply to the question itself, but an idea for an alternate approach:
In the context of already very nasty code, I'd argue that a good way to make it less nasty, and provide a good facility of parsing domain names and the likes - is to use PCRE or a similar library for regular expressions. That will definitly help you out if you also want to validate that the tld exists, for instance.
It may take some effort to learn initially, but if you need to make changes to existing matching/parsing code, or create more code for string matching - I'd argue that a regex-lib may simplify this a lot in the long term. Especially for more advanced matching.
Another library I recall which supports regex, is glib.
Not sure what flavor of C, but you probably want to tokenize the domain using "." as the separator.
Try this: http://www.metalshell.com/source_code/31/String_Tokenizer.html
As for the domain name, not sure what your end goal is, but domains can have lots and lots of nodes, you could have a domain name foo.baz.biz.boz.bar.co.uk.
If you just want the last 2 nodes, then use above and get the last two tokens.

Resources