Azure Search: How does Orderby treat null integer values? - azure-cognitive-search

I have an Azure Search Index that I'm hitting from an AngularJS app and I've run into a bit of a problem. One of my tables has a property that defines a "Sort Order" for the rows in the table, which is going to be used in the app to determine what order to display the items in the table to the user. Sort Order is an integer, and the goal here is when an admin user defines the "Sort Order" as 1, that item will appear first, 2 will appear second, etc.
My preferred solution would be to just sort the results from my query by ascending and spit that out to the user, but I've run into a problem with my null values, since Sort Order isn't a required field. All my models are written in C#, so null entries are being set to 0, and thus the entries without a sort order defined would be displayed first if ordered by ascending, which is the exact opposite of what I want.
There's a lot of different ways I could deal with this, but the easiest would be if I could just set the SortOrder property as a nullable int. The only problem with this is I'm not sure how Azure Search treats null values with its OrderBy statement. My initial thought was, if sorted by ascending, it would list all the numerical values in ascending order first, then kick the null values down to the end. If that's the case, then that's perfect, but I'm just looking to find out if that is indeed the case.
So, in short, my question is: Would Azure Search treat null values as greater than or less than a defined integer value? Put another way, if I OrderBy ascending integers, would the null values appear at the top or the bottom?

Azure search will place those documents whose $orderby column happens to be null towards the end of the results. This is true regardless of if you choose to order by "descending" or "ascending".
If there are multiple documents where the $orderby column is null, rest assured that all of them would be placed near the end, but you should ideally treat the ordering amongst the "null-valued" documents as undefined.

I find the above answer wrong, as per my experience, If there are null values in the field, null values appear first if the sort is asc and last if the sort is desc.
You can refer the Azure documentation below for same search-query-odata-orderby

Related

How to modify array under specific JSONB key in PostgreSQL?

We're storing various heterogeneous data in a JSONB column called ext and under some keys we have arrays of values. I know how to replace the whole key (||). If I want to add one or two values I still need to extract the original values (that would be ext->'key2' in the example lower) - in some cases this may be too many.
I realize this is trivial problem in relational world and that PG still needs to overwrite the whole row anyway, but at least I don't need to pull the unchanged part of the data from DB to the application and push them back.
I can construct the final value of the array in the select, but I don't know how to merge this into the final value of ext so it is usable in UPDATE statement:
select ext, -- whole JSONB
ext->'key2', -- JSONB array
ARRAY(select jsonb_array_elements_text(ext->'key2')) || array['asdf'], -- array + concat
ext || '{"key2":["new", "value"]}' -- JSONB with whole "key2" key replaced (not what I want)
from (select '{"key1": "val1", "key2": ["val2-1", "val2-2"]}'::jsonb ext) t
So the question: How to write such a modification into the UPDATE statement?
Example uses jsonb_*_text function, some values are non-textual, e.g. numbers, that would need non _text function, but I know what type it is when I construct the query, no problem here.
We also need to remove the values from the arrays as well, in which case if the array is completely empty we would like to remove the key from the JSONB altogether.
Currently we achieve this with this expression in the UPDATE statement
coalesce(ext, '{}')::jsonb - <array of items to delete> || <jsonb with additions> (<parts> are symbolic here, we use single JDBC parameter for each value). If the final value of the array is empty, the key for that value goes into the first array, otherwise the final value appears int he JSONB after || operator.
To be clear:
I know the path to the JSONB value I want to change - it's actually always a single key on the top level.
I know whether that key stores single value (no problem for those) or array (that's where I don't have satisfying solution yet), because we know the definitions of each key, this is stored separately.
I need to add and/or remove multiple values I provide, but I don't know what is in the array at that moment - that's the whole point, so that application doesn't need to read it.
I may also want to replace the whole array under the key, but this is trivial case and I know how to do this.
Finally, if removal results in an empty array, we'd like to get rid of the key as well.
I could probably write a function doing it all if necessary but I've not committed to that yet.
Obviously, restructuring the data out of that JSONB column is not an option. Eventually I want to make it more flexible and data with these characteristics would go to some other table, but at this moment we're not able to do it with our application.
You can use jsonb_set to modify an array which is placed under some key.
To update a value in an array you should specify a zero-based index within the array in the below example.
To add a new element on a start/end - specify negative/positive index which is greter than array's length.
UPDATE <table>
SET ext = jsonb_set(ext, '{key2, <index>}', '5')
WHERE <condition>

How to use index match (array formula) to return corresponding values from a drop down list?

Excel Screenshot
Excel Screenshot with Formulas
I have attached photos to show an idea of what I am trying to do. Basically, I have a very large list of features that are shared between certain groups. I want to use a drop down list of the features, and then have a formula that will output the group that has the lowest cost of that feature along with the cost of that feature within the group.
(Also you will see that I purposefully ignore zero values. I do this because not every group has a certain feature and those cells default to zero).
I figured out how to get the cost of the feature to output, but I'm having trouble getting to output the group name. I am assuming there will be an array formula to do this, but I am just starting to learn those and I'm having trouble with this one.
Well you could always use the same approach you used to pull in the value, by pulling in the index of the column heading that matches the computed min, and using an offset function to match on the right row:
=+INDEX($B$1:$D$1,MATCH($B$10,OFFSET($B$1:$D$1,MATCH($A$7,$A$2:$A$4,0),0),0))
The thing is, I'm not sure how you would want to handle ties, if 2 vendors had the same price, this would match the first one in the list.

does postgresql array type preserve order of array?

I read over the docs for PostgreSQL v 9.3 arrays (http://www.postgresql.org/docs/9.3/static/arrays.html), but I don't see the question of ordering covered. Can someone confirm that Postgres preserves the insertion order/original order of an array when it's inserted into an array column? This seems to be the case but I would like absolute confirmation.
Thank you.
The documentation is entirely clear that arrays are useful in scenarios where order is important, inasmuch as it explicitly documents querying against specific positions within an array. If those positions were not reliable, these queries would have no meaning. (Using the word "array" is also clear on this point, being as it is a term of the art: An array is an ordered datatype by its nature; an unordered collection allowed to contain duplicates would be a bag, not an array, just as an unordered collection in which duplicates were not allowed would be a set).
See the examples given in section 8.1.4.3, of "pay by quarter", with index position within the array indicating which quarter is being queried against.
Cannot find in documentation, but I'm pretty sure. Yes, order is preserved. And [2,4,5] is different from [5,2,4].
In case I'm wrong, indexes cannot work.

Generate unique ID from string

I am trying to take a text string and create a unique numerical value from it and I am not having any luck.
For example, I have user names (first and last) and birthdate. I have tried taking these values and converting them to varbinary, which does give me a numerical value from the data, but it isn't unique. Out of ~700 records, I will get at least 100 numerical values that are duplicated but the text of first name, last name, and birthdate that was used to generate the number is different.
Here is some code I have been trying:
SELECT CONVERT(VARCHAR(300), CONVERT(BIGINT,(CONVERT(VARBINARY, SE.FirstName) + CONVERT(VARBINARY, SE.BirthDate) ))) FROM ELIGIBILITY SE
If I use that code and convert the following data, the result is 3530884780910457344. So the same number is generated from this unique data:
David 12/03/1952
Janice 12/23/1952
Michael 03/24/1952
Mark 12/23/1952
I am looking for some way, the simpler the better, to take these values and generate a unique numerical value from that data. And the reason why I need to use these values as input is because I am trying to avoid creating duplicates in the future as well as be able to predict the numerical value based on the formula. This is why NewID() won't work for me.
How about simply:
SELECT CHECKSUM(name, BirthDate) FROM dbo.ELIGIBILITY;
Of course, since there are still chances for collisions, maybe you should better define what you are actually trying to do. You've stated some reasons why e.g. NEWID() won't work but I still don't follow the the underlying purpose of this unique number.

The C programming language 2. ed. question

I'm reading the well known book "The C programming Language, 2nd edition" and there is one exercise that I'm stuck with. I can't figure out what exactly needs to be done, so I would like for someone to explain it to me.
It's Exercise 5-17:
Add a field-searching capability, so sorting may be done on fields within lines, each field sorted according to an independent set of options.
What does the input program expect from the command line; what does it mean by "independent set of options"?
Study the POSIX sort utility, ignoring the legacy options. Or study the GNU sort program; it has even more options than POSIX sort does.
You need to decide between fixed-width fields as suggested by Neil Butterworth in his answer and variable-width fields. You need to decide on what character separates variable-width fields. You need to decide on which sorting modes to support for each field (string, case-folded string, phone-book string, integer, floating point, date, etc) as well as sort direction (forward/reverse or ascending/descending).
The 'independent options' means that you can have different sort criteria for different fields. That is, you can arrange for field 1 to be sorted in ascending string order, field 3 to be sorted in descending integer order, and field 9 to be sorted in ascending date order.
Note that when sorting, the primary criterion is the first key field specified. When two rows are compared, if there is a difference between the first key field in the two rows, then the subsequent key fields are never considered. When two rows are the same in the first key field, then the criterion for the second key field determines the relative order; then, if the second key fields are the same, the third key field is consulted, and so on. If there are no more key fields specified, then the usual default sort criterion is "the whole line of input in ascending string order". A stable sort preserves the relative order of two rows in the original data that are the same when compared using the key field criteria (instead of using the default, whole-line comparison).
It's referring to the ability to specify subfields in each row to sort by. For example:
sort -f1:4a -f20:28d somefile.txt
would sort the field beginning at character position 1 and extending to position4 ascending and within that sort the field beginning at position 20 and extending to 28 descending.
Of course, there are lots of other ways to specify fields, sort order etc. Designing the command line switches is one of the points of the exercise, IMHO.

Resources