In GridDB, can an index/rowkey/search be inverted? - database

Using GridDB, I sometimes need to read data starting with the first row, and sometimes starting with the last row. Currently I have two containers with the same data but inserted in the opposite order of each other. While functional, this does seem rather inefficient.
So the question is: are there better ways to do it? Can the rowkey, or an index assigned to a container be inverted with a function call? Or is there a way to adjust the search accordingly?

You're looking for the TQL ASC or DESC operator which will invert the order data is returned.
If you actually want an inverted index a timestamp I would add a field to your schema and when inserting data, insert (LLONG_MAX - timestamp)

Related

Vlookup within results of a multiple-match lookup

I am trying to find a way to return values from a table like the bottom table:
The tables are provided externally and my understanding is that I can't run filters or sorts as the table is full of extra irrelevant data that would not sort properly across all columns.
I'm approaching this twofold:
firstly, I wanted to return the row information for any entry in the table that has a CPT matching the lookup value.
Second, (where I'm stuck)--when the lookup returns a DESC that corresponds to a matched CPT, the goal would be to also pull in any Code A/Code B entries that correspond to that DESC value.
I found an existing formula that worked for the first part, shown below. (apologies for formatting--SO keeps flagging my draft as having unformatted code).
IFERROR(INDEX($B$3:$B$3000,SMALL(IF(I$2=$A$3:$A$3000,ROW($A$3:$A$3000)- MIN(ROW($A$3:$A$3000))+1,""), ROW()-1)),"")
Currently, the aforementioned formula does return the DESC entries for any matching CPTs and I use vlookup to pull the rest of the relevant columns for a corresponding row.
I'm coming up short in cases where there are multiple Code A/Code B entries for a given DESC, as the lookup only returns information for the row containing the matching CPT and not any relevant codes contained in subsequent rows.
I was thinking I'd have to use something similar to the existing lookup formula to identify a matched row and then display any subsequent rows containing Code entries until the next row with a non-blank CPT entry.
Unfortunately, I don't know if that's actually the best way to approach this. Any resources/suggestions are greatly appreciated.

Calculated field over list of values

Given this table of data:
I'd like to produce this pivot table:
I have an inkling this can be done with the calculated field, and SUMIF, but am not able to get it to work. I think the main blocker is that I'm not able to find good documentation for what I can reference inside of a calculated field formula. My best attempt was =SUMIF(color, "RED")/SUM(), but that produced zeros.
Example table at https://docs.google.com/spreadsheets/d/16htOLbwf47Neo68iFlm9OvFVS_u2Jlc-2thhdUQwrpU/edit?usp=sharing
Any guidance appreciated!
={QUERY(A1:B25,"select A,count(A)/"&COUNT(A:A)&" where B='RED' group by A label count(A)/"&COUNT(A:A)&" 'PCT RED'");{"Grand Total",COUNTIFS(A:A,">=0",B:B,"RED")/COUNT(A:A)}}
Function References
Query
COUNT
COUNTIFS
I think my concern here would be that with a normal pivot table it's robust against data moving around. This seems to break that by referencing specific columns
Method pivot table you must show all color
I think my concern here would be that with a normal pivot table it's robust against data moving around. This seems to break that by referencing specific columns
to "set it free" you can do:
={QUERY({A:B},
"select Col1,count(Col1)/"&COUNT(A:A)&"
where Col2='RED'
group by Col1
label count(Col1)/"&COUNT(A:A)&"'PCT RED'");
{"Grand Total", COUNTIFS(A:A, ">=0", B:B, "RED")/COUNT(A:A)}}

How to force table select to go over blocks

How can I make Sybase's database engine return an unsorted list of records in non-numeric order?
~~~
I have an issue where I need to reproduce an error in the application where I select from a table where the ID is generated in sequence, but the ID is not the last one in the selection.
Let me explain.
ID STATUS
_____________
1234 C
1235 C
1236 O
Above is 3 IDs. I had code where these would be the results of a
select #p_id = ID from table where (conditions).
However, there wasn't a clause to check for status = 'O' (open). Remember Sybase saves the last returned record into a variable.
~~~~~
I'm being asked to give the testing team something that will make the results not work. If Sybase selects the above in an unordered list, it could appear in ascending order, or, if the database engine needs to change blocks of stored data or something technical magic stuff, the order could be messed up. The original error was when the procedure would return say 1234 instead of 1236.
Is there a way that I can have a 100% guarantee that Sybase will search over a block of data and have to double back, effectively 'breaking' the ascending search, and returning not the last record, but any other one? (all records except the maximum will end up erroring, because they are all 'Closed')
I want some sort of magical SQL code that will make sure things don't search the table in exactly numeric order. Ideally I'd like to not have to change the procedure, as the test team want to see the exact same procedure breaking (as easy as plonking a order by id desc would fudge the results).
If you don't specify an order, there is no way to guarantee the return order of the results. It will be however the index is built - and can depend on the order of insertion, the type of index, and the content of index keys.
It's generally a bad idea to do those sorts of singleton SELECTs. You should always specify a specific record with the WHERE clause, or use a cursor, or TOPn or similar. The problem comes when someone tries to understand your code, because some databases when they see multiple hits take the first value, some take the last value, some take a random value (they call that "implementation-defined"), and some throw an error.
Is this by any chance related to 1156837? :)

What makes a SQL statement sargable?

By definition (at least from what I've seen) sargable means that a query is capable of having the query engine optimize the execution plan that the query uses. I've tried looking up the answers, but there doesn't seem to be a lot on the subject matter. So the question is, what does or doesn't make an SQL query sargable? Any documentation would be greatly appreciated.
For reference: Sargable
The most common thing that will make a query non-sargable is to include a field inside a function in the where clause:
SELECT ... FROM ...
WHERE Year(myDate) = 2008
The SQL optimizer can't use an index on myDate, even if one exists. It will literally have to evaluate this function for every row of the table. Much better to use:
WHERE myDate >= '01-01-2008' AND myDate < '01-01-2009'
Some other examples:
Bad: Select ... WHERE isNull(FullName,'Ed Jones') = 'Ed Jones'
Fixed: Select ... WHERE ((FullName = 'Ed Jones') OR (FullName IS NULL))
Bad: Select ... WHERE SUBSTRING(DealerName,4) = 'Ford'
Fixed: Select ... WHERE DealerName Like 'Ford%'
Bad: Select ... WHERE DateDiff(mm,OrderDate,GetDate()) >= 30
Fixed: Select ... WHERE OrderDate < DateAdd(mm,-30,GetDate())
Don't do this:
WHERE Field LIKE '%blah%'
That causes a table/index scan, because the LIKE value begins with a wildcard character.
Don't do this:
WHERE FUNCTION(Field) = 'BLAH'
That causes a table/index scan.
The database server will have to evaluate FUNCTION() against every row in the table and then compare it to 'BLAH'.
If possible, do it in reverse:
WHERE Field = INVERSE_FUNCTION('BLAH')
This will run INVERSE_FUNCTION() against the parameter once and will still allow use of the index.
In this answer I assume the database has sufficient covering indexes. There are enough questions about this topic.
A lot of the times the sargability of a query is determined by the tipping point of the related indexes. The tipping point defines the difference between seeking and scanning an index while joining one table or result set onto another. One seek is of course much faster than scanning a whole table, but when you have to seek a lot of rows, a scan could make more sense.
So among other things a SQL statement is more sargable when the optimizer expects the number of resulting rows of one table to be less than the tipping point of a possible index on the next table.
You can find a detailed post and example here.
For an operation to be considered sargable, it is not sufficient for it to just be able to use an existing index. In the example above, adding a function call against an indexed column in the where clause, would still most likely take some advantage of the defined index. It will "scan" aka retrieve all values from that column (index) and then eliminate the ones that do not match to the filter value provided. It is still not efficient enough for tables with high number of rows.
What really defines sargability is the query ability to traverse the b-tree index using the binary search method that relies on half-set elimination for the sorted items array. In SQL, it would be displayed on the execution plan as a "index seek".

Ordering numbers that are stored as strings in the database

I have a bunch of records in several tables in a database that have a "process number" field, that's basically a number, but I have to store it as a string both because of some legacy data that has stuff like "89a" as a number and some numbering system that requires that process numbers be represented as number/year.
The problem arises when I try to order the processes by number. I get stuff like:
1
10
11
12
And the other problem is when I need to add a new process. The new process' number should be the biggest existing number incremented by one, and for that I would need a way to order the existing records by number.
Any suggestions?
Maybe this will help.
Essentially:
SELECT process_order FROM your_table ORDER BY process_order + 0 ASC
Can you store the numbers as zero padded values? That is, 01, 10, 11, 12?
I would suggest to create a new numeric field used only for ordering and update it from a trigger.
Can you split the data into two fields?
Store the 'process number' as an int and the 'process subtype' as a string.
That way:
you can easily get the MAX processNumber - and increment it when you need to generate a
new number
you can ORDER BY processNumber ASC,
processSubtype ASC - to get the
correct order, even if multiple records have the same base number with different years/letters appended
when you need the 'full' number you
can just concatenate the two fields
Would that do what you need?
Given that your process numbers don't seem to follow any fixed patterns (from your question and comments), can you construct/maintain a process number table that has two fields:
create table process_ordering ( processNumber varchar(N), processOrder int )
Then select all the process numbers from your tables and insert into the process number table. Set the ordering however you want based on the (varying) process number formats. Join on this table, order by processOrder and select all fields from the other table. Index this table on processNumber to make the join fast.
select my_processes.*
from my_processes
inner join process_ordering on my_process.processNumber = process_ordering.processNumber
order by process_ordering.processOrder
It seems to me that you have two tasks here.
• Convert the strings to numbers by legacy format/strip off the junk• Order the numbers
If you have a practical way of introducing string-parsing regular expressions into your process (and your issue has enough volume to be worth the effort), then I'd
• Create a reference table such as
CREATE TABLE tblLegacyFormatRegularExpressionMaster(
LegacyFormatId int,
LegacyFormatName varchar(50),
RegularExpression varchar(max)
)
• Then, with a way of invoking the regular expressions, such as the CLR integration in SQL Server 2005 and above (the .NET Common Language Runtime integration to allow calls to compiled .NET methods from within SQL Server as ordinary (Microsoft extended) T-SQL, then you should be able to solve your problem.
• See
http://www.codeproject.com/KB/string/SqlRegEx.aspx
I apologize if this is way too much overhead for your problem at hand.
Suggestion:
• Make your column a fixed width text (i.e. CHAR rather than VARCHAR).
• Pad the existing values with enough leading zeros to fill each column and a trailing space(s) where the values do not end in 'a' (or whatever).
• Add a CHECK constraint (or equivalent) to ensure new values conform to the pattern e.g. something like
CHECK (process_number LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][ab ]')
• In your insert/update stored procedures (or equivalent), pad any incoming values to fit the pattern.
• Remove the leading/trailing zeros/spaces as appropriate when displaying the values to humans.
Another advantage of this approach is that the incoming values '1', '01', '001', etc would all be considered to be the same value and could be covered by a simple unique constraint in the DBMS.
BTW I like the idea of splitting the trailing 'a' (or whatever) into a separate column, however I got the impression the data element in question is an identifier and therefore would not be appropriate to split it.
You need to cast your field as you're selecting. I'm basing this syntax on MySQL - but the idea's the same:
select * from table order by cast(field AS UNSIGNED);
Of course UNSIGNED could be SIGNED if required.

Resources