Oracle IN condition without sort - database

May I know is there any solution to get the result without ordering in Oracle? It is because when I execute the query as follows, it seems to automatically helps me to sort it by ID field.
SELECT ID FROM USER WHERE ID IN (5004, 5003, 5005, 5002, 5008);
Actual results Expected results
---5002 ---5004
---5003 ---5003
---5004 ---5005
---5005 ---5002
---5008 ---5008
Million thanks if you guys have solutions on this.

SELECT statements return the rows of their result sets in an unpredictable order unless you give an ORDER BY clause.
Certain DBMS products give the illusion that their result sets are in a predictable order. But if you rely on that you're bound to be disappointed.

This is one way I've seen in the past using INSTR:
SELECT *
FROM YourTable
WHERE ID IN (5004, 5003, 5005, 5002, 5008)
ORDER BY INSTR ('5004,5003,5005,5002,5008', id)
SQL Fiddle Demo
I've also seen use of CASE like this:
ORDER BY
CASE ID
WHEN 5004 THEN 1
WHEN 5003 THEN 2
WHEN 5005 THEN 3
WHEN 5002 THEN 4
WHEN 5008 THEN 5
END

if you want to keep the order as your in list, you can do something like this:
SQL> create type user_va as varray(1000) of number;
2 /
Type created.
SQL> with users as (select /*+ cardinality(a, 10) */ rownum r, a.column_value user_id
2 from table(user_va(11, 0, 19, 5)) a)
3 select d.user_id, d.username
4 from dba_users d
5 inner join users u
6 on u.user_id = d.user_id
7 order by u.r
8 /
USER_ID USERNAME
---------- ------------------------------
11 OUTLN
0 SYS
19 DIP
5 SYSTEM
i.e we put the elements into a varray and assign a rownum prior to merging the set. we can then order by that r to maintain the order of our in list. The cardinality hint just tells the optimizer how many rows are in the array (doesn't have to be dead on, just in the ballpark..as without this, it will assume 8k rows and may prefer a full scan over an index approach)
if you don't have privs to create a type and this is just some adhoc thing, there's a few public ones:
select owner, type_name, upper_bound max_elements, length max_size, elem_type_name
from all_Coll_types
where coll_type = 'VARYING ARRAY'
and elem_type_name in ('INTEGER', 'NUMBER');

There is no guarantee of sort order without an ORDER BY clause.

If your question is about why the ordering occurs then the answer is: Do you have an index or primary key defined on the column ID? If yes the database responds to your query with an index scan. That is: it looks up the IDs in the IN clause not in the table itself but in the index defined on your ID-column. Within the index the values are ordered.
To get more information about the execution of your query try Oracle's explain plan feature.
To get the values in a certain order you have to add an ORDER BY clause. One way of doing this would be
select ID
from USER
where ID in (5004, 5003, 5005, 5002, 5008)
order by
case ID
when 5004 then 1
when 5003 then 2
...
end;
A more general way would be to add an ORDERING column to your table:
select ID
from USER
where ID in (5004, 5003, 5005, 5002, 5008)
order by
ORDERING;

Another solution that I found here.
select ID
from USER
where ID in (5004, 5003, 5005, 5002, 5008)
order by decode(ID, 5002, 1, 5003, 2, 5004, 3, 5005, 4, 5008, 5);
order by decode(COLUMN NAME, VALUE, POSITION)
*Note: Only need to repeat the VALUE and POSITION
And yah, thanks for all the responds! I am really appreciate it.

Related

SQL Server : Row Number without ordering

I want to create a Select statement that ranks the column as is without ordering.
Currently, the table is in the following order:
ITEM_Description1
ITEM_Description2
ITEM_StockingType
ITEM_RevisionNumber
I do not want the results to be numerical in any way, nor depend on the VariableID numbers, but with ROW_Number(), I have to choose something. Does anyone know how I can have the results look like this?
Row| VariableName
---------------------
1 | ITEM_Description1
2 | ITEM_Description2
3 | ITEM_StockingType
4 | ITEM_RevisionNumber
My code for an example is shown below.
SELECT
VariableName,
ROW_NUMBER() OVER (ORDER BY VariableID) AS RowNumber
FROM
SeanVault.dbo.TempVarIDs
Using ORDER BY (SELECT NULL) will give you the results your looking for.
SELECT
VariableName,
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM
SeanVault.dbo.TempVarIDs
Your problem seems to be with this sentence:
Currently, the table is in the following order:
No, your table is NOT implicitly ordered!!
Although it might look like this...
The only way to enforce the resultset's sort order is an ORDER BY-clause at the outer most SELECT.
If you want to maintain the sort order of your inserts you can use
a column like ID INT IDENTITY (which will automatically increase a sequence counter)
Using GETDATE() on insert will not solve this, as multiple row inserts might get the same DateTime value.
You do not have to show this in your output of course...
Your table has no inherent order. Even if you get that order a 100 times in a row is no guarantee it will be that order on the 101 time.
You can add an identity column to the table.

SSRS 2008 SUM using scope and recursive option to rollup values

Consider this data (6 rows and 1 column for now)
;with cols as
(
SELECT 1 colID, 'C1' col
--UNION SELECT 2, 'C2'
)
, rows as
(
SELECT 1 RowID, 'R1' row, null ParentID
UNION SELECT 2, 'R2', 1
UNION SELECT 3, 'R3', 2
UNION SELECT 4, 'R4', 2
UNION SELECT 5, 'R5', 1
UNION SELECT 6, 'R6', 1
)
,data
AS
(
SELECT 3 RowID, 1 as Amount
UNION SELECT 4 RowID, 2 as Amount
)
SELECT r.RowID, r.row, c.colID, c.col, d.Amount, r.ParentID
FROM rows r
CROSS JOIN cols c
LEFT JOIN data d on d.RowID = r.RowID
I apply this to a matrix control using the following layout and get the output as shown. Notice how the amounts are not rolled up to the parent rows. THis is ok for now.
Now, to get the values to roll up I can use the expression (well documented by msdn, blogs, etc)
=Sum(Fields!Amount.Value, "RowGroup", recursive)
This now gives me exactly what I want, with the values rolling up to their parent rows:
However, my dataset has dynamic column as well as rows but when a second (or third, forth, etc) column is introduced the recursive sum doesn't work as I expect. Instead of staying within the scope of the current column it sums all the columns and then rolls those values up to the parent lines. as shown:
I want the values to only get rolled up within the current scope of a given row and column.
Any guidance would be greatly appreciated as this has stumped me.
Thanks
This isn't actually possible due to needing to look at both the row group scope and the column group scope..
I spent a long time searching for an answer.. For anyone else looking. here are some better explanations from other forums:
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/55c55aad-4755-4da5-afce-94d16c9e201d/cannot-perform-a-correct-recursive-sum-in-a-tablix?forum=sqlreportingservices
Unfortunately, the scenario you described is not supported so far (both 2008 and 2005), because it cannot do Recursive sum in two scopes in a matrix. If you replace the expression with SUM(Fields!Amount.Value, "NAME", Recursive), you will get the recursive total, but the tota willl ignore the monthindex group. Actually, you want to recursive sum for both groups NAME and MonthIndex, however you cannot write expression like this: SUM(Fields!Amount.Value, "NAME" and MonthIndex, Recursive). Right? So generally speaking, in a matrix, when you write the expression like this: SUM(Fields!Amount.Value) It will sum in the scope of two groupsNAME and MonthIndex, and if you want to specify the scope, you have to specify only one scope. And unfortunately, the recursive sum needs to specify the scope and scenario you described needs to specify two scopes
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/9d7f36bc-73d6-4f19-a306-0b84321e6feb/calculating-the-sum-on-a-hierarchy-with-multiple-grouping-columns?forum=sqlreportingservices

SQL Get Second Record

I am looking to retrieve only the second (duplicate) record from a data set. For example in the following picture:
Inside the UnitID column there is two separate records for 105. I only want the returned data set to return the second 105 record. Additionally, I want this query to return the second record for all duplicates, not just 105.
I have tried everything I can think of, albeit I am not that experience, and I cannot figure it out. Any help would be greatly appreciated.
You need to use GROUP BY for this.
Here's an example: (I can't read your first column name, so I'm calling it JobUnitK
SELECT MAX(JobUnitK), Unit
FROM JobUnits
WHERE DispatchDate = 'oct 4, 2015'
GROUP BY Unit
HAVING COUNT(*) > 1
I'm assuming JobUnitK is your ordering/id field. If it's not, just replace MAX(JobUnitK) with MAX(FieldIOrderWith).
Use RANK function. Rank the rows OVER PARTITION BY UnitId and pick the rows with rank 2 .
For reference -
https://msdn.microsoft.com/en-IN/library/ms176102.aspx
Assuming SQL Server 2005 and up, you can use the Row_Number windowing function:
WITH DupeCalc AS (
SELECT
DupID = Row_Number() OVER (PARTITION BY UnitID, ORDER BY JobUnitKeyID),
*
FROM JobUnits
WHERE DispatchDate = '20151004'
ORDER BY UnitID Desc
)
SELECT *
FROM DupeCalc
WHERE DupID >= 2
;
This is better than a solution that uses Max(JobUnitKeyID) for multiple reasons:
There could be more than one duplicate, in which case using Min(JobUnitKeyID) in conjunction with UnitID to join back on the UnitID where the JobUnitKeyID <> MinJobUnitKeyID` is required.
Except, using Min or Max requires you to join back to the same data (which will be inherently slower).
If the ordering key you use turns out to be non-unique, you won't be able to pull the right number of rows with either one.
If the ordering key consists of multiple columns, the query using Min or Max explodes in complexity.

MS Access row number, specify an index

Is there a way in MS access to return a dataset between a specific index?
So lets say my dataset is:
rank | first_name | age
1 Max 23
2 Bob 40
3 Sid 25
4 Billy 18
5 Sally 19
But I only want to return those records between 'rank' 2 and 4, so my results set is Bob, Sid and Billy? However, Rank is not part of the table, and this should be generated when the query is run. Why don't I use an autogenerated number, because if a record is deleted, this will be inconsistent, and what if I wanted the results in reverse!
This obviously very simple, and the reason I ask is because I am working on a product catalogue and I am looking for a more efficient way of paging through the returned dataset, so if I only return 1 page worth of data from the database this is obviously going to be quicker then return a complete set of 3000 records and then having to subselect from that set!
Thanks R.
Original suggestion:
SELECT * from table where rank BETWEEN 2 and 4;
Modified after comment, that rank is not existing in structure:
Select top 100 * from table;
And if you want to choose subsequent results, you can choose the ID of the last record from the first query, say it was ID 101, and use a WHERE clause to get the next 100;
Select top 100 * from table where ID > 100;
But these won't give you what you're looking for either, I bet.
How are you calculating rank? I assume you are basing it on some data in another dataset somewhere. If so, create a function, do a table join, or do something that can calculate rank based on values in other table(s), then you can do queries based on the rank() function.
For example:
select *
from table
where rank() between 2 and 4
If you are not calculating rank based on some data somewhere, there really isn't a way to write this query, and you might as well be returning three random rows from the table.
I think you need to use a correlated subquery to calculate the rank on the fly e.g. I'm guessing the rank is based on name:
SELECT T1.first_name, T1.age,
(
SELECT COUNT(*) + 1
FROM MyTable AS T2
WHERE T1.first_name > T2.first_name
) AS rank
FROM MyTable AS T1;
The bad news is the Access data engine is poorly optimized for this kind of query; in my experience, performace will start to noticeably degrade beyond a few hundred rows.
If it is not possible to maintain the rank on the db side of the house (e.g. high insertion environment) consider doing the paging on the client side. For example, an ADO classic recordset object has properties to support paging (PageCount, PageSize, AbsolutePage, etc), something for which DAO recordsets (being of an older vintage) have no support.
As always, you'll have to perform your own timings but I suspect that when there are, say, 10K rows you will find it faster to take on the overhead of fetching all the rows to an ADO recordset then finding the page (then perhaps fabricate smaller ADO recordset consisting of just that page's worth of rows) than it is to perform a correlated subquery to only fetch the number of rows for the page.
Unfortunately the LIMIT keyword isn't available in MS Access -- that's what is used in MySQL for a multi-page presentation. If you can write an order key into the results table, then you can use it something like this:
SELECT TOP 25 MyOrder, Etc FROM Table1 WHERE MyOrder in
(SELECT TOP 55 MyOrder FROM Table1 ORDER BY MyOrder DESC)
ORDER BY MyOrder ASCENDING
If I understand you correctly, there is ionly first_name and age columns in your table. If this is the case, then there is no way to return Bob, Sid, and Billy with a single query. Unless you do something like
SELECT * FROM Table
WHERE FirstName = 'Bob'
OR FirstName = 'Sid'
OR FirstName = 'Billy'
But I think that this is not what you are looking for.
This is because SQL databases make no guarantee as to the order that the data will come out of the database unless you specify an ORDER BY clause. It will usually come out in the same order it was added, but there are no guarantees, and once you get a lot of rows in your table, there's a reasonably high probability that they won't come out in the order you put them in.
As a side note, you should probably add a "rank" column (this column is usually called id) to your table, and make it an auto incrementing integer (see Access documentation), so that you can do the query mentioned by Sev. It's also important to have a primary key so that you can be certain which rows are being updated when you are running an update query, or which rows are being deleted when you run a delete query. For example, if you had 2 people named Max, and they were both 23, how you delete 1 row without deleting the other. If you had another auto incrementing unique column in there, you could specify the unique ID in your query to delete only one.
[ADDITION]
Upon reading your comment, If you add an autoincrement field, and want to read 3 rows, and you know the ID of the first row you want to read, then you can use "TOP" to read 3 rows.
Assuming your data looks like this
ID | first_name | age
1 Max 23
2 Bob 40
6 Sid 25
8 Billy 18
15 Sally 19
You can wuery Bob, Sid and Billy with the following QUERY.
SELECT TOP 3 FirstName, Age
From Table
WHERE ID >= 2
ORDER BY ID

Efficient checking of possible duplicate entities

I have a requirement to produce a list of possible duplicates before a user saves an entity to the database and warn them of the possible duplicates.
There are 7 criteria on which we should check the for duplicates and if at least 3 match we should flag this up to the user.
The criteria will all match on ID, so there is no fuzzy string matching needed but my problem comes from the fact that there are many possible ways (99 ways if I've done my sums corerctly) for at least 3 items to match from the list of 7 possibles.
I don't want to have to do 99 separate db queries to find my search results and nor do I want to bring the whole lot back from the db and filter on the client side. We're probably only talking of a few tens of thousands of records at present, but this will grow into the millions as the system matures.
Anyone got any thoughs of a nice efficient way to do this?
I was considering a simple OR query to get the records where at least one field matches from the db and then doing some processing on the client to filter it some more, but a few of the fields have very low cardinality and won't actually reduce the numbers by a huge amount.
Thanks
Jon
OR and CASE summing will work but are quite inefficient, since they don't use indexes.
You need to make UNION for indexes to be usable.
If a user enters name, phone, email and address into the database, and you want to check all records that match at least 3 of these fields, you issue:
SELECT i.*
FROM (
SELECT id, COUNT(*)
FROM (
SELECT id
FROM t_info t
WHERE name = 'Eve Chianese'
UNION ALL
SELECT id
FROM t_info t
WHERE phone = '+15558000042'
UNION ALL
SELECT id
FROM t_info t
WHERE email = '42#example.com'
UNION ALL
SELECT id
FROM t_info t
WHERE address = '42 North Lane'
) q
GROUP BY
id
HAVING COUNT(*) >= 3
) dq
JOIN t_info i
ON i.id = dq.id
This will use indexes on these fields and the query will be fast.
See this article in my blog for details:
Matching 3 of 4: how to match a record which matches at least 3 of 4 possible conditions
Also see this question the article is based upon.
If you want to have a list of DISTINCT values in the existing data, you just wrap this query into a subquery:
SELECT i.*
FROM t_info i1
WHERE EXISTS
(
SELECT 1
FROM (
SELECT id
FROM t_info t
WHERE name = i1.name
UNION ALL
SELECT id
FROM t_info t
WHERE phone = i1.phone
UNION ALL
SELECT id
FROM t_info t
WHERE email = i1.email
UNION ALL
SELECT id
FROM t_info t
WHERE address = i1.address
) q
GROUP BY
id
HAVING COUNT(*) >= 3
)
Note that this DISTINCT is not transitive: if A matches B and B matches C, this does not mean that A matches C.
You might want something like the following:
SELECT id
FROM
(select id, CASE fld1 WHEN input1 THEN 1 ELSE 0 "rule1",
CASE fld2 when input2 THEN 1 ELSE 0 "rule2",
...,
CASE fld7 when input7 THEN 1 ELSE 0 "rule2",
FROM table)
WHERE rule1+rule2+rule3+...+rule4 >= 3
This isn't tested, but it shows a way to tackle this.
What DBS are you using? Some support using such constraints by using server side code.
Have you considered using a stored procedure with a cursor? You could then do your OR query and then step through the records one-by-one looking for matches. Using a stored procedure would allow you to do all the checking on the server.
However, I think a table scan with millions of records is always going to be slow. I think you should work out which of the 7 fields are most likely to match are make sure these are indexed.
I'm assuming your system is trying to match tag ids of a certain post, or something similar. This is a multi-to-multi relationship and you should have three tables to handle it. One for the post, one for tags and one for post and tags relationship.
If my assumptions are correct then the best way to handle this is:
SELECT postid, count(tagid) as common_tag_count
FROM posts_to_tags
WHERE tagid IN (tag1, tag2, tag3, ...)
GROUP BY postid
HAVING count(tagid) > 3;

Resources