I'm hoping to return a single row with a comma separated list of values from a query that returns multiple rows in Oracle, essentially flattening the returned rows into a single row.
In PostgreSQL this can be achieved using the array and array_to_string functions like this:
Given the table "people":
id | name
---------
1 | bob
2 | alice
3 | jon
The SQL:
select array_to_string(array(select name from people), ',') as names;
Will return:
names
-------------
bob,alice,jon
How would I achieve the same result in Oracle 9i?
Thanks,
Matt
Tim Hall has the definitive collection of string aggregation techniques in Oracle.
If you're stuck on 9i, my personal preference would be to define a custom aggregate (there is an implementation of string_agg on that page) such that you would have
SELECT string_agg( name )
FROM people
But you have to define a new STRING_AGG function. If you need to avoid creating new objects, there are other approaches but in 9i they're going to be messier than the PostgreSQL syntax.
In 10g I definitely prefer the COLLECT option mentioned at the end of Tim's article.
The nice thing about that approach is that the same underlying function (that accepts the collection as an argument), can be used both as an aggregate and as a multiset function:
SELECT deptno, tab_to_string(CAST(MULTISET(SELECT ename FROM emp
WHERE deptno = dept.deptno) AS t_varchar2_tab), ',') FROM dept
However in 9i that's not available. SYS_CONNECT_BY_PATH is nice because it's flexible, but it can be slow, so be careful of that.
Related
is there any way within snowflake/sql query to view what tables are being queried the most as well as what columns? I want to know what data is of most value to my users and not sure how to do this programatically. Any thoughts are appreciated - thank you!
2021 update
The new ACCESS_HISTORY view has this information (in preview right now, enterprise edition).
For example, if you want to find the most used columns:
select obj.value:objectName::string objName
, col.value:columnName::string colName
, count(*) uses
, min(query_start_time) since
, max(query_start_time) until
from snowflake.account_usage.access_history
, table(flatten(direct_objects_accessed)) obj
, table(flatten(obj.value:columns)) col
group by 1, 2
order by uses desc
Ref: https://docs.snowflake.com/en/sql-reference/account-usage/access_history.html
2020 answer
The best I found (for now):
For any given query, you can find what tables are scanned through looking at the plan generated for it:
SELECT *, "objects"
FROM TABLE(EXPLAIN_JSON(SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM a.b.any_table_or_view')))
WHERE "operation"='TableScan'
You can find all of your previous ran queries too:
select QUERY_TEXT
from table(information_schema.query_history())
So the natural next step would be combine both - but that's not straightforward, as you'll get an error like:
SQL compilation error: argument 1 to function EXPLAIN_JSON needs to be constant, found 'SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM a.b.c')'
The solution would be to combine the queries from the query_history() with the SYSTEM$EXPLAIN_PLAN_JSON outside (to make the strings constant), and then you will be able to find out the most queried tables.
Lets say I have a table in SQLServer named Tasks. It contains some tasks with people assigned to tasks. Workers are in one column. They are separated by semicolon.
ID Workers
1 John Newman; Troy Batchelor; Mike Smith
2 Chris Portman
3 Sara Oldman; Greg House
I need to separate workers from column like below
The result:
ID Worker
1 John Newman
1 Troy Batchelor
1 Mike Smith
2 Chris Portman
3 Sara Oldman
3 Greg House
I have no idea what to do.
Do I have to use some procedure or simple query is enough?
i solved your problem without using any function or stored procedure
SELECT ID,Workers FROM tblSemiColon
SELECT ID,
LTRIM(RTRIM(m.n.value('.[1]','varchar(8000)'))) AS Workers
FROM
(
SELECT ID,CAST('<XMLRoot><RowData>' + REPLACE(Workers,';','</RowData><RowData>') + '</RowData></XMLRoot>' AS XML) AS x
FROM tblSemiColon
)t
CROSS APPLY x.nodes('/XMLRoot/RowData')m(n)
this will work across all the version of sql server..i had tested it....
you can reduce the size of varchar length..
If you are on sql-server 2016 or higher you can use STRING_SPLIT function
SELECT id, value
FROM Tasks
CROSS APPLY STRING_SPLIT(Workers, ';')
I'm trying to take a raw data set that adds columns for new data and convert it to a more traditional table structure. The idea is to have the script pull the column name (the date) and put that into a new column and then stack each dates data values on top of each other.
Example
Store 1/1/2013 2/1/2013
XYZ INC $1000 $2000
To
Store Date Value
XYZ INC 1/1/2013 $1000
XYZ INC 2/1/2013 $2000
thanks
There are a few different ways that you can get the result that you want.
You can use a SELECT with UNION ALL:
select store, '1/1/2013' date, [1/1/2013] value
from yourtable
union all
select store, '2/1/2013' date, [2/1/2013] value
from yourtable;
See SQL Fiddle with Demo.
You can use the UNPIVOT function:
select store, date, value
from yourtable
unpivot
(
value
for date in ([1/1/2013], [2/1/2013])
) un;
See SQL Fiddle with Demo.
Finally, depending on your version of SQL Server you can use CROSS APPLY:
select store, date, value
from yourtable
cross apply
(
values
('1/1/2013', [1/1/2013]),
('2/1/2013', [2/1/2013])
) c (date, value)
See SQL Fiddle with Demo. All versions will give a result of:
| STORE | DATE | VALUE |
|---------|----------|-------|
| XYZ INC | 1/1/2013 | 1000 |
| XYZ INC | 2/1/2013 | 2000 |
Depending on the details of the problem (i.e. source format, number and variability of dates, how often you need to perform the task, etc), it very well may be much easier to use some other language to parse the data and perform either a reformatting function or the direct insert into the final table.
The above said, if you're interested in a completely SQL solution, it sounds like you're looking for some dynamic pivot functionality. The keywords being dynamic SQL and unpivot. The details vary based on what RDBMS you're using and exactly what the specs are on the initial data set.
I would use a scripting language (Perl, Python, etc.) to generate an INSERT statement for each date column you have in the original data and transpose it into a row keyed by Store and Date. Then run the inserts into your normalized table.
Can Anybody tell me when to use with clause.
The WITH keyword is used to create a temporary named result set. These are called Common Table Expressions.
A very basic, self-explanatory example:
WITH Administrators (Name, Surname)
AS
(
SELECT Name, Surname FROM Users WHERE AccessRights = 'Admin'
)
SELECT * FROM Administrators
For further reading and more examples, I suggest starting out with the following MSDN article:
Common Table Expressions by John Papa
In SQL Server you sometimes need the WITH clause to force a query to use an Index. This is often a necessity in spatial queries that can reduce query time from 1 minute to a few seconds.
select * from MyTable with(index(MySpatialIndex)) where...
Is there a way in MS access to return a dataset between a specific index?
So lets say my dataset is:
rank | first_name | age
1 Max 23
2 Bob 40
3 Sid 25
4 Billy 18
5 Sally 19
But I only want to return those records between 'rank' 2 and 4, so my results set is Bob, Sid and Billy? However, Rank is not part of the table, and this should be generated when the query is run. Why don't I use an autogenerated number, because if a record is deleted, this will be inconsistent, and what if I wanted the results in reverse!
This obviously very simple, and the reason I ask is because I am working on a product catalogue and I am looking for a more efficient way of paging through the returned dataset, so if I only return 1 page worth of data from the database this is obviously going to be quicker then return a complete set of 3000 records and then having to subselect from that set!
Thanks R.
Original suggestion:
SELECT * from table where rank BETWEEN 2 and 4;
Modified after comment, that rank is not existing in structure:
Select top 100 * from table;
And if you want to choose subsequent results, you can choose the ID of the last record from the first query, say it was ID 101, and use a WHERE clause to get the next 100;
Select top 100 * from table where ID > 100;
But these won't give you what you're looking for either, I bet.
How are you calculating rank? I assume you are basing it on some data in another dataset somewhere. If so, create a function, do a table join, or do something that can calculate rank based on values in other table(s), then you can do queries based on the rank() function.
For example:
select *
from table
where rank() between 2 and 4
If you are not calculating rank based on some data somewhere, there really isn't a way to write this query, and you might as well be returning three random rows from the table.
I think you need to use a correlated subquery to calculate the rank on the fly e.g. I'm guessing the rank is based on name:
SELECT T1.first_name, T1.age,
(
SELECT COUNT(*) + 1
FROM MyTable AS T2
WHERE T1.first_name > T2.first_name
) AS rank
FROM MyTable AS T1;
The bad news is the Access data engine is poorly optimized for this kind of query; in my experience, performace will start to noticeably degrade beyond a few hundred rows.
If it is not possible to maintain the rank on the db side of the house (e.g. high insertion environment) consider doing the paging on the client side. For example, an ADO classic recordset object has properties to support paging (PageCount, PageSize, AbsolutePage, etc), something for which DAO recordsets (being of an older vintage) have no support.
As always, you'll have to perform your own timings but I suspect that when there are, say, 10K rows you will find it faster to take on the overhead of fetching all the rows to an ADO recordset then finding the page (then perhaps fabricate smaller ADO recordset consisting of just that page's worth of rows) than it is to perform a correlated subquery to only fetch the number of rows for the page.
Unfortunately the LIMIT keyword isn't available in MS Access -- that's what is used in MySQL for a multi-page presentation. If you can write an order key into the results table, then you can use it something like this:
SELECT TOP 25 MyOrder, Etc FROM Table1 WHERE MyOrder in
(SELECT TOP 55 MyOrder FROM Table1 ORDER BY MyOrder DESC)
ORDER BY MyOrder ASCENDING
If I understand you correctly, there is ionly first_name and age columns in your table. If this is the case, then there is no way to return Bob, Sid, and Billy with a single query. Unless you do something like
SELECT * FROM Table
WHERE FirstName = 'Bob'
OR FirstName = 'Sid'
OR FirstName = 'Billy'
But I think that this is not what you are looking for.
This is because SQL databases make no guarantee as to the order that the data will come out of the database unless you specify an ORDER BY clause. It will usually come out in the same order it was added, but there are no guarantees, and once you get a lot of rows in your table, there's a reasonably high probability that they won't come out in the order you put them in.
As a side note, you should probably add a "rank" column (this column is usually called id) to your table, and make it an auto incrementing integer (see Access documentation), so that you can do the query mentioned by Sev. It's also important to have a primary key so that you can be certain which rows are being updated when you are running an update query, or which rows are being deleted when you run a delete query. For example, if you had 2 people named Max, and they were both 23, how you delete 1 row without deleting the other. If you had another auto incrementing unique column in there, you could specify the unique ID in your query to delete only one.
[ADDITION]
Upon reading your comment, If you add an autoincrement field, and want to read 3 rows, and you know the ID of the first row you want to read, then you can use "TOP" to read 3 rows.
Assuming your data looks like this
ID | first_name | age
1 Max 23
2 Bob 40
6 Sid 25
8 Billy 18
15 Sally 19
You can wuery Bob, Sid and Billy with the following QUERY.
SELECT TOP 3 FirstName, Age
From Table
WHERE ID >= 2
ORDER BY ID