Should recursive common table expressions over dmvs be built on cached data? - sql-server

I have written a little CTE to get the total blocking time of a head blocker process, and I am unsure if I should first copy all of the processes that I want the CTE to run over into a temp table and then perform the query over this - i.e. I want to be sure that the data cannot change under my feet whilst the query runs and (worst case scenario), I end up with an infinite recursive loop!
This is my SQL including the temp table - I'd prefer not to have to use the table for performance reasons, and go directly to the sysprocesses dmv inside my CTE, but I'm not sure of the possible implications of this.
DECLARE #proc TABLE(
spid SMALLINT PRIMARY KEY,
blocked SMALLINT INDEX blocked_index,
waittime BIGINT)
INSERT INTO #proc
SELECT spid, blocked, waittime
FROM master..sysprocesses
;WITH block_cte AS
(
SELECT spid, CAST(blocked AS BIGINT) [wait_time], spid [root_spid]
FROM #proc
WHERE blocked = 0
UNION ALL
SELECT blocked.spid, blocked.waittime, block_cte.spid
FROM #proc AS blocked
INNER JOIN block_cte ON blocked.blocked = block_cte.spid
)
SELECT root_spid blocking_spid, SUM(wait_time) total_blocking_time
FROM block_cte
GROUP BY root_spid

This question is probably best transfered to Stack DBA. I'm sure those clever guys and girls can not only tell you the answer but also the reason behind it.
Not being sure myself I decided to test it...
My script captures the record count fromsysProcesses 1,000 times. Now to do this I had to circumnavigate several limits placed on CTEs. Among other restrictions; you cannot use aggregate functions. This makes counting records quite hard. So I created an inline table function to return the current row count from sysProcesses.
sysProcess Count Function
CREATE FUNCTION ProcessCount()
RETURNS TABLE
AS
RETURN
(
-- Return the current process count.
SELECT
COUNT(*) AS RecordCount
FROM
Master..sysProcesses
)
;
I wrapped this function in a CTE.
CTE
WITH RCTE AS
(
/* CTE to test if recursion is effected by updates to
* underlying data.
*/
-- Anchor part.
SELECT
1 AS ExecutionCount,
1 AS JoinField,
RecordCount
FROM
ProcessCount()
UNION ALL
-- Recursive part.
SELECT
r.ExecutionCount + 1 AS ExecutionCount,
1 AS JoinField,
pc.RecordCount
FROM
ProcessCount() AS pc
INNER JOIN RCTE AS r ON r.JoinField = 1
WHERE
r.ExecutionCount < 1000
)
SELECT
MIN(RecordCount) AS MinRecordCount,
MAX(RecordCount) AS MaxRecordCount
FROM
RCTE
OPTION
(MAXRECURSION 1000)
;
GO
If the min and max record counts are always equal this would suggest there is only one consistent view of sysProcesses, used throughout the query. Any difference proves this is not the case. Running on SQL Server 2008 R2 I did find differences:
Results
Run Min Max
1 113 254
2 107 108
3 86 108
Of course the inline function could be to blame here. It certainly changed my execution plan. This has taught me a lesson. I really need to better understand execution plans. I'm sure reading the OPs plan would provide a definitive answer.

Related

Does UNION or UNION all build one massive query that locks all tables selected?

I'm being told by my lead DBA that I wrote poorly formed code because I used a UNION ALL to accumulate results of successive queries on different tables. I thought when a query with multiple select statements that had results UNIONed executed separately so when each select statement executes it places a shared lock on the table that is released when finished and the next select starts.
I thought the results were accumulated in some buffer or tmp table.
Would someone kindly tell me what goes on behind the scenes and what resources consumed when a results of a hundred select statements are UNIONed. Each select operates on one table and collects schema, Table, and Column names.
Sorry, I don't have query plan. The DBA complained the query was too big to show much of the plan. His comments are below the query.
SELECT 'R_Stage' as TheSchema, 'DateFrozenSectionModF63x086' as TheTable, 'PersonModTextStaffSID' as TheColumn, COUNT(*) as NullCount
FROM [R_Stage].[DateFrozenSectionModF63x086] WHERE [PersonModTextStaffSID] = -1
UNION ALL
SELECT 'R_Stage' as TheSchema, 'DateFrozenSectionModF63x086' as TheTable, 'LabDataLabSubjectSID' as TheColumn, COUNT(*) as NullCount
FROM [R_Stage].[DateFrozenSectionModF63x086] WHERE [LabDataLabSubjectSID] = -1
UNION ALL
SELECT 'R_Stage' as TheSchema, 'DateFrozenSectionModF63x086' as TheTable, 'LabDataPatientSID' as TheColumn, COUNT(*) as NullCount
FROM [R_Stage].[DateFrozenSectionModF63x086] WHERE [LabDataPatientSID] = -1
UNION ALL
SELECT 'R_Stage' as TheSchema, 'DateGrossDescChangedF63x087' as TheTable, 'PersonModTextStaffSID' as TheColumn, COUNT(*) as NullCount
FROM [R_Stage].[DateGrossDescChangedF63x087] WHERE [PersonModTextStaffSID] = -1
UNION
ALL SELECT 'R_Stage' as TheSchema, 'DateGrossDescChangedF63x087' as TheTable, 'LabDataLabSubjectSID' as TheColumn, COUNT(*) as NullCount
FROM [R_Stage].[DateGrossDescChangedF63x087] WHERE [LabDataLabSubjectSID] = -1
UNION ALL
SELECT 'R_Stage' as TheSchema, 'DateGrossDescChangedF63x087' as TheTable, 'LabDataPatientSID' as TheColumn, COUNT(*) as NullCount
FROM [R_Stage].[DateGrossDescChangedF63x087] WHERE [LabDataPatientSID] = -1
UNION ALL
In any case the query above could have certainly been written in a much more efficient way. As written for every table in the query it will scan the entire table for every UNION which is 791 times. Just looking at the first few lines of the query we can see these are just count’s from the same table which could have been done with a single scan of this table using a CASE expression for the count and you would have gotten all the counts in one pass per table.
The bottom line is that right now we only have a few users on the FRE and processes like this are already affecting many users / jobs. Imagine when we have hundreds to thousands of users. We simply can’t afford to run processes that are not vetted or properly tested like these two examples. This is nothing personal and should not be taken as such, it is all about the overall well being of the server and all the users. It is part of my job to point out such issues so they can be addressed when I see them and this is unquestionably one of those times. These can’t be run again until they are rewritten to ensure they do what they are intended to do and that they are efficient enough to not cause issues with other processes.
The advice from your DBA seems quite reasonable. He/she doesn't mention locking, and it's not clear why you've mentioned that as the problem.
As the DBA states, you're executing 791 queries that the database engine then unions together. This will impose a load on the database. Assuming your DBA is correct about those queries being full table scans, that means the entire table is going to be read 791 times.
Regardless of any locking, that is going to thrash the disks, overrun file system and database caches, and load up the CPU running those queries.
Assuming your database is large enough that it doesn't fit in the RAM file system or database cache, that means it has to be read from disk in full each time.
If the query were rewritten as your DBA advises so that it only made 1 full table scan through the database, the impact on the file system would be 1/791 of the query as currently written.
If your database does indeed take read locks at the same time, your query will impact updaters of that table 791 times.
Your DBA's recommendations have the effect of making the proposed query roughly 791 times as efficient.
If we assume just as a working example that your table is 100 meg, at a disk read speed of 100 mb/s it will take around 1 second to process each of 791 queries, so the full query would take around 14 minutes. Rewritten as your DBA advises it will take around 1 second.
This isn't a locking problem, it's a classic I/O performance problem. If you have locking problems as well, that just makes it worse.
The exact performance characteristics of your query depend on many factors, including how large the table is, what indexes are defined (noting that indexes can make a query slower in certain circumstances), how 'wide' the table is, the types of columns in the table, what hardware the query is running on, what database system you use, how fast the disks are, how much RAM your DB has, what else is happening on the system, and on and on. so it's not possible to give a definitive answer without a lot more information.
But avoiding 791 full table scans is a good start towards improved performance.
I'm sorry that post just made my eyes hurt. It sounds like you are needing to write a script to clean up or identify a problem. To make this easy You could automate as script that will spit out smallish testable sql statements code before you post up those 300 tables. If your dba will let you use cursors and temp tables, both of which should be avoided when possible, however, this seems more like an identify the problem and or clean up issue rather than focus on efficiency. That being said, I would not want to lock those tables up on a production system for periods of time...so do a lot of smaller task to reduce locks and reach the same goal. You can run this script in sql server admin and copy the output as input to give to your dba, maybe it helps.
SET NOCOUNT ON
DECLARE #OUTPUT TABLE
(
TheSchema NVARCHAR(45),
TheTable NVARCHAR(45),
Field1 NVARCHAR(45),
Field2 NVARCHAR(45),
Field3 NVARCHAR(45)
)
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x086','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x087','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x088','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x089','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x090','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x091','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x092','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x093','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x094','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
INSERT #OUTPUT SELECT 'R_Stage','DateFrozenSectionModF63x095','PersonModTextStaffSID','LabDataLabSubjectSID','LabDataPatientSID'
DECLARE #TheSchema NVARCHAR(45),#TheTable NVARCHAR(45),#Field1 NVARCHAR(45),#Field2 NVARCHAR(45),#Field3 NVARCHAR(45)
DECLARE LOOP CURSOR FOR
SELECT TheSchema,TheTable,Field1,Field2,Field3 FROM #OUTPUT
PRINT '
IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = ''__MY_SCAN''))
DROP TABLE __MY_SCAN
CREATE TABLE __MY_SCAN(
TheShema NVARCHAR(45),
TheTable NVARCHAR(45),
Field1NullCount INT,
Field2NullCount INT,
Field3NullCount INT
)'
OPEN LOOP
FETCH NEXT FROM LOOP INTO #TheSchema,#TheTable,#Field1,#Field2,#Field3
WHILE(##FETCH_STATUS=0) BEGIN
PRINT
'INSERT __MY_SCAN
SELECT
'''+#TheSchema+''' AS '+#TheSchema+',
'''+#TheTable+''' AS '+#TheTable+',
COUNT(Field1),
COUNT(Field2),
COUNT(Field3)
FROM
(
SELECT
Field1=CASE WHEN '+#Field1+'=-1 THEN 1 ELSE 0 END,
Field2=CASE WHEN '+#Field2+'=-1 THEN 1 ELSE 0 END,
Field3=CASE WHEN '+#Field3+'=-1 THEN 1 ELSE 0 END
FROM
'+#TheTable+'
WHERE
'+#Field1+'=-1 OR '+#Field2+'=-1 OR '+#Field3+'=-1
)AS X
GO'
FETCH NEXT FROM LOOP INTO #TheSchema,#TheTable,#Field1,#Field2,#Field3
END
CLOSE LOOP
DEALLOCATE LOOP
PRINT '
SELECT * FROM __MY_SCAN
GO
DROP TABLE __MY_SCAN
GO
'

Awkward JOIN causes poor performance

I have a stored procedure that combines data from several tables via UNION ALL. If the parameters passed in to the stored procedure don't apply to a particular table, I attempt to "short-circuit" that table by using "helper bits", e.g. #DataSomeTableExists and adding a corresponding condition in the WHERE clause, e.g. WHERE #DataSomeTableExists = 1
One (psuedo) table in the stored procedure is a bit awkward and causing me some grief.
DECLARE #DataSomeTableExists BIT = (SELECT CASE WHEN EXISTS(SELECT * FROM #T WHERE StorageTable = 'DATA_SomeTable') THEN 1 ELSE 0 END);
...
UNION ALL
SELECT *
FROM REF_MinuteDimension AS dim WITH (NOLOCK)
CROSS JOIN (SELECT * FROM #T WHERE StorageTable = 'DATA_SomeTable') AS T
CROSS APPLY dbo.fGetLastValueFromSomeTable(T.ParentId, dim.TimeStamp) dpp
WHERE #DataSomeTableExists = 1 AND dim.TimeStamp >= #StartDateTime AND dim.TimeStamp <= #EndDateTime
UNION ALL
...
Note: REF_MinuteDimension is nothing more than smalldatetimes with minute increments.
(1) The execution plan (below) indicates a warning on the nested loops operator saying that there is no join predicate. This is probably not good, but there really isn't a natural join between the tables. Is there a better way to write such a query? For each ParentId in T, I want the value from the UDF for every minute between #StartDateTime and #EndDateTime.
(2) Even when #DataSomeTableExists = 0, there is I/O activity on the tables in this query as reported by SET STATISTICS IO ON and the actual execution plan. The execution plan reports 14.2 % cost which is too much considering these tables don't even apply in this case.
SELECT * FROM #T WHERE StorageTable = 'DATA_SomeTable' comes back empty.
Is it the way my query is written? Why wouldn't the helper bit or an empty T short circuit this query?
For 2) I can sure say that line
CROSS JOIN (SELECT * FROM #T WHERE StorageTable = 'DATA_SomeTable') AS T
Ill force #T to be analysed and to enter a join. You can create to versions of a SP with and without that join and use that flag to execute one or another but I cannot say that ill save any response time||cpu clocks||I/O bandwith||memory.
For 1) I suggest to remove the (nolock) if you are using SQL Server 2005 or better and to keep a close eye in that UDF. Cannot say more without a good SQL fiddle.
I should mention, I have no clue if this will ever work, as it's kind of an odd way to write a sproc and table-valued UDFs aren't well understood by the query optimizer. You might have to build your resultset into a table variable or temp table conditionally, based on IF statements, then return that data. But I would try this, first:
--helper bit declared
declare #DataSomeTableExists BIT = 0x0
if exists (select 1 from #T where StorageTable = 'DATA_SomeTable')
begin
set #DataSomeTableExists = 0x1
end
...
UNION ALL
SELECT *
FROM REF_MinuteDimension AS dim WITH (NOLOCK)
CROSS JOIN (SELECT * FROM #T WHERE StorageTable = 'DATA_SomeTable' and #DataSomeTableExists = 0x1) AS T
CROSS APPLY dbo.fGetLastValueFromSomeTable(T.ParentId, dim.TimeStamp) dpp
WHERE #DataSomeTableExists = 0x1 AND dim.TimeStamp >= #StartDateTime AND dim.TimeStamp <= #EndDateTime
UNION ALL
...
And if you don't know already, the UDF might be giving you weird readings in the execution plans. I don't know enough to give you accurate data, but you should search around to understand the limitations.
Since your query is dependent on run-time variables, consider using dynamic SQL to create your query on the fly. This way you can include the tables you want and exclude the ones you don't want.
There are downsides to dynamic SQL, so read up

Performing INSERT for each row in a select RESULT

First, a general description of the problem: I'm running a periodical process which updates total figures in a table. The issue is, that multiple updates may be required in each execution of the process, and each execution depends on the previous results.
My question is, can it be done in a single SQL Server SP?
My code (I altered it a little to simply the sample):
INSERT INTO CustomerMinuteSessions(time, customer, sessions, bytes, previousTotalSessions)
SELECT MS.time,
MS.customer,
MS.totalSessions,
MS.totalBytes,
CTS.previousTotalSessions
FROM (SELECT time, customer, SUM(sessions) as totalSessions, SUM(bytes) AS totalBytes
FROM MinuteSessions
WHERE time > #time
GROUP BY time, x) MS
CROSS APPLY TVF_GetPreviousCustomerTotalSessions(MS.customer) CTS
ORDER BY time
The previousTotalSessions column depends on other rows in UpdatedTable, and its value is retrieved by CROSS APPLYing TVF_GetPreviousCustomerTotalSessions, but if I execute the SP as-is, all the rows use the value retrieved by the function without taking the rows added during the execution of the SP.
For the sake of completeness, here's TVF_GetPreviousCustomerTotalSessions:
FUNCTION [dbo].[TVF_GetCustomerCurrentSessions]
(
#customerId int
)
RETURNS #result TABLE (PreviousNumberOfSessions int)
AS
BEGIN
INSERT INTO #result
SELECT TOP 1 (PreviousNumberOfSessions + Opened - Closed) AS PreviousNumberOfSessions
FROM CustomerMinuteSessions
WHERE CustomerId = #customerId
ORDER BY time DESC
IF ##rowcount = 0
INSERT INTO #result(PreviousNumberOfSessions) VALUES(0)
RETURN
END
What is the best (i.e. without for loop, I guess...) to take previous rows within the query for subsequent rows?
If you are using SQL-2005 and later, you can do it with few CTEs in one shot. If you use SQL-2000 you'll can use inline table-valued function.
Personally I like the CTE approach more, so I'm including a schematic translation of your code to CTEs syntax. (Bare in mind hat I didn't prepare a test set to check it).
WITH LastSessionByCustomer AS
(
SELECT CustomerID, MAX(Time)
FROM CustomerMinuteSessions
GROUP BY CustomerID
)
, GetPreviousCustomerTotalSessions AS
(
SELECT LastSession.CustomerID, LastSession.PreviousNumberOfSessions + LastSession.Opened - LastSession.Closed AS PreviousNumberOfSessions
FROM CustomerMinuteSessions LastSession
INNER JOIN LastSessionByCustomer ON LastSessionByCustomer.CustomerID = LastSession.CustomerID
)
, MS AS
(
SELECT time, customer, SUM(sessions) as totalSessions, SUM(bytes) AS totalBytes
FROM MinuteSessions
WHERE time > #time
GROUP BY time, x
)
INSERT INTO CustomerMinuteSessions(time, customer, sessions, bytes, previousTotalSessions)
SELECT MS.time,
MS.customer,
MS.totalSessions,
MS.totalBytes,
ISNULL(GetPreviousCustomerTotalSessions.previousTotalSessions, 0)
FROM MS
RIGHT JOIN GetPreviousCustomerTotalSessions ON MS.Customer = GetPreviousCustomerTotalSessions.CustomerID
Going a bit beyond your question, I think that your query with cross apply could make big damage to the database once table CustomerMinuteSessions database grows
I would add an index like to improve your chances of getting Index-Seek:
CREATE INDEX IX_CustomerMinuteSessions_CustomerId
ON CustomerMinuteSessions (CustomerId, [time] DESC, PreviousNumberOfSessions, Opened, Closed );

SQL Server and intermediate materialization?

After reading this interesting article about intermediate materialization - I still have some questions.
I have this query :
SELECT *
FROM ...
WHERE isnumeric(MyCol)=1 and ( CAST( MyCol AS int)>1)
However, the where clause order is not deterministic.
So I might get exception here.( if he first tries to cast "k1k1" )
I assume this will solve the problem
SELECT MyCol
FROM
(SELECT TOP 100 PERCENT foo From MyTable WHERE ISNUMERIC (MyCol ) > 1 ORDER BY MyCol ) bar
WHERE
CAST(MyCol AS int) > 100
why does putting top 100 + order will change VS my regular query ?
I read in the comments :
(the "intermediate" result -- in other words, a result obtained during
the process, that will be used to calculate the final result) will be
physically stored ("materialized") in TempDB and used from there for
the remainder of the user, instead of being queried back from the base
tables.
what difference does it makes if it is stored in tempDB or queried back from the base tables? it is the same data !
The supported way to avoid errors due to the optimizer reorganizing things is to use CASE:
SELECT *
FROM YourTable
WHERE
1 <=
CASE
WHEN aa NOT LIKE '%[^0-9]%'
THEN CONVERT(int, aa)
ELSE 0
END;
Intermediate materialization is not a supported technique, so it should only be employed by very expert users in special circumstances where the risks are understood and accepted.
TOP 100 PERCENT is generally ignored by the optimizer in SQL Server 2005 onward.
By adding the TOP clause into the inner query, you're forcing SQL Server to run that query first before it runs the outer query - thereby discarding all rows for which ISNUMERIC returns false.
Without the TOP clause, the optimiser can rewrite the query to be the same as your first query.

Optimizing ROW_NUMBER() in SQL Server

We have a number of machines which record data into a database at sporadic intervals. For each record, I'd like to obtain the time period between this recording and the previous recording.
I can do this using ROW_NUMBER as follows:
WITH TempTable AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Machine_ID ORDER BY Date_Time) AS Ordering
FROM dbo.DataTable
)
SELECT [Current].*, Previous.Date_Time AS PreviousDateTime
FROM TempTable AS [Current]
INNER JOIN TempTable AS Previous
ON [Current].Machine_ID = Previous.Machine_ID
AND Previous.Ordering = [Current].Ordering + 1
The problem is, it goes really slow (several minutes on a table with about 10k entries) - I tried creating separate indicies on Machine_ID and Date_Time, and a single joined-index, but nothing helps.
Is there anyway to rewrite this query to go faster?
The given ROW_NUMBER() partition and order require an index on (Machine_ID, Date_Time) to satisfy in one pass:
CREATE INDEX idxMachineIDDateTime ON DataTable (Machine_ID, Date_Time);
Separate indexes on Machine_ID and Date_Time will help little, if any.
How does it compare to this version?:
SELECT x.*
,(SELECT MAX(Date_Time)
FROM dbo.DataTable
WHERE Machine_ID = x.Machine_ID
AND Date_Time < x.Date_Time
) AS PreviousDateTime
FROM dbo.DataTable AS x
Or this version?:
SELECT x.*
,triang_join.PreviousDateTime
FROM dbo.DataTable AS x
INNER JOIN (
SELECT l.Machine_ID, l.Date_Time, MAX(r.Date_Time) AS PreviousDateTime
FROM dbo.DataTable AS l
LEFT JOIN dbo.DataTable AS r
ON l.Machine_ID = r.Machine_ID
AND l.Date_Time > r.Date_Time
GROUP BY l.Machine_ID, l.Date_Time
) AS triang_join
ON triang_join.Machine_ID = x.Machine_ID
AND triang_join.Date_Time = x.Date_Time
Both would perform best with an index on Machine_ID, Date_Time and for correct results, I'm assuming that this is unique.
You haven't mentioned what is hidden away in * and that can sometimes means a lot since a Machine_ID, Date_Time index will not generally be covering and if you have a lot of columns there or they have a lot of data, ...
If the number of rows in dbo.DataTable is large then it is likely that you are experiencing the issue due to the CTE self joining onto itself. There is a blog post explaining the issue in some detail here
Occasionally in such cases I have resorted to creating a temporary table to insert the result of the CTE query into and then doing the joins against that temporary table (although this has usually been for cases where a large number of joins against the temp table are required - in the case of a single join the performance difference will be less noticable)
I have had some strange performance problems using CTEs in SQL Server 2005. In many cases, replacing the CTE with a real temp table solved the problem.
I would try this before going any further with using a CTE.
I never found any explanation for the performance problems I've seen, and really didn't have any time to dig into the root causes. However I always suspected that the engine couldn't optimize the CTE in the same way that it can optimize a temp table (which can be indexed if more optimization is needed).
Update
After your comment that this is a view, I would first test the query with a temp table to see if that performs better.
If it does, and using a stored proc is not an option, you might consider making the current CTE into an indexed/materialized view. You will want to read up on the subject before going down this road, as whether this is a good idea depends on a lot of factors, not the least of which is how often the data is updated.
What if you use a trigger to store the last timestamp an subtract each time to get the difference?
If you require this data often, rather than calculate it each time you pull the data, why not add a column and calculate/populate it whenever row is added?
(Remus' compound index will make the query fast; running it only once should make it faster still.)

Resources