On localserver (a SQL Server 2008 R2), I have a synonym called syn_view1 pointing to the linked server remoteserver.remotedb.dbo.view1
This SLOW query takes 20 seconds to run.
select e.column1, e.column2
from syn_view1 e
where e.column3 = 'xxx'
and e.column4 = 'yyy'
order by e.column1
This FAST query takes 1 second to run.
select e.column1, e.column2
from remoteserver.remotedb.dbo.view1 e
where e.column3 = 'xxx'
and e.column4 = 'yyy'
order by e.column1
The only difference in the two queries is really the presence of the synonym.
Obviously, the synonym has an impact on the performance of the query.
The execution plan for the SLOW query is :
Plan Cost % Subtree cost
4 SELECT
I/O cost: 0.000000 CPU cost: 0.000000 Executes: 0
Cost: 0.000000 0.00 3.3521
3 Filter
I/O cost: 0.000000 CPU cost: 0.008800 Executes: 1
Cost: 0.008800 0.26 3.3521
2 Compute Scalar
I/O cost: 0.000000 CPU cost: 3.343333 Executes: 1
Cost: 0.000000 0.00 3.3433
1 Remote Query
I/O cost: 0.000000 CPU cost: 3.343333 Executes: 1
Cost: 3.343333 99.74 3.3433
And for the FAST query:
Plan Cost % Subtree cost
3 SELECT
I/O cost: 0.000000 CPU cost: 0.000000 Executes: 0
Cost: 0.000000 0.00 0.1974
2 Compute Scalar
I/O cost: 0.000000 CPU cost: 0.197447 Executes: 1
Cost: 0.000000 0.00 0.1974
1 Remote Query
I/O cost: 0.000000 CPU cost: 0.197447 Executes: 1
Cost: 0.197447 100.00 0.1974
My understanding is that in the SLOW query, the server fetches all the data from the remote server, then applies the filter (though without index) whereas in the FAST query the server fetches the filtered data from the remote server, thus using the remote indexes.
Is there any way to use the synonym while being fast?
Maybe a setup of the linked server ? the local database server?
Thanks for the help!
I would dump the data without the order by into a temp table on local server. Then I would select from the temp table with the order by. Order by is almost always the killer.
The accepted answer for this post on dba.stackexchange.com notes that performance gotcha's may occur in queries over linked servers due to limited access rights on the linked server, restricting the visibility of the table statistics to the local server. This can affect query plan, and thus performance.
Excerpt:
And this is why I got different results. When running as sysadmin I
got the full distribution statistics which indicated that there are no
rows with order ID > 20000, and the estimate was one row. (Recall that
the optimizer never assumes zero rows from statistics.) But when
running as the plain user, DBCC SHOW_STATISTICS failed with a
permission error. This error was not propagated, but instead the
optimizer accepted that there were no statistics and used default
assumptions. Since it did get cardinality information, it learnt that
the remote table has 830 rows, whence the estimate of 249 rows.
Related
I am researching snowflake database and have a data aggregation use case, where i need to expose the aggregated data via a Rest API. While the data ingestion and aggregation seems to be well defined, is snowflake a system that can be used as an operational data store for servicing high throughput apis?
Or is this an anti pattern for this system
Updating based on your recent comment.
Here's some quick test results I did on large tables we have in production. *Changed the table names for display.
vLookupView records = 175,760,316
vMainView records = 179,035,026
SELECT
LP.REGIONCODE
, SUM(L.VALUE)
FROM DBO.vLookupView AS LP
INNER JOIN DBO.vMainView AS M
ON LP.PK = M.PK
GROUP BY LP.REGIONCODE;
Results:
SQL SERVER
Production box - 2:04 minutes
**Snowflake:**
By Warehouse (compute) size
XS - 17.1 seconds
Small - 9.9 seconds
Medium - 7.1s seconds
Large - 5.4 seconds
Extra Large - 5.4 seconds
When I added a WHERE condition
WHERE L.ENTEREDDATE BETWEEN '1/1/2018' AND '6/1/2018'
the results were:
SQL SERVER
Production box - 5 seconds
**Snowflake:**
By Warehouse (compute) size
XS - 12.1 seconds
Small - 3.9 seconds
Medium - 3.1s seconds
Large - 3.1 seconds
Extra Large - 3.1 seconds
Configured the SQL Server connnector in Presto, and tried few simple queries like:
Select count(0) from table_name
or,
Select sum(column_name) from table_name
Both above queries ran in SQL server in 300 ms and in Presto its running over 3 min.
This is the explain analyze of the second query (it seems to do table scan and fetch huge amount of data before doing sum), why it couldnt pushed down the sum operator to SQL Server itself.
Query Plan
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Fragment 1 [SINGLE]
Cost: CPU 2.98ms, Input: 1 row (9B), Output: 1 row (9B)
Output layout: [sum]
Output partitioning: SINGLE []
- Aggregate(FINAL) => [sum:double]
Cost: ?%, Output: 1 row (9B)
Input avg.: 1.00 lines, Input std.dev.: 0.00%
sum := "sum"("sum_4")
- LocalExchange[SINGLE] () => sum_4:double
Cost: ?%, Output: 1 row (9B)
Input avg.: 0.06 lines, Input std.dev.: 387.30%
- RemoteSource[2] => [sum_4:double]
Cost: ?%, Output: 1 row (9B)
Input avg.: 0.06 lines, Input std.dev.: 387.30%
Fragment 2 [SOURCE]
Cost: CPU 1.67m, Input: 220770667 rows (1.85GB), Output: 1 row (9B)
Output layout: [sum_4]
Output partitioning: SINGLE []
- Aggregate(PARTIAL) => [sum_4:double]
Cost: 0.21%, Output: 1 row (9B)
Input avg.: 220770667.00 lines, Input std.dev.: 0.00%
sum_4 := "sum"("total_base_dtd")
- TableScan[sqlserver:sqlserver:table_name:ivpSQLDatabase:table_name ..
Cost: 99.79%, Output: 220770667 rows (1.85GB)
Input avg.: 220770667.00 lines, Input std.dev.: 0.00%
total_base_dtd := JdbcColumnHandle{connectorId=sqlserver, columnName=total_base_dtd, columnType=double}
Both example queries are aggregate queries that produce single row result.
Currently, in Presto it is not possible to push down an aggregation to the underlying data store. Conditions and column selection (narrowing projections) are pushed down, but aggregations are not.
As a result, when you query SQL Server from Presto, Presto needs to read all the data (from given column) to do the aggregation, so there is a lot of disk and network traffic. Also, it might be, that SQL Server could optimize away certain aggregations so it might be skipping data read at all (i am guessing here).
Presto is not suited to be a frontend to some other database. It can be used as such, but this has some implications. Presto shines when it is put to work as a big data query engine (over S3, HDFS or other object stores) or as a federated query engine, where you combine data from multiple data stores / connectors.
Edit there is an ongoing work in Presto to improve pushdown, including aggregate pushdown. You can track it at https://github.com/prestosql/presto/issues/18
Presto doesn't support aggregate-pushdowns but as a workaround, you can create views in the source database (SQL Server in your case) and query those views from Presto.
After upgrading server hardware (cpu+mainboard) I'm having a big increase in query duration for really small and simple querys.
Software: Windows Server 2012 R2 + SQL Server 2014
Storage: Samsung SSD 850 EVO 2TB Disk
Old Hardware: i7-4790k 4.0Ghz 4core cpu + Asus H97M-E mainboard + 32 GB DDR3
New Hardware: i9-7900X 3.60Ghz 10core cpu + Asus Prime X299 mainboard + 32 GB DDR4
Query Sample:
UPDATE CLIE_PRECIOS_COMPRA SET [c_res_tr] = '0.0' WHERE eje ='18' AND mes =8 AND dia =27 AND hor =19 AND unipro='001'
SQL Profiler Result :
Old Hardware - CPU: 0, Reads 4, Writes 0, Duration 123
Old Hardware - CPU: 0, Reads 4, Writes 0, Duration 2852
I've checked network speed of both server to be the same but anyway I'm running the querys directly in the server throught Microsoft SQL Server Management console to avoid applicactions or network issues.
Checked Storage speed too being the same both at reading and writting in old and new hardware.
Also played with paralelism and tried diferent scenarios even disabling paralelism with the same result.
Of course the data is the same having the same copy of SQL database in both hardware.
I've set the duration to be showed in microseconds instead of miliseconds to appreciate better the diference.
The diference in duration for a single query is not really visible to user but the problem is that there are several thousands querys of this type and the time increase is important.
Any hint or thing to investigate would be really appreciated.
Current Execution Plan New Server: https://www.brentozar.com/pastetheplan/?id=HJYDtQQD7
Current Execution Plan Old Server: https://www.brentozar.com/pastetheplan/?id=SynyW4mPQ
Thanks in advance.
I have a table which contains nearly a million rows. Searching for a single value in it takes 5 sec and around 500 in 15 seconds. This is quite a long time. Please let me know how can I optimize the query?
My query is:
select a,b,c,d from table where a in ('a1','a2')
Job id : stable-apogee-119006:job_ClLDIUSdDLYA6tC2jfC5GxBXmv0
I'm not sure what you mean by "500 it takes 15 secs" but I ran some tests against our database trying to simulate what you are running and I got some similar results to yours
(my query is slower then yours as it has a join operation but still here we go):
SELECT
a.fv fv,
a.v v,
a.sku sku,
a.pp pp from(
SELECT
fullvisitorid fv,
visitid v,
hits.product.productsku sku,
hits.page.pagepath pp
FROM (TABLE_DATE_RANGE([40663402.ga_sessions_], DATE_ADD(CURRENT_DATE(), -3, 'day'), DATE_ADD(CURRENT_DATE(), -3, 'day')))
WHERE
1 = 1 ) a
JOIN EACH (
SELECT
fullvisitorid fv,
FROM (TABLE_DATE_RANGE([40663402.ga_sessions_], DATE_ADD(CURRENT_DATE(), -3, 'day'), DATE_ADD(CURRENT_DATE(), -3, 'day')))
GROUP EACH BY
fv
LIMIT
1 ) b
ON
a.fv = b.fv
Querying for just one day and bringing just one fullvisitor took BQ roughly 5 secs to process 1.7 GBs.
And when I ran the same query for the last month and removed the limit operator it took ~10s to process ~56GB of data (around 33 million rows):
This is insanely fast.
So you might have to evaluate your project specs. If 5 secs is still too much for you then maybe you'll need to find some other strategy in your architecture that suits you best.
BigQuery does consume seconds to process its demands but it's also ready to process hundreds of Gigas still in seconds.
If your project data consumption is expected to grow and you will start processing millions of rows then you might evaluate if waiting a few secs is still acceptable in your application.
Other than that, as far as your query goes, I don't think there's much optimization left to improve its performance.
(ps: I decided to run for 100 days and it processed around 100 GBs in 14s.)
I have a lucene index which has close to 480M documents. The size of the index is 36G. And I ran around 10000 queries against the index. Each query is a boolean AND query with 3 term queries inside. That is the query has 3 operands which MUST occur. Executing such 3 word queries gives the following latency percentiles.
50th = 16 ms
75th = 52 ms
90th = 121 ms
95th = 262 ms
99th = 76010 ms
99.9th = 76037 ms
Is the latency expected to degrade when the number of docs is as high as 480M? All the segments in the index are merged into one segment. Even when the segments are not merged, the latencies are not very different. Each document has 5-6 stored fields. But as mentioned above, the above latencies are for boolean queries that don't access any stored fields, but just do a posting list lookup on 3 tokens.
Any ideas on what could be wrong here?