Postgres How to speed up select statement on partitioned table - database

I'm looking for ways that I can reduce the time to run select statements in my data warehouse.
We are currently running Postgres Enterprise 9.3.4.10 with the intention of upgrading to 9.6 within the new few months.
There is a Fact table with about 95 million rows with b-tree indexes on all of the foreign key / id columns. The foreign key columns are a mix of smallint / integer data types. They are all single column indexes. There are also some measures such as dollar amounts that we use for aggregation (sum / avg). These fields are NOT indexed. The table is updated on a daily basis using JDBD insert/update statements via Pentaho. The table is also partitioned by activity_date_key.
Running "select sum(amount) from table where date between 20160701 and 20160801" runs in 0.2 seconds but if I increase that timeframe to be between 20160101 and 20160801 the run time jumps up to 70 seconds. (The date field is of type integer).
I'm looking for some ideas on what I can do to reduce this time. Possibly different types of indexes? I read that 9.6 comes with BRIN indexes (block range indexes) but not sure if that will help me. Are there any database config parameters that I can safely tweak? Maybe my problem is just too much data in general? Any tips are welcome. And let me know if you needed any more info on my environment. Thank You.
Ryan

Here is some additional info on my setup and some answers to questions so far:
Yes there is an index on activity_date_key. There are 16 b-tree indexes in total which represent the foreign keys.
I do want to consider BRIN indexes as we are going to upgrade to 9.6. Is there any reasoning to believe that it would be useful in conjunction with a partitioned table (by date key) which is already b-tree indexed on the same key?
Hardware Configuration 2CPUs with 16 cores each vendor_id : AuthenticAMD , cpu family : 21, model : 2, model name : AMD Opteron(tm) Processor 6378, cpu MHz : 2400
192GB of RAM SAS Disks on a netapp storage array
There are 18 other database on this server - very overloaded.
Explain (analyze, buffers) output for select count(*) from fact_gc_activity where activity_date_key bewteen 20160101 and 20160801:
"Aggregate (cost=372804.89..372804.90 rows=1 width=0) (actual time=121284.418..121284.418 rows=1 loops=1)" " Buffers: shared hit=72295 read=47564" " -> Append (cost=0.00..347567.97 rows=10094768 width=0) (actual time=350.581..118020.641 rows=10023451 loops=1)" " Buffers: shared hit=72295 read=47564" " -> Seq Scan on fact_gc_activity (cost=0.00..0.00 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)" " Filter: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " -> Index Only Scan using fact_gc_activity_201601_activity_date_key_idx on fact_gc_activity_201601 (cost=0.43..54563.87 rows=1748572 width=0) (actual time=350.577..5734.825 rows=1748572 loops=1)" " Index Cond: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Heap Fetches: 0" " Buffers: shared hit=2 read=4880" " -> Index Only Scan using fact_gc_activity_201602_activity_date_key_idx on fact_gc_activity_201602 (cost=0.43..41641.89 rows=1331873 width=0) (actual time=183.811..3071.842 rows=1331873 loops=1)" " Index Cond: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Heap Fetches: 0" " Buffers: shared hit=2 read=3737" " -> Index Only Scan using fact_gc_activity_201603_activity_date_key_idx on fact_gc_activity_201603 (cost=0.43..45535.79 rows=1456368 width=0) (actual time=171.306..3069.430 rows=1456368 loops=1)" " Index Cond: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Heap Fetches: 0" " Buffers: shared hit=2 read=4086" " -> Index Only Scan using fact_gc_activity_201604_activity_date_key_idx on fact_gc_activity_201604 (cost=0.43..34124.85 rows=1088621 width=0) (actual time=179.636..4707.148 rows=1088621 loops=1)" " Index Cond: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Heap Fetches: 0" " Buffers: shared hit=2 read=3076" " -> Seq Scan on fact_gc_activity_201605 (cost=0.00..43008.71 rows=1116181 width=0) (actual time=157.550..13863.706 rows=1094721 loops=1)" " Filter: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Buffers: shared hit=5374 read=20892" " -> Index Only Scan using fact_gc_activity_201608_activity_date_key_idx on fact_gc_activity_201608 (cost=0.43..1281.07 rows=37832 width=0) (actual time=0.058..19.571 rows=34741 loops=1)" " Index Cond: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Heap Fetches: 0" " Buffers: shared hit=124" " -> Seq Scan on fact_gc_activity_201607 (cost=0.00..64043.08 rows=1670672 width=0) (actual time=0.029..765.421 rows=1653358 loops=1)" " Filter: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Buffers: shared hit=38983" " -> Seq Scan on fact_gc_activity_201606 (cost=0.00..63368.72 rows=1644648 width=0) (actual time=0.030..81989.732 rows=1615197 loops=1)" " Filter: ((activity_date_key >= 20160101) AND (activity_date_key <= 20160801))" " Buffers: shared hit=27806 read=10893" "Total runtime: 121284.642 ms"
Configuration Settings: - shared_buffers: 4GB - effective_cache_size: 2049MB - work_mem: 10MB - default_statistics_target: 100

Try tuning your postgresql, in "postgresql.conf" you will find a lot of parameters which will speed up your postgresql, for query you have "work_mem" but be careful with the number of connections you have, pgtune will give you an idea how to set parameters, or read this article for a clear vision.

Related

Posrgres 13.1: Slow execution of sql query with inner select or join

I have a database with ~2.5M persons and ~3M pets. Each pet belongs to a person.
There are 32'300 cats (kind = 3) in the database.
Is there a way to speed up the following statement?
SELECT p.name
FROM person p
WHERE p.id IN (
SELECT pet.person_id
FROM pet pet
WHERE pet.kind = 3 -- cat
)
The execution plan is as follows:
Nested Loop (cost=898.14..66545.89 rows=32005 width=37) (actual time=22.546..12407.109 rows=32300 loops=1)
Output: p.name
Inner Unique: true
Buffers: shared hit=85815 read=43622
-> HashAggregate (cost=897.71..1218.20 rows=32049 width=8) (actual time=21.702..42.032 rows=32300 loops=1)
Output: pet.person_id
Group Key: pet.person_id
Batches: 1 Memory Usage: 3089kB
Buffers: shared hit=2 read=235
-> Index Only Scan using petx1 on pet pet (cost=0.43..817.59 rows=32049 width=8) (actual time=1.085..10.478 rows=32300 loops=1)
Output: pet.kind, pet.person_id
Index Cond: (pet.kind = '3'::bigint)
Heap Fetches: 2
Buffers: shared hit=2 read=235
-> Index Scan using personxpk on person p (cost=0.43..2.06 rows=1 width=45) (actual time=0.382..0.382 rows=1 loops=32300)
Output: p.name, p.id
Index Cond: (p.id = pet.person_id)
Buffers: shared hit=85813 read=43387
Planning:
Buffers: shared hit=444 read=64
Planning Time: 26.404 ms
Execution Time: 12413.696 ms
I already have the following indexes.
I don't have an index for the person's name because I need to select more attributes later from the person and an index for each attribute seems excessive.
CREATE UNIQUE INDEX personxpk ON person USING btree (person_id)
CREATE INDEX petx1 ON pet USING btree (kind, person_id)
I already tried to rewrite the SQL as follows.
This doubles the execution speed (probably because of the two parallel workers), but it is still not the desired speed.
SELECT p.name
FROM person p
JOIN pet pet ON pet.person_id = p.id AND pet.kind = 3 -- cat
Here the execution plan for this statement:
Gather (cost=1000.86..32294.58 rows=32005 width=37) (actual time=2.663..4303.776 rows=32300 loops=1)
Output: p.name
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=85819 read=43622
-> Nested Loop (cost=0.86..28094.08 rows=13335 width=37) (actual time=1.850..4272.721 rows=10767 loops=3)
Output: p.name
Inner Unique: true
Buffers: shared hit=85819 read=43622
Worker 0: actual time=1.787..4274.702 rows=10780 loops=1
Buffers: shared hit=28696 read=14503
Worker 1: actual time=1.734..4271.052 rows=10780 loops=1
Buffers: shared hit=28598 read=14602
-> Parallel Index Only Scan using petx1 on pet pet (cost=0.43..630.63 rows=13354 width=8) (actual time=0.946..8.839 rows=10767 loops=3)
Output: pet.kind, pet.person_id
Index Cond: (pet.kind = '3'::bigint)
Heap Fetches: 2
Buffers: shared hit=4 read=235
Worker 0: actual time=0.966..8.497 rows=10780 loops=1
Buffers: shared hit=1 read=77
Worker 1: actual time=0.831..9.440 rows=10780 loops=1
Buffers: shared hit=1 read=78
-> Index Scan using personxpk on person p (cost=0.43..2.06 rows=1 width=45) (actual time=0.395..0.395 rows=1 loops=32300)
Output: p.name, p.id
Index Cond: (p.id = pet.person_id)
Buffers: shared hit=85815 read=43387
Worker 0: actual time=0.394..0.394 rows=1 loops=10780
Buffers: shared hit=28695 read=14426
Worker 1: actual time=0.394..0.394 rows=1 loops=10780
Buffers: shared hit=28597 read=14524
Planning:
Buffers: shared hit=450 read=58
Planning Time: 27.957 ms
Execution Time: 6307.435 ms
Some observations:
This is a bulk data query. It's designed to retrieve tens of thousands of rows in its result set. Bulk data handling takes some time even with the most efficient of queries. Your client program, the one issuing the query, must ingest all those rows and do something with them.
You used a covering index (kind, person_id) on your pet table to good effect.
You could try a similar covering index on your person table.
CREATE INDEX id_name ON person USING BTREE (id, name);
This might help your query time a bit, because it can be satisfied directly from the index. But see my first observation.
Why did you make these two indexes unique? This question comes from a guy with two (very lazy) cats. But your indexes say I may only have one.
CREATE UNIQUE INDEX petxpk ON pet USING BTREE (person_id, kind)
CREATE UNIQUE INDEX petxpk2 ON pet USING BTREE (kind, person_id)
The second of these two indexes is made redundant by the first one.
CREATE UNIQUE INDEX petxpk2 ON pet USING BTREE (kind, person_id)
CREATE INDEX petx1 ON pet USING BTREE (kind)

Postgresql index best pratices / performance

I have a test Postgresql on 10.13 and looking for several answers on indexes. Let's take this table for example and admit there are 40k entries in it :
CREATE TABLE "public"."ecom_input"
(
"id" serial PRIMARY KEY,
"ecom_id" INTEGER NOT NULL,
"sku_supplier" CHARACTER VARYING(100) NOT NULL
);
Does a PRIMARY KEY have its own INDEX automatically?
With this query :
EXPLAIN SELECT * FROM ecom_input WHERE id = 27846;
I have the same results whether using the PRIMARY KEY or the INDEX :
--> Index Scan using ecom_input_pkey on ecom_input (cost=0.29..8.31 rows=1 width=853)
--> Index Scan using "ecom_input_id_idx" on ecom_input (cost=0.29..8.31 rows=1 width=853)
After creating a multi column INDEX on the same table :
CREATE INDEX idx_ecom_input ON ecom_input (ecom_id, sku_supplier);
Try :
EXPLAIN SELECT * FROM ecom_input WHERE ecom_id = 22 AND sku_supplier = 'MATHILDEJAS';
I have good cost performance:
--> Index Scan using idx_ecom_input on ecom_input (cost=0.41..8.43 rows=1 width=853)
Loosing a bit of performance when using only WHERE sku_supplier = 'MATHILDEJAS'; :
--> Index Scan using idx_ecom_input on ecom_input (cost=0.41..1044.64 rows=1 width=853)
But when using only WHERE ecom_id = 22; :
--> Bitmap Heap Scan on ecom_input (cost=5.58..547.25 rows=150 width=853)
Recheck Cond: (ecom_id = 22)
-> Bitmap Index Scan on idx_ecom_input (cost=0.00..5.54 rows=150 width=0)
Index Cond: (ecom_id = 22)
Why does the optimizer uses a Bitmap Heap Scan when using only the ecom_id in the WHERE clause but not when using the sku_supplier in the WHERE clause ? There are 150 rows where ecom_id = 22 but & only 1 row where sku_supplier = 'MATHILDEJAS', is this the reason ?
We usually operate JOIN on the column 'id' of our tables (which are PRIMARY KEYS). Is an INDEX used during JOIN operations ? If yes, is it a good pratice to always JOIN on a PRIMARY KEY ?
At what point (approximately) can we see a real performance difference when querying using INDEX and when NOT ? We are using cloudbased databases and even if the network is good, the round trip server-client through internet is what is taking longer.
Exemple :
EXPLAIN ANALYZE SELECT * FROM ecom_input WHERE sku_supplier = 'MATHILDEJAS' AND ecom_id = 22;
With INDEX :
"Index Scan using idx_ecom_input on ecom_input (cost=0.41..8.43 rows=1 width=853) (actual time=0.017..0.018 rows=1 loops=1)"
"Index Cond: ((ecom_id = 22) AND ((sku_supplier)::text = 'MATHILDEJAS'::text))"
"Planning time: 0.086 ms"
"Execution time: 0.034 ms"
Without INDEX :
"Seq Scan on ecom_input (cost=0.00..10034.43 rows=1 width=853) (actual time=0.006..13.555 rows=1 loops=1)"
"Filter: (((sku_supplier)::text = 'MATHILDEJAS'::text) AND (ecom_id = 22))"
"Rows Removed by Filter: 40561"
"Planning time: 0.097 ms"
"Execution time: 13.572 ms"
We gain 13ms in total. This requests take up to 250-700ms to go to the server and back. It is not really impacting on a 40k entries table. Do you know about how many entries and INDEX would be useful ?

PostGIS: optimised way to find intersection between polygon and circle

I am trying to find intersection between incidents(polygons) and watchzones(circles - points and radius) using PostGIS. The baseline data is going to be somewhere like more than 10 000 polygons and 500 000 circles. Also, I am quite new to PostGIS.
I have tried a few things but the execution is taking quite long. Can someone please suggest any optimisations or a better way using PostGIS only. Here is what I have tried -
1. Using Geometry datatype:
I have stored the incidents and watchzones in type geometry.
created GIST index on them, used ST_DWITHIN to find the intersection.
The output with 1 incident and 500 000 watchzones took like 6.750sec. Here, the time taken is optimum, but the problem is that I have radius in meters and with geometry type ST_DWithin requires it to be in SRID unit. I am unable to figure out this conversion.
CREATE TABLE incident (
incident_id SERIAL NOT NULL,
incident_name VARCHAR(20),
incident_span GEOMETRY(POLYGON, 4326),
CONSTRAINT incident_id PRIMARY KEY (incident_id)
);
CREATE TABLE watchzones (
id SERIAL NOT NULL,
date_created timestamp with time zone DEFAULT now(),
latitude NUMERIC(10, 7) DEFAULT NULL,
Longitude NUMERIC(10, 7) DEFAULT NULL,
radius integer,
position GEOMETRY(POINT, 4326),
CONSTRAINT id PRIMARY KEY (id)
);
CREATE INDEX ix_spatial_geom on watchzones using gist(position);
CREATE INDEX ix_spatial_geom_1 on incident using gist(incident_span);
Insert into incident values (
1,
'test',
ST_GeomFromText('POLYGON((152.945470916 -29.212227933,152.942130026 -29.213431145,152.939345911 -29.2125423759999,152.935144791 -29.21454003,152.933185494 -29.2135838469999,152.929481762 -29.216065516,152.929698621 -29.217402937,152.927245999
-29.219576,152.921539 -29.217676,152.918487996 -29.2113786959999,152.919254355 -29.206029929,152.919692387 -29.2027824419999,152.936020197 -29.207567346,152.944901258 -29.207729953,152.945470916
-29.212227933))',
4326
)
);
insert into watchzones
SELECT generate_series(1, 500000) AS id,
now(),
-29.21073,
152.93322,
'50',
ST_GeomFromText('POINT( 152.93322 -29.21073)', 4326);
explain analyze SELECT wz.id,
i.incident_id
FROM watchzones wz,
incident i
WHERE ST_DWithin(incident_span,position,wz.radius);
"Nested Loop (cost=0.14..227467.00 rows=42 width=8) (actual time=0.142..1506.476 rows=500000 loops=1)"
" -> Seq Scan on watchzones wz (cost=0.00..11173.00 rows=500000 width=40) (actual time=0.109..47.822 rows=500000 loops=1)"
" -> Index Scan using ix_spatial_geom_1 on incident i (cost=0.14..0.42 rows=1 width=284) (actual time=0.002..0.002 rows=1 loops=500000)"
" Index Cond: (incident_span && st_expand(wz."position", (wz.radius)::double precision))"
" Filter: ((wz."position" && st_expand(incident_span, (wz.radius)::double precision)) AND _st_dwithin(incident_span, wz."position", (wz.radius)::double precision))"
"Planning time: 0.150 ms"
"Execution time: 1523.312 ms"
2. Using Geography data type:
The output with 1 incident and 500 000 watchzones here, took like 29.987sec which is quite slow. Please note that I have tried this with both GIST and BRIN indexes and also ran VACUUM ANALYZE on the tables.
CREATE TABLE watchzones_geog
(
id SERIAL PRIMARY KEY,
date_created TIMESTAMP with time zone DEFAULT now(),
latitude NUMERIC(10, 7) DEFAULT NULL,
longitude NUMERIC(10, 7) DEFAULT NULL,
radius INTEGER,
position geography(point)
);
CREATE INDEX watchzones_geog_gix ON watchzones_geog USING GIST (position);
insert into watchzones_geog
SELECT generate_series(1,500000) AS id, now(),-29.21073,152.93322,'50',ST_GeogFromText('POINT(152.93322 -29.21073)');
CREATE TABLE incident_geog (
incident_id SERIAL PRIMARY KEY,
incident_name VARCHAR(20),
incident_span GEOGRAPHY(POLYGON)
);
CREATE INDEX incident_geog_gix ON incident_geog USING GIST (incident_span);
Insert into incident_geog values (1,'test', ST_GeogFromText
('POLYGON((152.945470916 -29.212227933,152.942130026 -29.213431145,152.939345911 -29.2125423759999,152.935144791 -29.21454003,152.933185494 -29.2135838469999,152.929481762 -29.216065516,152.929698621 -29.217402937,152.927245999
-29.219576,152.921539 -29.217676,152.918487996 -29.2113786959999,152.919254355 -29.206029929,152.919692387 -29.2027824419999,152.936020197 -29.207567346,152.944901258 -29.207729953,152.945470916
-29.212227933))'));
explain analyze SELECT i.incident_id,
wz.id
FROM watchzones_geog wz,
incident_geog i
WHERE St_dwithin(position, incident_span, radius);
"Nested Loop (cost=0.27..348717.00 rows=17 width=8) (actual time=0.277..18551.844 rows=500000 loops=1)"
" -> Seq Scan on watchzones_geog wz (cost=0.00..11173.00 rows=500000 width=40) (actual time=0.102..50.052 rows=500000 loops=1)"
" -> Index Scan using incident_geog_gix on incident_geog i (cost=0.27..0.67 rows=1 width=711) (actual time=0.036..0.036 rows=1 loops=500000)"
" Index Cond: (incident_span && _st_expand(wz."position", (wz.radius)::double precision))"
" Filter: ((wz."position" && _st_expand(incident_span, (wz.radius)::double precision)) AND _st_dwithin(wz."position", incident_span, (wz.radius)::double precision, true))"
"Planning time: 0.155 ms"
"Execution time: 18587.041 ms"
3. I have also tried creating a circle using ST_Buffer(position, radius,'quad_segs=8') and then using ST_Intersects. With this the query takes more than a minute with both geometry and geography data types.
Would be great if someone can suggest a better way or some optimisations which would speed up the execution.
Thanks
The query is fine, but your sample is wrong. First, let's note that a query optimized for 1 polygon might not be the same as the optimized one for several thousands.
The main issue is with the sample points. As it is, you have 500,000 points at the exact same location, so depending on the intersecting polygon, the query will return 0 or 500 000 results. Postgis starts by using the index to intersects points/polygons using a square box, and then refine the results by computing the true distance. Using your sample, it has to compute the distance 500,000 times, which is slow.
Using a point layer with random locations (within 1 degree), the query takes less than 1 second as it has to compute the distance for 20 locations only.
INSERT INTO watchzones_geog
SELECT generate_series(1,500000) AS id, now(),0,0,'50',
ST_makePoint(152.93322+random(),-29.21073+random())::geography;
explain analyze SELECT i.incident_id,
wz.id
FROM watchzones_geog wz,
incident_geog i
WHERE St_dwithin(position, incident_span, radius);
Nested Loop (cost=0.00..272424.01 rows=1 width=8) (actual time=25.956..921.846 rows=20 loops=1)
--------------------------------------------
Join Filter: ((wz."position" && _st_expand(i.incident_span, (wz.radius)::double precision)) AND (i.incident_span && _st_expand(wz."position", (wz.radius)::double precision)) AND _st_dwithin(wz."position", i.incident_span, (wz.radius)::double precision, true))
Rows Removed by Join Filter: 499980
-> Seq Scan on incident_geog i (cost=0.00..1.01 rows=1 width=36) (actual time=0.009..0.009 rows=1 loops=1)
-> Seq Scan on watchzones_geog wz (cost=0.00..11173.00 rows=500000 width=40) (actual time=0.006..65.625 rows=500000 loops=1)
Planning time: 1.887 ms
Execution time: 921.895 ms

postgresql 9.6.4: timestamp range query on large table takes forever

I need some help in analyzing the bad performance of a query executed on a large table containing 83.660.142 million rows which takes up to 25 minutes to more than one hour, depending on the system load, for computation.
I've created the following table that consists of a composite key and 3 indexes:
CREATE TABLE IF NOT EXISTS ds1records(
userid INT DEFAULT 0,
clientid VARCHAR(255) DEFAULT '',
ts TIMESTAMP,
site VARCHAR(50) DEFAULT '',
code VARCHAR(400) DEFAULT '');
CREATE UNIQUE INDEX IF NOT EXISTS primary_idx ON records (userid, clientid, ts, site, code);
CREATE INDEX IF NOT EXISTS userid_idx ON records (userid);
CREATE INDEX IF NOT EXISTS ts_idx ON records (ts);
CREATE INDEX IF NOT EXISTS userid_ts_idx ON records (userid ASC,ts DESC);
In a spring batch application I'm executing a query that looks as follows:
SELECT *
FROM records
WHERE userid = ANY(VALUES (2), ..., (96158 more userids) )
AND ( ts < '2017-09-02' AND ts >= '2017-09-01'
OR ts < '2017-08-26' AND ts >= '2017-08-25'
OR ts < '2017-08-19' AND ts >= '2017-08-18'
OR ts < '2017-08-12' AND ts >= '2017-08-11')
The User ID's are determined at runtime (number of id's lie between 95.000 and 110.000). For each user I need to extract the page views of the current day and the last same three weekdays. The query always returns rows between 3-4M rows.
Executing the query with the EXPLAIN ANALYZE option returns the following execution plan.
Nested Loop (cost=1483.40..1246386.43 rows=3761735 width=70) (actual time=108.856..1465501.596 rows=3643240 loops=1)
-> HashAggregate (cost=1442.38..1444.38 rows=200 width=4) (actual time=33.277..201.819 rows=96159 loops=1)
Group Key: "*VALUES*".column1
-> Values Scan on "*VALUES*" (cost=0.00..1201.99 rows=96159 width=4) (actual time=0.006..11.599 rows=96159 loops=1)
-> Bitmap Heap Scan on records (cost=41.02..6224.01 rows=70 width=70) (actual time=8.865..15.218 rows=38 loops=96159)
Recheck Cond: (userid = "*VALUES*".column1)
Filter: (((ts < '2017-09-02 00:00:00'::timestamp without time zone) AND (ts >= '2017-09-01 00:00:00'::timestamp without time zone)) OR ((ts < '2017-08-26 00:00:00'::timestamp without time zone) AND (ts >= '2017-08-25 00:00:00'::timestamp without time zone)) OR ((ts < '2017-08-19 00:00:00'::timestamp without time zone) AND (ts >= '2017-08-18 00:00:00'::timestamp without time zone)) OR ((ts < '2017-08-12 00:00:00'::timestamp without time zone) AND (ts >= '2017-08-11 00:00:00'::timestamp without time zone)))
Rows Removed by Filter: 792
Heap Blocks: exact=77251145
-> Bitmap Index Scan on userid_ts_idx (cost=0.00..41.00 rows=1660 width=0) (actual time=6.593..6.593 rows=830 loops=96159)
Index Cond: (userid = "*VALUES*".column1)
I've adjusted the values of some Postgres tuning parameters (unfortunately with no success):
effective_cache_size=15GB (probably useless as query is executed just once)
shared_buffers=15GB
work_mem=3GB
The application runs computationally expensive tasks (e.g. data fusion/data injection) and consumes roughly 100GB memory, so the system hardware is sufficiently dimensioned with 125GB RAM and 16 cores (OS: Debian).
I'm wondering why postgres is not using the combined index userid_ts_idx in its execution plan? Since the timestamp column in the index is sorted in reverse order I would expect postgres to use this to find matching tuples for the range part of the query as it could sequentially go through the index until the condition ts < '2017-09-02 00:00:00 holds true and return all values until condition ts >= 2017-09-01 00:00:00 is met. Instead postgres uses the expensive Bitmap Heap Scan which does a linear table scan if I understood correctly. Did I misconfigure the db settings or do I have a conceptual misunderstanding?
Update
The CTE as suggested in the comments unfortunately did not bring any improvements. The Bitmap Heap Scan has been replaced by a Sequantial Scan but the performance is still poor. Following is the updated execution plan:
Merge Join (cost=20564929.37..20575876.60 rows=685277 width=106) (actual time=2218133.229..2222280.192 rows=3907472 loops=1)
Merge Cond: (ids.id = r.userid)
Buffers: shared hit=2408684 read=181785
CTE ids
-> Values Scan on "*VALUES*" (cost=0.00..1289.70 rows=103176 width=4) (actual time=0.002..28.670 rows=103176 loops=1)
CTE ts
-> Values Scan on "*VALUES*_1" (cost=0.00..0.05 rows=4 width=32) (actual time=0.002..0.004 rows=4 loops=1)
-> Sort (cost=10655.37..10913.31 rows=103176 width=4) (actual time=68.476..83.312 rows=103176 loops=1)
Sort Key: ids.id
Sort Method: quicksort Memory: 7909kB
-> CTE Scan on ids (cost=0.00..2063.52 rows=103176 width=4) (actual time=0.007..47.868 rows=103176 loops=1)
-> Sort (cost=20552984.25..20554773.54 rows=715717 width=102) (actual time=2218059.941..2221230.585 rows=8085760 loops=1)
Sort Key: r.userid
Sort Method: quicksort Memory: 1410084kB
Buffers: shared hit=2408684 read=181785
-> Nested Loop (cost=0.00..20483384.24 rows=715717 width=102) (actual time=885849.043..2214665.723 rows=8085767 loops=1)
Join Filter: (ts.r #> r.ts)
Rows Removed by Join Filter: 707630821
Buffers: shared hit=2408684 read=181785
-> Seq Scan on records r (cost=0.00..4379760.52 rows=178929152 width=70) (actual time=0.024..645616.135 rows=178929147 loops=1)
Buffers: shared hit=2408684 read=181785
-> CTE Scan on ts (cost=0.00..0.08 rows=4 width=32) (actual time=0.000..0.000 rows=4 loops=178929147)
Planning time: 126.110 ms
Execution time: 2222514.566 ms
You should get different plan if you would cast that timestamp to date and filter by value list instead.
CREATE INDEX IF NOT EXISTS userid_ts_idx ON records (userid ASC,cast(ts AS date) DESC);
SELECT *
FROM records
WHERE userid = ANY(VALUES (2), ..., (96158 more userids) )
AND cast(ts AS date) IN('2017-09-01','2017-08-25','2017-08-18','2017-08-11');
Whether it will perform better depends on your data and date range, since I found in my case that Postgres will keep using that index even if date values cover whole table (so a seq scan would be better).
Demo

Postgres not Using different query plan for higher offsets

I have this postgres query
explain SELECT "facilities".* FROM "facilities" INNER JOIN
resource_indices ON resource_indices.resource_id = facilities.uuid WHERE
(client_id IS NULL OR (client_tag=NULL AND client_id=7))
AND (ARRAY['country:india']::varchar[] && resource_indices.tags)
AND "facilities"."is_listed" = 't'
ORDER BY resource_indices.name LIMIT 11 OFFSET 100;
Observe the offset. When the offset is less than say 200 it uses index and works fine.
The query plan for that is as follow
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=23416.57..24704.45 rows=11 width=1457) (actual time=41.951..43.035 rows=11 loops=1)
-> Nested Loop (cost=0.71..213202.15 rows=1821 width=1457) (actual time=2.107..43.007 rows=211 loops=1)
-> Index Scan using index_resource_indices_on_name on resource_indices (cost=0.42..190226.95 rows=12460 width=28) (actual time=2.096..40.790 rows=408 loops=1)
Filter: ('{country:india}'::character varying[] && tags)
Rows Removed by Filter: 4495
-> Index Scan using index_facilities_on_uuid on facilities (cost=0.29..1.83 rows=1 width=1445) (actual time=0.005..0.005 rows=1 loops=408)
Index Cond: (uuid = resource_indices.resource_id)
Filter: ((client_id IS NULL) AND is_listed)
Planning time: 1.259 ms
Execution time: 43.121 ms
(10 rows)
Increasing the offset for say four hundred starts using hash join and gives a much poorer performance. Increasing offsets gives much poorer performance.
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=34508.62..34508.65 rows=11 width=1457) (actual time=136.288..136.291 rows=11 loops=1)
-> Sort (cost=34507.62..34512.18 rows=1821 width=1457) (actual time=136.224..136.268 rows=411 loops=1)
Sort Key: resource_indices.name
Sort Method: top-N heapsort Memory: 638kB
-> Hash Join (cost=29104.96..34419.46 rows=1821 width=1457) (actual time=23.885..95.099 rows=6518 loops=1)
Hash Cond: (facilities.uuid = resource_indices.resource_id)
-> Seq Scan on facilities (cost=0.00..4958.39 rows=33790 width=1445) (actual time=0.010..48.732 rows=33711 loops=1)
Filter: ((client_id IS NULL) AND is_listed)
Rows Removed by Filter: 848
-> Hash (cost=28949.21..28949.21 rows=12460 width=28) (actual time=23.311..23.311 rows=12601 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 814kB
-> Bitmap Heap Scan on resource_indices (cost=1048.56..28949.21 rows=12460 width=28) (actual time=9.369..18.710 rows=12601 loops=1)
Recheck Cond: ('{country:india}'::character varying[] && tags)
Heap Blocks: exact=7334
-> Bitmap Index Scan on index_resource_indices_on_tags (cost=0.00..1045.45 rows=12460 width=0) (actual time=7.680..7.680 rows=13889 loops=1)
Index Cond: ('{country:india}'::character varying[] && tags)
Planning time: 1.408 ms
Execution time: 136.465 ms
(18 rows)
How do I resolve this? Thank you
That is unavoidable, because there is no other way to implement LIMIT 10 OFFSET 10000 but to fetch the first 10010 rows and throw away all but the last 10. This is bound to perform increasingly bad as the offset is raised.
PostgreSQL switches to a different plan because it has to retrieve more result rows, and “fast start” plans that are quick to retrieve the first few rows and usually involve nested loop joins will perform worse than other plans when more result rows are needed.
OFFSET is evil and you should avoid it in most cases. Read what Markus Winand has to say about this topic, particularly how to paginate result sets without OFFSET.

Resources