I'm brand new to the concept of database administration, so I have no basis for what to expect. I am working with approximately 100GB of data in the form of five different tables. Descriptions of the data, as well as the first few rows of each file, can be found here.
I'm currently just working with the flows tables in an effort to gauge performance. Here is the results from \d flows:
Table "public.flows"
Column | Type | Modifiers
------------+-------------------+-----------
time | real |
duration | real |
src_comp | character varying |
src_port | character varying |
dest_comp | character varying |
dest_port | character varying |
protocol | character varying |
pkt_count | real |
byte_count | real |
Indexes:
"flows_dest_comp_idx" btree (dest_comp)
"flows_dest_port_idx" btree (dest_port)
"flows_protocol_idx" btree (protocol)
"flows_src_comp_idx" btree (src_comp)
"flows_src_port_idx" btree (src_port)
Here is the results from EXPLAIN ANALYZE SELECT src_comp, COUNT(DISTINCT dest_comp) FROM flows GROUP BY src_comp;, which I thought would be a relatively simple query:
GroupAggregate (cost=34749736.06..35724568.62 rows=200 width=64) (actual time=1292299.166..1621191.771 rows=11154 loops=1)
Group Key: src_comp
-> Sort (cost=34749736.06..35074679.58 rows=129977408 width=64) (actual time=1290923.435..1425515.812 rows=129977412 loops=1)
Sort Key: src_comp
Sort Method: external merge Disk: 2819360kB
-> Seq Scan on flows (cost=0.00..2572344.08 rows=129977408 width=64) (actual time=26.842..488541.987 rows=129977412 loops=1)
Planning time: 6.575 ms
Execution time: 1636290.138 ms
(8 rows)
If I'm interpreting this correctly (which I might not be since I'm new to PSQL), it's saying that my query would take almost 30 minutes to execute, which is much, much longer than I would expect. Even with ~130 million rows.
My computer is running with an 8th-gen i7 quad-core CPU, 16GBs of RAM, and a 2TB HDD (full specs can be found here).
My questions then are: 1) is this performance to be expected, and 2) is there anything I can do to speed it up, other than buying an external SSD?
1 - src_comp and dest_comp, which are used by the query, are both indexed. However, they are indexed independently. If you had an index of 'src_comp, dest_comp' then there is a possibility that the database could process this all via indexes, eliminating a full table scan.
2 - src_comp and dest_comp are character varying. That is NOT a good thing for indexed fields, unless necessary. What are these values really? Numbers? IP addresses? Computer network names? If there is a relatively finite number of these items and they can be identified as they are added to the database, change them to integers that are used as foreign keys into other tables. That will make a HUGE difference in this query. If they can't be stored that way, but they at least have a definite finite length - e.g., 15 characters for IPv4 addresses in dotted quad format - then set a maximum length for the fields, which should help some.
Related
I am hoping to get some advice regarding next steps to take for our production database that, in hindsight, has been very ill thought out.
We have an events based system that generates payment information based on "contracts" (a "contract" defines how/what the user gets paid).
Example schema:
Approx Quantity is ess. Quantity is slightly
5M rows 1-to-1 with "Events" less, approx.
i.e. ~5M rows 3M rows
---------- ---------------- --------------
| Events | | Contract | | Payment |
---------- ---------------- --------------
| id | | id | | id |
| userId | -> | eventId | ----> | contractId |
| jobId | | <a bunch of | | amount |
| date | | data to calc | | currency |
---------- | payments> | |------------|
----------------
There is some additional data on each table too such as generic "created_at", "updated_at", etc.
These are the questions I have currently come up with (please let me know if I'm barking up the wrong tree):
Should I denormalise the payment information?
Currently, in order to get a payment based on an event ID we need to do minimum one join but realistically we have to do two (there are rarely scenarios where we want a payment based only on some contract information). With 5M rows (and growing on a daily basis) it takes well over 20s to do a normal select with sort, let alone a join. My thought is that by moving the payment info directly onto the events table we can skip the joins in order to get payment information.
Should I be "archiving" old rows from Events table?
Since the Events table has so many new rows constantly (and also needs to be read from for other business logic purposes) should I be implementing some strategy to move events where payments have already been calculated into some "archive" table? My thought is that this will keep the very I/O-heavy Events table more "freed up" in order to be read/written to faster.
What other optimisation strategies should I be thinking about?
The application in question is currently live to paying customers and our team is growing concerned about the manpower that will be required to keep this unoptimised setup in-check. Are there any "quick wins" that we could perform on the tables/database in order to speed things up? (I've read about "indexing" but I haven't grasped the purpose well enough to be confident to deploy it to a production database)
Thank you so much in advance for any advice you can give!
Edit to include DB information:
explain (analyze, buffers) select * from "event" order by "updatedAt" desc
Gather Merge (cost=1237571.93..1699646.19 rows=3960360 width=381) (actual time=71565.082..76242.131 rows=5616130 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=107649 read=211715, temp read=805207 written=805973
I/O Timings: read=76365.059
-> Sort (cost=1236571.90..1241522.35 rows=1980180 width=381) (actual time=69698.164..70866.085 rows=1872043 loops=3)
Sort Key: "updatedAt" DESC
Sort Method: external merge Disk: 725448kB
Worker 0: Sort Method: external merge Disk: 685968kB
Worker 1: Sort Method: external merge Disk: 714432kB
Buffers: shared hit=107649 read=211715, temp read=805207 written=805973
I/O Timings: read=76365.059
-> Parallel Seq Scan on event (cost=0.00..339111.80 rows=1980180 width=381) (actual time=0.209..26506.333 rows=1872043 loops=3)
Buffers: shared hit=107595 read=211715
I/O Timings: read=76365.059
Planning Time: 1.167 ms
Execution Time: 76799.910 ms
I've created composite indexes (indices for you mathematical folk) on tables before with an assumption of how they worked. I was just curious if my assumption is correct or not.
I assume that when you list the order of columns for the index, you are also specifying how the indexes will be grouped. For instance, if you have columns a, b, and c, and you specify the index in that same order a ASC, b ASC, and c ASC then the resultant index will essentially be many indexes for each "group" in a.
Is this correct? If not, what will the resultant index actually look like?
Composite indexes work just like regular indexes, except they have multi-values keys.
If you define an index on the fields (a,b,c) , the records are sorted first on a, then b, then c.
Example:
| A | B | C |
-------------
| 1 | 2 | 3 |
| 1 | 4 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 5 |
| 2 | 4 | 4 |
| 2 | 4 | 5 |
Composite index is like a plain alphabet index in a dictionary, but covering two or more letters, like this:
AA - page 1
AB - page 12
etc.
Table rows are ordered first by the first column in the index, then by the second one etc.
It's usable when you search by both columns OR by first column. If your index is like this:
AA - page 1
AB - page 12
…
AZ - page 245
BA - page 246
…
you can use it for searching on 2 letters ( = 2 columns in a table), or like a plain index on one letter:
A - page 1
B - page 246
…
Note that in case of a dictionary, the pages themself are alphabetically ordered. That's an example of a CLUSTERED index.
In a plain, non-CLUSTERED index, the references to pages are ordered, like in a history book:
Gaul, Alesia: pages 12, 56, 78
Gaul, Augustodonum Aeduorum: page 145
…
Gaul, Vellaunodunum: page 24
Egypt, Alexandria: pages 56, 194, 213, 234, 267
Composite indexes may also be used when you ORDER BY two or more columns. In this case a DESC clause may come handy.
See this article in my blog about using DESC clause in a composite index:
Descending indexes
The most common implementation of indices uses B-trees to allow somewhat rapid lookups, and also reasonably rapid range scans. It's too much to explain here, but here's the Wikipedia article on B-trees. And you are right, the first column you declare in the create index will be the high order column in the resulting B-tree.
A search on the high order column amounts to a range scan, and a B-tree index can be very useful for such a search. The easiest way to see this is by analogy with the old card catalogs you have in libraries that have not yet converted to on line catalogs.
If you are looking for all the cards for Authors whose last name is "Clemens", you just go to the author catalog, and very quickly find a drawer that says "CLE- CLI" on the front. That's the right drawer. Now you do a kind of informal binary search in that drawer to quickly find all the cards that say "Clemens, Roger", or "Clemens, Samuel" on them.
But suppose you want to find all the cards for the authors whose first name is "Samuel". Now you're up the creek, because those cards are not gathered together in one place in the Author catalog. A similar phenomenon happens with composite indices in a database.
Different DBMSes differ in how clever their optimizer is at detecting index range scans, and accurately estimating their cost. And not all indices are B-trees. You'll have to read the docs for your specific DBMS to get the real info.
No. Resultant index will be single index but with compound key.
KeyX = A,B,C,D; KeyY = 1,2,3,4;
Index KeyX, KeyY will be actually: A1,A2,A3,B1,B3,C3,C4,D2
So that in case you need to find something by KeyX and KeyY - that will be fast and will use single index. Something like SELECT ... WHERE KeyX = "B" AND KeyY = 3.
But it's important to understand: WHERE KeyX = ? requests will use that index, while WHERE KeyY = ? will NOT use such index at all.
I have a table where I keep record of who is following whom on a Twitter-like application:
\d follow
Table "public.follow" .
Column | Type | Modifiers
---------+--------------------------+-----------------------------------------------------
xid | text |
followee | integer |
follower | integer |
id | integer | not null default nextval('follow_id_seq'::regclass)
createdAt | timestamp with time zone |
updatedAt | timestamp with time zone |
source | text |
Indexes:
"follow_pkey" PRIMARY KEY, btree (id)
"follow_uniq_users" UNIQUE CONSTRAINT, btree (follower, followee)
"follow_createdat_idx" btree ("createdAt")
"follow_followee_idx" btree (followee)
"follow_follower_idx" btree (follower)
Number of entries in table is more than a million and when I run explain analyze on the query I get this:
explain analyze SELECT "follow"."follower"
FROM "public"."follow" AS "follow"
WHERE "follow"."followee" = 6
ORDER BY "follow"."createdAt" DESC
LIMIT 15 OFFSET 0;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..353.69 rows=15 width=12) (actual time=5.456..21.497
rows=15 loops=1)
-> Index Scan Backward using follow_createdat_idx on follow (cost=0.43..61585.45 rows=2615 width=12) (actual time=5.455..21.488 rows=15 loops=1)
Filter: (followee = 6)
Rows Removed by Filter: 62368
Planning time: 0.068 ms
Execution time: 21.516 ms
Why it is doing backward index scan on follow_createdat_idx where it could have been more faster execution if it had used follow_followee_idx.
This query is taking around 33 ms when running first time and then subsequent calls are taking around 22 ms which I feel are on higher side.
I am using Postgres 9.5 provided by Amazon RDS. Any idea what wrong could be happening here?
The multicolumn index on (follower, "createdAt") that user1937198 suggested is perfect for the query - as you found in your test already.
Since "createdAt" can be NULL (not defined NOT NULL), you may want to add NULLS LAST to query and index:
...
ORDER BY "follow"."createdAt" DESC NULLS LAST
And:
"follow_follower_createdat_idx" btree (follower, "createdAt" DESC NULLS LAST)
More:
PostgreSQL sort by datetime asc, null first?
There are minor other performance implications:
The multicolumn index on (follower, "createdAt") is 8 bytes per row bigger than the simple index on (follower) - 44 bytes vs 36. More (btree indexes have mostly the same page layout as tables):
Making sense of Postgres row sizes
Columns involved in an index in any way cannot be changed with a HOT update. Adding more columns to an index might block this optimization - which seems particularly unlikely given the column name. And since you have another index on just ("createdAt") that's not an issue anyway. More:
PostgreSQL Initial Database Size
There is no downside in having another index on just ("createdAt") (other than the maintenance cost for each (for write performance, not for read performance). Both indexes support different queries. You may or may not need the index on just ("createdAt") additionally. Detailed explanation:
Is a composite index also good for queries on the first field?
I have to optimize my little-big database, because it's too slow, maybe we'll find another solution together.
First of all let's talk about data that are stored in the database. There are two objects: users and let's say messages
Users
There is something like that:
+----+---------+-------+-----+
| id | user_id | login | etc |
+----+---------+-------+-----+
| 1 | 100001 | A | ....|
| 2 | 100002 | B | ....|
| 3 | 100003 | C | ....|
|... | ...... | ... | ....|
+----+---------+-------+-----+
There is no problem inside this table. (Don't afraid of id and user_id. user_id is used by another application, so it has to be here.)
Messages
And the second table has some problem. Each user has for example messages like this:
+----+---------+------+----+
| id | user_id | from | to |
+----+---------+------+----+
| 1 | 1 | aab | bbc|
| 2 | 2 | vfd | gfg|
| 3 | 1 | aab | bbc|
| 4 | 1 | fge | gfg|
| 5 | 3 | aab | gdf|
|... | ...... | ... | ...|
+----+---------+------+----+
There is no need to edit messages, but there should be an opportunity to updated the list of messages for the user. For example, an external service sends all user's messages to the db and the list has to be updated.
And the most important thing is that there are about 30 Mio of users and average user has 500+ of messages. Another problem that I have to search through the field from and calculate number of matches. I designed a simple SQL query with join, but it takes too much time to get the data.
So...it's quite big amount of data. I decided not to use RDS (I used Postgresql) and decided to move to databases like Clickhouse and so on.
However I faced with a problem that for example Clickhouse doesn't support UPDATE statement.
To resolve this issues I decided to store messages as one row. So the table Messages should be like this:
Here I'd like to store messages in JSON format
{"from":"aaa", "to":bbe"}
{"from":"ret", "to":fdd"}
{"from":"gfd", "to":dgf"}
||
\/
+----+---------+----------+------+ And there I'd like to store the
| id | user_id | messages | hash | <= hash of the messages.
+----+---------+----------+------+
I think that full-text search inside the messages column will save some time resources and so on.
Do you have any ideas? :)
In ClickHouse, the most optimal way is to store data in "big flat table".
So, you store every message in a separate row.
15 billion rows is Ok for ClickHouse, even on single node.
Also, it's reasonable to have each user attributes directly in messages table (pre-joined), so you don't need to do JOINs. It is suitable if user attributes are not updated.
These attributes will have repeated values for each users' message - it's Ok because ClickHouse compresses data well, especially repeated values.
If users' attributes are updated, consider to store users table in separate database and use 'External dictionaries' feature to join it.
If message is updated, just don't update it. Write another row with modified message to a table instead and leave old message as is.
Its important to have right primary key for your table. You should use table from MergeTree family, which constantly reorders data by primary key and so maintains efficiency of range queries. Primary key is not required to be unique, for example you could define primary key as just (from) if you would frequently write "from = ...", and if these queries must be processed in short time.
And you could use user_id as primary key: if queries by user id are frequent and must be processed as fast as possible, but then queries with predicate on 'from' will scan whole table (mind that ClickHouse do full scan efficiently).
If you need to fast lookup by many different attributes, you could just duplicate table with different primary keys. It's typically that table will be compressed well enough and you could afford to have data in few copies with different order for different range queries.
First of all, when we have such a big dataset, from and to columns should be integers, if possible, as their comparison is faster.
Second, you should consider creating proper indexes. As each user has relatively few records (500 compared to 30M in total), it should give you a huge performance benefit.
If everything else fails, consider using partitions:
https://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
In your case they would be dynamic, and hinder first time inserts immensely, so I would consider them only as last, if very efficient, resort.
I have a table that has over 15m rows in Postgresql. The users can save these rows (let's say items) into their library and when they request their library, the system loads the user's library.
The query in Postgresql is like
SELECT item.id, item.name
FROM items JOIN library ON (library.item_id = item.id)
WHERE library.user_id = 1
, the table is already indexed and denormalized so I don't need any other JOIN.
If a user has many items in the library (like 1k items) the query time increases normally. (for example for 1k items the query time is 7s) My aim is to reduce the query time for large datasets.
I already use Solr for full-text searching, and I tried queries like ?q=id:1 OR id:100 OR id:345 but I'm not sure whether it's efficient or not in Solr.
I want to know my alternatives for querying this datasets. The bottleneck in my system seems disk speed. Should I buy a server that has over 15gb memory and go with Postgresql in increased shared_memory option or try something like Mongodb or another memory based databases, or should I create a cluster system and replicate the data in Postgresql?
items:
Column | Type
--------------+-------------------
id | text
mbid | uuid
name | character varying
length | integer
track_no | integer
artist | text[]
artist_name | text
release | text
release_name | character varying
rank | numeric
user_library:
Column | Type | Modifiers
--------------+-----------------------------+--------------------------------------------------------------
user_id | integer | not null
recording_id | character varying(32) |
timestamp | timestamp without time zone | default now()
id | integer | primary key nextval('user_library_idx_pk'::regclass)
-------------------
explain analyze
SELECT recording.id,name,track_no,artist,artist_name,release,release_name
FROM recording JOIN user_library ON (user_library.recording_id = recording.id)
WHERE user_library.user_id = 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..10745.33 rows=1036539 width=134) (actual time=0.168..57.663 rows=1000 loops=1)
Join Filter: (recording.id = (recording_id)::text)
-> Seq Scan on user_library (cost=0.00..231.51 rows=1000 width=19) (actual time=0.027..3.297 rows=1000 loops=1) (my opinion: because user_library has only 2 rows, Postgresql didn't use index to save resources.)
Filter: (user_id = 1)
-> Append (cost=0.00..10.49 rows=2 width=165) (actual time=0.045..0.047 rows=1 loops=1000)
-> Seq Scan on recording (cost=0.00..0.00 rows=1 width=196) (actual time=0.001..0.001 rows=0 loops=1000)
-> Index Scan using de_recording3_table_pkey on de_recording recording (cost=0.00..10.49 rows=1 width=134) (actual time=0.040..0.042 rows=1 loops=1000)
Index Cond: (id = (user_library.recording_id)::text)
Total runtime: 58.589 ms
(9 rows)
First, if your interesting (commonly used) set of data fits comfortably in memory as well as all indexes, you will have much better performance so yes, more RAM will help. However, with 1k records, part of your time will be spent on materializing records and sending them to the client.
A couple other initial points:
There is no substitute for real profiling. You may be surprised as to where the time is being spent. On PostgreSQL use explain analyze to do profiling of a query.
Look into cursors and limit/offset.
I don;t think one can come up with better advice until these are done.