Any difference in the way a primary key is defined in Postgres? - database

I am wondering if all these are exactly the same or if there is some difference.
Method 1:
CREATE TABLE testtable
(
id serial,
title character varying,
CONSTRAINT id PRIMARY KEY (id)
);
Method: 2
CREATE TABLE testtable
(
id serial PRIMARY KEY,
title character varying,
);
Method 3:
CREATE TABLE testtable
(
id integer PRIMARY KEY,
title character varying,
);
CREATE SEQUENCE testtable_id_seq
START WITH 1
INCREMENT BY 1
NO MAXVALUE
NO MINVALUE
CACHE 1;
ALTER SEQUENCE testtable_id_seq OWNED BY testtable.id;
Update: I found something on the web saying that by using a raw sequence you can pre-allocate memory for primary keys which helps if you plan on doing several thousand inserts in the next minute.

Try it and see; remove the trailing "," after "varying" on the second and third so they run, execute each of them, then do:
\d testtable
after each one and you can see what happens. Then drop the table and move onto the next one. It will look like this:
Column | Type | Modifiers
--------+-------------------+--------------------------------------------------------
id | integer | not null default nextval('testtable_id_seq'::regclass)
title | character varying |
Indexes:
"id" PRIMARY KEY, btree (id)
Column | Type | Modifiers
--------+-------------------+--------------------------------------------------------
id | integer | not null default nextval('testtable_id_seq'::regclass)
title | character varying |
Indexes:
"testtable_pkey" PRIMARY KEY, btree (id)
Column | Type | Modifiers
--------+-------------------+-----------
id | integer | not null
title | character varying |
Indexes:
"testtable_pkey" PRIMARY KEY, btree (id)
First and second are almost identical, except the primary key created is named differently. In the third, the sequence is no longer filled in when you insert into the database. You need to create the sequence first, then create the table like this:
CREATE TABLE testtable
(
id integer PRIMARY KEY DEFAULT nextval('testtable_id_seq'),
title character varying
);
To get something that looks the same as the second one. The only upside to that is that you can use the CACHE directive to pre-allocate some number of sequence numbers. It's possible for that to be a big enough resource drain that you need to lower the contention. But you'd need to be doing several thousand inserts per second, not per minute, before that's likely to happen.

No semantic difference between method 1 and method 2.
Method 3 is quite similar, too - it's what happens implicitly, when using serial. However, when using serial, postgres also records a dependency of sequence on the table. So, if you drop the table created in method 1 or 2, the sequence gets dropped as well.

Related

Add NOT NULL column without DEFAULT but WITH VALUES

I'm using SQL Server 2017 and I want to add a NOT NULL column without DEFAULT but supply a values for current record e.g. using WITH VALUES in a single query.
Let me explain. I understand the fact that I cannot create a NOT NULL column without supplying values. But a DEFAULT clause sets a default value for this column also for future inserts which I don't want. I want a default value to be used only for adding this new column and that's it.
Let me explain.
Assume such a sequence of queries:
CREATE TABLE items (
id INT NOT NULL PRIMARY KEY IDENTITY(1,1)
);
ALTER TABLE items ADD name VARCHAR(255) NOT NULL; -- No default value because table is empty
INSERT INTO items(name) VALUES( 'test'); -- ERROR
Last query gives error (as expected):
Error: Cannot insert the value NULL into column 'description', table 'suvibackend.dbo.items'; column does not allow nulls. INSERT fails.
It is so because we didn't supply value for description column. It's obvious.
Let's consider a situation when there are some records in items table. Without a DEFAULT and WITH VALUES clauses it will fail (obviously) so let's use them now:
CREATE TABLE items (
id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
name varchar(255) NOT NULL
);
INSERT INTO items(name) VALUES ('name-test-1');
INSERT INTO items(name) VALUES ('name-test-2');
ALTER TABLE items ADD description VARCHAR(255) NOT NULL DEFAULT 'no-description' WITH VALUES;
So now our table looks like this as expected:
SELECT * FROM items;
--------------------------------------
| id | name | description |
| --- | ----------- | -------------- |
| 1 | name-test-1 | no-description |
| 2 | name-test-2 | no-description |
--------------------------------------
But from now on, it is possible to INSERT records without description:
INSERT INTO items(name) VALUES ('name-test-3'); -- No description column
SELECT * FROM ITEMS;
--------------------------------------
| id | name | description |
| --- | ----------- | -------------- |
| 1 | name-test-1 | no-description |
| 2 | name-test-2 | no-description |
| 3 | name-test-3 | no-description |
--------------------------------------
But when we compare this to our first situation (empty table without DEFAULT clause) it is different. I still want an error regarding NULL and description column.
SQL Server has created a default constraint for this column which I don't want to have.
The solution is to either drop a constraint after adding a new column with DEFAULT clause, or to split adding new column into 3 queries:
CREATE TABLE items
(
id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
name varchar(255) NOT NULL
);
INSERT INTO items(name) VALUES ('name-test-1');
INSERT INTO items(name) VALUES ('name-test-2');
ALTER TABLE items
ADD description VARCHAR(255) NULL;
UPDATE items
SET description = 'no description'
ALTER TABLE items
ALTER COLUMN description VARCHAR(255) NOT NULL;
INSERT INTO items(name)
VALUES ('name-test-3'); -- ERROR as expected
My question:
Is there a way to achieve it in a single query, but without having a default constaint created?
It would be nice if it is possible to use a default value just for a query without permanently creating a constraint.
Although you can't specify an ephemeral default constraint that's automatically dropped after adding the column (i.e. single statement operation), you can explicitly name the constraint to facilitate dropping it immediately afterward.
ALTER TABLE dbo.items
ADD description VARCHAR(255) NOT NULL
CONSTRAINT DF_items_description DEFAULT 'no-description' WITH VALUES;
ALTER TABLE dbo.items
DROP CONSTRAINT DF_items_description;
Explict constraint names are a best practice, IMHO, as it makes subsequent DDL operations easier.

how do I model subtyping in a relational schema?

Is the following DB-schema ok?
REQUEST-TABLE
REQUEST-ID | TYPE | META-1 | META-2 |
This table stores all the requests each of which has a unique REQUEST-ID. The TYPE is either A, B or C. This will tell us which table contains the specific request parameters. Other than that we have the tables for the respective types. These tables store the parameters for the respective requests. META-1 are just some additional info like timestamps and stuff.
TYPE-A-TABLE
REQUEST-ID | PARAM_X | PARAM_Y | PARAM_Z
TYPE-B-TABLE
REQUEST-ID | PARAM_I | PARAM_J
TYPE-C-TABLE
REQUEST-ID | PARAM_L | PARAM_M | PARAM_N | PARAM_O | PARAM_P | PARAM_Q
The REQUEST-ID is the foreign key into the REQUEST-TABLE.
Is this design normal/best-practice? Or is there a better/smarter way? What are the alternatives?
It somehow feels strange to me, having to do a query on the REQUEST-TABLE to find out which TYPE-TABLE contains the information I need, to then do the actual query I'm interested in.
For instance imagine a method which given an ID should retrieve the parameters. This method would need to do 2 db-access.
- Find correct table to query
- Query table to get the parameters
Note: In reality we have like 10 types of requests, i.e. 10 TYPE tables. Moreover there are many entries in each of the tables.
Meta-Note: I find it hard to come up with a proper title for this question (one that is not overly broad). Please feel free to make suggestions or edit the title.
For exclusive types, you just need to make sure rows in one type table can't reference rows in any other type table.
create table requests (
request_id integer primary key,
request_type char(1) not null
-- You could also use a table to constrain valid types.
check (request_type in ('A', 'B', 'C', 'D')),
meta_1 char(1) not null,
meta_2 char(1) not null,
-- Foreign key constraints don't reference request_id alone. If they
-- did, they might reference the wrong type.
unique (request_id, request_type)
);
You need that apparently redundant unique constraint so the pair of columns can be the target of a foreign key constraint.
create table type_a (
request_id integer not null,
request_type char(1) not null default 'A'
check (request_type = 'A'),
primary key (request_id),
foreign key (request_id, request_type)
references requests (request_id, request_type) on delete cascade,
param_x char(1) not null,
param_y char(1) not null,
param_z char(1) not null
);
The check() constraint guarantees that only 'A' can be stored in the request_type column. The foreign key constraint guarantees that each row will reference an 'A' row in the table "requests". Other type tables are similar.
create table type_b (
request_id integer not null,
request_type char(1) not null default 'B'
check (request_type = 'B'),
primary key (request_id),
foreign key (request_id, request_type)
references requests (request_id, request_type) on delete cascade,
param_i char(1) not null,
param_j char(1) not null
);
Repeat for each type table.
I usually create one updatable view for each type. The views join the table "requests" with one type table. Application code uses the views instead of the base tables. When I do that, it usually makes sense to revoke privileges on the base tables. (Not shown.)
If you don't know which type something is, then there's no alternative to running one query to get the type, and another query to select or update.
select request_type from requests where request_id = 42;
-- Say it returns 'A'. I'd use the view type_a_only.
update type_a_only
set param_x = '!' where request_id = 42;
In my own work, it's pretty rare to not know the type, but it does happen sometimes.
The phrase you may be looking for is "how do I model inheritance in a relational schema". It's been asked before. Whilst this is a reference to object oriented software design, the basic question is the same: how do I deal with data where there is a "x is a type of y" relationship.
In your case, "request" is the abstract class, and typeA, TypeB etc. are the subclasses.
Your solution is one of the classic answers - "table per subclass". It's clean and easy to maintain, but does mean you can have multiple database access requests to retrieve the data.

postgresql: change primary key

Until now I had a column account_name as the primary key for my database. I'd now like to use a hash of account_name as the primary key instead.
So as an interim measure, I added an account_hash column and gave it the UNIQUE constraint, so that both account_name and account_hash exist together.
I populated account_hash for all database entries, and am now actually using account_hash as the key for the database, and am no longer actively using account_name for anything.
But of course because account_name is the "official" primary key, and must be NOT NULL, for any new entries I have been populating both account_name and account_hash with the same hash.
It's all working fine like this, but now I'd like to clean up the database, to get rid of account_name entirely, and to make account_hash the primary key instead.
What is the best way of doing this? It is a working database that is in use constantly, so any change needs to be at minimum disruption to the users.
Here is the \d+ information relating to the relevant columns:
Column | Type | Modifiers | Storage | Stats target | Description
-------------------------------+------------------------+-----------------------------+----------+--------------+-------------
account_name | character varying(255) | not null | extended | |
account_hash | character varying(256) | | extended | |
Indexes:
"users_pkey" PRIMARY KEY, btree (account_name)
"users_account_hash_256_key" UNIQUE CONSTRAINT, btree (account_hash)
Has OIDs: no
Thanks for any help!
You can drop the current primary key with
ALTER TABLE tablename DROP CONSTRAINT users_pkey;
Make the account_hash required with
ALTER TABLE tablename ALTER account_hash SET NOT NULL;
After that you can add a new primary key with
ALTER TABLE tablename ADD PRIMARY KEY USING INDEX indexname;
You may have to drop the users_account_hash_256_key constraint first to not have a duplicate and depending on how the unique index was created, you may have to create the index again for this.
If the account_name column is not used anywhere, it can then be dropped with
ALTER TABLE tablename DROP COLUMN account_name;
Note I would advise against this action. Hashes have collisions, so if you use them as primary keys, there may be a time when you cannot insert a value into the database because of that. Also performance is worse with varchar indexes than with integers (or a UUID, if a very large keyspace is needed), so if there is no specific reason for using hashes, I wouldn't do this.

Syntax of creating sequence and auto increment fields in PostgreSQL for the given table

I am new in PostgreSQL. I'm trying to figure out the syntax for creating the following table.
I'm having difficulties in creating the sequence and the auto increment fields.
Column | Type | Modifiers
--------------+-----------------------+-----------------------------------------------------
id_numuser | integer | not null default nextval('id_numuser_seq'::regclass)
username | character varying(70) |
completename | character varying(70) |
id_cat | integer |
email | character varying(70) |
password | character varying(30) |
active | boolean |
Indexes:
"users_pkey" PRIMARY KEY, btree (id_numuser)
"taskuser_uniq" UNIQUE, btree (username)
Foreign-key constraints:
"users_id_cat_fkey" FOREIGN KEY (id_cat) REFERENCES usercategories(id_numcat)
Use a serial column. Details here:
Auto increment SQL function
The complete Script:
CREATE TABLE users (
id_numuser serial PRIMARY KEY
,username character varying(70) UNIQUE
,completename character varying(70)
,id_cat integer REFERENCES usercategories(id_numcat)
,email character varying(70)
,password character varying(30)
,active boolean
);
You can use pgAdmin to get complete reverse-engineered SQL scripts for all objects.
Aside: I'd suggest to use just text instead of varchar(n).
If you're ever in doubt about how to define something, pg_dump will help.
pg_dump -t 'users' --schema-only
will print a dump that shows the command(s) to create your users table.
It won't use shorthand like SERIAL, so it'll create the sequence then assign the sequence ownership and set the column default. So sometimes there's a shorter and simpler way than how pg_dump does it. The way pg_dump does it will always work, though.
In this case it produces (trimmed):
CREATE TABLE users (
id_numuser integer NOT NULL,
username character varying(70),
completename character varying(70),
id_cat integer,
email character varying(70),
password character varying(30),
active boolean
);
CREATE SEQUENCE users_id_numuser_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
ALTER SEQUENCE users_id_numuser_seq OWNED BY users.id_numuser;
ALTER TABLE ONLY users ALTER COLUMN id_numuser
SET DEFAULT nextval('users_id_numuser_seq'::regclass);
ALTER TABLE ONLY users
ADD CONSTRAINT users_pkey PRIMARY KEY (id_numuser);
ALTER TABLE ONLY users
ADD CONSTRAINT users_username_key UNIQUE (username);
ALTER TABLE ONLY users
ADD CONSTRAINT users_id_cat_fkey FOREIGN KEY (id_cat) REFERENCES usercategories(id_numcat);
So it's defining the sequence and all the constraints after creating the base table, not as part of it, and it's specifying a bunch of stuff that would usually be set by defaults.
The effect is the same and you can do things this way if you want.
Stuff like SERIAL PRIMARY KEY is basically just convenient shorthand. All that is covered well in the documentation for CREATE TABLE, so once you know what you want you can generally figure out how to define it pretty easily. Most of the time, anything you can write in ALTER TABLE ... ADD ... can be written the same in CREATE TABLE (...), eg:
ALTER TABLE ONLY users
ADD CONSTRAINT users_id_cat_fkey FOREIGN KEY (id_cat) REFERENCES usercategories(id_numcat);
can be done at create time with:
CREATE TABLE users (
....,
CONSTRAINT users_id_cat_fkey FOREIGN KEY (id_cat) REFERENCES usercategories(id_numcat)
);
Additionally, for any column-specific CONSTRAINT there's usually a way to tack it on to the end of the column definition. In this case, you omit the CONSTRAINT constraint_name (it's generated) and the FOREIGN KEY (id_cat) (because the column is implied, you don't need to specify it), and write:
CREATE TABLE users (
....
id_cat integer REFERENCES usercategories(id_numcat),
....
);
Once you know what to look for in the CREATE TABLE docs it's usually easy to find how to write what you want.

(PostgreSQL) "Advanced" Check Constraint Question

I use PostgreSQL but am looking for SQL answer as standard as possible.
I have the following table "docs" --
Column | Type | Modifiers
------------+------------------------+--------------------
id | character varying(32) | not null
version | integer | not null default 1
link_id | character varying(32) |
content | character varying(128) |
Indexes:
"docs_pkey" PRIMARY KEY, btree (id, version)
id and link_id are for documents that have linkage relationship between each other, so link_id self references id.
The problem comes with version. Now id is no longer the primary key (won't be unique either) and can't be referenced by by link_id as foreign key --
my_db=# ALTER TABLE docs ADD FOREIGN KEY(link_id) REFERENCES docs (id) ;
ERROR: there is no unique constraint matching given keys for referenced table "docs"
I tried to search for check constraint on something like "if exists" but didn't find anything.
Any tip will be much appreciated.
I usually do like this:
table document (id, common, columns, current_revision)
table revision (id, doc_id, content, version)
which means that document has a one-to-many relation with it's revisions, AND a one-to-one to the current revision.
That way, you can always select a complete document for the current revision with a simple join, and you will only have one unique row in your documents table which you can link parent/child relations in, but still have versioning.
Sticking as close to your model as possible, you can split your table into two, one which has 1 row per 'doc' and one with 1 row per 'version':
You have the following table "versions" --
Column | Type | Modifiers
------------+------------------------+--------------------
id | character varying(32) | not null
version | integer | not null default 1
content | character varying(128) |
Indexes:
"versions_pkey" PRIMARY KEY, btree (id, version)
And the following table "docs" --
Column | Type | Modifiers
------------+------------------------+--------------------
id | character varying(32) | not null
link_id | character varying(32) |
Indexes:
"docs_pkey" PRIMARY KEY, btree (id)
now
my_db=# ALTER TABLE docs ADD FOREIGN KEY(link_id) REFERENCES docs (id) ;
is allowed, and you also want:
my_db=# ALTER TABLE versions ADD FOREIGN KEY(id) REFERENCES docs;
of course there is nothing stoping you getting a 'combined' view similar to your original table:
CREATE VIEW v_docs AS
SELECT id, version, link_id, content from docs join versions using(id);
Depending on if it's what you want, you can simply create a FOREIGN KEY that includes the version field. That's the only way to point to a unique row...
If that doesn't work, you can write a TRIGGER (for all UPDATEs and INSERTs on the table) that makes the check. Note that you will also need a trigger on the docs table, that restricts modifications on that table that would break the key (such as a DELETE or UPDATE on the key value itself).
You cannot do this with a CHECK constraint, because a CHECK constraint cannot access data in another table.

Resources