SQL Database design - Large permission table - sql-server

I have a interesting issue with the design for user permissions on our existing database.
We have a large table with well over 10 million records that the powers that be now want individual permissions placed on.
I immediately think OK, create a new table, each user has its own column and access is done with a bit value.
[U1] [U2] [U3] [U4] etc.
=============================================
Record 1 | 1 | 0 | 1 | 0 |
Record 2 | 0 | 0 | 1 | 1 |
Record 3 | 0 | 1 | 1 | 0 |
Record 4 | 1 | 0 | 1 | 1 |
Then I was told to expect 300 users as we allow more access to the database which would mean 300 columns :/
So can anyone out there think of a better way to do this? Any thoughts or suggestions will be gratefully received.

Like I mention in the comments, use a normalised approach. Considering that the value for the allow/dey is a bit, you could actually just do with with 2 columns, the User's ID and the Record's ID:
CREATE TABLE dbo.UserAllowPermissions (UserID int,
RecordID int);
ALTER TABLE dbo.UserAllowPermissions
ADD CONSTRAINT PK_UserRecordPermission
PRIMARY KEY CLUSTERED (UserID,RecordID);
Then you can INSERT the data you already have above like below:
INSERT INTO dbo.UserAllowPermissions (UserID,
RecordID)
VALUES (1, 1),
(1, 4),
(2, 3),
(3, 1),
(3, 2),
(3, 3),
(3, 4),
(4, 2),
(4, 4);
If you want to revoke the permission of a User, then just delete the relevant row. For example, say you want to revoke the permission for User 3 to Record 4:
DELETE
FROM dbo.UserAllowPermissions
WHERE UserID = 3
AND RecordID = 4;
And, unsurprisingly, if you want to grant a user permission, just INSERT the row:
INSERT INTO dbo.UserAllowPermissions (UserID, RecordID)
VALUES(5,1);
Which would grant the User with the ID of 5 access to the Record with an ID of 1.

Related

SQL Server Merge Update With Partial Sources

I have a target table for which partial data arrives at different times from 2 departments. The keys they use are the same, but the fields they provide are different. Most of the rows they provide have common keys, but there are some rows that are unique to each department. My question is about the fields, not the rows:
Scenario
the target table has a key and 30 fields.
Dept. 1 provides fields 1-20
Dept. 2 provides fields 21-30
Suppose I loaded Q1 data from Dept. 1, and that created new rows 100-199 and populated fields 1-20. Later, I receive Q1 data from Dept. 2. Can I execute the same merge code I previously used for Dept. 1 to update rows 100-199 and populate fields 21-30 without unintentionally changing fields 1-20? Alternatively, would I have to tailor separate merge code for each Dept.?
In other words, does (or can) "Merge / Update" operate only on target fields that are present in the source table while ignoring target fields that are NOT present in the source table? In this way, Dept. 1 fields would NOT be modified when merging Dept. 2, or vice-versa, in the event I get subsequent corrections to this data from either Dept.
You can use a merge instruction, where you define a source and a target data, and what happens when a registry is found on both, just on the source, just on the target, and even expand it with custom logic, like it's just on the source, and it's older than X, or it's from department Y.
-- I'm skipping the fields 2-20 and 22-30, just to make this shorter.
create table #target (
id int primary key,
field1 varchar(100), -- and so on until 20
field21 varchar(100), -- and so on until 30
)
create table #dept1 (
id int primary key,
field1 varchar(100)
)
create table #dept2 (
id int primary key,
field21 varchar(100)
)
/*
Creates some data to merge into the target.
The expected result is:
| id | field1 | field21 |
| - | - | - |
| 1 | dept1: 1 | dept2: 1 |
| 2 | | dept2: 2 |
| 3 | dept1: 3 | |
| 4 | dept1: 4 | dept2: 4 |
| 5 | | dept2: 5 |
*/
insert into #dept1 values
(1,'dept1: 1'),
--(2,'dept1: 2'),
(3,'dept1: 3'),
(4,'dept1: 4')
insert into #dept2 values
(1,'dept2: 1'),
(2,'dept2: 2'),
--(3,'dept2: 3'),
(4,'dept2: 4'),
(5,'dept2: 5')
-- Inserts the data from the first department. This could be also a merge, it necessary.
insert into #target(id, field1)
select id, field1 from #dept1
merge into #target t
using (select id, field21 from #dept2) as source_data(id, field21)
on (source_data.id = t.id)
when matched then update set field21=source_data.field21
when not matched by source and t.field21 is not null then delete -- you can even use merge to remove some records that match your criteria
when not matched by target then insert (id, field21) values (source_data.id, source_data.field21); -- Every merge statement should end with ;
select * from #target
You can see this code running on this DB Fiddle

Return all results from left table when NULL present in right table and results of inner join when no null present in right table

Hi just wondering if this scenario is possible?
I have two tables and a relationship table to create a many to many relationships between the two tables. See the below tables for a simple representation;
| Security ID | Security Group |
| 1 | Admin |
| 2 | Basic |
| Security ID | Access ID |
| 1 | NULL |
| 2 | 1 |
| Function ID | Function Code |
| 1 | Search |
| 2 | Delete |
What I want to achieve is while checking the relationship table I want to return all functions a user on a security group has access to. If the user is assigned to a security group that contains a NULL value in the relationship table then grant them access to all functions.
For instance, a user on the "Basic" security group would have access to the search function while a user on the "Admin" security group should have access to both Search and Delete.
The reason it is set up this way is because a user can have 0 to many security groups and the list of functions is very large requiring the use of a whitelist of functions you can access instead of a list of a blacklist of functions you can't access.
Thank you for your time.
Your tables' sample:
CREATE TABLE #G
(
Security_ID INT,
Security_Group VARCHAR(32)
)
INSERT INTO #G
VALUES (1, 'Admin'), (2, 'Basic')
CREATE TABLE #A
(
Security_ID INT,
Access_ID INT
)
INSERT INTO #A
VALUES (1, NULL), (2, 1)
CREATE TABLE #F
(
Function_ID INT,
Function_CODE VARCHAR(32)
)
INSERT INTO #F
VALUES (1, 'Search'), (2, 'Delete')
Query:
SELECT #G.Security_Group, #F.Function_CODE
FROM #G
JOIN #A ON #G.Security_ID = #A.Security_ID
JOIN #F ON #F.Function_ID = #A.Access_ID OR #A.Access_ID IS NULL
Dropping the sample tables:
DROP TABLE #G
DROP TABLE #A
DROP TABLE #F

Generating sets from imported rules

The title is pretty imprecise but I wanted to avoid copying the whole question into it. The sets are actually maps as explained in the first section, the rules come in the second and the problem is that generating everything would mean too much data as detailed in the last section.
Current state
There are currently tens, soon hundreds of Customers. Each of them may have thousands of Items and thousands of Catalogs. Every item and every catalog belong to exactly one customer and data of different customers don't interact.
Items and catalogs stand in an m:n relationship. I see the catalogs as overlapping sets of items and there are some additional details associated. The data come from an import file looking like
catalog1 item1 details11
catalog1 item3 details13
catalog2 item1 details12
catalog2 item2 details22
In the database, there is a connecting table with three columns just like the import file.
In this example, I retrieve the content of catalog1 as {item1: details11, item3: details13}, etc. Retrieving the content of a catalog in this form is the only important query. So far, it's pretty trivial.
The imports happen a few times a day and I have to update the database content accordingly. The imports are partial in the sense that always only data of a single customer get imported, which means that always only
a part of the data is influenced. The imports are full in the sense that the import file contains all data of a given customer, so I have to add what's new, update what's changed and remove what's missing in the new import file. Still rather trivial.
New requirement
Now, groups are to be introduced. Every Item may be a member of multiple ItemGroups and every Catalog may be a member of multiple CatalogGroups. This information is available separately and the import format hardly changes: In place of a catalog, there may be a catalog group and in place of an item, there may be an item group.
So there are rules like
catalog1 item1 details100
catalogGroup1 item1 details101
catalog1 itemGroup1 details102
catalogGroup1 itemGroup1 details102
in place of simple connection table rows. These rules may and will conflict and I'm currently inclined to resolve the conflicts by giving precedence to earlier rules (the producer of the import files will accept my decision).
In the Details, there may be a piece of information stating that the corresponding item(s) should be excluded from the catalog(s), for example
catalog1 item1 EXCLUDE
catalog1 itemGroup1 someDetails
means that catalog1 includes all items from itemGroup1 except item1 (the first rule wins).
The Problem
Our connection table has already nearly one million rows and we're just starting. If there wasn't the new requirement, it could grow to some hundreds of millions of rows, which is acceptable.
With the new requirement, this number may grow much faster, so that storing the connection table may be infeasible. Already now, this table takes more disk space than all remaining ones together. It's also rather easy to write a rule generating millions of rows by mistake (which will surely happen one day).
All we need is to be able to retrieve the content of a catalog rather quickly, i.e., less than half a second when it contains some hundreds of items. We don't necessarily need to store the table as until now, a few JOINs using an index and some simple postprocessing should be alright.
Many catalogs won't get queried at all, but this doesn't help as we don't know which ones.
The imports needn't be fast. Currently, they take a second or two, but a few minutes would be acceptable.
So I wonder if I should create four tables, one for each combination of catalog or catalogGroup with item or itemGroup. Each of the tables would also contain the line number from the import file, so that I could retrieve all rules matching the requested catalog and resolve the conflicts in a postprocessing.
Or would be some hacky solution better? I'm a bit inclined to create a single table
catalog, catalogGroup, item, itemGroup, lineNo, details_part1, details_part2, ...
(where always exactly two of the first four columns are used) as the details are actually tuples of a few parts, which makes the four tables very repetitive. I could extract the details in a new table or blob them together instead.
I'm looking for some general advice how to tackle it efficiently.
I guess, some details are missing, but the question is far too long already. Feel free to ask. It's a web application using Java, Hibernate and MySQL, but omitting the tags as it hardly matters.
Answers to comments
Is the query still for catalog contents?
Yes, just like before. The groups are sort of input compression, nothing more.
And you return an item if it's either connected to a catalog or a group containing a queried catalog?
Or a member of such a group.
How do item groups work?
Both kinds of groups work the same:
A rule containing a group is equivalent to a list of rules, one for each group member.
I suggest beginning with normalized tables and a simple join. If I'm reading your question correctly, here's a reasonable schema:
CREATE TABLE items (
id integer PRIMARY KEY,
name varchar(40)
);
CREATE TABLE catalogs (
id integer PRIMARY KEY,
name varchar(40)
);
CREATE TABLE item_groups (
id integer PRIMARY KEY,
name varchar(40)
);
CREATE TABLE catalog_groups (
id integer PRIMARY KEY,
name varchar(40)
);
CREATE TABLE items_in_groups (
item_id integer REFERENCES items(id),
item_group_id integer REFERENCES item_groups(id),
PRIMARY KEY(item_id, item_group_id)
);
CREATE TABLE catalogs_in_groups (
catalog_id integer REFERENCES catalogs(id),
catalog_group_id integer REFERENCES catalog_groups(id),
PRIMARY KEY(catalog_id, catalog_group_id)
);
CREATE TABLE exclusions (
catalog_id integer REFERENCES catalogs(id),
item_id integer REFERENCES items(id),
PRIMARY KEY(catalog_id, item_id)
);
CREATE TABLE connections (
catalog_group_id integer REFERENCES catalog_groups(id),
item_group_id integer REFERENCES item_groups(id),
details varchar(40),
PRIMARY KEY(catalog_group_id, item_group_id)
);
Note that items_in_groups and catalogs_in_groups get a singleton entry for each item and catalog respectively. That is, each item is represented by a 1-element group and the same for catalogs.
Now add some data:
INSERT INTO items VALUES (1, 'Item 1'), (2, 'Item 2'), (3, 'item 3'), (4, 'item 4'), (5, 'item 5'), (6, 'item 6');
INSERT INTO catalogs VALUES (1, 'Catalog 1'), (2, 'Catalog 2'), (3, 'Catalog 3'), (4, 'Catalog 4'),
(5, 'Catalog 5'), (6, 'Catalog 6');
INSERT INTO item_groups VALUES
(1, 'Item group 1'), (2, 'Item group 2'), (3, 'Item group 3'), (4, 'Item group 4'), (5, 'Item group 5'),
(6, 'Item group 6'), (10, 'Item group 10'), (11, 'Item group 11'), (12, 'Item group 12');
INSERT INTO catalog_groups VALUES
(1, 'Catalog group 1'), (2, 'Catalog group 2'), (3, 'Catalog group 3'), (4, 'Catalog group 4'),
(5, 'Catalog group 5'), (6, 'Catalog group 6'), (10, 'Catalog group 10'), (11, 'Catalog group 11'),
(12, 'Catalog group 12');
INSERT INTO items_in_groups VALUES
(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (2, 10), (4, 10),
(6, 10), (1, 11), (3, 11), (5, 11), (1, 12), (2, 12), (3, 12), (6, 12);
INSERT INTO catalogs_in_groups VALUES
(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (2, 11), (4, 11),
(6, 11), (1, 10), (3, 10), (5, 10), (3, 12), (4, 12), (5, 12), (6, 12);
INSERT INTO exclusions VALUES (5, 1), (6, 3);
INSERT INTO connections VALUES (1, 10, 'Details 1-10'), (11, 2, 'Details 11-2'), (12, 12, 'Details 12-12');
And query:
SELECT cig.catalog_id, iig.item_id, c.details FROM connections AS c
INNER JOIN item_groups AS ig ON c.item_group_id = ig.id
INNER JOIN catalog_groups AS cg ON c.catalog_group_id = cg.id
INNER JOIN items_in_groups AS iig ON iig.item_group_id = ig.id
INNER JOIN catalogs_in_groups AS cig ON cig.catalog_group_id = cg.id
WHERE NOT EXISTS (
SELECT NULL
FROM exclusions e
WHERE iig.item_id = e.item_id AND cig.catalog_id = e.catalog_id)
ORDER BY cig.catalog_id, iig.item_id;
The result:
catalog_id | item_id | details
------------+---------+---------------
1 | 2 | Details 1-10
1 | 4 | Details 1-10
1 | 6 | Details 1-10
2 | 2 | Details 11-2
3 | 1 | Details 12-12
3 | 2 | Details 12-12
3 | 3 | Details 12-12
3 | 6 | Details 12-12
4 | 1 | Details 12-12
4 | 2 | Details 11-2
4 | 2 | Details 12-12
4 | 3 | Details 12-12
4 | 6 | Details 12-12
5 | 2 | Details 12-12
5 | 3 | Details 12-12
5 | 6 | Details 12-12
6 | 1 | Details 12-12
6 | 2 | Details 11-2
6 | 2 | Details 12-12
6 | 6 | Details 12-12
(20 rows)
You could add items and catalogs to the join to look up the respective names rather than stopping with IDs.
SELECT cat.name, item.name, c.details FROM connections AS c
INNER JOIN item_groups AS ig ON c.item_group_id = ig.id
INNER JOIN catalog_groups AS cg ON c.catalog_group_id = cg.id
INNER JOIN items_in_groups AS iig ON iig.item_group_id = ig.id
INNER JOIN catalogs_in_groups AS cig ON cig.catalog_group_id = cg.id
INNER JOIN catalogs AS cat ON cat.id = cig.catalog_id
INNER JOIN items AS item ON item.id = iig.item_id
where NOT EXISTS (
SELECT NULL
FROM exclusions e
WHERE iig.item_id = e.item_id AND cig.catalog_id = e.catalog_id)
ORDER BY cig.catalog_id, iig.item_id;
Like so...
name | name | details
-----------+--------+---------------
Catalog 1 | Item 2 | Details 1-10
Catalog 1 | item 4 | Details 1-10
Catalog 1 | item 6 | Details 1-10
Catalog 2 | Item 2 | Details 11-2
Catalog 3 | Item 1 | Details 12-12
Catalog 3 | Item 2 | Details 12-12
Catalog 3 | item 3 | Details 12-12
Catalog 3 | item 6 | Details 12-12
Catalog 4 | Item 1 | Details 12-12
Catalog 4 | Item 2 | Details 11-2
Catalog 4 | Item 2 | Details 12-12
Catalog 4 | item 3 | Details 12-12
Catalog 4 | item 6 | Details 12-12
Catalog 5 | Item 2 | Details 12-12
Catalog 5 | item 3 | Details 12-12
Catalog 5 | item 6 | Details 12-12
Catalog 6 | Item 1 | Details 12-12
Catalog 6 | Item 2 | Details 11-2
Catalog 6 | Item 2 | Details 12-12
Catalog 6 | item 6 | Details 12-12
(20 rows)
As you can see, there are duplicate catalog/item ID pairs with respective details. I suppose this is what you meant by "conflicts." It wouldn't be hard to tweak the query to honor a priority rule to choose one of the alternatives.
Getting a specific catalog's items is just an additional AND cause:
SELECT cat.name, item.name, c.details FROM connections AS c
INNER JOIN item_groups AS ig ON c.item_group_id = ig.id
INNER JOIN catalog_groups AS cg ON c.catalog_group_id = cg.id
INNER JOIN items_in_groups AS iig ON iig.item_group_id = ig.id
INNER JOIN catalogs_in_groups AS cig ON cig.catalog_group_id = cg.id
INNER JOIN catalogs AS cat ON cat.id = cig.catalog_id
INNER JOIN items AS item ON item.id = iig.item_id
WHERE NOT EXISTS (
SELECT NULL
FROM exclusions e
WHERE iig.item_id = e.item_id AND cig.catalog_id = e.catalog_id)
AND cat.id = 3
ORDER BY cig.catalog_id, iig.item_id;
And...
name | name | details
-----------+--------+---------------
Catalog 3 | Item 1 | Details 12-12
Catalog 3 | Item 2 | Details 12-12
Catalog 3 | item 3 | Details 12-12
Catalog 3 | item 6 | Details 12-12
(4 rows)
As usual, there are several other ways to write the query. I'm no SQL wizard, but SO has many. If this first hack isn't good enough, come back and ask. Denormalization and crazy pre- and post-processing schemes take away future flexibility.
You can make catalog relation table like this
Catalog-relation table
+--------------------------------------+
| Id | catalog | ref_catalog |
+----+------------+--------+-----------+
| 1 | 'catalog1' | 'catalog1' |
| 2 | 'catalog1' | 'catalogGroup1' |
| 3 | 'catalog2' | 'catalog2' |
| 4 | 'catalog2' | 'catalogGroup1' |
+--------------------------------------+
When calc content of catalog1, you can search rows where catalog is catalog1`. Assume that count of this will not big.
Catalog-item table
+--------------------------------------+
| Id | catalog | ref_item |
+----+-----------------+---+-----------+
| 1 | 'catalog1' | 'item1' |
| 2 | 'catalogGroup1' | 'itemGroup2'|
| 3 | 'catalog1' | 'item2' |
| 4 | 'catalogGroup2' | 'itemGroup2'|
+--------------------------------------+
Then you search for catalog-item table.
After that, replace itemGroup to separate items.
I think this will take not O(n x n) but O(n).
I recommend you to use ruby on rails model to improve performance.
Without hard algorithm, you can easily make nested structure of tables using belongs_to& has_many.

How to create database within a database(postgres)?

Actually I'm noob and stuck on this problem for a week. I will try explaining it.
I have table for USER,
and a table for product
I want to store data of every user for every product. Like if_product_bought, num_of_items, and all.
So only solution I can think of database within database , that is create a copy of products inside user named database and start storing.
If this is possible how or is there any other better solution
Thanks in advance
You actually don't create a database within a database (or a table within a table) when you use PostgreSQL or any other SQL RDBMS.
You use tables, and JOIN them. You normally would have an orders table, together with an items_x_orders table, on top of your users and items.
This is a very simplified scenario:
CREATE TABLE users
(
user_id INTEGER /* SERIAL */ NOT NULL PRIMARY KEY,
user_name text
) ;
CREATE TABLE items
(
item_id INTEGER /* SERIAL */ NOT NULL PRIMARY KEY,
item_description text NOT NULL,
item_unit text NOT NULL,
item_standard_price decimal(10,2) NOT NULL
) ;
CREATE TABLE orders
(
order_id INTEGER /* SERIAL */ NOT NULL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(user_id),
order_date DATE NOT NULL DEFAULT now(),
other_data TEXT
) ;
CREATE TABLE items_x_orders
(
order_id INTEGER NOT NULL REFERENCES orders(order_id),
item_id INTEGER NOT NULL REFERENCES items(item_id),
-- You're supposed not to have the item more than once in an order
-- This makes the following the "natural key" for this table
PRIMARY KEY (order_id, item_id),
item_quantity DECIMAL(10,2) NOT NULL CHECK(item_quantity <> /* > */ 0),
item_percent_discount DECIMAL(5,2) NOT NULL DEFAULT 0.0,
other_data TEXT
) ;
This is all based in the so-called Relational Model. What you were thinking about is something else called a Hierarchical model, or a document model used in some NoSQL databases (where you store your data as a JSON or XML hierarchical structure).
You would fill those tables with data like:
INSERT INTO users
(user_id, user_name)
VALUES
(1, 'Alice Cooper') ;
INSERT INTO items
(item_id, item_description, item_unit, item_standard_price)
VALUES
(1, 'Oranges', 'kg', 0.75),
(2, 'Cookies', 'box', 1.25),
(3, 'Milk', '1l carton', 0.90) ;
INSERT INTO orders
(order_id, user_id)
VALUES
(100, 1) ;
INSERT INTO items_x_orders
(order_id, item_id, item_quantity, item_percent_discount, other_data)
VALUES
(100, 1, 2.5, 0.00, NULL),
(100, 2, 3.0, 0.00, 'I don''t want Oreo'),
(100, 3, 1.0, 0.05, 'Make it promo milk') ;
And then you would produce queries like the following one, where you JOIN all relevant tables:
SELECT
user_name, item_description, item_quantity, item_unit,
item_standard_price, item_percent_discount,
CAST(item_quantity * (item_standard_price * (1-item_percent_discount/100.0)) AS DECIMAL(10,2)) AS items_price
FROM
items_x_orders
JOIN orders USING (order_id)
JOIN items USING (item_id)
JOIN users USING (user_id) ;
...and get these results:
user_name | item_description | item_quantity | item_unit | item_standard_price | item_percent_discount | items_price
:----------- | :--------------- | ------------: | :-------- | ------------------: | --------------------: | ----------:
Alice Cooper | Oranges | 2.50 | kg | 0.75 | 0.00 | 1.88
Alice Cooper | Cookies | 3.00 | box | 1.25 | 0.00 | 3.75
Alice Cooper | Milk | 1.00 | 1l carton | 0.90 | 5.00 | 0.86
You can get all the code and test at dbfiddle here

TSQL Trigger to maintain incremental order integrity

I am trying to puzzle out a trigger in a SQL Server database. I am a student working on a summer project so I am no pro at this but can easily learn it.
This is a simplified version of my database table sorted by rank:
ID as primary key
ID | RANK
--------------
2 | NULL
1 | 1
3 | 2
4 | 3
7 | 4
The objective for me right now is to have the ability to insert/delete/update the rank and maintain incremental order of ranks in the database without any missing numbers in available positions along with no duplicates.
/* Insert new row */
INSERT INTO TABLE (ID, RANK) VALUES (6, 4)
/* AFTER INSERT */
ID | RANK
--------------
2 | NULL
1 | 1
3 | 2
4 | 3
6 | 4 <- new
7 | 5 <- notice how the rank increased to make room for the new row
I think doing this in a trigger is the most efficient/easiest way; although I may be wrong.
Alternatively to a trigger, I have made a temporary solution that uses front end code to run updates on each row when any rank is changed.
If you know how or if a trigger could do this please share.
EDIT: Added scenarios
The rank being inserted would always take its assigned number. Everything that is greater than or equal to the one being inserted would increase.
The rank causing the trigger will always have priority to claim its number while everything else will have rank increased to accommodate.
If rank is the highest number then the trigger would ensure that the number is +1 of the max.
This may work for you. Let me know.
DROP TABLE dbo.test
CREATE TABLE dbo.test (id int, ranke int)
INSERT INTO test VALUES (2, NULL)
INSERT INTO test VALUES (1, 1)
INSERT INTO test VALUES (3, 2)
INSERT INTO test VALUES (4, 3)
INSERT INTO test VALUES (7, 4)
GO
CREATE TRIGGER t_test
ON test
AFTER INSERT
AS
UPDATE test set ranke += 1 WHERE ranke >= (select max(ranke) from inserted) and id <> (select max(id) from inserted)
GO
INSERT INTO test values (6,4)
INSERT INTO test values (12,NULL)
SELECT * FROM test

Resources