Advanced sql server key combination - sql-server

I have an sql server address table.
The user can put either a StreetId or a NeighborhoodId or a CityID but can not put 2 of these three fields. I could restrict the user through the ui, but I would prefer to do this force this rule at the db level.
Is there a way to do this?

You are better of doing this in your business logic layer.
UI
^
Business logic
^
Data Access
All business rules like this can be defined in a separate business logic layer (BLL) which the UI will call and receive a message from the BLL (which can form a warning, information or an error) if the rules are not adhered to. If all the rules defined in the BLL are adhered to for the current use case, the BLL will call the data access layer (DAL) which will call the database e.g. a stored procedure with the parameters authorised by the BLL using the defined business rules.
Hope it is clear.

This is the first time I'm messing with CHECK constraint. But I think below code might be helpful.
Let's say we have a table BaseTable
CREATE TABLE dbo.BaseTable
(
StreetId VARCHAR(50),
NeighborhoodId VARCHAR(50),
CityID VARCHAR(50)
)
Let's add a CHECK constraint:
ALTER TABLE dbo.BaseTable
ADD CONSTRAINT CheckOnlyOneColumnValue
CHECK (
(
CASE WHEN StreetId IS NOT NULL THEN 1 ELSE 0 END
+ CASE WHEN NeighborhoodId IS NOT NULL THEN 1 ELSE 0 END
+ CASE WHEN CityID IS NOT NULL THEN 1 ELSE 0 END)
= 1
)
GO
Test:
These insert queries will work just fine:
INSERT INTO dbo.BaseTable(StreetId) VALUES ('StreetId')
INSERT INTO dbo.BaseTable(NeighborhoodId) VALUES ('NeighborhoodId')
INSERT INTO dbo.BaseTable(CityID) VALUES ('CityID')
But these queries will fail:
INSERT INTO dbo.BaseTable(StreetId, NeighborhoodId)
VALUES ('StreetId', 'NeighborhoodId')
INSERT INTO dbo.BaseTable(StreetId, NeighborhoodId, CityID)
VALUES ('StreetId', 'NeighborhoodId', 'CityID')
UPDATE dbo.BaseTable SET NeighborhoodId = 'NeighborhoodId'
WHERE StreetId = 'StreetId'

Related

Provide counterfeit data protection for end users of service – tools?

I need to create a service but i need a help with choice of tools.
Imagine service in which users create some data that have value in historical view (e.g. transactions). Other users can see this data but they need a proof that data are real and not falsified by users or even by service.
Example:
User A creates record with number 42
Couple of months passes
User B see this record and wants to be sure that service can't update this record with any other number 37
Service has trust window with 24 hours: it even can change users data, which were made on this day.
Question: Which instruments can help me to achieve that?
I was thinking about doing public daily backups (or reports?) that any user can download. From each report hash will be calculated and inserted into next backup – thus, a chan of hashes created. If service will change something in past, then hashes in this chain will not converge. Of course, i'll create open sourced tool for easy comparing diff between data and check if chain is valid.
Point of trust: there is one thing that i'm afraid of. Service can use many databases simultaneously and update all backups with all hashes one time (because first backup has no hash of previous one). So, to cover that case too, i think of storing hashes in some place that service can't change at all. For example, in one of the existed blockchains (btc, eth, ...) from official wallet of service. Or, maybe, DAG with some blockchain like IOTA?
What do you think of point of trust?
Can i achieve my goal with some simpler way (without blockchain)? And which one?
What are bottlenecks in this logic?
There are 2 participating variables here
timestamp at which the record is created.
the data.
Solution premise,
Tampering proof.
the data can be changed in the same GMT calendar day without violating tamper-proof guarantee. (can be changed to a fixed window after creation)
RDBMS as the data store, (can be changed to any NoSQL with minor mods, but the idea remains the same).
Doesn't depend on any other mechanism which can be faulty or error-prone.
Single query verification.
## Proposed solution
create data table
CREATE TABLE TEST(
ID INT PRIMARY KEY AUTO_INCREMENT,
DATA VARCHAR(64) NOT NULL,
CREATED_AT DATETIME DEFAULT CURRENT_TIMESTAMP()
);
create checksum table, which monitor tempering
CREATE TABLE SIGN(
ID INT PRIMARY KEY AUTO_INCREMENT,
DATA_ID INT NOT NULL,
SIGNATURE VARCHAR(128) NOT NULL,
CREATED_AT DATETIME DEFAULT CURRENT_TIMESTAMP(),
UPDATED_AT TIMESTAMP
);
create trigger on insert of data
/** Trigger on insert */
DELIMITER //
CREATE TRIGGER sign_after_insert
AFTER INSERT
ON TEST FOR EACH ROW
BEGIN
-- INSERT VAL
INSERT INTO SIGN(DATA_ID, `SIGNATURE`) VALUES(
NEW.ID, MD5(CONCAT (NEW.DATA, DATE(NEW.CREATED_AT)))
);
END; //
DELIMITER ;
Create a trigger for update of data
-- UPDATE TRIGGER
DELIMITER //
CREATE TRIGGER SIGN_AFTER_UPDATE
AFTER UPDATE
ON TEST FOR EACH ROW
BEGIN
-- UPDATE VALS
IF (NEW.DATA <> OLD.DATA) AND (DATE(OLD.CREATED_AT) = CURRENT_DATE() ) THEN
UPDATE SIGN SET SIGNATURE=MD5(CONCAT(NEW.DATA, DATE(NEW.CREATED_AT))) WHERE DATA_ID=OLD.ID;
END IF;
END; //
DELIMITER ;
Test
Step 1: insert the data
INSERT INTO TEST(DATA) VALUES ('DATA2');
The signature of data and the date at which it was created, will reflect as the signature in SIGN table.
Step 2: update the data
the signature will get updated if value is changed and it is the SAME DAY.
UPDATE TEST SET DATA='DATA' WHERE ID =1;
Step 3: validate
you can always validate the data signature as
SELECT MD5(CONCAT (T.DATA, DATE(T.`CREATED_AT`))) AS CHECKSUM, S.SIGNATURE FROM TEST AS T ,SIGN AS S WHERE S.DATA_ID= T.ID AND S.`id`=1;
Output
| CHECKSUM | SIGNATURE |
| ------ | ------ |
|2bba70178abdafc5915ba0b5061597fa |2bba70178abdafc5915ba0b5061597fa

SQL Server contraints for date ranges

I am trying to constrain a SQL Server Database by a Start Date and End Date such that I can never double book a resource (i.e. no overlapping or duplicate reservations).
Assume my resources are numbered such that the table looks like
ResourceId, StartDate, EndDate, Status
So lets say I have resource #1. I want to make sure that I cannot have have the a reservation for 1/8/2017 thru 1/16/2017 and a separate reservation for 1/10/2017 - 1/18/2017 for the same resource.
A couple of more complications, a StartDate for a resource can be the same as the EndDate for the resource. So 1/8/1027 thru 1/16/2017 and 1/16/2017 thru 1/20/2017 is ok (i.e., one person can check in on the same day another person checkouts).
Furthermore, the Status field indicates whether the booking of the resource is Active or Cancelled. So we can ignore all cancelled reservations.
We have protected against these overlapping or double booking reservations in Code (Stored Procs and C#) when saving but we are hoping to add an extra layer of protection by adding a DB Contraint.
Is this possible in SQL Server ?
Thanks in Advance
You can use a CHECK constraint to make sure startdate is on or before EndDate easily enough:
CONSTRAINT [CK_Tablename_ValidDates] CHECK ([EndDate] >= [StartDate])
A constraint won't help with preventing an overlapping date range. You can instead use a TRIGGER to enforce this by creating a FOR INSERT, UPDATE trigger that rolls back the transaction if it detects a duplicate:
CREATE TRIGGER [TR_Tablename_NoOverlappingDates] FOR INSERT, UPDATE AS
IF EXISTS(SELECT * from inserted INNER JOIN [MyTable] ON blah blah blah ...) BEGIN
ROLLBACK TRANSACTION;
RAISERROR('hey, no overlapping date ranges here, buddy', 16, 1);
RETURN;
END
Another option is to create a indexed view that finds duplicates and put a unique constraint on that view that will be violated if more than 1 record exists. This is usually accomplished with a dummy table that has 2 rows cartesian joined to an aggregate view that selects the duplicate id-- thus one record with a duplicate would return two rows in the view with the same fake id value that has a unique index.
I've done both, I like the trigger approach better.
Drawing from this answer here: Date range overlapping check constraint.
First, check to make sure there are not existing overlaps:
select *
from dbo.Reservation as r
where exists (
select 1
from dbo.Reservation i
where i.PersonId = r.PersonId
and i.ReservationId != r.ReservationId
and isnull(i.EndDate,'20990101') > r.StartDate
and isnull(r.EndDate,'20990101') > i.StartDate
);
go
If it is all clear, then create your function.
There are a couple of different ways to write the function, e.g. we could skip the StartDate and EndDate and use something based only on ReservationId like the query above, but I will use this as the example:
create function dbo.udf_chk_Overlapping_StartDate_EndDate (
#ResourceId int
, #StartDate date
, #EndDate date
) returns bit as
begin;
declare #r bit = 1;
if not exists (
select 1
from dbo.Reservation as r
where r.ResourceId = #ResourceId
and isnull(#EndDate ,'20991231') > r.StartDate
and isnull(r.EndDate,'20991231') > #StartDate
and r.[Status] = 'Active'
group by r.ResourceId
having count(*)>1
)
set #r = 0;
return #r;
end;
go
Then add your constraint:
alter table dbo.Reservation
add constraint chk_Overlapping_StartDate_EndDate
check (dbo.udf_chk_Overlapping_StartDate_EndDate(ResourceId,StartDate,EndDate)=0);
go
Last: Test it.

SQL use a variable as TABLE NAME in a FROM

We install our database(s) to different customers and the name can change depending on the deployment.
What I need to know is if you can use a variable as a table name.
The database we are in is ****_x and we need to access ****_m.
This code is part of a function.
I need the #metadb variable to be the table name - Maybe using dynamic SQL with
sp_executesql. I am just learning so take it easy on me.
CREATE FUNCTION [dbo].[datAddSp] (
#cal NCHAR(30) -- calendar to use to non-working days
,#bDays INT -- number of business days to add or subtract
,#d DATETIME
)
RETURNS DATETIME
AS
BEGIN
DECLARE #nDate DATETIME -- the working date
,#addsub INT -- factor for adding or subtracting
,#metadb sysname
SET #metadb = db_name()
SET #metadb = REPLACE (#metadb,'_x','_m')
SET #metadb = CONCAT (#metadb,'.dbo.md_calendar_day')
SET #ndate = #d
IF #bdays > 0
SET #addsub = 1
ELSE
SET #addsub = -1
IF #cal = ' ' OR #cal IS NULL
SET #cal = 'CA_ON'
WHILE #bdays <> 0 -- Keep adding/subtracting a day until #bdays becomes 0
BEGIN
SELECT #ndate = dateadd(day, 1 * #addsub, #ndate) -- increase or decrease #ndate
SELECT #bdays = CASE
WHEN (##datefirst + datepart(weekday, #ndate)) % 7 IN (0, 1) -- ignore if it is Sat or Sunday
THEN #bdays
WHEN ( SELECT 1
FROM #metadb -- **THIS IS WHAT I NEED** (same for below) this table holds the holidays
WHERE mast_trunkibis_m.dbo.md_calendar_day.calendar_code = #cal AND mast_trunkibis_m.dbo.md_calendar_day.calendar_date = #nDate AND mast_trunkibis_m.dbo.md_calendar_day.is_work_day = 0
) IS NOT NULL -- ignore if it is in the holiday table
THEN #bdays
ELSE #bdays - 1 * #addsub -- incr or decr #ndate
END
END
RETURN #nDate
END
GO
The best way to do this, if you aren't stuck with existing structures is to keep all of the table structures and names the same, simply create a schema for each customer and build out the tables in the schema. For example, if you have the companies: Global Trucking and Super Store you would create a schema for each of those companies: GlobalTrucking and SuperStore are now your schemas.
Supposing you have products and payments tables for a quick example. You would create those tables in each schema so you end up with something that looks like this:
GlobalTrucking.products
GlobalTrucking.payments
and
SuperStore.products
SuperStore.payments
Then in the application layer, you specify the default schema name to use in the connection string for queries using that connection. The web site or application for Global Trucking has the schema set to GlobalTrucking and any query like: SELECT * FROM products; would actually automatically be SELECT * FROM GlobalTrucking.products; when executed using that connection.
This way you always know where to look in your tables, and each customer is in their own segregated space, with the proper user permissions they will never be able to accidentally access another customers data, and everything is just easier to navigate.
Here is a sample of what your schema/user/table creation script would look like (this may not be 100% correct, I just pecked this out for a quick example, and I should mention that this is the Oracle way, but SQL Server should be similar):
CREATE USER &SCHEMA_NAME IDENTIFIED BY temppasswd1;
CREATE SCHEMA AUTHORIZATION &SCHEMA_NAME
CREATE TABLE "&SCHEMA_NAME".products
(
ProductId NUMBER,
Description VARCHAR2(50),
Price NUMBER(10, 2),
NumberInStock NUMBER,
Enabled VARCHAR2(1)
)
CREATE TABLE "&SCHEMA_NAME".payments
(
PaymentId NUMBER,
Amount NUMBER(10, 2),
CardType VARCHAR2(2),
CardNumber VARCHAR2(15),
CardExpire DATE,
PaymentTimeStamp TIMESTAMP,
ApprovalCode VARCHAR2(25)
)
GRANT SELECT ON "&SCHEMA_NAME".products TO &SCHEMA_NAME
GRANT SELECT ON "&SCHEMA_NAME".payments TO &SCHEMA_NAME
;
However, with something like the above, you only have 1 script that you need to keep updated for automation of adding new customers. When you run this, the &SCHEMA_NAME variable will be populated with whatever you choose for the new customer's username/schemaname, and an identical table structure is created every time.

PostgreSQL multi-layer partitioning

I have been using partitioning with a postgreSQL database for a while. My database has grown quite a lot and does so nicely with partitioning. Unfortunately I now seem to have hit another barrier in speed and am trying to figure out some ways to speed up the database even more.
My basic setup is as follows:
I have one master table called database_data from which all the partitions inherit. I chose to have one partition per month and name them like: database_data_YYYY_MM which works nicely.
By analyzing my data usage, I noticed, that I mostly do insert operations on the table and only some updates. The updates, however also occur on only a certain kind of row: I have a column called channel_id (a FK to another table). The rows I update always have a channel_id out of a set of maybe 50 IDs, so this would be a great way of distinguishing the rows that are never updated from the ones that potentially are.
I figured it would speed up my setup further if I would use the partitioning to have one table of insert only data and one of potentially updated data per month, as my updates would have to check less rows each time.
I could of course use the "simple" partitioning I am using now and add another table for each month called database_data_YYYY_MM_update and add the special constraints to that and the database_data_YYYY_MM table in order for the query planner to distinguish between the tables.
I was, however thinking, that I do sometimes have operations which operate on all data of a given month, no matter if updateable or not. In such a case I could JOIN the two tables but there could be an easier way for such queries.
So now to my real question:
Is "two layer" partitioning possible in PostgreSQL? What I mean by that is, that instead of having two tables for each month inheriting from the master table, I would only have one table per month directly inheriting from the master table e.g. database_data_YYYY_MM and then have two more tables inheriting from that table, one for the insert only data e.g. database_data_YYYY_MM_insert and one for the updateable data e.g. database_data_YYYY_MM_update.
Would this speed up the query planning at all? I would guess that it would be faster if the query planner could eliminate both tables at once if the intermediate table was eliminated.
The obvious advantage here would be that I could operate on all data of one month by simply using the table database_data_YYYY_MM and for my updates use the child table directly.
Any drawbacks that I am not thinking of?
Thank you for your thoughts.
Edit 1:
I don't think a schema is strictly necessary to answer my question but if it helps understanding I'll provide a sample schema:
CREATE TABLE database_data (
id bigint PRIMARY KEY,
channel_id bigint, -- This is a FK to another table
timestamp TIMESTAMP WITH TIME ZONE,
value DOUBLE PRECISION
)
I have a trigger on the database_data table that generates the partitions on demand:
CREATE OR REPLACE FUNCTION function_insert_database_data() RETURNS TRIGGER AS $BODY$
DECLARE
thistablename TEXT;
thisyear INTEGER;
thismonth INTEGER;
nextmonth INTEGER;
nextyear INTEGER;
BEGIN
-- determine year and month of timestamp
thismonth = extract(month from NEW.timestamp AT TIME ZONE 'UTC');
thisyear = extract(year from NEW.timestamp AT TIME ZONE 'UTC');
-- determine next month for timespan in check constraint
nextyear = thisyear;
nextmonth = thismonth + 1;
if (nextmonth >= 13) THEN
nextmonth = nextmonth - 12;
nextyear = nextyear +1;
END IF;
-- Assemble the tablename
thistablename = 'database_datanew_' || thisyear || '_' || thismonth;
-- We are looping until it's successfull to catch the case when another connection simultaneously creates the table
-- if that would be the case, we can retry inserting the data
LOOP
-- try to insert into table
BEGIN
EXECUTE 'INSERT INTO ' || quote_ident(thistablename) || ' SELECT ($1).*' USING NEW;
-- Return NEW inserts the data into the main table allowing insert statements to return the values like "INSERT INTO ... RETURNING *"
-- This requires us to use another trigger to delete the data again afterwards
RETURN NEW;
-- If the table does not exist, create it
EXCEPTION
WHEN UNDEFINED_TABLE THEN
BEGIN
-- Create table with check constraint on timestamp
EXECUTE 'CREATE TABLE ' || thistablename || ' (CHECK ( timestamp >= TIMESTAMP WITH TIME ZONE '''|| thisyear || '-'|| thismonth ||'-01 00:00:00+00''
AND timestamp < TIMESTAMP WITH TIME ZONE '''|| nextyear || '-'|| nextmonth ||'-01 00:00:00+00'' ), PRIMARY KEY (id)
) INHERITS (database_data)';
-- Add any trigger and indices to the table you might need
-- Insert the new data into the new table
EXECUTE 'INSERT INTO ' || quote_ident(thistablename) || ' SELECT ($1).*' USING NEW;
RETURN NEW;
EXCEPTION WHEN DUPLICATE_TABLE THEN
-- another thread seems to have created the table already. Simply loop again.
END;
-- Don't insert anything on other errors
WHEN OTHERS THEN
RETURN NULL;
END;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER trigger_insert_database_data
BEFORE INSERT ON database_data
FOR EACH ROW EXECUTE PROCEDURE function_insert_database_data();
As for sample data: Let's assume we only have two channels: 1 and 2. 1 is insert only data and 2 is updateable.
My two layer approach would be something like:
Main table:
CREATE TABLE database_data (
id bigint PRIMARY KEY,
channel_id bigint, -- This is a FK to another table
timestamp TIMESTAMP WITH TIME ZONE,
value DOUBLE PRECISION
)
Intermediate table:
CREATE TABLE database_data_2015_11 (
(CHECK ( timestamp >= TIMESTAMP WITH TIME ZONE '2015-11-01 00:00:00+00' AND timestamp < TIMESTAMP WITH TIME ZONE '2015-12-01 00:00:00+00)),
PRIMARY KEY (id)
) INHERITS(database_data);
Partitions:
CREATE TABLE database_data_2015_11_insert (
(CHECK (channel_id = 1)),
PRIMARY KEY (id)
) INHERITS(database_data_2015_11);
CREATE TABLE database_data_2015_11_update (
(CHECK (channel_id = 2)),
PRIMARY KEY (id)
) INHERITS(database_data_2015_11);
Of course I would then need another trigger on the intermediate table to create the child tables on demand.
It's a clever idea, but sadly it doesn't seem to work. If I have a parent table with 1000 direct children, and I run a SELECT that should pull from just one child, then explain analyze gives me a planning time of around 16ms. On the other hand, if I have just 10 direct children, and they all have 10 children, and those all have 10 children, I get a query planning time of about 29ms. I was surprised---I really thought it would work!
Here is some ruby code I used to generate my tables:
0.upto(999) do |i|
if i % 100 == 0
min_group_id = i
max_group_id = min_group_id + 100
puts "CREATE TABLE datapoints_#{i}c (check (group_id > #{min_group_id} and group_id <= #{max_group_id})) inherits (datapoints);"
end
if i % 10 == 0
min_group_id = i
max_group_id = min_group_id + 10
puts "CREATE TABLE datapoints_#{i}x (check (group_id > #{min_group_id} and group_id <= #{max_group_id})) inherits (datapoints_#{i / 100 * 100}c);"
end
puts "CREATE TABLE datapoints_#{i + 1} (check (group_id = #{i + 1})) inherits (datapoints_#{i / 10 * 10}x);"
end

SQL Server stored procedure with C# enum values

I am writing some stored procedures for my application. Within the C# code I have lots of enums. As an example:
public enum Status
{
Active = 1
Inactive = 2
}
Now when I write my stored procedures, is there any way of replicating enums in the database? I want to avoid using a table to hold them all, but I was wondering if you can tuck them all away in an object somewhere, and reference this in other stored procedures?
So I want my stored procedures to use the style of:
SELECT *
FROM Users
WHERE status = Status.Active
Instead of using raw numerical values like this:
SELECT *
FROM Users
WHERE status = 1
Does anyone have any ideas on how this might be achieved?
It turns out the best way to do this is to use a SQL Server Scalar function for the job as follows:
ALTER FUNCTION [dbo].[fn_GetEnumValue]
(
#enumName varchar(512)
)
RETURNS INT
AS
BEGIN
DECLARE #returnVal INT = 0
SELECT #returnVal = (
CASE
WHEN #enumName = 'Active' THEN 1
WHEN #enumName = 'Inactive' THEN 2
ELSE 0
END
)
RETURN #returnVal
END
Then you can use it like this:
SELECT * FROM Users WHERE status = dbo.fn_GetEnumValue('Active');
You could create a set of single row tables, each one representing an enumeration. So for user statuses:
CREATE TABLE Enum_Status
(
Active TINYINT NOT NULL DEFAULT 1,
Inactive TINYINT NOT NULL DEFAULT 2
);
INSERT INTO Enum_Status VALUES( 1 , 2 );
Then your SELECT statement would look quite neat (if not as neat as the equivalent in Oracle):
SELECT Users.*
FROM Users, Enum_Status Status
WHERE status = Status.Active;
To keep things tidy I would be tempted to put the enumeration tables in their own schema and grant all users the relevant permissions.
As i understand, your aim is to ensure that the value passed to the stored procedure is valid.
You could create a Status table as follows
CREATE TABLE Status
(
StatusID int,
StatusName nvarchar,
)
insert the valid data here.
Then, in your query, you could do an INNER JOIN to validate.
SELECT *
FROM Users
INNER JOIN Status ON Users.status = Status.StatusID
WHERE StatusID = #StatusID

Resources