I thought that the schemas are namespace instances and hence the same table created under 2 different schemas are 2 different objects from the perspective of the database. One of my colleagues claim that schemas are nothing but a security container, hence we can create the same table in different schemas. Is this true?
You are correct.
CREATE TABLE foo.T
(
c int
)
and
CREATE TABLE bar.T
(
c int
)
creates 2 separate objects. You could create a synonym bar.T that aliases foo.T though.
CREATE SCHEMA foo
GO
CREATE SCHEMA bar
GO
CREATE TABLE foo.T(c INT)
GO
CREATE SYNONYM bar.T FOR foo.T;
INSERT INTO foo.T VALUES (1);
SELECT * FROM bar.T;
They are 2 different objects, check the object_id
Yes, it can. Just try it
CREATE SCHEMA OneSchema AUTHORIZATION dbo;
CREATE SCHEMA TwoSchema AUTHORIZATION dbo;
CREATE TABLE dbo.SomeTable (foo int);
CREATE TABLE OneSchema.SomeTable (foo int);
CREATE TABLE TwoSchema.SomeTable (foo int);
A schema is both a securable and part of the "namespace"
I guess that you are trying to solve an issue by dividing data with the same data-structure between different tenants. You don't want to use different databases to reduce costs.
To my mind, it is better to use row-level security in this case. In this case, data are stored in one table, but one tenant can't access data that were created by another tenant.
You could read more in the next article - Row-Level Security
myschema.table1 is different than yourschema.table1
Related
I am designing a database that restricts access to certain objects. My colleague and I have discussed different ways to approach this, with there being to main candidates: 1) Implicit access, and 2) Explicit access
For illustration, let's assume there are the following tables:
User (Id)
ObjectA (Id, ParentId) -- Where ParentId is an ObjectA
ObjectB (Id, ObjectAId)
UserObjectA (UserId, ObjectAId) -- Grants access to an ObjectA
UserObjectB (UserId, ObjectBId) -- Grants access to an ObjectB
Implicit approach:
Because ObjectA serves as a containing entity for ObjectB, if a user has access to an ObjectA that is the container for an ObjectB, then he also has access to the contained ObjectB, even though there is no such explicit record in UserObjectB.
Similarly, if a user has access to a parent ObjectA, then he has access to all ObjectA descendants, even though there is no such record in UserObjectA.
Additionally, if a user has no records in either access-granting table, then implicitly he has access to all records in ObjectA and ObjectB.
Explicit approach:
To have access to either an ObjectA or ObjectB record, a user must have a record in UserObjectA or UserObjectB, respectively.
No record equals no access, period.
The implicit approach has two benefits: 1) Save space when a user has access to many objects implicitly, and 2) a user who implicitly has access to all objects will have access to all objects added in the future without having to trigger inserts or handle the inserts in sprocs when objects are created.
The explicit approach has the benefit that queries are much simpler, maintanable, and more performant.
Initially, we ran with the implicit approach. However, after getting into the implementation of various sprocs, the logic to handle the access is becoming a beast and there are various subtleties that we've run into that make this approach more error-prone than the explicit approach. (Note that the real scenario is somewhat more complicated than the simplified example.) I'm finding myself constantly implementing recursive CTEs to determine access, which doesn't allow me to (when considering performance) abstract away certain parts of the logic in views or inline TVFs. So I have to repeat and tweak the error-prone logic in lots of different sprocs. If anything ever changed, we'd have a big maintenance task on our hands.
So, have we made a mistake going with this implicit access approach? I'm definitely having second thoughts and would appreciate advice form anyone who has experience with similar design decisions.
If you can wait a month Postgres 9.5 will be out and has Row Security. Oracle has it now if you have ten million bucks kicking around.
For now, or in other dbs, you can mimic row security:
Each protected table gets an "owner" column. By default only the owner can select, update, or delete that row.
Each "child" table also has an owner column, with a cascading foreign key to the parent table. So if change parent.owner, then this changes all children.owners as well
Use updateable CHECK OPTION views to enforce security.
You need to set current_user from your application. Here's how for pg + spring
In Postgres:
create schema protected;
create table protected.foo (
foo_id int primary key,
bar text,
owner name not null default_current user
);
create table protected.foo_children (
foo_child_id int primary key,
foo_id int not null references foo(food_id),
owner name not null default current_user references foo(owner) on update cascade
);
Now some CHECK OPTION views - use security_barrier if postgres:
create view public.foo with (security_barrier) as
select
*
from protected.foo
where
owner = current_user
WITH CHECK OPTION;
create view public.foo_children with (security_barrier) as
select
*
from protected.foo_children
where
owner = current_user
WITH CHECK OPTION;
grant delete, insert, select, update on public.foo to some_users;
grant delete, insert, select, update on public.foo_children to some_users;
For sharing, you need to add some more tables. The important thing is that you can index the right columns so that you don't kill performance:
create schema acl;
create table acl.foo (
foo_id int primary key references protected.foo(foo_id),
grantee name not null,
privilege char(1) not null
);
Update your views:
create or update view public.foo with (security_barrier) as
select
*
from protected.foo
where
owner = current_user
or exists ( select 1 from acl.foo where privilege in ('s','u','d') and grantee = current_user) );
--add update trigger that checks for update privilege
--add delete trigger that checks for delete privilege
I have to move an schema from one database to a new database to keep a centralized schema into the same server. The problem is that I already have many stored procedures that use some of these tables from the schema that I need to move.
Is there any workaround to do this and change all the objects that use this tables to be able to pointed to the new database? Can I use synonyms or link server?
I'm working on SQL Server 2008 R2
Thank you.
yep; synonyms are the way to go (stole this example from http://msdn.microsoft.com/en-us/library/ms177544.aspx):
USE tempdb;
GO
-- Create a synonym for the Product table in AdventureWorks2012.
CREATE SYNONYM MyProduct
FOR AdventureWorks2012.Production.Product;
GO
-- Query the Product table by using the synonym.
USE tempdb;
GO
SELECT ProductID, Name
FROM MyProduct
WHERE ProductID < 5;
GO
You could pretty easily generate synonym statements by searching the sys.tables view and identifying the tables which belong to the schema you want to move.
Is there a natural option to establish a relationship between table and view or i should use trigger as a workaround to check that the data consistency?
I have a lookup view (for some reason i need it to be view and not a table).
I want to insert records to a different table. one of the values of the record i want to insert MUST be one of the ids from the lookup view.
For example:
ViewCities (CityId, CityName) -- This is the lookup View. the table behind the view located on a different database.
now i want to insert new row to tblUsers. one of the row columns is CityId. I want that not one will be able to insert a row to tblUsers that includes cityid that not exists on ViewCities.
You have two options that I am aware of to maintain referential integrity. You cannot use a foreign key constraint because you said that the tables are in two separate databases. The options are:
1. Use triggers, as you had mentioned.
2. Use a check constraint which references a user defined function which does the check.
For example:
Let's say I have a database named test, and another database is the Northwind database. In my test database I want to create a table which records names of users. The check I want to enforce is that the user name must be one of the LastName's of a user in the Northwind database. I first create a UDF like so:
create function chk_name (#name varchar(50))
returns bit
as
begin
declare #name_found bit=0
if exists(select * from Northwind..Employees where LastName=#name)
begin
set #name_found=1
end
return #name_found
end
Then, I create the table with a check constraint like so:
create table tst
(name varchar(50) check ( dbo.chk_name(name)=1 )
)
Now, if you try to insert a row into the tst table, it must be one of the Last Names of the Employees table in the Northwind database.
What is the difference between using
SELECT ... INTO MyTable FROM...
and
INSERT INTO MyTable (...)
SELECT ... FROM ....
?
From BOL [ INSERT, SELECT...INTO ], I know that using SELECT...INTO will create the insertion table on the default file group if it doesn't already exist, and that the logging for this statement depends on the recovery model of the database.
Which statement is preferable?
Are there other performance implications?
What is a good use case for SELECT...INTO over INSERT INTO ...?
Edit: I already stated that I know that that SELECT INTO... creates a table where it doesn't exist. What I want to know is that SQL includes this statement for a reason, what is it? Is it doing something different behind the scenes for inserting rows, or is it just syntactic sugar on top of a CREATE TABLE and INSERT INTO.
They do different things. Use INSERT when the table exists. Use SELECT INTO when it does not.
Yes. INSERT with no table hints is normally logged. SELECT INTO is minimally logged assuming proper trace flags are set.
In my experience SELECT INTO is most commonly used with intermediate data sets, like #temp tables, or to copy out an entire table like for a backup. INSERT INTO is used when you insert into an existing table with a known structure.
EDIT
To address your edit, they do different things. If you are making a table and want to define the structure use CREATE TABLE and INSERT. Example of an issue that can be created: You have a small table with a varchar field. The largest string in your table now is 12 bytes. Your real data set will need up to 200 bytes. If you do SELECT INTO from your small table to make a new one, the later INSERT will fail with a truncation error because your fields are too small.
Which statement is preferable? Depends on what you are doing.
Are there other performance implications? If the table is a permanent table, you can create indexes at the time of table creation which has implications for performance both negatively and positiviely. Select into does not recreate indexes that exist on current tables and thus subsequent use of the table may be slower than it needs to be.
What is a good use case for SELECT...INTO over INSERT INTO ...? Select into is used if you may not know the table structure in advance. It is faster to write than create table and an insert statement, so it is used to speed up develoment at times. It is often faster to use when you are creating a quick temp table to test things or a backup table of a specific query (maybe records you are going to delete). It should be rare to see it used in production code that will run multiple times (except for temp tables) because it will fail if the table was already in existence.
It is sometimes used inappropriately by people who don't know what they are doing. And they can cause havoc in the db as a result. I strongly feel it is inappropriate to use SELECT INTO for anything other than a throwaway table (a temporary backup, a temp table that will go away at the end of the stored proc ,etc.). Permanent tables need real thought as to their design and SELECT INTO makes it easy to avoid thinking about anything even as basic as what columns and what datatypes.
In general, I prefer the use of the create table and insert statement - you have more controls and it is better for repeatable processes. Further, if the table is a permanent table, it should be created from a separate create table script (one that is in source control) as creating permanent objects should not, in general, in code are inserts/deletes/updates or selects from a table. Object changes should be handled separately from data changes because objects have implications beyond the needs of a specific insert/update/select/delete. You need to consider the best data types, think about FK constraints, PKs and other constraints, consider auditing requirements, think about indexing, etc.
Each statement has a distinct use case. They are not interchangeable.
SELECT...INTO MyTable... creates a new MyTable where one did not exist before.
INSERT INTO MyTable...SELECT... is used when MyTable already exists.
The primary difference is that SELECT INTO MyTable will create a new table called MyTable with the results, while INSERT INTO requires that MyTable already exists.
You would use SELECT INTO only in the case where the table didn't exist and you wanted to create it based on the results of your query. As such, these two statements really are not comparable. They do very different things.
In general, SELECT INTO is used more often for one off tasks, while INSERT INTO is used regularly to add rows to tables.
EDIT:
While you can use CREATE TABLE and INSERT INTO to accomplish what SELECT INTO does, with SELECT INTO you do not have to know the table definition beforehand. SELECT INTO is probably included in SQL because it makes tasks like ad hoc reporting or copying tables much easier.
Actually SELECT ... INTO not only creates the table but will fail if it already exists, so basically the only time you would use it is when the table you are inserting to does not exists.
In regards to your EDIT:
I personally mainly use SELECT ... INTO when I am creating a temp table. That to me is the main use. However I also use it when creating new tables with many columns with similar structures to other tables and then edit it in order to save time.
I only want to cover second point of the question that is related to performance, because no body else has covered this. Select Into is a lot more faster than insert into, when it comes to tables with large datasets. I prefer select into when I have to read a very large table. insert into for a table with 10 million rows may take hours while select into will do this in minutes, and as for as losing indexes on new table is concerned you can recreate the indexes by query and can still save a lot more time when compared to insert into.
SELECT INTO is typically used to generate temp tables or to copy another table (data and/or structure).
In day to day code you use INSERT because your tables should already exist to be read, UPDATEd, DELETEd, JOINed etc. Note: the INTO keyword is optional with INSERT
That is, applications won't normally create and drop tables as part of normal operations unless it is a temporary table for some scope limited and specific usage.
A table created by SELECT INTO will have no keys or indexes or constraints unlike a real, persisted, already existing table
The 2 aren't directly comparable because they have almost no overlap in usage
Select into creates new table for you at the time and then insert records in it from the source table. The newly created table has the same structure as of the source table.If you try to use select into for a existing table it will produce a error, because it will try to create new table with the same name.
Insert into requires the table to be exist in your database before you insert rows in it.
The simple difference between select Into and Insert Into is:
--> Select Into don't need existing table. If you want to copy table A data, you just type Select * INTO [tablename] from A. Here, tablename can be existing table or new table will be created which has same structure like table A.
--> Insert Into do need existing table.INSERT INTO [tablename] SELECT * FROM A;.
Here tablename is an existing table.
Select Into is usually more popular to copy data especially backup data.
You can use as per your requirement, it is totally developer choice which should be used in his scenario.
Performance wise Insert INTO is fast.
References :
https://www.w3schools.com/sql/sql_insert_into_select.asp
https://www.w3schools.com/sql/sql_select_into.asp
The other answers are all great/correct (the main difference is whether the DestTable exists already (INSERT), or doesn't exist yet (SELECT ... INTO))
You may prefer to use INSERT (instead of SELECT ... INTO), if you want to be able to COUNT(*) the rows that have been inserted so far.
Using SELECT COUNT(*) ... WITH NOLOCK is a simple/crude technique that may help you check the "progress" of the INSERT; helpful if it's a long-running insert, as seen in this answer).
[If you use...]
INSERT DestTable SELECT ... FROM SrcTable
...then your SELECT COUNT(*) from DestTable WITH (NOLOCK) query would work.
Select into for large datasets may be good only for a single user using one single connection to the database doing a bulk operation task. I do not recommend to use
SELECT * INTO table
as this creates one big transaction and creates schema lock to create the object, preventing other users to create object or access system objects until the SELECT INTO operation completes.
As proof of concept open 2 sessions, in first session try to use
select into temp table from a huge table
and in the second section try to
create a temp table
and check the locks, blocking and the duration of second session to create a temp table object. My recommendation it is always a good practice to create and Insert statement and if needed for minimal logging use trace flag 610.
In SQL Server, what is the difference between a # table, a # table and a ## table?
#table refers to a local (visible to only the user who created it) temporary table.
##table refers to a global (visible to all users) temporary table.
#variableName refers to a variable which can hold values depending on its type.
Have a look at
Temporary Tables vs. Table Variables
and Their Effect on SQL Server
Performance
Differences between SQL Server
temporary tables and table
variables
Temp Tables and Table Variables:
When To Use What And Why
# and ## tables are actual tables represented in the temp database. These tables can have indexes and statistics, and can be accessed across sprocs in a session (in the case of a global temp table, it is available across sessions).
The #table is a table variable.
For more: http://www.sqlteam.com/article/temporary-tables
I would focus on the differences between #table and #table. ##table is a global temporary table and for the record in over 10 years of using SQL Server I have yet to come across a valid use case. I'm sure that some exist but the nature of the object makes it highly unusable IMHO.
The response to #whiner by #marc_s is absolutely true: it is a prevalent myth that table variables always live in memory. It is actually quite common for a table variable to go to disk and operate just like a temp table.
Anyway I suggest reading up on the set of differences by following the links pointed out by #Astander. Most of the difference involve limitations on what you can't do with #table variables.
CREATE TABLE #t
Creates a table that is only visible on and during that CONNECTION
the same user who creates another connection will not be able to see table #t from the other connection.
CREATE TABLE ##t
Creates a temporary table visible to other connections. But the table is dropped when the creating connection is ended.
if you need a unique global temp table, create your own with a Uniqueidentifier Prefix/Suffix and drop post execution if an if object_id(.... The only drawback is using Dynamic sql and need to drop explicitly.