I'm trying to create an indexed view and get the following error creating the index:
Cannot create index on view
....'
because column 'Amount' that is referenced by the view in the
WHERE or GROUP BY clause is imprecise. Consider eliminating the column
from the view, or altering the column to be precise.
The column in question has a data type of real which I guess is the problem?
What's the appropriate way of resolving this? Can I do a convert in the view SQL to eliminate the "impreciseness"?
The view SQL is specified below:
EXEC('
CREATE VIEW model.ReceivableBillableParties
WITH SCHEMABINDING
AS
SELECT pf.Id AS Id
, pf.InsuranceId AS InsuranceId
, pf.FinancialInsType AS InsuranceType
, pr.ReceivableId
FROM dbo.Receivables pr
INNER JOIN dbo.Demographics pd ON pd.PersonId = pr.PersonId
INNER JOIN dbo.Appointments ap ON ap.AppointmentId = pr.AppointmentId
INNER JOIN dbo.Financiasl pf ON pf.PersonId = pf.PersonId
INNER JOIN dbo.PracticeInsurers pri ON pri.InsurerId = pf.FinancialInsurerId
WHERE pri.Amount = 0
')
EXEC('
CREATE UNIQUE CLUSTERED INDEX [IX_ReceivableBillableParties]
ON model.ReceivableBillableParties ([Id]);
')
The documentation does indicate that the problem lies with the real data type (see Precision Requirements). If you want to use that column in the WHERE clause of your view, and index that view, you'll need to alter the column to a precise data type (i.e., DECIMAL(9, 2)).
EDIT
This documentation provides a clearer explanation for why this restriction exists. From the section "Deterministic Functions":
Even if an expression is deterministic, if it contains float
expressions, the exact result may depend on the processor architecture
or version of microcode. To ensure data integrity, such expressions
can participate only as non-key columns of indexed views.
Deterministic expressions that do not contain float expressions are
called precise. Only precise deterministic expressions can participate
in key columns and in WHERE or GROUP BY clauses of indexed views.
Hope that helps.
Related
so here is a little ms sql server query:
update tblSaleReturnMain set sync=0
from tblSaleOrderMain s join tblSaleReturnMain r on r.ID=s.intReturnOrderId
where s.sync=0
it updates my "tblSaleReturnMain" table just fine, also I wrote this query myself, but I dont know why it works. My question is, with all the many join-ed tables that I could reference after the "from" clause, and all the possible data that can be produced, how does this query know that the tblSaleReturnMain mentioned in "update tblSaleReturnMain .." is the same that is being filtered in join statement? Is that always like a protocol, like we mention a table before the "set" keyword and do not give it an alias, but then go on filtering/joining its data any way we like, and what remains in a resultset is what the "set" statement will apply to?
My question is specifically for Update statements that have JOINS after the FROM keyword.
Also this question is not about "how to use join in Update statement", because I already did that successfully above.
Yes, as long as you only use the target table once in the FROM SQL Server will assume it is the same table reference. From the docs (emphasis mine):
FROM <table_source>
Specifies that a table, view, or derived table source is used to provide the criteria for the update operation. For more information, see FROM (Transact-SQL).
If the object being updated is the same as the object in the FROM clause and there is only one reference to the object in the FROM clause, an object alias may or may not be specified. If the object being updated appears more than one time in the FROM clause, one, and only one, reference to the object must not specify a table alias. All other references to the object in the FROM clause must include an object alias.
If you reference the same table more than once and try to update it using just the table name rather than the alias, you'll get an error along the lines of:
Msg 8154 Level 16 State 1 Line 2
The table 'tblSaleReturnMain ' is ambiguous.
You can reference the same table more than once, but if doing so you must use the alias as the table_or_view_name, e.g.
UPDATE Alias
SET Col = 1
FROM dbo.T1 AS Alias
INNER JOIN dbo.T1 AS Alias2
ON Alias.ID = Alias2.ID;
Examples on DB<>Fiddle
I personally always use the alias regardless of whether the full table reference would be ambiguous.
SQL Server's UPDATE ... FROM syntax is non-standard and confusing.
Much better to write a CTE, examine the results, and then update the CTE, eg:
with q as
(
select r.sync
from tblSaleOrderMain s
join tblSaleReturnMain r
on r.ID=s.intReturnOrderId
where s.sync=0
)
update q set sync = 0
Often you join two tables following their foreign key, so that the row in the RHS table will always be found. Adding the join does not affect the number of rows affected by the query. For example
create table a (x int not null primary key)
create table b (x int not null primary key, y int not null)
alter table a add foreign key (x) references b (x)
Now, assuming you set up some data in these two tables, you can get a certain number of rows from a:
select x from a
Adding a join to b following the foreign key does not change this:
select a.x from a join b on a.x = b.x
However, that is not true of joins in general, which may filter out some rows or (by Cartesian product) add more:
select a.x from a join b on a.x = b.x and b.y != 42 -- probably gives fewer rows
select a.x from a join b on a.x != b.y -- probably gives more rows
When reading SQL code there is no obvious way to tell whether a join is the key-preserving kind, which may add extra columns but does not change the number of rows returned, or whether it has other effects. Over time I have developed a coding convention which I mostly stick to:
if a key-preserving join, use join
if wanting to filter rows, put the filter condition in the where clause
if wanting more rows, sometimes cross join for Cartesian product is the clearest way
These are usually just style issues, since you can often put a predicate into either the join clause or the where clause, for example.
My question
Is there some way to have these key-preserving joins statically checked by the database server when the query is compiled? I understand that the query optimizer already knows that a join on a foreign key will always find exactly one row in the table pointed to by the foreign key. But I would like to tag it in my SQL code for the benefit of human readers. For example, suppose the new syntax fkjoin is used for a join following a foreign key. Then the following SQL fragments will give errors or not:
a fkjoin b on a.x = b.x -- OK
a fkjoin b on a.x = b.x and b.y = 42 -- "Error, join can fail due to extra predicate"
a fkjoin b on a.x = b.y -- "Error, no foreign key from a.x to b.y"
This would be a useful check for me when writing the SQL, and also when returning to read it later. I understand and accept that changing the foreign keys in the database would change what SQL is legal under this scheme - to me, that is a desired outcome, since if a necessary FK ceases to exist then the key-preserving semantics of the query are no longer guaranteed, and I'd like to find out about it.
Potentially, there could be some external SQL static checker tool that does the work, and special comment syntax could be used rather than a new keyword. The checker tool would need access to the database schema to see what foreign keys exist, but it would not need to actually execute the query.
Is there something that does what I want? I am using MSSQL 2008 R2. (Microsoft SQL Server for the pedantic)
I realize that you are interested in indicating whether a particular join on particular columns is on a FK, or is a restriction, or perhaps is of some other case, or none of the preceding. (And it's not clear what you mean by "success" or "failure" of a join, or its relevance.) Whereas focusing on that information, as explained below, is to miss focusing on more important and fundamental things.
A base table has a "meaning" or "predicate (expression)" that is a fill-in-the-(named-)blanks statement given by the DBA. The names of the blanks of the statement are the columns of the table. Rows that fill in the blanks to make a true proposition about the world go in the table. Rows that fill in the blanks to make a false proposition about the world are left out. Ie a table holds the rows that satisfy its statement. You cannot set a base table to a certain value without knowing its statement, observing the world and putting the appropriate rows into the table. You cannot know about the world from base tables except by knowing its statement and taking present-row propositions to be true and absent-row propositions to be false. Ie you need its statement to use the database.
Notice that the typical syntax for a table declaration looks like a shorthand for its statement:
-- employee [eid] is named [name] and lives at [address] in ...
EMPLOYEE(eid,name,address,...)
You can make bigger statements by putting logic operators AND, OR, AND NOT, EXISTS name, AND condition, etc between/around other statements. If you translate a statement to a relation/SQL expression by converting
a table's statement to its name
AND to JOIN
OR to UNION
AND NOT to EXCEPT/MINUS
EXISTS C,... [...] to SELECT all columns but C,... FROM ...
AND condition to ON/WHERE condition
IMPLIES to SUBSETOF
IFF to =
then you get a relation expression that calculates the rows that make the statement true. (Arguments of UNION & EXCEPT/MINUS need the same columns.) So just as every table holds the rows satisfying its statement, a query expression holds the rows that satisfy its statement. You cannot know about the world from a query result except by knowing its statement and taking its present-row propositions to be true and absent-row propositions to be false. Ie you need its statement to compose or interpret a query. (Observe that this is true regardless of what constraints hold.)
This is the foundation of the relational model: table expressions calculate rows satisfying corresponding statements. (To the extent that SQL differs, it is literally illogical.)
Eg: If table T holds the rows that make statement T(...,T.Ci,...) true and table U holds the rows that make statement U(...,U.Cj,...) true then table T JOIN U holds the rows that make statement T(...,T.Ci,...) AND U(...,U.Cj,...) true. That is the semantics of JOIN that is important to using a database. You can always join, and a join always has a meaning, and it is always the AND of its operands' meanings. Whether any tables happen to have FKs to others just isn't particularly helpful for reasoning about updates or queries. (The DBMS uses constraints for when you make mistakes.)
A constraint expression just corresponds to a proposition aka always-true statement about the world and simultaneusly to one about base tables. Eg for C UNIQUE NOT NULL in U, the following three expressions are equivalent to each other:
FOREIGN KEY T (C) REFERENCES U (C)
EXISTS columns other than C T(...,C,...)
IMPLIES EXISTS columns other than C U(...,C,...)
(SELECT C FROM T) SUBSETOF (SELECT C FROM U)
It is true that this implies that SELECT C FROM T JOIN U ON T.C = U.C = SELECT C FROM U, ie a join on a FK returns the same number of rows. But so what? The join's meaning is still the same function of its arguments'.
Whether a particular join on a particular column set involves a foreign key is just not germane to understanding the meaning of a query.
I realize this is a very contrived example, but I've simplified the full version down to the following which demonstrates the problem:
CREATE VIEW model.Appointments_Partition1
WITH SCHEMABINDING AS
SELECT CONVERT(varchar(15), AppointmentId) as Id,
ap.AppTypeId as AppointmentTypeId,
ap.Duration as DurationMinutes,
ap.AppointmentId as EncounterId,
COUNT_BIG(*) as __count_big
FROM dbo.Appointments ap
JOIN dbo.PracticeCodeTable pct ON SUBSTRING(pct.Code, 1, 1) = ap.ScheduleStatus
AND pct.ReferenceType = 'AppointmentStatus'
WHERE ap.AppTime > 0
GROUP BY CONVERT(varchar(15), AppointmentId), ap.AppTypeId, ap.Duration, ap.AppointmentId
CREATE UNIQUE CLUSTERED INDEX [IX_Appointments_Partition1_Id]
ON model.Appointments_Partition1 ([Id]);
I get:
Msg 8668, Level 16, State 0, Line 12
Cannot create the clustered index 'IX_Appointments_Partition1_Id' on view 'PracticeRepository.model.Appointments_Partition1' because the select list of the view contains an expression on result of aggregate function or grouping column. Consider removing expression on result of aggregate function or grouping column from select list.
I'm including count_big...so why is the group by a problem?....and how can I resolve the error?
Here is the same error message with some boolean logic applied to it:
Cannot create the clustered index '...' on view '...' because the
select list of the view contains an expression on a grouping column.
Consider removing expression on a grouping column from the select list.
You need to remove the CONVERT in CONVERT(varchar(15), AppointmentId)
I find this reason on one of the blogs, seems reasonable to me
No, you can't use schema binding on a view that has an aggregate. And you can't index a view unless you use schema binding. You also can't bind an index that uses outer or left joins. Basically, you can only bind a view that contains a simple select statement.
http://www.tek-tips.com/viewthread.cfm?qid=1401646
You can go through the blog and see if it exactly matched your scenario.
http://technet.microsoft.com/en-us/library/cc917715.aspx
If you want to build index on views, then you must create views with schema binding, in the above link it is explained in detail. Go through the section of Design Considerations
I've been reading documentation and looking at FAQs and haven't found an answer for this one which probably means it can't be done. My actual situation is a little more complex, but I'll try to simplify it for this question. For each of the past years, I have a header/detail tables with a foreign key linking them. The year datum is in the header records! I want to be able to query all tables concatenated across years.
I have set up views that follows a 'SELECT + UNION ALL' format. I've also put check constraints on the header tables to restrict their values to their respective year. This allows the SQL server query optimizer to only query specific tables when running a query that is restricted with a WHERE clause. Awesome. Up to this point, this information can be found anywhere and everywhere by searching for Partitioned Views.
I want to do the same sort of query optimization with the detail tables but can't figure it out. There is nothing in the detail record that indicates what year it belongs to without joining with the header record; Meaning, the foreign key constraint is the only constraint I have to go off of.
The only solution I've thought of is adding a 'year' column to the detail tables and then adding another where sub clause to the queries. Is there any thing I can do to create a partitioned view of the detail tables using the existing foreign key constraint?
Here is some DDL for reference:
CREATE TABLE header2008 (
hid INT PRIMARY KEY,
dt DATE CHECK ('2008-01-01' <= dt AND dt < '2009-01-01')
)
CREATE TABLE header2009 (
hid INT PRIMARY KEY,
dt DATE CHECK ('2009-01-01' <= dt AND dt < '2010-01-01')
)
CREATE TABLE detail2008 (
did INT PRIMARY KEY,
hid INT FOREIGN KEY REFERENCES header2008(hid),
value INT
)
CREATE TABLE detail2009 (
did INT PRIMARY KEY,
hid INT FOREIGN KEY REFERENCES header2009(hid),
value INT
)
GO
CREATE VIEW headerAll AS
SELECT * FROM header2008 UNION ALL
SELECT * FROM header2009
GO
CREATE VIEW detailAll AS
SELECT * FROM detail2008 UNION ALL
SELECT * FROM detail2009
GO
--This only hits the header2008 table (GOOD)
SELECT *
FROM headerAll h
WHERE dt = '2008-04-04'
--This hits the header2008, detail2008, and detail 2009 tables. (BAD)
SELECT *
FROM headerAll h
INNER JOIN detailAll d ON h.hid = d.hid
WHERE dt = '2008-04-04'
Since you're not going for partitioned tables, I'm assuming you can't target 2005+ Enterprise Edition or higher.
Here is an alternative to adding a new physical column to your tables:
CREATE VIEW detailAll AS
SELECT 2008 AS Year, * FROM detail2008
UNION ALL
SELECT 2009, * FROM detail2009
then,
SELECT *
FROM headerAll h
INNER JOIN detailAll d ON h.hid = d.hid
WHERE dt = '2008-04-04' AND d.Year = 2008
Before you run off and implement this, there is a catch; well, two catches actually.
This solution, like the headerAll view as it's written, cannot accommodate parameters on the partitioning column and still do partition elimination. Using a search predicate of WHERE dt = #date AND d.Year = YEAR(#date) causes table scans across all tables in both views because the query optimizer assumes #date is an arbitrary value (and there's no way to fix that). This is a recipe for a performance disaster if the view is exposed publicly in your database API: there is no restriction on parameterization in queries, and most query authors and ORMs tend to use parameterized queries wherever possible (it's almost always a good thing!).
To get the views to do partition elimination in a real application, you will have to resort to dynamic string execution. How you accomplish this will depend on your business requirements, data requirements, and application architecture. It will be a bit trickier if you're grabbing data from multiple years.
Note also that using dynamic string execution would allow you to write queries directly against the base tables instead of introducing a UNIONed view for each "table". I don't think there's anything wrong with the latter, but this is an option you may not have considered.
I have a database schema like this:
[Patients] [Referrals]
| |
[PatientInsuranceCarriers] [ReferralInsuranceCarriers]
\ /
[InsuranceCarriers]
PatientInsuranceCarriers and ReferralInsuranceCarriers are identical, except for the fact that they reference either Patients or Referrals. I would like to merge those two tables, so that it looks like this:
[Patients] [Referrals]
\ /
[PatientInsuranceCarriers]
|
[InsuranceCarriers]
I have two options here
either create two new columns - ID_PatientOrReferral + IsPatient (will tell me which table to reference)
or create two different columns - ID_Patient and ID_Referral, both nullable.
Generally, I try to avoid nullable columns, because I consider them a bad practice (meaning, if you can live w/o nulls, then you don't really need a nullable column) and they are more difficult to work with in code (e.g., LINQ to SQL).
However I am not sure if the first option would be a good idea. I saw that it is possible to create two FKs on ID_PatientOrReferral (one for Patients and one for Referrals), though I can't set any update/delete behavior there for obvious reasons, I don't know if constraint check on insert works that way, either, so it looks like the FKs are there only to mark that there are relationships. Alternatively, I may not create any foreign keys, but instead add the relationships in DBML manually.
Is any of the approaches better and why?
To expand on my somewhat terse comment:
I would like to merge those two tables
I believe this would be a bad idea. At the moment you have two tables with good clear relation predicates (briefly, what it means for there to exist a record in the table) - and crucially, these relation predicates are different for the two tables:
A record exists in PatientInsuranceCarriers <=> that Patient is associated with that Insurance Carrier
A record exists in ReferralInsuranceCarriers <=> that Referral is associated with that Insurance Carrier
Sure, they are similar, but they are not the same. Consider now what would be the relation predicate of a combined table:
A record exists in ReferralAndPatientInsuranceCarriers <=> {(IsPatient is true and the Patient with ID ID_PatientOrReferral) or alternatively (IsPatient is false and the Referral with ID ID_PatientOrReferral)} is associated with that Insurance Carrier
or if you do it with NULLs
A record exists in ReferralAndPatientInsuranceCarriers <=> {(ID_Patient is not NULL and the Patient with ID ID_Patient) or alternatively (ID_Referral is not NULL and the Referral with ID ID_Referral)} is associated with that Insurance Carrier
Now, I'm not one to automatically suggest that more complicated relation pedicates are necessarily worse; but I'm fairly sure that either of the two above are worse than those they would replace.
To address your concerns:
we now have two LINQ to SQL entities, separate controllers and views for each
In general I would agree with reducing duplication; however, only duplication of the same things! Here, is it not the case that all the above are essentially 'boilerplate', and their construction and maintenance can be delegated to suitable development tools?
and have to merge them when preparing data for reports
If you were to create a VIEW, containing a UNION, for reporting purposes, you would keep the simplicity of the actual data and still have the ability to report on a combined list; eg (making assumptions about column names etc):
CREATE VIEW InterestingInsuranceCarriers
AS
SELECT
IC.Name InsuranceCarrierName
, P.Name CounterpartyName
, 'Patient' CounterpartyType
FROM InsuranceCarriers IC
INNER JOIN PatientInsuranceCarriers PIC ON IC.ID = PIC.InsuranceCarrierID
INNER JOIN Patient P ON PIC.PatientId = P.ID
UNION
SELECT
IC.Name InsuranceCarrierName
, R.Name CounterpartyName
, 'Referral' CounterpartyType
FROM InsuranceCarriers IC
INNER JOIN ReferralInsuranceCarriers RIC ON IC.ID = RIC.InsuranceCarrierID
INNER JOIN Referral R ON PIC.ReferralId = R.ID
Copying my answer from this question
If you really need A_or_B_ID in TableZ, you have two similar options:
1) Add nullable A_ID and B_ID columns to table z, make A_or_B_ID a computed column using ISNULL on these two columns, and add a CHECK constraint such that only one of A_ID or B_ID is not null
2) Add a TableName column to table z, constrained to contain either A or B. now create A_ID and B_ID as computed columns, which are only non-null when their appropriate table is named (using CASE expression). Make them persisted too
In both cases, you now have A_ID and B_ID columns which can have appropriate foreign
keys to the base tables. The difference is in which columns are computed. Also, you
don't need TableName in option 2 above if the domains of the 2 ID columns don't
overlap - so long as your case expression can determine which domain A_or_B_ID
falls into