I am trying to implement always encrypted on a partitioned table on column that is not used for partitioning the table - always-encrypted

I am trying to implement always encrypted using deterministic type on a partitioned table on column that is not used for partitioning the table.I am getting below error:
Column 'SubsidiaryId' is partitioning column of the index 'XYZ'.
Partition columns for a unique index must be a subset of index key.
Cannot create Index or constraint. See previous errors.

Related

Is partitioning column required in the clustered index in SQL Server

I have a table with these indexes:
pk_id_sales PRIMARY KEY (id) -> Clustered unique index
uk_sales_id UNIQUE(sales_id -> Non clustered unique index
uk_sales_date_party_name (sales_date, party_name) -> Non clustered, non unique index
I want to partition this table on the column sales_date.
Should I include sales_date into the clustered index to get the benefits of partitioning? Is this an optional one? What should be the factors to be considered to make this decision if it is an optional one?
What should be the order of columns in the clustered index If I add sales_date? Should it be (id, sales_date) or (sales_date, id)? What is the role of order here?
Will the order of columns in the index make any performance impact in this case?
If we include the partition column in the query, will partition elimination always happen regardless of the indexes we have? (Eg: I already have a unique non-clustered index on the sales_id (it doesn't contain sales_date). If I make a query with sales_id and sales_date in the where clause, will the partition elimination happen?)
Please share if there is a comprehensive write-up or video that will help to gain a fair understanding of the above-given concepts.
Any response will be appreciated. I can share more details if required.
I tried the following scenarios on an existing empty table. In both cases, the new records are getting inserted into the respective partitions and partition elimination is happening correctly (Found it based on the actual execution plan in azure data studio)
SCENARIO 1
I followed the below-given steps based on a tutorial. I don't know we are performing the 4th step.
Drop the existing clustered index on ID
Create a new non-clustered index on ID
Create a clustered index on sales_date
Drop the clustered index on `sales_date'
SCENARIO 2
Based on another tutorial, I tried the following.
I followed the below-given steps based on a tutorial. I don't know we are performing the 4th step.
Drop the existing clustered index on ID
Create a new non-clustered index on ID
Create a clustered index on sales_date
For your first question, the partitioning column is required to be specified explicitly as a key column for all unique indexes. Furthermore, SQL Server will automatically add the partitioning column to clustered index keys if not already specified.
The partitioning column is automatically added as an included column in non-unique non-clustered indexes when not already a key or included column.
EDIT:
For this question asked in comment:
The existing clustered index on my table is id (It is IDentical and
auto incremented). I want to partition the table based on sales_date.
My understanding is that we need to add sales_date to the clustered
index. In the examples I saw on web, they are adding it as a second
part of the clustered index, ie, (id, sales_date). But for me, it
looks like (sales_date,id) will be more helpful as id is unique and
it will not help to improve performance.
It depends on your queries. The partitioning column must be specified to eliminate parttions and the leftmost key column must be specified to perform an index seek.
With unique clustered index key (id,sales_date) and no other indexes:
WHERE id = 1 will perform an index seek against every partition to
find the single row.
WHERE sales_date = '20221114' will perform a full scan of single
partition containing the date and return only rows matching the date.
WHERE id = 1 AND sales_date = '20221114' will perform a seek
against only the single partition containing the date and touch the
single row.
With unique clustered index key (sales_date,id):
WHERE id = 1 will full scan every partition to find the single
row.
WHERE sales_date = '20221114' will perform an index seek on only
the partition containing the date and touch only rows that qualify.
WHERE id = 1 AND sales_date = '20221114' will perform and index
seek only the partition containing the date and touch only the single
row.

Creating MD5 for all the rows in the table

I am working on creating a unique key to find the rows that are changed after the last refresh in the table. So my approach here is to take the PK in the table and also create a md5 column for each row and based on PK and md5, check to see if any of the rows in the table have changed since last time.
What is the best method to create md5 in MS SQL based on query itself? that will take care of all the datatype and null columns also.

Single Column Huge table (2.5 B rows). Clustered index Vs Clustered Columnstore index

We are having a huge table Table1(2.5 billion rows) with single column A(NVARCHAR(255) datatype). What is the right approach for seek operations against this table. Clustered index on A Vs Clustered Column store index on A.
We are already keeping this table in separate filegroup from the other table Table2, with which it will be Joined.
Do you suggest partitioning this table for better performance ? This column will have unicode data also. So, what kind of partitioning approach is fine for unicode datatype ?
UPDATE: To clarify further, the use case for the table is SEEK. The table is storing identifiers for individuals. The major concerns here are performance for SEEK in the case of huge table. This table will be referred inside a transaction. We want the transaction to be short.
Clustered index vs column store index depends on the use case for the table. Column store keeps track of unique entries in the column and the rows where those entries are stored. This makes it very useful for data warehousing tasks such as aggregates against the indexed columns, however not as optimal for transactional tasks that need to pull a small number of specific rows. If you are using SQL Server 2014 or later you can use both a clustered index and a columnstore index by creating a clustered columnstore index. It does have some limitations and overhead that you should read up on though.
Given that this is a seek for specific rows and not an aggregation of the column, I would recommend a clustered index instead of a column store index.

What happens to a clustered index when PK is created on two columns in SQL Server

I just created a table with TWO primary keys in SQL Server. One column is age, another is ID number and I set the option to CLUSTER INDEX, so it automatically creates a cluster index on both columns. However, when I query the table, the results only seem to sort the ID and completely disregard/ignore the AGE (other PK and other Cluster index column). Why is this? Why is it only sorting based on the first cluster index column?
The query optimizer may decide to use the physical ordering of the rows in the table if there is no advantage in ordering any other way. So, when you select from the table using a simple query, it may be ordered this way. It is very easy to assume that the rows are physically stored in the order specified within the definition of your clustered index. But this turns out to be a false assumption.
Please view the following article for more details: Clustered Index do “NOT” guarantee Physically Ordering or Sorting of Rows

Non spatial indexes in Oracle

I'm trying to create a non-spatial index on two columns, one of which is a geometry column (SDO_GEOMETRY). It appears from the documentation that it is possible but I'm unable to create one.
An excerpt from the Oracle documentation:
For each spatial column in a non-SPATIAL index except POINT columns, a
column prefix length must be specified. (This is the same requirement
as for indexed BLOB columns.) The prefix length is given in bytes.
Here's the query I'm trying to execute to create the index:
create index multiple_column_index on TestDB (ID, SHAPE) tablespace test;
The SHAPE column is the geometry column here. The error I'm receiving is:
SQL Error: ORA-02327: cannot create index on expression with datatype ADT
02327. 00000 - "cannot create index on expression with datatype %s"
*Cause: An attempt was made to create an index on a non-indexable
expression.
*Action: Change the column datatype or do not create the index on an
expression whose datatype is one of VARRAY, nested table, object,
LOB, or REF.
I've not applied the column prefix here as I couldn't find any documentation that explains its usage.
There's no way to index a spatial column as a part of a B-Tree index.
If you have a spatial column on one of your tables — you have to create a spatial domain index on that column, in order to use spatial functions.
There's no other way to index those columns.
Spatial domain indexes are pretty complex - you got many interesting options when creating it.
One can achieve great performance configuring (and using) those indexes correctly.

Resources