I am creating a partition table using some very specific character values.
If new rows aren't inserted that match any of the values in the partition function, I want it to go in the default partition .
Right now it's going in the default partition.
For example
if i have values of 'apple' , 'banana', 'cake';
and a new record is insert with a value of 'berry'
it gets assigned to the 'banana' partition.
HOw can I say that my partition values must be exact?
Related
We must create a process in Spring Framework that reads a DB2 table by blocks.
However, that table does not have a column with an unique identifier that we can use as a cursor, so in second block we don't from which point we must read.
The table has those columns:
BOOK_ID SOLD_AT QUANTITY
The first one is a foreign key to book model, the second one is a date when a book was sold, and the third one the quantity of books sold.
Is it possible to do SELECT ordering by db2's rowId? Unfortunately, this is legacy code so we cannot create an extra column to the db2.
Thanks in advance.
Try this:
select hex(rowid) rowid, t.name, t.creator
from (
select t.*, rid_bit(t) rowid
from sysibm.systables t
) t
order by rowid
fetch first 10 row only;
rid_bit(table-designator) value for the row may change upon physical row movement (reorg, for example, old row is deleted, new row is inserted into the same physical place, etc.)
I am pretty new to table partitioning technique supported by MS SQL server. I have a huge table that has more than 40 millions of records and want to apply table partitioning to this table. Most of the examples I find about the partition function is to define the partition function as Range LEFT|RIGHT for Values(......), but what I need exactly is to something like following example I found from Oracle web page:
CREATE TABLE q1_sales_by_region
(...,
...,
...,
state varchar2(2))
PARTITION BY LIST (state)
(PARTITION q1_northwest VALUES ('OR', 'WA'),
PARTITION q1_southwest VALUES ('AZ', 'UT', 'NM'),
PARTITION q1_northeast VALUES ('NY', 'VM', 'NJ'),
PARTITION q1_southeast VALUES ('FL', 'GA'),
PARTITION q1_northcentral VALUES ('SD', 'WI'),
PARTITION q1_southcentral VALUES ('OK', 'TX'));
);
The example shows that we can specify a PARTITION BY LIST clause in the CREATE TABLE statement, and the PARTITION clauses specify lists of discrete values that qualify rows to be included in the partition.
My question is does MS SQL server support table partitioning by List as well?
It does not. SQL Server's partitioned tables only support range partitioning.
In this circumstance, you may wish instead to consider using a Partitioned View.
There are a number of restrictions (scroll down slightly from the link anchor) that apply to partitioned views but the key here is that the partitioning is based on CHECK constraints within the underlying tables and one form the CHECK can take is <col> IN (value_list).
However, setting up partitioned views is considerably more "manual" than creating a partitioned table - each table that holds some of the view data has to be individually and explicitly created.
You can achieve this by using ausillary computed persisted column.
Here you can find a complete example:
LIST Partitioning in SQL Server
The idea is to create a computed column based on your list like this:
alter table q1_sales_by_region add calc_field (case when q1_northwest in ('OR', 'WA') then 1...end) PERSISTED
And then partition on this calc_field using standard range partition function
What are you trying to accomplish with partitioning? 40M rows was huge 20 years ago but commonplace nowadays. Index and query tuning is especially important for performance of large tables, although partitioning can improve performance of large scans when the partitioning column is not the leftmost clustered index key column and partitions can be eliminated during query processing.
For improved manageability and control over physical placement on different filegroups, you can use range partitioning with a filegroup per region. For example:
CREATE TABLE q1_sales_by_region
(
--
state char(2)
);
CREATE PARTITION FUNCTION PF_State(char(2)) AS RANGE RIGHT FOR VALUES(
'AZ'
, 'FL'
, 'GA'
, 'NJ'
, 'NM'
, 'NY'
, 'OK'
, 'OR'
, 'SD'
, 'TX'
, 'UT'
, 'VM'
, 'WA'
, 'WI'
);
CREATE PARTITION SCHEME PS_State AS PARTITION PF_State TO(
[PRIMARY] --unused
, q1_southwest --'AZ'
, q1_southeast --'FL'
, q1_southeast --'GA'
, q1_northeast --'NJ'
, q1_southwest --'NM'
, q1_northeast --'NY'
, q1_southcentral --'OK'
, q1_northwest --'OR'
, q1_northcentral --'SD'
, q1_southcentral --'TX'
, q1_southwest --'UT'
, q1_northeast --'VM'
, q1_northwest --'WA'
, q1_northcentral --'WI'
);
You can also add a check constraint if you don't already have a related table to enforce only valid state values:
ALTER TABLE q1_sales_by_region
ADD CONSTRAINT ck_q1_sales_by_region_state
CHECK (state IN('OR', 'WA', 'AZ', 'UT', 'NM','NY', 'VM', 'NJ','FL', 'GA','SD', 'WI','OK', 'TX'));
I want to write a trigger to transfer some columns of all inserted rows in a table to another table while incrementing the maximum number in a sequence number field in the destination table. this field is not auto increment but is a primary key field.
What I used to do was find the max sequence no in destination table, increment and then insert the new value. This worked fine if data is inserted row at a time. But when many rows are inserted from a single query, how can I increment the sequence number? Sample problem follows:
insert into [mssql].mssql.dbo.destination_table (name,seq_no)
select name,?
from inserted
even few thousand rows can be inserted at once.
seq_no is part of a composite primary key. So for example if data is inserted under different name seq_no will be different. (This requirement should be considered when I can increment the seq_no without considering its part in the primary key)
Okay, I got your problem, try this
insert into [mssql].mssql.dbo.destination_table (name,seq_no)
select name, x.MaxSeq + row_number() over (order by name)
from inserted, (select Max(seq_no) As MaxSeq From source_table) x
I have an update statement in SQL server where there are four possible values that can be assigned based on the join. It appears that SQL has an algorithm for choosing one value over another, and I'm not sure how that algorithm works.
As an example, say there is a table called Source with two columns (Match and Data) structured as below:
(The match column contains only 1's, the Data column increments by 1 for every row)
Match Data
`--------------------------
1 1
1 2
1 3
1 4
That table will update another table called Destination with the same two columns structured as below:
Match Data
`--------------------------
1 NULL
If you want to update the ID field in Destination in the following way:
UPDATE
Destination
SET
Data = Source.Data
FROM
Destination
INNER JOIN
Source
ON
Destination.Match = Source.Match
there will be four possible options that Destination.ID will be set to after this query is run. I've found that messing with the indexes of Source will have an impact on what Destination is set to, and it appears that SQL Server just updates the Destination table with the first value it finds that matches.
Is that accurate? Is it possible that SQL Server is updating the Destination with every possible value sequentially and I end up with the same kind of result as if it were updating with the first value it finds? It seems to be possibly problematic that it will seemingly randomly choose one row to update, as opposed to throwing an error when presented with this situation.
Thank you.
P.S. I apologize for the poor formatting. Hopefully, the intent is clear.
It sets all of the results to the Data. Which one you end up with after the query depends on the order of the results returned (which one it sets last).
Since there's no ORDER BY clause, you're left with whatever order Sql Server comes up with. That will normally follow the physical order of the records on disk, and that in turn typically follows the clustered index for a table. But this order isn't set in stone, particularly when joins are involved. If a join matches on a column with an index other than the clustered index, it may well order the results based on that index instead. In the end, unless you give it an ORDER BY clause, Sql Server will return the results in whatever order it thinks it can do fastest.
You can play with this by turning your upate query into a select query, so you can see the results. Notice which record comes first and which record comes last in the source table for each record of the destination table. Compare that with the results of your update query. Then play with your indexes again and check the results once more to see what you get.
Of course, it can be tricky here because UPDATE statements are not allowed to use an ORDER BY clause, so regardless of what you find, you should really write the join so it matches the destination table 1:1. You may find the APPLY operator useful in achieving this goal, and you can use it to effectively JOIN to another table and guarantee the join only matches one record.
The choice is not deterministic and it can be any of the source rows.
You can try
DECLARE #Source TABLE(Match INT, Data INT);
INSERT INTO #Source
VALUES
(1, 1),
(1, 2),
(1, 3),
(1, 4);
DECLARE #Destination TABLE(Match INT, Data INT);
INSERT INTO #Destination
VALUES
(1, NULL);
UPDATE Destination
SET Data = Source.Data
FROM #Destination Destination
INNER JOIN #Source Source
ON Destination.Match = Source.Match;
SELECT *
FROM #Destination;
And look at the actual execution plan. I see the following.
The output columns from #Destination are Bmk1000, Match. Bmk1000 is an internal row identifier (used here due to lack of clustered index in this example) and would be different for each row emitted from #Destination (if there was more than one).
The single row is then joined onto the four matching rows in #Source and the resultant four rows are passed into a stream aggregate.
The stream aggregate groups by Bmk1000 and collapses the multiple matching rows down to one. The operation performed by this aggregate is ANY(#Source.[Data]).
The ANY aggregate is an internal aggregate function not available in TSQL itself. No guarantees are made about which of the four source rows will be chosen.
Finally the single row per group feeds into the UPDATE operator to update the row with whatever value the ANY aggregate returned.
If you want deterministic results then you can use an aggregate function yourself...
WITH GroupedSource AS
(
SELECT Match,
MAX(Data) AS Data
FROM #Source
GROUP BY Match
)
UPDATE Destination
SET Data = Source.Data
FROM #Destination Destination
INNER JOIN GroupedSource Source
ON Destination.Match = Source.Match;
Or use ROW_NUMBER...
WITH RankedSource AS
(
SELECT Match,
Data,
ROW_NUMBER() OVER (PARTITION BY Match ORDER BY Data DESC) AS RN
FROM #Source
)
UPDATE Destination
SET Data = Source.Data
FROM #Destination Destination
INNER JOIN RankedSource Source
ON Destination.Match = Source.Match
WHERE RN = 1;
The latter form is generally more useful as in the event you need to set multiple columns this will ensure that all values used are from the same source row. In order to be deterministic the combination of partition by and order by columns should be unique.
How do I get the position of a given value inside a table column. I need to get the column number.
In psuedo-code:
For each through the column column collection in the result set.
When you find the value, note the index number
This assumes one row only.
You can't do this in T-SQL: only a client language such as .net or Java
One option is to query the ColID column from syscolumns for your table [ select [name],[colid] from dbo.syscolumns where [id] = object_id('tablename') ]. Note that I'm not sure if this is guaranteed to be sequential or if gaps could appear if a column is dropped.