SQL Server 2005 - odd clustered index size - sql-server

Existing table structure
CREATE TABLE [MYTABLE](
[ROW1] [numeric](18, 0) NOT NULL,
[ROW2] [numeric](18, 0) NOT NULL,
[ROW3] [numeric](18, 0) NOT NULL,
[ROW4] [numeric](18, 0) NULL,
CONSTRAINT [MYTABLE_PK] PRIMARY KEY CLUSTERED ([ROW1] ASC, [ROW2] ASC, [ROW3] ASC)
)
This table has 2 non-clustered indexes, and the following stats:
RowCount: 5260744
Data Space: 229.609 MB
Index Space: 432.125 MB
I wanted to reduce the size of the indexes, and use a surrogate primary key as the clustered index, instead of the natural composite key.
New table structure
CREATE TABLE [dbo].[TEST_RUN_INFO](
[ROW1] [numeric](18, 0) NOT NULL,
[ROW2] [numeric](18, 0) NOT NULL,
[ROW3] [numeric](18, 0) NOT NULL,
[ROW4] [numeric](18, 0) NULL,
[ID] [int] IDENTITY(1,1) NOT NULL,
CONSTRAINT [MYTABLE_PK] PRIMARY KEY CLUSTERED ([ID] ASC)
)
Still with only 2 non-clustered indexes, here's the new stats:
RowCount: 5260744
Data Space: 249.117 MB
Index Space: 470.867 MB
Question
Can someone account for how a clustered index using 3 NUMERIC(18,0) columns is smaller than a clustered index using a single INT column?
I rebuilt the indexes before and after the changes, and the fill factor is set to 0 for both structures.
The two non-clustered indexes are the same, and were not changed to include the new ID column.
sys.dm_db_index_physical_stats
Stats taken with the ID column
Composite clustered index
INDEX TYPE DEPTH LEVEL PAGECOUNT RECORDCOUNT RECORDSIZE
1 CLUSTERED 3 0 31884 5260744 47
1 CLUSTERED 3 1 143 31884 34
1 CLUSTERED 3 2 1 143 34
5 NONCLUSTERED 3 0 27404 5260744 40
5 NONCLUSTERED 3 1 167 27404 46
5 NONCLUSTERED 3 2 1 167 46
6 NONCLUSTERED 3 0 27400 5260744 40
6 NONCLUSTERED 3 1 164 27400 46
6 NONCLUSTERED 3 2 1 164 46
INT clustered index
INDEX TYPE DEPTH LEVEL PAGECOUNT RECORDCOUNT RECORDSIZE
1 CLUSTERED 3 0 31887 5260744 47
1 CLUSTERED 3 1 54 31887 11
1 CLUSTERED 3 2 1 54 11
5 NONCLUSTERED 4 0 29893 5260744 44
5 NONCLUSTERED 4 1 198 29893 50
5 NONCLUSTERED 4 2 3 198 50
5 NONCLUSTERED 4 3 1 3 50
6 NONCLUSTERED 4 0 29891 5260744 44
6 NONCLUSTERED 4 1 193 29891 50
6 NONCLUSTERED 4 2 2 193 50
6 NONCLUSTERED 4 3 1 2 50

The clustered index leaf pages include all the columns of the table (not just the key columns). By adding a surrogate primary key you have just increased the length of all rows in the leaf pages by 4 bytes. Multiply that out by 5,260,744 rows and that equals an additional 20 MB to store the ID column.
The key is narrower however so you may well have fewer non leaf level pages (use sys.dm_db_index_physical_stats to see this) and as the clustered index key is used as the row locator in the non clustered indexes this can make those smaller (but less covering) too.

Related

ORA-02291: integrity constraint (INA.member#mem_id) violated - parent key not found

Error report - ORA-02291: integrity constraint (INA.member#mem_id) violated - parent key not found.
into INA.member
(mem_id,mem_insertaddress,address_type,effective_date,end_date,adress,city,zip_code,phone_number,last_name,first_name)
values
(19889218,191166765,'Z2','01-AUG-2013','07-MAY-2016','45 NEWYORK','ATLANTIC','NY',011101,2012922341,'BOB','GUY');
when try to run getting error message
ORA-02291: integrity constraint (INA.member#mem_id) violated - parent key not found.
really appreciate for your help.
Assume you have two more tables, namely socialClubs and clubMembership, and neccessary constraints( primary and foreign ) defined as below :
SQL> create table socialClubs( -- parent(look-up) table for "clubMembership" table.
2 id int primary key,
3 name varchar2(500)
4 );
Table created
SQL> create table clubMembership( -- parent table for "member" table.
2 id int primary key,
3 club_id int,
4 constraint fk_cmmb_sc_id
5 foreign key(club_id)
6 references socialClubs(id)
7 );
Table created
SQL> create table member(
2 mem_id int primary key,
3 mem_address varchar2(500),
4 address_type varchar2(10),
5 effective_date date,
6 end_date date,
7 adress varchar2(500),
8 city varchar2(50),
9 zip_code varchar2(10),
10 phone_number number(15),
11 last_name varchar2(50),
12 first_name varchar2(50),
13 constraint fk_cmmb_mem_id
14 foreign key(mem_id)
15 references clubMembership(id)
16 );
Table created
SQL> insert into socialClubs values(1,'Mathematics');
1 row inserted
SQL> insert into member
2 (mem_id,
3 mem_address,
4 address_type,
5 effective_date,
6 end_date,
7 adress,
8 city,
9 zip_code,
10 phone_number,
11 last_name,
12 first_name)
13 values
14 (19889218,
15 191166765,
16 'Z2',
17 date'2013-08-01',
18 date'2016-05-07',
19 '45 NEWYORK',
20 'ATLANTIC NY',
21 011101,
22 2012922341,
23 'BOB',
24 'GUY');
ORA-02291: integrity constraint (ASKIMEMUR.FK_CMMB_MEM_ID) violated - parent key not found
raises ORA-02291 error,
since there's no (parent) record found in clubMembership table for member_id 19889218 of member table which has an inline definition of constraint fk_cmmb_mem_id foreign key(mem_id) references clubMembership(id).
So insert the necessary member_id into clubMembership table and go on without problem :
SQL> insert into clubMembership values(19889218,1);
1 row inserted
SQL> insert into member
2 (mem_id,
3 mem_address,
4 address_type,
5 effective_date,
6 end_date,
7 adress,
8 city,
9 zip_code,
10 phone_number,
11 last_name,
12 first_name)
13 values
14 (19889218,
15 191166765,
16 'Z2',
17 date'2013-08-01',
18 date'2016-05-07',
19 '45 NEWYORK',
20 'ATLANTIC NY',
21 011101,
22 2012922341,
23 'BOB',
24 'GUY');
1 row inserted
SQL> commit;
Commit complete

SQL Server index with condition

I have table Range with columns
Start (date), RangeTypeId (integer), ChannelId (integer), IsActive (bit)
I have this index:
CREATE UNIQUE NONCLUSTERED INDEX [IX_Range_Unique]
ON [dbo].[Range] ([Start] ASC, [RangeTypeId] ASC, [ChannelId] ASC, [IsActive] ASC)
I want my index to prevent inserting or updating only in case of 2 rows with IsActive = 1. So I want index or some sort of trigger that will allow to have multiple Ranges with IsActive = 0 and same start date, channel id and type, but only one with IsActive = 1 and same start date, channel id and type.
Example of valid db table state:
Start | RangeTypeId | ChannelId | IsActive
------------------------------------------
23:00 5 1 0
23:00 5 1 0
23:00 5 1 0
23:00 5 1 1
invalid:
Start | RangeTypeId | ChannelId | IsActive
------------------------------------------
23:00 5 1 0
23:00 5 1 0
23:00 5 1 1
23:00 5 1 1
Is it possible?
You can create a unique filtered index like so:
CREATE UNIQUE NONCLUSTERED INDEX [uIXf_Range_Unique] ON [dbo].[Range]
(
[Start] ASC,
[RangeTypeId] ASC,
[ChannelId] ASC,
[IsActive] ASC
)
where IsActive = 1
rextester demo: http://rextester.com/LBI81243
create table range ([Start] varchar(5), [RangeTypeId] int, [ChannelId] int, [IsActive] int) ;
insert into range ([Start], [RangeTypeId], [ChannelId], [IsActive]) values
('23:00', 5, 1, 0),
('23:00', 5, 1, 0),
('23:00', 5, 1, 0),
('23:00', 5, 1, 1)
;
CREATE UNIQUE NONCLUSTERED INDEX [uIXf_Range_Unique] ON [dbo].[Range]
(
[Start] ASC,
[RangeTypeId] ASC,
[ChannelId] ASC,
[IsActive] ASC
)
where IsActive = 1
go
/* throws an error error due to duplicate key */
insert into range ([Start], [RangeTypeId], [ChannelId], [IsActive]) values
('23:00', 5, 1, 1)
IMHO,There is one big disadvantage of UNIQUE FILTERED INDEX .
Main purpose of index is to speed up select query.
So it is possible that above index are not utilize in most of the select query,in that case we often change index .
So main purpose of index is defeated.In that case we drop the index and create index on some other column/columns.
Idea of Filtered index is also different than use here.Purpose of filtered index is that if among huge data,we query on certain value very frequently then we create filtered index on that column using that value like above.Its purpose is not to provide uniqueness.
Suppose DBA is not aware of this plan and DBA decide to drop this index then you may start getting duplicate records.
So best way to check duplicate in this case will be to check via code .
if not exists (select id from mytable where [Start]=#Start and [RangeTypeId]=#RangeTypeId
and [ChannelId]=#ChannelId and [IsActive]=1)
BEGIN
print 'insert'
END
If insert/update can happen from several places then it is wise to use trigger.

Migration DDL from Oracle to SQLServer

I would like to migrate DDL from Oracle to SQLServer.
It was able to migrate to a certain extent.
However, some items can not be migrated.
Oracle DDL:
CREATE TABLE ExampleTbl
(
code CHAR(3) NOT NULL,
code2 CHAR(3) NOT NULL,
username VARCHAR2(255) NOT NULL,
d DATETIME
CONSTRAINT PK_Example PRIMARY KEY (code, code2) USING INDEX
PCTFREE 10
INITRANS 2 -- <-?
MAXTRANS 255 -- <-?
TABLESPACE TBSP01
STORAGE(INITIAL 64K NEXT 1M MINEXTENTS 1 MAXEXTENTS 2147483645 BUFFER_POOL DEFAULT) -- <-?
LOGGING -- <-?
ENABLE -- <-?
)
PCTFREE 10
MAXTRANS 255
TABLESPACE TBSP01
STORAGE(INITIAL 64K NEXT 1M MINEXTENTS 1 MAXEXTENTS 2147483645 BUFFER_POOL DEFAULT) -- <-?
NOCACHE -- <-?
LOGGING
/
COMMENT ON TABLE ExampleTbl IS 'Table comment!'
/
SQLServer DDL:
CREATE TABLE [dbo].[ExampleTbl](
[code] [char](10) NOT NULL,
[code2] [char](10) NOT NULL,
[username] [varchar](255) NOT NULL,
[d] [datetime] NULL,
CONSTRAINT [PK_ExampleTbl] PRIMARY KEY CLUSTERED
(
[code] ASC,
[code2] ASC
)
WITH
(
PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON,
FILLFACTOR = 90 -- FillFactor = 100 - Oracle.PCTFREE(10)
) ON [TBSP01] -- Oracle.TableSpace
) ON [TBSP01] -- Oracle.TableSpace
GO
EXEC sys.sp_addextendedproperty
#name=N'MS_Description',
#value=N'Table comment!' , -- Oracle.Comment
#level0type=N'SCHEMA',
#level0name=N'dbo',
#level1type=N'TABLE',
#level1name=N'ExampleTbl'
GO
Don't worry about column names.
How do I migrate these?
INITRANS, MAXTRANS, STORAGE, LOGGING, ENABLE, NOCACHE.
And, are there any other problems?
CREATE TABLE Statement
Converting CREATE TABLE statement keywords and clauses:
Oracle SQL Server
1 ENABLE constraint attribute Removed
Storage and physical attributes:
Oracle SQL Server
1 PCTFREE num Removed
2 PCTUSED num Removed
3 INITRANS num Removed
4 MAXTRANS num Removed
5 COMPRESS [BASIC] | COMPRESS num | NOCOMPRESS Removed
6 LOGGING | NOLOGGING Removed
7 SEGMENT CREATION IMMEDIATE | DEFERRED Removed
8 TABLESPACE name ON name
9 LOB (column) STORE AS BASIC FILE (params) Removed
10 PARALLEL num | NOPARALLEL Removed
11 NOCACHE Removed
12 NOMONITORING Removed
STORAGE clause:
Oracle SQL Server
1 INITIAL num Removed
2 NEXT num Removed
3 MINEXTENTS num Removed
4 MAXEXTENTS num | UNLIMITED Removed
5 PCTINCREASE num Removed
6 FREELISTS num Removed
7 FREELIST GROUPS num Removed
8 BUFFER_POOL DEFAULT | KEEP | RECYCLE Removed
9 FLASH_CACHE DEFAULT | KEEP | NONE Removed
10 CELL_FLASH_CACHE DEFAULT | KEEP | NONE Removed
LOB storage clause:
Oracle SQL Server
1 TABLESPACE name Removed
2 DISABLE | ENABLE STORAGE IN ROW Removed
3 CHUNK num Removed
4 NOCACHE Removed
5 LOGGING Removed
More Details http://www.sqlines.com/oracle-to-sql-server

is it possible to add duplicate values to usn column? this is the table i have created,how to add data something like i have shown in the example?

CREATE TABLE [dbo].[studentdb] (
[usn] VARCHAR (15) NOT NULL,
[name] VARCHAR (50) NOT NULL,
[collegename] VARCHAR (50) NOT NULL,
[eventid] VARCHAR (15) NOT NULL,
[passwd] VARCHAR (50) NULL,
[email] VARCHAR (75) NULL,
CONSTRAINT [PK_studentdb] PRIMARY KEY CLUSTERED ([usn] ASC, [eventid] ASC),
FOREIGN KEY ([eventid]) REFERENCES [dbo].[eventdb] ([eventid])
);
is it possible to add duplicate values to usn column?
this is the table i have created,how to add data something like i have shown in the example?
USN EVENTID
1 100
2 100
3 200
1 200
3 100
4 100
5 100
5 200
Your PRIMARY KEY is compound key. It is based on two columns ([usn] ASC, [eventid] ASC) so as long as pair is unique you can insert it.
In your example:
1 100
2 100
3 200
1 200
3 100
4 100
5 100
5 200
every pair is unique.
For inserting data use INSERT INTO syntax like:
INSERT INTO [dbo].[studentdb](
[usn],
[name],
[collegename],
[eventid],
[passwd],
[email])
VALUES (1, 100, ...), -- rest of values
(2, 100, ...),
...;

Considerations when dropping columns in large tables

I have a table of call data that has grown to 1.3 billion rows and 173 gigabytes of data There are two columns that we no longer use, one is char(15) and the other is varchar(24). They have both been getting inserted with NULL for some time, I've been putting off removing the columns because I am unsure of the implications. We have limited space on both the drive with the database and the drive with the transaction log.
In addition I found this post saying the space would not be available until a DBCC REINDEX was done. I see this as both good and bad. It's good because dropping the columns should be very fast and not involve a lot of logging, but bad because the space will not be reclaimed. Will newly inserted records take up less space though? That would be fine in my case as we prune the old data after 18 months so the space will gradually decrease.
If we did a DBCC REINDEX (or ALTER INDEX REBUILD) would that actually help since the columns are not part of any index? Would that take up log space or lock the table so it could not be used?
I found your question interesting, so decided to model it on a development database.
SQL Server 2008, database size 400 Mb, log 2.4 Gb.
I assume, from link provided you created a table with clustered index:
CREATE TABLE [dbo].[big_table](
[recordID] [int] IDENTITY(1,1) NOT NULL,
[col1] [varchar](50) NOT NULL,
[col2] [char](15) NULL,
[col3] [varchar](24) NULL,
CONSTRAINT [PK_big_table] PRIMARY KEY CLUSTERED
(
[recordID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
This table consist of 12 Million records.
sp_spaceused big_table, true
name-big_table, rows-12031303, reserved-399240 KB, data-397760 KB, index_size-1336 KB, unused-144 KB.
drop columns
sp_spaceused big_table, true
Table size stays the same. Database and log size remained the same.
add 3 million of rows to the rest of the table
name-big_table, rows-15031303, reserved-511816 KB, data-509904 KB, index_size-1752 KB, unused-160 KB.
database size 500 Mb, log 3.27 Gb.
After
DBCC DBREINDEX( big_table )
Log is the same size, but database size increased to 866 Mb
name-big_table, rows-12031303, reserved-338376 KB, data-337704 KB, index_size-568 KB, unused-104 KB.
Again add 3 million rows to see if they going into available space within database.
Database size is the same, log 3.96 Gb, which clearly shows they are.
Hope it makes sense.
No, newly inserted records would not take up less space. I was looking at this exact issue earlier today as it happens.
Test table
CREATE TABLE T
(
id int identity primary key,
FixedWidthColToBeDropped char(10),
VariableWidthColToBeDropped varchar(10),
FixedWidthColToBeWidened char(7),
FixedWidthColToBeShortened char(20),
VariableWidthColToBeWidened varchar(7),
VariableWidthColToBeShortened varchar(20),
VariableWidthColWontBeAltered varchar(20)
)
Offsets Query
WITH T
AS (SELECT ISNULL(LEFT(MAX(name), 30), 'Dropped') AS column_name,
MAX(column_id) AS column_id,
ISNULL(MAX(case
when column_id IS NOT NULL THEN max_inrow_length
END), MAX(max_inrow_length)) AS max_inrow_length,
leaf_offset,
CASE
WHEN leaf_offset < 0 THEN SUM(CASE
WHEN column_id IS NULL THEN 2 ELSE 0
END)
ELSE MAX(max_inrow_length) - MAX(CASE
WHEN column_id IS NULL THEN 0
ELSE max_inrow_length
END)
END AS wasted_space
FROM sys.system_internals_partition_columns pc
JOIN sys.partitions p
ON p.partition_id = pc.partition_id
LEFT JOIN sys.columns c
ON column_id = partition_column_id
AND c.object_id = p.object_id
WHERE p.object_id = object_id('T')
GROUP BY leaf_offset)
SELECT CASE
WHEN GROUPING(column_name) = 0 THEN column_name
ELSE 'Total'
END AS column_name,
column_id,
max_inrow_length,
leaf_offset,
SUM(wasted_space) AS wasted_space
FROM T
GROUP BY ROLLUP ((column_name,
column_id,
max_inrow_length,
leaf_offset))
ORDER BY GROUPING(column_name),
CASE
WHEN leaf_offset > 0 THEN leaf_offset
ELSE 10000 - leaf_offset
END
Initial State of the Table
column_name column_id max_inrow_length leaf_offset wasted_space
------------------------------ ----------- ---------------- ----------- ------------
id 1 4 4 0
FixedWidthColToBeDropped 2 10 8 0
FixedWidthColToBeWidened 4 7 18 0
FixedWidthColToBeShortened 5 20 25 0
VariableWidthColToBeDropped 3 10 -1 0
VariableWidthColToBeWidened 6 7 -2 0
VariableWidthColToBeShortened 7 20 -3 0
VariableWidthColWontBeAltered 8 20 -4 0
Total NULL NULL NULL 0
Now make some changes
ALTER TABLE T
ALTER COLUMN FixedWidthColToBeWidened char(12)
ALTER TABLE T
ALTER COLUMN FixedWidthColToBeShortened char(10)
ALTER TABLE T
ALTER COLUMN VariableWidthColToBeWidened varchar(12)
ALTER TABLE T
ALTER COLUMN VariableWidthColToBeShortened varchar(10)
ALTER TABLE T
DROP COLUMN FixedWidthColToBeDropped, VariableWidthColToBeDropped
Look at the table again
column_name column_id max_inrow_length leaf_offset wasted_space
------------------------------ ----------- ---------------- ----------- ------------
id 1 4 4 0
Dropped NULL 10 8 10
Dropped NULL 7 18 7
FixedWidthColToBeShortened 5 10 25 10
FixedWidthColToBeWidened 4 12 45 0
Dropped NULL 10 -1 2
VariableWidthColToBeWidened 6 12 -2 0
Dropped NULL 20 -3 2
VariableWidthColWontBeAltered 8 20 -4 0
VariableWidthColToBeShortened 7 10 -5 0
Total NULL NULL NULL 31
Insert a row and look at the page
INSERT INTO T
([FixedWidthColToBeWidened]
,[FixedWidthColToBeShortened]
,[VariableWidthColToBeWidened]
,[VariableWidthColToBeShortened])
VALUES
('1','2','3','4')
DECLARE #DBCCPAGE nvarchar(100)
SELECT TOP 1 #DBCCPAGE = 'DBCC PAGE(''tempdb'',' + CAST(file_id AS VARCHAR) + ',' + CAST(page_id AS VARCHAR) + ',3)'
FROM T
CROSS APPLY sys.fn_PhysLocCracker(%%physloc%%)
DBCC TRACEON(3604)
EXEC (#DBCCPAGE)
Returns
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Record Size = 75
Memory Dump #0x000000000D5CA060
0000000000000000: 30003900 01000000 26a44500 00000000 †0.9.....&¤E.....
0000000000000010: ffffffff ffffff7f 00322020 20202020 †ÿÿÿÿÿÿÿ..2
0000000000000020: 20202003 00000000 98935c0d 00312020 † ......\..1
0000000000000030: 20202020 20202020 200a0080 00050049 † ......I
0000000000000040: 004a004a 004a004b 003334†††††††††††††.J.J.J.K.34
Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4
id = 1
Slot 0 Column 67108868 Offset 0x8 Length 0 Length (physical) 10
DROPPED = NULL
Slot 0 Column 67108869 Offset 0x0 Length 0 Length (physical) 0
DROPPED = NULL
Slot 0 Column 67108865 Offset 0x12 Length 0 Length (physical) 7
DROPPED = NULL
Slot 0 Column 67108866 Offset 0x19 Length 0 Length (physical) 20
DROPPED = NULL
Slot 0 Column 6 Offset 0x49 Length 1 Length (physical) 1
VariableWidthColToBeWidened = 3
Slot 0 Column 67108867 Offset 0x0 Length 0 Length (physical) 0
DROPPED = NULL
Slot 0 Column 8 Offset 0x0 Length 0 Length (physical) 0
VariableWidthColWontBeAltered = [NULL]
Slot 0 Column 4 Offset 0x2d Length 12 Length (physical) 12
FixedWidthColToBeWidened = 1
Slot 0 Column 5 Offset 0x19 Length 10 Length (physical) 10
FixedWidthColToBeShortened = 2
Slot 0 Column 7 Offset 0x4a Length 1 Length (physical) 1
VariableWidthColToBeShortened = 4
Slot 0 Offset 0x0 Length 0 Length (physical) 0
KeyHashValue = (010086470766)
You can see the dropped (and altered) columns are still consuming space even though the table was actually empty when the schema was changed.
The impact of the dropped columns in your case will be 15 bytes wasted for the char one and 2 bytes for the varchar one unless it is the last column in the variable section when it will take up no space.

Resources