Does Sql Server Compress Database size on similar data - sql-server

Just to find out my would be DB size on production environment, I just populated my tables with 1.5 million rows of nearly same data (Except Primary key). It currently shows 261 MB...
Now, Whether I can rely on this, or since the Data is almost similar on all other columns the SQL server has compressed the size. ie. Will the size be different if the values in each rows are different...
Further.. Does even columns will null values contribute to the size of the DB ?
Thanks for your time...
Edit : Here is my schema... And I have made some indexes too...
CREATE TABLE [dbo].[Trn_Tickets](
[ObjectID] [bigint] IDENTITY(1,1) NOT NULL,
[TicketSeqNo] [bigint] NULL,
[BookSeqNo] [bigint] NULL,
[MatchID] [int] NULL,
[TicketNumber] [varchar](20) NULL,
[BarCodeNumber] [varchar](20) NULL,
[GateNo] [varchar](5) NULL,
[EntryFrom] [varchar](10) NULL,
[MRP] [decimal](9, 2) NULL,
[Commission] [decimal](9, 2) NULL,
[Discount] [decimal](9, 2) NULL,
[CashPrice] [decimal](9, 2) NULL,
[CashReceived] [decimal](9, 2) NULL,
[BalanceDue] [decimal](9, 2) NULL,
[CollectibleFrom] [char](1) NULL,
[PlaceOfIssue] [varchar](20) NULL,
[DateOfIssue] [datetime] NULL,
[PlaceOfSale] [varchar](20) NULL,
[AgentID] [int] NULL,
[BuyerID] [int] NULL,
[SaleTypeID] [tinyint] NULL,
[SaleDate] [smalldatetime] NULL,
[ApprovedBy] [varchar](15) NULL,
[ApprovedDate] [smalldatetime] NULL,
[InvoiceStatus] [char](1) NULL,
[InvoiceRefNo] [varchar](15) NULL,
[InvoiceDate] [smalldatetime] NULL,
[BookPosition] [char](2) NULL,
[TicketStatus] [char](2) NULL,
[RecordStatus] [char](1) NULL,
[ClosingStatus] [char](2) NULL,
[ClosingDate] [datetime] NULL,
[UpdatedDate] [datetime] NULL,
[UpdatedUser] [varchar](10) NULL,
CONSTRAINT [PK_Trn_Tickets] PRIMARY KEY CLUSTERED
(
[ObjectID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Hope this helps

SQL Server 2005 and 2008 Express will not compress your data. SQL Server 2008 can use page compression, but only on Enterprise Edition. NULL columns occupy one bit in the row.
From the description of your data, sounds more like a problem of ordinary normalization. Separate the repeat values into a lookup table, store only distinct combinations, join agaisnt the lookup table. This will save data by schema design and will work on all DB platforms, all versions, all SKUs.

Replace ApprovedBy etc (varchar) with lookups to other tables
Do you need datetime?
Do you expect more then 4 billion rows? Why are the 1st 3 cols bigint?
Save a few bytes here and there = a big difference. Higher page density (eg more rows per 8k page) = less space + smaller indexes.
Compress when you have 1.5 billion rows.

Related

LoadData from SQL Server 2012 to 2008 R2 issue with date date type loading into datetime datatype column

I have to load data from SQL Server 2012 to SQL Server 20108 R2. I have a column with a date datatype and in the destination table I have a data type of datetime. When I am trying to load data using SSIS I am getting an error as saying invalid date format and sometime I am getting conversation failed.
Source :
CREATE TABLE SourceTable(
[SSNO] [char](9) NOT NULL,
[SH_CHANGE_DATE] [date] NULL,
[SH_REASON_CODE] [char](2) NOT NULL,
[SH_ANN_SALARY] [decimal](8, 2) NOT NULL,
[GRADE] [char](2) NOT NULL,
[ITEM] [char](4) NOT NULL,
[MULTI_POSITNBR] [char](4) NOT NULL,
[STEP] [char](2) NOT NULL,
[OFF_STEP] [char](1) NOT NULL,
[SH_TITLE_NAME] [char](30) NOT NULL,
[SH_CHANGE_AMT] [decimal](7, 2) NOT NULL,
[SH_DIV_DIST_IND] [char](4) NOT NULL,
[SH_BUDGET] [char](3) NOT NULL,
[SH_ENGNO] [char](2) NOT NULL,
[SH_RCDADD_DATE] [date] NULL,
[SH_RCDADD_TIME] [time](0) NULL,
[SH_OCC_CATEGORY] [char](1) NOT NULL,
[SH_FULL_PART_CD] [char](1) NOT NULL,
[SH_SUPVY_CLASS] [char](1) NOT NULL,
[EMPLOYEE_NUMBER] [char](9) NOT NULL,
[T101_TSAL_HIST_PRIMARY_KEY]  AS ((substring([SSNO],(1),(9))+substring(CONVERT([varchar],[SH_CHANGE_DATE],(112)),(1),(8)))+substring([SH_REASON_CODE],(1),(2))) PERSISTED NOT NULL,
 CONSTRAINT SourceTableconstrain PRIMARY KEY CLUSTERED
(
[Source_PRIMARY_KEY] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
DESTINATION :
CREATE TABLE TableName(
[SSNO] [char](9) NOT NULL,
[SH_CHANGE_DATE] [datetime] NOT NULL,
[SH_REASON_CODE] [char](2) NOT NULL,
[SH_ANN_SALARY] [money] NULL,
[GRADE] [char](2) NULL,
[ITEM] [char](4) NULL,
[MULTI_POSITNBR] [char](4) NULL,
[STEP] [char](2) NULL,
[OFF_STEP] [char](1) NULL,
[SH_TITLE_NAME] [varchar](30) NULL,
[SH_CHANGE_AMT] [money] NULL,
[SH_DIV_DIST_IND] [char](4) NULL,
[SH_BUDGET] [char](3) NULL,
[SH_ENGNO] [char](2) NULL,
[SH_RCDADD_DATE] [datetime] NULL,
[SH_RCDADD_TIME] [varchar](6) NULL,
[SH_OCC_CATEGORY] [char](1) NULL,
[SH_FULL_PART_CD] [char](1) NULL,
[SH_SUPVY_CLASS] [char](1) NULL,
[EMPLOYEE_NUMBER] [char](9) NULL,
 CONSTRAINT Tablename PRIMARY KEY CLUSTERED
(
[SSNO] ASC,
[SH_CHANGE_DATE] ASC,
[SH_REASON_CODE] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
guys thank you for you feedback , I resolve this issue by myself . To load data from date to datetime from sql server 2012 to 2008r2 i have used two ssis packages ,In one ssis package i have converted the date to datetime(n) datatype for column value and in the next ssis pakcage copied the column value from loaded place initially to destination . This resulting in having the same datatype at destination while but still loading data from date to datetime datatype column value .

How do I speed up an insert that deals with huge amount of data and non indexed joins?

I have the following query that will be used to fetch data from legacy tables. It's no surprise but the amount of data is huge and thus it takes a long time. The first select takes 40 minutes to run using an empty dbo.commodities_copy table as a starting point and yields around 26,000 rows. Keep in mind that there are separate databases: STAGING and PRESTAGING and that some joins are made using non-PK fields, which is most definately making an impact in its performance. This is something that I cannot fix, due to the way data was organized from the start. Also the transaction table has around 1 million rows, which also impacts heavily on performance. The entire script takes a total of 3.5 hours to execute when using an EMPTY dbo.commodities_copy table. I have not tested on insertion to a table with data.
The goal of the query is to get commodity information from the transaction table (if you guessed this was supposed to be noSQL data, you guessed right) and if the commodity code exists in the commodity table, do not insert a commodity in it.
The group bys are absolutely needed to get around duplicates, since a transactions may share the same commodity. The commodity code should be unique in the commodities table, but currently it is not - though if it helps, it's possible we could alter it.
What can I do to speed it up?
INSERT INTO STAGING.dbo.commodities_copy
(commodity_code,
short_description_sched_b,
short_description_sched_tsusa,
long_description_sched_b,
long_description_sched_tsusa,
measurement_unit_1_sched_b,
measurement_unit_1_sched_tsusa,
measurement_unit_2_sched_b,
measurement_unit_2_sched_tsusa,
end_use_sched_b,
end_use_sched_tsusa,
year,
created_by,
created_on,
taxable_sched_b,
taxable_sched_tsusa,
non_taxable_sched_b,
non_taxable_sched_tsusa,
fk_sic_sched_b,
fk_sic_sched_tsusa,
chapter,
header,
sub_header,
needs_validation)
SELECT
--Distinct
Commodity_Code,
iif(miob2.DESC_COMM is null, UPPER(socrata.Commodity_Short_Name), miob2.DESC_COMM) as short_commmodity_description_b,
iif(mio2tsusa.DESC_COMM is null, UPPER(socrata.Commodity_Short_Name), mio2tsusa.DESC_COMM) as short_commmodity_description_tsusa,
socrata.Commodity_description as long_commodity_description_b,
socrata.Commodity_description as long_commodity_description_tsusa,
iif(miob2.UNIDAD is null, socrata.unit_1, miob2.UNIDAD) as unit_1_b,
iif(mio2tsusa.UNIDAD is null, socrata.unit_1, mio2tsusa.UNIDAD) as unit_1_tsusa,
MAX(socrata.unit_2) as unit_2_b,
MAX(socrata.unit_2) as unit_2_tsusa,
socrata.end_use_e as end_use_b,
socrata.end_use_i as end_use_tsusa,
MAX(socrata.[year]),
'system' as created_by,
getdate() as created_on,
miob.TRIBUTA as taxable_b,
miotsusa.TRIBUTA as taxable_tsusa,
miob.NTRIBUTA as non_taxable_b,
miotsusa.NTRIBUTA as non_taxable_tsusa,
sicb.id as sic_id_b,
sictsusa.id as sic_id_tsusa,
SUBSTRING(Commodity_Code, 1, 2) as chapter,
SUBSTRING(Commodity_Code, 1, 4) as header,
SUBSTRING(Commodity_Code, 1, 6) as sub_header,
0 as needs_validation
FROM PRE_STAGING.dbo.TRANSACTIONS_FROM_SOCRATA socrata
Left join PRE_STAGING.DBO.MIOB_TBL miob ON miob.COMM=socrata.Commodity_Code
Left join PRE_STAGING.dbo.MSCHB_TBL miob2 ON miob2.COMM=socrata.Commodity_Code
Left join PRE_STAGING.dbo.MIOTSUSA_TBL miotsusa ON miotsusa.COMM=socrata.Commodity_Code
Left join PRE_STAGING.dbo.MTSUSA_TBL mio2tsusa ON mio2tsusa.COMM=socrata.Commodity_Code
Left join STAGING.dbo.sics_altered sicb ON sicb.sic_code = miob.SIC
Left join STAGING.dbo.sics_altered sictsusa ON sictsusa.sic_code = miotsusa.SIC
WHERE NOT EXISTS
(Select Distinct commodity_code from STAGING.dbo.commodities_copy)
group by
Commodity_Code,
iif(miob2.DESC_COMM is null, UPPER(socrata.Commodity_Short_Name), miob2.DESC_COMM),
iif(mio2tsusa.DESC_COMM is null, UPPER(socrata.Commodity_Short_Name), mio2tsusa.DESC_COMM),
socrata.Commodity_description,
socrata.Commodity_description,
iif(miob2.UNIDAD is null, socrata.unit_1, miob2.UNIDAD),
iif(mio2tsusa.UNIDAD is null, socrata.unit_1, mio2tsusa.UNIDAD),
socrata.end_use_e,
socrata.end_use_i,
miob.TRIBUTA,
miotsusa.TRIBUTA,
miob.NTRIBUTA,
miotsusa.NTRIBUTA,
sicb.id,
sictsusa.id,
SUBSTRING(Commodity_Code, 1, 2),
SUBSTRING(Commodity_Code, 1, 4),
SUBSTRING(Commodity_Code, 1, 6)
The tables used are the following:
STAGING.dbo.commodities_copy:
CREATE TABLE [dbo].[commodities_copy](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[chapter] [varchar](5) NULL,
[header] [varchar](5) NULL,
[sub_header] [varchar](10) NULL,
[commodity_code] [varchar](20) NULL,
[short_description_sched_b] [varchar](100) NULL,
[long_description_sched_b] [varchar](200) NULL,
[measurement_unit_1_sched_b] [varchar](5) NULL,
[measurement_unit_2_sched_b] [varchar](5) NULL,
[end_use_sched_b] [int] NULL,
[sitc_sched_b] [varchar](20) NULL,
[usda_sched_b] [int] NULL,
[hitech_sched_b] [int] NULL,
[naics_fk_id_sched_b] [bigint] NULL,
[short_description_sched_tsusa] [varchar](100) NULL,
[long_description_sched_tsusa] [varchar](200) NULL,
[measurement_unit_1_sched_tsusa] [varchar](5) NULL,
[measurement_unit_2_sched_tsusa] [varchar](5) NULL,
[end_use_sched_tsusa] [int] NULL,
[sitc_sched_tsusa] [varchar](20) NULL,
[usda_sched_tsusa] [int] NULL,
[hitech_sched_tsusa] [int] NULL,
[naics_fk_id_sched_tsusa] [bigint] NULL,
[year] [int] NOT NULL,
[created_on] [datetime] NOT NULL,
[created_by] [varchar](50) NULL,
[updated_on] [datetime] NULL,
[updated_by] [varchar](50) NULL,
[needs_validation] [bit] NOT NULL,
[taxable_sched_b] [nchar](3) NULL,
[non_taxable_sched_b] [nchar](3) NULL,
[taxable_sched_tsusa] [nchar](3) NULL,
[non_taxable_sched_tsusa] [nchar](3) NULL,
[fk_sic_sched_b] [bigint] NULL,
[fk_sic_sched_tsusa] [bigint] NULL
) ON [PRIMARY]
STAGING.dbo.sics_altered:
CREATE TABLE [dbo].[sics_altered](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[sic_code] [varchar](4) NULL,
[sic_description] [varchar](max) NULL,
[created_on] [datetime] NOT NULL,
[created_by] [varchar](50) NOT NULL,
PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
The rest are in PRESTAGING:
PRESTAGING.dbo.TRANSACTIONS_FROM_SOCRATA:
This is the table with 1.3 million rows
CREATE TABLE [dbo].[TRANSACTIONS_FROM_SOCRATA](
[Trade] [varchar](255) NULL,
[Year] [varchar](255) NULL,
[Month] [varchar](50) NULL,
[Commodity_Code] [varchar](50) NULL,
[Commodity_Short_Name] [varchar](255) NULL,
[Commodity_description] [varchar](255) NULL,
[cty_code] [varchar](50) NULL,
[Country] [varchar](50) NULL,
[Subcountry_code] [varchar](50) NULL,
[district] [varchar](50) NULL,
[dist_name] [varchar](255) NULL,
[data] [varchar](50) NULL,
[sitc] [varchar](50) NULL,
[SITC_Short_Desc] [varchar](255) NULL,
[SITC_Long_Desc] [varchar](255) NULL,
[naics] [varchar](50) NULL,
[NAICS_description] [varchar](255) NULL,
[end_use_i] [varchar](50) NULL,
[end_use_e] [varchar](50) NULL,
[hts_desc] [varchar](255) NULL,
[unit_1] [varchar](50) NULL,
[qty_1] [varchar](50) NULL,
[unit_2] [varchar](50) NULL,
[qty_2] [varchar](50) NULL,
[ves_val_mo] [varchar](50) NULL,
[ves_wgt_mo] [varchar](50) NULL,
[cards_mo] [varchar](50) NULL,
[air_val_mo] [varchar](50) NULL,
[air_wgt_mo] [varchar](50) NULL,
[dut_val_mo] [varchar](50) NULL,
[cal_dut_mo] [varchar](50) NULL,
[con_cha_mo] [varchar](50) NULL,
[con_cif_mo] [varchar](50) NULL,
[gen_val_mo] [varchar](50) NULL,
[gen_cha_mo] [varchar](50) NULL,
[gen_cif_mo] [varchar](50) NULL,
[air_cha_mo] [varchar](50) NULL,
[ves_cha_mo] [varchar](50) NULL,
[cnt_cha_mo] [varchar](50) NULL,
[rev_data] [varchar](50) NULL
) ON [PRIMARY]
PRESTAGING.dbo.MIOB_TBL:
CREATE TABLE [dbo].[MIOB_TBL](
[id] [int] IDENTITY(1,1) NOT NULL,
[COMM] [nchar](10) NOT NULL,
[INSUMO] [nchar](3) NULL,
[PBTO] [nchar](4) NULL,
[SIC] [nchar](4) NULL,
[NAICS] [nchar](6) NULL,
[TRIBUTA] [nchar](3) NULL,
[NTRIBUTA] [nchar](3) NULL,
[LAST_UPDATE] [date] NULL,
[LAST_UPDATED_BY] [nchar](20) NULL,
[CREATION_DATE] [date] NULL,
[CREATED_BY] [nchar](15) NULL,
[migrated_on] [datetime] NOT NULL,
PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
PRESTAGING.dbo.MIOTSUSA_TBL:
CREATE TABLE [dbo].[MIOTSUSA_TBL](
[COMM] [nchar](10) NOT NULL,
[INSUMO] [nchar](3) NULL,
[PBTO] [nchar](4) NULL,
[SIC] [nchar](4) NULL,
[NAICS] [nchar](6) NULL,
[TRIBUTA] [nchar](3) NULL,
[NTRIBUTA] [nchar](3) NULL,
[id] [int] IDENTITY(1,1) NOT NULL,
[migrated_on] [datetime] NOT NULL
) ON [PRIMARY]
PRESTAGING.dbo.MSCHB_TBL:
CREATE TABLE [dbo].[MSCHB_TBL](
[id] [int] IDENTITY(1,1) NOT NULL,
[COMM] [nchar](10) NOT NULL,
[DESC_COMM] [nchar](50) NULL,
[UNIDAD] [nchar](3) NULL,
[LAST_UPDATE] [date] NULL,
[LAST_UPDATED_BY] [nchar](20) NULL,
[CREATION_DATE] [date] NULL,
[CREATED_BY] [nchar](15) NULL,
[migrated_on] [datetime] NOT NULL,
PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
PRESTAGING.dbo.MTSUSA_TBL
CREATE TABLE [dbo].[MTSUSA_TBL](
[COMM] [nchar](10) NOT NULL,
[DESC_COMM] [nchar](50) NULL,
[UNIDAD] [nchar](3) NULL,
[id] [int] IDENTITY(1,1) NOT NULL,
[migrated_on] [datetime] NOT NULL
) ON [PRIMARY]
Let me know if there's anything else I need to provide.
With all those left outer joins, the query optimizer has to start with TRANSACTION_FROM_SOCRATA, so I would start with that. The only filtering is the NOT IN clause--would that cut down the 1MM rows to something more reasonable? If not, you're pretty much doomed to running at least one table scan (and possibly several) on the entire table.
If filtering on Commodity_Code would significantly cut things down, that can only be done if the column is indexed, so that SQL can find and read only those rows. It can only do that if there is an index on column--otherwise you're back to a table scan. Similarly, having an index on commodity_code in table commodities_copy` would help as well, if that table is large.
As discussed in the comments, a NOT EXISTS check would be most efficient, written as a correlated subquery:
WHERE NOT EXISTS (select commodity_code
from STAGING.dbo.commodities_copy
where commodity_code = socrtata.Commodity_Code)
(I'd want to do a lot of testing on this, checking and double-checking everything. Improving performance is tricky, doubly so when done through SO.)
Try this,
create table #socrata(Commodity_Code varchar(100),unit_2_b varchar(50),unit_2_tsusa varchar(50),[year] varchar(50))
insert into #socrata
SELECT
Commodity_Code,
MAX(socrata.unit_2) as unit_2_b,
MAX(socrata.unit_2) as unit_2_tsusa,
MAX(socrata.[year]),
FROM PRE_STAGING.dbo.TRANSACTIONS_FROM_SOCRATA socrata
group by Commodity_Code
SELECT
--Distinct
Commodity_Code,
iif(miob2.DESC_COMM is null, UPPER(socrata.Commodity_Short_Name), miob2.DESC_COMM) as short_commmodity_description_b,
iif(mio2tsusa.DESC_COMM is null, UPPER(socrata.Commodity_Short_Name), mio2tsusa.DESC_COMM) as short_commmodity_description_tsusa,
socrata.Commodity_description as long_commodity_description_b,
socrata.Commodity_description as long_commodity_description_tsusa,
iif(miob2.UNIDAD is null, socrata.unit_1, miob2.UNIDAD) as unit_1_b,
iif(mio2tsusa.UNIDAD is null, socrata.unit_1, mio2tsusa.UNIDAD) as unit_1_tsusa,
unit_2_b,
unit_2_tsusa,
socrata.end_use_e as end_use_b,
socrata.end_use_i as end_use_tsusa,
[year],
'system' as created_by,
getdate() as created_on,
miob.TRIBUTA as taxable_b,
miotsusa.TRIBUTA as taxable_tsusa,
miob.NTRIBUTA as non_taxable_b,
miotsusa.NTRIBUTA as non_taxable_tsusa,
sicb.id as sic_id_b,
sictsusa.id as sic_id_tsusa,
SUBSTRING(Commodity_Code, 1, 2) as chapter,
SUBSTRING(Commodity_Code, 1, 4) as header,
SUBSTRING(Commodity_Code, 1, 6) as sub_header,
0 as needs_validation
FROM #socrata socrata
Left join PRE_STAGING.DBO.MIOB_TBL miob ON miob.COMM=socrata.Commodity_Code
Left join PRE_STAGING.dbo.MSCHB_TBL miob2 ON miob2.COMM=socrata.Commodity_Code
Left join PRE_STAGING.dbo.MIOTSUSA_TBL miotsusa ON miotsusa.COMM=socrata.Commodity_Code
Left join PRE_STAGING.dbo.MTSUSA_TBL mio2tsusa ON mio2tsusa.COMM=socrata.Commodity_Code
Left join STAGING.dbo.sics_altered sicb ON sicb.sic_code = miob.SIC
Left join STAGING.dbo.sics_altered sictsusa ON sictsusa.sic_code = miotsusa.SIC
WHERE NOT EXISTS
(Select commodity_code from STAGING.dbo.commodities_copy where commodity_code = socrtata.Commodity_Code)
if Read uncommitted data is not a concern then you can use with (nolock)
Also your exists clause was wrong and no need of distinct.check rest of the changes.

Fastest way to record count using filter in SQL Server

I am using SQL Server version 2012. I have a table which has more than 10 million rows. I have to count records using a SQL filter.
My query is this:
select count(*)
from reconcil
where tenantid = 101
which is taking more than 5 minutes for 5 millions records.
Is there any fastest way to count records?
Reconcil table structure is
CREATE TABLE [dbo].[RECONCIL]
(
[AckCode] [nvarchar](50) NULL,
[AckExpireTime] [int] NULL,
[AckFileName] [nvarchar](255) NULL,
[AckKey] [int] NULL,
[AckState] [int] NULL,
[AppMsgKey] [nvarchar](30) NULL,
[CurWrkActID] [nvarchar](50) NULL,
[Date_Time] [datetime] NULL,
[Direction] [nvarchar](1) NULL,
[ErrorCode] [nvarchar](50) NULL,
[FGLOGKEY] [int] NOT NULL,
[FolderID] [int] NULL,
[FuncGCtrlNo] [nvarchar](14) NULL,
[INLOGKEY] [int] NULL,
[InputFileName] [nvarchar](255) NULL,
[IntCtrlNo] [nvarchar](14) NULL,
[IsAssoDataPresent] [nvarchar](1) NULL,
[JobState] [int] NULL,
[LOGDATA] [nvarchar](max) NULL,
[MessageID] [nvarchar](25) NULL,
[MessageState] [int] NULL,
[MessageType] [int] NULL,
[NextWrkActID] [nvarchar](50) NULL,
[NextWrkHint] [nvarchar](20) NULL,
[NONFAERRORLOG] [nvarchar](max) NULL,
[NumberOfBytes] [int] NULL,
[NumberOfSegments] [int] NULL,
[OutputFileName] [nvarchar](255) NULL,
[Priority] [nvarchar](1) NULL,
[ReceiverID] [nvarchar](30) NULL,
[RecNo] [int] NULL,
[RecordID] [int] IDENTITY(1,1) NOT NULL,
[RelationKey] [int] NULL,
[SEGLOG] [nvarchar](max) NULL,
[SenderID] [nvarchar](30) NULL,
[ServerID] [nvarchar](255) NULL,
[Standard] [int] NULL,
[TenantID] [int] NULL,
[TPAgreementKey] [int] NULL,
[TSetCtrlNo] [nvarchar](35) NULL,
[UserKey1] [nvarchar](255) NULL,
[UserKey2] [nvarchar](255) NULL,
[UserKey3] [nvarchar](255) NULL,
CONSTRAINT [RECONCIL_PK]
PRIMARY KEY CLUSTERED ([RecordID] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Unless you materialized the count, this non-clustered index on TenentID will provide better performance because it is narrower than the clustered primary key index and will scan only the matching rows:
CREATE INDEX idx ON [dbo].[RECONCIL](TenantID);
If performance of the aggregate query with this index isn't acceptable, you could create an indexed view with the count. The indexed view will provide the fastest performance for this query but will incur additional costs for storage and index maintenance for inserts and deletes. Also, queries that modify the table must have required SET options for indexed views. Those costs may be justified if the count query is executed often.
SQL Server can use the indexed view automatically in Enterprise (or Developer) editions even if not directly referenced in the query as long as the optimizer can match the semantics of the query using the view. In lesser editions, you'll need to query the indexed view directly and specify the NOEXPAND hint.
CREATE VIEW dbo.VW_RECONCIL_COUNT
WITH SCHEMABINDING
AS
SELECT
TenantID
, COUNT_BIG(*) AS TenentRowCount
FROM [dbo].[RECONCIL]
GROUP BY TenantID;
GO
CREATE UNIQUE CLUSTERED INDEX cdx ON dbo.VW_RECONCIL_COUNT(TenantID);
GO
--Enterprise Edition can use the view index automatically
SELECT COUNT_BIG(*) AS TenentRowCount
FROM [dbo].[RECONCIL]
WHERE TenantID = 101
GROUP BY TenantID;
GO
--other editions require the view to be specified plus the NOEXPAND hint
SELECT TenentRowCount
FROM dbo.VW_RECONCIL_COUNT WITH (NOEXPAND)
WHERE TenantID = 101;
GO
As being suggested, create an index or even partition your table by tenantId if you have so many items. This way you would have one data file per partition which increases performance.
select count(tenantid)
from reconcil
where tenantid = 101 group by tenantid ;
not sure but try using this.

SQL Server: extremely slow table on a simple select

I am dealing with a SQL Server on Azure, and I found a rare case where queries over a single table are really slow, over 10 - 12 seconds for a table that has about thousand rows, and while similar tables respond in less that 1 second.
Table definition (script create) is:
CREATE TABLE [dbo].[Content]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Content_ID] [int] NOT NULL,
[CultureCode] [nvarchar](50) NOT NULL,
[Version] [int] NOT NULL,
[UserID] [int] NOT NULL,
[Timestamp] [datetime] NOT NULL,
[Title] [nvarchar](max) NOT NULL,
[Subtitle] [nvarchar](max) NULL,
... 14 more [nvarchar](max) FIELDS...
[NotesPlainText] [nvarchar](max) NULL,
CONSTRAINT [PK_dbo.Content]
PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON)
)
(the columns Content_ID, CultureCode and Version seem to be a unique combination, even is not set in the table as that. [ID] is just used as row identifier)
Aside of 17 columns all nvarchar(max), nothing else weird.
As I said, no uniques defined, no index ...
So I tune up as
CREATE TABLE [dbo].[Content_optimized]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Content_ID] [int] NOT NULL,
[CultureCode] [nvarchar](50) NOT NULL,
[Version] [int] NOT NULL,
[UserID] [int] NOT NULL,
[Timestamp] [datetime] NOT NULL,
[Title] [nvarchar](max) NOT NULL,
[Subtitle] [nvarchar](max) NULL,
... 14 more [nvarchar](max) FIELDS...
[NotesPlainText] [nvarchar](max) NULL,
)
CREATE INDEX IDX_dbo_Content_optimized__Content_ID
ON [dbo].[Content_optimized]([Content_ID])
CREATE INDEX IDX_dbo_Content_optimized__CultureCode
ON [dbo].[Content_optimized]([CultureCode])
CREATE INDEX IDX_dbo_Content_optimized__Version
ON [dbo].[Content_optimized]([Version])
CREATE INDEX IDX_dbo_Content_optimized__UserID
ON [dbo].[Content_optimized]([UserID])
ALTER INDEX ALL ON [dbo].[Content_optimized] REBUILD WITH (FILLFACTOR=90, ONLINE=ON)
and here where things get weird, I am not saving even a single second of execution.
Indeed code like this:
select *
from [Content]
where Content_ID <> 1049
order by Content_ID, Version
select *
from [Content_optimized]
where Content_ID <> 1049
order by Content_ID, Version
gives a 53% and 47% on the execution plan (just 11% faster because of the INDEXes)
Sure I am relative new to SQL optimisation, so there is something I am not seeing here that may be obvious, but I am right now lost on it.
Any help?

Violation of Primary Key error on Identity column

This is maddening! Code in question has been running for over 5 years.
Here's the scoop....
I am doing an INSERT...SELECT into a table with a primary key that is an identity column. I do not specify the key when I insert - SQL Server generates it as expected.
I am doing the insert in a stored procedure that I call in a loop (for loop in SSIS, actually). The stored procedure will insert rows in batches (configurable). It might insert 1000 rows at a time or it might insert 50,000 - doesn't matter. It will work for a random number of calls (inserting thousands of rows) and then it will fail, out of the blue, with a
Violation of primary key / duplicate
error. If I check the identity seed - it is correct. If I kick off the process again it will work fine, for a while.
The values being inserted are coming from 2 tables that I join together, as if that matters.
The bulk of my code is below:
WHILE #pk <= #max_pk
BEGIN
INSERT INTO tbl_claim_line (fk_batch_control_group, fk_claim, fk_provider, service_from_date, service_to_date, allowed, net_paid, COB, flex_1, flex_2, flex_3, flex_4)
SELECT
#fk_batch_control_group
, c.pk_claim
, p.pk_provider
, i.date_of_service_from
, i.date_of_service_to
, i.allowed_amount
, i.net_paid_amount
, i.cob_amount
, i.claimline_flex_1
, i.claimline_flex_2
, i.claimline_flex_3
, i.claimline_flex_4
FROM
tbl_import i
INNER JOIN
tbl_import__claim c ON i.claim_number = c.claim_number
LEFT JOIN
tbl_import__provider p ON ISNULL(i.provider_type,'') = ISNULL(p.provider_type,'')
AND ISNULL(i.provider_specialty,'') = ISNULL(p.provider_specialty,'')
AND ISNULL(i.provider_zip_code,'') = ISNULL(p.provider_zip_code,'')
WHERE
pk_import = #pk
UPDATE tbl_import
SET fk_claim_line = SCOPE_IDENTITY()
WHERE pk_import = #pk
SET #pk += 1
END
--TABLE DEFINITIONS...
CREATE TABLE [dbo].[tbl_claim_line](
[fk_batch_control_group] [int] NOT NULL,
[fk_claim] [int] NOT NULL,
[fk_provider] [int] NULL,
[service_from_date] [date] NULL,
[service_to_date] [date] NULL,
[allowed] [money] NULL,
[net_paid] [money] NULL,
[COB] [money] NULL,
[flex_1] [varchar](200) NULL,
[flex_2] [varchar](200) NULL,
[flex_3] [varchar](200) NULL,
[flex_4] [varchar](200) NULL,
[pk_claim_line] [int] IDENTITY(1,1) NOT NULL,
[insert_date] [datetime] NOT NULL,
CONSTRAINT [PK_tbl_claim_line] PRIMARY KEY NONCLUSTERED
(
[pk_claim_line] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[tbl_claim_line] WITH CHECK
ADD CONSTRAINT [FK_tbl_claim_line_tbl_batch_control_group]
FOREIGN KEY([fk_batch_control_group])
REFERENCES [dbo].[tbl_batch_control_group] ([pk_batch_control_group])
GO
ALTER TABLE [dbo].[tbl_claim_line] CHECK CONSTRAINT [FK_tbl_claim_line_tbl_batch_control_group]
GO
ALTER TABLE [dbo].[tbl_claim_line] WITH CHECK
ADD CONSTRAINT [FK_tbl_claim_line_tbl_claim]
FOREIGN KEY([fk_claim])
REFERENCES [dbo].[tbl_claim] ([pk_claim])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[tbl_claim_line] CHECK CONSTRAINT [FK_tbl_claim_line_tbl_claim]
GO
ALTER TABLE [dbo].[tbl_claim_line] WITH CHECK
ADD CONSTRAINT [FK_tbl_claim_line_tbl_provider]
FOREIGN KEY([fk_provider])
REFERENCES [dbo].[tbl_provider] ([pk_provider])
GO
ALTER TABLE [dbo].[tbl_claim_line] CHECK CONSTRAINT [FK_tbl_claim_line_tbl_provider]
GO
ALTER TABLE [dbo].[tbl_claim_line] ADD CONSTRAINT [DF_tbl_claim_line__insert_date] DEFAULT (getdate()) FOR [insert_date]
GO
----second table
CREATE TABLE [dbo].[tbl_import](
[fk_claim_line] [int] NULL,
[member_id] [varchar](50) NULL,
[member_card_id] [varchar](50) NULL,
[member_first_name] [varchar](50) NULL,
[member_last_name] [varchar](50) NULL,
[member_dob] [varchar](50) NULL,
[member_gender] [varchar](50) NULL,
[member_subscriber_relationship_code] [varchar](50) NULL,
[member_address_line_1] [varchar](100) NULL,
[member_address_line_2] [varchar](100) NULL,
[member_city] [varchar](50) NULL,
[member_state] [varchar](50) NULL,
[member_zip] [varchar](50) NULL,
[member_phone] [varchar](50) NULL,
[member_email] [varchar](50) NULL,
[subscriber_id] [varchar](50) NULL,
[group_line_of_business] [varchar](50) NULL,
[group_product] [varchar](50) NULL,
[group_employer] [varchar](50) NULL,
[provider_first_name] [varchar](50) NULL,
[provider_last_or_full_name] [varchar](200) NULL,
[provider_type] [varchar](200) NULL,
[provider_specialty] [varchar](400) NULL,
[provider_zip_code] [varchar](50) NULL,
[provider_tax_id] [varchar](50) NULL,
[medical_code_1] [varchar](10) NULL,
[medical_code_1_description] [varchar](500) NULL,
[medical_code_2] [varchar](10) NULL,
[medical_code_2_description] [varchar](500) NULL,
[medical_code_3] [varchar](10) NULL,
[medical_code_3_description] [varchar](500) NULL,
[medical_code_4] [varchar](10) NULL,
[medical_code_4_description] [varchar](500) NULL,
[medical_code_5] [varchar](10) NULL,
[medical_code_5_description] [varchar](500) NULL,
[medical_code_6] [varchar](10) NULL,
[medical_code_6_description] [varchar](500) NULL,
[medical_code_7] [varchar](10) NULL,
[medical_code_7_description] [varchar](500) NULL,
[medical_code_8] [varchar](10) NULL,
[medical_code_8_description] [varchar](500) NULL,
[medical_code_9] [varchar](10) NULL,
[medical_code_9_description] [varchar](500) NULL,
[medical_code_10] [varchar](10) NULL,
[medical_code_10_description] [varchar](500) NULL,
[medical_code_11] [varchar](10) NULL,
[medical_code_11_description] [varchar](500) NULL,
[medical_code_12] [varchar](10) NULL,
[medical_code_12_description] [varchar](500) NULL,
[medical_code_13] [varchar](10) NULL,
[medical_code_13_description] [varchar](500) NULL,
[medical_code_14] [varchar](10) NULL,
[medical_code_14_description] [varchar](500) NULL,
[medical_code_15] [varchar](10) NULL,
[medical_code_15_description] [varchar](500) NULL,
[medical_code_16] [varchar](10) NULL,
[medical_code_16_description] [varchar](500) NULL,
[date_of_service_from] [varchar](50) NULL,
[date_of_service_to] [varchar](50) NULL,
[claim_number] [varchar](50) NULL,
[claim_line_number] [varchar](50) NULL,
[original_claim_number] [varchar](50) NULL,
[allowed_amount] [varchar](50) NULL,
[net_paid_amount] [varchar](50) NULL,
[cob_amount] [varchar](50) NULL,
[date_paid] [varchar](50) NULL,
[member_flex_1] [varchar](200) NULL,
[member_flex_2] [varchar](200) NULL,
[member_flex_3] [varchar](200) NULL,
[member_flex_4] [varchar](200) NULL,
[claim_flex_1] [varchar](200) NULL,
[claim_flex_2] [varchar](200) NULL,
[claim_flex_3] [varchar](200) NULL,
[claim_flex_4] [varchar](200) NULL,
[claimline_flex_1] [varchar](200) NULL,
[claimline_flex_2] [varchar](200) NULL,
[claimline_flex_3] [varchar](200) NULL,
[claimline_flex_4] [varchar](200) NULL,
[pk_import] [int] IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK_tbl_import] PRIMARY KEY NONCLUSTERED
(
[pk_import] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
I ran into this and much like user3170349, it was a seed issue on the column. I'm adding some additional info, however.
First, you can run this to figure out if you have a seed problem:
DBCC CHECKIDENT ('TABLE_NAME_GOES_HERE', NORESEED);
This will give you information which will read something like this:
Checking identity information: current identity value 'XXXX', current column value 'YYYY'.
If YYYY is larger than XXXX, then you have a problem and need to RESEED the table to get things going again. You can do so with the following command:
DBCC CHECKIDENT ('TABLE_NAME_GOES_HERE', RESEED, ZZZZZ);
Where ZZZZ is the reseed value. That value should be at least one higher than YYYY. YMMV, so pick a value that is appropriate for your situation.
"Code in question has been running for over 5 years."
"It might insert 1000 records at a time or it might insert 50,000 "
Is it possible you have finally overflowed the integer type of the primary key?
Did it wrap around and is now starting over? That would cause duplicate primary keys.

Resources