Vague title I know.
I have, at the moment, 16,000 rows in my database. This was created just while in development, I want to now delete all these rows so I can start again (so I don't have duplicate data).
The database is on SQL Azure.
If I run a select query
SELECT [Guid]
,[IssueNumber]
,[Severity]
,[PainIndex]
,[Status]
,[Month]
,[Year]
,[DateCreated]
,[Region]
,[IncidentStart]
,[IncidentEnd]
,[SRCount]
,[AggravatingFactors]
,[AggravatingFactorDescription]
FROM [dbo].[WeeklyGSFEntity]
GO
This returns all the rows, and SSMS says this takes 49 seconds.
If I attempt to drop the table, this goes on for 5 minutes plus.
DROP TABLE [dbo].[WeeklyGSFEntity]
GO
/****** Object: Table [dbo].[WeeklyGSFEntity] Script Date: 10/01/2013 09:46:18 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[WeeklyGSFEntity](
[Guid] [uniqueidentifier] NOT NULL,
[IssueNumber] [int] NULL,
[Severity] [int] NULL,
[PainIndex] [nchar](1) NULL,
[Status] [nvarchar](255) NULL,
[Month] [int] NULL,
[Year] [int] NULL,
[DateCreated] [datetime] NULL,
[Region] [nvarchar](255) NULL,
[IncidentStart] [datetime] NULL,
[IncidentEnd] [datetime] NULL,
[SRCount] [int] NULL,
[AggravatingFactors] [nvarchar](255) NULL,
[AggravatingFactorDescription] [nvarchar](max) NULL,
PRIMARY KEY CLUSTERED
(
[Guid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
GO
If I attempt to delete each row, this also takes 5 minutes plus.
DELETE
FROM [dbo].[WeeklyGSFEntity]
GO
Am I doing something wrong or is it just that this is big data and I'm being impatient?
UPDATE:
Dropping the entire database took some 25 seconds.
Importing 22,000 rows (roughly the same 16,000 plus more) into localdb\v11.0 took 6 seconds. I know this is local but surely the local dev server is slower than Azure? Surely...
UPDATE the second:
Recreating the database and recreating the schema (with (Fluent) NHibernate), and then inserting some 20,000 rows took 2 minutes 6 seconds. All Unit Tests pass.
Is there anything I can do to look back?
Dropping and recreating the database sped things up considerably.
The reason for this is unknown.
Possible there is an open transaction on causing a lock on the table. This could be caused by cancelling an operation half way like we all do during dev.
Do a sp_who2 and see which Id is in blkby column. If there is one that's it.
To kill that process do kill id
Related
I have a SQL Server table linked in an Access app. If I try to delete records with a delete query there is no problem. But if I try to delete records directly in table or using a select query in datasheet mode Access doesn't allow me to delete the records and throws the following warning:
"The microsoft access database engine stopped the process because you and another user are attempting to change the same data at the same time."
The same happens when I try to update data. There is no other user modifying the data.
The problem is that we still have a lot of legacy forms that uses datasheet mode to alter o delete records instead of using queryes and, for now, changing all these forms is unthinkable.
¿Has anyone any idea of what could be happening?
Thanks!
FINAL EDIT:
The problem was a bit field that was set to nullable that, thanks to Kostas K. I discovered is not convertable to Access.
So, instead of this:
[FIELD] [bit] NULL
We need tis:
[FIELD] [bit] NOT NULL
ALTER TABLE [dbo].[TABLE] ADD DEFAULT ((0)) FOR [FIELD] GO
UPDATE: This locking only happens with new records added from Access, but not with the original records of the SQL table.
This is the script to create the table:
`
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [Chapas].[INFO_CHAPAS](
[ID_INFO_CHAPA] [int] IDENTITY(1,1) NOT NULL,
[COD_EQUIPO] [int] NULL,
[EQUIPO] [nvarchar](255) NULL,
[NUMERO_SERIE] [nvarchar](255) NULL,
[FASES] [nvarchar](255) NULL,
[VOLTAJE] [nvarchar](255) NULL,
[FRECUENCIA] [nvarchar](255) NULL,
[POTENCIA] [nvarchar](255) NULL,
[AÑO] [int] NULL,
[IMPRESO] [bit] NULL,
[SELECTOR_REGISTRO] [bit] NULL,
[USUARIO] [int] NULL,
[FECHA_IMPRESION] datetime NULL
CONSTRAINT [INFO_CHAPAS_PK] PRIMARY KEY NONCLUSTERED
(
[ID_INFO_CHAPA] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY
= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [Chapas].[INFO_CHAPAS] ADD DEFAULT ((0)) FOR [IMPRESO]
GO
`
Weird situation here. I'm inserting a row in a table with a primary key with IDENTITY (1,1), but the value that it uses is waaay wrong. This is the table:
CREATE TABLE [dbo].[adm_tm_modulo](
[id_modulo] [int] IDENTITY(1,1) NOT NULL,
[codigo] [varchar](255) NULL,
[descripcion] [varchar](255) NULL,
[estado] [int] NOT NULL,
[fecha_actualizacion] [datetime2](7) NULL,
[fecha_creacion] [datetime2](7) NULL,
[id_usuario_actualizacion] [int] NOT NULL,
[id_usuario_creacion] [int] NOT NULL,
[nombre] [varchar](255) NULL,
[ruta] [varchar](255) NULL,
PRIMARY KEY CLUSTERED
(
[id_modulo] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
Right now, it has 4 rows as you can see:
But whenever I insert a new row, the primary key ends up as if the IDENTITY value starts at 1000. This is what happens after I execute the INSERT:
I am certain that I have never inserted that many rows before, nor anyone else as this is a private DB in my own PC. Also, adding all the rows of all the tables, they are around 400 (not even close to 1000). And I tried inserting in other tables but the same thing is happening, only that in some tables it inserts a value from 3001 foward, or 4001, etc. It always starts with the first number after a thousand.
Any help about why this is happening would be very appreciated.
You may want to use a SEQUENCE object, which gives you more control on behavior of values generating.
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-sequence-transact-sql?view=sql-server-ver15
Another option (which I prefer better) is to create an INSERT TRIGGER and define the logic you want in it
I have a complex problem with SQL Server.
I administer 40 databases with identical structure but different data. Those database sizes vary from 2 MB to 10 GB of data. The main table for these databases is:
CREATE TABLE [dbo].[Eventos](
[ID_Evento] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[FechaGPS] [datetime] NOT NULL,
[FechaRecepcion] [datetime] NOT NULL,
[CodigoUnico] [varchar](30) COLLATE Modern_Spanish_CI_AS NULL,
[ID_Movil] [int] NULL,
[CodigoEvento] [char](5) COLLATE Modern_Spanish_CI_AS NULL,
[EventoData] [varchar](150) COLLATE Modern_Spanish_CI_AS NULL,
[EventoAlarma] [bit] NOT NULL CONSTRAINT [DF_Table_1_Alarma] DEFAULT ((0)),
[Ack] [bit] NOT NULL CONSTRAINT [DF_Eventos_Ack] DEFAULT ((0)),
[Procesado] [bit] NOT NULL CONSTRAINT [DF_Eventos_Procesado] DEFAULT ((0)),
[Latitud] [float] NULL,
[Longitud] [float] NULL,
[Velocidad] [float] NULL,
[Rumbo] [smallint] NULL,
[Satelites] [tinyint] NULL,
[EventoCerca] [bit] NOT NULL CONSTRAINT [DF_Eventos_FueraCerca] DEFAULT ((0)),
[ID_CercaElectronica] [int] NULL,
[Direccion] [varchar](250) COLLATE Modern_Spanish_CI_AS NULL,
[Localidad] [varchar](150) COLLATE Modern_Spanish_CI_AS NULL,
[Provincia] [varchar](100) COLLATE Modern_Spanish_CI_AS NULL,
[Pais] [varchar](50) COLLATE Modern_Spanish_CI_AS NULL,
[EstadoEntradas] [char](16) COLLATE Modern_Spanish_CI_AS NULL,
[DentroFuera] [char](1) COLLATE Modern_Spanish_CI_AS NULL,
[Enviado] [bit] NOT NULL CONSTRAINT [DF_Eventos_Enviado] DEFAULT ((0)),
[SeñalGSM] [int] NOT NULL DEFAULT ((0)),
[GeoCode] [bit] NOT NULL CONSTRAINT [DF_Eventos_GeoCode] DEFAULT ((0)),
[Contacto] [bit] NOT NULL CONSTRAINT [DF_Eventos_Contacto] DEFAULT ((0)),
CONSTRAINT [PK_Eventos] PRIMARY KEY CLUSTERED
(
[ID_Evento] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
USE [ABS]
GO
ALTER TABLE [dbo].[Eventos] WITH CHECK ADD CONSTRAINT [FK_Eventos_Eventos] FOREIGN KEY([ID_Evento])
REFERENCES [dbo].[Eventos] ([ID_Evento])
I also have a cycle that runs every n seconds to process these records (only new ones and mark them as processed). This process uses this query:
SELECT
Tbl.ID_Cliente, Ev.ID_Evento, Tbl.ID_Movil, Ev.EventoData, Tbl.Evento,
Tbl.ID_CercaElectronica, Ev.Latitud, Ev.Longitud, Tbl.EsAlarma, Ev.FechaGPS,
Tbl.AlarmaVelocidad, Ev.Velocidad, Ev.CodigoEvento
FROM
dbo.Eventos AS Ev
INNER JOIN
(SELECT
Det.CodigoEvento, Mov.CodigoUnico, Mov.ID_Cliente, Mov.ID_Movil, Det.Evento,
Mov.ID_CercaElectronica, Det.EsAlarma, Mov.AlarmaVelocidad
FROM
dbo.Moviles Mov
INNER JOIN
dbo.GruposEventos AS GE
INNER JOIN
dbo.GruposEventosDet AS Det ON Det.ID_GrupoEventos = GE.ID_GrupoEventos
ON GE.ID_GrupoEventos = Mov.ID_GrupoEventos) as Tbl ON EV.CodigoUnico = Tbl.CodigoUnico AND Ev.CodigoEvento = Tbl.CodigoEvento
WHERE
(Ev.Procesado = 0)
The table can have on some databases more than 1.000.000 records. So to optimize the process I created this index specific for this query using SQL assistant for optimization:
CREATE NONCLUSTERED INDEX [OptimizadorProcesarEventos] ON [dbo].[Eventos]
(
[Procesado] ASC,
[CodigoEvento] ASC,
[CodigoUnico] ASC,
[FechaGPS] ASC
)
INCLUDE ( [ID_Evento],
[EventoData],
[Latitud],
[Longitud],
[Velocidad]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
This used to work perfect. But now occasionally and only in some databases, the query takes forever and gives me timeout. So I run a "show execution plan" and realize that in some scenarios depending on the data from the table, SQL Server decides not to use my index and use a PK Index instead. I verify this running the same execution plan on other db that works fine and the index is being use.
So my question: why does SQL Server on some occasions decide not to use my index?
Thank you for your interest!
UPDATE
I already try to UPDATE STATICS and didn´t help. I preffer to avoid the use of HINT for now, so the question remains: Why SQL Server choose a more inefficient way to execute my query if has an index for it?
UPDATE II
After many test, I could finaly resolve the problem, even though i don't quite undestand why this worked. I change the index to this:
CREATE NONCLUSTERED INDEX [OptimizadorProcesarEventos] ON [dbo].[Eventos]
(
[CodigoUnico] ASC,
[CodigoEvento] ASC,
[Procesado] ASC,
[FechaGPS] ASC
)
INCLUDE ( [ID_Evento],
[EventoData],
[Latitud],
[Longitud],
[Velocidad]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
Basicaly i change the order of the fields in the index and the query inmediatly start to use the index as espected. I'ts still a mistery for me how SQL Server choose to use or not to use indexes on specific query. Thanks to everyone.
you must have find lot of articles on how Query optimizer chooses the Right Index. if not search something on google.
I can point out one to start with.
Index Selection and the Query Optimizer
The simple answer is as follows:
"Based on the index usage history, statistics, number of rows inserted/updated/deleted etc.... Query optimizer has find out that using the PK index is less costly than using the other Non Clustered index."
now you will have lot of questions around how did Query Optimizer finds that out? and that will require some home work.
though in your specific situation, I am not agree with "Femi" as mentioned to try and running "Update Statistics" because there are some other situations as well where Update Statistics will also not help.
It sound like you have tested this Index on this query and if you are sure that you want only this index to be used 100% of time by that query, use the query hint and specify this index needs to be used. by that way you can always sure that this index will be used.
CAUTION: you must have done more than enough testing on various data loads to make sure in no case using this index is not expected or not acceptable. Once you use the Query hints every execution will use that only and Optimizer will always come up with execution plan using that Index.
Its difficult to tell in this specific case, but very often the query planner will look at the statistics it has for the specific table and decide to use the wrong index (for some definition of wrong; probably just not the index you think it should use). Try running UPDATE STATISTICS on the table and see if the query planner arrives at a different set of decisions.
Determining why the optimizer does or doesn't choose a given index can be somewhat of a dark art. I do notice, however, that there's likely a better index that you could be using. Specifically:
CREATE NONCLUSTERED INDEX [OptimizadorProcesarEventos] ON [dbo].[Eventos]
(
[Procesado] ASC,
[CodigoEvento] ASC,
[CodigoUnico] ASC,
[FechaGPS] ASC
)
INCLUDE ( [ID_Evento],
[EventoData],
[Latitud],
[Longitud],
[Velocidad])
WHERE Procesado = 0 -- this makes it a filtered index
WITH (SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
IGNORE_DUP_KEY = OFF,
ONLINE = OFF)
ON [PRIMARY]
This goes on my assumption that at any given time, most of the rows in your table are processed (i.e. Procesado = 1) so the above index would be much smaller than the non-filtered version.
I'm currently dealing with performance/memory consumption optimizations of our application. One of the tasks to perform is to replace all blobs in the a table that correspond to empty arrays with null values; this should reduce db size, memory consumption and speed up the load. Here is the table definition:
CREATE TABLE [dbo].[SampleTable](
[id] [bigint] NOT NULL,
[creationTime] [datetime] NULL,
[binaryData] [image] NULL,
[isEvent] [bit] NULL,
[lastSavedTime] [datetime] NULL,
CONSTRAINT [PK_SampleTable] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
I updated the table and replaced image field values (binaryData) with NULL values where appropriate (data corresponding to empty arrays in the application). Now, I observe the performance deterioration when running trivial SELECT * FROM SampleTable.
Originally those fields that had been updated had length = 512 bytes, not sure if it matters, though.
Any ideas why selecting blobs containing NULL values takes longer than selecting real binary data even if the data is the same for different rows?
I don't know the answer to this question. I tried the following test though and got a result that I found surprising.
CREATE TABLE [dbo].[SampleTable](
[id] [BIGINT] NOT NULL,
[creationTime] [DATETIME] NULL,
[binaryData] [IMAGE] NULL,
[isEvent] [BIT] NULL,
[lastSavedTime] [DATETIME] NULL,
CONSTRAINT [PK_SampleTable] PRIMARY KEY CLUSTERED
(
[id] ASC
)
)
INSERT INTO [dbo].[SampleTable]
SELECT 1, GETDATE(),
0x1111,
1, GETDATE()
INSERT INTO [dbo].[SampleTable]
SELECT 2, GETDATE(),
0x2222,
2, GETDATE()
INSERT INTO [dbo].[SampleTable]
SELECT 3, GETDATE(),
NULL,
3, GETDATE()
UPDATE [dbo].[SampleTable] SET [binaryData] = NULL
WHERE [id]=2
Looking at this in SQL Internals Viewer I was surprised to see a difference between the row I inserted as NULL and the one I updated to NULL.
It looks as though even when the value is updated to NULL it doesn't just set the NULL bitmap for some reason and still needs to follow a pointer to another LOB_DATA page.
Inserted as NULL
Inserted http://img809.imageshack.us/img809/9301/row3.png
Updated to NULL
Updated http://img84.imageshack.us/img84/420/row2.png
Let me help you restate this:
You have sql server doing a table scan while testing every.single.record. for a null value on the one side, versus the other where you have sql server doing a massive dump of ALL the records...
If your blobs are relatively small, then it's pretty obvious which one would be faster...
I have a 'SessionVisit' table which collects data about user visits.
The script for this table is below. There may be 25,000 rows added a day.
The table CREATE statement is below. My database knowledge is definitely not up to scratch as far as understanding the implications of such a schema.
Can anyone give me their 2c of advice on some of these issues :
Do I need to worry about ROWSIZE for this schema for SQL Server 2008. I'm not even sure how the 8kb rowsize works in 2008. I don't even know if I'm wasting a lot of space if I'm not using all 8kb?
How should I purge old records I don't want. Will new rows fill in the empty spaces from dropped rows?
Any advice on indexes
I know this is quite general in nature. Any 'obvious' or non obvious info would be appreciated.
Here's the table :
USE [MyDatabase]
GO
/****** Object: Table [dbo].[SessionVisit] Script Date: 06/06/2009 16:55:05 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[SessionVisit](
[SessionGUID] [uniqueidentifier] NOT NULL,
[SessionVisitId] [int] IDENTITY(1,1) NOT NULL,
[timestamp] [timestamp] NOT NULL,
[SessionDate] [datetime] NOT NULL CONSTRAINT [DF_SessionVisit_SessionDate] DEFAULT (getdate()),
[UserGUID] [uniqueidentifier] NOT NULL,
[CumulativeVisitCount] [int] NOT NULL CONSTRAINT [DF_SessionVisit_CumulativeVisitCount] DEFAULT ((0)),
[SiteUserId] [int] NULL,
[FullEntryURL] [varchar](255) NULL,
[SiteCanonicalURL] [varchar](100) NULL,
[StoreCanonicalURL] [varchar](100) NULL,
[CampaignId] [int] NULL,
[CampaignKey] [varchar](50) NULL,
[AdKeyword] [varchar](50) NULL,
[PartnerABVersion] [varchar](10) NULL,
[ABVersion] [varchar](10) NULL,
[UserAgent] [varchar](255) NULL,
[Referer] [varchar](255) NULL,
[KnownRefererId] [int] NULL,
[HostAddress] [varchar](20) NULL,
[HostName] [varchar](100) NULL,
[Language] [varchar](50) NULL,
[SessionLog] [xml] NULL,
[OrderDate] [datetime] NULL,
[OrderId] [varchar](50) NULL,
[utmcc] [varchar](1024) NULL,
[TestSession] [bit] NOT NULL CONSTRAINT [DF_SessionVisit_TestSession] DEFAULT ((0)),
[Bot] [bit] NULL,
CONSTRAINT [PK_SessionVisit] PRIMARY KEY CLUSTERED
(
[SessionGUID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[SessionVisit] WITH CHECK ADD CONSTRAINT [FK_SessionVisit_KnownReferer] FOREIGN KEY([KnownRefererId])
REFERENCES [dbo].[KnownReferer] ([KnownRefererId])
GO
ALTER TABLE [dbo].[SessionVisit] CHECK CONSTRAINT [FK_SessionVisit_KnownReferer]
GO
ALTER TABLE [dbo].[SessionVisit] WITH CHECK ADD CONSTRAINT [FK_SessionVisit_SiteUser] FOREIGN KEY([SiteUserId])
REFERENCES [dbo].[SiteUser] ([SiteUserId])
GO
ALTER TABLE [dbo].[SessionVisit] CHECK CONSTRAINT [FK_SessionVisit_SiteUser]
I see SessionGUID and SessionVisitId, why have both a uniqueidentifier and an Identity(1,1) on the same table? Seems redundant to me.
I see referer and knownrefererid, think about getting the referer from the knownrefererid if possible. This will help reduce excess writes.
I see campaignkey and campaignid, again if possible get from the campaigns table if possible.
I see orderid and orderdate. I'm sure you can get the order date from the orders table, correct?
I see hostaddress and hostname, do you really need the name? Usually the hostname doesn't serve much purpose and can be easily misleading.
I see multiple dates and timestamps, is any of this duplicate?
How about that SessionLog column? I see that it's XML. Is it a lot of data, is it data you may already have in other columns? If so get rid of the XML or the duplicated columns. Using SQL 2008 you can parse data out of that XML column when reporting and possibly eliminate a few extra columns (thus writes). Are you going to be in trouble in the future when developers add more to that XML? XML to me just screams 'a lot of excessive writing'.
Mitch says to remove the primary key. Personally I would leave the index on the table. Since it is clustered that will help speed up write times as the DB will always write new rows at the end of the table on the disk.
Strip out some of this duplicate information and you'll probably do just fine writing a row each visit.
Well, I'd recommend NOT inserting a few k of data with EVERY page!
First thing I'd do would be to see how much of this information I could get from a 3rd party analytics tool, perhaps combined with log analysis. That should allow you to drop a lot of the fields.
25k inserts a days isn't much, but the catch here is that busier your site gets, the more load this is going to put on the db. Perhaps you could build a queuing system that batches the writes, but really, most of this information is already in the logs.
Agre with Chris that you would probably be better off using log analysis (check out Microsoft's free Log Parser)
Failing that, I would remove the Foreign Key constraints from your SessionVisit table.
You mentioned rowsize; the varchar's in your table do not pre-allocate to their maximum length (more 4 + 4 bytes for an empty field (approx.)). But saying that, a general rule is to keep rows as 'lean' as possible.
Also, I would remove the primary key from the SessionGUID (GUID) column. It won't help you much.
That's also an awful lot of nulls in that table. I think you should group together the columns that must be non-null at the same time. In fact, you should do a better analysis of the data you're writing, rather than lumping it all together in a single table.