Why my Azure SQL Database indexes are still fragmented? - sql-server

My company has committed the sin of using GUIDs as Primary Keys on our Azure SQL Database tables (it is actually worse than that: we used VARCHAR(36) instead of UNIQUEIDENTIFIER). As such, we end up with fragmented indexes. They looked like this:
CREATE TABLE OldTable (
Id VARCHAR(36) PRIMARY KEY CLUSTERED NOT NULL DEFAULT NEWID(),
CreateTime DATETIME2 NOT NULL,
...
)
I "fixed" the problem by creating new tables. This time, I used an immutable, ever-increasing DATETIME2 (e.g. CreateTime) column for CLUSTERED INDEX, and kept the VARCHAR(36) as PRIMARY KEY but this time, NONCLUSTERED. Like this:
CREATE TABLE NewTable (
Id VARCHAR(36) PRIMARY KEY NONCLUSTERED NOT NULL DEFAULT NEWID(),
CreateTime DATETIME2 NOT NULL INDEX IX_NewTable_CreateTime CLUSTERED,
)
Then I "copied" rows from old table to new table using INSERT INTO NewTable SELECT * FROM OldTable. Finally, I renamed tables and dropped the old one. Life seemed good.
For my surprise, couple of weeks later, I found out NewTable has many fragmented indexes, with avg fragmentation as high as 80%! Even the IX_NewTable_CreateTime reports fragmentation of 18%.
Did the INSERT INTO fragmented the index? Will REBUILD index solve the problem, for good?

Fragmentation will depend on the insert/update frequency on the indexed fields and the size of the Index page.
For maintenance purposes, you can use Azure Automation and create a recurring script that checks for fragmented indexes and optimizes it.
There's a Runbook in the Gallery just for that:
The best thing about this is that Automation is free as long as you don't go over the 500 running minutes per month, time your executions well and you won't have to pay :)
I made a custom improvement to the gallery script, feel free to use it too:
<#
.SYNOPSIS
Indexes tables in a database if they have a high fragmentation
.DESCRIPTION
This runbook indexes all of the tables in a given database if the fragmentation is
above a certain percentage.
It highlights how to break up calls into smaller chunks,
in this case each table in a database, and use checkpoints.
This allows the runbook job to resume for the next chunk of work even if the
fairshare feature of Azure Automation puts the job back into the queue every 30 minutes
.PARAMETER SqlServer
Name of the SqlServer
.PARAMETER Database
Name of the database
.PARAMETER SQLCredentialName
Name of the Automation PowerShell credential setting from the Automation asset store.
This setting stores the username and password for the SQL Azure server
.PARAMETER FragPercentage
Optional parameter for specifying over what percentage fragmentation to index database
Default is 20 percent
.PARAMETER RebuildOffline
Optional parameter to rebuild indexes offline if online fails
Default is false
.PARAMETER Table
Optional parameter for specifying a specific table to index
Default is all tables
.PARAMETER SqlServerPort
Optional parameter for specifying the SQL port
Default is 1433
.EXAMPLE
Update-SQLIndexRunbook -SqlServer "server.database.windows.net" -Database "Finance" -SQLCredentialName "FinanceCredentials"
.EXAMPLE
Update-SQLIndexRunbook -SqlServer "server.database.windows.net" -Database "Finance" -SQLCredentialName "FinanceCredentials" -FragPercentage 30
.EXAMPLE
Update-SQLIndexRunbook -SqlServer "server.database.windows.net" -Database "Finance" -SQLCredentialName "FinanceCredentials" -Table "Customers" -RebuildOffline $True
.NOTES
AUTHOR: Matias Quaranta
LASTEDIT: Jan 10th, 2015
#>
workflow MyRunBook
{
param(
[parameter(Mandatory=$True)]
[string] $SqlServer,
[parameter(Mandatory=$True)]
[string] $Database,
[parameter(Mandatory=$True)]
[string] $SQLCredentialName,
[parameter(Mandatory=$False)]
[int] $FragPercentage = 20,
[parameter(Mandatory=$False)]
[int] $SqlServerPort = 1433,
[parameter(Mandatory=$False)]
[boolean] $RebuildOffline = $False,
[parameter(Mandatory=$False)]
[string] $Table
)
# Get the stored username and password from the Automation credential
$SqlCredential = Get-AutomationPSCredential -Name $SQLCredentialName
if ($SqlCredential -eq $null)
{
throw "Could not retrieve '$SQLCredentialName' credential asset. Check that you created this first in the Automation service."
}
$SqlUsername = $SqlCredential.UserName
$SqlPass = $SqlCredential.GetNetworkCredential().Password
InlineScript{
# Define the connection to the SQL Database
$Conn = New-Object System.Data.SqlClient.SqlConnection("Server=tcp:$using:SqlServer,$using:SqlServerPort;Database=$using:Database;User ID=$using:SqlUsername;Password=$using:SqlPass;Trusted_Connection=False;Encrypt=True;Connection Timeout=30;")
# Open the SQL connection
$Conn.Open()
# SQL command to find tables and their average fragmentation
$SQLCommandString = #"
SELECT a.object_id, b.name, (select name from sys.tables t where t.object_id = b.object_id) as tablename, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (
DB_ID(N'$Database')
, OBJECT_ID(0)
, NULL
, NULL
, NULL) AS a
JOIN sys.indexes AS b
ON a.object_id = b.object_id AND a.index_id = b.index_id;
"#
# Return the tables with their corresponding average fragmentation
$Cmd=new-object system.Data.SqlClient.SqlCommand($SQLCommandString, $Conn)
$Cmd.CommandTimeout=120
# Execute the SQL command
$FragmentedTable=New-Object system.Data.DataSet
$Da=New-Object system.Data.SqlClient.SqlDataAdapter($Cmd)
[void]$Da.fill($FragmentedTable)
# Return the table names that have high fragmentation
ForEach ($FragTable in $FragmentedTable.Tables[0])
{
If ($FragTable.avg_fragmentation_in_percent -ge $Using:FragPercentage)
{
Write-Verbose ("Index found : " + $FragTable.name + " on table:" + $FragTable.tablename)
$SQLCommandString = "EXEC('ALTER INDEX "+$FragTable.name+" ON "+$FragTable.tablename+" REBUILD')"
$Cmd2=new-object system.Data.SqlClient.SqlCommand($SQLCommandString, $Conn)
# Set the Timeout to be less than 30 minutes since the job will get queued if > 30
# Setting to 25 minutes to be safe.
$Cmd2.CommandTimeout=1500
Try
{
$Ds=New-Object system.Data.DataSet
$Da=New-Object system.Data.SqlClient.SqlDataAdapter($Cmd2)
[void]$Da.fill($Ds)
}
Catch
{
Write-Verbose ($FragTable.name +" on table "+$FragTable.tablename+" could NOT be indexed.")
}
}
}
$Conn.Close()
}
Write-Verbose "Finished Indexing"
}

Related

PowerShell SQL Statement - Insert/Update depending on value in the field

I need to check in a database, if a dataset in the customers table has already a subsidy.
If yes, I need to update the responding dataset in the customer_subisdies to the value I retrieve from a csv.
If not, I need to insert a new dataset in the customer_subsidies.
The customer_subsidies is linked with the customers database by the id of the customers.
$csvFile = "C:\exchange\dailycardimport.csv"
Import-Csv $csvFile -Delimiter ";" | ForEach-Object {
#check if dataset with the card_num = "12345" has a value in the field subsidy_id#
#if yes,
update customer_subsidies
SET subsidy_id = 1, priority = 1,
where customer_number=(Id from Customer WHERE card_NUM = "12345")
#if not,
INSERT INTO customer_subsidies (id, subsidy_id, customer_id, priority)
VALUES (
(select max(id) from customer_subsidies) +1,
$($_.priority),
(select id from customers where code ='$($_.CARD_NUM)'),
1)
}
In the moment I am using the command Invoke-SQLcmd to interact with the SQL Server as I'am not able to download new modules to the windows server.
"Invoke-Sqlcmd -Query "SELECT * FROM customers WHERE card_num=" -ConnectionString $connectionString"
Thanks for any help!

Powershell pipeline Invoke-Sqlcmd won't work in SQL agent, how to fix?

I'm typing to implement a Powershell script (that works in Powershell) in the SQL server agent. In this script, I fetch the SQL server version (among others) of all the registered servers (+/-100) on the management server and save the result in a table in a specific database on the management server. If I execute the script from Powershelll or Powershell ISE, this works like a charm; the destination table is filled with the correct data.
However, if the script is placed in a job in SQL server and executed, the output is NULL.
What have I tried:
iterate over the servers by a "foreach" statemement and execute a query per server, saving the outcome in a pipeline. (this is the one that works in powershell but fails to yield the data in SQL agent.)
Implement the query-execution and saving of the data in a "get-childitem"- pipeline, by addition of the "Invoke-Sqlcmd" and "Write-SqlTableData" statements in the pipeline. Couldn't get this to work, the message "The input object cannot be bound to any parameters for the command either because the command does not take pipeline input or the input and its properties do not match any of the parameters that take pipeline input" is returned.
According to the documentation of Invoke-Sqlcmd however, the parameter "-ServerInstance" is the only parameter that accepts pipeline values. And that is the only one I used (for iterating over the servers); the other parameters are the same for each iteration.
changed the module from "sqlserver"to "dbatools": same outcome.
execute the ps1-script in management studio (xp_cmdshell 'PowerShell -noprofile "[path/filename]"'). Doesn't work.
Any ideas, how can I fix this?
Powershell-script that works in Powershell:
$CreateTable =
"IF EXISTS (SELECT * FROM sysobjects where name='SQL_server_current_versions' and xtype='U')
DROP TABLE [SQL_server_current_versions];
CREATE TABLE [dbo].[SQL_server_current_versions](
[Machine_name] [varchar](20) NULL,
[SQL_Service_name] [varchar](20) NULL,
[Instance_name] [varchar](30) NULL,
[VersionBuild] [varchar](20) NULL,
[Current_SP] [varchar](10) NULL,
[Edition_32_or_64_bit] [varchar](40) NULL,
[IsWindowsAuthOnly] [bit] NULL,
[IsClustered] [bit] NULL
)"
SQLserver\Invoke-Sqlcmd -ServerInstance RIVM-MANDB-W01P -Database SSCC_db -Query $CreateTable
$query =
"SELECT SERVERPROPERTY('MachineName') AS [MACHINE NAME],
SERVERPROPERTY('InstanceName') AS [SQL SERVICE NAME],
SERVERPROPERTY('ServerName') AS [INSTANCE NAME],
SERVERPROPERTY('ProductVersion') AS [VersionBuild],
SERVERPROPERTY('ProductLevel') AS [Current SP],
SERVERPROPERTY ('Edition') AS [Edition 32 or 64 BIT],
SERVERPROPERTY('IsIntegratedSecurityOnly') AS [IsWindowsAuthOnly],
SERVERPROPERTY('IsClustered') AS [IsClustered]"
foreach ($server in (Get-ChildItem 'SQLSERVER:\SQLRegistration' -Recurse | Where-Object {$_ -is [Microsoft.SqlServer.Management.RegisteredServers.RegisteredServer]} | select ServerName))
{$server.ServerName
if ($server.ServerName -ne "RIVM-MANDB-W01P\SSRS")
{SQLserver\Invoke-Sqlcmd -ServerInstance $server.ServerName -Database master -Query $query -OutputAs DataTables | SQLserver\Write-SqlTableData -ServerInstance RIVM-MANDB-W01P -DatabaseName SSCC_db -SchemaName dbo -TableName SQL_server_current_versions}
}

How to bulk import many Word documents into SQL Server database table

I need to import ~50,000 Word documents (.doc and .docx) from a single directory into a SQL Server 2016 database table so that I can use full text indexing and then search the documents' contents.
Since this is a one-off task and the database won't be required for long I'm not concerned with performance or the arguments for using FILESTREAM or FileTables.
I've just created a database with a single table:
CREATE TABLE [dbo].[MyDocument]
(
[ID] INT IDENTITY(1,1) NOT NULL,
[DocumentName] NVARCHAR(255) NOT NULL,
[Extension] NCHAR(10) NOT NULL,
[DocumentContent] VARBINARY(MAX) NOT NULL,
CONSTRAINT [PK_MyDocument] PRIMARY KEY CLUSTERED ([ID] ASC)
)
Now I'm looking for a way to get my documents into the table. There are plenty of examples online for importing a single document into a SQL Server database table using OPENROWSET, but they require me to specify a name for the file, which is obviously no use for my requirements.
I can't believe there isn't a well-documented and straightforward way to do this but a couple of hours of searching haven't turned anything up, which is starting to make me doubt this is even possible, but surely it is?
Can anybody give me an example snippet of T-SQL for importing multiple files into the database? Or suggest how else it might be achieved?
Below is a PowerShell script to import all ".docx" files in the specified folder using a parameterized query along with a FileStream parameter value to stream file contents to the database rather than loading the entire file contents into client memory.
# import all documents in specified directory using file stream parameter
try {
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$insertQuery = #"
INSERT INTO dbo.MyDocument (DocumentName, Extension, DocumentContent)
VALUES(#DocumentName, #Extension, #DocumentContent);
"#
$connection = New-Object System.Data.SqlClient.SqlConnection("Data Source=.;Initial Catalog=YourDatabase;Integrated Security=SSPI")
$command = New-Object System.Data.SqlClient.SqlCommand($insertQuery, $connection)
$documentNameParameter = $command.Parameters.Add("#DocumentName", [System.Data.SqlDbType]::NVarChar, 255)
$documentExtensionParameter = $command.Parameters.Add("#Extension", [System.Data.SqlDbType]::NVarChar, 10)
$documentContentParameter = $command.Parameters.Add("#DocumentContent", [System.Data.SqlDbType]::VarBinary, -1)
$connection.Open()
$filesToImport = Get-ChildItem "E:\DocumentsToImport\*.docx"
$importedFileCount = 0
foreach($fileToImport in $filesToImport) {
$documentContentStream = [System.IO.File]::Open($fileToImport.FullName, [System.IO.FileMode]::Open)
$documentNameParameter.Value = [System.IO.Path]::GetFileNameWithoutExtension($fileToImport.FullName)
$documentExtensionParameter.Value = [System.IO.Path]::GetExtension($fileToImport.Name)
$documentContentParameter.Value = $documentContentStream
[void]$command.ExecuteNonQuery()
$documentContentStream.Close()
$importedFileCount += 1
}
$connection.Close()
$timer.Stop()
Write-Host "$importedFileCount files imported. Duration $($timer.Elapsed)."
}
catch {
throw
}

How to backup statistics in sql?

Let's just say I have a table A with some data on it in SSMS. There are sub tables such as columns, constraints,triggers,indexes and statistics etc.
I want to create a similar table with same properties as table A. I know I need to go to Script Table As-> Create To-> New Query Window to duplicate the table structure.
However, after doing that, I realized the statistics in my new table is empty when there are statistics in table A. Did I miss out something?
You can script the statistics blob only with the following bit of powershell (which I yoinked from an old blog post of mine):
pushd;
import-module sqlps -disablenamechecking;
popd;
$opts = new-object Microsoft.SqlServer.Management.SMO.ScriptingOptions;
$opts.OptimizerData = $true;
$server = new-object Microsoft.SqlServer.Management.SMO.Server ".";
$database = $server.Databases["AdventureWorks2008R2"];
foreach ($table in $database.Tables) {
foreach ($stat in $table.Statistics) {
$stat.Script($opts);
}
}
The above will script out all statistics (including the histogram data) for all tables in the AdventureWorks2008R2 database. You should be able to tailor it to your needs.

Pass a powershell variable into a SQL value during out-datatable (invoke-sqlcmd2)

I want to insert a PowerShell variable value with a Select as I build a datatable from a SQL query.
Borrowed function invoke-sqlcmd2 from TechNet gallery and dot-sourced it in.
$NewSequenceID = invoke-sqlcmd2 -ServerInstance "MyServer" -Database "MyDB" -Query "INSERT INTO [Sequence] (TimeStarted) SELECT GETDATE(); SELECT max(SequenceID) as SequenceID FROM [Sequence]" | foreach { $_.SequenceID }
This generates a new sequence ID and stamps the time we started the batch. Results in a single number which will identify this run. Verified with 'write $NewSequenceID'.
I want to keep later results from queries together with this SequenceID for analysis.
Then I have this:
$PollTime = Get-Date -format "yyyy-MM-dd HH:mm:ss"
Then I want to do this: (Edit: This statement is not working - error message at the bottom)
$AuditUserOutput = invoke-sqlcmd2 -ServerInstance "MyServer2" -Database "MyDB2" -Query "SELECT $NewSequenceID, $PollTime, [USERID], [PID], [UDATE] FROM [MyTable]" -As 'Datatable'
And do some things with the table, then write it after with write-datatable.
If I select NULL for the first two values and grab the other three from the existing table, it works fine. I want to add the $NewSequenceID and $PollTime from the previous statements.
I've read a dozen pages about using ` (backtick), $, {}, and on and on, but I haven't gotten it right. Can someone help with the correct syntax for inserting these variable values into the selection?
PS Error is: Exception calling "Fill" with "1" argument(s): "Invalid pseudocolumn "$NewSequenceID"."
You're interpolating the variables correctly in PowerShell. If I'm understanding this correctly, the problem is with your SQL query. I'm going to make an inference here, but I think this is probably what you want:
$AuditUserOutput = invoke-sqlcmd2 -ServerInstance "MyServer2" -Database "MyDB2" -Query "SELECT [NewSequenceID], [PollTime], [USERID], [PID], [UDATE] FROM [MyTable] WHERE NewSequenceID = '$NewSequenceID' AND PollTime = '$PollTime'" -As 'Datatable'
If not, please clarify by responding to the questions above.
I was able to work around this by first creating a variable to store the query text, which allowed for the natural substitution I needed:
$AuditUserQuery = "SELECT '$NewSequenceID', '$PollTime', [USERID], [PID], [UDATE] FROM [AUDITUSER]"
Then calling that variable as the $query when building the datatable.
This avoided the parameterization problem experienced before.

Resources