PowerShell SQL Statement - Insert/Update depending on value in the field - sql-server

I need to check in a database, if a dataset in the customers table has already a subsidy.
If yes, I need to update the responding dataset in the customer_subisdies to the value I retrieve from a csv.
If not, I need to insert a new dataset in the customer_subsidies.
The customer_subsidies is linked with the customers database by the id of the customers.
$csvFile = "C:\exchange\dailycardimport.csv"
Import-Csv $csvFile -Delimiter ";" | ForEach-Object {
#check if dataset with the card_num = "12345" has a value in the field subsidy_id#
#if yes,
update customer_subsidies
SET subsidy_id = 1, priority = 1,
where customer_number=(Id from Customer WHERE card_NUM = "12345")
#if not,
INSERT INTO customer_subsidies (id, subsidy_id, customer_id, priority)
VALUES (
(select max(id) from customer_subsidies) +1,
$($_.priority),
(select id from customers where code ='$($_.CARD_NUM)'),
1)
}
In the moment I am using the command Invoke-SQLcmd to interact with the SQL Server as I'am not able to download new modules to the windows server.
"Invoke-Sqlcmd -Query "SELECT * FROM customers WHERE card_num=" -ConnectionString $connectionString"
Thanks for any help!

Related

How to drop duplicate records in the SQL Server using Python?

I have a .csv file and it gets updated every day. Below is the example of my .csv file
I am pushing this .csv file into SQL Server using Python. My script reads the .csv file and uploads it into a SQL Server database.
This is my Python script:
import pandas as pd
import pyodbc
df = pd.read_csv ("C:/Users/Dhilip/Downloads/test.csv")
print(df)
conn = pyodbc.connect('Driver={SQL Server};'
'Server=DESKTOP-7FCK7FG;'
'Database=test;'
'Trusted_Connection=yes;')
cursor = conn.cursor()
#cursor.execute('CREATE TABLE people_info (Name nvarchar(50), Country nvarchar(50), Age int)')
for row in df.itertuples():
cursor.execute('''
INSERT INTO test.dbo.people_info (Name, Country, Age)
VALUES (?,?,?)
''',
row.Name,
row.Country,
row.Age
)
conn.commit()
The script is working fine. I am trying to automate my Python script using batch file and task scheduler, and it's working fine. However, whenever I add new data in the .csv file and SQL Server gets updated with new data and the same time it prints the old data multiple times.
Example, if I add new record called Israel, the output appears in SQL Server as below
I need output as below,
Can anyone advise me the change I need to do in the above python script?
You can use below query in your python script. if Not exists will check if the record already exists based on the condition in where clause and if record exists then it will go to else statement where you can update or do anything.
checking for existing records in database works faster than checking using python script.
if not exists (select * from Table where Name = '')
begin
insert into Table values('b', 'Japan', 70)
end
else
begin
update Table set Age=54, Country='Korea' where Name = 'A'
end
to find existing duplicate records then use the below query
select Name, count(Name) as dup_count from Table
group by Name having COUNT(Name) > 1
I find duplicates like this
def find_duplicates(table_name):
"""
find duplicates inside table
:param table_name:
:return:
"""
connection = sqlite3.connect("./k_db.db")
cursor = connection.cursor()
findduplicates = """ SELECT a.*
FROM {} a
JOIN (
SELECT shot, seq, lower(user), date_time,written_by, COUNT(*)
FROM {}
GROUP BY shot, seq, lower(user), date_time,written_by
HAVING count(*) > 1 ) b
ON a.shot = b.shot
AND a.seq = b.seq
AND a.date_time = b.date_time
AND a.written_by = b.written_by
ORDER BY a.shot;""".format(
table_name, table_name
)
# print(findduplicates)
cursor.execute(findduplicates)
connection.commit()
records = cursor.fetchall()
cursor.close()
connection.close()
You could rephrase your insert such that it checks for existence of the tuple before inserting:
for row in df.itertuples():
cursor.execute('''
INSERT INTO test.dbo.people_info (Name, Country, Age)
SELECT ?, ?, ?
WHERE NOT EXISTS (SELECT 1 FROM test.dbo.people_info
WHERE Name = ? AND Country = ? AND Age = ?)
''', (row.Name, row.Country, row.Age, row.Name, row.Country, row.Age,))
conn.commit()
An alternative to the above would be to add a unique index on (Name, Country, Age). Then, your duplicate insert attempts would fail and generate an error.

Perl - Get the structure of a sqlite database using DBI

I need to test the structure of my SQLite database which is composed by a unique table with let's say 2 columns (id, name). I can't figure out the SQL query to get the table schema of my database.
I am able to get all the content of the database using the DBI method selectall_arrayref(). However it only returns an array containing the values inside my database. This information is useful but I would like to have a SQL query which returns something like id, name (Basically, the table schema).
I tried the following queries : SHOW COLUMNS FROM $tablename but also SELECT * from $tablename (This one returns all the table content).
Here is my implementation so far :
# database path
my $db_path = "/my/path/to/.database.sqlite";
my $tablename = "table_name";
sub connect_to_database {
# Connect to the database
my $dbh = DBI->connect ("dbi:SQLite:dbname=$db_path", "", "",
{ RaiseError => 1, AutoCommit => 0 },
)
or confess $DBI::errstr;
return $dbh;
}
sub get_database_structure {
# Connect to the database
my $dbh = &connect_to_database();
# Get the structure of the database
my $sth = $dbh->prepare("SHOW COLUMNS FROM $tablename");
$sth->execute();
while (my $inphash = $sth->fetrow_hashref()) {
print $inphash."\n";
}
# Disconnect from the database
$dbh->disconnect();
}
# Call the sub to print the database structure
&get_database_structure();
I expect the output to be the structure of my table so id, name but I raise an error : DBD::SQLite::db prepare failed: near "SHOW": syntax error
I can't find the good query. Any comments or help would be greatly appreciated.
Thanks !
What you're looking for is really just the SQL lite query for the table and column information. This answer SQLite Schema Information Metadata has the full details if this query doesn't work for you, but under the assumption you're using whatever the 'recent' version mentioned in one of the answers is, you can do something like this:
# Get the structure of the database
my $sth = $dbh->prepare("<<END_SQL");
SELECT
m.name as table_name,
p.name as column_name
FROM sqlite_master AS m
JOIN pragma_table_info(m.name) AS p
ORDER BY m.name, p.cid
END_SQL
$sth->execute();
my $last = '';
while (my $row = $sth->fetchrow_arrayref()) {
my ($table, $column) = #$row;
if ($table ne $last) {
print "=== $table ===\n";
$last = $table;
}
print "$column\n";
}
After digging through the community answers I finally find a solution using the pragma table_info.
sub get_database_structure {
# Connect to the database
my $dbh = &connect_to_database ();
# Return the structure of the table execution_host
my $sth = $dbh->prepare('pragma table_info(execution_host)');
$sth->execute();
my #struct;
while (my $row = $sth->fetchrow_arrayref()) {
push #struct, #$row[1];
}
# Disconnect from the database
$dbh->disconnect ();
return #struct;
}
It returns a list of the columns name present in the table execution_host.
Thanks for the help !

Why my Azure SQL Database indexes are still fragmented?

My company has committed the sin of using GUIDs as Primary Keys on our Azure SQL Database tables (it is actually worse than that: we used VARCHAR(36) instead of UNIQUEIDENTIFIER). As such, we end up with fragmented indexes. They looked like this:
CREATE TABLE OldTable (
Id VARCHAR(36) PRIMARY KEY CLUSTERED NOT NULL DEFAULT NEWID(),
CreateTime DATETIME2 NOT NULL,
...
)
I "fixed" the problem by creating new tables. This time, I used an immutable, ever-increasing DATETIME2 (e.g. CreateTime) column for CLUSTERED INDEX, and kept the VARCHAR(36) as PRIMARY KEY but this time, NONCLUSTERED. Like this:
CREATE TABLE NewTable (
Id VARCHAR(36) PRIMARY KEY NONCLUSTERED NOT NULL DEFAULT NEWID(),
CreateTime DATETIME2 NOT NULL INDEX IX_NewTable_CreateTime CLUSTERED,
)
Then I "copied" rows from old table to new table using INSERT INTO NewTable SELECT * FROM OldTable. Finally, I renamed tables and dropped the old one. Life seemed good.
For my surprise, couple of weeks later, I found out NewTable has many fragmented indexes, with avg fragmentation as high as 80%! Even the IX_NewTable_CreateTime reports fragmentation of 18%.
Did the INSERT INTO fragmented the index? Will REBUILD index solve the problem, for good?
Fragmentation will depend on the insert/update frequency on the indexed fields and the size of the Index page.
For maintenance purposes, you can use Azure Automation and create a recurring script that checks for fragmented indexes and optimizes it.
There's a Runbook in the Gallery just for that:
The best thing about this is that Automation is free as long as you don't go over the 500 running minutes per month, time your executions well and you won't have to pay :)
I made a custom improvement to the gallery script, feel free to use it too:
<#
.SYNOPSIS
Indexes tables in a database if they have a high fragmentation
.DESCRIPTION
This runbook indexes all of the tables in a given database if the fragmentation is
above a certain percentage.
It highlights how to break up calls into smaller chunks,
in this case each table in a database, and use checkpoints.
This allows the runbook job to resume for the next chunk of work even if the
fairshare feature of Azure Automation puts the job back into the queue every 30 minutes
.PARAMETER SqlServer
Name of the SqlServer
.PARAMETER Database
Name of the database
.PARAMETER SQLCredentialName
Name of the Automation PowerShell credential setting from the Automation asset store.
This setting stores the username and password for the SQL Azure server
.PARAMETER FragPercentage
Optional parameter for specifying over what percentage fragmentation to index database
Default is 20 percent
.PARAMETER RebuildOffline
Optional parameter to rebuild indexes offline if online fails
Default is false
.PARAMETER Table
Optional parameter for specifying a specific table to index
Default is all tables
.PARAMETER SqlServerPort
Optional parameter for specifying the SQL port
Default is 1433
.EXAMPLE
Update-SQLIndexRunbook -SqlServer "server.database.windows.net" -Database "Finance" -SQLCredentialName "FinanceCredentials"
.EXAMPLE
Update-SQLIndexRunbook -SqlServer "server.database.windows.net" -Database "Finance" -SQLCredentialName "FinanceCredentials" -FragPercentage 30
.EXAMPLE
Update-SQLIndexRunbook -SqlServer "server.database.windows.net" -Database "Finance" -SQLCredentialName "FinanceCredentials" -Table "Customers" -RebuildOffline $True
.NOTES
AUTHOR: Matias Quaranta
LASTEDIT: Jan 10th, 2015
#>
workflow MyRunBook
{
param(
[parameter(Mandatory=$True)]
[string] $SqlServer,
[parameter(Mandatory=$True)]
[string] $Database,
[parameter(Mandatory=$True)]
[string] $SQLCredentialName,
[parameter(Mandatory=$False)]
[int] $FragPercentage = 20,
[parameter(Mandatory=$False)]
[int] $SqlServerPort = 1433,
[parameter(Mandatory=$False)]
[boolean] $RebuildOffline = $False,
[parameter(Mandatory=$False)]
[string] $Table
)
# Get the stored username and password from the Automation credential
$SqlCredential = Get-AutomationPSCredential -Name $SQLCredentialName
if ($SqlCredential -eq $null)
{
throw "Could not retrieve '$SQLCredentialName' credential asset. Check that you created this first in the Automation service."
}
$SqlUsername = $SqlCredential.UserName
$SqlPass = $SqlCredential.GetNetworkCredential().Password
InlineScript{
# Define the connection to the SQL Database
$Conn = New-Object System.Data.SqlClient.SqlConnection("Server=tcp:$using:SqlServer,$using:SqlServerPort;Database=$using:Database;User ID=$using:SqlUsername;Password=$using:SqlPass;Trusted_Connection=False;Encrypt=True;Connection Timeout=30;")
# Open the SQL connection
$Conn.Open()
# SQL command to find tables and their average fragmentation
$SQLCommandString = #"
SELECT a.object_id, b.name, (select name from sys.tables t where t.object_id = b.object_id) as tablename, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (
DB_ID(N'$Database')
, OBJECT_ID(0)
, NULL
, NULL
, NULL) AS a
JOIN sys.indexes AS b
ON a.object_id = b.object_id AND a.index_id = b.index_id;
"#
# Return the tables with their corresponding average fragmentation
$Cmd=new-object system.Data.SqlClient.SqlCommand($SQLCommandString, $Conn)
$Cmd.CommandTimeout=120
# Execute the SQL command
$FragmentedTable=New-Object system.Data.DataSet
$Da=New-Object system.Data.SqlClient.SqlDataAdapter($Cmd)
[void]$Da.fill($FragmentedTable)
# Return the table names that have high fragmentation
ForEach ($FragTable in $FragmentedTable.Tables[0])
{
If ($FragTable.avg_fragmentation_in_percent -ge $Using:FragPercentage)
{
Write-Verbose ("Index found : " + $FragTable.name + " on table:" + $FragTable.tablename)
$SQLCommandString = "EXEC('ALTER INDEX "+$FragTable.name+" ON "+$FragTable.tablename+" REBUILD')"
$Cmd2=new-object system.Data.SqlClient.SqlCommand($SQLCommandString, $Conn)
# Set the Timeout to be less than 30 minutes since the job will get queued if > 30
# Setting to 25 minutes to be safe.
$Cmd2.CommandTimeout=1500
Try
{
$Ds=New-Object system.Data.DataSet
$Da=New-Object system.Data.SqlClient.SqlDataAdapter($Cmd2)
[void]$Da.fill($Ds)
}
Catch
{
Write-Verbose ($FragTable.name +" on table "+$FragTable.tablename+" could NOT be indexed.")
}
}
}
$Conn.Close()
}
Write-Verbose "Finished Indexing"
}

Pass a powershell variable into a SQL value during out-datatable (invoke-sqlcmd2)

I want to insert a PowerShell variable value with a Select as I build a datatable from a SQL query.
Borrowed function invoke-sqlcmd2 from TechNet gallery and dot-sourced it in.
$NewSequenceID = invoke-sqlcmd2 -ServerInstance "MyServer" -Database "MyDB" -Query "INSERT INTO [Sequence] (TimeStarted) SELECT GETDATE(); SELECT max(SequenceID) as SequenceID FROM [Sequence]" | foreach { $_.SequenceID }
This generates a new sequence ID and stamps the time we started the batch. Results in a single number which will identify this run. Verified with 'write $NewSequenceID'.
I want to keep later results from queries together with this SequenceID for analysis.
Then I have this:
$PollTime = Get-Date -format "yyyy-MM-dd HH:mm:ss"
Then I want to do this: (Edit: This statement is not working - error message at the bottom)
$AuditUserOutput = invoke-sqlcmd2 -ServerInstance "MyServer2" -Database "MyDB2" -Query "SELECT $NewSequenceID, $PollTime, [USERID], [PID], [UDATE] FROM [MyTable]" -As 'Datatable'
And do some things with the table, then write it after with write-datatable.
If I select NULL for the first two values and grab the other three from the existing table, it works fine. I want to add the $NewSequenceID and $PollTime from the previous statements.
I've read a dozen pages about using ` (backtick), $, {}, and on and on, but I haven't gotten it right. Can someone help with the correct syntax for inserting these variable values into the selection?
PS Error is: Exception calling "Fill" with "1" argument(s): "Invalid pseudocolumn "$NewSequenceID"."
You're interpolating the variables correctly in PowerShell. If I'm understanding this correctly, the problem is with your SQL query. I'm going to make an inference here, but I think this is probably what you want:
$AuditUserOutput = invoke-sqlcmd2 -ServerInstance "MyServer2" -Database "MyDB2" -Query "SELECT [NewSequenceID], [PollTime], [USERID], [PID], [UDATE] FROM [MyTable] WHERE NewSequenceID = '$NewSequenceID' AND PollTime = '$PollTime'" -As 'Datatable'
If not, please clarify by responding to the questions above.
I was able to work around this by first creating a variable to store the query text, which allowed for the natural substitution I needed:
$AuditUserQuery = "SELECT '$NewSequenceID', '$PollTime', [USERID], [PID], [UDATE] FROM [AUDITUSER]"
Then calling that variable as the $query when building the datatable.
This avoided the parameterization problem experienced before.

How get multi value data form Active Directory using SQL

Is it possible to get multi value properties form AD like description, memberOf. if I run simply by adding memberOf this gives error
select *
FROM OPENQUERY(ADSI,'SELECT initials, samAccountName, displayName, distinguishedName, mail, memberOf FROM ''LDAP://DC=corp, DC=contoso, DC=com'' WHERE objectClass=''Person''')
Error:
Msg 7346, Level 16, State 2, Line 1
Cannot get the data of the row
from the OLE DB provider "ADSDSOObject" for linked server "ADSI".
Could not convert the data value due to reasons other than sign
mismatch or overflow.
This is because of memberOf is multi valued property in Active Directory. I am using SQL Server 2008 R2
No, you cannot do this - and there's no "trick" or hack to get it to work, either.
The ADSI provider for SQL Server is rather limited - not supporting multi-valued attributes is one of those limitations.
So you'll need to find another way to do this, e.g. by using SQl-CLR integration and accessing the Active Directory through .NET, or by e.g. exposing the data you need as a web service that you consume from SQL Server.
While you can't use ADSI to return memberof you can query memberof so if you have a group you want to check against you can do the following where extenstionAttribute3 is the employee ID:
SELECT displayName
FROM OPENQUERY(ADSI,
'SELECT displayName
FROM ''LDAP://DC=company,DC=com''
WHERE memberof = ''CN=staff,OU=SharepointGroups,DC=company,DC=com''
AND extensionAttribute3 = ''12345678''
')
If the return value is not null then you can assume the user is part of the group.
Here is my trick / hack for getting this to work:
exec xp_cmdshell 'powershell -command "get-aduser -filter * -properties SamAccountName, <MultiValue> | Select samaccountname, <MultiValue>"'
Change <MultiValue> to whatever attribute you are trying to pull. It will output the values as comma delimited in SQL. Change the PowerShell cmdlet as needed. All you have to do is collect the output, format it, and join it on your other data. Also, be sure your server has the AD PowerShell module and that you have enabled xp_cmdshell in SQL.
just wrote a sql script to include description (multi-value) field on our company Intranet directory. What I did was export-csv delimited using powershell and then bulk insert that info into a table. The simplest solution for me as we only have about 650 employees (records).
exec xp_cmdshell 'powershell.exe -command "get-aduser -filter * -properties SamAccountName, Description,GivenName,sn,title,telephoneNumber,mobile,mail,physicalDeliveryOfficeName| Select SamAccountName, Description,GivenName,sn,title,telephoneNumber,mobile,mail,physicalDeliveryOfficeName| export-csv -encoding unicode -delimiter "`t" -path C:\SQLJobs\addir.csv -notype"'
Go
CREATE TABLE dbLiftowDir.dbo.ADDir
(
[SamAccountName] NVARCHAR(4000),
[Description] NVARCHAR(4000),
[GivenName] NVARCHAR(4000),
[sn] NVARCHAR(4000),
[title] NVARCHAR(4000)COLLATE French_CI_AS NOT NULL,
[telephoneNumber] NVARCHAR(4000),
[mobile] NVARCHAR(4000),
[mail] NVARCHAR(4000),
[physicalDeliveryOfficeName] NVARCHAR(4000),
)
BULK
INSERT dbLiftowDir.dbo.ADDir
FROM 'C:\SQLJobs\addir.csv'
WITH
(
CODEPAGE = 'ACP',
DATAFILETYPE ='char',
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n',
FIRSTROW = 2
Other things I did was remove " from field values using set column replace value, and deleting rows that were non-human accounts. As I found it to be easier to do in SQL instead of passing the code into powershell.

Resources