I am trying to insert a .csv file's contents into a SQL Server database.
Here is my code:
cursor = cnxn.cursor()
cursor.execute("Truncate table HumanResources.DepartmentTest") # Truncate old table contents
# Insert Dataframe into SQL Server:
for index, row in df.iterrows():
cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
cnxn.commit()
cursor.close()
I am running above code in my lambda function. Not sure why I am getting this error;
"errorMessage": "Attempt to use a closed cursor."
"errorType": "ProgrammingError"
"stackTrace":
" File "/var/task/lambda_function.py", line 56, in lambda_handler\n cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
Can anyone help me with this issue?
Your cursor.close() should be outside of the for loop:
for index, row in df.iterrows():
cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
cnxn.commit()
cursor.close()
Related
I'm having issue in saving a PDF file in SQL Server database using stored procedure in Golang. Below is the code.
tsql := fmt.Sprintf("DECLARE #tmp varbinary(max);"+
"SET #tmp = CAST('%s' as varbinary(max));"+
"EXEC BP_AddCorrespondenceIn #PatientID=1, #ContactName='Test', #Subject='First Test',"+
"#Category='Report', #DocType='PDF', #Content = #tmp", content)
// Execute query
rows, err := db().Query(tsql)
Here the content is the [ ]byte. When I run the program the query executes and I got the error below:
mssql: '3�Ze�
#��!~T��ϔljQ*���f1-~L���^ը;s;���.�)�[P�hjDN��J�.1��W�Zt���xq�\r���ן�)N���=df' is an invalid name because it contains a NULL character or an invalid unicode character.
Thank you!
I fixed the problem by changing the stored procedure exec method to _, err := db().Exec(tsql, content) and varbinary(max) conversion to tmp = CAST(? as varbinary(max));
Let's say I want to execute an insert in this connection, which is valid:
import pyodbc
CONNSTR = "DRIVER={ODBC Driver 17 for SQL Server};"
"SERVER=....database.windows.net,1433;
"UID=...;PWD=...;DATABASE=..."
connection = pyodbc.connect(CONNSTR, autocommit=True)
cursor = connection.cursor()
Then, I make this insert, which is valid:
cursor.execute("INSERT INTO [dbo].[products]([name], [regular_price], [sale_price], [type]) VALUES (?, ?, ?, ?)", ["Hello", 1.1, 1.1, "lalala"])
This is: I make the query using parameters, and then I insert one single record. This works (assume the table is valid and accepts sending those 4 columns).
But when I use 2100 or more arguments, I get an error:
>>> cursor.execute("INSERT INTO [dbo].[products]([name], [regular_price], [sale_price], [type]) VALUES " + ", ".join("(?, ?, ?, ?)" for _ in range(525)), ["Hello", 1.1, 1.1, "lalala"] * 525)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request. (8003) (SQLExecDirectW)')
>>> cursor.execute("INSERT INTO [dbo].[products]([name], [regular_price], [sale_price], [type]) VALUES " + ", ".join("(?, ?, ?, ?)" for _ in range(526)), ["Hello", 1.1, 1.1, "lalala"] * 526)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
pyodbc.Error: ('07002', '[07002] [Microsoft][ODBC Driver 17 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)')
So, it seems that using 2100 or more arguments is not allowed, and I need to support inserting up to 1000 records like this (in fact! this will be user-handled so I DON'T KNOW how many columns will the table have).
So my question is: How do I escape the arguments manually so I don't have to resort to using this argument-placeholder approach (which is limited on insert because of this)? Or, alternatively: Is there a driver-enabled method in the odbc adapters to insert a value through pyodbc (method which actually takes care of the escaping by itself)?
Spoiler alert: No. There is no built-in process. What I had to do comes like this:
Compute the number of columns per row to insert.
Compute the number of rows to insert "per batch" as int(2099 / numcols).
Batch-insert using "insert into mytable({col_list}) values ..." using parameters args_chunk.
This will be explained now:
col_list will be ", ".join(cols) while numcols will be len(cols), being cols an array of strings with the columns to insert.
args_chunk will be a flattened version of rows[index:index + batch_size] considering index as iterated from range(0, len(rows), batch_size).
... in the insert query will be ", ".join(["(?, ?...?, ?)"] * len(rows[index:index + batch_size])) where the number of question marks is numcols.
So the logic goes like this:
Considering the number of columns (which will at most be 1024), insert an amount of rows that make the total number of arguments NOT pass 2099. Use this amount as a "safe" amount of rows to insert.
Each iteration will use that "safe" amount of rows.
The query, on each iteration, will have the appropriate number of rows (and arguments).
A last iteration may have a different (lower) number of rows.
By the end, all of them will be safely inserted.
I have a .csv file and it gets updated every day. Below is the example of my .csv file
I am pushing this .csv file into SQL Server using Python. My script reads the .csv file and uploads it into a SQL Server database.
This is my Python script:
import pandas as pd
import pyodbc
df = pd.read_csv ("C:/Users/Dhilip/Downloads/test.csv")
print(df)
conn = pyodbc.connect('Driver={SQL Server};'
'Server=DESKTOP-7FCK7FG;'
'Database=test;'
'Trusted_Connection=yes;')
cursor = conn.cursor()
#cursor.execute('CREATE TABLE people_info (Name nvarchar(50), Country nvarchar(50), Age int)')
for row in df.itertuples():
cursor.execute('''
INSERT INTO test.dbo.people_info (Name, Country, Age)
VALUES (?,?,?)
''',
row.Name,
row.Country,
row.Age
)
conn.commit()
The script is working fine. I am trying to automate my Python script using batch file and task scheduler, and it's working fine. However, whenever I add new data in the .csv file and SQL Server gets updated with new data and the same time it prints the old data multiple times.
Example, if I add new record called Israel, the output appears in SQL Server as below
I need output as below,
Can anyone advise me the change I need to do in the above python script?
You can use below query in your python script. if Not exists will check if the record already exists based on the condition in where clause and if record exists then it will go to else statement where you can update or do anything.
checking for existing records in database works faster than checking using python script.
if not exists (select * from Table where Name = '')
begin
insert into Table values('b', 'Japan', 70)
end
else
begin
update Table set Age=54, Country='Korea' where Name = 'A'
end
to find existing duplicate records then use the below query
select Name, count(Name) as dup_count from Table
group by Name having COUNT(Name) > 1
I find duplicates like this
def find_duplicates(table_name):
"""
find duplicates inside table
:param table_name:
:return:
"""
connection = sqlite3.connect("./k_db.db")
cursor = connection.cursor()
findduplicates = """ SELECT a.*
FROM {} a
JOIN (
SELECT shot, seq, lower(user), date_time,written_by, COUNT(*)
FROM {}
GROUP BY shot, seq, lower(user), date_time,written_by
HAVING count(*) > 1 ) b
ON a.shot = b.shot
AND a.seq = b.seq
AND a.date_time = b.date_time
AND a.written_by = b.written_by
ORDER BY a.shot;""".format(
table_name, table_name
)
# print(findduplicates)
cursor.execute(findduplicates)
connection.commit()
records = cursor.fetchall()
cursor.close()
connection.close()
You could rephrase your insert such that it checks for existence of the tuple before inserting:
for row in df.itertuples():
cursor.execute('''
INSERT INTO test.dbo.people_info (Name, Country, Age)
SELECT ?, ?, ?
WHERE NOT EXISTS (SELECT 1 FROM test.dbo.people_info
WHERE Name = ? AND Country = ? AND Age = ?)
''', (row.Name, row.Country, row.Age, row.Name, row.Country, row.Age,))
conn.commit()
An alternative to the above would be to add a unique index on (Name, Country, Age). Then, your duplicate insert attempts would fail and generate an error.
When I attempt to insert new data in database via PowerShell, it works.
However, if the data is already in database, I get exception.
Someone had similar problem, Catch exception calling "ExecuteNonQuery", but I believe I am using the correct SQL statement in my PowerShell code, where I say SELECT 1
$SQL_insert = "BEGIN
IF NOT EXISTS (SELECT 1
FROM [dbo].[Group_Stats]
WHERE Date_of_Record = CAST(GETDATE() AS DATE)
AND [Group] = '$group')
BEGIN
INSERT INTO [dbo].[Group_Stats] ([Date_of_Record], [Group], [Windows_SEP_11],[Mac_SEP_11],[Windows_SEP_12],[Mac_SEP_12])
VALUES (CAST(GETDATE() AS DATE), REPLACE ('$group', 'My Company\', ''), $win_sep_11, $mac_sep_11, $win_sep_12, $mac_sep_12)
END
END"
Exception
Exception calling "ExecuteNonQuery" with "0" argument(s): "Violation of PRIMARY KEY constraint 'PK_Group_Stats'. Cannot insert duplicate key in object 'dbo.Group_Stats'.
And this is the database
Thanks
You are querying for the untrimmed group name, but when you insert, you call REPLACE() to trim "My Company" off $group. You should instead trim the $group first, then query and insert without calling REPLACE().
I'm inserting data from mysql table to postgres table and my code is:
from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.orm import mapper, sessionmaker
import psycopg2
class TestTable(object):
pass
class StoreTV(object):
pass
if __name__ == "__main__":
engine = create_engine('mysql://root#localhost:3306/irt', echo=False)
Session = sessionmaker(bind=engine)
session = Session()
metadata = MetaData(engine)
test_table = Table('test_1', metadata, autoload=True)
store_tv_table = Table('roku_store', metadata, autoload=True)
mapper(TestTable, test_table)
mapper(StoreTV, store_tv_table)
res = session.query(TestTable).all()
print res[1].test_1col
tv_list = session.query(StoreTV).all()
for tv in tv_list:
tv_data = dict()
tv_data = {
'title': tv.name,
'email': tv.business_email
}
print tv_data
conn = psycopg2.connect(database="db", user="user", password="pass", host="localhost", port="5432")
print "Opened database successfully"
cur = conn.cursor()
values = cur.execute("Select * FROM iris_store")
print values
cur.execute("INSERT INTO iris_store(title, business_email) VALUES ('title':tv_data[title], 'business_email':tv_data[business_email])")
print "Record created successfully"
conn.commit()
conn.close()
And I'm not able to get data from postgres data and insert into postgres table
while I'm successful to get data from Mysql table
ERROR is:
something
{'email': 'name#example.com', 'title': "Some Name"}
Opened database successfully
None
Traceback (most recent call last):
File "/home/Desktop/porting.py", line 49, in
cur.execute("INSERT INTO iris_store(title, business_email) VALUES ('title':tv_data[title], 'business_email':tv_data[business_email])")
psycopg2.ProgrammingError: syntax error at or near ":"
LINE 1: ... iris_store(title, business_email) VALUES ('title':tv_data[t...
^
Usman
you have to check your data type for email to insert data.
because to insert data from mysql to postgres you have to both fields of same type.
click here and page no 28 will describe you aboout the data types of mysql and postgres
Your main problem is that you hava a sql syntax error in your insert query. It should look something like this:
cur.execute("INSERT INTO iris_store(title, business_email) VALUES (%(title)s, %(email)s)", tv_data)
For reference, see: Passing parameters to SQL queries
Also you probably don't want to create a new connection to your postgres db for each single value in tv_list, you should move the connect and close calls outside of the for loop, and printing the whole table each time also doesn't seem very useful