Can pg_notify be as fast inside a trigger as outside? - database

I noticed that statements like SELECT pg_notify('foo', 'bar') are incredibly quick to execute.
BenchmarkNotify-8 257728 54542 ns/op
Then simple updates to random rows in a table are ~5x slower.
Init statement:
CREATE TABLE table1 (
id SERIAL PRIMARY KEY,
int INT
);
Benchmarked statement:
INSERT INTO table1 (id, int) VALUES($1, $2)
ON CONFLICT (id) DO UPDATE
SET int = $3
BenchmarkUpdate-8 44913 289502 ns/op
But updates to a table with a trigger that runs PERFORM pg_notify('foo', 'bar') are ~5x slower still.
Init statement:
CREATE TABLE table1 (
id SERIAL PRIMARY KEY,
int INT
);
CREATE OR REPLACE FUNCTION table1_fn() RETURNS TRIGGER AS $$
BEGIN
PERFORM pg_notify('foo', 'bar');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER table1_trg AFTER UPDATE ON table1 FOR EACH ROW EXECUTE PROCEDURE table1_fn();
Benchmarked statement:
INSERT INTO table1 (id, int) VALUES($1, $2)
ON CONFLICT (id) DO UPDATE
SET int = $3
BenchmarkUpdateTriggerNotify-8 8110 1502632 ns/op
I'd like to understand why this slowdown is so large and whether it can be avoided? I would expect BenchmarkUpdateTriggerNotify to be much closer to BenchmarkUpdate than it is.
Other Benchmarks I Tried
A benchmark on a table with a trigger that doesn't do anything was 4% slower than BenchmarkUpdate. A trigger that inserts to a different table was 12% slower than BenchmarkUpdate. Nothing like the 5x slowdown a call to pg_notify does. So I ruled out the simple presence of the trigger to be the cause.
I noticed that updating the same row every time without any triggers runs way closer to BenchmarkUpdateNotifyTrigger performance than BenchmarkUpdate. That made me think that maybe pg_notify prevents Postgres from parallelizing the workload in some way, but I'm only guessing here.
I also tried making the trigger a CONSTRAINT trigger and adding DEFERRABLE INITIALLY DEFERRED parameters to it, to try to separate the transaction in which the update happens, and the notification is sent. It slowed BenchmarkUpdateNotifyTrigger further by ~22% instead of speeding it up.
How I Ran the Benchmarks
Put this code in a file main_test.go in an empty folder:
package main
import (
"context"
"sync"
"testing"
"github.com/jackc/pgx/v4"
)
const (
concurrency = 10
tableSize = 100
connString = "postgres://test:test#localhost:5432/test"
)
func sqlInit(ctx context.Context, sql string) error {
conn, err := pgx.Connect(ctx, connString)
if err != nil {
return err
}
defer conn.Close(ctx)
_, err = conn.Exec(ctx, sql)
return err
}
type sqlQuery struct {
sql string
args []interface{}
}
type pool struct {
sqlQueryCh chan sqlQuery
wg sync.WaitGroup
err error
errMu sync.Mutex
nonNilErrCh chan struct{}
}
func newPool(ctx context.Context, cc int) (*pool, error) {
ret := &pool{
sqlQueryCh: make(chan sqlQuery, cc),
nonNilErrCh: make(chan struct{}),
}
for i := 0; i < cc; i++ {
conn, err := pgx.Connect(ctx, connString)
if err != nil {
return nil, err
}
ret.wg.Add(1)
go func(conn *pgx.Conn) {
defer ret.wg.Done()
defer conn.Close(ctx)
for q := range ret.sqlQueryCh {
if _, err := conn.Exec(ctx, q.sql, q.args...); err != nil {
ret.setErr(err)
}
}
}(conn)
}
return ret, nil
}
func (p *pool) setErr(err error) {
p.errMu.Lock()
defer p.errMu.Unlock()
if p.err == nil && err != nil {
p.err = err
close(p.nonNilErrCh)
}
}
func (p *pool) Send(ctx context.Context, sql string, args ...interface{}) error {
select {
case p.sqlQueryCh <- sqlQuery{
sql: sql,
args: args,
}:
return nil
case <-ctx.Done():
return ctx.Err()
case <-p.nonNilErrCh:
return p.err
}
}
func (p *pool) Close() error {
close(p.sqlQueryCh)
p.wg.Wait()
return p.err
}
func BenchmarkNotify(b *testing.B) {
ctx := context.Background()
if err := sqlInit(ctx, `
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (
id SERIAL PRIMARY KEY,
int INT
);`); err != nil {
panic(err)
}
pool, err := newPool(ctx, concurrency)
if err != nil {
panic(err)
}
for i := 0; i < b.N; i++ {
if err := pool.Send(ctx, "SELECT pg_notify('foo', 'bar');"); err != nil {
panic(err)
}
}
if err := pool.Close(); err != nil {
panic(err)
}
}
func BenchmarkUpdate(b *testing.B) {
ctx := context.Background()
if err := sqlInit(ctx, `
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (
id SERIAL PRIMARY KEY,
int INT
);
`); err != nil {
panic(err)
}
pool, err := newPool(ctx, concurrency)
if err != nil {
panic(err)
}
for i := 0; i < b.N; i++ {
if err := pool.Send(ctx, `
INSERT INTO table1 (id, int) VALUES($1, $2)
ON CONFLICT (id) DO UPDATE
SET int = $3`, i%tableSize, 2, 3); err != nil {
panic(err)
}
}
if err := pool.Close(); err != nil {
panic(err)
}
}
func BenchmarkUpdateTriggerNotify(b *testing.B) {
ctx := context.Background()
if err := sqlInit(ctx, `
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (
id SERIAL PRIMARY KEY,
int INT
);
CREATE OR REPLACE FUNCTION table1_fn() RETURNS TRIGGER AS $$
BEGIN
PERFORM pg_notify('foo', 'bar');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS table1_trg ON table1;
CREATE TRIGGER table1_trg AFTER UPDATE ON table1 FOR EACH ROW EXECUTE PROCEDURE table1_fn();
`); err != nil {
panic(err)
}
pool, err := newPool(ctx, concurrency)
if err != nil {
panic(err)
}
for i := 0; i < b.N; i++ {
if err := pool.Send(ctx, `
INSERT INTO table1 (id, int) VALUES($1, $2)
ON CONFLICT (id) DO UPDATE
SET int = $3`, i%tableSize, 2, 3); err != nil {
panic(err)
}
}
if err := pool.Close(); err != nil {
panic(err)
}
}
Run go mod init test.com/m
Run postgres docker image.
docker run -it -p 5433:5432 -e POSTGRES_USER=test -e POSTGRES_PASSWORD=test -e POSTGRES_DB=test postgres
Run the benchmarks in another window.
go test . --bench=.
goos: linux
goarch: amd64
pkg: example.com/m
cpu: Intel(R) Core(TM) i7-8550U CPU # 1.80GHz
BenchmarkUpdate-8 5182 309329 ns/op
BenchmarkUpdateTriggerNotify-8 895 1554109 ns/op
BenchmarkNotify-8 27082 43838 ns/op
PASS
ok example.com/m 7.840s

Related

How to query any table of RDS using Golang SDK

I am writing an AWS lambda to query 10 different tables from RDS(SQL Server) using Golang SDK. What I have learned so far is we have to create a similar struct for the table to query it. But as I want to query 10 tables, So I don't want to create the struct for every table, even the table schema may get changed someday.
Lately, I want to create a CSV file per table as the backup with the queried data and upload it to S3. So is it possible to directly import the CSV file into a lambda, so that I can directly upload it to S3?
You can see my current code below
func executeQuery(dbconnection *sql.DB) {
println("\n\n----------Executing Query ----------")
query := "select TOP 5 City,State,Country from IMBookingApp.dbo.Address"
rows, err := dbconnection.Query(query)
if err != nil {
fmt.Println("Error:")
log.Fatal(err)
}
println("rows", rows)
defer rows.Close()
count := 0
for rows.Next() {
var City, State, Country string
rows.Columns
err := rows.Scan(&City, &State, &Country)
if err != nil {
fmt.Println("Error reading rows: " + err.Error())
}
fmt.Printf("City: %s, State: %s, Country: %s\n", City, State, Country)
count++
}
}
This code can only work for the Address table, and not for other tables
I have also tried it with GORM
package main
import (
"fmt"
"github.com/jinzhu/gorm"
_ "github.com/jinzhu/gorm/dialects/mssql"
)
type Currency struct {
CurrencyId int `gorm:"column:CurrencyId;"`
Code string `gorm:"column:Code;"`
Description string `gorm:"column:Description;"`
}
func main() {
db, err := gorm.Open("mssql", "sqlserver://***")
db.SingularTable(true)
gorm.DefaultTableNameHandler = func(dbVeiculosGorm *gorm.DB, defaultTableName string) string {
return "IMBookingApp.dbo.Currency"
}
fmt.Println("HasTable-Currency:", db.HasTable("ClientUser"))
var currency Currency
db.Debug().Find(&currency)
fmt.Println("Currency:", currency)
fmt.Println("Error", err)
defer db.Close()
}
With both the approaches I couldn't find any way to make the code generic for multiple tables. I would appreciate it if anyone can give me some suggestions or if you can point to some resources.
I did not test this code but is should give you an idea how to fetch Rows into strings array.
defer rows.Close()
columns, err := rows.Columns()
if err != nil {
panic(err)
}
for rows.Next() {
receiver := make([]*string, len(columns))
err := rows.Scan(receiver)
if err != nil {
fmt.Println("Error reading rows: " + err.Error())
}
}
GO internally converts many types into strings - https://github.com/golang/go/blob/master/src/database/sql/convert.go#L219
If data is cannot be converted you have 2 options:
Easy - update your SQL query to return strings or string compatible data
Complicated. Use slice of interface{} instead of slice of *string and fill it in with default values of correct type based on rows.ColumnTypes(). Later you will have to convert real values into strings to save into csv.
Below code worked for me -
conn, _ := getConnection() // Get database connection
rows, err := conn.Query(query)
if err != nil {
fmt.Println("Error:")
log.Fatal(err)
}
defer rows.Close()
columns, err := rows.Columns()
if err != nil {
panic(err)
}
for rows.Next() {
receiver := make([]string, len(columns))
is := make([]interface{}, len(receiver))
for i := range is {
is[i] = &receiver[i]
// each is[i] will be of type interface{} - compatible with Scan()
// using the underlying concrete `*string` values from `receiver`
}
err := rows.Scan(is...)
if err != nil {
fmt.Println("Error reading rows: " + err.Error())
}
fmt.Println("receiver", receiver)
Reference:- sql: expected 3 destination arguments in Scan, not 1 in Golang

BigQuery - fetch 1000000 records and do some process over data using goLang

I Have 1000000 records inside BigQuery. what is the best way to fetch data from DB and process using goLang? I'm getting timeout issue if fetch all the data without limit. already I increase the limit to 5min, but it takes more than 5 min.
I want to do some streaming call or pagination implementation, But i don't know in golang how I do.
var FetchCustomerRecords = func(req *http.Request) *bigquery.RowIterator {
ctx := appengine.NewContext(req)
ctxWithDeadline, _ := context.WithTimeout(ctx, 5*time.Minute)
log.Infof(ctx, "Fetch Customer records from BigQuery")
client, err := bigquery.NewClient(ctxWithDeadline, "ddddd-crm")
q := client.Query(
"SELECT * FROM Something")
q.Location = "US"
job, err := q.Run(ctx)
if err != nil {
log.Infof(ctx, "%v", err)
}
status, err := job.Wait(ctx)
if err != nil {
log.Infof(ctx, "%v", err)
}
if err := status.Err(); err != nil {
log.Infof(ctx, "%v", err)
}
it, err := job.Read(ctx)
if err != nil {
log.Infof(ctx, "%v", err)
}
return it
}
You can read the table contents directly without issuing a query. This doesn't incur query charges, and provides the same row iterator as you would get from a query.
For small results, this is fine. For large tables, I would suggest checking out the new storage api, and the code sample on the samples page.
For a small table or simply reading a small subset of rows, you can do something like this (reads up to 10k rows from one of the public dataset tables):
func TestTableRead(t *testing.T) {
ctx := context.Background()
client, err := bigquery.NewClient(ctx, "my-project-id")
if err != nil {
t.Fatal(err)
}
table := client.DatasetInProject("bigquery-public-data", "stackoverflow").Table("badges")
it := table.Read(ctx)
rowLimit := 10000
var rowsRead int
for {
var row []bigquery.Value
err := it.Next(&row)
if err == iterator.Done || rowsRead >= rowLimit {
break
}
if err != nil {
t.Fatalf("error reading row offset %d: %v", rowsRead, err)
}
rowsRead++
fmt.Println(row)
}
}
you can split your query to get 10x of 100000 records and run in multiple goroutine
use sql query like
select * from somewhere order by id DESC limit 100000 offset 0
and in next goroutine select * from somewhere order by id DESC limit 100000 offset 100000

sql: Scan error on column index 0, name "": unsupported Scan, storing driver.Value type int64 into type *main.SMSBlast?

I'm now try restful api, where column SequenceID not auto incerement because on purpose ,my problem is with library gorm when I count like this countSequenceId := db.Debug().Table("SMSBlast2").Count(&smsblast1) , result is sql: Scan error on column index 0, name "": unsupported Scan, storing driver.Value type int64 into type *main.SMSBlast
package main
import (
"encoding/json"
"fmt"
"github.com/gorilla/mux"
"github.com/jinzhu/gorm"
_ "github.com/jinzhu/gorm/dialects/mssql"
"log"
"net/http"
"strconv"
"time"
)
type SMSBlast struct {
SequenceID int `gorm:"primary_key";column:"SequenceID"`
MobilePhone string `gorm:"column:MobilePhone"`
Output string `gorm:"column:Output"`
WillBeSentDate *time.Time `gorm:"column:WillBeSentDate"`
SentDate *time.Time `gorm:"column:SentDate"`
Status *string `gorm:"column:Status"`
DtmUpd time.Time `gorm:"column:DtmUpd"`
}
func (SMSBlast) TableName() string {
return "SMSBlast2"
}
func insertSMSBlast(w http.ResponseWriter, r *http.Request){
fmt.Println("New Insert Created")
db, err := gorm.Open("mssql", "sqlserver://sa:#localhost:1433?database=CONFINS")
if err != nil{
panic("failed to connect database")
}
defer db.Close()
vars := mux.Vars(r)
mobilephone := vars["mobilephone"]
output := vars["output"]
var(
smsblast1 SMSBlast
)
countSequenceId := db.Debug().Raw("SELECT COUNT (*) FROM SMSBlast2").Scan(&smsblast1)
fmt.Println(countSequenceId)
msg, err := json.Marshal(countSequenceId)
if err != nil{
fmt.Println(err.Error())
}
sequenceid1,_ := strconv.Atoi(string(msg))
fmt.Println("SequenceID : " , sequenceid1)
smsblasts := SMSBlast{SequenceID: sequenceid1,MobilePhone: mobilephone,Output:output, DtmUpd: time.Now()}
prindata := db.Create(&smsblasts)
fmt.Println(prindata)
}
func handleRequests(){
myRouter := mux.NewRouter().StrictSlash(true)
myRouter.HandleFunc("/smsblaststest",allSMSBlasts).Methods("POST")
myRouter.HandleFunc("/smsblaststestInsert/{mobilephone}/{output}", insertSMSBlast).Methods("POST")
log.Fatal(http.ListenAndServe(":8080",myRouter))
}
func main(){
fmt.Println("SMSBLASTS ORM")
handleRequests()
}
You seem to assume that Gorm methods will return the result for you. That is not how it works. Gorm returns an error or a nil and you pass a reference to a variable where you want to store results. So to use Count() on something you would write something like
var count int
db.Model(&SMSBlast{}).Count(&count)
fmt.Printf("count: %d\n", count)
That said, if you just want to make sure that you get a new sequence ID every time, why don't you use auto-increment?
type SMSBlast struct {
SequenceID int `gorm:"primary_key";column:"SequenceID";"AUTO_INCREMENT"`
...
}

MSSQL leaking connections

I have strange issue with golang sql and probably denisenkom/go-mssqldb.
My code part:
func Auth(username string, password string, remote_ip string, user_agent string) (string, User, error) {
var token string
var user = User{}
query := `exec dbo.sp_get_user ?,?`
rows, err := DB.Query(query, username, password)
if err != nil {
return token, user, err
}
defer rows.Close()
rows.Next()
if err = rows.Scan(&user.Id, &user.Username, &user.Description); err != nil {
log.Printf("SQL SCAN: Failed scan User in Auth. %v \n", err)
return token, user, err
}
hashFunc := md5.New()
hashFunc.Write([]byte(username + time.Now().String()))
token = hex.EncodeToString(hashFunc.Sum(nil))
query = `exec dbo.sp_set_session ?,?,?,?`
_, err = DB.Exec(query, user.Id, token, remote_ip, user_agent)
if err != nil {
return token, user, err
}
return token, user, nil
}
Problem: defer rows.Close() - not working properly
After this with DB.Connection.Stats().OpenConnections I always have 2 connection opened (also after repeat User login is still 2 connection for whole app lifecycle)
But if I rewrite func as:
...
query := `exec dbo.sp_get_user ?,?`
rows, err := DB.Query(query, username, password)
if err != nil {
return token, user, err
}
defer rows.Close()
rows.Next()
if err = rows.Scan(&user.Id, &user.Username, &user.Description); err != nil {
log.Printf("SQL SCAN: Failed scan User in Auth. %v \n", err)
return token, user, err
}
rows.Close()
...
Then rows underline stmt is closed and next DB.Connection.Stats().OpenConnections always will be 1 connection open.
DB in my app is simply return underlying connection from sql.Open
Problem is only in this part where two query executions with Query and Exec in one functions.
Maybe Query and Exec defines different connections, but i don't find this in driver source or database/sql source.
Thank you! (sorry for english if it's so bad)
PS:
exec dbo.sp_get_user ?,? - is simple select from user table, not more.
exec dbo.sp_set_session ?,?,?,? - is simple insert to user table, not more
In MSSQL - DBCC INPUTBUFFER shows me query = 'cast(##identity as bigint)' which executes in denisenkom/go-mssqldb mssql.go on line 593

Call stored procedure on Microsoft SQL Server using ODBC driver

I have a stored procedure let the name be "vijaystoredprocedure" , if it is some query in mssql then i will query in Go like
l_query_str = fmt.Sprintf(`select * from Users where Fname='%s'`, l_firstanme)
row, err := DBC.Query(l_query_str)
if err != nil {
log.Fatal("Prepare failed:", err.Error())
}
_, rows, r_err := DBScan_fn(row)
if r_err != nil {
fmt.Println("no data found err")
return
}
now since i have to get values from a stored procedure...can some one suggest how to acheive this in go
I'm using github.com/alexbrainman/odbc driver,
Example of executing stored procedure:
proc := "exec Dbo.vijaystoredprocedure ?, ?, ?, ?" //(Number of parameters)
parms := []interface{}{"parm1","parm2","parm3","parm4"}// Parameters if needed
if Stmt, err := DBC.Prepare(proc); err != nil {
log.Fatal(err.Error())
} else {
defer Stmt.Close()
if result, err := Stmt.Exec(parms...); err != nil {
log.Fatal(err.Error())
}
}
Example of stored function:
proc := "SELECT * From Dbo.[vijaystoredprocedure](?,?)" //(Number of parameters)
parms := []interface{}{"parm1","parm2"}// Parameters if needed
row, err := DBC.Query(proc, parms...)
if err != nil {
log.Fatal("Prepare failed:", err.Error())
}
_, rows, r_err := DBScan_fn(row)
if r_err != nil {
fmt.Println("no data found err")
return
}

Resources