perl script without using DBI - database

I have to make a perl script populate a database in PostgreSQL without using DBI or any sort of database interface model. I am a beginner to scripting so naturally, I'v been stuck on this for quite a while. I only have this much so far.
open my $pipe, '|-', "psql -d postgres -U postgres", #options or die;
# NOT SURE WHAT TO DO AFTER THIS
close $pipe;
edit 1: Now i'm trying to do this.
for ($count = $iters; $count >= 1; $count--) {
$randdecimal = rand();
$pipe "INSERT INTO random_table (runid, random_number) VALUES ($runid, $randdecimal)";
}
but it gives me a syntax error

Like the others say, DBI is much better than printing to a pipe.
However, there is a halfway house. Just print all your SQL to STDOUT and then do something like:
myscript.pl | psql -v ON_ERROR_STOP=1 --single-transaction -f -
This lets you easily check your script output / send it to a file. The psql options stop on the first error, wrap everything in a transaction and read from STDIN. You might want the usual -h/-U options too.
Personally, I tend to have two terminals open and just write to a .sql file then \i from a psql prompt. I like having a record of what command I ran.

Related

psql Batch File - Escaping "Not Equal" Operator

I'm working on a batch file that will import data into the PostgreSQL database I use for testing. The batch file drops all of the databases, then recreates/reloads them from a previous dump file made from our production database. However, I sometimes run into a problem if I've accidentally left a connection open to that server/database. The "drop" portion fails because there are still users connected (me).
I've been trying to "tweak" my batch file with a command to disconnect all users from the database(s) prior to issuing the command to drop them, but I can't get that part (disconnection) to work. I've taken the disconnect code from another SO question How to drop a PostgreSQL database if there are active connections to it?, and I've been looking at other questions like How to execute postgres' sql queries from batch file? for help with the syntax.
I've also seen the "alternate" syntax for a not equal operator on the 9.2. Comparison Functions and Operators page of the official PostgreSQL documentation, but that seems to also be using "special" characters that would require escaping, so I'm not sure how to proceed.
At this point, the batch file looks like this:
#Echo OFF
SET PGPASSWORD=PASSWORD
cd /D "C:\PostgreSQL\bin"
psql.exe -h localhost -p 5432 -d postgres -U username -c 'SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = ''betadb'' AND pid \<\> pg_backend_pid();'
dropdb.exe -h localhost -p 5432 -U username betadb
psql.exe -h localhost -p 5432 -d postgres -U username < "C:\PostgresSQL\prodserverdump.sql"
Everything else works except for the pg_terminate_backend query. Every time I run that, I get strange errors indicating a problem with a path, or a file, or something else like that. I believe I've narrowed the problem down to the "not equal" operator (<>) in the query, but I can't seem to find the correct way to escape this so it doesn't try to pipe in data from a file that's not being defined.
I've tried using single backslashes (\) and double backslashes (\\), in front of one or both of the characters in the operator, but that doesn't appear to work. Is there a special way to escape the "greater than" and "less than" characters for the -c command line option in psql?
Using a combination of suggestions and "trial & error", I believe I found the correct syntax for executing this particular SQL command through a batch file.
Trying the "alternative" not equal operator (!=), I was still getting errors. They were different errors (it was giving me some nonsense about too many parameters), but it still wouldn't execute.
Using #Compo's suggestion from the comments, I then tried to enclose the entire SELECT statement in double quotes instead of single quotes. Still not quite there.
Finally, I removed the "extra" single quotes I was using around the database names from before. The query appears to have executed properly.
The final result looks like this:
#Echo OFF
SET PGPASSWORD=PASSWORD
cd /D "C:\PostgreSQL\bin"
psql.exe -h localhost -p 5432 -d postgres -U username -c "SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'betadb' AND pid != pg_backend_pid();"
dropdb.exe -h localhost -p 5432 -U username betadb
psql.exe -h localhost -p 5432 -d postgres -U username < "C:\PostgresSQL\prodserverdump.sql"
I suppose I had assumed that, because all of the examples I had found were using single quotes to surround the SQL statement, that's what I had to use. Apparently, that assumption was incorrect.
Regardless, it all seems to be working correctly now. Hope this helps someone else who's looking to accomplish something similar.

Managing error messages generated by Postgres \copy command

I am working on a tool to import large sets of files into a Postgres database. Currently I have a working prototype - a bash script going over the list of files, and using psql with the \copy command to import each file.
I would like to add some error handling; I'm thinking of parsing error messages to generate feedback for users, but I can't find a specification, or a list of error messages that are generated by the \copy command in particular.
Is there a tool, or a library, or even a reference list that I could use? I am constrained to use either Shell or Node with the Postgres module.
That should be fairly simple; just check the return code:
psql -c "\copy ${atable} FROM '${afile}' (FORMAT 'csv')"
if [ $? -ne 0 ]; then
echo "copy failed!"
fi

SQL Server sqlcmd execute os command in script

In SQL Server 2016, I am executing a SQL script through SQLCMD like this:
SQLCMD -H XXXXXX,1433 -U username -P password -d mydatabase
-v varMDF="testing" -i "Script.sql" -o "DATA.txt"
and in Script.sql, I want to echo some text to the console, just to see the progress. I have a while loop in the script and executing the command
echo I am in sql script
as shown here:
OPEN tab_cursor
FETCH NEXT FROM tab_cursor INTO #tablename
WHILE ##FETCH_STATUS = 0
BEGIN
!!echo i am in sql script
PRINT #tablename
FETCH NEXT FROM tab_cursor INTO #tablename
END
CLOSE tab_cursor
DEALLOCATE tab_cursor
The problem is, it display the line "i am in sql script" only once in console but I could see many entries for tablename in my output file. Please help to solve this issue or suggest if there is any other way to do this.
Thanks
I would try the following solutions in order:
1) Look into BCP; it might allow you to see what you are doing much more effectively, and depending on the size of your output file it may be significantly faster. (1b : look into SSIS, even though it's a huge pain)
2) putting a SQLCMD execution inside of Script.sql that did the data push to the file, and having the PRINT statement work as normal without a -o. (NOTE: If this is a Complicated Stored Procedure, why aren't you writing a Complicated Stored Procedure?)
3) Monkeying with server monitoring and profiler. This would be for debugging purposes only, if that's why you need the output.
Generally, it sounds to me like the source of your problem is that you're using the wrong tool for the job. If you want lots of output from SQLCMD on process status, you're probably using it where you should be using BCP, which is designed for doing exports programmatically. SQLCMD isn't all that great an interface for running complicated scripts, in my experience; it needs fire-and-forget.

Insert SQL statements via command line without reopening connection to remote database

I have a large amount of data files to process and to be stored in the remote database. Each line of a data file represents a row in the database, but must be formatted before inserting into the database.
My first solution was to process data files by writing bash scripts and produce SQL data files, and then import the dump SQL files into the database. This solution seems to be too slow and as you can see involves an extra step of creating intermediary SQL file.
My second solution was to write bash scripts that while processing each line of the data file, creates and INSERT INTO ... statement and sends the SQL statement to the remote database:
echo sql_statement | psql -h remote_server -U username -d database
i.e. does not create SQL file. This solution, however, has one major issue that I am searching an advice on:
Each time I have to reconnect to the remote database to insert one single row.
Is there a way to connect to the remote database, stay connected and then "pipe" or "send" the insert-SQL-statement without creating a huge SQL file?
Answer to your actual question
Yes. You can use a named pipe instead of creating a file. Consider the following demo.
Create a schema x in my database event for testing:
-- DROP SCHEMA x CASCADE;
CREATE SCHEMA x;
CREATE TABLE x.x (id int, a text);
Create a named pipe (fifo) from the shell like this:
postgres#db:~$ mkfifo --mode=0666 /tmp/myPipe
Either 1) call the SQL command COPY using a named pipe on the server:
postgres#db:~$ psql event -p5433 -c "COPY x.x FROM '/tmp/myPipe'"
This will acquire an exclusive lock on the table x.x in the database. The connection stays open until the fifo gets data. Be careful not to leave this open for too long! You can call this after you have filled the pipe to minimize blocking time. You can chose the sequence of events. The command executes as soon as two processes bind to the pipe. The first waits for the second.
Or 2) you can execute SQL from the pipe on the client:
postgres#db:~$ psql event -p5433 -f /tmp/myPipe
This is better suited for your case. Also, no table locks until SQL is executed in one piece.
Bash will appear blocked. It is waiting for input to the pipe. To do it all from one bash instance, you can send the waiting process to the background instead. Like this:
postgres#db:~$ psql event -p5433 -f /tmp/myPipe 2>&1 &
Either way, from the same bash or a different instance, you can fill the pipe now.
Demo with three rows for variant 1):
postgres#db:~$ echo '1 foo' >> /tmp/myPipe; echo '2 bar' >> /tmp/myPipe; echo '3 baz' >> /tmp/myPipe;
(Take care to use tabs as delimiters or instruct COPY to accept a different delimiter using WITH DELIMITER 'delimiter_character')
That will trigger the pending psql with the COPY command to execute and return:
COPY 3
Demo for for variant 2):
postgres#db:~$ (echo -n "INSERT INTO x.x VALUES (1,'foo')" >> /tmp/myPipe; echo -n ",(2,'bar')" >> /tmp/myPipe; echo ",(3,'baz')" >> /tmp/myPipe;)
INSERT 0 3
Delete the named pipe after you are done:
postgres#db:~$ rm /tmp/myPipe
Check success:
event=# select * from x.x;
id | a
----+-------------------
1 | foo
2 | bar
3 | baz
Useful links for the code above
Reading compressed files with postgres using named pipes
Introduction to Named Pipes
Best practice to run bash script in background
Advice you may or may not not need
For bulk INSERT you have better solutions than a separate INSERT per row. Use this syntax variant:
INSERT INTO mytable (col1, col2, col3) VALUES
(1, 'foo', 'bar')
,(2, 'goo', 'gar')
,(3, 'hoo', 'har')
...
;
Write your statements to a file and do one mass INSERT like this:
psql -h remote_server -U username -d database -p 5432 -f my_insert_file.sql
(5432 or whatever port the db-cluster is listening on)
my_insert_file.sql can hold multiple SQL statements. In fact, it's common practise to restore / deploy whole databases like that. Consult the manual about the -f parameter, or in bash: man psql.
Or, if you can transfer the (compressed) file to the server, you can use COPY to insert the (decompressed) data even faster.
You can also do some or all of the processing inside PostgreSQL. For that you can COPY TO (or INSERT INTO) a temporary table and use plain SQL statements to prepare and finally INSERT / UPDATE your tables. I do that a lot. Be aware that temporary tables live and die with the session.
You could use a GUI like pgAdmin for comfortable handling. A session in an SQL Editor window remains open until you close the window. (Therefore, temporary tables live until you close the window.)
I know I'm late to the party, but why couldn't you combine all your INSERT statements into a single string, with a semicolon marking the end of each statement? (Warning! Pseudocode ahead...)
Instead of:
for each line
sql_statement="INSERT whatever YOU want"
echo $sql_statement | psql ...
done
Use:
sql_statements=""
for each line
sql_statement="INSERT whatever YOU want;"
sql_statements="$sql_statements $sql_statement"
done
echo $sql_statements | psql ...
That way you don't have to create anything on your filesystem, do a bunch of redirection, run any tasks in the background, remember to delete anything on your filesystem afterwards, or even remind yourself what a named pipe is.

On-the-fly compression of stdin failing?

From what was suggested here, I am trying to pipe the output from sqlcmd to 7zip so that I can save disk space when dumping a 200GB database. I have tried the following:
> sqlcmd -S <DBNAME> -Q "SELECT * FROM ..." | .\7za.exe a -si <FILENAME>
This does not seem to be working even when I leave the system for a whole day. However, the following works:
> sqlcmd -S <DBNAME> -Q "SELECT TOP 100 * FROM ..." | .\7za.exe a -si <FILENAME>
and even this one:
> sqlcmd -S <DBNAME> -Q "SELECT * FROM ..."
When I remove the pipe symbol, I can see the results and can even redirect it to a file within finishes in 7 hours.
I am not sure what is going on with piping large amount of output but what I could understand up until this point is that 7zip seems to be waiting to consume the whole input before it creates an archive file (because I don't really see a file being created to begin with) so I am not sure if it is actually performing on-the-fly compression. So I tried gzip and here's my experience:
> echo "Test" | .\gzip.exe > test.gz
> .\gzip.exe test.gz
gzip: test.gz: not in gzip format
I am not sure I am doing this the right way. Any suggestions?
Oh boy! It was PowerShell all along! I have no idea why this is happening at least with gzip. Gzip kept complaining that the input was not in gzip format. I switched over to the normal command prompt and everything started working.
I did observe this before. Looks like | and > have a slightly different functionality in PowerShell and Command prompt. Not sure what exactly it is but if someone knows about it, please add in here.

Resources