How to save a file in postgresql and retrieve it? - database

I am trying to get the details of the customer like photograph, adhar, and other identity and bank details in pdf format. I wanted these details to be saved in PostgreSQL. Which is the best format I can store in DB? I will also have access to those files in DB whenever I need them. There won't be much update process in the files though.
I have the file in my Golang server in multipart.File type.
type DocumentsAll struct{
Photograph multipart.File `json:"photograph"`
Identityproof multipart.File `json:"identityproof"`
}
this is the struct I have that contains all files
My column in db is as follows
ALTER TABLE LOANS ADD column Documentlist bytea;
ie column that contains documents is of data type bytea.
when i try to save document with these format this is what is getting stored in db
\x7b2270686f746f6772617068223a7b7d2c22706f69223a6e756c6c2c22706f72223a6e756c6c2c2273616c617279736c697073223a6e756c6c2c2262616e6b73746174656d656e74223a6e756c6c2c2263686571756573223a6e756c6c2c22706f6a223a6e756c6c2c22706f6f223a6e756c6c2c22706f71223a6e756c6c2c226c6f616e41677265656d656e74223a6e756c6c2c227369676e766572696679223a6e756c6c2c22656373223a6e756c6c2c227365637572697479636865717565223a6e756c6c2c22706f74223a6e756c6c2c22697472223a6e756c6c2c226c6f616e73746174656d656e74223a6e756c6c7d
So when I try to access it back, I am not able to retrieve the file.

Related

Azure Data Factory: Lookup varbinary column in SQL DB for use in a Script activity to write to another SQL DB - ByteArray is not supported

I'm trying to insert into an on-premises SQL database table called PictureBinary:
PictureBinary table
The source of the binary data is a table in another on-premises SQL database called DocumentBinary:
DocumentBinary table
I have a file with all of the Id's of the DocumentBinary rows that need copying. I feed those into a ForEach activity from a Lookup activity. Each of these files has about 180 rows (there are 50 files fed into a new instance of the pipeline in parallel).
Lookup and ForEach Activities
So far everything is working. But then, inside the ForEach I have another Lookup activity that tries to get the binary info to pass into a script that will insert it into the other database.
Lookup Binary column
And then the Script activity would insert the binary data into the table PictureBinary (in the other database).
Script to Insert Binary data
But when I debug the pipeline, I get this error when the binary column Lookup is reached:
ErrorCode=DataTypeNotSupported,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Column: coBinaryData,The data type ByteArray is not supported from the column named coBinaryData.,Source=,'
I know that the accepted way of storing the files would be to store them on the filesystem and just store the file path to the files in the database. But we are using a NOP database that stores the files in varbinary columns.
Also, if there is a better way of doing this, please let me know.
I tried to reproduce your scenario in my environment and got similar error
As per Microsoft document Columns with datatype Byte Array Are not supported in lookup activity is might be the main cause of error.
To workaround this as Follow below steps:
As you explained your case you have a file in which all the Id's of the DocumentBinary rows that need copy in destination are stored. To achieve this, you can simply use Copy activity with the Query where you copy records where the DocumentBinary in column is equal to the Id stored in file
First, I took lookup activity from where I can get Id's of the DocumentBinary rows stored in file
Then I took ForEach I passed the output of lookup activity to ForEach activity.
After this I took Copy activity in forEach activity
Select * from DocumentBinary
where coDocumentBinaryId = '#{item().PictureId}'
In source of copy activity select Use query as Query and pass above query with your names
Now go to Mapping Click on Import Schema then delete unwanted columns and map the columns accordingly.
Note: For this, columns in both tables are of similar datatypes either both uniqueidenntifier or both should be int
Sample Input in file:
Output (Copied only picture id contend in file from source to destination):

How to read from one DB but write to another using Snowflake's Snowpark?

I'm SUPER new to Snowflake and Snowpark, but I do have respectable SQL and Python experience. I'm trying to use Snowpark to do my data prep and eventually use it in a data science model. However, I cannot write to the database from which I'm pulling from -- I need to create all tables in a second DB.
I've created code blocks to represent both input and output DBs in their own sessions, but I'm not sure that's helpful, since I have to be in the first session in order to even get the data.
I use code similar to the following to create a new table while in the session for the "input" DB:
my_table= session.table("<SCHEMA>.<TABLE_NAME>")
my_table.toPandas()
table_info = my_table.select(col("<col_name1>"),
col("<col_name2>"),
col("<col_name3>").alias("<new_name>"),
col("<col_name4"),
col("<col_name5")
)
table_info.write.mode('overwrite').saveAsTable('MAINTABLE')
I need to save the table MAINTABLE to a secondary database that is different from the one where the data was pulled from. How do I do this?
It is possible to provide fully qualified name:
table_info.write.mode('overwrite').saveAsTable('DATABASE_NAME.SCHEMA_NAME.MAINTABLE')
DataFrameWriter.save_as_table
Parameters:
table_name – A string or list of strings that specify the table name or fully-qualified object identifier (database name, schema name, and table name).

Demo Db and UTIL Db in snowflake

When we create a new snowflake account, it comes with two blank databases DEMO_DB and UTIL_DB. I always wonder what is the reason for providing two blank databases. Would not DEMO_DB be enough? Does anyone know why the util_db is required?
As you can see from the comment of the database, UTIL_DB is an "utility database".
As far as I know, DEMO_DB does not contain any file format objects when the account is created. Sample file format objects (ie: single_column_rows, csv, csv_dq, tsv, psv and more...) are created under UTIL_DB. I'm not talking about file format types; these are sample file format objects!
Additionally, it contains two utility functions:
SELECT * FROM TABLE(util_db.public.sfwho());
which returns a row containing the current timestamp, account name, user, role, database, schema and warehouse information.
SELECT util_db.public.decode_uri('https%3A%2F%2Fgokhanatil.com');
which decodes an encoded URI (calling the decodeURIComponent function):
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent

In Mule 4,how can I save the data I retrieve from database using database connector?

Suppose I retrieve a row having a particular id using database connector.If I have to save a particular field lets say name in a variable for later use.How can I do that?
A very crude example would look like the one below:
You would need to go to the advanced section under the DB connector and you can configure the script to obtain the column values in the Output section.

Read single text file and based on a particular value of a column load that record into its respective table

I have been searching on the internet for a solution to my problem but I can not seem to find any info. I have a large single text file ( 10 million rows), I need to create an SSIS package to load these records into different tables based on the transaction group assigned to that record. That is Tx_grp1 would go into Tx_Grp1 table, Tx_Grp2 would go into Tx_Grp2 table and so forth. There are 37 different transaction groups in the single delimited text file, records are inserted into this file as to when they actually occurred (by time). Also, each transaction group has a different number of fields
Sample data file
date|tx_grp1|field1|field2|field3
date|tx_grp2|field1|field2|field3|field4
date|tx_grp10|field1|field2
.......
Any suggestion on how to proceed would be greatly appreciated.
This task can be solved with SSIS, just with some experience. Here are the main steps and discussion:
Define a Flat file data source for your file, describing all columns. Possible problems here - different data types of fields based on tx_group value. If this is the case, I would declare all fields as strings long enough and later in the dataflow - convert its type.
Create a OLEDB Connection manager for the DB you will use to store the results.
Create a main dataflow where you will proceed the file, and add a Flat File Source.
Add a Conditional Split to the output of Flat file source, and define there as much filters and outputs as you have transaction groups.
For each transaction group data output - add Data Conversion for fields if necessary. Note - you cannot change data type of existing column, if you need to cast string to int - create a new column.
Add for each destination table an OLEDB Destination. Connect it to proper transaction group data flow, and map fields.
Basically, you are done. Test the package thoroughly on a test DB before using it on a production DB.

Resources