Errors in Data-Transfer due to quotes in column-fields - database

I am trying to do a standard data-transfer from a bunch of tables (from CSV, DB2 and PostGreSQL) into a PostGreSQL-instance. The tool I am using is Dbeaver Light's Data Transfer Task.
However, I am facing a ton of errors and I have tracked it down to being due to the actual content of the data.
What I see in the data are examples like this:
Resulting in the following error:
I believe the errors arise due to the " in the data since these are used to qualify colums like so:
SELECT "Column1", "Column2" FROM "schema"."table"
The setting I am using are:
Can anyone think of a way to solve this?

Related

Count how many fields where cleansed and which fields on SSIS

I'm doing an exercise in which I have to clean data from a Flat File Source and write it on my Database. I have already managed to clean all of the fields by using some data quality rules for each field and also generate error codes which I write to a different table when a rule is broken.
My problem is that for the final step of the exercise I have to generate some Power BI graphics in which it shows how many fields were fixed from the source and which fields where cleansed. The only thing that I have thought compares the DB table to the flat file source or maybe do something with script components but I don't really think that those are really good solutions.
Has anybody encountered this problem? if somebody could point me out for info for something like this, it would be great. Thanks!
If I am facing a similar issue, I will do this in three steps:
Importing data without any transformation to a staging table
Cleaning data and loading it into the destination table
Comparing staging and destination table to get how many values were fixed.
From design standpoint - establishing a key is central before starting to clean.
Use could use SSIS derived column transformation to create a business key that is a concatenation of available fields to create a unique key, using FindString function and string functions.
Similar to the above step add a column in your staging table or use a derived column (depending on if you are using sql cleanup or ssis tasks to cleanup) to indicate if it was cleaned or not.

shp2pgsql "ERROR: INSERT has more target columns than expressions"

I'm using the script provided by loader_generate_script in Postgis to load Tiger census data and I'm getting an error while loading EDGE data. I was able to locate the error by appending -v ON_ERROR_EXIT=1 to all psql calls.
NOTICE: INSERT INTO tiger_data.al_edges(statefp,countyfp,tlid,tfidl,tfidr,mtfcc,fullname,smid,lfromadd,ltoadd,rfromadd,rtoadd,zipl,zipr,featcat,hydroflg,railflg,roadflg,olfflg,passflg,divroad,exttyp,ttyp,deckedroad,artpath,persist,gcseflg,offsetl,offsetr,tnidf,tnidt,the_geom) SELECT statefp,countyfp,tlid,tfidl,tfidr,mtfcc,fullname,smid,lfromadd,ltoadd,rfromadd,rtoadd,zipl,zipr,featcat,hydroflg,railflg,roadflg,olfflg,passflg,exttyp,ttyp,deckedroad,artpath,persist,gcseflg,offsetl,offsetr,tnidf,tnidt,the_geom FROM tiger_staging.al_edges;
ERROR: INSERT has more target columns than expressions
LINE 1: ...tpath,persist,gcseflg,offsetl,offsetr,tnidf,tnidt,the_geom) ...
The error says pretty clearly that the columns don't match (the SELECT statement is missing a column called divroad. How do I mitigate that in shp2psql since that is what is creating the INSERT statement. Any help is appreciated!
Turns out the issue stems from my postgis version. The Azure managed postgres uses POSTGIS="2.3.2 r15302" as the Postgis extension and it looks like TIGER2017 data issues were patched in 2.4.1 of postgis. The solution is to upgrade Postgis, though that may mean not using Azure managed for now.

"Conversion failed because the data value overflowed the specified type" error applies to only one column of the same table

I am trying to import data from database access file into SQL server. To do that, I have created SSIS package through SQL Server Import/Export wizard. All tables have passed validation when I execute package through execute package utility with "validate without execution" option checked. However, during the execution I received the following chunk of errors (using a picture, since blockquote uses a lot of space):
Upon the investigation, I found exactly the table and the column, which was causing the problem. However, this is problem I have been trying to solve for a couple days now, and I'm running dry on possible options.
Structure of the troubled table column
As noted from the error list, the trouble occurs in RHF Repairs table on the Date Returned column. In Access, the column in question is Date/Time type. Inside the actual table, all inputs are in a form of 'mmddyy', which when clicked upon, turn into 'mm/dd/yyyy' format:
In SSIS package, it created OLEDB Source/Destination relationship like following:
Inside this relationship, in both output columns and external columns data type is DT_DATE (I still think it is a key cause of my problems). What bugs me the most is that the adjacent to Date Returned column is exactly the same as what I described above, and none of the errors applied to it or any other columns of the same type, Date Returned is literally the only black sheep in the flock.
What have I tried
I have tried every option from the following thread, the error remains the same.
I tried Data conversion option, trying to convert this column into datestamp or even unicode string. It didn't work.
I tried to specify data type with the advanced source editor to both datestamp/unicode string. I tried specifying it only in output columns, tried in both external and output columns, same result.
Plowing through the data in access table also did not give me anything. All of them use the same 6-char formatting through it all.
At this point, I literally exhausted all options I could think of. Can you please point me in the right direction on what else I could possibly try to resolve it, since it drives me nuts for last two days.
PS: On my end, I will plow through each row individually, while not trying to get discouraged by the fact that there are 4000+ row entries...
UPDATE:
I resolved this matter by plowing through data. There were 3 faulty entries among 4000+ rows... Since the issue was resolved in a manner unlikely to help others, please close that question.
It sounds to me like you have one or more bad dates in the column. With 4,000 rows, I actually would visually scan and look for something very short or very long.
You could change your source to selecting top 1 instead of all 4,000. Do those insert? If so, that would lend weight to the bad date scenario. If 1 row does not flow through, it is another issue.
(I will just share my experience, how I overcame this problem, in case it helps someone)
My scenario:
One of the column Identifier in the ole db data source has changed from int to bigint. I was getting the error message - Conversion failed because the data value overflowed the specified type.
Basically, it was telling me the source data size was greater than the destination data size.
What I have tried:
In the ole db data source and destination both places, I clicked "show advanced editior", checkd the data type Identifier was bigint. But still, I was getting the error message
The solution worked for me:
In the ole db data source--> show advanced edition option--> Input and Output Properties--> OLE DB Source Output--> there are two options - External columns & Output columns.
In my case, though the Identifier column in the External columns was showing the data type bigint, but in the Output columns was showing the data type int. So, I changed the data type to bigint and it has solved my problem.
Now and then I get this problem, specially when I have a big table with lots of data.
I hope it helps.
We had this error when someone had entered the year as 216 instead of 2016. The data source was reading the data ok but it was failing on the OLEDB destination task.
We use a script task in the data flow for validation. By adding a check that dates aren't too far in the past we are able to trap this kind of error and at least generate a meaningful error message to find and correct the problem quickly.

How to store XML result of WebService into SQL Server database?

We have got a .Net Client that calls a Webservice. We want to store the result in a SQL Server database.
I think we have two options here how to store the data, and I am a bit undecided as I can't see the pros and cons clearly: One would be to map the results into database fields. That would require us to have database fields corresponding to each possible result type, e.g. for each "normal" result type as well as those for faults.
On the other hand, we could store the resulting XML and query that via the SQL Server built in XML functions.
Personally, I am comfortable with dealing with both SQL and XML, so both look fine to me.
Are there any big pros and cons and what would I need to consider in terms of database design when trying to store the resulting XML for quite a few different possible Webservice operations? I was thinking about a result table for each operation that we call with different entries for the different possible outcomes / types and then store the XML in the right field, e.g. a fault in the fault field, a "normal" return type in the appropriate field etc.
We use a combination of both. XML for reference and detailed data, and text columns for fields you might search on. Searchable columns include order number, customer reference, ticket number. We just add them when we need them since you can extract them from the XML column.
I wouldn't recommend just the XML. If you store 10.000 messages a day, a query like:
select * from XmlLogging with (nolock) where Response like '%Order12%'
can become slow and interfere with other queries. You also can't display the logging in a GUI because retrieval is too slow.
I wouldn't recommend just the text columns either. If the XML format changes, you'd get an empty column. That's hard to troubleshoot without the XML message. In addition, if you need to "replay" the message stream, that's a lot easier with the XML messages. Few requirements demand replay, but it's really helpful when repairing the fallout of production problems.

XML column in SSIS has byte-order-mark

I'm using an oledb data source in an SSIS package to pull a column from a database. The column is XML data type. In SSIS, it is automatically recognized as data type DT_NTEXT. It's going to a script component where I'm trying to load it into a System.Xml.XmlDocument. This is the code that I'm using to get the xml data into a string:
System.Text.Encoding.Default.GetString(Row.Data.GetBlobData(0, Row.Data.Length))
Is this the correct way?
One odd thing that I'm seeing is that on one server, I get a byte-order-mark in the resulting string, and another server I don't. I wouldn't mind knowing why that is the case, but my real desire is how to get this string without the BOM.
Help me, Stack Overflow, you're my only hope...
This is the only way I was able to get it to work:
System.Text.UnicodeEncoding.Unicode.GetString(...).Trim()
The .Trim() removes the BOM. I'm not sure if this is the "right" way, but it's the only thing that's worked so far.

Resources