SQL Server BULK INSERT FROM different schemas - sql-server

I have a database that can have data updated from two external parties.
Each of those parties sends a pipe delimited text file that is BULK INSERTED into the staging table.
I now want to change the scheme for one of the parties by adding a few columns, but this is unfortunately breaking the BULK INSERT for the other party even though the new columns are all added as NULLABLE.
Is there any obvious solution to this?
TABLE SCHEMA:
CREATE TABLE [dbo].[CUSTOMER_ENTRY_LOAD](
[CARD_NUMBER] [varchar](12) NULL,
[TITLE] [varchar](6) NULL,
[LAST_NAME] [varchar](34) NULL,
[FIRST_NAME] [varchar](40) NULL,
[MIDDLE_NAME] [varchar](40) NULL,
[NAME_ON_CARD] [varchar](26) NULL,
[H_ADDRESS_PREFIX] [varchar](50) NULL,
[H_FLAT_NUMBER] [varchar](5) NULL,
[H_STREET_NUMBER] [varchar](10) NULL,
[H_STREET_NUMBER_SUFFIX] [varchar](5) NULL,
[H_STREET] [varchar](50) NULL,
[H_SUBURB] [varchar](50) NULL,
[H_CITY] [varchar](50) NULL,
[H_POSTCODE] [varchar](4) NULL,
[P_ADDRESS_PREFIX] [varchar](50) NULL,
[P_FLAT_NUMBER] [varchar](5) NULL,
[P_STREET_NUMBER] [varchar](10) NULL,
[P_STREET_NUMBER_SUFFIX] [varchar](5) NULL,
[P_STREET] [varchar](50) NULL,
[P_SUBURB] [varchar](50) NULL,
[P_CITY] [varchar](50) NULL,
[P_POSTCODE] [varchar](4) NULL,
[H_STD] [varchar](3) NULL,
[H_PHONE] [varchar](7) NULL,
[C_STD] [varchar](3) NULL,
[C_PHONE] [varchar](10) NULL,
[W_STD] [varchar](3) NULL,
[W_PHONE] [varchar](7) NULL,
[W_EXTN] [varchar](5) NULL,
[DOB] [smalldatetime] NULL,
[EMAIL] [varchar](50) NULL,
[DNS_STATUS] [bit] NULL,
[DNS_EMAIL] [bit] NULL,
[CREDITCARD] [char](1) NULL,
[PRIMVISACUSTID] [int] NULL,
[PREFERREDNAME] [varchar](100) NULL,
[STAFF_NUMBER] [varchar](50) NULL,
[CUSTOMER_ID] [int] NULL,
[IS_ADDRESS_VALIDATED] [varchar](50) NULL
) ON [PRIMARY]
BULK INSERT STATEMENT:
SET #string_temp = 'BULK INSERT customer_entry_load FROM '+char(39)+#inpath
+#current_file+'.txt'+char(39)+' WITH (FIELDTERMINATOR = '+char(39)+'|'+char(39)
+', MAXERRORS=1000, ROWTERMINATOR = '+char(39)+'\n'+char(39)+')'
SET DATEFORMAT dmy
EXEC(#string_temp)

The documentation describes how to use a format file to handle the scenario where the target table has more columns than the source file. An alternative that can sometimes be easier is to create a view on the table and BULK INSERT into the view instead of the table; this possibility is described in the same documentation.
And please always mention your SQL Server version.

Using OPENROWSET with BULK allows you to use your file in a query. You can use that to format the data and select only the columns you need.

In the end I have handled the two different cases with two different BULK INSERT statements (depending on which file is being processed). It seems like there isn't a way to do what I was trying to do with one statement.

You could use the format file idea supplied by #Pondlife.
Adapt your insert dynamically based on the input file name (provided there are unique differneces between the external parties). Using a CASE statement, simply select the correct format file based on the unique identifier in the file name.
DECLARE #formatFile varchar (max);
Set #formatFile =
CASE
WHEN #current_file LIKE '%uniqueIdentifier%'
THEN 'file1'
ELSE 'file2'
END
SET #string_temp = 'BULK INSERT customer_entry_load FROM '+char(39)+#inpath
+#current_file+'.txt'+char(39)+' WITH (FORMATFILE = '+char(39)+#formatFile+char(39)
')'
SET DATEFORMAT dmy
EXEC(#string_temp)
Hope that helps!

Related

Error while querying the partitioned table in SQL Server

I have created a partition table and trying to query the partition but not able to do so in sql server 2016. Could somebody tell me where I am going wrong
CREATE PARTITION FUNCTION [financialStatementPartition](datetime)
AS RANGE RIGHT FOR VALUES (N'2013-01-01T00:00:00.000', N'2014-01-01T00:00:00.000',
N'2015-01-01T00:00:00.000', N'2016-01-01T00:00:00.000',
N'2017-01-01T00:00:00.000')
GO
Table schema
CREATE TABLE [dbo].[FinancialStatementIds]
(
[financialCollectionId] [int] NOT NULL,
[companyId] [int] NOT NULL,
[dataItemId] [int] NOT NULL,
[dataItemName] [varchar](200) NULL,
[dataItemvalue] [decimal](18, 0) NULL,
[unittypevalue] [int] NULL,
[fiscalyear] [int] NULL,
[fiscalquarter] [int] NULL,
[periodenddate] [datetime] NULL,
[filingdate] [datetime] NULL,
[restatementtypename] [varchar](200) NULL,
[latestforfinancialperiodflag] [bit] NULL,
[latestfilingforinstanceflag] [bit] NULL,
[currencyconversionflag] [int] NULL,
[currencyname] [varchar](200) NULL,
[periodtypename] [varchar](200) NULL
)
Query
SELECT *
FROM dbo.FinancialStatementIds
WHERE $PARTITION.financialStatementPartition(periodenddate) = '2013-01-01T00:00:00.000'
Error
Conversion failed when converting the varchar value '2013-01-01T00:00:00.000' to data type int.
Use following query instead:
SELECT *
FROM dbo.FinancialStatementIds
WHERE periodenddate = '20130101'
Anyway, the table is still not partitioned properly. You need to apply Partition Scheme to connect the table and Partition function.

resolve updatable view to xml column?

I have the following portion of a view definition
SELECT
codedValue.value('Code[1]','nvarchar(max)') AS "Code",
codedValue.value('Name[1]', 'nvarchar(max)') AS "Value"
FROM GDB_ITEMS AS items
CROSS APPLY items.Definition.nodes
('/GPCodedValueDomain2/CodedValues/CodedValue') AS CodedValues(codedValue)
WHERE items.Name = 'tlu_Loss_list'
which queries an application-generated xlm column for "code" and "value". In this context, I am able to read-only the codes and values in the xml column.
Ideally, I'd like to make the view updatable, so users can enter their own codes and values, which will be replicated over to this xml column. Is this possible?
Here is the relavent portion of the the xml column and table:
Existing data in xml column:
<GPCodedValueDomain2 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:typens="http://www.esri.com/schemas/ArcGIS/10.0" xsi:type="typens:GPCodedValueDomain2">
<DomainName>tlu_Loss_List</DomainName>
<FieldType>esriFieldTypeString</FieldType>
<MergePolicy>esriMPTDefaultValue</MergePolicy>
<SplitPolicy>esriSPTDefaultValue</SplitPolicy>
<Description>Loss_Reason</Description>
<Owner>DBO</Owner>
<CodedValues xsi:type="typens:ArrayOfCodedValue">
<CodedValue xsi:type="typens:CodedValue">
<Name>Abandoned</Name>
<Code xsi:type="xs:string">AB</Code>
</CodedValue>
<CodedValue xsi:type="typens:CodedValue">
<Name>Coyote</Name>
<Code xsi:type="xs:string">CO</Code>
</CodedValue>
</CodedValues>
</GPCodedValueDomain2>
Table holding the XML:
CREATE TABLE [dbo].[GDB_ITEMS](
[ObjectID] [int] NOT NULL,
[UUID] [uniqueidentifier] NOT NULL,
[Type] [uniqueidentifier] NOT NULL,
[Name] [nvarchar](226) NULL,
[PhysicalName] [nvarchar](226) NULL,
[Path] [nvarchar](512) NULL,
[Url] [nvarchar](255) NULL,
[Properties] [int] NULL,
[Defaults] [varbinary](max) NULL,
[DatasetSubtype1] [int] NULL,
[DatasetSubtype2] [int] NULL,
[DatasetInfo1] [nvarchar](255) NULL,
[DatasetInfo2] [nvarchar](255) NULL,
[Definition] [xml] NULL,
[Documentation] [xml] NULL,
[ItemInfo] [xml] NULL,
[Shape] [geometry])
You might be able to do this with an "instead-of" trigger: Designing INSTEAD OF triggers
For examples of modifying XML, see modify() Method and XML Data Modification Language

How to store DropDownList information in SQL

I'm looking to store the contents of several dropdownlists in my SQL Server. Is it better to store them in 1 table per dropdown, or in a larger table?
My larger table would have schema like:
CREATE TABLE [dbo].[OptionTable](
[OptionID] [int] IDENTITY(1,1) NOT NULL,
[ListName] [varchar](100) NOT NULL,
[DisplayValue] [varchar](100) NOT NULL,
[Value] [varchar](100) NULL,
[OptionOrder] [tinyint] NULL,
[AssociatedDept] [int] NULL,
[Other2] [nchar](10) NULL,
[Other3] [nchar](10) NULL
) ON [PRIMARY]
And I would get the contents of 1 list by doing something like:
Select [columns]
From OptionTable
WHERE ListName = 'nameOfList'
So how can I decide? I know it will work like this, I'm just not sure if this is good practice or not? Will one way perform better? What about readability? Opinions appreciated.
I've worked in databases that had a single "super option table" that contained values for multiple drop down lists... it worked OK for the drop down list population, but when I needed to use those values for other reporting purposes, it became a pain because the "super option table" needed to be filtered based on the specific set of options that I needed, and it ended up in some ugly looking queries.
Additionally, down the road there were conditions that required an additional value to be tracked with one of the lists... but that column would need to be added to the whole table, and then all the other sets of options within that table would simply have a NULL for a column that they didn't care about...
Because of that, I'd suggest if you're dealing with completely distinct lists of data, that those lists be stored in separate tables.
The quick and easy:
CREATE TABLE [dbo].[Lists](
[ListId] [int] IDENTITY(1,1) NOT NULL,
[ListName] [varchar](100) NOT NULL,
--these could be associated with lists or options, wasn't specified
[AssociatedDept] [int] NULL,
[Other2] [nchar](10) NULL,
[Other3] [nchar](10) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[Options](
[OptionId] [int] IDENTITY(1,1) NOT NULL,
[ListId] [int] NOT NULL,
[DisplayValue] [varchar](100) NOT NULL,
[Value] [varchar](100) NULL,
[OptionOrder] [tinyint] NULL,
--these could be associated with lists or options, wasn't specified
[AssociatedDept] [int] NULL,
[Other2] [nchar](10) NULL,
[Other3] [nchar](10) NULL
) ON [PRIMARY]
Get contents with
select Options.* --or a subset
from Options as o
join Lists as l
on l.ListId=o.ListId and l.ListName = 'nameOfList'
order by o.OptionOrder
The (potentially: depends on your data) more optimized (particularly if one option appears in more than one list)
CREATE TABLE [dbo].[Lists](
[ListId] [int] IDENTITY(1,1) NOT NULL,
[ListName] [varchar](100) NOT NULL,
--these could be associated with lists or options, wasn't specified
[AssociatedDept] [int] NULL,
[Other2] [nchar](10) NULL,
[Other3] [nchar](10) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[Options](
[OptionId] [int] IDENTITY(1,1) NOT NULL,
[DisplayValue] [varchar](100) NOT NULL,
[Value] [varchar](100) NULL,
--these could be associated with lists or options, wasn't specified
[AssociatedDept] [int] NULL,
[Other2] [nchar](10) NULL,
[Other3] [nchar](10) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[ListOptions](
[OptionId] [int] NOT NULL,
[ListId] [int] NOT NULL,
[OptionOrder] [tinyint] NULL,
--these could be associated with lists or options, wasn't specified
[AssociatedDept] [int] NULL,
[Other2] [nchar](10) NULL,
[Other3] [nchar](10) NULL
)
Get contents with
select Options.* --or a subset
from Options as o
join ListOptions as lo
on lo.OptionId=o.OptionId
join Lists as l
on l.ListId=lo.ListId and l.ListName = 'nameOfList'
order by lo.OptionOrder
On either, you'd want to index the foreign key columns.

Error 4866 when bulk inserting data from csv

I'm trying to load data from a csv file and keep getting these errors. Am I missing some params in the bulk insert script or do I need to modify the file before I attempt this?
Msg 4866, Level 16, State 1, Line 1
The bulk load failed. The column is too long in the data file for row 1, column 54. Verify that the field terminator and row terminator are specified correctly.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
Here's the script
BULK
INSERT BrowseNotes
FROM 'C:\Users\Jarek\browseNotes2.csv'
WITH
(
FIELDTERMINATOR = ','
, ROWTERMINATOR = '\n'
)
Here are sample rows from the file, I delete out the first row before attempting to load. The rows end with ",/n". I've tried replacing /n with /r/n and removing the last comma. Still get the same error.
LoanType,Maturity,LoanClass,Borrower,LoanStatus,TimeLeftBeforeExpiration,MonthlyPayment,LoanMaturity,JobTenureYearsString,AmountToInvest,AmountMissingToClose,NumberOfPayments,Id,State,Type,Status,Aid,Amount,Duration,StartD,IntRate,Grade,Purpose,HousingStatus,JobTenure,Income,CreditClassId,City,UnfundedAmnt,Fico,OpenCreditLines,TotalCreditLines,Inq6Months,RevolvUtil,FundedPercentage,FundedAmount,EmpStatus,JobTitle,AppDate,AppAmount,Employer,DelinquentAmount,EarliestCreditLine,PubRecords,DTI,AppExpiration,LapStatus,IncomeVStatus,CreditReportD,RevolvCreditBal,AccntsNowDelinquent,Delinquencies2Yrs,MnthsSinceLastDelinquency,MnthsSinceLastRecord
PERSONAL,60,C4,1248804,INFUNDING,279589,344.62,Year5,8 years,0,625.0,60,1020047,PA,1,1,1248804,13775.0,60,2011-11-11 11:40:18,0.1527,C,debt_consolidation,MORTGAGE,96,50000.0,124,PHILADELPHIA,625.0,679-713,10,21,2,62.2,0.9565972222222222,13775.0,EMPLOYED,"Quality Assurance Manager",2011-11-11 11:40:18,14400.0,"J. Ambrogi Food Distribution",0.0,01/27/2003,0,23.14,2011-11-25 11:40:18,APPROVED_CR,NOT_REQUIRED,11/11/2011,22906.0,0,0,null,null,
PERSONAL,60,A5,1247389,INFUNDING,180323,289.94,Year5,3 years,0,1975.0,60,1018925,FL,1,1,1247389,12025.0,60,2011-11-10 08:05:52,0.089,A,house,MORTGAGE,36,150000.0,105,orange park,1950.0,750-779,9,25,0,62.9,0.8607142857142858,12050.0,EMPLOYED,"Project Manager",2011-11-10 08:05:52,14000.0,"Scientific Research Corp.",0.0,10/01/1984,0,14.02,2011-11-24 08:05:52,APPROVED_CR,VERIFIED,11/09/2011,43069.0,0,0,null,null,
Here's the table I'm trying to load to
CREATE TABLE [dbo].[BrowseNotes](
[LoanType] [nvarchar](25) NULL,
[Maturity] [tinyint] NULL,
[LoanClass] [nvarchar](2) NULL,
[Borrower] [int] NULL,
[LoanStatus] [nvarchar](25) NULL,
[TimeLeftBeforeExpiration] [int] NULL,
[MonthlyPayment] [smallmoney] NULL,
[LoanMaturity] [nvarchar](10) NULL,
[JobTenureYearsString] [nvarchar](15) NULL,
[AmountToInvest] [smallmoney] NULL,
[AmountMissingToClose] [smallmoney] NULL,
[NumberOfPayments] [tinyint] NULL,
[Id] [int] NULL,
[State] [char](2) NULL,
[Type] [tinyint] NULL,
[Status] [tinyint] NULL,
[Aid] [int] NULL,
[Amount] [smallmoney] NULL,
[Duration] [tinyint] NULL,
[StartD] [datetime] NULL,
[IntRate] [decimal](18, 0) NULL,
[Grade] [char](1) NULL,
[Purpose] [nvarchar](25) NULL,
[HousingStatus] [nvarchar](25) NULL,
[JobTenure] [tinyint] NULL,
[Income] [money] NULL,
[CreditClassId] [smallint] NULL,
[City] [nvarchar](255) NULL,
[UnfundedAmnt] [smallmoney] NULL,
[Fico] [nvarchar](10) NULL,
[OpenCreditLines] [tinyint] NULL,
[TotalCreditLines] [tinyint] NULL,
[Inq6Months] [tinyint] NULL,
[RevolvUtil] [decimal](18, 0) NULL,
[FundedPercentage] [decimal](18, 0) NULL,
[FundedAmount] [smallmoney] NULL,
[EmpStatus] [nvarchar](25) NULL,
[JobTitle] [nvarchar](255) NULL,
[AppDate] [datetime] NULL,
[AppAmount] [money] NULL,
[Employer] [nvarchar](255) NULL,
[DelinquentAmount] [money] NULL,
[EarliestCreditLine] [datetime] NULL,
[PubRecords] [tinyint] NULL,
[DTI] [decimal](18, 0) NULL,
[AppExpiration] [datetime] NULL,
[LapStatus] [nvarchar](25) NULL,
[IncomeVStatus] [nvarchar](25) NULL,
[CreditReportD] [datetime] NULL,
[RevolvCreditBal] [money] NULL,
[AccntsNowDelinquent] [tinyint] NULL,
[Delinquencies2Yrs] [tinyint] NULL,
[MnthsSinceLastDelinquency] [nvarchar](10) NULL,
[MnthsSinceLastRecord] [nvarchar](10) NULL
)
What database is the table in? Try fully qualifying your table name i.e.
`mydb.dbo.BrowseNotes`
Though it certainly sounds like its not recognizing the ROWTERMINATOR .
I know this is coming in waaaaaay late, but I figured out how to do this.
DECLARE #sql varchar(1000)
set #sql = '
BULK
INSERT BrowseNotes
FROM "C:\Users\Jarek\browseNotes2.csv"
WITH (
FIELDTERMINATOR = ",",
ROWTERMINATOR = "' + char(10) + '"
)'
exec(#sql)
GO
This script works by forcing the rowterminator to a literal '0A' (linefeed). This works for both \r\n and \n terminated data.
I would also suggest using a pipe character (or anything not contained in your data) for a fieldterminator. BULK INSERT is not very tolerant of embedded field terminators in the data.
Also, adding FIRSTROW to the statement does not skip field validation for the first row. So you have to strip the headers before import, not just skip them.

TSQL Help (SQL Server 2005)

I have been playing around with a quite complex SQL Statement for a few days, and have gotten most of it working correctly.
I am having trouble with one last part, and was wondering if anyone could shed some light on the issue, as I have no idea why it isnt working:
INSERT INTO ExistingClientsAccounts_IMPORT
SELECT DISTINCT
cca.AccountID, cca.SKBranch, cca.SKAccount, cca.SKName, cca.SKBase,
cca.SyncStatus, cca.SKCCY, cca.ClientType, cca.GFCID, cca.GFPID, cca.SyncInput,
cca.SyncUpdate, cca.LastUpdatedBy, cca.Deleted, cca.Branch_Account, cca.AccountTypeID
FROM ClientsAccounts AS cca
INNER JOIN
(SELECT DISTINCT ClientAccount, SKAccount, SKDesc,
SKBase, SKBranch, ClientType, SKStatus, GFCID,
GFPID, Account_Open_Date, Account_Update
FROM ClientsAccounts_IMPORT) AS ccai
ON cca.Branch_Account = ccai.ClientAccount
Table definitions follow:
CREATE TABLE [dbo].[ExistingClientsAccounts_IMPORT](
[AccountID] [int] NOT NULL,
[SKBranch] [varchar](2) NOT NULL,
[SKAccount] [varchar](12) NOT NULL,
[SKName] [varchar](255) NULL,
[SKBase] [varchar](16) NULL,
[SyncStatus] [varchar](50) NULL,
[SKCCY] [varchar](5) NULL,
[ClientType] [varchar](50) NULL,
[GFCID] [varchar](10) NULL,
[GFPID] [varchar](10) NULL,
[SyncInput] [smalldatetime] NULL,
[SyncUpdate] [smalldatetime] NULL,
[LastUpdatedBy] [varchar](50) NOT NULL,
[Deleted] [tinyint] NOT NULL,
[Branch_Account] [varchar](16) NOT NULL,
[AccountTypeID] [int] NOT NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[ClientsAccounts_IMPORT](
[NEWClientIndex] [bigint] NOT NULL,
[ClientGroup] [varchar](255) NOT NULL,
[ClientAccount] [varchar](255) NOT NULL,
[SKAccount] [varchar](255) NOT NULL,
[SKDesc] [varchar](255) NOT NULL,
[SKBase] [varchar](10) NULL,
[SKBranch] [varchar](2) NOT NULL,
[ClientType] [varchar](255) NOT NULL,
[SKStatus] [varchar](255) NOT NULL,
[GFCID] [varchar](255) NULL,
[GFPID] [varchar](255) NULL,
[Account_Open_Date] [smalldatetime] NULL,
[Account_Update] [smalldatetime] NULL,
[SKType] [varchar](255) NOT NULL
) ON [PRIMARY]
The error message I get is:
Msg 8152, Level 16, State 14, Line 1
String or binary data would be truncated.
The statement has been terminated.
The error is because you are trying to insert data into a column in ExistingClientsAccounts_IMPORT where the column size is smaller than the length of data attempting to be inserted into it.
e.g.
SKAccount column is VARCHAR(12) in the ExistingClientsAccounts_IMPORT table but is VARCHAR(255) in ClientsAccounts_IMPORT.
So if ClientsAccounts_IMPORT contains any rows where that field is longer than 12 characters, you will get that error as obv. e.g. 100 characters will not fit into a 12 character field.
You need to make sure all the columns in the table you are inserting into, are big enough - make sure each column definition matches the source table.
The third column of your SELECT column list means that ExistingClientsAccounts_IMPORT.SKAccount is populated from ClientsAccounts.SKAccount - however, the source is up to 255 characters, while the destination has a capacity of 12. If there's any data that wouldn't fit, you'll get this message.
I haven't gone through all the other columns.
You are trying to insert values which are greater than tha max length specified for a column. Use a profiler to check the data being passed to this query and verify the length of data against the permissible length for all columns.
There is a clear mismatch in the column lenghts of common columns of these two tables.
ClientsAccounts_IMPORT.SKBase is 10 whereas it is 16 in the other table.
Check the field definitions. You can see some are smaller than the original ones. Now run a query on the old data - you will find some of the larger fields were used, so the insert is not possible without losing data.
Example: SKAccount - from 255 length to 12.

Resources