Error 4866 when bulk inserting data from csv - sql-server
I'm trying to load data from a csv file and keep getting these errors. Am I missing some params in the bulk insert script or do I need to modify the file before I attempt this?
Msg 4866, Level 16, State 1, Line 1
The bulk load failed. The column is too long in the data file for row 1, column 54. Verify that the field terminator and row terminator are specified correctly.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
Here's the script
BULK
INSERT BrowseNotes
FROM 'C:\Users\Jarek\browseNotes2.csv'
WITH
(
FIELDTERMINATOR = ','
, ROWTERMINATOR = '\n'
)
Here are sample rows from the file, I delete out the first row before attempting to load. The rows end with ",/n". I've tried replacing /n with /r/n and removing the last comma. Still get the same error.
LoanType,Maturity,LoanClass,Borrower,LoanStatus,TimeLeftBeforeExpiration,MonthlyPayment,LoanMaturity,JobTenureYearsString,AmountToInvest,AmountMissingToClose,NumberOfPayments,Id,State,Type,Status,Aid,Amount,Duration,StartD,IntRate,Grade,Purpose,HousingStatus,JobTenure,Income,CreditClassId,City,UnfundedAmnt,Fico,OpenCreditLines,TotalCreditLines,Inq6Months,RevolvUtil,FundedPercentage,FundedAmount,EmpStatus,JobTitle,AppDate,AppAmount,Employer,DelinquentAmount,EarliestCreditLine,PubRecords,DTI,AppExpiration,LapStatus,IncomeVStatus,CreditReportD,RevolvCreditBal,AccntsNowDelinquent,Delinquencies2Yrs,MnthsSinceLastDelinquency,MnthsSinceLastRecord
PERSONAL,60,C4,1248804,INFUNDING,279589,344.62,Year5,8 years,0,625.0,60,1020047,PA,1,1,1248804,13775.0,60,2011-11-11 11:40:18,0.1527,C,debt_consolidation,MORTGAGE,96,50000.0,124,PHILADELPHIA,625.0,679-713,10,21,2,62.2,0.9565972222222222,13775.0,EMPLOYED,"Quality Assurance Manager",2011-11-11 11:40:18,14400.0,"J. Ambrogi Food Distribution",0.0,01/27/2003,0,23.14,2011-11-25 11:40:18,APPROVED_CR,NOT_REQUIRED,11/11/2011,22906.0,0,0,null,null,
PERSONAL,60,A5,1247389,INFUNDING,180323,289.94,Year5,3 years,0,1975.0,60,1018925,FL,1,1,1247389,12025.0,60,2011-11-10 08:05:52,0.089,A,house,MORTGAGE,36,150000.0,105,orange park,1950.0,750-779,9,25,0,62.9,0.8607142857142858,12050.0,EMPLOYED,"Project Manager",2011-11-10 08:05:52,14000.0,"Scientific Research Corp.",0.0,10/01/1984,0,14.02,2011-11-24 08:05:52,APPROVED_CR,VERIFIED,11/09/2011,43069.0,0,0,null,null,
Here's the table I'm trying to load to
CREATE TABLE [dbo].[BrowseNotes](
[LoanType] [nvarchar](25) NULL,
[Maturity] [tinyint] NULL,
[LoanClass] [nvarchar](2) NULL,
[Borrower] [int] NULL,
[LoanStatus] [nvarchar](25) NULL,
[TimeLeftBeforeExpiration] [int] NULL,
[MonthlyPayment] [smallmoney] NULL,
[LoanMaturity] [nvarchar](10) NULL,
[JobTenureYearsString] [nvarchar](15) NULL,
[AmountToInvest] [smallmoney] NULL,
[AmountMissingToClose] [smallmoney] NULL,
[NumberOfPayments] [tinyint] NULL,
[Id] [int] NULL,
[State] [char](2) NULL,
[Type] [tinyint] NULL,
[Status] [tinyint] NULL,
[Aid] [int] NULL,
[Amount] [smallmoney] NULL,
[Duration] [tinyint] NULL,
[StartD] [datetime] NULL,
[IntRate] [decimal](18, 0) NULL,
[Grade] [char](1) NULL,
[Purpose] [nvarchar](25) NULL,
[HousingStatus] [nvarchar](25) NULL,
[JobTenure] [tinyint] NULL,
[Income] [money] NULL,
[CreditClassId] [smallint] NULL,
[City] [nvarchar](255) NULL,
[UnfundedAmnt] [smallmoney] NULL,
[Fico] [nvarchar](10) NULL,
[OpenCreditLines] [tinyint] NULL,
[TotalCreditLines] [tinyint] NULL,
[Inq6Months] [tinyint] NULL,
[RevolvUtil] [decimal](18, 0) NULL,
[FundedPercentage] [decimal](18, 0) NULL,
[FundedAmount] [smallmoney] NULL,
[EmpStatus] [nvarchar](25) NULL,
[JobTitle] [nvarchar](255) NULL,
[AppDate] [datetime] NULL,
[AppAmount] [money] NULL,
[Employer] [nvarchar](255) NULL,
[DelinquentAmount] [money] NULL,
[EarliestCreditLine] [datetime] NULL,
[PubRecords] [tinyint] NULL,
[DTI] [decimal](18, 0) NULL,
[AppExpiration] [datetime] NULL,
[LapStatus] [nvarchar](25) NULL,
[IncomeVStatus] [nvarchar](25) NULL,
[CreditReportD] [datetime] NULL,
[RevolvCreditBal] [money] NULL,
[AccntsNowDelinquent] [tinyint] NULL,
[Delinquencies2Yrs] [tinyint] NULL,
[MnthsSinceLastDelinquency] [nvarchar](10) NULL,
[MnthsSinceLastRecord] [nvarchar](10) NULL
)
What database is the table in? Try fully qualifying your table name i.e.
`mydb.dbo.BrowseNotes`
Though it certainly sounds like its not recognizing the ROWTERMINATOR .
I know this is coming in waaaaaay late, but I figured out how to do this.
DECLARE #sql varchar(1000)
set #sql = '
BULK
INSERT BrowseNotes
FROM "C:\Users\Jarek\browseNotes2.csv"
WITH (
FIELDTERMINATOR = ",",
ROWTERMINATOR = "' + char(10) + '"
)'
exec(#sql)
GO
This script works by forcing the rowterminator to a literal '0A' (linefeed). This works for both \r\n and \n terminated data.
I would also suggest using a pipe character (or anything not contained in your data) for a fieldterminator. BULK INSERT is not very tolerant of embedded field terminators in the data.
Also, adding FIRSTROW to the statement does not skip field validation for the first row. So you have to strip the headers before import, not just skip them.
Related
Azure SQL Server: Insert 20m records between tables is slow (60+ mins)
I've been going through the boards and tried lots of different things without luck. So thought id reach out to the community directly. The problem: I have an Azure SQL Server DB that has 2 tables: DATA_IMPORT (Source table I import data into via Data Factory...it gets truncated each load (approx 20m rows). DATA_SOURCE (Table where I insert the 20m rows from DATA_IMPORT into with some simple transformation. This is expected to reach about 0.5b rows) Im a little new to SQL Server and now resorted to having no indexes in DATA_SOURCE to see if that helps....still takes 60+mins. No indexes are needed on table DATA_IMPORT, since its just a holding table. Table Structures CREATE TABLE [dbo].[DATA_IMPORT ]( [field1] [nvarchar](255) NOT NULL, [field2] [nvarchar](255) NOT NULL, [field3] [nvarchar](255) NOT NULL, [field4] [nvarchar](255) NOT NULL, [field5] [nvarchar](255) NOT NULL, [field6] [nvarchar](255) NOT NULL, [field7] [nvarchar](255) NOT NULL, [field8] [nvarchar](255) NOT NULL, [field9] [nvarchar](255) NOT NULL, [field10] [nvarchar](255) NOT NULL, [measure1] int NULL, [measure2] decimal(10,2) NULL, [measure3] decimal(10,5) NULL, [measure4] decimal(7,2) NULL, [measure5] decimal(10,5) NULL ) CREATE TABLE [dbo].[DATA_SOURCE]( [EFF_DATE] [datetime] NOT NULL, [EFF_STATUS] [nvarchar](255) NOT NULL, [DATA_SOURCE] [nvarchar](255) NOT NULL, [PERIOD] [date] NOT NULL, [field1] [nvarchar](255) NOT NULL, [field2] [nvarchar](255) NOT NULL, [field3] [nvarchar](255) NOT NULL, [field4] [nvarchar](255) NOT NULL, [field5] [nvarchar](255) NOT NULL, [field6] [nvarchar](255) NOT NULL, [field7] [nvarchar](255) NOT NULL, [field8] [nvarchar](255) NOT NULL, [field9] [nvarchar](255) NOT NULL, [field10] [nvarchar](255) NOT NULL, [measure1] int NULL, [measure2] decimal(10,2) NULL, [measure3] decimal(10,5) NULL, [measure4] decimal(7,2) NULL, [measure5] decimal(10,5) NULL, [measure6] decimal(11,3) NULL [REC_CREATEDBY] [nvarchar](50) NOT NULL, [REC_CREATEDON] [datetime] NOT NULL, [REC_LASTUPDATEDBY] [nvarchar](50) NULL, [REC_LASTUPDATEDON] [datetime] NULL ) INSERT SQL --YYYY-MM-DD Declare #varPeriod varchar(30) = '2020-01-01' Declare #varDataSource varchar(255) = 'https://blah.com' INSERT INTO [DATA_SOURCE] ( [EFF_DATE],[EFF_STATUS],[DATA_SOURCE],[PERIOD], [field1],[field2],[field3],[field4],[field5], [field6],[field7],[field8],[field9],[field10], [measure1],[measure2],[measure3],[measure4],[measure5], [measure6], [REC_CREATEDBY],[REC_CREATEDON], [REC_LASTUPDATEDBY], [REC_LASTUPDATEDON]) SELECT SYSDATETIME() AS [EFF_DATE] ,'A' AS [EFF_STATUS] ,#varDataSource AS [DATA_SOURCE], CONVERT(varchar, #varPeriod, 100) AS [PERIOD], [field1],[field2],[field3],[field4],[field5], [field6],[field7],[field8],[field9],[field10], [measure1],[measure2],[measure3],[measure4],[measure5], ,CAST([measure1]*[measure2] AS numeric(11,3)) as [measure6] ,'DATA_LOADER' AS [REC_CREATEDBY] ,SYSDATETIME() AS [REC_CREATEDON] ,'DATA_LOADER' AS [REC_LASTUPDATEDBY] ,SYSDATETIME() AS [REC_LASTUPDATEDON] FROM [dbo].[DATA_IMPORT]; GO What performance recommendations do you have so I can insert these 20m rows quickly? I will need to apply a 3/4 indexes too once I join to my dimensional data. Thanks for your help all Jay
EDIT: use BULK Insert for a better performance when inserting data. PS: another important thing to look is the DTU / vCores you assign to your database.
How to use bulk import in SQL Server using files from Google Big Query?
I'm importing 700 files that, when put together, represent a 1.2Billion record table that was analyzed and exported in Big Query. Using SSIS, or the Import/Export wizard, I was able to import the first file to create the structure of the table below. Because it takes 30-45 minutes PER file in a regular transfer, I'm trying to work "bulk import". My Current Command: BULK INSERT mother.dbo.Mother FROM 'F:\Data\Consumer\New Consumer\Mother\Mother_000000000000.csv' WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = ''+CHAR(10)+'', KEEPNULLS ); I'm getting errors that indicate I have the ROWTERMINATOR wrong: Msg 4866, Level 16, State 1, Line 1 The bulk load failed. The column is too long in the data file for row 1, column 45. Verify that the field terminator and row terminator are specified correctly. Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)". I'm being told that this is due to a bad ROWTERMINATOR. Notepad++ says LF. To date, I've tried: \n 0x0A 0x0a lf LF CHAR(10) <--example above Has anyone imported BigQuery exports into SQL Server before using bulk import? and if so, what is the proper syntax? Thanks. TABLE DEFINITION CREATE TABLE [dbo].[Mother]( [FirstName] [varchar](500) NULL, [LastName] [varchar](500) NULL, [MiddleName] [varchar](500) NULL, [Gender] [varchar](500) NULL, [Age] [varchar](500) NULL, [DOB] [varchar](500) NULL, [Address] [varchar](500) NULL, [Address2] [varchar](500) NULL, [City] [varchar](500) NULL, [State] [varchar](500) NULL, [Zip] [varchar](500) NULL, [Zip4] [varchar](500) NULL, [TimeZone] [varchar](500) NULL, [Income] [varchar](500) NULL, [HomeValue] [varchar](500) NULL, [Networth] [varchar](500) NULL, [MaritalStatus] [varchar](500) NULL, [IsRenter] [varchar](500) NULL, [HasChildren] [varchar](500) NULL, [CreditRating] [varchar](500) NULL, [Investor] [varchar](500) NULL, [LinesOfCredit] [varchar](500) NULL, [InvestorRealEstate] [varchar](500) NULL, [Traveler] [varchar](500) NULL, [Pets] [varchar](500) NULL, [MailResponder] [varchar](500) NULL, [Charitable] [varchar](500) NULL, [PolicalDonations] [varchar](500) NULL, [PoliticalParty] [varchar](500) NULL, [Attom_ID] [varchar](500) NULL, [GEOID] [varchar](500) NULL, [Score] [varchar](500) NULL, [Score1] [varchar](500) NULL, [Score2] [varchar](500) NULL, [Score3] [varchar](500) NULL, [Score4] [varchar](500) NULL, [Score5] [varchar](500) NULL, [Latitude] [varchar](500) NULL, [Longitude] [varchar](500) NULL, [Email1] [varchar](500) NULL, [Email2] [varchar](500) NULL, [Email3] [varchar](500) NULL, [Phone1] [varchar](500) NULL, [Phone2] [varchar](500) NULL, [Phone3] [varchar](500) NULL ) ON [PRIMARY] Thanks much!. Unfortunately, the data coming in is personal data, and I can't put it on here for example.
Error while querying the partitioned table in SQL Server
I have created a partition table and trying to query the partition but not able to do so in sql server 2016. Could somebody tell me where I am going wrong CREATE PARTITION FUNCTION [financialStatementPartition](datetime) AS RANGE RIGHT FOR VALUES (N'2013-01-01T00:00:00.000', N'2014-01-01T00:00:00.000', N'2015-01-01T00:00:00.000', N'2016-01-01T00:00:00.000', N'2017-01-01T00:00:00.000') GO Table schema CREATE TABLE [dbo].[FinancialStatementIds] ( [financialCollectionId] [int] NOT NULL, [companyId] [int] NOT NULL, [dataItemId] [int] NOT NULL, [dataItemName] [varchar](200) NULL, [dataItemvalue] [decimal](18, 0) NULL, [unittypevalue] [int] NULL, [fiscalyear] [int] NULL, [fiscalquarter] [int] NULL, [periodenddate] [datetime] NULL, [filingdate] [datetime] NULL, [restatementtypename] [varchar](200) NULL, [latestforfinancialperiodflag] [bit] NULL, [latestfilingforinstanceflag] [bit] NULL, [currencyconversionflag] [int] NULL, [currencyname] [varchar](200) NULL, [periodtypename] [varchar](200) NULL ) Query SELECT * FROM dbo.FinancialStatementIds WHERE $PARTITION.financialStatementPartition(periodenddate) = '2013-01-01T00:00:00.000' Error Conversion failed when converting the varchar value '2013-01-01T00:00:00.000' to data type int.
Use following query instead: SELECT * FROM dbo.FinancialStatementIds WHERE periodenddate = '20130101' Anyway, the table is still not partitioned properly. You need to apply Partition Scheme to connect the table and Partition function.
Error altering table name MSSQL
Wanted to see if I could get some help concerning altering and renaming tables in MSSQL Server 2008. I am getting an error message on the sp_name syntax and maybe I am doing something wrong? -- Archives data existing one week ago and then recreates production table (registration data) IF EXISTS (SELECT * FROM dbo.tab_reg13_old) DROP TABLE dbo.tab_reg13_old; sp_rename 'dbo.tab_reg13', 'dbo.tab_reg13_old'; CREATE TABLE [dbo].[tab_reg13]( [badge] [nvarchar](255) NULL, [firstname] [nvarchar](255) NULL, [lastname] [nvarchar](255) NULL, [degree] [nvarchar](255) NULL, [title] [nvarchar](255) NULL, [company] [nvarchar](255) NULL, [address1] [nvarchar](255) NULL, [address2] [nvarchar](255) NULL, [city] [nvarchar](255) NULL, [state] [nvarchar](255) NULL, [zipcode] [nvarchar](255) NULL, [country] [nvarchar](255) NULL, [email] [nvarchar](255) NULL, [association] [nvarchar](255) NULL, [regclass] [nvarchar](255) NULL, [regtimestamp] [datetime] NULL ) ON [PRIMARY]; Getting this error message: Msg 102, Level 15, State 1, Line 5 Incorrect syntax near 'sp_rename'.
Needs exec in front of proc call exec sp_rename 'dbo.tab_reg13', 'dbo.tab_reg13_old'; you need to add exec in front of a proc if it is not the first statement in the batch
SQL Server BULK INSERT FROM different schemas
I have a database that can have data updated from two external parties. Each of those parties sends a pipe delimited text file that is BULK INSERTED into the staging table. I now want to change the scheme for one of the parties by adding a few columns, but this is unfortunately breaking the BULK INSERT for the other party even though the new columns are all added as NULLABLE. Is there any obvious solution to this? TABLE SCHEMA: CREATE TABLE [dbo].[CUSTOMER_ENTRY_LOAD]( [CARD_NUMBER] [varchar](12) NULL, [TITLE] [varchar](6) NULL, [LAST_NAME] [varchar](34) NULL, [FIRST_NAME] [varchar](40) NULL, [MIDDLE_NAME] [varchar](40) NULL, [NAME_ON_CARD] [varchar](26) NULL, [H_ADDRESS_PREFIX] [varchar](50) NULL, [H_FLAT_NUMBER] [varchar](5) NULL, [H_STREET_NUMBER] [varchar](10) NULL, [H_STREET_NUMBER_SUFFIX] [varchar](5) NULL, [H_STREET] [varchar](50) NULL, [H_SUBURB] [varchar](50) NULL, [H_CITY] [varchar](50) NULL, [H_POSTCODE] [varchar](4) NULL, [P_ADDRESS_PREFIX] [varchar](50) NULL, [P_FLAT_NUMBER] [varchar](5) NULL, [P_STREET_NUMBER] [varchar](10) NULL, [P_STREET_NUMBER_SUFFIX] [varchar](5) NULL, [P_STREET] [varchar](50) NULL, [P_SUBURB] [varchar](50) NULL, [P_CITY] [varchar](50) NULL, [P_POSTCODE] [varchar](4) NULL, [H_STD] [varchar](3) NULL, [H_PHONE] [varchar](7) NULL, [C_STD] [varchar](3) NULL, [C_PHONE] [varchar](10) NULL, [W_STD] [varchar](3) NULL, [W_PHONE] [varchar](7) NULL, [W_EXTN] [varchar](5) NULL, [DOB] [smalldatetime] NULL, [EMAIL] [varchar](50) NULL, [DNS_STATUS] [bit] NULL, [DNS_EMAIL] [bit] NULL, [CREDITCARD] [char](1) NULL, [PRIMVISACUSTID] [int] NULL, [PREFERREDNAME] [varchar](100) NULL, [STAFF_NUMBER] [varchar](50) NULL, [CUSTOMER_ID] [int] NULL, [IS_ADDRESS_VALIDATED] [varchar](50) NULL ) ON [PRIMARY] BULK INSERT STATEMENT: SET #string_temp = 'BULK INSERT customer_entry_load FROM '+char(39)+#inpath +#current_file+'.txt'+char(39)+' WITH (FIELDTERMINATOR = '+char(39)+'|'+char(39) +', MAXERRORS=1000, ROWTERMINATOR = '+char(39)+'\n'+char(39)+')' SET DATEFORMAT dmy EXEC(#string_temp)
The documentation describes how to use a format file to handle the scenario where the target table has more columns than the source file. An alternative that can sometimes be easier is to create a view on the table and BULK INSERT into the view instead of the table; this possibility is described in the same documentation. And please always mention your SQL Server version.
Using OPENROWSET with BULK allows you to use your file in a query. You can use that to format the data and select only the columns you need.
In the end I have handled the two different cases with two different BULK INSERT statements (depending on which file is being processed). It seems like there isn't a way to do what I was trying to do with one statement.
You could use the format file idea supplied by #Pondlife. Adapt your insert dynamically based on the input file name (provided there are unique differneces between the external parties). Using a CASE statement, simply select the correct format file based on the unique identifier in the file name. DECLARE #formatFile varchar (max); Set #formatFile = CASE WHEN #current_file LIKE '%uniqueIdentifier%' THEN 'file1' ELSE 'file2' END SET #string_temp = 'BULK INSERT customer_entry_load FROM '+char(39)+#inpath +#current_file+'.txt'+char(39)+' WITH (FORMATFILE = '+char(39)+#formatFile+char(39) ')' SET DATEFORMAT dmy EXEC(#string_temp) Hope that helps!