I am trying to import data from Sybase using Sqoop. From logs i can say that i have mangeged to do a coonection successfully.
But my job fails giving me some Sql exception from Sybase. I don't primarily work on Sybase so
could not dig out much with this error. Only one of my sources resides at Sybase.
I used following command:
sqoop import --verbose \
--driver com.sybase.jdbc3.jdbc.SybDriver \
--connect jdbc:sybase:Tds:nyhostx123.sm.com:13290/DATABASE=tempdb \
--table tempdb..mit \
--split-by sipid \
--fields-terminated-by ',' \
--target-dir /home/DEVTEST/sqoop_mit \
--username user01 \
-m 1 \
-P
Error Snippet:
13/03/14 07:36:19 INFO mapred.JobClient: Running job: job_201301151126_25936
13/03/14 07:36:20 INFO mapred.JobClient: map 0% reduce 0%
13/03/14 07:36:27 INFO mapred.JobClient: Task Id : attempt_201301151126_25936_m_000000_0, Status : FAILED
java.io.IOException: SQLException in nextKeyValue
at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:456)
at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
at org.apache.hadoop.mapred.Child.main(Child.java:260)
Caused by: com.sybase.jdbc3.jdbc.SybSQLException: Incorrect syntax near '.'.
at com.sybase.jdbc3.tds.Tds.a(Unknown Source)
at
attempt_201301151126_25936_m_000000_0: log4j:WARN No appenders could be found for logger (org.apache.hadoop.hdfs.DFSClient).
attempt_201301151126_25936_m_000000_0: log4j:WARN Please initialize the log4j system properly.
13/03/14 07:36:33 INFO mapred.JobClient: Task Id : attempt_201301151126_25936_m_000000_1, Status : FAILED
java.io.IOException: SQLException in nextKeyValue
I believe that the problem is in --table parameter. Sqoop is expecting pure table name, but you seem to be passing extra value "tempdb.." (I guess it's a db name?). Would you mind trying it out with "--table mit" only?
Related
I am able to successfully import data from SQL Server to HDFS using sqoop. However, when it tries to link to HIVE I get an error. I am not sure I understand the error correctly
sudo -u hdfs sqoop import \
-Dorg.apache.sqoop.splitter.allow_text_splitter=true \
--connect "jdbc:sqlserver://XX.XX.X.X:1433;instanceName=data-engr-sql-svr; databaseName=AdventureWorks2019" \
--username sa \
--password XXXXXXXX \
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver \
--warehouse-dir "/user/hive/warehouse/AdventureWorks2019.db" \
--hive-import \
--create-hive-table \
--fields-terminated-by ',' \
--hive-table AdventureWorks2019.Production.TransactionHistory \
--table Production.TransactionHistory \
--split-by TransactionID \
-- --schema Production
I don't know how to handle schemas, most of the tutorial uses a dummy database without proper schemas which are not helpful.
Error
21/03/31 08:52:47 INFO conf.HiveConf: Using the default value passed in for log id: 95e2b831-cfe5-4108-be0f-0df1d9a8797e
21/03/31 08:52:47 INFO session.SessionState: Updating thread name to 95e2b831-cfe5-4108-be0f-0df1d9a8797e main
21/03/31 08:52:47 INFO conf.HiveConf: Using the default value passed in for log id: 95e2b831-cfe5-4108-be0f-0df1d9a8797e
21/03/31 08:52:47 INFO ql.Driver: Compiling command(queryId=hdfs_20210331085247_050638e8-593a-4d01-8020-c40b7db8e66a): CREATE TABLE IF NOT EXISTS AdventureWorks2019.Production.TransactionHistory ( TransactionID INT, ProductID INT, ReferenceOrderID INT, ReferenceOrderLineID INT, TransactionDate STRING, TransactionType STRING, Quantity INT, ActualCost DOUBLE, ModifiedDate STRING) COMMENT 'Imported by sqoop on 2021/03/31 08:52:45' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054' LINES TERMINATED BY '\012' STORED AS TEXTFILE
21/03/31 08:52:49 INFO hive.metastore: HMS client filtering is enabled.
21/03/31 08:52:49 INFO hive.metastore: Trying to connect to metastore with URI thrift://cnt7-naya-cdh63:9083
21/03/31 08:52:49 INFO hive.metastore: Opened a connection to metastore, current connections: 1
21/03/31 08:52:49 INFO hive.metastore: Connected to metastore.
21/03/31 08:52:49 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
FAILED: SemanticException [Error 10255]: Invalid table name AdventureWorks2019.Production.TransactionHistory
21/03/31 08:52:49 ERROR ql.Driver: FAILED: SemanticException [Error 10255]: Invalid table name AdventureWorks2019.Production.TransactionHistory
There is no such thing as schema inside the database in Hive. Database and schema mean the same thing and can be used interchangeably.
So, the bug is in using database.schema.table. Use database.table in Hive.
Read the documentation: Create/Drop/Alter/UseDatabase
I am trying to export HDFS file from a HDFS directory to sybase IQ table.
I have placed the sybase driver in sqoop lib path correctly .
sqoop Command :
sqoop export \
--connect jdbc:sybase:Tds:sybasehost:port/DATABASE=OMEGA \
--username dummy \
--password dummy \
--driver com.sybase.jdbc4.jdbc.SybDriver \
--table omega_sybase_table \
--export-dir /user/cloudera/omega/output_files/ \
--input-fields-terminated-by ','
I am getting the below error and this export failed.
17/04/25 16:17:07 INFO mapreduce.Job: Task Id : attempt_1489579695153_4935_m_000002_1, Status : FAILED
Error: java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: java.sql.SQLException: SQL Anywhere Error -210: User 'another user' has the row in 'omega_sybase_table' locked
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:233)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
... 10 more
Caused by: java.sql.SQLException: SQL Anywhere Error -210: User 'another user' has the row in 'omega_sybase_table' locked
at com.sybase.jdbc4.jdbc.SybConnection.getAllExceptions(Unknown Source)
Could someone help me fixing this issue?
its getting because of multiple mappers tasks being used in sqoop export command.
Sybase IQ only allows one connection at a time, multiple mappers tasks try to insert records in sybase iq table in parallel.
Solution is to use -m 1 in sqoop export command.
I was trying to import all tables from a Specific schema in DB2 using below command line.
sqoop import-all-tables --username user --password pass \
--connect jdbc:db2://myip:50000/databs:CurrentSchema=testdb \
--driver com.ibm.db2.jcc.DB2Driver --fields-terminated-by ',' \
--lines-terminated-by '\n' --hive-database default --hive-import --hive-overwrite \
--create-hive-table -m 1;
Struck with following error
2017-05-02 09:21:18,474 ERROR - [main:] ~ Error reading database metadata:
com.ibm.db2.jcc.am.SqlSyntaxErrorException: [jcc][10165][10051][4.11.77]
Invalid database URL syntax:
jdbc:db2://myip:50000/msrc:CurrentSchema=testdb. ERRORCODE=-4461,
SQLSTATE=42815 (SqlManager:43)
com.ibm.db2.jcc.am.SqlSyntaxErrorException: [jcc][10165][10051][4.11.77]
Invalid database URL syntax:
jdbc:db2://myip:50000/msrc:CurrentSchema=testdb. ERRORCODE=-4461,
SQLSTATE=42815
at com.ibm.db2.jcc.am.gd.a(gd.java:676)
at com.ibm.db2.jcc.am.gd.a(gd.java:60)
at com.ibm.db2.jcc.am.gd.a(gd.java:85)
at com.ibm.db2.jcc.DB2Driver.tokenizeURLProperties(DB2Driver.java:911)
at com.ibm.db2.jcc.DB2Driver.connect(DB2Driver.java:408)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at
org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:885)
at
org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.listTables(SqlManager.java:520)
at
org.apache.sqoop.tool.ImportAllTablesTool.run(ImportAllTablesTool.java:95)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
at java.util.StringTokenizer.nextToken(StringTokenizer.java:377)
at com.ibm.db2.jcc.DB2Driver.tokenizeURLProperties(DB2Driver.java:899)
... 13 more
Could not retrieve tables list from server
2017-05-02 09:21:18,696 ERROR - [main:] ~ manager.listTables() returned null
(ImportAllTablesTool:98)
[
Command:
sqoop import-all-tables \
--driver com.ibm.db2.jcc.DB2Driver \
--connect jdbc:db2://myip:50000/databs \
--username username --password password \
--hive-database default --hive-import --m 1 \
--create-hive-table --hive-overwrite
The import-all-tables tool imports a set of tables from an RDBMS to HDFS. Data from each table is stored in a separate directory in HDFS.
For the import-all-tables tool to be useful, the following conditions must be met:
Each table must have a single-column primary key.
You must intend to import all columns of each table.
You must not intend to use non-default splitting column, nor impose any conditions via a WHERE clause.
we are trying to export data from hive to sql server table
sqoop export -D mapred.child.java.opts='\-Djava.security.egd=file:/dev/../dev/urandom' --connect 'jdbc:sqlserver://' --username $$$$ --password #### --table ib_c3 --columns BILL_TO_CUSTOMER_NAME,CONTRACT_NUMBER,SERVICE_LINE_ID,SERVICE_LINE_NAME,SERVICE_LINE_STATUS,STS_CODE,INSTANCE_ID,SERIAL_NUMBER,ITEM_NAME,QUANTITY,INVENTORY_ITEM_ID,WARRANTY_TYPE,WARRANTY_END_DATE,SHIP_DATE,PARTY_SITE_ID,LAST_DOS,IB_PRODUCT_TYPE,PRODUCT_FAMILY,ERP_ITEM_TYPE,SKU_LIST_PRICE -m 1 --input-fields-terminated-by '\001' --export-dir /app/dev/SmartAnalytics/Apps/CSP/hivewarehouse/csp.db/csp_ib_c3_export --input-null-string '\\N' --input-null-non-string '\\N' -- --schema staging
During the export we are getting the below error from sqoop. Can someone help us with this issue.
Caused by: java.io.IOException: com.microsoft.sqlserver.jdbc.SQLServerException: The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:644)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
... 10 more
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:216)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1515)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatementBatch(SQLServerPreparedStatement.java:1299)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtBatchExecCmd.doExecute(SQLServerPreparedStatement.java:1209)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:5696)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1715)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:180)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:155)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch(SQLServerPreparedStatement.java:1173)
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:231)
When i run following sqoop command from $SQOOP_HOME/bin it works fine
sqoop import --connect "jdbc:sqlserver://ip_address:port_number;database=database_name;username=sa;password=sa#Admin" --table $SQL_TABLE_NAME --hive-import --hive-home $HIVE_HOME --hive-table $HIVE_TABLE_NAME -m 1
But when i run the same command in a loop for different databases from bash script as follows
while IFS='' read -r line || [[ -n $line ]]; do
$DATABASE_NAME=$line
sqoop import --connect "jdbc:sqlserver://ip_address:port_number;database=$DATABASE_NAME;username=sa;password=sa#Admin" --table $SQL_TABLE_NAME --hive-import --hive-home $HIVE_HOME --hive-table $HIVE_TABLE_NAME -m 1
done < "$1"
I am passing database names to my bash script in text file as parameter. My hive table is same as i want to append data from all databases in one hive table only.
For first two-three databases it works fine after that it starts giving following errors
15/06/25 11:41:06 INFO mapreduce.Job: Job job_1435124207953_0033 failed with state FAILED due to:
15/06/25 11:41:06 INFO mapreduce.ImportJobBase: The MapReduce job has already been retired. Performance
15/06/25 11:41:06 INFO mapreduce.ImportJobBase: counters are unavailable. To get this information,
15/06/25 11:41:06 INFO mapreduce.ImportJobBase: you will need to enable the completed job store on
15/06/25 11:41:06 INFO mapreduce.ImportJobBase: the jobtracker with:
11:41:06 INFO mapreduce.ImportJobBase:mapreduce.jobtracker.persist.jobstatus.active = true
11:41:06 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.hours = 1
15/06/25 11:41:06 INFO mapreduce.ImportJobBase: A jobtracker restart is required for these settings
15/06/25 11:41:06 INFO mapreduce.ImportJobBase: to take effect.
15/06/25 11:41:06 ERROR tool.ImportTool: Error during import: Import job failed!
I have already restarted my multi-node hadoop cluster by changing mapred-site.xml with above two said parametrs i.e
mapreduce.ImportJobBase:mapreduce.jobtracker.persist.jobstatus.active = true
mapreduce.jobtracker.persist.jobstatus.hours = 1
Still i am facing same problem. As i have just started learning sqoop any help will be appreciated.