sqoop export to SQL server - with where clause - export

I have partition table. I only want to export partition for today. Is there a way you can pass query with variable in export command ?

Related

Are functions allowed in SQL Server's Import/Export Wizard?

I am using SQL Server Import and Export Wizard to transfer data from an Excel file into a table. I wrote the following query in the wizard's query option:
Select
ISNULL([Col 1], [Col 2]),
[Col 3]
FROM [myExcelWorkSheet$]
It says this SQL Statement is not a query. In fact, no other functions seem to work, such as COALESCE and even CAST.
Does the SQL Server Import and Export Wizard not accept functions?
You are missing the schema of the function. It must be something like this:
Select
ISNULL([Col1], [Col2]),
[Col3]
FROM [dbo].[myExcelWorkSheet$];

Best practice to import data from SQL server to Hive through Sqoop

We are working on to import data from MS SQL Server to hive through Sqoop. If we use the incremental & append mode which is the requirement then we need to specify the --last-value of the row id which we inserted last time.
I have to update about 100 tables into Hive.
What is the practice to save the value of row id for all tables and specify in the sqoop --last-value command ?
Why does not Sqoop itself check the row id of the source & destination table, finally update the rows onwards the last row id value of the destination table?
If i save the last value of row id for all tables in a hive table and want to use those values in Sqoop job then how it's possible?
All and above, i want to automate the data importing job so that i do not have to provide the value manually for each table data import per day
Any pointers ?
Thanks

Sqoop Import HBase - Sql Database

I wanna to migrate my data from SQL database to HBase. One of my problem is my SQL tables don't have a Primary key so to overcome this. I am using Composite Key in Sqoop query. I have successfully imported data from SQL to HBase, but the main problem is that the imported data doesn't consists of columns which are used for Candidate Key which are required with data imported. Kindly give some resolution to this..!!
Sqoop query wich I am currently using is of below format :
sqoop import --connect "jdbc:sqlserver://Ip:1433;database=dbname;username=test;password=test" --table TableName --hbase-create-table --hbase-table TableName --column-family NameSpace --hbase-row-key Candidate1,Candidate2,Candidate3 -m 1
Also let me know if anyone knows a query to import the complete database for the same rather then single table.
After lots of research, I came across a correct syntax through which I was able to load all the data correctly without losing any of the single columns as below:
sqoop import -D sqoop.hbase.add.row.key=true –connect “jdbc:sqlserver://IP:1433;database=DBNAME;username=UNAME;password=PWD” --table SQLTABLENAME –hbase-create-table –hbase-table HBASETABLENAME –column-family COLUMNFAMILYNAME –hbase-row-key PRIMARYKEY -m 1
OR
sqoop import -D sqoop.hbase.add.row.key=true –connect “jdbc:sqlserver://IP:1433;database=DBNAME;username=UNAME;password=PWD” --table SQLTABLENAME –hbase-create-table –hbase-table HBASETABLENAME –column-family COLUMNFAMILYNAME –hbase-row-key CANDIDATEKEY1, CANDIDATEKEY2, CANDIDATEKEY3 -m 1

Run query on sql server through teradata and store result in teradata

I have one table in SQL server and 5 tables in Teradata.I want to join those 5 table in teradata with sql server table and store result in Teradata table.
I have sql server name but i dont know how to simultaneously run a query both on sql server and teradata.
i want to do this:
sql server table query
Select distinct store
from store_Desc
teradata tables:
select cmp_id,state,sde
from xyz
where store in (
select distinct store
from sql server table)
You can create a table (or a volatile table if you do not have write privileges) to do this. Export result from SQL Server as text or into the language of your choice.
CREATE VOLATILE TABLE store_table (
column_1 datatype_1,
column_2 datatype_2,
...
column_n datatype_n);
You may need to add ON COMMIT PRESERVE ROWS before the ; to the above depending on your transaction settings.
From a language you can loop the below or do an execute many.
INSERT INTO store_table VALUES(value_1, value_2, ..., value_n);
Or you can use the import from text using Teradata SQL Assistant by going to File and selecting Import. Then execute the below and navigate to your file.
INSERT INTO store_table VALUES(?, ?, ..., n);
Once you have inserted your data you can query it by simply referencing the table name.
SELECT cmp_id,state,sde
FROM xyz
WHERE store IN(
SELECT store
FROM store_table)
The DISTINCT function is most easily done on export from SQL Server to minimize the rows you need to upload.
EDIT:
If you are doing this many times you can do this with a script, here is a very simple example in Python:
import pyodbc
con_ss = pyodbc.connect('sql_server_odbc_connection_string...')
crs_ss = con_ss.cursor()
con_td = pyodbc.connect('teradata_odbc_connection_string...')
crs_td = con_td.cursor()
# pull data for sql server
data_ss = crs_ss.execute('''
SELECT distinct store AS store
from store_Desc
''').fetchall()
# create table in teradata
crs_td.execute('''
CREATE VOLATILE TABLE store_table (
store DEC(4, 0)
) PRIMARY INDEX (store)
ON COMMIT PRESERVE ROWS;''')
con_td.commit()
# insert values; you can also use an execute many, but this is easier to read...
for row in data_ss:
crs_td.execute('''INSERT INTO store_table VALUES(?)''', row)
con_td.commit()
# get final data
data_td = crs_td.execute('''SELECT cmp_id,state,sde
FROM xyz
WHERE store IN(
SELECT store
FROM store_table);''').fetchall()
# from here write to file or whatever you would like.
Is fetching data from the Sql Server through ODBC an option?
The best option may be to use Teradata Parallel Transporter (TPT) to fetch data from SQL Server using its ODBC operator (as a producer) combined with Load or Update operator as the consumer to insert it into an intermediate table on Teradata. You must then perform rest of the operations on Teradata. For the rest of the operations, you can use BTEQ/SQLA to store the results in the final Teradata table. You can also put the same SQL in TPT's DDL operator instead of BTEQ/SQLA and get it done in a single job script.
To allow use of tables residing on separate DB environments (in your case SQL-Server and Teradata) in a single select statement, Teradata has recently released Teradata Query Grid. But I'm not sure about exact level of support for SQL-Server and it will involve licensing hassle and quite a learning curve to do this simple job.

Can't import as null value SQL Server 2008 TSV file

I import data from a TSV file with SQL Server 2008.
null is replaced by 0 when I confirm a table after import with integer column.
How to import as null, please Help me!!
Using bcp, -k switch
Using BULK INSERT, use KEEPNULLS
After comment:
Using SSIS "Bulk insert" task, options page, "Keep nulls" = true
This is what the import wizard uses: but you'll have to save and edit it first because I see no option in my SSMS 2005 wizard.
This can be set in the OLE DB Destination editor....there is a 'Keep nulls' option.
Alternative for those using the Import and Export Wizard on SQL Server Express, or anyone who finds themselves too lazy to modify the SSIS package:
Using text editing software before you run the wizard, replace NULLs with a valid value that you know doesn't appear in your dataset (eg. 987654; be sure to do a search first!) and then run the Import Export Wizard normally. If your data contains every single value (maybe bits or tinyints), you'll have some data massaging ahead of you, but it's still possible by using a temporary table with datatypes that can store a greater number of values. Once it's in SQL, use commands like
UPDATE TempTable
SET Column1 = NULL
WHERE Column1 = 987654
to get those NULLs where they belong. If you've used a temporary table, use INSERT INTO or MERGE to get your data into your end table.

Resources