I am trying to import a huge amount of data from sql server.
There is a date column in this table, and I'm going to import the table by dividing it by date.
Is there any way?
Or is there a way to import into sqoop first and then divide the table into date columns in hive?
I understand that if i use the option "--split-by", I can divide the table by date, is this correct?
Related
i have a history table that i'd like to convert to a sql server temporal history table. Every day, i take a snapshot of the data and add it to the table, regardless of changes. I'm able to use this to get a point in time for each day of the year. and since i have 3 years of data, i can get a 3 year comparison of the day for each year. Anyway, these tables are getting rather large. We'd like to start using temporal tables. I've been trying to figure out if i can convert these history tables to a temporal history table with the proper from and to dates based on the date added in the current table.
The examples i've found don't seem to allow for this. Any ideas? Thanks.
I have an empty table into which I am trying to import data from a flat file. The only option I seem to have is to create the table, which isn't needed. Why can't I select the other options?
In the screen before the one on your screenshot click where the destination table name is to select an existing table.
I have loaded data into HANA using a CSV file and now I have added a new column to the table by
ALTER TABLE <tablename> ADD (<columnname> <datatype>);
Now I want to import data into this specific column using a CSV file. I could not find any methods to import to a specific column. Any help is appreciated. Thanks in advance.
The CSV import feature doesn't allow for partial loading and "column filling".
What you could do is to load the new data (together with the key columns of course) to a staging table and then update the target table columns from there.
Be aware that the CSV import is not meant to be an ETL solution. For that, the smart data integration features are there.
I am looking to store data into Hive to run analysis on the pas months (~100GB per days).
My rows contains a date (STRING) field looking like that: 2016-03-06T04:31:59.933012793+08:00
I want to partition based on this field but only based on the date (2016-03-06) --and i don't care about the timezone. Is there any ways to achieve that without changing the original format?
The reason for partitioning is both performances and the ability to delete old days to have a rolling window of data.
Thank you
You can achieve this through Insert Overwrite table with dynamic partition.
You can apply sub-string or regexp_extract function on your date time column and get the value in required format.
Below is my sample query where I am loading a Partitioned table by applying function on the partition column.
CREATE TABLE base2(id int, name String)
PARTITIONED BY (state string);
INSERT OVERWRITE TABLE base2 PARTITION (state)
SELECT id, name, substring(state,0,1)
Here I am applying some transformation the partition column. Hope this helps.
FROM base;
Is it possible to get a column name those is get updated recently.
In some database some tables was there.
In some tables column dtattype like date, datetime, any date type
in that database some columns of above date columns were updated.
Here my question is to get list of table names and column name and value of those who are updated recently
We have Triggers in SQL Server have a look at the advantages and disadvantages i hope it clarifies your doubt.Triggers
You can try to:
a) decode DBCC LOG/fn_dblog() output by hand. Starting point: http://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/
b) purchase "SysTools SQL Log Analyzer" or "ApexSQL Log"
What you want is possible but costs time or money.