preserve the data while dropping a hive internal table - sql-server

I have loaded a huge table from SQL Server onto Hive. The mistake I made is I created the table as a Internal table in HIVE. Can anyone suggest any hack so that I can alter the table structure , without dropping the data.
The data is huge and I cant afford to export the data out of source again.
The problem right now, is that since the column orders don't match the SQL server table, a lot of columns display NULL.
Any help will be highly appreciated.

I do not see any problem to use an Alter Table on a internal table. (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/Partition/Column)
Another - but not recommended - option would be to open your hive metastore(HCatalog) and apply the changes there. Hive reads out the schema information from a relational database (configured during the Hadoop setup, default is MySQL). In this MySQL you can try to change some settings. However, this is not recommended as with a mistake, you can screw your whole Hive databases.
The safest way is creating a new table and using the existing as a source
create table new_table
as
select
[...]
from existing_table

Related

Import CSV into SQL Server database, keeping ID column values

I am working to migrate a SQLite database to SQL Server and I need to use IntelliJ IDEA to import all the data from the SQLite tables in to the MSSQL database.
I have exported the data to CSV format, but when I import into SQL Server, I need to maintain the existing ID columns (as foreign keys refer to it).
Normally, I can do this by executing SET IDENTITY_INSERT xxx ON; prior to my INSERT statements.
However, I do not know how to do this when importing CSV using IntelliJ.
The only other option I see is to export the data as a series of SQL INSERT statements, but that is very time consuming as the schemas between the two databases are slightly different (not to mention the SQL syntax).
Is there another way to import this data?
I don't know how to perform an Identity Insert ON in an IntelliJ query, but I do know how to work around this problem. Import your data into a temporary table destination, then execute a query within SQL Server that
Sets Identity Insert ON
Inserts the data from the temporary table into the final destination
Sets Identity Insert OFF
What this really does is prevent you from having to spend (potentially) hours finding out how to implement an Identity Insert ON in IntelliJ when you may never need to do this again. It is straightforward and simple to code as well.
However, if you want to learn if there is a way to do this in IntelliJ, go for it. That would be a more optimal method.

Insert data into SQL tables manually using the related columns in Management Studio

I am trying to insert data into some related table in SQL Server 2008R2 and I am trying to figure out whether there is an easier way to insert data manually (visually) using the related columns and not the IDs. If you check the two snapshots of the tables and table WFUserGroup basically I am trying to see if I can have a bound query (like in MS ACCESS) where I can see the Name column instead of the ID and the name of the Group instead of the group_id
I know that with a TRANSACTION block and INSERT INTO statements I can create a new user in WFUser table and then relate it to a group in the WFUserGroup table, but I am telling myself there should be an easier way. Anyone knows a workaround?
Tables:
Using Edit Top 200 Rows Feature:
You could use a flat file with data in a .csv or excel and use the Import feature in SQL server.
how to navigate, right click on the database and tasks--> Import then the wizard to select the necessary file and tables.
I do see that there are primary key and foreign keys so you have to make sure that its considered in your files you are going to import.

Quick way to perform a fulltext-search on MS SQL Server

First of all: i don't need a full-text-search engine, i don't need full-text-search in my code. I have a database with ~2000 tables, and i need to find the table and column in which certain information is stored, for developing purposes. Is there any quick way (maybe an SQL Server Management Studio trick that i should know of) to do this? I think phpmyadmin provides such a feature for mysql dbs. At the moment i'm seriously thinking of dumping the database to an .sql file and use a text editor to search for the phrases i'm looking for.
Check the INFORMATION_SCHEMA. You can select on it - there is a table containing all the field names etc. and you can then do search on that one.
I don't see a way how to do it without dynamic SQL - get list of all tables and their columns from sys.tables and sys.columns (don't forget to add proper schema if you're using them), construct query that checks for the values you're trying to find and stores table and column name in temporary table, place all queries into (temp) table and finally cursor/loop over that table executing all queries.
PS. your idea of dumping everything into *.sql files should work as well, depends on the volume of data.

Quickest way to restore a record from a SQL Server 2008 MDF file

I was wondering what the best approach would be to restoring a single record from an MDF file (generated as backup on the live instance) into the live SQL Server database.
I know about the process of attaching the file to the database and have read quite a bit about completely restoring, but how about selecting a single record from one of the tables and inserting it back into the same table on the live instance?
I could always create the new record from scratch myself based on the resulting row from the select statement, but I am sure that there has got to be a smarter and cleaner approach to such a simple task.
Thanks a bunch in advance, looking forward to your answers.
Cheers.
You cannot simply read a record out of an MDF file, you need to attach it or restore it to a database.
Natively, you can't. However, Red Gate has a product called Virtual Restore that allows you to mount a database from a backup.
Is this for right now or for future planning? If the latter, then you can utilize database snapshots.
Depending on what kind of flexibility you have on the live server, you could always just attach the backup database under a different name on the live or another linked server and then just select the record you want straight in to the equivalent table in the live database.
How viable this is depends entirely on the primary key. If it is an auto-generated identity column, selecting it in will give a different primary key which may have undesirable results on any linked records you may also want to add, the new primary key would have to be taken in to account.
Example of query
insert into originaldb.dbo.Persons
select * from backupdb.dbo.Persons where PersonId = '654G'
originaldb.dbo.Persons is the original table that you want to select into.
backupdb.dbo.Persons is your restored backup table.
You'll need to modify this query a little if you are not selecting the entire row but that is the gist of it.

Sql Server 2008 Replicate Synonym?

I plan on updating some table names by create a synonym of the old name and renaming the table to what I want it to be. Can replication properly reference a synonym?
Also as a side question, is there an easy way to see if a specific table is actually being replicated? (via a query perhaps)
I don't think so. Replication works by reading the log and there are no log records generated for a synonym. As to your question about finding out which tables are replicated, a query on sysarticles in the table should get you where you want to go. HTH.

Resources