Tools to create and update snowfalke tables - snowflake-cloud-data-platform

Is there any tool to compare a table with a base table and generate an alter script? once the alter script is generated, using the same alter script it should update the base table.
Thanks in advance.

I wish there was. A lot of us are looking for that magic tool right now. As of right now, there are no tools that do true Schema compare. At least not very well. There are some tools that attempt to do this but make some assumptions under the hood that are sometimes not good assumptions. DBeaver does a little bit of this Data Grip is another tool. There are a couple more out there but again, many of these tools are assuming Snowflake is like all the other databases, which leads to bad scripts.

Related

Most Efficient Way to Migrate Un-Normalized Data in an Access Database to a Normalized Form in a SQL Server Database

I've been doing some research on this topic for a while now and can't seem to find a similar instance to my issue. I will try and explain everything as best I can, as simply as I can.
The problem is in the title; I am trying to migrate data from an Access database to SQL Server. Typically, this isn't really a hard problem as there exists several import/export tools within SQL Server but I am looking for the best solution. That or some advice/tips as I am somewhat new to database migration. I will now begin to explain my situation.
So I am currently working on migrating data that exists in an Access “database” (database in quotes because I don’t think it is actually a database, you’ll know why in a minute) in an un-normalized form. What I mean by un-normalized is that all of the data is in one table. This table has about 150+ columns and the rows number in the thousands. Yikes, I know; this is what I’ve walked into lol. Anyways, sitting down and sorting through everything, I’ve designed relationships for the data that normalize it nicely in its new home, SQL Server. Enter my predicament (or at least part of it). I have the normalized database set up to hold the data but I’m not sure how to import it, massage/cut it up, and place it in the respective tables I’ve set up.
Thus far I’ve done a bunch of research into what can be done and for starters I have found out about the SQL Server Migration Assistant. I’ve begun messing with it and was able to import the data from Access into SQL Server, but not in the way I wanted. All I got was a straight copy & paste of the data into my SQL Server database, exactly as it was in the Access database. I then learned about the typical practice of setting up a global table/staging area for this type of migration, but I am somewhat of a novice when it comes to using TSQL. The heart of my question comes down to this; Is there some feature in SQL Server (either its import/export tool or the SSMA) that will allow me to send the data to the right tables that already exist in my normalized SQL Server database? Or do I import to the staging area and write the script(s) to dissect and extract the data to the respective normalized table? If it is the latter, can someone please show me some tips/examples of what the TSQL would look like to do this sort of thing. Obviously I couldn’t expect exact scripts from anyone without me sharing the data (which I don’t have the liberty of as it is customer data), so some cookie cutter examples will work.
Additionally, future data is going to come into the new database from various sources (like maybe excel for example) so that is something to keep in mind. I would hate to create a new issue where every time someone wants to add data to the database, a new import, sort, and store script has to be written.
Hopefully this hasn’t been too convoluted and someone will be willing (and able) to help me out. I would greatly appreciate any advice/tips. I believe this would help other people besides me because I found a lot of other people searching for similar things. Additionally, it may lead to TSQL experts showing examples of such data migration scripts and/or an explanation of how to use the tools that exist in such a way the others hadn’t used before or have functions/capabilities not adequately explained in the documentation.
Thank you,
L
First this:
Additionally, future data is going to come into the new database from
various sources (like maybe excel for example)...?
That's what SSIS is for. Setting up SSIS is not a trivial task but it's not rocket science either. SQL Server Management Studio has an Import/Export Wizard which is a easy-to-use SSIS package creator. That will get you started. There's many alternatives such as Powershell but SSIS is the quickest and easiest solution IMO. Especially when dealing with data from multiple sources.
SSIS works nicely with Microsoft Products as data sources (such as Excel and Sharepoint).
For some things too, you can create an MS Access Front-end that interfaces with SQL Server via sql server stored procedures. It just depends on the target audience. This is easy to setup. A quick google search will return many simple examples. It's actually how I learned SQL server 20+ years ago.
Is there some feature in SQL Server that will allow me to send the
data to the right tables that already exist in my normalized SQL
Server database?
Yes and don't. For what you're describing it will be frustrating.
Or do I import to the staging area and write the script(s) to dissect
and extract the data to the respective normalized table?
This.
If it is the latter, can someone please show me some tips/examples of
what the TSQL would look like to do this sort of thing.
When dealing with denormalized data a good splitter is important. Here's my two favorites:
DelimitedSplit8K
PatternSplitCM
In SQL Server 2016 you also have split_string which is faster (but has issues).
Another must have is a good NGrams function. The link I posted has the function attached at the bottom of the article. I have some string cleaning functions here.
The links I posted have some good examples.
I agree with all the approaches mentioned: Load the data into one staging table (possibly using SSIS) then shred it with T-SQL (probably wrapped up in stored procedures).
This is a custom piece of work that needs hand built scripts. There's no automated tool for this because both your source and target schemas are custom schemas. So you'd need to define all that mapping and rules somewhow.... and no SSIS does not magically do this!
It sounds like you have a target schema and mappings between source and target schema already worked out
As an example your first step is to load 'lookup' tables with this kind of query:
INSERT INTO TargetLookupTable1 (Field1,Field2,Field3)
SELECT DISTINCT Field1,Field2,Field3
FROM SourceStagingTable
TargetLookupTable1 should already have an identity primary key defined (which is not mentioned in the above query because it is auto generated)
This is where you will find your first problem. You'll almost definitely find your distinct query just gives you a whole lot of duplicated mispelt data rubbish data. So before you even load your lookup table you need to do data cleansing.
I suggest you clean the data in your source system directly but it depends how comfortable you are with that.
Next step is: assuming your data is all clean and you've loaded a dozen lookup tables in this way..
Now you need to load transactions but you don't know the lookup key that you just generated!
The trick is to pre-include an empty column for this in your staging table to record this
Once you've loaded up your lookup table you can write the key back into the staging table. This query matches back on the fields you used to load the lookup, and writes the key back into the staging table
UPDATE TGT
SET MyNewLookupKey = NewLookupTable.MyKey
FROM SourceStagingTable TGT
INNER JOIN
NewLookupTable
ON TGT.Field1 = NewLookupTable.Field1
AND TGT.Field2 = NewLookupTable.Field2
AND TGT.Field3 = NewLookupTable.Field3
Now you have a column called MyNewLookupKey in your staging table which holds the correct lookup key to load into you transaction table
Ongoing uploads of data is a seperate issue but you might want to investigate an MS Access Data Project (although they are apparently being phased out, they are very handy for a front end into SQL Server)
The thing to remember is: if there is anything ambiguous about your data, for example, "these rows say my car is black but these rows say my car is white", then you (a human) needs to come up with a rule for "disambiguating" it. It can't be done automatically.
So there are quite a number of ways to skin this cat. I don't know much about the "Migration Assistant", but I somehow doubt it's going to make your life easier given what you're trying to do.
I'd just dump the whole denormalized mess into a single big staging table then shred it where you need it using SQL. I know you asked for help with the TSQL, but without having some idea of what the denormalized data is and how you want to re-shape it, all I can do really is suggest you read up on SQL in general (select, from, where, group by, etc).
You could also do the work in SSIS, but ultimately the solution you use is largely going to depend on the nature of how you need to normalize the big denormalized data set. IMHO doing this in SQL is usually the easiest way, but then again when you're a hammer, everything looks like a nail.
As far as future proofing the process, how you import the Access data probably will have little bearing on how you'd import Excel data. If you have a whole lot of different data sources which you'll need to incorporate on a recurring basis, SSIS might be a good choice to invest some time and effort into for the long run. No matter what, incorporating data from a distinct data source takes time and effort. You'll have to do some extra work no matter what. I would weight how frequently you think you'll have to integrate a given data source, and how much effort is involved to massage it into the format you want.
I have a completely different opinion. Because I do both database development and Microsoft's Power BI - - on the PBI side we come across a lot of non-normalized data because a lot of the data is coming in from excel.
My guess is that what is now in Access was an import of something originally began in excel.
Excel Power Query and PBI offers transforms to pivot and unpivot layout. I would use these tools to do that task. Then import the results into SQL.

Need help understanding the difference between a script format and query

This is a very newbie question I fear... I was wondering if there are folks on here that can tell me what the difference is between SQL scripts and SQL queries (I've been inadvertently using these terms interchangeably for too long).
I have plenty of experience executing queries (SQL server, oracle, postgres), but I started working with a group that requires I submit scripts with said SQL queries through their department for review and explicitly told me not to send them queries, but the scripts. Can someone explain the difference for me?
If it matters, the SQL query I need to submit just joins fields from four tables together into one.
Thanks in advance if anyone can help me out with this!
In this context it sounds like they want to see any set of commands that will be changing (INSERT, UPDATE, DELETE, etc.) data and are not concerned with any queries you're running to simply return or review data.
I suspect more people use these terms interchangeably than you think.
It's a very good idea to go back to this department unashamed and ask for clarification as different teams use different terms. I think you will find they're asking you to plan any data-changing action out in advance, put your commands in sequence in a file (like a .sql file if you work in Management studio, etc.) and forward that file to them for review.
Asking around was a great move. You cannot be too careful with these things!
A query consists of a single command. A script is just a file with a bunch of queries.
See:
http://docs.oracle.com/cd/E14373_01/user.32/e13370/sql_rep.htm
A SQL script is nothing but when, you save a bunch of SQL statements (select, insert delete, update etc) in a file.

Easy plugin or procedure for sqlserver Management Studio to script row inserts

I've never been able to find a good script or plugin for sql server Management Studio (2005 and or 2008) for a very common scripting need: specifying a few/all rows in a table and scripting their insert. You can guess my story: I've got some configuration data in my dev db and I need to script it for deployment to UAT and then production.
I've found a few cludgy systems in the past, that were more trouble than they were worth. I need something free and unobtrusive. Once I find it I'll share it with the other 20 developers in my shop who are annoyed by this. Aren't we all annoyed by this by the way?
What is the best, easiest, free, way to specify a few/all rows in a table and get a script their insert?
Edit
Resolution: SSMS Tools Pack rocks! Just what I was looking for: free, unobtrusive, simple, solid. It's got a lot of other handy additions too that I look forward to exploring.
The SMSS Tools Pack can do this. Sorta.
You can use RedGate Data Compare to compare table(s) across databases. It will generate inserts for you.
Well anytime we add rows to a lookup table in dev, we do it in an insert script which is put into source control like the rest of the project. Then the script is run as part of deployment.

Need good scheme/workflow for managing database objects using Subversion

How do you track/manage your stored procedures, views, and functions in SQL Server?
I'd like to use Subversion, but it looks like I would have to just save & commit the CREATE/ALTER statements. That might work okay for me, but I suspect I'd end up doing a lot of nagging.
Is anyone using versioning with their databases? Is there a better way?
In the past, people have just commented out parts of the code and left it in. Or, they add little "added on 2/31/2010" comments all over. It drives me nuts, because I know there is a better way.
We do log changes in the object's header, but that's pretty limited. It would make my life easier to be able to diff versions.
Additional Info
We are using SQL Server 2005. I have Subversion (via VisualSVN Server) and TortoiseSVN installed, but I'm open to other suggestions.
By database objects, I specifically mean stored procedures, views, and functions.
There are only a few tables I would need to track. The database is the backend for a commercial application, and we mostly pull information out for reporting
I found a related question about stored procedure versioning
We script everything and put it into Subversion. Nothing can be loaded to Prod without a script (developers do not have rights to prod) and the people with rights on prod only accept scripts they loaded from Subversion.
We revision our database, schema creation, dw, etl, stored procedures just like any other piece of code, because it's code!
I have also seen people type dates in headers, etc. This is normally due to them completely missing the point of revision control.
Have a look at liquibase, here
It manages your sql changes/scripts for you, and can apply them in conjunction with svn via hooks or scripts. Makes doing all sorts of setup easy, and helps eliminate the case of the missing trigger/sproc/etc...
I'm not sure what you all mean with "database objects". Are these only the tables, views, procedures etc or also data? I mean daily created data?
Assumed you mean the database schema definition. By my experience there is only one way to handle database schema definitions (if you don't have NHibernate or some similar tool). You write sql scripts that create your database from scratch and check them in. You use the same scripts for installation of your software. You see the differences by just comparing the scripts files.
Whenever I've gone through this excercise, it's come down to 3 main things that need to be source-controlled:
Stored Procedures / Views / Triggers (more or less anything that can be fairly expressed as "code". These are fairly simple, include a conditional drop and create at the top of the file.
Table Schema - DROP / CREATE statements as above. You can try to get fancy with ALTER statements, but it tends to get really messy.
The biggest challenge we faced was this forces you into a system where your DB goes back to an initial state often - if there's a fair amount of work involved in bringing DBs to something usable / testable, it can be a pain. In that case we kept a library of scripts that brought a DB to various usable states, and source controlled those as well.
Data within tables. We looked at a couple of approaches here - either a series of INSERT statements stored in a file like "TableName_Data.sql" or a CSV file with custom build tooling that parsed and inserted when the DB was rebuilt.
Ultimately we went with the INSERT statements for simplicity's sake.

How to keep code base and database schema in synch?

So recently on a project I'm working on, we've been struggling to keep a solution's code base and the associated database schema in synch (Database = SQL Server 2008).
Database changes occur fairly regularly (adding columns, constraints, relationships, etc) and as a result it's not uncommon for people to do a 'Get Latest' from source control and
find that they also need to rebuild the database as well (and sometimes they forget to do the latter).
We're not using VSTS: Database Edition (DataDude) but the standard Visual Studio database project with a script (batch file) which tears down and recreates the database from T-SQL scripts. The solution is a .Net & ASP.net solution with LINQ to SQL underlying as the ORM.
Anyone have ideas on an approach to take (automated or not) which would keep everyone up to date with the latest database schema?
Continuous integration with MSBuild is an option, but only helps pick up any breaking changes committed, it doesn't really help in the scenario I highlighted above.
We are using Team Foundation Server, if that helps..
We try to work forward from the creation scripts.
i.e a change to the database is not authorised unless the script has been tested and checked into source control.
But this assumes that the database team is integrated with your app team which is usually not the case in a large project...
(I was tempted to answer this "with great difficulty")
EDIT: Tools won't help you if your process isn't right.
Ok although its not the entire solution, you should include an assertion in the Application code that links up to the database to assert the correct schema is being used, that way at least it becomes obvious, and you avoid silent bugs and people complaining that stuff went crazy all of the sudden.
As for the schema version, you could use some database specific functionality if available, but i personally prefer to declare a schema version table and keep the version number in there, that way its portable and can be checked with a simple select statement
have a look at DB Ghost - you can create a dbp using the scripter in seconds and then manage all your database code with the change manager. www.dbghost.com
This is exactly what DB Ghost was designed to handle.
We basically do things the way you are, with the generation script checked into source control as well. I'm the designated database master so all changes to the script itself are done through me. People send me scripts of the changes they have made, I update my master copy of the schema, run a generate scripts (SSMS) to produce the new DB script, and then check it in. I keep my copy of the code current with any changes that are being made elsewhere. We're a small shop so this works pretty well for us. I realize that it probably doesn't scale.
If you are not using Visual Studio Database Professional Edition, then you will need another tool that can break the database down into its elemental pieces so that they are managable and changeable in an easier manner.
I'd recommend seriously considering Redgate's SQL tools if you want to maintain sanity over all your database changes and updates.
SQL Packager
SQL Multi Script
SQL Refactor
Use a tool like RedGate SQL Compare to generate the change schema between any given version of the database. You can then check that file into source code control
Have a look at this question: dynamic patching of databases. I think it's similar enough to your problem to be helpful.
My solution to this problem is simple. Define everything as XML, and make sure that both the database, the ORM and the UI are generated from this XML, no exceptions. That way, you can use code generation tools to quickly regenerate the database creation script, which will alter your schema while (hopefully) preserving some data. It takes some effort to do, but the net result is well worth it.

Resources