How to store hot and cold data with Azure SQL - sql-server

I have a huge order table in Azure SQL. I have one boolean field "IsOrderActive" to separate hot and cold orders. Is it possible to automatically transfer cold data to a separate database with Azure SQL?

One way to accomplish required task is to divide the order table into two using T-SQL command then transfer the table with cold data in different database (different server) using SSMS.
Please follow the repro steps done by me.
Create a table
create table hotcoldtable (orderID int, IsOrderActive char(3))
Inserted demo data into the table
insert into hotcoldtable
values (1,'yes')
,(2,'no')
,(3,'yes')
,(4,'yes')
,(5,'no')
,(6,'no')
,(7,'yes')
Divide the table into cold and hot data tables using below commands
cold data table - select OrderID, IsOrderActive into coldtable from hotcoldtable where IsOrderActive = 'no'
hot data table - select OrderID, IsOrderActive into coldtable from hotcoldtable where IsOrderActive = 'yes'
You can see two new tables in your database.
In SQL Server Management Studio (SSMS), login to your Azure SQL Server. Fill the details and click on Connect.
Left click on database name where you have order tables and click on Generate Scripts...
Select Select specific database objects and mark the objects for which you want to create script as shown in below image.
Set the below settings.
Review the details and click on Next. This will generate your script.
Go to the location where your script got saved. Open the file in any editor and copy the script.
Now in Azure Portal, go to the database where you want to transfer the cold data table. Go the the Query Editor and paste the copied script in the white space. Run the script and you will get the tables in this database as shown below.

Are you referring to SQL Server Stretch Database to Azure? Check this out https://www.mssqltips.com/sqlservertip/5526/how-to-setup-and-use-a-sql-server-stretch-database

If you are interested in saving space by archiving the cold data, you can use two separate tables in the same or different databases. The thing to note is you should use columnstore index for the archive(cold) table. Depending upon your data, you should be able to achieve between 30%-60% data compression.
However, this can't be done without running some queries. But it can be automated using Azure workbooks.
I built a similar kind of functionality that helped me save 58% space in Azure SQL database.
Please comment if this is something you feel might help. I can share more details about this.

Database sharding seems like a possible solution for the scenario where cold orders can be put on Azure Serverless databases that have auto-pause and auto-resume capabilities where you can save when they are not in use, only paying for storage used. Azure SQL Database provides a good number of tools here to support sharding.

Related

SQL Server table daily sync of records from table A to table B

I want to create a daily process where I reload all rows from table A into table B. Over time table A rows will change due to changes in source system and also because of aging/deletion of records in the origin table. Table A gets truncated/reloaded daily in step 1. Table B is the master table that just gets new/updated rows.
From a historical point of view, I want to keep track of ALL the rows in table B and be able to do a point in time comparison for analytics purposes.
So I need to do two things, Daily insert rows from table A to table B if they don't exist and then also create a new record in Table B if the record already exists but ANY of the columns have changed. At one point I attempted to use temporal tables but I had too many false/positives on 'real' changes, basically certain columns were throwing off things because a date/time column was updated(only real change in row).
I'm using a Azure SQL Server Managed Instance database (Microsoft SQL Azure (RTM) - 12.0.2000.8).
At my disposal I have SSMS, SQL Server and also Azure Data Factory.
Any suggestions on the best way to do this or tools to help with this?
There are 2 concepts out of which you can implement any one.
Temporal table
Capture Data Change (CDC)
As CDC is the commonly used approach in which you can create an Azure data factory with a pipeline that loads delta data based on change data capture (CDC) information in the source Azure SQL Managed Instance database to an Azure blob storage.
To implement the CDC, you can you can follow this simple Microsoft tutorial Incrementally load data from Azure SQL Managed Instance to Azure Storage using change data capture (CDC)
Note: You also need to Create a storage account which is required but not given in above tutorial.

Azure sql cross database trigger

. I have two databases in same azure sql server .i want that both database interact to each other using trigger. i.e If any record is inserted in Customer table of first database the trigger gets fired and record is inserted in another database.
We had / have the same problem with triggers that we use for insert-update-delete where we write a record to Database-1 that has the primary table, but also updates Database-2 where we hold "archive" versions of the tables.
The only solution we have identified and are testing is to bring all of the tables into a single database and separate the different tables under separate database schemas in the one database.
Analysis so far of this approach looks promising.
I think what you're trying to do is not allowed in Sql Azure. From my expertise what you are trying to do is a bad practice on-premise as well (think backups-restore and availability issue scenarios).
You should move the dependency in the application and have the application update both databases, as appropriate.
Anyway, if you want to continue with this approach please take a look over Elastic Query feature: https://learn.microsoft.com/en-in/azure/sql-database/sql-database-elastic-query-overview
Please let me know if I can help with something

SSIS Cross-DB "WHERE IN" Clause (or Equivalent) in Azure

I'm currently trying to build a data flow in SSIS to select all records from a mapping table where an ID column exists in the related Item table. There are two complications:
The two tables are currently in different databases on different servers.
The databases are in Azure, for which I've read Linked Servers are not supported.
To be more clear, the job to migrate data from Staging environment to Production. I only want to push lookup records into prod if the associated Item IDs are in there. Here's some psudo-TSQL to give a clear goal of what I'm trying to achieve:
SELECT *
FROM [Staging_Server].[SourceDB].[dbo].[Lookup] L
WHERE L.[ID] IN (
SELECT P.[Item]
FROM [Production_Server].[TargetDB].[dbo].[Item] P
)
I haven't found a good way to create this in SSIS. I think I've created a work-around that involves sorting both tables and performing a merge join, but sorting both sides is an unnecessary hit on performance. I'm looking for a more direct and intuitive design for this seemingly simple data flow.
Doing this in a data flow, you'd have your Source query, sans filter, fed into a Lookup Component which is the subquery.
The challenge with this is SSIS is likely on-premises so that means you are going to pull all of your data out of Stage Azure to the server running SSIS and push it back to the Prod Azure instance.
That's a lot of network activity and as I'm reading the Azure pricing guide, I guess as long as you have the appropriate DTUs, you'd be fine. Back in the day, you were charged for Reads and not Writes so the idiom was to just push all your data to target server and then do the comparison there, much as ElendaDBA mentions. Only suggestion I'd make on the implementation is to avoid temporary tables or ad-hoc creation/destruction of them. Just implement as a physical table and truncate and reload prior to transmission to production.
You could create a temp table on staging server to copy production data into. Then you could create a query joining those two tables. After SSIS package runs, you could delete the temp table on staging server

How to Copy/Consolidate data from different tables hosted on different MS SQL Servers and save them into one Table on another MS SQL Server

I am a newbie in SQL so please bear with me. I am hoping you can help/guide me. I have a table on 5 MS SQL Servers that have identical Columns and I want to consolidate the data into a separate table/separate MS SQL Server.
the challenge is that I only have "Read Only Permission" from the source table (5 MS SQL Servers) but I have permission to create a table on the destination MS SQL Server DB.
another challenge is I wan to truncate or extract parts of the txt in one column of the source table and save them into different columns on the destination table.
Next challenge is for the destination table to query once a day the source table for any update.
See screenshot by clicking either of the URL.
Screenshot URL1
Screenshot URL2
Appreciate it very much if you can help/guide me. Many thanks in advance.
You'll need to setup a linked server and use either an SSIS package to pull the data into the form you need, or OPENROWSET/OPENQUERY queries with an insert on the server you do have write privileges.
Either pre-create a table to put the new data in, or if not needed build up a temporary table or the insert the data into a table variable.
To concat a field to a new field use something like the examples below:
SELECT (field1 + field 2) as Newfield
or
SELECT (SUBSTRING(field1, 2,2) + SUBSTRING(field2, 3,1)) as Newfield
Finally you should setup all this an agent Job scheduled to your needs.
Apologies if this is not as detailed as you like, but it seems there are many questions to be answered and not enough detail to help further.
Alternatively you could also do a lookup upon lookup (USING SSIS):
data flow task > download first table completely to destination server
JOIN TO
dataflow task > reading from destination server, do a lookup to 2 origin server (if match you might update, if not, insert)
repeat until all 5 of them are done.
This is NOT the most elegant or efficient solution, but it will definitely get the work done.

Copy Database Data from Many DBs to One. Data Replication (sort of)

This involves data replication, kind of:
We have many sites with SQL Express installed, there is an 'audit' database on each site that has one table in 1st normal form (to make life simple :)
Now I need to get this table from each site, and copy the contents (say, with a Date Time Value > 1/1/200 00:00, but this will change obviously) and copy it to a big 'super table' in sql server proper, that also has the primary key as the Site Name (That needs injecting in) and the current primary key from the SQL Express table)
e.g. Many SQL Express DBs with the following table columns
ID, Definition Name, Definition Type, DateTime, Success, NvarChar1, NvarChar2 etc etc etc
And the big super table needs to have:
SiteName, ID, Definition Name, Definition Type, DateTime, Success, NvarChar1, NvarChar2 etc etc etc
Where items in bold are the primary key(s)
Is there a Microsoft (or non MS I suppose) app/tool/thing to manager copying all this data accross already, or do we need to write our own?
Many thanks.
You can use SSIS (which comes with SQL Server) to populate, it can be set up with variables to change the connection string to the various databases. I have one that loops through the whole list and does the same process using three differnt files from three differnt vendors. You could so something simliar to loop through the different site databases. Put the whole list of database you want to copy the audit data from in a table and loop through it changing the connection string each time.
However, why on earth would you want one mega audit table per site? If every table in the database populates the audit table as changes happen, then the audit table eventually becomes a huge problem for performance. Every insert, update and delete has to hit this table and then you are proposing to add an export on top of that. This seems to me to be a guaranteed structure for locking and deadlocks and all sorts of nastiness. Do yourself a favor and limit each audit table to the table it is auditing.
Things to consider:
Linked servers and sp_msforeachdb as part of a do-it-yourself solution.
SQL Server Replication (by Microsoft) (which I believe can pull data from SQL Server Express)
SQL Server Integration Services which can pull data from SQL Server Express instances.
Personally, I would investigate Integration Services first.
Good luck.
You could do this with SymmetricDS. SymmetricDS is open source, web-enabled, database independent, data synchronization/replication software. It uses web and database technologies to replicate tables between relational databases in near real time. The software was designed to scale for a large number of databases, work across low-bandwidth connections, and withstand periods of network outage.
As of right now, however, you would need to implement a custom IDataLoaderFilter extension point (in Java) to add the extra column. The metadata would be available though because your SiteName would be the external_id.

Resources