Is there any query which can return me the number of revisions made to the structure of a database table?
Secondly, how can I determine the number of pages (in terms of size) present in mdf or ldf files?
I think you need to create a trigger and store all changes to the table in a separate table. You can then use this table to get the revision history.
You can get last modify date or creation date of object in SQL Server.
For examle info on tables:
SELECT * FROM sys.objects WHERE type='U'
More info on msdn
Number of pages can be fetched from sys.database_files.
Check documentation
SQL Server doesn't keep track of changes so it can't tell you this.
The only way you may be able to do this is if you had a copy of all the scripts applied to the database.
In order to be able to capture this information in the future you should look at DDL triggers (v2005+) which will enable you to record changes.
Related
Problem: I have accidentally overwrite a view in SnowFlake using CREATE OR REPLACE VIEW.
Question: Is there anyway to retrieve the old view i.e. the SQL code?
You can use QUERY_HISTORY to find the previous DDL used to create the query.
You can filter results using QUERY_TYPE which will help to you to find quickly the right query type.
If you can't find it in the query history tab using QUERY_TYPE > CREATE, you can search for it over the previous 365 days in the query history. This previous post has the SQL to run:
View DDL history of CREATE VIEW statement in Snowflake
Note that this is a big query if your account has run lots of queries over the last year. You can modify it to reduce the scan if necessary if you know more information, such as the month of creation.
If you want a totally informal and lightweight way to version your objects, I wrote one for my internal use that I decided to share. It's a table and stored procedure. Call the stored procedure with the object type and three-part name, and it adds a version row to the table. If it finds a pervious version, it marks the old one obsolete as of the current_timestamp and increments the number of the new version by 1.
https://github.com/GregPavlik/SimpleVersioning/blob/main/install.sql
Do you know how to transfer only new records between two different databases (ie. Oracle and MSSQL) using SSIS? There is no problem transfering new data only between two tables in the same database and server, but is this possible to do such operation between completely different servers and databases?
Ps. I know about solution using Lookup but it is not very efficient if anybody needs to check and add a lot of records (50k and more) several times per day. I would like to operate with new data only.
You have several options:
Timestamp based solution
If you have a column which stores the insertation time in the source system, you can select only the new records created since the last load. With the same logic, you can transfer modified records too, just mark the records with the timestamp value when it change.
Sequence based solution
If there is a sequence in the source table, you can load the new records based on that sequence. Query the last value from the destination system, then load avarything which is larger than that value.
CDC based solution
If you have CDC (Change Data Capture) in your source system, you can track the changes and you can load them based on the CDC entries.
Full load
This is the most resource hungry solution: you have to copy all data from the source to the destination. If you do not have any column which marks the new records, you should use this solution.
You have several options to achieve this:
TRUNCATE the destination table and reload it from source
Use a Lookup component to determine which records are missing
Load all data from source to a temporary table and write a query which retrieves the new/changed records.
Summary
If you have at least one column, which marks the new/modified records, you can use it to implement a differential/incremental load with SSIS. If you do not have any clue, which columns/rows are changed, you have to load (or at least query) all of them.
There is no solution which enables a one-query (INSERT .. SELECT) solution using multiple servers without transferring all data. (Please note, that a multi-server query using Linked Servers are transfers the data from the source system).
What about variables? Is it possible to use the same variable between different databases and servers in SSIS?
I would like to transfer last id number from a destination table and transfer it to the source table (different server!).
I can set a variable in a database scope like this:
DECLARE #Last int
SET #Last = (SELECT TOP 1 Id FROM dbo.Table_1 ORDER BY Id DESC)
SELECT *
FROM dbo.Table_2
WHERE ID > #Last;
However it works between two tables in the same database (as a SQL command) only. I can create a variable for a entire SSIS package in Variables --> Add variable, but I don't know it is possible to use the variable in a similar way as above - to keep an information about last id in a destination table and pass it to another table on a source server as data limit.
I've got a table in SQLite, and it already has many rows stored in it. I know realise I need another column in the table. Up to now I've just deleted the database and started again because the data has just been test data. But now the data in the database can't be deleted.
I know the query to add a column to the table, my question is what is a good way to do this so that it works for both existing users and new users? (I have updated the CREATE query I have for when the table is not found (because it's a new user or an existing user has cleared the database). It seems wrong to have an ALTER query in software that ships, and check every time. Is there some way of telling SQLite to automatically add the column if it doesn't exist during the UPDATE query I now need?
If I discover I need more columns in the future, is having a bunch of ALTER statements on startup (or somewhere?) really the best way to do it?
(If relevant this is for a node js app)
I'd just throw a table somewhere that marks what version of your database it is, and check that to determine if an update is needed. Either that or if you have a table already where there's always going to be just one record in it add a new field 'DatabaseVersion' to it.
So for example if you check the version number, and find it's a version 1 database when the newest version should be version 3, you know which updates to perform on it.
You can use PRAGMA user_version to store the version number of the database and check if the database needs to be updated.
I need to use the changed table tracking feature of sql server 2008. I have enabled this on many tables. Now i have to write a sync program to transfer this data to another location.
My problem is how do i get only those tables whose data has changed without having to loop through all the changed tables list and checking each of them?
Try sys.CHANGE_TRACKING_TABLES (documented on MSDN here).
You'll have to use OBJECT_NAME to get the table name from the first column.
I receive new data files every day. Right now, I'm building the database with all the required tables to import the data and perform the required calculations.
Should I just append each new day's data to my current tables? Each file contains a date column, which would allow for a "WHERE" query in the future if I need to analyze data for one particular day. Or should I be creating a new set of tables for every day?
I'm new to database design (coming from Excel). I will be using SQL Server for this.
Assuming that the structure of the data being received is the same, you should only need one set of tables rather than creating new tables each day.
I'd recommend storing the value of the date column from your incoming data in your database, and also having a 'CreateDate' column in your tables, with a default value of 'GetDate()' so that it automatically gets populated with the current date when the row is inserted.
You may also want to have another column to store the data filename that the row was imported from, but if you're already storing the value of the date column and the date that the row was inserted, this shouldn't really be necessary.
In the past, when doing this type of activity using a custom data loader application, I've also found it useful to create log files to log success/error/warning messages, including some type of unique key of the source data and target database - ie. if coming from an Excel file and going into a database column, you could store the row index from Excel and the primary key of the inserted row. This helps tracking down any problems later on.
You might want to consider having a look at SSIS (SqlServer Integration Services). It's the SqlServer tool for doing ETL activities.
yes, append each day's data to the tables; 1 set of tables for all data.
yes, use a date column to identify the day that the data was loaded.
maybe have another table with a date column and a clob column. The date to contain the load date and the clob to contain the file that you imported.
Good question. You most definitely should have a single set of tables and append the data daily. Consider this: if you create a new set of tables each day, what would, say, a monthly report query look like? A quarterly report query? It would be a mess, with UNIONs and JOINs all over the place.
A single set of tables with a WHERE clause makes the querying and reporting manageable.
You might do a little reading on relational database theory. Wikipedia is a good place to start. The basics are pretty straightforward if you have the knack for it.
I would have the data load into a stage table regardless and append to the main tables after. Once a week i would then refresh all data in the main table to ensure that the data remains correct as per the source.
Marcus