We recently ran into a huge issue in which we lost critical usage information for the months of August-October 2019.
I have set out to try recover this data. I found our SQL Server 2017 database is set to full recovery thankfully because I read some threads that if this wasn't the case, then we would be out of luck.
With that, I'm trying to run a stored procedure I found here https://raresql.com/2012/04/08/how-to-recover-truncated-data-from-sql-server-without-backup/
I'm running into some permission issues but assuming I have permissions, would it be possible to recover such data from 3-4 months ago? I want to know before proceeding asking for permissions and wasting DBA's time. If not, are we truly outta luck? Could 3rd party tools like ApexSql work?
Backstory if interested what's going on:
What happened is the powershell script we have on our windows server generates csv files containing usage information from a usage table that gets updated every 5 mins with usage data and sends them over to a linux server via connect direct FTP service daily. Once the file is sent, the script truncates the usage table because we dont want to run into performance issues as data continues building up in 1000s of records daily.
At that point of time, we didn't consider archiving the export files but now we learned our lesson and started archiving them since. What happened was the ftp service stopped running on linux server and so no one was notified for 2 months about this, and management now sees a gap in the usage report for those missing months, and they're not happy about it.
Related
Let me start by apologizing as I'm afraid this might be more of a "discussion" than an "answerable" question...but I'm running out of options.
I work for the Research Dept. for my city's public schools and am in charge of a reporting web site. We use a third-party vendor (Infinite Campus/IC) solution to track information on our students -- attendance, behavior, grades, etc. The IC database sits in a cloud and they replicate the data to a local database controlled by our IT Dept.
I have created a series of SSIS packages that pull in data each night from our local database, so the reporting data is through the prior school day. This has worked well, but recently users have requested that some of the data be viewed in real-time. My database sits on a different server than the local IC database.
My first solution was to create a linked server from my server to the local IC server, and this was slow but worked. Unfortunately, this put a strain on the local IC database, my IT Dept. freaked out and told me I could no longer do that.
My next & current solution was to create an SSIS package that would be called by a stored procedure. The SSIS package would query the local IC database and bring in the needed data to my database. This has been working well and is actually much quicker than using the linked server. It takes about 30 seconds to pull in the data, process it and spit it out on the screen as opposed to the 2-3 minutes the linked server took. It's been in place for about a month or so.
Yesterday, this live report turned into a parking lot -- the report says "loading" and just sits like that for hours. It eventually will bring back the data. I discovered the department head that I created this report for sent out an e-mail to all schools (approximately 160) encouraging them to check out the report. As far as I can tell, about 90 people tried to run the report at the same time, and I guess this is what caused the traffic jam.
So my question is...is there a better way to pull in this data from the local IC database? I'm kind of limited with what I can do, because I'm not in our IT Dept. I think if I presented a solution to them, they may work with me, but it would have to be minimal impact on their end. I'm good with SQL queries but I'm far from a db admin so I don't really know what options are available to me.
UPDATE
I talked to my IT Dept about doing transactional replication on the handful of tables that I needed, and as suspected it was quickly shot down. What I decided to do was set up an SSIS package that is called via Job Scheduler and runs every 5 minutes. The package only takes about 25-30 seconds to execute. On the report, I've put a big "Last Updated 3/29/2018 5:50 PM" at the top of the report along with a message explaining the report gets updated every 5 minutes. So far this morning, the report is running fantastically and the users I've checked in with seem to be satisfied. I still wish my IT team was more open to replicating, but I guess that is a worry for another day.
Thanks to everybody who offered solutions and ideas!!
One option which I've done in the past is an "ETL on the Fly" method.
You set up an SSIS package as a dataflow but it writes to a DataReader Destination. This then becomes the source for your SSRS Report. In effect this means that when the SSRS report is run - it automatically runs the SSIS package and fetches the data - it can pass parameters into the SSIS report as well.
There's a bit of extra config involved but this is straightforward.
This article goes through it -
https://www.mssqltips.com/sqlservertip/1997/enable-ssis-as-data-source-type-on-sql-server-reporting-services/
I am running several DB instances on the dedicated Azure SQL Server. All instances are S0. They all run the same processes yet one of them got into trouble (never happened before). The automated SP which purges old records from tables periodically did not seem to be able to finish. It has been running over 12 hours now (should finish in minutes) and I can't figure out how to stop it (obviously restart is not available).
I tried to KILL the process (the SP in question) but that did not seem to work (see below). Any help appreciated. No need to worry about the data but trying not to go nuclear on this and delete the whole database.
Try renaming the database and rename it back or alter the database to different SLO and revert it. The other option is to turn ON / OFF RCSI with rollback immediate
I have inherited a poorly documnted CRM 2011 instance, lets call it 'haystack', that has literally 100s of batch processes and workflows running on it under the same user account (IT_Job). One of them, lets call it 'needle', has a problem.
I know 'needle' ran at noon. But the Advanced find only allows me to query the scheduled jobs down to the day they ran NOT the time they ran.
How do I find out which Batch Process or Workflow last updated an entity (incident). All I seem to be able to query the scheduled jobs by is IT_Job.
It seems to me that the only way to find the 'needle' job would be to systematically add a corresponding User for each job that acts on 'incident' to then search for job running as the user that last modified the record.
So how do I find 'needle' quickly?
EDIT
Ok it turns out the site I am working for won't allow me access to the database or allow me to deploy a console app. (One could argue that they're not interested in a solution to this problem) So is there any other way to query this data?
If you have access to the SQL server then you can query the table dbo.WorkflowLogBase. This will help you narrow down which workflows had run at that exact time.
You might also be able to add some extra information by looking at the audit history for the record that has been changed assuming that auditing is turned on.
I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.
we run since 2 years a small application on SQL Server 2005 Express Edition the Database has gown from 75 MB up to nearly 400MB within this time, the there isn't a big amount of data.
But the log file has been arrived at 3,7GB now without changing Hardware, table structure or Program code we noted that the Import processes which required 10-15 minutes are now arrived at a couple of hours.
any idea where could be the Problem? Depends it on the log file may be? The 4GB Lock of Express Edition bear only on data files or also on log files?
Additional Informations: There isn't any RAID on the DB Server, There doesn't work concurrent users (only one user is logged in while the import process).
Thanks in Advance
Johannes
That the log file is so large is completely normal behavior; in the two years you have been running; sql has been keeping track of the events that happen in the database as it goes along its business.
Normally you might clear these logs off when you take a backup (as you most likely dont need them anyway.) If you are backing up all you need to change the sql script to checkpoint the logfile (its in books online) depending on how you are backing up your milage may vary.
To clear it down in the immediate make sure no one is in using the database; open management studio express find the database and run
backup log database_name with truncate_only
go
dbcc shrinkdatabase('database_name')
From MSDN:
"The 4 GB database size limit applies only to data files and not to log files. "
SQL Server Express is also limited in that it can only use 1 processor and 1GB of memory. Have you tried monitoring the processor/memory usage while the import is running to see if this is causing a bottleneck?