Reporting Solution for Data Validation Queries - sql-server

I'd like to get some advice on a reporting situation that I have. I am working in SQL Server. I have a ton of data validation queries that I run against a database. In general, for each query, I return two things -- one is the count of the offending records, and the other is the offending records themselves.
My goal is produce a report that gives the counts of the offending records for all data validation queries (ideally, on one sheet in an Excel workbook) and the offending records themselves (ideally, on separate sheets in an Excel workbook).
How is this best achieved? That is, what technology is best for this situation? For example, in the past, I have prototyped the queries in SSMS, copied them into a Windows batch file (and added code to write the results to separate text files), and called the batch file via the sqlcmd utility (using command prompt). However, I know that other solutions exist (e.g., SSRS). Would something like SSRS be a better tool for this situation? I'm hesitant to go the SSRS route, since I'm only giving metrics on one issue (i.e., counts of offending records) and the rest of the report consists of offending records.

This might get closed because it is a matter of opinion, but SSRS would be a good solution for this requirement. I think SSRS is a good fit if you have the following criteria:
You need to visualize the data in some kind of a table, chart, or graph
You want to send out automated emails every morning / week / month to a group of users (as opposed to just individual consumption)
You want to be able to export the report to other formats (excel or pdf) for additional analysis or sharing.
Otherwise, if it's just for you and you currently don't have SSRS running on the server, save yourself the overhead of running another service and just keep doing it in batch files.

Related

Importing a CSV into SQL Server - Truncation

I'm trying to import data into SQL Server using SQL Server Management Studio and I keep getting the "output column... failed because truncation occurred" error. This is because I'm letting the Studio autodetect the field length which it isn't very good at.
I know I can go back and extend the column length but I'm thinking there must be a better way to get it right first time without having to manaully work out how long each column is.
I know that this must be a common issue but my Google searches aren't coming up with anything as I'm more looking for a technique rather than a specific issue.
One approach you may take, assuming the import is not something which would take hours to complete, is to just set every text column to VARCHAR(MAX), and then complete the CSV import. Once you have the actual table in SQL Server, you can inspect each column using LEN to see how wide it is. Based on that, you can either alter columns, or you could just take notes, drop the table, and reimport using appropriate widths.
You should look into leveraging SSIS for this task. There is somewhat of a fixed cost in terms of spending time setting up the process for importing the csv file and creating a physical table in the database. Ultimately, though, you will be able to set the data types for each column/field in your file. Further, SSIS will enable you to transform or reformat the data to say the least.
I would suggest downloading Visual Studio and SQL Server Data Tools. The latter contains the necessary tools, including SSIS, SSRS, and SSAS, for which you would need to complete this task.
The main point is being able to automate this task, especially if it's an ongoing project of uploading csv files into the database.

Update Multiple SSRS Reports in bulk

I need to make identical changes to hundreds of reports, and I was hoping to do this via SQL instead of each indvidual report and it's query. I can extract the report query via xml and generate my list of reports, their location, and the query being used. But what I cannot figure out is how to update the report query and then get that updated back into the Catalog? database so that the report itself reflects the changes when executed? I have never seen where this is possible, but maybe someone on here has tried to do this or knows that it's flat out not possible.
I could use SSIS and do this, but I would prefer not to download all the RDLs and then update, and then redeploy/upload the reports. Was hoping to update in place the reports/RDLs.
You shouldn't have to download the RDLs, they should already be in your source control system, and ideally collected and grouped into project(s). If so, you are in luck - you can use the global search/replace capabilities of Visual Studio (BIDS) or Notepad++ to make your change.
If your change was to the structure of the report then you could simply write a quick nasty console app to load the RDL and manipulate the XML structure. But things like the report query are held as free-form text in a node, making it harder to apply mass updates in a reliable way.
You could look to refactor the report queries into stored procedures and/or functions, this will make future updates a bit easier. In any case if you change the report RDLs you've got no option but to republish the modified ones - there's no such thing as an in-place change on the server (having your queries as stored procedures would have avoided this issue).

Extract data from thousands of Excel files into database

We use SharePoint 2013 as a library to hold thousands of Excel files, with almost never consistent formatting, to manage projects occurring on servers. Somewhere in these maybe formatted as table objects is a common set of server names.
Somehow, without being able to change this process in the short term, I need to pull data from all these files to identify how many projects are targeting a particular server.
I've got access to SQL Server 2016 enterprise, and wondering if something like PolyBase could help with this? I also wonder about SSIS but I don't expect any tables to look exactly like another one.
Other tools may be an option, but I'm not sure what can handle this scale and variety. I think daily updates to the data would be enough, but even so it's still a mess.
How do I pull thousands of varied excel tables into a database? Is this even possible?
Any longer term solution that doesn't allow them to format and annotate like excel is unlikely to actually be adopted.
The less you know in advance, the more difficult it will be...
Some ideas:
Technology
read about FROM OPENROWSET which allows to read from an Excel
read about linked server
Use Excel and its great abilities through VBA to iterate through all your Excel-Sheets, open them, analyse them and fill proper tables. Within Excel you know most about your messy data...
Target structure
You might create thousands of tables, each representing one single sheet in all your Excel files. You could query these tables with dynamically created SQL (using meta-data of INFORMATION_SCHEMA) or think about Full-Text-Search
You might import each sheet into one single XML-structure (SELECT * ... FOR XML PATH('...')). In this case you'd need a target table with columns for Path and name of your Excel, Name of the sheet and an XML column for your data. Another approach was to represent each File on one XML and include all sheets there. Try to define common naming for all your data. Querying XML allows to query columns without knowing their actual names (XQuery with XPath using *).
If your Excels are xlsx already, you might open them with UNZIP and take the existing XML as-is.
To be honest: I do not think that any tool can do the magic to import such a wide range of mess automatically...

Best way to import large excel file into SQL Server

We are trying to devise an optimal method for importing very large Excel files into SQL database. Using SSIS is somewhat troublesome because it scans top X records to determine the format of the file, but rows further down may be different, so it takes a lot of trial and error, with us having to bring the unusual columns to the top so SSIS can "learn".
When we get new file formats to import, they conform to specification in terms of row formatting etc - so we can say we know the schema in advance. The SQL destination tables have the same schema, with couple of extra columns such as date inserted and original filename.
Is there an easier way to create format definitions for new files we are going to insert? We don't have to use SSIS, we are open to any other tool, with a view for as much automation as possible. There's a question of testing the sanity of data we will import, we were planning on doing basic queries against staging datasets such as "less than 1% of records can miss postal code" etc.
Many thanks
Maybe you can import data as text and after that you can convert that using Derived Column transformation. You can read data from Excel as Text using IMEX option in Connection String. More information about this parameter you can find here.

Best Practise: presenting some data (making a report) from SQL Server

I have many SQL Server databases, each with a few tables containing important (from my point of view) information. I check the data (retrieving for example maximum or minimum value) within those tables using T-SQL queries.
Since I don't want to create views for each of the databases, I'm thinking about most convenient, easier and simply the best way to prepare summary which will be updating each time when opened.
The output file (or web page) will be used internally within technical team. All members can log into database using Windows authentication.
My idea was:
Excel + dynamic T-SQL --> I want to connect to database and execute T-SQL (cursor will go through all database names)
PowerShell (showing table using Out-GridView cmdlet)
PHP - first I will ask about all database names (executing select name fromsys.databases` and then execute query for each DB)
What is in your opinion best way? Do you have any better (from programmers point of view) way of getting such report/data?
You can use SSRS Reports .You have the options of exporting the report data to several formats such as pdf ,excel ,word .You can create a dataset for all your database .Since you are interested in showing aggregation and summation of values ,SSRS reports will be pretty useful in these cases .

Resources