I need to create report using SQL Server data from couple million records. Getting this error in Excel any way to resolve this? Or should I switch to SSRS which I don't know if it's better for handling large data sets since I never used it before.
Related
I'm trying to design a Matrix report through SSRS to aggregate a column for a range of dynamic values in another column (i.e. a pivot). This data consists of just over 13 million rows, so it's a large dataset.
When doing a PIVOT on this data via T-SQL, it's able to aggregate all of these rows in about ~1min, however when getting SSRS to do the pivoting for me through a Matrix report, I'm getting an OutOfMemory exception when trying to preview the report on my PC.
The query returning the dataset itself isn't complicated, it's as simple as:
SELECT
ID
,Test_Ref
,Data_issue_indicator
FROM MyTable
Where we're trying to do the sum of Data_issue_indicator (which can be either a 1 or 0) for values in Test_Ref, in which there is a dynamic range of values to aggregate against; in other words we cannot use a standard Tablix report because the amount of columns can increase at any time should a new Test_Ref value be introduced into the dataset.
I'm using Visual Studio Enterprise 2019, and my PC is a Windows 10, i7-8850H, with 16GB memory.
Is there a suggestion on getting around this issue?
When using SSRS, its recommended to grab more data once in case of using the dataset multiple times. but when you have a larger dataset it needs to be a trade off between what you want to achieve against do you need all the data.
So in this situation i would suggest to use a procedure to restrict the amount of data that you are grabbing to the report.
I have gone through this sort of scenario, and i had to do the same, because its not the query that is timing out but the huge amount of data that is loaded to the report which fails the report.
If you have SQL server profiler , you would see the SQL executed and completed, but the report times out rendering.
Two ideas, assuming that you plan to deploy the report to a server that will have the memory to handle this, and that you'd prefer to do this processing on the report server rather than the SQL server for some reason:
Don't test the functionality on your PC in Visual Studio. Design the report, deploy it to your Report Server, and test it there to see if it works.
When testing on your PC, force it somehow to use a much smaller dataset: one just large enough to verify that the pivoting Matrix works, but small enough that your PC's memory can handle it.
Or better yet, do option 2, and then option 1.
I'm trying to import data into SQL Server using SQL Server Management Studio and I keep getting the "output column... failed because truncation occurred" error. This is because I'm letting the Studio autodetect the field length which it isn't very good at.
I know I can go back and extend the column length but I'm thinking there must be a better way to get it right first time without having to manaully work out how long each column is.
I know that this must be a common issue but my Google searches aren't coming up with anything as I'm more looking for a technique rather than a specific issue.
One approach you may take, assuming the import is not something which would take hours to complete, is to just set every text column to VARCHAR(MAX), and then complete the CSV import. Once you have the actual table in SQL Server, you can inspect each column using LEN to see how wide it is. Based on that, you can either alter columns, or you could just take notes, drop the table, and reimport using appropriate widths.
You should look into leveraging SSIS for this task. There is somewhat of a fixed cost in terms of spending time setting up the process for importing the csv file and creating a physical table in the database. Ultimately, though, you will be able to set the data types for each column/field in your file. Further, SSIS will enable you to transform or reformat the data to say the least.
I would suggest downloading Visual Studio and SQL Server Data Tools. The latter contains the necessary tools, including SSIS, SSRS, and SSAS, for which you would need to complete this task.
The main point is being able to automate this task, especially if it's an ongoing project of uploading csv files into the database.
I am building a simple database with about 6-7 tables. I will be setting a schedule to do a clean import from a .txt file.
I want to take this data and create a report, like I would do in an excel spreadsheet, convert it to a pdf and post it to our company intranet for those interested to access it.
I'm trying to think of the best way to build my report. Would I just use an excel spreadsheet with a direct connection to the database? Would I create some sort of console application (c/c#/vb/vb.net) that would query the db, generate the report in an excel file, convert to pdf and save?
I'm quite comfortable in these different languages, just not as experienced in the reporting services (although I do have a lot of experience working with EXCEL and VBA Macros) but I want to get into it (SSRS) and get familiar with it as I will be doing a lot of projects like this in the future. This is seems like an easy one to get my hands dirty with and learn and build off of.
Any insight or suggestions would be greatly appreciated.
Thanks so much!
My suggestion:
Create desired SQL queries to retrieve the data in desired form
Link these queries to your Excel sheet, perhaps directly in form of pivot tables for aggregation of results
Using VBA, you can easily create PDF from the data at the click of a button
The initial design will be time intensive, but after that, everything is automated and one just needs to press the button that creates the PDF.
How to link Access queries to your Excel file:
Data --> Get external Data
You can easily refresh all data whenever you open the Excelsheet by using the code below in the On Open event of the workbook:
ThisWorkbook.RefreshAll
If you need further clarification, do not hesitate to ask
If your end goal is to create a PDF that will be out on your intranet then I would create the report in SSRS. Then you can schedule it to run and output a PDF to your network location.
I've had good experiences using a pivot table in Excel which is a connected table to your SQL database.
In the connection parameters in Excel there is a field where you can define your SQL query, whether it be to call a stored procedure or just a simple SELECT statement.
The main reason I prefer a pivot table SQL connection rather than a normal table connection is because if you have a chart that references the connected table, the chart formatting will be reset when you refresh your connection (if you need to updated your report).
If I use a chart that references a pivot table (or a pivot chart) then the formatting is retained.
I'd like to get some advice on a reporting situation that I have. I am working in SQL Server. I have a ton of data validation queries that I run against a database. In general, for each query, I return two things -- one is the count of the offending records, and the other is the offending records themselves.
My goal is produce a report that gives the counts of the offending records for all data validation queries (ideally, on one sheet in an Excel workbook) and the offending records themselves (ideally, on separate sheets in an Excel workbook).
How is this best achieved? That is, what technology is best for this situation? For example, in the past, I have prototyped the queries in SSMS, copied them into a Windows batch file (and added code to write the results to separate text files), and called the batch file via the sqlcmd utility (using command prompt). However, I know that other solutions exist (e.g., SSRS). Would something like SSRS be a better tool for this situation? I'm hesitant to go the SSRS route, since I'm only giving metrics on one issue (i.e., counts of offending records) and the rest of the report consists of offending records.
This might get closed because it is a matter of opinion, but SSRS would be a good solution for this requirement. I think SSRS is a good fit if you have the following criteria:
You need to visualize the data in some kind of a table, chart, or graph
You want to send out automated emails every morning / week / month to a group of users (as opposed to just individual consumption)
You want to be able to export the report to other formats (excel or pdf) for additional analysis or sharing.
Otherwise, if it's just for you and you currently don't have SSRS running on the server, save yourself the overhead of running another service and just keep doing it in batch files.
I am tasked with exporting the data contained inside a MaxDB database to SQL Server 200x. I was wondering if anyone has gone through this before and what your process was.
Here is my idea but its not automated.
1) Export data from MaxDB for each table as a CSV.
2) Clean the CSV to remove ? (which it uses for nulls) and fix the date strings.
3) Use SSIS to import the data into tables in SQL Server.
I was wondering if anyone has tried linking MaxDB to SQL Server or what other suggestions or ideas you have for automating this.
Thanks.
AboutDev.
I managed to find a solution to this. There is an open source MaxDB library that will allow you to connect to it through .Net much like the SQL provider. You can use that to get schema information and data, then write a little code to generate scripts to run in SQL Server to create tables and insert the data.
MaxDb Data Provider for ADO.NET
If this is a one time thing, you don't have to have it all automated.
I'd pull the CSVs into SQL Server tables, and keep them forever, will help with any questions a year from now. You can prefix them all the same, "Conversion_" or whatever. There are no constraints or FKs on these tables. You might consider using varchar for every column (or the ones that cause problems, or not at all if the data is clean), just to be sure there are no data type conversion issues.
pull the data from these conversion tables into the proper final tables. I'd use a single conversion stored procedure to do everything (but I like tsql). If the data isn't that large millions and millions of rows or less, just loop through and build out all the tables, printing log info as necessary, or inserting into exception/bad data tables as necessary.