Importing Excel to SQL Server from the backend - sql-server

I have a ton of excel files which I'm trying to import to SQL Server from the backend and then automate it using a batch file.
I know that we can use OPENROWSET inside a T-sql script and load the excel files. I'm also aware of using SQLCMD or BCP options. All of these will work for excel sheets which are straightforward grids.
However, the challenge is, I only need to load a specific region/range of excel cells from a sheet.
For example: if the sheet has the below info, I need to only load the below columns:
Date, Group1, Group2 and Group 3
until it hits the "Blank Row" and ignore everything below it.
Date Group1 Group2 Group 3
Jan-13 25 26 27
Jan 18 35 29 19
20 15
<empty row> <empty row>
Y/Y % YTD % Group %
15 20 40
So, my question is: is it possible to implement this functionality using OPENROWSET in T-SQL? IF so, can you please point me to any links/example on how I can do this? I tried digging around a bit on the MSDN site but couldn't find any.
If this cannot be done in T-SQL, any ideas on how I could implement it from the backend?
Thanks in advance,
Bee

Related

Adding observations and variables to a dataset with .csv files in Stata

I am using Stata 17. I want to add observations and variables in a dataset, I'll name it dataset1.
Dataset1 has the following structure
Date Year urbanname urbancode etc..
2010m1 2010 Beijing 1029 ...
2010m2 2010 Beijing 1029 ...
2010m3 2010 Beijing 1029 ...
...
2015m1 2015 Paris 1030 etc
For different cities and different time periods.
I would like to add observations of other cities (that are not in the rows of dataset1), that I have in different .csv files (dataset2.csv, dataset3.csv, and so on..). Each city has its own dataset.
In each .csv dataset I want to add I have the following variables
the dates
the urbanname
the urbancode
other variables which I do not yet have in dataset1 but that I want to add
What would be your advice on how to proceed ? I thought of doing it with R but dataset1 does not open well in RStudio and the variable Date is not well imported.
You do not describe what you have tried so far and what issues you are encountering but you can do something like this:
use dataset1, clear
* Store in the data in a temporary file
tempfile appendfile
save `appendfile`
foreach dataset in dataset2.csv dataset3.csv {
import delimited `dataset`
append using `appendfile`
save `appendfile`
}

Need to add dynamic columns in headers in SSRS report

I need to add column headers in SSRS report but that are dynamic in nature.
For example, sometimes Query will return 5 different named columns with it's data and sometime will return 9 different named columns with it's data and so on.
So how to drag or refresh columns in Dataset and how to show in SSRS report dynamically.
I am totally confused seen many articles but not able to get solution.
How to implement this in SSRS report. I have the query, depending on parameters columns gets generated. Check below sample report preview
its showing date in different columns
In SSRS , the dataset must always return the same number of columns with the same names and datatypes, so you cannot of what you want directly.
You have two options.
Option 1
Normalise the data.
So instead of returning something like
SomeID ColumnA ColumnB ColumnC
1 10 20 30
2 15 25 35
3 100 200 300
You need to return
SomeID ColName Amount
1 'ColumnA' 10
1 'ColumnB' 20
1 'ColumnC' 30
2 'ColumnA' 15
2 'ColumnB' 25
2 'ColumnC' 35
3 'ColumnA' 10
3 'ColumnB' 200
3 'ColumnC' 300
Once you have your data in this format, you can simply use a matrix in your report. Set the rowgroup to SomeID, set the Column Group to ColName and the data value to Amount
This is by far the simplest solution.
Option 2
Deconstruct and rebuild the table in code
There are several drawbaks to this method but if you are interested, read my answer to this question asked a few days ago
SQL Server - SSRS - Display the content of a Table/View directly in the report (and not using table/matrix)

how to use the Pivoted Column values in Matrix use in another Tablix and write expressions on top of it

I have one set of data with fields
StudentId, Name , Address in one dataset and being used in one Tablix.
also another set of data: StudentID Subject Marks in another Dataset and using Matrix to Pivot in the Report.
I am able to fetch the Report in this way
StudentID Name Address MAths Physcis Chemistry Median
1 Mike NJ 85 70 90 2
2 David CA 81 85 90 1
I was calculating Median by counting number of Subject Marks greater than 80.
Now how do I use the value of Median in Tablix instead of in Matrix.
Below should be the expected output format
StudentID Median Name Address MAths Physcis Chemistry
1 2 Mike NJ 85 70 90
2 3 David CA 81 85 90
Note: I am using Matrix to Pivot Subject Column in SSRS Report. I am using Pivot operation in SSRS instead of performing in SP because I get 40 columns after Pivoting in SP and need to physically map 40 columns. Here in example I have only given 3 columns(Maths, Physcis and Chemistry).
Also please do let me know if expected output format is at least possible.
Is there any way that I will be able to Pivot Subject Columns inside the Tablix itself instead of using the another Matrix??
Thank you.
There are two ways to typically go about an aggregation like this. If you stick with the two existing datasets, you'll have to use the Lookup or LookupSet functions to get data from the other dataset. For example, if your table/matrix is using the second dataset as it's source, you would Lookup the Name of each student. Keep in mind that this is not efficient for large reports.
The other approach, which I would recommend, is to join these two datasets in SQL and use that as the data source for the report. This is more efficient and makes the report simpler to maintain.
It's good that you are letting the report do the pivoting for you, it works much better that way.

MS Analysis Cube - one-to-many joins

I am building an OLAP cube in MS SQL Server BI Studio. I have two main tables that contain my measures and dimensions.
One table contains
Date | Keywords | Measure1
where date-keyword is the composite key.
One table contains looks like
Date | Keyword | Product | Measure2 | Measure3
where date-keyword-product is the composite key.
My problem is that there can be a one-to-many relationship between date-keyword's in the first table and date-keyword's in the second table (as the second table has data broken down by product).
I want to be able to make queries that look something like this when filtered for a given Keyword:
Measure1 Measure2 Measure3
============================================================
Tuesday, January 01 2013 23 19 18
============================================================
Bike 23
Car 23 16 13
Motorcycle 23
Caravan 23 2 4
Van 23 1 1
I've created dimensions for the Date and ProductType but I'm having problems creating the dimension for the Keywords. I can create a Keyword dimension that affects the measures from the second table but not the first.
Can anyone point me to any good tutorials for doing this sort of thing?
Turns out the first table had one row with all null values (a weird side effect of uploading an excel file straight into MS SQL Server db). Because the value that the cube was trying to apply the dimension to was null in this one row, the whole cube build and deploy failed with no useful error messages! Grr

GROUP_CONCAT and DISTINCT are great, but how do i get rid of these duplicates i still have?

i have a mysql table set up like so:
id uid keywords
-- --- ---
1 20 corporate
2 20 corporate,business,strategy
3 20 corporate,bowser
4 20 flowers
5 20 battleship,corporate,dungeon
what i WANT my output to look like is:
20 corporate,business,strategy,bowser,flowers,battleship,dungeon
but the closest i've gotten is:
SELECT DISTINCT uid, GROUP_CONCAT(DISTINCT keywords ORDER BY keywords DESC) AS keywords
FROM mytable
WHERE uid !=0
GROUP BY uid
which outputs:
20 corporate,corporate,business,strategy,corporate,bowser,flowers,battleship,corporate,dungeon
does anyone have a solution? thanks a ton in advance!
What you're doing isn't possible with pure SQL the way you have your data structured.
No SQL implementation is going to look at "Corporate" and "Corporate, Business" and see them as equal strings. Therefore, distinct won't work.
If you can control the database,
The first thing I would do is change the data setup to be:
id uid keyword <- note, not keyword**s** - **ONE** value in this column, not a comma delimited list
1 20 corporate
2 20 corporate
2 20 business
2 20 strategy
Better yet would be
id uid keywordId
1 20 1
2 20 1
2 20 2
2 20 3
with a seperate table for keywords
KeywordID KeywordText
1 Corporate
2 Business
Otherwise you'll need to massage the data in code.
Mmm, your keywords need to be in their own table (one record per keyword). Then you'll be able to do it, because the keywords will then GROUP properly.
Not sure if MySql has this, but SQL Server has a RANK() OVER PARTITION BY that you can use to assign each result a rank...doing so would allow you to only select those of Rank 1, and discard the rest.
You have two options as I see it.
Option 1:
Change the way your store your data (keywords in their own table, join the existing table with the keywords table using a many-to-many relationship). This will allow you to use DISTINCT. DISTINCT doesn't work currently because the query sees "corporate" and "corporate,business,strategy" as two different values.
Option 2:
Write some 'interesting' sql to split up the keywords strings. I don't know what the limits are in MySQL, but SQL in general is not designed for this.

Resources