Context
There are two excel.workbooks in the same location: database and dashboards. Whereas database.workbook has as many tabs as clients I manage, dashboard.workbook has as many tabs as reports are required.
Navigation across report's (dashboard.worksheets) it's pretty simple. On each report there's a combobox that contains every dashboard.worksheets' names. Selecting any report on that combobox hides the current worksheet/report and open the desired one.
In each tab/report there is a second combobox that allows you to select a client, populating the report with the selected client's data.
The report
The information in the database looks like this:
Date|Device|Group|Subgroup|metric1|metric2|metric3|etc.
The information displayed in the report (in the one I'm having issues with) looks like this:
Group|metric1|2|3|...
The issues
1) Currently the group is displayed like this:
=IFERROR(LOOKUP(2,1/(COUNTIF($C$17:C18,IF($C$8="Goldsmiths",Client1_GroupName,IF($C$8="Client2",Client2_GroupName,IF($C$8="Client3",Client3_GroupName,IF($C$8="Client4",Client4_GroupName)))))=0),IF($C$8="Client1",Client1_GroupName,IF($C$8="Client2",Client2_GroupName,IF($C$8="Client3",Z2,Client3_GroupName($C$8="Client4",Client4_GroupName))))),"")
The combobox prints its value into Range("C8"). Through a nested ifs structure the formula identifies the client and then pulls a unique list of groups from the selected client tab (from database.workbook).
One issue is that it is very messy and hard to escalate (the more clients I get, its complexity growth exponentially). I bet there are easiest ways to do it (maybe VBA?).
It can be quite slow, the more "groups" we get and more days recorded into the database, more slow it will get.
2) Pulling the data
Most of the data to pull can be done through array formulas like this one:
={SUM((Client1_GroupName=C20)*Metric1)}
It sums all the Metric1 for the group matching C20,C21,22,23 (in that c20:xx range we have the first formula pulling the Group list.
I haven't added the nested ifs yet. It's going to be a pain to do it across 5 more columns. Again very hard to escalate.
This can be terribly slow. It comes a point that changing client means waiting 2 or 3 minutes to process the array.
Conclusions
I guess what I'm seeking is some advice on how to face this issues, which essentially are: scalability and speed.
Related
This is my first question on StackOverflow so apologies if there is not enough appropriate information.
Rather than having four different tables that I try to position 'just so' so that they look like one table, I was hoping to have all of my data in one visible table and hide the rest.
To do this I was trying to use LookupSet/Lookup with Running Value (I need a cumulative figure for each fortnight from a start date).
I have used the following code which supplies me with figures in the table - however the figures seem to be nearly double what they actually are.
=Lookup(Fields!StartFortnightDate.Value, Fields!StartFortnightDate.Value,
Fields!RowIdentifier.Value, "KPI004")
Is it possible to use Lookup with RunningValue? It won't let me use ReportItems either its obviously only pulling from the first box and therefore is just repeating the first figure again and again.
Any help, guidance, or even a simple "it's not possible" would be appreciated.
Edited to add more information as suggested:
It's difficult to add example data without worrying about data protection etc.
Report design is currently:
ReportDesign
Each table has it's own dataset - I'm trying to get them all into one table.
Lets say the first dataset is number of cars sold in each fortnight.
The second dataset (table) is number of meetings held.
The third dataset is number of days weather was sunny/cloudy/rainy etc.
(This obviously isn't what the datasets are, but I'm trying to show that they don't actually relate to each other that much and therefore can't all be in the same script)
All datasets have a table of the fortnightly dates within that quarter, my hope was to get one table that showed the cumulative figures of each item even though they're not in the same dataset - the tables are all grouped by the StartOfFortnightDate.
The script =RunningValue(Fields!NumberOfFordCarsSold.Value, Count, Nothing) and similar works fine in the separate tables, however if I add a row to the top table and try to use RunningValue with Lookup it doesn't work.
When I used the script mentioned at the top (Lookup script) I get inflated figures (top row of this image) compared to the expected figures (bottom row of the image): IncorrectAndCorrectFigures
Apologies if this doesn't make sense, it's likely that my complete confusion in trying to find the answer is coming across in the question.
If the resulting datasets are all similar then why can you not combine them?
From the output they seem to be just Indicator & Date.
Add an extra column to indicate which set of data each row belongs to (Cars Meetings etc), this might help with grouping rows in the report.
I have been thinking about of how Tinder might have setup their data model - especially the part to select the candidates to be shown (I'm not talking about the algorithm that determines the order, but only how to get all possible candidates in the first place). This process should only display other profile, that the current user did not already vote on.
So I could imagine this:
A table for the Users (>40mio entries), and another one for the swipes (>1,5 billion new entries each day).
When selecting the candidates, one could join the two tables (+ obviously apply certain other selection criteria like the location, age range etc) and only return the users that the current user has not yet swiped for.
But: does that scale? Both of those tables are rather huge - so I guess at some point you would run into problem, right?
Furthermore, I read that Tinder is using AWS DynamoDB - so not a relational model. And this makes it even harder I guess...
So my question is: do you have an idea on how Tinder accomplished this?
I am essentially building gone report that ingests two types of data. One is the receptionists data. Which is each receptionists stats by day. But then the data gets a little more granular and is each call for each receptionist.
Essentially the report does two things gives receptionist performance then a person can click and prompt the same dashboard sheet to update with specific call log etc.
So basically this data set is huge and held as an export so it will be faster an I limit the data to this month and last month (minimum requirement). I have also eliminated any unnecessary columns.
I am curious if I should create two separate custom queries in Tableau then create referential field or should I bring both custom queries inside of one workbook and join them together. At first I had the two connections separate but now I brought them together and am noticing some performance issues. What are some of my options?
It would be better to have two seperate queries since for the first view doesnt need all the additional details you want to show in the drill down.
Use an action filter and link the two sheets(which use different data sources) by selecting the specific fields when configuring the action filter.
Performance wise this is a good approach.
I am stuck on a database problem for a client, wandering if someone could help me out. I am currently trying to implement filtering functionality so that a user can filter results after they have searched for something. We are using SQL Server 2008. I am working on an electronics e-commerce site and the database is quite large (500,000 plus records). The scenario is this - user goes to our website and types in 'laptop' and clicks search. This brings up the first page of several thousand results. What I want to do is then
filter these results further and present the user with options such as:
Filter By Manufacturer
Dell (10,000)
Acer (2,000)
Lenovo (6,000)
Filter By Colour
Black (7000)
Silver (2000)
The main columns of the database are like this - the primary key is an integer ID
ID Title Manufacturer Colour
The key part of the question is how to get the counts in various categories in an efficient manner. The only way I currently know how to do it is with separate queries. However, should we wish to filter by further categories then this will become very slow - especially as the database grows. My current SQL is this:
select count(*) as ManufacturerCount, Manufacturer from [ProductDB.Product] GROUP BY Manufacturer;
select count(*) as ColourCount, Colour from [ProductDB.Product] GROUP BY Colour;
My question is if I can get the results as a single table using some-kind of join or union and if this would be faster than my current method of issuing multiple queries with the Count(*) function. Thanks for your help, if you require any further information please ask. PS I am wandering how on sites like ebay and amazon manage to do this so fast. In order to understand my problem better if you go onto ebay and type in laptop you will
see a number of filters on the left - this is basically what I am trying to achieve. I don't know how it can be done efficiently when there are many filters. E.g to get functionality equivalent to Ebay I would need about 10 queries and I'm sure that will be slow. I was thinking of creating an intermediate table with all the counts however the intermediate table would have to be continuously updated in order to reflect changes to the database and that would be a problem if there are multiple updates per minute. Thanks.
The "intermediate table" is exactly the way to go. I can guarantee you that no e-commerce site with substantial traffic and large number of products would do what you are suggesting on the fly at every inquiry.
If you are worried about keeping track of changes to products, just do all changes to the product catalog thru stored procs (my preferred method) or else use triggers.
One complication is how you will group things in the intermediate table. If you are only grouping on pre-defined categories and sub-categories that are built into the product hierarchy, then it's fairly easy. It sounds like you are allowing free-text search... if so, how will you manage multiple keywords that result in an unexpected intersection of different categories? One way is to save the keywords searched along with the counts and a time stamp. Then, the next time someone searches on the same keywords, check the intermediate table and if the time stamp is older than some predetermined threshold (say, 5 minutes), return your results to a temp table, query the category counts from the temp table, overwrite the previous counts with the new time stamp, and return the whole enchilada to the web app. Otherwise, skip the temp table and just return the pre-aggregated counts and data records. In this case, you might get some quirky front-end count behavior, like it might say "10 results" in a particular category but then when the user drills down, they actually find 9 or 11. It's happened to me on different sites as a customer and it's really not a big deal.
BTW, I used to work for a well-known e-commerce company and we did things like this.
Okay. two questions on interactive sorting:
1. How to sort on multiple columns without holding SHIFT key?(like this: http://lukehayler.com/2011/04/sorting-on-multiple-columns-in-ssrs/)
2. How do I cancel sorting? So, usually on most web, first click on sorting arrows icon sorts by ascending, second click sorts by descending, third click cancels sorting. With SSRS I only observe first two options. Is there a way to cancel column sorting?
1) Right now that is just how the report viewer works, as others have said you could write your own controll to view the reports but this may not be an option due to time or skill level constraints.
Other people have written their own custom report viewer controls however I have always stuck to the MS version so I havent got much experience with these, and I would suspect any good ones would be a paid solution.
2) I would love this option myself, again you cant reset the sorting in this way as SSRS does not keep a record of the initial 'unordered' state of the data.
The only options you really have here are to reload your data with its original parameters or as nathan pointed out include a column that contains the starting sort order, however users may not like this as it is adding data which is not really relevant to the report data.
1) There is no way to do this with the standard report viewer control
2) There is no way to "cancel" sorting. However assuming the data was sorted into some order originally then you could include a column on the report that represents the original sort order (if it's complex ordering then you could represent this with a sequence number). This would allow the user to sort on that column to return to the original order of the report.