SSRS Code Shared Variables and Simultaneous Report Execution - sql-server

We have some SSRS reports that are failing when two of them are executed very close together.
I've found out that if two instances of an SSRS report run at the same time, any Code variables declared at the class level (not inside a function) can collide. I suspect this may be the cause of our report failures and I'm working up a potential fix.
The reason we're using the Code portion of SSRS at all is for things like custom group and page header calculation. The code is called from expressions in TextBoxes and returns what the current label should be. The code needs to maintain state to remember what the last header value was in order return it when unknown or to store the new header value for reuse.
Note: here are my resources for the variable collision problem:
The MSDN SSRS Forum:
Because this uses static variables, if two people run the report at the exact same
moment, there's a slim chance one will smash the other's variable state (In SQL 2000,
this could occasionally happen due to two users paginating through the same report at
the same time, not just due to exactly simultaneous executions). If you need to be 100%
certain to avoid this, you can make each of the shared variables a hash table based on
user ID (Globals!UserID).
Embedded Code in Reporting Services:
... if multiple users are executing the report with this code at the same time, both
reports will be changing the same Count field (that is why it is a shared field). You
don’t want to debug these sorts of interactions – stick to shared functions using only
local variables (variables passed ByVal or declared in the function body).
I guess the idea is that on the report generation server, the report is loaded and the Code module is a static class. If a second clients ask for the same report as another quickly enough, it connects to the same instance of that static class. (You're welcome to correct my description if I'm getting this wrong.)
So, I was proceeding with the idea of using a hash table to keep things isolated. I was planning on the hash key being an internal report parameter called InstanceID with default =Guid.NewGuid().ToString().
Part way through my research into this, though, I found that it is even more complicated because Hashtables aren't thread-safe, according to Maintaining State in Reporting Services.
That writer has code similar to what I was developing, only the whole thread-safe thing is completely outside my experience. It's going to take me hours to research all this and put together sensible code that I can be confident of and that performs well.
So before I go too much farther, I'm wondering if anyone else has already been down this path and could give me some advice. Here's the code I have so far:
Private Shared Data As New System.Collections.Hashtable()
Public Shared Function Initialize() As String
If Not Data.ContainsKey(Parameters!InstanceID.Value) Then
Data.Add(Parameters!InstanceID.Value, New System.Collections.Hashtable())
End If
LetValue("SomethingCount", 0)
Return ""
End Function
Private Shared Function GetValue(ByVal Name As String) As Object
Return Data.Item(Parameters!InstanceID.Value).Item(Name)
End Function
Private Shared Sub LetValue(ByVal Name As String, ByVal Value As Object)
Dim V As System.Collections.Hashtable = Data.Item(Parameters!InstanceID.Value)
If Not V.ContainsKey(Name) Then
V.Add(Name, Value)
Else
V.Item(Name) = Value
End If
End Sub
Public Shared Function SomethingCount() As Long
SomethingCount = GetValue("SomethingCount") + 1
LetValue("SomethingCount", SomethingCount)
End Function
My biggest concern here is thread safety. I might be able to figure out the rest of the questions below, but I am not experienced with this and I know it is an area that it is EASY to go wrong in. The link above uses the method Dim _sht as System.Collections.Hashtable = System.Collections.Hashtable.Synchronized(_hashtable). Is that best? What about Mutex? Semaphore? I have no experience in this.
I think the namespace System.Collections for Hashtable is correct, but I'm having trouble adding System.Collections as a reference in my report to try to cure my current error of "Could not load file or assembly 'System.Collections'". When I browse to add the reference, it's not an available component to select.
I just confirmed that I can call code from a parameter's default value expression, so I'll put my Initialize code there. I also just found out about the OnInit procedure, but this has its own gotchas to research and work around: the Parameters collection may not be referenced from the OnInit method during parameter initialization.
I'm unsure about declaring the Data variable as New, perhaps it should be only be instantiated in the initializer if not already done (but I worry about race conditions because of the delay between the check that it's empty and the instantiation of it).
I also have a question about the Shared keyword. Is it necessary in all cases? I get errors if I leave it off function declarations, but it appears to work when I leave it off the variable declaration. Testing multiple simultaneous report executions is difficult... Could someone explain what Shared means specifically in the context of SSRS Code?
Is there a better way to initialize variables? Should I provide a second parameter to the GetValue function which is the default value to use if it finds that the variable doesn't exist in the hashtable yet?
Is it better to have nested Hashtables as I chose in my implementation, or to concatenate my InstanceID with the variable name to have a flat hashtable?
I'd really appreciate guidance, ideas and/or critiques on any aspect of what I've presented here.
Thank you!
Erik

Your code looks fine. For thread safety only the root (shared) hashtable Data needs to be synchronised. If you want to avoid using your InstanceID you could use Globals.ExecutionTime and User.UserID concatenated.
Basically I think you just want to change to initialize like this:
Private Shared Data As System.Collections.Hashtable
If Data Is Nothing Then
Set Data = Hashtable.Synchronized(New System.Collections.Hashtable())
End If
The contained hashtables should only be used by one thread at a time anyway, but if in doubt, you could synchronize them too.

Related

Static methods in GOSU and Thread-safety

I have below function in a .gs class, which gets called when accessing specific Claim information -
public static function testVisibility(claim : Claim) : boolean {
if(claim.State == ClaimState.TC_OPEN){
return true;
}
else{
return false;
}
}
My question -
a) If two users are accessing their respective Claims information, this function should get called twice - first time it should receive the Claim instance of first user, second time Claim instance of second user. If the accessing in simultaneous - will two copies of the same function be invoked? Should not be the case, as static function is only one copy. So, if it's one copy, how is thread safety ensured? Will the function be called one-after-another?
b) Like Java, does Gosu also use Heap to run the static functions?
It seems you are confusing a little about the definition here. Thread-safe is only a mechanism created to protect the integrity of data shared between threads. Therefore, your example function is thread-safe, no matter if it is static or not.
a) For the reason mentioned above, there would be no thread-safety problem here, because you are working with 2 different sets of data.
b) Provided that Gosu is built to run on JVM, and produce .class files, I believe for the most part (if not 100%, beside the syntax) it will behave like Java.
This is a cliche confusion when we start loving any programming language.
Consider 100 people accessing a web-application exactly at a particular point of time, here as per your doubt, the static variable/function will return/share the content value for all the 100 people.
The fact is, data sharing won't happen here because for each server connection, each separate THREAD is created and the entire application works on that thread (called as one-thread-per-connection).
SO if have a static/global variable, that particular variable will work on 100 different threads, and content/data of each thread will be secure and cant be accessed from other threads(directly). This is how web applications works.
If we need to share some variables/Classes among threads, we have to make it singleton.
Eg, For database connections, we don't need to create the connection all the time if already an established connection exists. In that case the connection class will be singleton.
Hope this make sense. :)
-Aravind

Short lived DbContext in WPF application reasonable?

In his book on DbContext, #RowanMiller shows how to use the DbSet.Local property to avoid 1.) unnecessary roundtrips to the database and 2.) passing around collections (created with e.g. ToList()) in the application (page 24). I then tried to follow this approach. However, I noticed that from one using [} – block to another, the DbSet.Local property becomes empty:
ObservableCollection<Destination> destinationsList;
using (var context = new BAContext())
{
var query = from d in context.Destinations …;
query.Load();
destinationsList = context.Destinations.Local; //Nonzero here.
}
//Do stuff with destinationsList
using (var context = new BAContext())
{
//context.Destinations.Local zero here again;
//So no way of getting the in-memory data from the previous using- block here?
//Do I have to do another roundtrip to the database here to get the same data I wanted
//to cache locally???
}
Then, what is the point on page 24? How can I avoid the passing around of my collections if the DbSet.Local is only usable inside the using- block? Furthermore, how can I benefit from the change tracking if I use these short-lived context instances not handing over any cache data to each others under the hood? So, if the contexts should be short-lived for freeing resources such as connections, have I to give up the caching for this? I.e. I can’t use both at the same time (short-lived connections but long-lived cache)? So my only option would be to store the results returned by the query in my own variables, exactly what is discouraged in the motivation on page 24?
I am developing a WPF application which maybe will also become multi-tiered in the future, involving WCF. I know Julia has an example of this in her book, but I currently don’t have access to it. I found several others on the web, e.g. http://msdn.microsoft.com/en-us/magazine/cc700340.aspx (old ObjectContext, but good in explaining the inter-layer-collaborations). There, a long-lived context is used (although the disadvantages are mentioned, but no solution to these provided).
It’s not only that the single Destinations.Local gets lost, as you surely know all other entities fetched by the query are, too.
[Edit]:
After some more reading in Julia Lerman’s book, it seems to boil down to that EF does not have 2nd level caching per default; with some (considerable, I think) effort, however, ones can add 3rd party caching solutions, as is also described in the book and in various articles on MSDN, codeproject etc.
I would have appreciated if this problem had been mentioned in the section about DbSet.Local in the DbContext book that it is in fact a first level cache which is destroyed when the using {} block ends (just my proposal to make it more transparent to the readers). After first reading I had the impression DbSet.Local would always return the same reference (Singleton-style) also in the second using {} block despite the new DbContext instance.
But I am still unsure whether the 2nd level cache is the way to go for my WPF application (as Julia mentions the 2nd level cache in her article for distributed applications)? Or is the way to go to get my aggregate root instances (DDD, Eric Evans) of my domain model into memory by one or some queries in a using {} block, disposing the DbContext and only holding the references to the aggregate instances, this way avoiding a long-lived context? It would be great if you could help me with this decision.
http://msdn.microsoft.com/en-us/magazine/hh394143.aspx
http://www.codeproject.com/Articles/435142/Entity-Framework-Second-Level-Caching-with-DbConte
http://blog.3d-logic.com/2012/03/31/using-tracing-and-caching-provider-wrappers-with-codefirst/
The Local property provides a “local view of all Added, Unchanged, and Modified entities in this set”. Like all change tracking it is specific to the context you are currently using.
The DB Context is a workspace for loading data and preparing changes.
If two users were to add changes at the same time, they must not know of the others changes before they saved them. They may discard their prepared changes which suddenly would lead to problems for other other user as well.
A DB Context should be short lived indeed, but may be longer than super short when necessary. Also consider that you may not save resources by keeping it short lived if you do not load and discard data but only add changes you will save. But it is not only about resources but also about the DB state potentially changing while the DB Context is still active and has data loaded; which may be important to keep in mind for longer living contexts.
If you do not know yet all related changes you want to save into the database at once then I suggest you do not use the DB Context to store your changes in-memory but in a data structure in your code.
You can of course use entity objects for doing so without an active DB Context. This makes sense if you do not have another appropriate data class for it and do not want to create one, or decide preparing the changes in them make more sense. You can then use DbSet.Attach to attach the entities to a DB Context for saving the changes when you are ready.

Global property for DB access rather than passing DB around everywhere? Advice anyone?

Globals are evil right? At least everything I read says so, because something might alter the state of the global at any point.
However, I've a DB object that's a bit of a tramp in regards class parameters. The property below is an instance of a wrapper class that automatically works in MS Access or SQL - hence why it's not EF or some other ORM.
Public Property db As New DBI.DBI(DBI.DBI.modeenum.access, String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0} ;Persist Security Info=True;Jet OLEDB:Database Password=""lkjhgfds8928""", GetRpcd("c:\cms")))
The code itself does have PostSharp for exception handling, so I'm thinking that I can conditionally handle oledb errors by logging them and re initialising the DB if it is Null.
Up till now, the solution has been to continually pass the db around as a parameter to every single class that needs it. Most of the data classes have a shared observablecollection that is built from structures that individually implement inotifyproperty changed. One of these is asynchronously built. The collection property checks if it's empty before firing off the private Async buildCollection sub.
Given that we don't use dependency injection (yet) as I need to learn it; is the Global property all that bad? Db is needed everywhere that data is pulled in or saved. The only places I don't need it at all is the View and its code behind.
It's not a customer facing project but it does need to be solid.
Any advice gratefully recieved!!
Passing the DB connection as a parameter into your classes IS using dependency injection, perhaps you just didn't recognize it as such. Hard coding the connection string in the callers is still code that is not free of dependencies, but at least your database accessors themselves are free of the dependency upon a global connection.
Globals aren't just evil because they change without notice - that's just one effect you see resulting from the bad design choice. They're evil because a design using them is brittle. Code that depends upon globals requires invisible stuff to be set correctly before calling it, and that leads to inter-dependencies between unrelated code. The invisible stuff becomes critically important stuff. Reading just the interface of a module that internally uses globals, how would I know that I have to call the SetupGlobalThing() method before calling it? What happens if I call IncrementGlobalThing() and DecrementGlobalThing() and MultiplyGlobalThing() in varying orders, depending on the function the user selects?
Instead, prefer stateless methods where you pass in all the stuff to be changed and used: IncrementThing(Integer thing) doesn't rely on hidden setup steps. It clearly does one thing: it increments the thing passed in.
It may help to think about it from a unit testing viewpoint. If you were to write a unit test to prove a specific module of code works, would you need to pass in a real database connection (hard*), or would you be able to pass in a fake database reference that meets your testing needs easily?
The best way to test your logic is to unit test it. The best way to test your class interfaces and method structure is to write unit tests that call them. If the class is hard to test, it's likely due to dependencies upon external things (globals, singletons, databases, inappropriate member variables, etc.)
The reason I called using a real database "hard" is that a unit test needs to be easy and fast to run. It shouldn't rely on slow or breakable or complex external things. Think about unit testing your software on the bus, with no network connection. Think about how much work it is to create a dummy database: you have to add users, you have to have the right version of schema in it, it has to be installed, it has to be filled with the right kind of testing data, you need network connectivity to it, all those things can make your testing unreliable. Instead, in a unit test you pass in a mock database, which simply returns values that exercise your code being tested.

combining ms access vba codes

Me and my colleague are developing an ms access based application. We are designing and coding different pages/forms in order to divide work. We plan to merge our work later. How can we do that without any problems like spoiling the design and macros? We are using Ms access 2007 for front end and sqlserver 2005 as the datasource.
I found an idea somewhere on bytes.com. I can import forms, reports, queries,data and tables that I want.I'm going to try this. However, it's just an idea.So, need to study this approach by trial and error techniques.
The most important requirement is to complete the overall design before you start coding. For example:
All the forms must have the same style. Help and error information must be provided in the same way on each form. If a user can divide the forms into two sets, you have failed.
The database design must be finished with a complete, written description of each table, its relationships and its attributes.
The purpose and parameters for each major macro must be defined. If macro A1 exists only to service macro A then A1 is not a major macro and only A's author need know of its details until coding is complete.
Agreed a documentation style and detail level. If the application needs enhancement in six or twelve months' time, you should be able to work on the others macros and forms as easily as on your own.
If one of you thinks a change to the design is required after coding has started, this change must be documented, agreed with the other and the change specification added to the master specification.
Many years ago I lectured on (Electronic Data interchange (EDI). With EDI, the specification is divided into two with one set of organisations providing applications for message senders and another set providing applications for message receivers. I often used an example in my lectures to help my audience understand the importance of a complete, unambiguous specification.
I want two shapes, an E and a reverse-E, which I can fit together to create a 10 cm square. I do not care what they are made of providing they fit together perfectly.
If I give this task to a single organisation, this specification will be enough. One organisation might use cardboard, another metal, but I do not care. But suppose I ask one organisation to create the E and another the reverse-E. How detailed does my specification have to be if I am to get my 10 cm square? I would suggest: material, thickness and dimensions of the E. My audience would compete to suggest more and more obscure characteristics that had to match: density, colour, pattern, texture, etc, etc.
I was not always convinced my audience listened to the rest of my lecture because they were searching for a characteristic that would cap all the others. No matter, I had got across my major point which was why EDI specifications were no mind-blowingly detailed.
Your situation will not be so difficult since you and your colleague are probably in the same room and can talk whenever you want. But I hope this example helps you understand how easy is it for the interface between your two parts to be less than seamless if you do not agree the complete design at the beginning. It's the little assumptions - I though you knew I was doing it that way - that will kill your application.
New section
OK, probably most of my earlier advice was inappropriate in your situation.
So you are trying to modify code you did not write in a language you do not know. Good luck; you will need it.
I think scope is going to be your biggest problem. Most modern languages have namespaces allowing you to give a variable or a routine as much or as little scope as you require. VBA only has three levels.
A variable declared within a function or subroutine is automatically private to that function or subroutine.
A variable declared as Private within a module is invisible to functions and subroutines in other modules but is visible to any function or subroutine within the module.
A variable declared as Public within a module is visible to any function or subroutine within the project.
Anything declared within a form is private to that form. If a form wishes to pass a value to an outside function or subroutine, it can do so by writing to a public variable or by passing it in a parameter to a public function or subroutine.
Avoiding Naming Conflicts within VBA Help gives useful advice.
Form and module names will have to be unique across the merged project. You will not be able to avoid have constants, variables, functions and sub-routines which are visible to the other's functions and sub-routines. Avoiding Naming Conflicts offers one approach. An approach I have used successfully is to divide the application into sub-applications and, if necessary, sub-sub-applications and to assign a prefix to each. If every public constant, variable, function and sub-routine name has the appropriate prefix you can simulate namespace type control.

Is it bad practice to "go deep" with your application of callbacks?

Weird question, but I'm not sure if it's anti-pattern or not.
Say I have a web app that will be rendering 1000 records to an html table.
The typical approach I've seen is to send a query down to the database, translate the records in some way to some abstract state (be it an array, or a object, etc) and place the translated records into a collection that is then iterated over in the view.
As the number of records grows, this approach uses up more and more memory.
Why not send along with the query a callback that performs an operation on each of the translated rows as they are read from the database? This would mean that you don't need to collect the data for further iteration in the view so the memory footprint shrinks, and you're not iterating over the data twice.
There must be something implicitly wrong with this approach, because I rarely see it used anywhere. What's wrong with this approach?
Thanks.
Actually, this is exactly how a well-developed application should behave.
There is nothing wrong with this approach, except that not all database interfaces allow you to do this easily.
If we talk about tabularizing 10 records for a yet another social network, there is no need to mess with callbacks if you can get an array of hashes or whatever with a single call that is already implemented for you.
There must be something implicitly wrong with this approach, because I rarely see it used anywhere.
I use it. Frequently. Even when i wouldn't use too much memory repeatedly copying the data, using a callback just seems cleaner. In languages with closures, it also lets you keep relevant code together while factoring out the messy DB stuff.
This is a "limited by your tools" class of problem: Most programming languages don't allow to say "Do something around this code". This was solved in recent years with the advent of closures. Think of a closure as a way to pass code into another method which is then executed in a context. For example, in GSQL, you can write:
def l = []
sql.execute ("select id from table where time > ?", time) { row ->
l << row[0]
}
This will open a connection to the database, create a statement and a result set and then run the l << it[0] for each row the DB returns. Note that the code runs inside of sql.execute() but it can access local variables (l) and variables defined in sql.execute() (row).
With this kind of code, you can even generate the result of a HTTP request on the fly without keeping much of the page in RAM at any time. In my case, I'd stream a 2MB document to the browser using only a few KB of RAM and the browser would then chew 83s to parse this.
This is roughly what the iterator pattern allows you to do. In many cases this breaks down on the interface between your application and the database. Technologies like LINQ even have solutions that can send back code to the database.
I've found it easier to use an interface resolver than deep callback where its hooked up through several classes. MS has a much fancier version than mine called Unity. This provides a much cleaner way of accessing classes that should not be tightly coupled
http://www.codeplex.com/unity

Resources