heavy data application silverlight vs mvc vs asp.net - winforms

we've been using a custom made ERP application for more than 4 years, actually there were some bad design issues which affected the performance for the application right now.
in fact the problem is, this application can be classified as heavy-data application, aka business application.
the biggest bad design issues, is that we are getting the whole data for a particular business unit, and assign it for controls, or keep it in memory (like ArrayLists, Generic Lists), and give the user the flexibility of navigation and full application speed response.
in the first 2 years of usage, we didn't notice the problem, but by the end of that period, we are getting complaints, more and more about the performance of the application.
We were and still using ADO.NET, with stored procedures, which is generated for CRUD operations (Get, List, Delete, Insert, Update SPs), so if we are going to separate each year's data, we've to redesign those auto generated SPs with the required modifications in the data access layer, and with the fact that there were no actual separation in layers, we are not able by easy means to do that modification, so the first upcoming solution is to do some database enhancements to balance this huge data traffic, but the problem raises again for about 6 Months by now.
so we think this is the time to correct our mistakes (very late decision), but how could we do this? we end up with these solutions:
Use web application rather than windows application, thus we'll be able to use our server's full power, and minimizes the traffic for the connection
Redesign the application so that it takes care of Year Endings, don't get all data, use better ORM rather than old fusion ADO.NET, but stick in windows forms, and same application
and according to time frame (limited some how), team members (2, both have good experience in windows, web development), I'm not able to take the decision right now, I'm afraid that windows application modifications could take longer time than expected.
can you advice me?
I'm using C# 4.0, ASP.NET web forms, (MVC newbie).

If the database itself is quick, as you've indicated in your response to my comment, then, as you have suggested, you must reduce the amount of data populating the DataSet.
Although ADO.NET uses a "disconnected" model, the developer still has control over the select-query that is used to populate each table/relation in the DataSet. A hypothetical example: suppose you had 5000 customers in your CUSTOMERS table in the database. You might populate the CUSTOMERS table in your DataSet like this:
select * from CUSTOMERS order by customername
or like this:
select id, customername from CUSTOMERS order by customername
The order-headers table could be populated like this:
create proc getOrderHeaders4Customer
#customerid int
as
select * from ORDERHEADER where customerid = #customerid
that is, you would need to have a current customer in the GUI before you'd populate the OrderHeaders relation, and you would fetch the order-detail data only when the user clicks on a particular order-header in the GUI. The point is, that ADO.NET can be used to do this -- you don't have to abandon everything in the app in order to be more parsimonious with the disconnected dataset.
How much data would be fetched by the client, and more importantly, when the data are fetched by the client, is completely under developer control regardless of whether it's Silverlight, WinForms, or ASP.NET. Now, if your ERP application was not efficient, retrofitting efficiency may be difficult. But you don't have to abandon ADO.NET to achieve efficiency, and simply choosing another data-layer technology will not itself bring you efficiency.

Related

Which one is better, iterate and sort data in backend or let the database handle it?

I'm trying to design a database schema for Djabgo rest framework web application.
At some point, I have two choces:
1- Choose a schema in which in one or several apies, I have to get a queryset from database and iterate and order it with python. (For example, I can store some datas in an array-data-typed column, get them from database and sort them with python.)
2- store the data in another table and insert a kind of big number of rows with each insert. This way, I can get the data in my favorite format in much less lines with orm codes.
I tried some basic tests and benchmarking to see which way is faster, and letting database handle more of the job (second way) didn't let me down. But I don't have the means of setting a more real situatuin and here's the question:
Is it still a good idea to let database handle the job when it also has to handle hundreds of requests from other apies and clients each second?
Is database (and orm) usually faster and more reliable than backend?
As a general rule, you want to let the database do work when the work is appropriate for the database. Sorting result sets would be in that category.
Keep in mind:
The database is running on a server, often on a distributed system and so it has access to more resources.
Databases are designed to handle large data, so they are not limited by the memory in a single thread.
When this question comes up, often more data needs to be passed back to the application than is strictly needed. Consider a problem such as getting the top 10 of something.
Mixing processing in the application and the database often requires multiple queries and passing data back and forth, which is expensive.
(And there are no doubt other considerations.)
There are some situations where it might be more efficient or convenient to do work in the application. A common example is formatting result sets for the application -- say turning 1234.56 into $1,234.56. Other examples would be when the application language has capabilities that are not directly in SQL or are hard to implement in SQL.

With Microsoft Common Data Service, is incremental refresh possible?

We have tables on Salesforce which we'd like to make available to other applications usings Microsoft Common Data Service. Moreover, we'd like to keep CDS more or less up to date, even having data which was created or updated five minutes ago.
However, some of those tables have hundreds of thousands, or even millions, of records. So, refreshing all the data is inefficient and impractical.
In theory, when CDS queries for data, it should be able to know when its most recent data is from and include this data in the query for new data.
But I'm not clear how to make that part of the query that gets used in the refresh operation.
Is this possible?
What do I need to do?
How are you selecting the data that you are retrieving? If you have the option to use a SOQL query, you can use the fields CreatedDate and LastModifiedDate as part of your queries. For example:
SELECT Id, Name
FROM Account
WHERE CreatedDate > 2020-10-25T23:01:01Z
or
SELECT Id, Name
FROM Account
WHERE LastModifiedDate > 2020-10-25T23:01:01Z
There are some options but I have no idea what (if anything) is implemented in your connector. Read up a bit and maybe you'll decide to do something custom. There's also a semi-decent SF connector in Azure Data Factory 2 if that helps.
Almost every Salesforce table contains CreatedDate, LastModifiedDate and SystemModstamp columns but we don't have raw access to underlying database. There's no ODBC driver (or if there is - it's lying, pretending SF objects are "linked tables" and hiding the API implementation magic in stored procedures). I'm not affiliated, never used it personally but heard good stuff about DBAmp. It's been few years though, alternatives might have popped up. Go give that ADF connector a go.
I mean you could even rephrase it a bit, look around for backup tool that does incremental backups and works with SQL Server? kill 2 birds with 1 stone.
So...
The other answer gives you query route which is OK but bit impractical if it's 100+ tables.
There's Data Replication API to get Ids (primary keys) of recently updated/deleted records. SOAP API has getUpdated call, similar in REST API. You'd still have to call it per object though. And you'd just know that these were modified, still need to query all columns (there's "retrieve" call similar to SELECT *)
Perhaps you need to change the direction. SF can raise events when data changes, subscribing apps have between 1 to 3 days to consume them. It uses cometd protocol, chances are there's app for that in .NET world. There are few types of events (could raise custom events, could raise only when certain conditions are met, from SF config or code; and other way around - subscribing app could specify a query it's interested in and get notified whenever query's results would change). But if you just want everything - search for "Change Data Capture". Could be nice near-realtime solution

Publish SQL Server data to clients from saas website with multi-tenant database?

We maintain a Software as a Service (SaaS) web application that sits on top of a multi-tenant SQL Server database. There are about 200 tables in the system, this biggest with just over 100 columns in it, at last look the database was about 10 gigabytes in size. We have about 25 client companies using the application every entering their data and running reports.
The single instance architecture is working very effectively for us - we're able to design and develop new features that are released to all clients every month. Each client experience can be configured through the use of feature-toggles, data dictionary customization, CSS skinning etc.
Our typical client is a corporate with several branches, one head office and sometimes their own inhouse IT software development teams.
The problem we're facing now is that a few of the clients are undertaking their own internal projects to develop reporting, data warehousing and dashboards based on the data presently stored in our multi-tenant database. We see it as likely that the number and sophistication of these projects will increase over time and we want to cater for it effectively.
At present, we have a "lite" solution whereby we expose a secured XML webservice that clients can call to get a full download of their records from a table. They specify the table, and we map that to a purpose-built stored proc that returns a fixed number of columns. Currently clients are pulling about 20 tables overnight into a local SQL database that they manage. Some clients have tens of thousands of records in a few of these tables.
This "lite" approach has several drawbacks:
1) Each client needs to develop and maintain their own data-pull mechanism, deal with all the logging, error handling etc.
2) Our database schema is constantly expanding and changing. The stored procs they are calling have a fixed number of columns, but occasionally when we expand an existing column (e.g. turn a varchar(50) into a varchar(100)) their pull will fail because it suddenly exceeds the column size in their local database.
3) We are starting to amass hundreds of different stored procs built for each client and their specific download expectations, which is a management hassle.
4) We are struggling to keep up with client requests for more data. We provide a "shell" schema (i.e. a copy of our database with no data in it) and ask them to select the tables they need to pull. They invariably say "all of them" which compounds the changing schema problem and is a heavy drain on our resources.
Sorry for the long winded question, but what I'm looking for is an approach to this problem that other teams have had success with. We want to securely expose all their data to them in a way they can most easily use it, but without getting caught in a constant process of negotiating data exchanges and cleaning up after schema changes.
What's worked for you?
Thanks,
Michael
I've worked for a SaaS company that went through a similar exercise some years back and Web Services is the probably the best solution here. incidentally, one of your "drawbacks" is actually a benefit. Customers should be encouraged to do their own data pulls because each customer's needs on timing and amount of data will be different.
Now instead of a LITE solution, you should look at building out a WSDL with separate CRUD calls for each table and good filtering capabilities. Also, make sure you have change times for records on each table. this way a customer can hit each table and immediately pull only the records that have been updated since the last time they pulled.
Will it be easy. Not a chance, but if you want scalability, it's the only route to go.
ood luck.

Business Layer vs SQL Server

I have an application that does complex calculations for members. Each member can have multiple US states linked to their profile. Each state has got different calculations for each course a member completes.
As of now I have been performing the calculations in the DB (SQL Server 2008) and then sending data back to app layer where they can see their history and then download a certificate for each course.
I have a business logic layer but not a lot happens there. I know this has been asked a lot but where do you think I should perform these calculations: business layer or database? I am going back and forth!!
I would basically do anything in SQL Server that:
does a lot of summing, counting, averaging etc. of data and returns only a single value.
There's really no point in transferring large volumes of data to the middle tier, just to sum up a column
does a lot of row / set-based manipulation; if you need to copy, insert, update lots of data, again, there's really no point in dragging all that data to the middle tier and then sending it all back to the server - do it on the server right from the get go. Also: T-SQL is significantly faster in dealing with set-based operations (like shuffling data around) than anything in your middle tier could be
In brief: try to avoid sending large volumes of data to the client/middle-tier/business layer unless you absolutely have to (e.g. because you want to download a file stored in a database, or if you really need to have those 200 rows materialized into objects in your app to be displayed somewhere or operated on)
One feature that's often overlooked are computed columns right in your database table - those are really good at e.g. summing up your order total plus tax and shipping into a grand total, or they are great to put together your first and last name into a display name. Those kinds of things really shouldn't be handled only in the business logic layer - if you do those in the database directly, those "computed" values are also available to you when you inspect the database tables and look at the data in SQL Server Mgmt Studio ...
I would put it into the middle tier / business logic layer
if it required extensive logic checking, string parsing, pattern matching and so forth; T-SQL sucks at those things
if it required things like calling a web service to get some data to validate your data against, or something like that
if it required more of a business logic operation (rather than strict "atomic" database operations)
But those are just "rough" guidelines - it's always a design decision for each case, and I don't believe in strict rules; your mileage may vary, from case to case - so pick the one approach that works best for any given task at hand.
It helps to not have business logic code inside of the database (stored procedures). It is much better to have it directly in the application so it fits right in the architecture. This SQL code contains your business logic and there is nothing wrong with it. (There is nothing wrong with having data or maintenance related code in sprocs though).
If your business logic layer is not doing much and is just passing the data from SQL Server to the caller, maybe you don't need it at all.
The business logic layer is not there to do the heavy lifting, it's purpose is to provide an abstraction with entities in the language of the subject matter. Thus the business layer can be used to provide as shared and consistent approach across for any layers/applications that are required to work in that space.
To over engineer things to make a point, the ultimate goal would be that the business logic layer is used across an organisation for all applications working in that subject space. I.e. wrapped in a service etc.
In the real world of small apps to do this or that, the business logic layer does feel like an appendix some times. The trick is to also remember that any uses cases should be implemented as tests against the business layer, giving you another way to think of how it's public interface should look.
How the business logic layer gets it's work done should be hidden from those parts of the application calling it.
Thus it is perfectly acceptable to calculate the data in the most efficient manner (i.e. sql), so long as these calculations are given an appropriate representation in the business logic layer.

Dataset retrieving data from another dataset

I work with an application that it switching from filebased datastorage to database based. It has a very large amount of code that is written specifically towards the filebased system. To make the switch I am implementing functionality that will work as the old system, the plan is then making more optimal use of the database in new code.
One problem is that the filebased system often was reading single records, and read them repeatedly for reports. This have become alot of queries to the database, which is slow.
The idea I have been trying to flesh out is using two datasets. One dataset to retrieve an entire table, and another dataset to query against the first, thereby decreasing communication overhead with the database server.
I've tried to look at the DataSource property of TADODataSet but the dataset still seems to require a connection, and it asks the database directly if Connection is assigned.
The reason I would prefer to get the result in another dataset, rather than navigating the first one, is that there is already implemented a good amount of logic for emulating the old system. This logic is based on having a dataset containing only the results as queried with the old interface.
The functionality only have to support reading data, not writing it back.
How can I use one dataset to supply values for another dataset to select from?
I am using Delphi 2007 and MSSQL.
You can use a ClientDataSet/DataSetProvider pair to fetch data from an existing DataSet. You can use filters on the source dataset, filters on the ClientDataSet and provider events to trim the dataset only to the interesting records.
I've used this technique with success in a couple of migrating projects and to mitigate similar situation where a old SQL Server 7 database was queried thousands of times to retrieve individual records with painful performance costs. Querying it only one time and then fetching individual records to the client dataset was, at the time, not only an elegant solution but a great performance boost to that particular application: The most great example was an 8 hour process reduced to 15 minutes... poor users loved me that time.
A ClientDataSet is just a TDataSet you can seamlessly integrate into existing code and UI.

Resources