a system design question

a system design question - sql-server

I was asking the following question during interviewing in a company working on cloud computing, and did not answer well. Any suggestions on how to analyze this question will be greatly appreciate.
Our company has hundreds of millions of users and we expect zero down time in production, explain techniques and programming practices that help improve redundancy and fail-over capabilities for front-end, middle-tier and back-end services including database services.

This question is very much along the lines of the "Impossible Question" from Joel. There is no right answer to this question.
I would start breaking this down into a list of all possible failure points:
Database Server
Database
Middle Tier
Middle Tier Server
Application
Web Server
Then for each one of them, I would identify reasons for breakage, and how to recover from it without having downtime. The ones that I do not know the answers to, I would profess to as much.
For example, Let's build a list of reasons a Database server goes down. Since we are looking for 100% uptime, we ignore nothing - no matter how far fetched
Hardware goes bad
Power goes down
Network card goes bad
Operating System unexpectedly crashes
O.S. Upgrades break system
Dumb System Admin or DBA
Dumb Janitor
Some Possible solutions (considering SQL Server on Windows back-end)
Lock on door
Database Mirroring (with regular failover testing)
Multiple NICS
Clustering (with regular failover testing)
Get better people
You can basically keep answering this question until the interviewer throws in the towel because there really isn't the One-Right-Answer to this question.

That's a pretty broad question. If they expect zero downtime, tell them to forget about it or turn all of their profits over to building redundancy. Now, if they just want "five 9's, or 99.999% uptime" then we can talk. :)
You can usually answer these kinds of questions with the usual canned blather about building a sustainable, automatic, build environment that includes extensive unit testing. Using design patterns like MVC or similar can help with testability. Perform regular security audits. This is much bigger than just a development question, this is a question about network and server architecture, maintaining secondary and tertiary data centers, etc. These kinds of question really give you a chance to make the interviewer feel important.

Related

IBM Notes database - Slow

I am currently working with IBM notes and I realize that it is sometimes very slow. Our database runs on a Server and my question is: What if 100 users have a Notes client in which they access and edit documents(which are in a database from that server) at the same time. Would that cause slowness because too many people do too many actions on that server?

I got the answers from the comments on my question and posted it as an
answer. Thank you #Torsten Link and #Richard Schwartz:
Comment from #Torsten Link:
This question is wrong here, belongs to serverfault or superuser. Just one thing: We have servers, they have 2.000 concurrent users, and they are not slow at all (and these are "small environments" for Domino- servers). IF the application is slow, then a) the application is very bad or b) the server configuration is bad or c) the client configuration is miserable
Reply on above comment from me:
#TorstenLink Thank you very much for taking time to anwser, may I know how much servers you got and which is the most active one an how big are the notes databases on that server.
Answer from #TorstenLink:
I am consultant and know everything from single server environments to worldwide environments with 2.000 or more users per server... This question is to broad: The answer is "this is not normal". But a fix for it might involve a lot of analysis for the reasons for the slowlyness.
Comment from #Richard Schwartz:
Torsten is correct. Properly designed Domino applications or perform well under loads from thousands of users as long as the hardware is adequate - and very modest hardware can easily handle 100 users. But if the application is poorly designed, or if the hardware is not up to the job, then of course it can be slow. An experienced Domino consultant would look at all aspects of the problem; there are far too many possible issues to consider and StackOverflow isn't designed for the type of detailed back-and-forth dialogue that would be required to help you narrow it down.

Basic Database Question?

I am intrested to know a little bit more about databases then i currently know. I know how to setup a database backend for any webapp that i happen to be creating but that is all. For example if i was creating three different apps i would simply create three different databases and then configure each database for the particular app. This is all simple knowledge and i would now like to have a deeper understanding of how databases actually work.
Lets say that I developed an application for example that needed lot of space and processing power.This database would then have to be spread over numerous machines. How exactly would a database be spread across numerous machines and still be able to write records and then retreieve them. Would each table get their own machine and what software is needed to make sure that the different machines have all performed their transactions successfully.
As you can see i am quite a database ignoramus lol.
Any help in clearing this up would be greatly appreciated.

I don't know what RDBMS you're using but I have two book suggestions.
For theory (which should come first, in my opinion): Database in Depth: Relational Theory for Practitioners
For implementation: High Performance MySQL: Optimization, Backups, Replication, and More
I own both these books and they are both pretty great, especially the first one.

That's quite a broad topic... You might want to start with Multi-master replication, High-availability clustering and Massively parallel processing.

If you want to know about how to keep databases running with ever increasing load, then it's not a basic question. Several well known web companies are struggling to find the right way to make their database scalable.
Using memcached to cache database information is one way to decrease load on your database if your application is read-intensive. If you application is write-intensive then may be you would want to consider using a NOSQL datastore like MongoDB or Redis.

Database Design for Mere Mortals
This is the best book about the subject if you don't have any experience with databases. It's got historical background and practical examples. Most books often skip the historical stuff because they assume you know what a db is, or it doesn't matter, and jump right to the practical. This book gives you the complete picture.

Why can application developers do datasebase stuff but database developers try to stay clear of application stuff?

In my experience, this has been a contentious issue between "backend" (database developer) and "frontend" guys (application developer, client and server side).
There have been many heated pub discussions on this subject.
I just want to know is it just people have different mindsets, or lazy to learn more and feel comfortable in what they know, or something else.

I might re-phrase the question: why do (some) application developers think they can do "database stuff" without actually bothering to understand it properly? Whereas database developers do not (in general) assume they can write a good application without some training and experience!

It is about levels of abstraction. A database is the lowest level of abstraction in a typical business application (software-wise). It is much more likely that a developer working on an outer layer of the abstraction would have knowledge of an inner layer than a developer in an inner layer would know about the outer layer.
This is because inner layers of abstraction best perform when they are ignorant of the outer layers who depend on them.
So a designer in the presentation layer of a website may know a bit about the server-side code they depend on because they interact with it. But the developer working on the server does not need to know anything about design at all.

I would say it's on a need to know basis. Applications developers often need to know how to connect to databases, add records, delete records etc... This is taken further with new technologies such as LINQ where developers can write database queries within their actual code.
Database developers on the other hand only really need to know how to write database queries as that is their job and probably won't need to worry about the code at application level.

Because programmers very often must understand and interact with databases to do their job, but DBAs very often don't need to do any programming (outside of the DBMS) to do their jobs.

I believe it stems from the fact that programming in sql looks easy, and to get started you have to have a small amount of knowledge (Really for a programmer to learn SELECT * FROM Table is pretty easy). Application programming is not the same way. It becomes very complex in a small amount of time, and that discourages a lot of people. Now I am not saying that database people are any less intelligent it is just what they do looks easier than building applications.

If you develop applications, then the chances are, that sooner or later, you'll have to connect an app to a back-end.
The opposite is not as true.

I think it stems from necessity. If you consider the roles of each person, a programmer needs to to database related stuff far more than database workers need to do programming tasks.

From my experience, having developed both "databases" and "applications" (following your nomenclature...), I guess there's a big difference in state management.
Properly designed databases are always in a "clean" state, and every transaction keeps this consistency. So when developing a database, you have to very clearly specify your data abstractions into tables and which updates are legal and so on.
I've found that most application developers (myself included :)) do a very sloppy job in keeping this consistent state in the application. Any non-trivial interface has many more possible states to manage than a modest database, and it's not as easy to make sure it's always in a clean state. It's also harder to analyze every possible sequence of steps that users will perform.

From my experience, the application developers don't do all the database stuff. Consider all the administration that is related to the databse, backups, replication, etc.
A typical DBA (at least on most of the projects I've been involved to) takes care about everything that is related to project databases - all administration, cooperates with application developers on performance tuning, gives advices about SQL used by the app, does some of the stored procs coding, creates (or, at least reviews and consults) physical DB designs, etc.
So, aren't the database guys "lazy", or "fine with what they already know" just from an application developer's perspective? I'm an app developer myself and there is a whole lot of things that I just don't know about the DBs we're using on our projects.

Part of my education ensured I got a decent understanding of how Databases work. I went into the field expecting to do database work, and a lot of it. I'm a web app guy; it comes with the territory I guess.
My two jobs as a developer have been at two shops that would best be described as tiny (2 people myself included, and then just me) and tiny (3 developers, briefly having a fourth). I have not observed an immediate business need for, nor worked anywhere that had the resources to employ a dedicated DB guy. I can envision some scenarios where that would change (including a new job :P).
As to the rest, I agree that abstraction is also a factor and as developers we're way up on top/outside looking in. I can't imagine doing web app development without DB skills, and I consider Sql/DB Management to be both an important tool and an area I need to stay sharp in.
I'll add that I treat the database side as its own field. There's skills that translate between the two, but there's a lot of specialized knowledge I need to acquire to get better at it, and that being a good programmer doesn't necessarily mean I'm doing a good job on the back end either (fortunately, I'm not a good programmer ;) ). Also, I'm pretty sure that's what she said.

2 reasons:
DB Vendors facilitate bad SQL, and
SQL is hidden from view while
application UI is front and center.
Most naive developers think SQL is a procedural language and write it as such because vendors ensure that the tools exist so that they can do so. DBAs know that good SQL is set-oriented and has optimization principles that are totally different from those involved in application programming.
The visibility aspect makes it so the application developers can write bad SQL against a database and get it to perform in a marginal way, and no one ever sees quite how bad it is. When a DBA writes an application, there are immediate critiques on its appearance and behavior because it's directly visible to the end user.

Good question. Actually why developers do Database Stuff because where no dedicated Database guys then developers have to do that. But a company have Database Guys also have Development guys.
:) what is your idea ?

Is anyone using the Service Broker in SQL Server? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
When I attended a presentation of SQL Server 2008 at Microsoft, they did a quick gallup to see what features we were using. It turned out that in the entire lecture hall, my company was the only one using the Service Broker. This surprised me a lot, as I thought that more people would be using it.
My experience with SB is that it does it's job well, but is pretty tough to administer and it's hard to get an overview.
So, have you considered using the Service Broker? If not, why not? Did you go for MSMQ instead? Is there anything in SQL Server 2008 that would make you consider using the Service Broker.

I've been using SQL Service Broker since a couple of months after SQL 2005 was released. We use it non-stop here sending hundreds of thousands of messages through it per day.
We use it to load data from staging tables to production tables so that the service that loads the staging table doesn't have to wait for the data to actually process, it can go back and get more data to load.
We use it to queue the deletion of files from the file system. (When the row is deleted the file needs to be deleted as well.)
At prior companies I've used it to print loan documents and the checks that were sent out to the customers.
I even used Service Broker to do ETL from an OLTP database to an OLAP database for real time reporting.
Most people (especially DBAs) don't like Service Broker because there isn't any UI for it. If you want to use service broker or see what its doing you have to actually write and run some T/SQL.

I have been using SB in 2005 for about two years now with one implementation handling several hundred thousand messages a day. I would say the biggest challenge has been not so much in the architecture but understanding all the nuances involved. The documentation from Microsoft is poor with very few practical examples. Remus Rusanu's blogs have really been helpful in doing things like dialog reuse and activation stored procedure tuning. I have found it's REALLY important to reuse dialogs as much as possible (and working through all the associated locking involved with that) as well as handling multiple received messages as a set rather than one at a time.
Monitoring SB can be a pain. You basically depend on a bunch of system views to tell you what's going on. Orphaned messages are a pain. There's just a lot of little gotchas that can, well, getcha.
Aside from the problems, and there aren't THAT many, I think it has really worked out better than I expected it to. Since SB is integrated into the database, there's no separate message queues to back up outside the database. It's all transactionally consistent. Performance is good. It's a great solution.
I would use it again and will continue to use it.

At my current company, our usage of SB is somewhat different to that of the other posters. We use SB in SQL2005 mainly as a management tool. For example, we use it to manage updates to a small set of mutable tables that are present in a large number of otherwise immutable databases. All the messages are between services running on the same instance and the message volume is very low.
My experience with SB has been that it can be somewhat 'fiddly' to setup correctly and, as you mentioned in your question, it is hard to get an overview of the state of SB because there is not a single monitoring tool.
Nevertheless, we have found it hugely valuable as a way to automate a lot of database management tasks in a traceable and reliable way.

I have recently considered using Service Broker for a project, but yes, decided to go for MSMQ instead.
Our architecture consisted of a number of (clustered) servers, each needing to write information into a single instance of SQL reliably.
As I understand it, SB only works for SQL to SQL communication, so we would have needed an instance of SQL on each clustered box. We felt this was a bit unnecessary, hence using MSMQ
To be honest, i'm can't think of a scenario where I would use SB - I'm interested in knowing a bit more about your scenario, to see if I'm missing something vital.

Service Broker can be used in various cases where automation is required to be done in the distributed architecture.
Such applications receiving events from various devices and need processing to be done reliably. Where events from devices (detection) or sensors are used for processing the logic of automation. To do exchange of data between multiple database or applications.
I hope the implementation can be more secured and reliable with SB

To what extent should a developer learn specifics about database systems? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Modern database systems today come with loads of features. And you would agree with me that to learn one database you must unlearn the concepts you learned in another database. For example, each database would implement locking differently than others. So to carry the concepts of one database to another would be a recipe for failure. And there could be other examples where two databases would perform very very differently.
So while developing the database driven systems should the programmers need to know the database in detail so that they code for performance? I don't think it would be appropriate to have the DBA called for performance later as his job is to only maintain the database and help out the developer in case of an emergency but not on a regular basis.
What do you think should be the extent the developer needs to gain an insight into the database?

I think these are the most important things (from most important to least, IMO):
SQL (obviously) - It helps to know how to at least do basic queries, aggregates (sum(), etc), and inner joins
Normalization - DB design skills are an major requirement
Locking Model/MVCC - Its nice to have at least a basic grasp of how your databases manage row locking (or use MVCC to accomplish similar goals with optimistic locking)
ACID compliance, Txns - Please know how these work and interact
Indexing - While I don't think that you need to be an expert in tablespaces, placing data on separate drives for optimal performance, and other minutiae, it does help to have a high level knowledge of how index scans work vs. tablescans. It also helps to be able to read a query plan and understand why it might be choosing one over the other.
Basic Tools - You'll probably find yourself wanting to copy production data to a test environment at some point, so knowing the basics of how to restore/backup your database will be important.
Fortunately, there are some great FOSS and free commercial databases out there today that can be used to learn quite a bit about db fundamentals.

I think a developer should have a fairly good grasp of how their database system works, not matter which one it is. When making design and architecture decisions, they need to understand the possible implications when it comes to the database.

Personally, I think you should know how databases work as well as the relational model and the rhetoric behind it, including all forms of normalization (even though I rarely see a need to go beyond third normal form). The core concepts of the relational model do not change from relational database to relational database - implementation may, but so what?
Developers that don't understand the rationale behind database normalization, indexes, etc. are going to suffer if they ever work on a non-trivial project.

I think it really depends on your job. If you are a developer in a large company with dedicated DBAs then maybe you don't need to know much, but if you are in a small company then it may be really helpful knowing more about databases. In small companies you may wear more than one hat.
It cannot hurt to know more in any situation.

It certainly can't hurt to be familiar with relational database theory, and have a good working knowledge of the standard SQL syntax, as well as knowing what stored procedures, triggers, views, and indexes are. Obviously it's not terribly important to learn the database-specific extensions to SQL (T-SQL, PL/SQL, etc) until you start working with that database.
I think it's important to have a basic understand of databses when developing database applications just like it's important to have an understanding of the hardware your your software runs on. You don't have to be an expert, but you shouldn't be totally ignorant of anything your software interacts with.
That said, you probably shouldn't need to do much SQL as an application developer. Most of the interaction with the database should be done through stored procedures developed by the DBA, I'm not a big fan of including SQL code in your application code. If your queries are in stored procedures, then the DBA can change the implementation of the stored procedure, or even the database schema, and so long as the result is the same it doesn't require any changes to your application code.

If you are uncertain about how to best access the database you should be using tried and tested solutions like the application blocks from Microsoft - http://msdn.microsoft.com/en-us/library/cc309504.aspx. They can also prove helpful to you by examining how that code is implemented.

Basic things about Sql queries are must. then you can develop simple system. but when you are going to implement Complex systems you should know Normalization, Procedures, Functions, etc.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight