User Story - Database design - database

I need to build a product that will have a database on the back end to store and retrieve data.
I just started gathering the user stories from my stakeholders and I am stuck...
If I have a Project Leader who has one user story like:
"As a project leader, I want to be able to see and modify the scope of my project so that I make sure my project is up to date"
This user story would require I had created the database and have a table before that having the data in the table.
Should I collect all the user stories and add the database component on the acceptance criteria?
Should I create user stories only for the back end and some for the front end?
I'm not sure how to separate or make them work together.

The idea behind SCRUM is that the architecture / design will emerge as you develop. With this in mind you still need the product backlog to reflect what will the product be. So somewhere in the backlog there should be a user story like... "As a user, I want an application that I can use the manage my projects". That story is rather large (epic) level. So that has to be broken out into smaller stories (like the "... the application must have ability x"). If that is indeed the user story, then another sub epic (still large needs breaking out) story would be... "As an application developer (notice the context change here), I need a database to store my Project application data". Then that story gets broken out for the person making db scripts (assuming you are creating the application database first, some applications are code first and ORM generates database schema). The main point here is that you start large and break it down until you get a full backlog with very small stories. Then you know you have a full backlog (groomed backlog) and you are ready to start planning your sprints.

Related

User or User Profile model for app wide relationships

I recently read a tweet that suggested that if one wants to avoid headaches in the future of an app, they should have the user table have only authentication information and a user profile table for everything else. That is if you have bikes and peaches in the system they should be linked to the user that owns them via the user profile id. The tweet was not clear on what the consequences of using the user profile. Are there maintainability/scalability repercussions to not following this especially in a large web app?
Well, don't take it as a dogma, though it isn't completely worthless. Dependency is a problem: if you have to have a lot of different data that represent particular user, you'll change underlying database oftenly. In case everything is stored in a single column, you might find yourself doing repetative monkey job of "making it work" with your types/ORM and whatsnot gonna be involved in DB <-> RUNTIME communication.
It is all about splitting complicated task into smaller less complex subtasks: auth is self-standing - one of the most important - task itself and it definitely deserves some dedicated space. However, your app might be not that big, or not that concerned with users, and thus it won't be very helpful to split data into multiple columns. You must develop a deep sense of purpose and measure when it comes down to a software design.

How to structure/coordinate multiple databases?

Imagine a large corp with dozens of companies, each with their own website and each website will have their own unique functional requirements
Most data on each website will be specific to that website
Each website can edit its own data
Some data will be shared across all websites
There will be a central CMS that is allowed to edit this data, but other websites can read and use that data
e.g. say you're planning the infrastructure for a company that owns multiple sub-companies that make different kinds of products, some in the same category (cereal, food), others in completely different categories (books, instruments). Some are marketing websites, some are for CRM, some are online stores
there are a list of regulatory requirements that affect all products
each company should manage the status of compliance of its own products to each requirement
when a new requirement surfaces, details regarding that requirement should only be entered once
How would the multiple databases be coordinated?
edit: added more info per Bob's suggestions
Thanks for the incredibly insightful questions!
compliance data is not shared, silo'd within each site
shared data is only on the one enterprise-wide database, they will mostly be "types of [thing]"
no conclusive list of instances where they'll be used but currently it'd be to populate CMS dropdowns for individual sites.
changes to shared data would occur a few times a year.
Ideally changes would be reflected within a few minutes, but an hour or so should be acceptable
very low volume in shared data.
All DBs will be new, decision on which DB is pending current investigation.
Sub-systems will expose REST api
Here are some ways I have seen this handled, you need to think about the implications of each structure based on the details of your particular business domain. All can work, but all have to be carefully set up if they are going to work.
One database for shared information and one for each client for client-specific information. Set up the overall application so that the first thing you put in the application on log in is the client and it connects to the correct client. People might have to also have a way to change the client if users will handled multiples.
Separate servers for each client if they completely need to be siloed. Database changes are by script (and in source control) and are applied to each server as need be. So the changes to the central database might have a job that runs to push any data changes to the other servers
All the data in one database, but making sure each table has a client_id so that the data is always filtered correctly by client. You can set up separate views by client, so that the users can only see the clients they are supposed to see. This only works if the data for each client is substantially in the same form.
And since you are in a regulatory environment, I strongly urge that you create an audit database that is updated by database triggers (never audit from the application, you will lose changes to the data) for each database.
I agree with Chris that, even after both the sets of questions, there is still a big set of possible solutions. For instance, if the databases were the same technology, and the shared data were stored in the same way in each one, you could do db-level replication from the central db to the others. Is it OK to have 2 separate dbs per application (one with shared stuff and one with not-shared?) - this would influence the kind of replication.
Or you could have a purely code solution, where clicking publish in a GUI that updates the central db calls a set of APIs that also update the other dbs. Or micro-services - updating the central db also creates a message on a shared queue, that is picked up by services that each look after a different db and apply the updates in whatever form makes sense for that db.
It depends on (among the things already mentioned) what your organisation's technology strategy is, what technology and skills you already have in-house, and so on.
So this is as much an architecture question as it is a db question.
I don't think this question is sufficiently clear to get a single answer. However there are a few possibilities.
In many cases, where you have shared data you want to have a single point of ownership of that information. It could be in a database, in an excel file (which can then be turned into csv and periodically loaded on all dbs), or some other form. The specifics depend on what is shared exactly.
Now in this case it sounds like you are going to have some sort of legal department in charge of some shared information and they will manage that data, which will then be shared to the other sites. This might be done with an application they manage which aggregates information from the other companies or it could be data which is pushed to their systems.
A final point:
Software is at its best when it facilitates human solutions to human problems, not when it tries to solve those problems directly. In these cases, you probably want a good human solution in place and then to look at what software can do to support that. A lot of the issues (who owns the information?) will already have been solved and you will be simply automating what is already done.

How can I create a custom Box application to move content directly into an MS-Access database?

I would like to automate content delivery from box.com, directly into an MS-Access 2013 database app. Is it possible to do this? What ingredients are needed to accomplish this objective?
EDIT: A typical scenario is I have several law firms that handle personal injury cases. Their clients doctors, insurance companies, etc. put content (mainly PDF documents) into the box upload folders so that I can distribute them to the corresponding law firm that represents each client. I have an Access 2013/SQL-Server application which invoices each law firm for the service I provide, but I have to manually retrieve the documents from Box and forward to each law firm. Most of the Law firms have a case management application, also written with Access 2007. I would like to automate the retrieval and forwarding of the documents, along with my billing, to each law firm, using my Access app as an intermediary.
Based on the information you provided in follow up questions, I understand the following to be true:
All files of interest are contained in a Box account that you own.
Each client's documents are in a separate folder.
Further, I'm going to stipulate that all of the Access/SQL Server-related file processing stuff is beyond the scope of this question, or is at least too broad to be addressed here in whole.
That said, you don't really need to use the Box API at all. Something like this might be easier:
Download and install Box Sync, which will make your Box files easily available on your work computer.
From the Box web interface choose the folders that you'd like to keep in sync.
Point your application to the client files in your Sync folder (e.g. C:\Users\Frank\Box Sync\<Client>\Inbox) for processing.
Forward the results via email or, better, write them back to the appropriate sync folder (C:\Users\Frank\Box Sync\<Client>\Outbox

how to minimize application downtime when updating database and application ORM

We currently run an ecommerce solution for a leisure and travel company. Everytime we have a release, we must bring the ecommerce site down as we update database schema and the data access code. We are using a custom built ORM where each data entity is responsible for their own CRUD operations. This is accomplished by dynamically generating the SQL based on attributes in the data entity.
For example, the data entity for an address would be...
[tableName="address"]
public class address : dataEntity
{
[column="address1"]
public string address1;
[column="city"]
public string city;
}
So, if we add a new column to the database, we must update the schema of the database and also update the data entity.
As you can expect, the business people are not too happy about this outage as it puts a crimp in their cash-flow. The operations people are not happy as they have to deal with a high-pressure time when database and applications are upgraded. The programmers are upset as they are constantly getting in trouble for the legacy system that they inherited.
Do any of you smart people out there have some suggestions?
The first answer is obviously, don't use an ORM. Only application programmers think they're good. Learn SQL like everyone else :)
OK, so back to reality. What's to stop you restricting all schema changes to be additions only. Then you can update the DB schema anytime you like, and only install the recompiled application until a safe time (6am works best I find) after the DB is updated. If you must remove things, perform the steps the other way round - install the new app leaving the schema unchanged, and then remove the bits from the schema.
You're always going to have a high-pressure time as you roll out changes, but at least you can manage it better by doing it in 2 easier to understand pieces. Your DBAs will be ok with updating the schema for the existing application.
The downside is that you have to be a lot more organised, but that's not a bad thing when dealing with production servers and you should be seriously organised about it currently.
Supporting this scenario will add significant complexity to your environment and/or process and/or application.
You can run a complex update process where your application code is smart enough to run correctly on both the old schema and the new schema at the same time. Then you can update the application first and the schema second. A third step may be to migrate any data, which again, the application has to be able to work with. In that case, you only need to "tombstone" the application for the time it takes to upgrade the application, which could just be seconds, depending on how many files and machines are involved in the upgrade.
In most cases, it's best to leave the application/environment/process simple and live with the downtown during a slow time of the day/week/month. Pretty much all applications need to be "taken down" for time to time for "regularly schedule maintenance".

Subscription website architecture questions + SQL Server & .NET

I have a few questions about the architecture of a subscription service I am about to embark on and I am looking for some feedback on how best to set it up.
I won’t have a large amount of customers as Basecamp, maybe a few hundred and was wondering what would be a solid architecture for setting up the customer sites. I’m running SQL Server and .NET on a dedicated machine. Should create a new database for each customer as to have control and isolation of data or keep them all in one database?
I am also thinking of creating a sub-domain for each customer as well so modifications can be made to each site as needed. The customer URLs would look like this:
https://customer1.foobar.com
https://customer2.foobar.com
I am going to have the ability to ‘plug-in’ reports that will be uploaded to the site so each customer can customize as needed. Off the top of my head this necessitates having each sub domain on its own code-base for the uploading of these reports.
So on the main site the customer would sign up for their new subscription and I would programmatically create a new directory for the customer from the main code base and then create a sub domain pointing to the new directory for the customer and then finally their database.
Does this sound about right? Am I on the right track? How do other such sites accomplish the same thing?
Thanks for letting me bend your ear for a bit on this.
From a maintenance perspective, having a virtual directory for each customer scares me. Having done something similar, I would create separate domain pointers as you are intimating. Then you can check the referral headers to see what should be displayed. I would probably create one main site template and dynamically brand it for each customer. You can still create separate folders for customer specific reports or if you really need custom pages unique to that customer. I just wouldn't make each their own site.
The advantage of separate sites (including databases) is that the fate on one client isn't bound to all others. It'd be easier to upgrade (trial) to a sub-set before deploying to everyone else. The big issue here, as Scot points out, is time. You'd want to have things as automated as possible (and well tested), etc. It's also easy when a client leaves. You can always just back-up their database and send it to them (for example).
Auto-provisioning new sites and databases isn't easy, and the account that does that will need plenty of privileges - so your security testing will need to be better than usual.
A multi-tenancy approach is good for minimizing your time but you do have to be careful, you don't want customers data getting mixed up.
One approach that will work, within the one app (and database), is to make use of HttpHandlers (MVC framework, perhaps) so that some sort of client identifier is in part of the URL - but the folder doesn't have to physically exist (or virtually in the IIS sense). That way you don't have to worry about getting folder permission correct; but you do have to be careful about correctly identifying clients, their ids, and making sure clients can't make calls that use an id that isn't theirs.
https://www.foobar.com/[clientid]/subscriptions
The advantage of this is it's relatively straight forward: everything is in the application, and you don't have to worry about adding new DNS records, setting directory and/or database permissions, etc.

Resources