I'm working on a SaaS which have a database for each account, with basically the same tables. What's the best way to index all databases separately? I was thinking about setting different solr instances(different ports) for each database in the same server, but it could be hard on the server. So, i'm in this crazy doubt on what to do next. I haven't found any useful idea in the solr documentation. Could you guys help out. Thanks in advance.
If you store all the data from all of your tenants on one collection, it will be easy in the beginning because probably you will do several changes on your schema and it is easier if you do them once for all your customers.
As a negative point in this scenario you will have lots of unrelated data grouped together and you always have to use a filter query for the tenant (client) id.
What if you create, for starters, a collection for each of the tenant on the same Solr server? This way you don't mix the data of your tenants and you achieve the functionality you basically need.
In this scenario, as it happens for your relational database instances, you have to keep the schema changes in sync.
For relational databases there are tools like flyway or liquibase that can be used to version the changes applied on each of the tenant database.
For Solr there aren't AFAIK such tools, but you can apply your schema changes programmatically through Solr Schema API. In case you have to do highly detailed changes that can't be done via the Schema API, you can replace the schema.xml file of each collection with an updated version of it and restart the solr server.
What you need to keep in mind is backward compatibility. Whenever you add some changes to any of the databases (relational DB or Solr) you need to take into account that the old code must still work with the latest updates that you perform on the relational database/ solr schema structure.
Related
We have a requirement where we will have to move data between different database instance on regular basis. (For e.g. some customers willing to pay more for the better performance). So this is not going to be one off.
The database tables has referential integrity. Is there a way in which this can be done without rewriting sql script (or some other method) every time we migrate customers data?
I came across this How to move data between multiple database's table while maintaining foreign-key relationships/referential integrity?. However it appears that we have write script every time we migrate data (please correct me if I misunderstood the answer on this thread).
Thanks
Edit:
Both servers are using SQL Server 2012 (same version). Its an Azure SQL Server database.
They are not necessarily linked (no firewall between them)
We are only transferring some data, not the whole database. This is only for certain customers who opted pay more.
The schema are exactly same in both databases.
Preyash - please see the documentation on the Split-Merge tool. The Split-Merge tool enables you do move data between databases, as you have described, based on a sharding key (e.g., customer ID). One modification that you will need for your application is to add a shard map (i.e., a database that understand the global state of which customers resides in which databases).
Have a look into Azure Data Sync. It is much more aligned with your requirements. But you may end up in having another SQL Azure DB to maintain a Hub. Azure data Sync follows hub-spoke pattern and will let you do all flexible directional syncs with a few minutes of syncing gap. It is more simple and can set it up very fast without any scripts and all as you wanted.
If you have another application that uses data of an existing database and needs some more, and you don't want to change the schema of the existing database, how do you do that?
Background of my question: We use an IBM product (Connections) to store user profiles. But we have lots of custom requirements (lots of custom fields and logics), so currently we create a few more tables, views and functions in the backend database of Connections to store the custom data. However, as it is IBM's internal database and we are not supposed to touch it, when we upgrade Connections, all our custom tables, views and functions are gone.
So we decide to move out our custom things. But the problem is we still need to join with the data from Connections. (Or not database join, just some other way to integrate with the data before presenting to the users. )
If we create a federated table in our own database, we can create tables and views like we used to. But would it have performance issues? And we are still going to be heavily depend on IBM's schema and have to assume they don't change it. Is it a good approach?
What are the other options we could consider?
If we create a federated table in our own database, we can create tables and views like we used to. But would it have performance issues?
Probably. Your application code would have to do joins between the IBM database tables and your database tables.
I'm assuming that Connections uses DB2. If you bring up your own DB2 database, I think you can do SQL joins between two separate DB2 databases.
Either way, this code should reside in a separate data access package made up of data access objects. The rest of your applications would use the data access package.
And we are still going to be heavily depend on IBM's schema and have to assume they don't change it.
IBM will change their schema, and you have to plan on making corresponding changes to your database and / or application.
What are the other options we could consider?
You could copy the IBM data from their database to your database. You still have to make changes to the copy process when the IBM schema table definitions change.
Can I store any custom tables in SharePoint's own database?
Is this supported behavior or not?
(I mean tables in MS SQL database, not SharePoint lists.)
If I can, how well does this play with backup/restore functionality?
What are possible caveats?
For anyone wondering why I'm asking: there's an app which is bound to SharePoint server and needs to store some purely relational internal information that doesn't make sense apart from that SharePoint instance. I would like to narrow down data storage to one place but I'm not sure if SharePoint likes its database being used for other purposes.
I'm using SharePoint 2007.
Is it possible? Sure. Should you? Nope.
The SharePoint content/configuration databases are subject to change with any update Microsoft releases, and any changes you make will very likely be destroyed, and if your farm depends on them, be left non-functional.
If you want to store purely relational data in a set of tables, just create another database. There's nothing stopping you from using the same SQL Server instance that houses your SharePoint content and/or configuration databases to store other relational databases as well.
Not a good idea: Support for changes to the databases used by Windows Sharepoint Services
...
Making any modification to the database schema
Adding tables to any of the databases
...
If an unsupported database modification is discovered during a support call, the customer must perform one of the following procedures at a minimum:
Perform a database restoration from the last known good backup that did not include the database modifications
Roll back all the database modifications
It is even worse than the above. It is likely that future upgrades will notice your changes to the content database schema and refuse to upgrade the database period.
Our IT manager is asking my help on deciding on which would be the best to save the data. Is it in sharepoint or sql server.
On my side I don't know much about saving data on sharepoint server, how does it work, how fast, how secured, etc. I even have a doubt if sharepoint is capable of complex database design. As far as I know, sharepoint is not a database server that's why I have this doubts.
So obviously I would say Sql Server would be my prefered storage and also because Sql server is known to me for a long time already. Considering my 3 weeks exposure on sharepoint vs. 7 years on Sql Server. I don't have the enough experience to witness the strength of Sharepoint for me to decide on what to do. So to be fair on sharepoint I would like to ask you guys out there who are more experienced on this.
My questions:
1.) Does sharepoint have the ability to store data?
2.) If sharepoint can store data, what are the pros and cons?
3.) Can it cover a complex design such as relational database design like sql server does?
4.) If you where to develop a sharepoint project, would you choose sql server as the backend?
Thanks in advance!
It obviously depends on the application, and complexity of it, who the client or audience is, and how you want to deploy it.
Here are my answers to your questions:
1. Yes
2. Pros:
It provides a UI for updating data.
Cons:
Creating relational structures will be complicated.
Think custom lookup lists, associated with other custom lists.
3. Yes, but I wouldn't try it.
4. SQL Server, but this depends on the project and
isn't an entirely technical decision.
Personally, I think given your skillset, you should use SQL Server, if your manager has said it's up to you.
SharePoint itself is built on top SQL Server and ASP.NET.
Yes. You can create a custom list (basically similar to table structure), you can store document along with its metadata. You can store web pages if you are using it as your publishing (CMS) platform.
It's not supposed be a relational engine like SQL Server. Pro: versioning, workflow, for most cases, UI is there to support data input / editing. Con: Limitation of the UI w/ large amount of data.
To some degree you can relate one list to another field in a different list / document metadata.
See what I said before point 1.
SharePoint offers its own database layer built on top of SQL Server.
A complex object model is provided, and the SQL language API not available.
Acsess is by API, REST, and UI List Webparts with views; NOT SQL and the database is not accessible except through interfaces.
Deep inside data stored in Entity-Attribute-Value triples (specifically: site, web, list, item, state, field, value) such that each value goes into its own record. This is strickly non-tablular.
Maintains a dynamic end-user populated Metadata dictionary.
As a non-relational layer above a DB is offers inheritance, multi-type list, hierarchies, taxonomies, versioning, check in/out and other advanced features missing from a relational model.
Documents may be attached to a list.
Extensive use of GUIDS for identifiers, but this causes problems when moving partial related data between systems.
No referential integrity.
No joining of database tables or lists.
Filtering is more limited than in SQL.
No concept of a schema.
Parts of SharePoint break when restoring from a backup or when published to a separate site.
Rolling new features and data from development to production is problematic and sometimes breaks.
Hope this helps.
Sharepoint is obviously not a Database Server but somehow it works on some ways.
1.)Yes
2.)You can but not as complicated as Sql Server does.
Pros: It's the interfaces the gives sharepoint the edge, UI grants the user a friendlier way of inputting data.
Cons:Just like what I've said complicated database design is not easy to do.
3.) 100% Yes
4.) I would prefer Sharepoint if the application doesn't need complex design on data. Definitely Sql Server for enterprise type of application.
I am developing a multi-tenant app. I chose the "Shared Database/Separate Schemas" approach.
My idea is to have a default schema (dbo) and when deploying this schema, to do an update on the tenants' schemas (tenantA, tenantB, tenantC); in other words, to make synchronized schemas.
How can I synchronize the schemas of tenants with the default schema?
I am using SQL Server 2008.
First thing you will need is a table or other mechanism to store the version information of the schema. If nothing else so that you can bind your application and schema together. There is nothing more painful than a version of the application against the wrong schema—failing, corrupting data, etc.
The application should reject or shutdown if its not the right version—you might get some blowback when its not right, but protects you from the really bad day when the database corrupts the valuable data.
You'll need a way to track changes such as Subversion or something else—from SQL you can export the initial schema. From here you will need a mechanism to track changes using a nice tool like SQL compare and then track the schema changes and match to an update in version number in the target database.
We keep each delta in a separate folder beneath the upgrade utility we built. This utility signs onto the server, reads the version info and then applies the transform scripts from the next version in the database until it can find no more upgrade scripts in its sub folder. This gives us the ability upgrade a database no matter how old it is to the current version. If there are data transforms unique the tenant, these are going to get tricky.
Of course you should always make a backup of the database that writes to an external file preferable with an human identifiable version number so you can find it and restore it when the script(s) go bad. And eventually it will so just plan on figuring out how to recover and restore.
I saw there is some sort of schema upgrader tool in the new VS 2010 but I haven't used it. That might also be useful to you.
There is no magic command to synchronize the schemas as far as I know. You would need to use a tool - either built in house or bought (Check out Red Gate's SQL Compare and SQL Examiner - you need to tweak them to compare different schemas).
Just synchronizing can often be tricky business though. If you added a column, do you need to also fill that column with data? If you split a column into two new columns there has to be conversion code for something like that.
My suggestion would be to very carefully track any scripts that you run against the dbo schema and make sure that they also get run against the other schemas when appropriate. You can then use a tool like SQL Compare as an occasional sanity check to look for any unexpected differences.