liferay id generator accross the database - database

I have been given a task to migrate users from customers database to liferay portal.
I have already managed to find all the places I need to fill with data, to make a user functional (USER_, USERS_GROUPS, CONTACT_, LAYOUTSET, EXPANDOVALUE).
The only problem I have faced are the IDs. Liferay doesn't use a sequence to generate them (at least I haven't found one), but appears to generate them from the code. What's even more concerning, It looks like all the IDs (UserID, GroupId, RowID, etc) need to be unique not only in scope of a table, but whole database.
I need to find a way to get the last used ID in database and a way to set last ID used by my script, so that Liferay doesn't use it again.
I don't have access to an application server, just the database, that's why I can't use the API...

First of all I would like to ask, why do you have no access to the application server? Changing things in the database is like repairing a modern car without tools and manual. It is possible to get all things right - but it is possible as well to screw all things up, if you forget anything that the API is usually thinking off.
That having said:
The counter ID is saved in the COUNTER table in the row with name com.liferay.counter.model.Counter. It is incremented by the value of the property counter.increment (usually 100). Check the class CounterFinderImpl to see how Liferay is using it.
Ensure that the server is stopped before modifying anything in the database - as Liferay is caching many things, especially the current counter value.

Related

Export Outlook Emails Into SQL (Vai ACCESS?)

I have a email folder in Outlook that contains 100s of emails which record my discussions with a developer of some bespoke software. I want to import these into SQL to create a knowledge base of information that can be searched upon to extract all the decisions that we have made during the course of the 2 year project.
Having sreached the net, I found that it is very easy to dump the contents of an email folder into Access using the import data functionality. In fact I have linked the table and so believe (never used Access before!!) that I now have an Access table that is connected in 'real-time' to the Outlook folder. This is eactly what I want BUT in SLQ as this is something that I am very familiar with using.
So I have tried to import the Access database into SQL (which also appears to be relatively easy) but keep getting the message that 'The source database ...contains no visible tables or views'. Checking SQL pemissions, I am owner of this new databse.
Two questions please. First, cant believe that going through Access is the simplest way to do this and presume that I will loose the 'real-time' link - am I right? Second, given that I can see my Access database has a visible table, why am I getting the error?
The easiest and quickest way is to create a VBA macro where you can populate your SQL database from Outlook emails. You can build the table structure according to your needs and extract the required information from Outlook using VBA. I'd suggest processing emails in chunks using the Find/FindNext or Restrict methods of the Items class, so you will not reach the reference counter limit. The MailItem properties you may find described in MSDN.
BTW The internal store (if you use the cached mode) in Outlook acts like a database. So, why do you need to introduce yet a new database?

Does hbm2ddl.auto=update not honor different DB users, maybe?

We are encountering strange hibernate behavior with hbm2ddl.auto set to update.
In our test setup, we have two database users, one containing the tables for our beta application, the other one is mainly used for development. I.e. same table names with different users.
When new tables are to be created, we do so by using hbm2ddl.auto=update.
Now suddenly the strange behavior is: the update process looks for existing tables with the wrong user and creates those not found with the right user.
E.g. if the following tables exists
USER_A.TABLE_1
USER_B.TABLE_2
and we update with three tables configured: TABLE_1, TABLE_2, TABLE_3 using USER_B, we end up with
USER_A.TABLE_1
USER_B.TABLE_2
USER_B.TABLE_3
TABLE_1 is not created for USER_B. After renaming USER_A.TABLE_1 to USER_A.TABLE_0 and updating again we end up with the expected result:
USER_A.TABLE_0
USER_B.TABLE_1
USER_B.TABLE_2
USER_B.TABLE_3
Does this make any sense to anyone? Is there something like an internal hibernate cache remembering like "Hey I have already created this table on this server (and I do not care about the user)".
We have spent quite some testing to reassure this is not a configuration problem, reproduced this on different machines, different configurations, from ant or using the IDE, making sure USER_A's password cannot be found anywhere in the build directory etc. So we are 100% sure, the behavior is as described - but we are completely out of ideas what happens.
I'd be very happy to hear your ideas about this, since this problem is nagging for some time now.
Thanks a lot,
Peter
Is there something like an internal hibernate cache remembering like "Hey I have already created this table on this server (and I do not care about the user)".
No. What is probably happening is that USER_A can see the tables created under USER_B account, and vice-versa. It's not clear which database you are using, but I would try to configure Hibernate to use two different schemas, in addition to use just different users. You may also want to try to set the property "hibernate.default_schema", but I'm not sure that this only will solve your problem.

Web-App : Keeping trace of the version of the application in database?

We are building a webapp which is shipped to several client as a debian package. Each client runs his own server. But the update and support is done by us.
We make regular releases of the product, with a clean version number. Most of the users get an automatic update (by Puppet), some others don't.
We want to keep a trace of the version of the application (in order to allow the user to check the version in an "about" section, and for our support to help the user more accurately).
We plan to store the version of the code and the version of the base in our database, and to keep the info up to date automatically.
Is that a good idea ?
The other alternative we see is a file.
EDIT : The code and database schema are updated together. ( if we update to version x.y.z , both code and database go to x.y.z )
Using a table to track every change to a schema as described in this post is a good practice that I'd definitely suggest to follow.
For the application, if it is shipped independently of the database (which is not clear to me), I'd embed a file in the package (and thus not use the database to store the version of the web application).
If not and thus if both the application and the database versions are maintained in sync, then I'd just use the information stored in the database.
As a general rule, I would have both, DB version and application version. The problem here is how "private" is the database. If the database is "private" to the application, and user never modifies the schema then your initial solution is fine. In my experience, databases which accumulate several years of data stop being private, it means that users add a table or two and access data using some reporting tool; from that point on the database is not exclusively used by the application any more.
UPDATE
One more thing to consider is users (application) not being able to connect to the DB and calling for support. For this case it would be better to have version, etc.. stored on file system.
Assuming there are no compelling reasons to go with one approach or the other, I think I'd go with keeping them in the database.
I'd put them in both places. Then when running your about function you quickly check that they are both the same, and if they aren't you can display extra information about the version mismatch. If they're the same then you will only need to display one of them.
I've generally found users can do "clever" things like revert databases back to old versions by manually copying directories around "because they can" so defensively dealing with it is always a good idea.

How do you make your model's database portable in CakePHP?

I'm not very familiar with cake.. So here's my questions.. we're developing an app on mysql, but it may eventually need to deploy to mssql or oracle. How do we make sure that we won't have strange problems with our primary keys? In mysql they are AUTO INCREMENT columns but IIRC in oracle you need to use sequences... is there a way to make this a transparent change? Am I over thinking it?
Does anyone have experience with switching database vendors on a cakephp app? any pointers or things to keep an eye out for?
In the Cake database config file you choose your driver (see http://book.cakephp.org/view/40/Database-Configuration). Then, if you set your PK (which will also be your A_I column if using MySQL) with the fieldname id, Cake will automatically handle the auto_increment insertion. I would presume (NB: haven't tried Cake w/ something else) that Cake will take care of A_I columns in something like Oracle.
Cake uses its own DB abstraction layer -- but the included abstractions cover quite a bit, and it will perform as specified (i.e. it'll handle your auto increment stuff for you).
In short, you're probably overthinking. That said, I would mock up a little cake app, then try switching databases behind it (change your db config and your app should automagically switch over).
HTH,
Travis
The following practices work great for me
I use cake schemas ( I tend to set up 1 schema file for each group of models. I.E. User, Role, Profile might all be in one UsersSchema file )
Also take a look at using the debuggable.com FixturesShell - it allows you to import test case fixtures into the live database. Great for setting up that initial group of users and roles from the schema file.
Also, if you set your 'id' field to VARCHAR(36) instead of INT(#) cake will automatically use UUID style id's. This means you have a FAR FAR lower chance of your data having id value collisions if you need to move the data to another application or server.
The fixtures shell also has a command line tool for generating uuids ( so you can add them to your $records variable in the fixture for insertion etc. )
In summary - Use the CakeSchema schemas shell, the fixtures shell from debuggable.com and UUID values for your id's and it should give you a portable structure creation tool, a portable data insertion tool, and a portable id field format.
http://github.com/felixge/debuggable-scraps/tree/fd0e5ad625cb21f5ba16e6b186821a5774089ac7/cakephp/shells/fixtures
http://api.cakephp.org/class/schema-shell
You need to be using "cake schema" to manage your DB.. This will handle all the DB specific stuff when you create your database.
http://book.cakephp.org/view/735/Generating-and-using-Schema-files

Re-Running Database Development Scripts

In our current database development evironment we have automated build procceses check all the sql code out of svn create database scripts and apply them to the various development/qa databases.
This is all well and good, and is a tremdous improvement over what we did in the past, but we have a problem with rerunning scripts. Obviously this isn't a problem with some scripts like altering procedures, because you can run them over and over without adversly affecting the system. Right now to add metadata and run statements like create/alter table statements we add code to check and see if the objects exists, and if they do, don't run them.
Our problem is that we really only get one shot to run the script, because once the script has been run, the objects are in the environment and system won't run the script again. If something needs to change once it's been deployed, we have a difficult process of running update scripts agaist the update scripts and hoping that everything falls in the correct order and all of the PKs line up between the environments (the databases are, shall we say, "special").
Short of dropping the database and starting the process from scratch (the last most current release), does anyone have a more elegant solution to this?
I'm not sure how best to approach the problem in your specific environment, but I'd suggest reading up on Rail's migrations feature for some inspiration on how to get started.
http://wiki.rubyonrails.org/rails/pages/UnderstandingMigrations
We address this - or at least a similar problem to this - as follows:
The schema has a version number - this is represented by a table which has one row per version which, as well as the version number, carries boring things like a date/time stamp for when that version came into existence.
By having the schema create/modify DDL wrapped in code that performs the changes for us.
In the context above one would build the schema change code as part of the build process then run it and it would only apply schema changes that haven't already been applied.
In our experience (which is bound not to be representative) in most cases the schema changes are sufficiently small/fast that they can safely be run in a transaction which means that if it fails we get a rollback and the db is "safe" - although one would always recommend taking backups before applying schema updates if practicable.
I evolved this out of nasty painful experience. Its not a perfect system (or an original idea) but as a result of working this way we have a high degree of confidence that if there are two instances of one of our databases with the same version that then the schema for those two databases will be the same in almost all respects and that we can safely bring any db up to the current schema for that application without ill effects. (That last isn't 100% true unfortunately - there's always an exception - but its not too far from the truth!)
Do you keep your existing data in the database? If not, you may want to look at something similar to what Matt mentioned for .NET called RikMigrations
http://www.rikware.com/RikMigrations.html
I use that on my projects to update my database on the fly, while keeping track of revisions. Also, it makes it very simple to move database schema to different servers, etc.
if you want to have re-runnability in your scripts, then you can't have them as definitions... what I mean by this is that you need to focus on change scripts rather than here is my Table script.
let's say you have a table Customers:
create table Customers (
id int identity(1,1) primary key,
first_name varchar(255) not null,
last_name varchar(255) not null
)
and later you want to add a status column. Don't modify your original table script, that one has already run (and can have the if(! exists) syntax to prevent it from causing errors while running again).
Instead, have a new script, called add_customer_status.sql
in this script you'll have something like:
alter table Customers
add column status varchar(50) null
update Customers set status = 'Silver' where status is null
alter table Customers
alter column status varchar(50) not null
Again you can wrap this with an if(! exists) block to allow re-running, but here we've leveraged the notion that this is a change script, and we adapt the database accordingly. If there is data already in the customers table then we're still okay, since we add the column, seed it with data, then add the not null constraint.
Both of the migration frameworks mentioned above are good, I've also had excellent experience with MigratorDotNet.
Scott named a couple of other SQL tools that address the problem of change management. But I'm still rolling my own.
I would like to second this question, and add my puzzlement that there is still no free, community-based tool for this problem. Obviously, scripts are not a satisfactory way to maintain a database schema; neither are instances. So, why don't we keep metadata in a separate (and while we're at it, platform-neutral) format?
That's what I'm doing now. My master database schema is a version-controlled XML file, created initially from a simple web service. A simple javascript program compares instances against it, and a simple XSL transform yields the CREATE or ALTER statements. It has limits, like RikMigrations; for instance it doesn't always sequence inter-depdendent objects correctly. (But guess what — neither does Microsoft's SQL Server Database Publication tool.) Really, it's too simple. I simply didn't include objects (roles, users, etc.) that I wasn't using.
So, my view is that this problem is indeed inadequately addressed, and that sooner or later we'll have to get together and tackle the devilish details.
We went the 'drop and recreate the schema' route. We had some classes in our JUnit test package which parameterized the scripts to create all the objects in the schema for the developer executing the code. This allowed all the developers to share one test database and everyone could simultaneously create/test/drop their test tables without conflicts.
Did it take a long time to run? Yes. At first we used the setup method for this which meant the tables were dropped/created for every test and that took way too long. Then we created a TestSuite which could be run once before all the tests for a class and then cleaned up when all the class tests were complete. This still meant that the db setup ran many times when we ran our 'AllTests' class which included all the tests in all our packages. How I solved it was adding a semaphore to the OracleTestSuite code so when the first test requested the database to be setup it would do that but any subsequent call would just increment a counter. As each tearDown() method was called, the counter would decrement the counter until it reached 0 and the OracleTestSuite code would drop everything. One issue this leaves is whether the tests assume that the database is empty. It can be convenient to let database tests know the order in which they run so they can take advantage of the state of the database because it can reduce the duplication of DB setup.
We used the concept of ObjectMothers to solve a similar problem with creating complex domain objects for testing purposes. Mock objects might be a better answer but we hadn't heard about them at the time. After all this time, I'd recommend creating test helper methods that could create standardized datasets for the typical scenarios. Plus that would help document the important edge cases from a data perspective.

Resources