Oracle Data Masking Clarification - database

Hi I am looking at the possibility of implementing the Oracle Data Masking Package but require some clarification. I am looking to mask just my non-production data. In the process of doing this does Oracle create clones of the the database? I am hoping to mask the PII data without creating a clone/ needing to create additional Oracle users. Does the Oracle solution meet these requirements? I am being told that deploying ODM, it will require application changes? Can anyone elaborate on this. My apologies I am extremely new to the DB world. Are there any other data masking solutions anyone can recommend?

Which version of Oracle you are using ? Do you have advanced security License ?
You can make use of Redaction features as part of Advanced security available from 12.1. Oracle provides DBMS_REDACT to hide values after execution of queries i.e. only when they are displayed at screen and hence dont impact performance of any dependencies.
There are multiple options available with full/partial redaction.
Let me know if you require any more details, I have recently implemented it in production environment to protect PII DATA
Official documentation

Related

Database Agnostic Application

The database for one the application that I am working on is not confirmed yet by the business.
Best guess is Oracle and DB2.
What I've heard is initially the project will go live with DB2 V9 and then to Oracle 11g.
We are using Spring 3.0.5, Hibernate 3.5, JPA2 and JBoss 5 for this project
So what are the best practices here going into the build phase and test phase?
Shall I build using DB2 first and worry about Oracle later (this
doesn't sound right)?
Or, shall I write using JPA (Hibernate) and
then generate the database schema?
Or something else?
PS: I've no control over the choice of the DB, what and when, as these are strategic decision made by people sitting in nice rooms getting fat cheques and big bonuses.
Thanks,
Adi
Obviously you are loosing the access to specific features of the database if you are writing your application database agnostic. The database is, except for automatic optimizations done by JPA and Hibernate, reduced to common features. You have to set some things to automatic and trust JPA/Hibernate to do it right that you could set specifically if you knew the database (e.g. id generator strategies).
But it seems that the specific developer features of the database are not relevant for the decision so they can't be relevant to the application. What other reasons may influence the decision (like price, money, cash, personal relations, management tools, hardware requirements, existing knowledge and personell) can only be speculated about.
So IMHO you don't have a choice. Strictly avoid anything database specific. That includes letting the JPA/Hibernate generate the schema (your point #2). In this project setup you shouldn't tinker with the database manually.
Well... sadly there ARE some hidden traps in JPA/Hibernate developement that make it database dependent (e.g. logarithmic functions are not mapped consistenly). So you should run all your tests against all possible databases from day one. As you write "Best guess is..." you should just grab any database available and test against it. Should be easly setup with the given stack.
And you should try to accelerate the decision about the database used, if possible.
Just "write using JPA (Hibernate)" develop it to be de database agnostic. Put all you business logic in java code not stored procedures.
If you are using spring you don't need jboss you could use just tomcat, about a quarter of the foot print, and much simpler imho.
Spring vs Jboss and jboss represents all that is bad, while spring represents all that is good in Java enterprise development
We have add this issue and had to migrate late in the project, leading to a lot of extra works, frustrations and delays.
My advise is to define an abstract layer. Go to the point you may have a data model without any database, say with tables or text files.
Then when you have to switch to some database, you can optimize for it, while staying free to continue application development on any already developped model. So you don't delay the developpers on the app while one is tuning the DB2 layer. When everything is duly validated, the team can switch on it.
I will disagree with the currently accepted answer suggesting avoiding database specific things. From a performance perspective, that would be a pity, and it's definitely doable.
JPA/Hibernate and also jOOQ can abstract over a lot of things and if you're using the query builder APIs of either technology (criteria query in JPA, or jOOQ for more advanced SQL), you can get very far in a vendor agnostic way without removing all the vendor specific stuff. For example, you can easily create a vendor specific predicate like this:
.where(oracle ? oracleCondition() : db2Condition())
What you should do from the very beginning of such a project, once you know you'll have to support both dialects is to run integration tests on both database products. For this, I recommend testcontainers, which makes running such tests quite simple. If you have to add support for another dialect, and if you're using one of the above abstractions, you can simply add another testcontainers configuration, check if your application still works, tweak 2-3 things, and you're set.
Disclaimer: I work for the company behind jOOQ.

Building OLAP style applications with SalesForce/Apex

We are considering moving a planning and budgeting app to the Salesforce platform. The existing app is built on a dimensional data model, and has extensive ad-hoc query capability implemented through star joins.
We see how the platform will allow us to put together the data entry screens quickly, but the underlying datamodel and query languages do not seem suitable for our reporting requirements.
Is it possible to have fast and flexible reporting with this platform? If not, how cumbersome is it to extract the data on a regular basis to bring it into an analytical application?
Hmm - I guess I answer my own question? The relative silence on this (even with bounty- who wants to have anything to do with something that is ignored on stackoverflow?) is a kind of answer.
So - No, this platform is not well suited for applications that have any kind of ROLAP requirements. I guess shame on me for asking a silly question, but I welcome any responses...
Doing native, fast, OLAP-like queries: possible, but somewhat cumbersome since SFDC is basically a traditional-style RDBMS with somewhat limited joining capability within its native reporting. You can do OLAP-like things with custom code but it can get cumbersome if you are used to using established high-end OLAP solutions.
Extracting data from SFDC to use in other applications: really easy and supported across a number of technologies, the most common is extracting CSV files or using the data web service. There are tools like the SFDC data loader which also let you extract/load data via command line or UI. That's probably what I would recommend to a client who has pre-existing expertise in a given analysis tool.
I would not attempt to build an OLAP data model in salesforce. The limitations in both the joins and roll-up of data from child to parent make it difficult to implement a star schema with aggregations.
There are some products such as IQ 20/20 that can integrate with salesforce and provide near real time business intelligence functionality.
Analytical snapshots can also help as they provide a way to build aggregate tables. The snapshots pull data from a report and can be scheduled to run periodically. The different salesforce editions give different features regarding the scheduling so it is best to check the limits for your edition before going too far into the design.

Umbraco Database Question- Adding custom tables

I'm working on a site managed by Umbraco. I need to store data about images and clients. I don't think there is any way I can store that data in the existing tables.
Is there any reason I shouldn't add the tables I'll need to the Umbraco database, rather than creating a separate DB? I like Umbraco so far but the documentation is a little thin and I haven't found any suggestions one way or the other.
TIA
I have built a site using Umbraco, with a separate application with a database of vehicles. I used the same database as Umbraco is using, and prefixed all my custom app tables with a few letters to distinguish them easily (eg: vehicles_xxx)
I have had no problems with this arrangement, and don't believe there's much risk involved. Of course you'll need to take care when upgrading Umbraco (never upgrade in the live environment before fully testing, and preferably do it locally anyway), however its unlikely an upgrade script will ever alter or delete any tables that it does not know about.
There's heaps of doco available for umbraco now - much more than when i started.. however a question like this is always best for the forums. :)
all the best
greg
You might use the Umbraco API to store and retrieve your data and enjoy the ease of not having to worry bout tables and much more. Or you create your own tables. Do as Gregorius says - using umbraco db is fine.
Your choice depends on:
do you have a lot of data?
do you have a large relation model?
If not - then go with Umbraco API
The rest of the answers you'll find on http://our.umbraco.org
/Jesper Ordrup

Advice on a DB that can be uploaded to a website by a smart client for collecting survey feedback

I'm hoping you can help.
I'm looking for a zero config multi-user datbase that my winforms application can easily upload to a webserver folder (together with 1 or 2 classic asp pages) and am looking for some suggestions/recommendations.
The idea is that the database will be used to collect feedback entered by people filling in the asp pages. The pages will write to the database using javascript.
The database will subsequently be downloaded again for processing once the responses are in.
In Summary:
It will mostly run in MS Windows environments.
I have a modest budget for this and do not mind paying for such a database.
No runtime licensing costs.
Should be xcopy - Once uploaded to a website folder it should be operational.
It should not have a dotnet CLR dependency.
It should support a resonable level of concurrent access. Average respondent count would be around 20-30 but one never knows.
Should be a reasonable size so that uploads/downloads to and from the site will be reasonably fast.
Would appreciate your suggestions/comments
Many thanks
Abz
To clarify - this is a desktop commercial application for feedback management in a vertical market. It uses SQL Server as the backing store.
The application currently provides feedback management from email and paper feedback. I now want to add web feedback capability. Getting users to to make their SQL servers accessible to a website is not at option at this time as I am want to make getting up and running as painless as possible.
I intend to release a web based implementation of the software in the near future but for now am looking at the above as a pragmatic way to provide web based feedback collection.
SQLite comes to mind. It meets all of your stated requirements, is open source, and has a liberal license (public domain).
http://sqlite.org/
I would use 'normal' database (say MySql, Postgresql, Firebird, etc.) on server. Instead of copying files to server your winforms application would create custom tables (or even custom databases). After collecting data you could just get it back to your application using plain old SQL.
why reinvent the wheel ? If you want to collect feedback and stuffs from users of your app and if they are connected to internet, it might be a better idea - and in the long term cheaper - to use a service like wufoo. We recently switched from homegrown setup to wufoo and are very pleased. Check it out.
Otherwise you might want to take a look at sqlite orfirebird. Both of them are very robust, and have ADO.NET providers. Firebird scales from a single user to full blown client server system and has no .NET dependency.
If you really don't want a DB/SQL Solution, you could try simple text files and ftp/xcopy files down and parse them into the back-office server as needed. ASP/VBScript or ASP.NET can create the files to store the basic feedback comments. Need to consider security of course!

Django database scalability

We have a new django powered project which have a potential heavy-traffic characteristic(means a heavy db interaction). So we need to consider the database scalability in advance. With some researches, the following questions are still not clear to us:
coarse-grained: how to specify one db table(a django model) to a specific db(maybe in another server)?
fine-grained: how to specify a group of table rows to a specific db(so-called sharding, also can in another db server)?
how to specify write and read to different db?(which will be helpful for future mysql master/slave replication)
We are finding the solution with:
be transparent to application program(means we don't need to have additional codes in views.py)
should be in ORM level(means only needs to specify in models.py)
compatible with the current(or future) django release(to keep a minimal change for future's upgrading of django)
I'm still doing the research. And will share in this thread later if I've got some fruits.
Hope anyone with the experience can answer. Thanks.
Don't forget about caching either. Using memcached to relieve your DB of load is key to building a high performance site.
As alex said, django-core doesn't support your specific requests for those features, though they are definitely on the todo list.
If you don't do this in the application layer, you're basically asking for performance trouble. There aren't any really good open source automation layers for this sort of task, since it tends to break SQL axioms. If you're really concerned about it, you should be coding the entire application for it, not simply hoping that your ORM will take care of it.
There is the GSoC project by Alex Gaynor that in future will allow to use multiple databases in one Django project. But now there is no cross-RDBMS working solution.
There is no solution right now too.
And again - there is no cross-RDBMS solution. But if you are using MySQL you can try excellent third-party Django application called - mysql_replicated. It allows to setup master-slave replication scenario easily.
here for some reason we r using django with sqlalchemy. maybe combination of django and sqlalchemy also works for your needs.

Resources