Load milion objects from DB

Load milion objects from DB - database

i am using Drools engine but only from the Implementation side. the frameWork is setup for me, and i can use only RULES side ( i hope i am able to explain myself).
that said - my problem is that i am trying to load about 1 milion row from Oracle DB into the WM, and i find that this task takes too long, here is the rule that i use to load the objects:
(BTW - the request to load the milion records into the WM is mandatory since i need to use these DB objects as part of my rules along with other objects that are injected at runtime into the engine)
rule "Load_CMMObject"
salience 10
no-loop
when
not CMMObjectShouldBeRefreshed() over window:time( 1500m )
then
log("Fetching Load_CMMObject");
ExecutionContext ctx = getExecutionContext();
String getObjectTypeSQL = "select AID , BC_OBJECT_ID , ALR_EQP_NAME , ALR_FROM_SITE from CMM_DB.COR_EQP_VW";
PreparedStatement pStmt = null;
try {
pStmt = MTServiceAccessor.getDBAccessService().prepareNativeStatement(ctx, getObjectTypeSQL, false);
ResultSet rs = pStmt.executeQuery();
while (rs.next()) {
String aid = rs.getString(1);
int objectID = rs.getInt(2);
String eqpName = rs.getString(3);
String fromSite = rs.getString(4);
CMMObject cmmObject = new CMMObject();
cmmObject.setIp(aid);
cmmObject.setObjectID(objectID);
cmmObject.setEqpName(eqpName);
cmmObject.setFromSite(fromSite);
insert(cmmObject);
//log("insert Object ---> " + cmmObject.getIp());
}
log("Finish Loading All cmm_db.BCMED_EQP_VW");
} catch (Exception e) {
log("Failed to Load ServiceName_TBL" + e);
} finally {
if (pStmt != null) {
try {
pStmt.close();
} catch (Exception e) {
log("Failed to close pstmt");
}
}
}
//log(" finished loading trails into the WM1");
CMMObjectShouldBeRefreshed cMMObjectShouldBeRefreshed = new CMMObjectShouldBeRefreshed();
//log(" finished loading trails into the WM2");
insert (cMMObjectShouldBeRefreshed);
//log("finished loading trails into the WM3");
end
i am using server that allocates about 20Gb RAM for the Drools engine, and it has 8 1.5GHZ Quad Core proccessors.
the problem is that it take me to load 5000 raws about 1 minute --> so if i want to load the 1 milion records from the DB it will take me 200 minutes to complete the task and this is too much.
i will appreciate any help here,
thanks alot!

Related

Do database changes result in change of ResultSet

try (
Connection conn = ds.getConnection();
PreparedStatement sm = conn.prepareStatement(SQL, ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY);
ResultSet rs = sm.executeQuery();
) {
// parsing start
List<Entity> list = Lists.newArrayList();
while (rs.next()) {
list.add(parseFromResultSet(rs));
}
// parsing end
return entities;
} catch (SQLException e) {
e.printStackTrace();
}
Considering a simple case like above:
could the result change if the database changes while the function is stil iterating over it in case the fetch size is smaller than the result set size?
(Meaning an INSERT/UPDATE/DELETE has been committed to the database between parsing start and parsing end)
Are some database driver implementations known to behave this way?

How to reload list resource bundle in ADF 12c

I fail to reload my resource bundle class to reflect the changed translations (made my end-user) on page. Although getContent method executes and all translations as key/value fetched from database and object[][] returned from getContent method successfully. this happens after each time I clear the cache and refresh the jsf page through actionListener.
ResourceBundle.clearCache();
Also I tried to use the below and got the same result.
ResourceBundle.clearCache(Thread.currentThread().GetContextClassLoader());
Why WLS always see the old one? Am I miss something?
versions: 12.2.1.1.0 and 12.2.1.3.0

The end user - after making the translations and contributing to the internationalization of the project, the translations are saved to the database,
The process to inforce these operations are done through the following steps:
Create a HashMap and load all the resource key/vale pairs in the map
from the database:
while (rs.next()) {
bundle.put(rs.getString(1), rs.getString(2));
}
Refresh the Bundle of your application
SoftCache cache =
(SoftCache)getFieldFromClass(ResourceBundle.class,
"cacheList");
synchronized (cache) {
ArrayList myBundles = new ArrayList();
Iterator keyIter = cache.keySet().iterator();
while (keyIter.hasNext()) {
Object key = keyIter.next();
String name =
(String)getFieldFromObject(key, "searchName");
if (name.startsWith(bundleName)) {
myBundles.add(key);
sLog.info("Resourcebundle " + name +
" will be refreshed.");
}
}
cache.keySet().removeAll(myBundles);
Getthe a String from ResourceBoundle of your application:
for (String resourcebundle : bundleNames) {
String bundleName =
resourcebundle + (bundlePostfix == null ? "" : bundlePostfix);
try {
bundle = ResourceBundle.getBundle(bundleName, locale, getCurrentLoader(bundleName));
} catch (MissingResourceException e) {
// bundle with this name not found;
}
if (bundle == null)
continue;
try {
message = bundle.getString(key);
if (message != null)
break;
} catch (Exception e) {
}
}

H2 performance degrade using remote tcp server

I have a simple application that uses a single connection, the flow is something like this:
SELECT QUERY
CONNECTION CLOSE
for(..) //thousands of iterations
{
SIMPLE SELECT QUERY
}
BUSINESS LOGIC
CONNECTION CLOSE
for(..) //thousands of iterations
{
SIMPLE SELECT QUERY
}
BUSINESS LOGIC
CONNECTION CLOSE
When i use the embedded connection mode, the application ends in about 20 seconds, but when i switch to the server mode, the performances deteriorate:
localhost: 110 seconds
remote server (LAN): more than 30 minutes
Each query retrieves a small amount of data.
Is there an explanation for such a poor performance? How can i speed up the application without rewriting the code?
Any help is appreciated

compare your problem like accessing a cup of coffee.
If you access your coffe on your table/workdesk it is comparable like a in memory access to your database (H2 embedded)
It takes about seconds (coffee) and microseconds (h2 embedded)
if you need to go to the kitchen to access a cup of coffe it will take you the travel time from your chair to the kitchen + back (response).
The kitchen is comparable like TCP accessing or file accessing your local database.
It takes you minutes (coffee) and one digit milliseconds (h2 tcp localhost or local file)
if you need to go to a coffeshop outside to access a cup of coffee it will take you at least 15 mins to get a cup of coffee (and ages ( at least 2 digits milliseconds) in h2 tcp on remote machine)
now the question, how many times do you want to travel to a coffeshop?
If I give you a iteration (for loop) over 1000 times to the coffeshop, would you just asking me after the second or the third time if I am kidding? Why do you bother a IO dependent system on long travel times over the network?
So in your case, if you reduce your second for-loop into one SQL query, you will get a great performance locally and of course especially on the remote way.
I hope I could explain you the situation with the coffee, since it explains the problem better.
So to answer the last question, you have to rewrite your "thousands of iterations" for-loops.
Update 1
Forgot, if your for-loops are writing loops (update/insert) you can use batch queries. How to use them depends on your language you are using. Batches are to provide the database a bunch (e.g. multiple hundreds) of insert/updates before the operation takes in place on the database.

I did following test with your provided H2 version.
create a test database with 500.000 records, database size 990 MB
java -cp h2-1.3.168.jar;. PerfH2 create
select random 8.000 records on the indexed column
in embedded mode -> 1.7 sec
java -cp h2-1.3.168.jar;. PerfH2 embedded
in server (localhost) mode -> 2.6 sec
java -cp h2-1.3.168.jar;. PerfH2 server
If you face the problem already using server mode as jdbc:h2:tcp://localhost/your_database, then something with your environment or the way you access the server mode seems to be wrong. Try with a stripped done application and check if the problem still exists. If you have can reproduce the problem also with a stripped down version please post the code.
Find the code used for the test below.
public class PerfH2 {
public static void main(String[] args) throws SQLException {
if (null == args[0]) {
showUsage();
return;
}
long start = System.currentTimeMillis();
switch (args[0]) {
case "create":
createDatabase();
break;
case "embedded":
try (Connection conn = getEmbeddedConnection()) {
execSelects(conn);
}
break;
case "server":
try (Connection conn = getServerConnection()) {
execSelects(conn);
}
break;
default:
showUsage();
}
System.out.printf("duration: %d%n", System.currentTimeMillis() - start);
}
private static Connection getServerConnection() throws SQLException {
return DriverManager.getConnection("jdbc:h2:tcp://localhost/perf_test", "sa", "");
}
private static Connection getEmbeddedConnection() throws SQLException {
return DriverManager.getConnection("jdbc:h2:d:/temp/perf_test", "sa", "");
}
private static void execSelects(final Connection conn) throws SQLException {
Random rand = new Random(1);
String selectSql = "SELECT * FROM TEST_TABLE WHERE ID = ?";
PreparedStatement selectStatement = conn.prepareStatement(selectSql);
int count = 0;
for (int i = 0; i < 8000; i++) {
selectStatement.setInt(1, rand.nextInt(500_000));
ResultSet rs = selectStatement.executeQuery();
while (rs.next()) {
count++;
}
}
System.out.printf("retrieved rows: %d%n", count);
}
private static void createDatabase() throws SQLException {
try (Connection conn = DriverManager.getConnection("jdbc:h2:d:/temp/perf_test", "sa", "")) {
String createTable = "CREATE TABLE TEST_TABLE(ID INT, NAME VARCHAR(1024))";
conn.createStatement().executeUpdate(createTable);
String insertSql = "INSERT INTO TEST_TABLE VALUES(?, ?)";
PreparedStatement insertStmnt = conn.prepareStatement(insertSql);
StringBuilder sb = new StringBuilder(1024);
for (int i = 0; i < 1024 / 10; i++) {
sb.append("[cafebabe]");
}
String value = sb.toString();
int count = 0;
for (int i = 0; i < 50; i++) {
insertStmnt.setInt(1, i);
insertStmnt.setString(2, value);
count += insertStmnt.executeUpdate();
}
System.out.printf("inserted rows: %d%n", count);
conn.commit();
String createIndex = "CREATE INDEX TEST_INDEX ON TEST_TABLE(ID)";
conn.createStatement().executeUpdate(createIndex);
}
}
private static void showUsage() {
System.out.println("usage: PerfH2 [create|embedded|server]");
}
}

Is using a SqlCacheDependency per user a good idea?

I'm thinking of caching permissions for every user on our application server. Is it a good idea to use a SqlCacheDependency for every user?
The query would look like this
SELECT PermissionId, PermissionName From Permissions Where UserId = #UserId
That way I know if any of those records change then to purge my cache for that user.

If you read how Query Notifications work you'll see why createing many dependency requests with a single query template is good practice. For a web app, which is implied by the fact that you use SqlCacheDependency and not SqlDependency, what you plan to do should be OK. If you use Linq2Sql you can also try LinqToCache:
var queryUsers = from u in repository.Users
where u.UserId = currentUserId
select u;
var user= queryUsers .AsCached("Users:" + currentUserId.ToString());
For a fat client app it would not be OK. Not because of the query per-se, but because SqlDependency in general is problematic with a large number of clients connected (it blocks a worker thread per app domain connected):
SqlDependency was designed to be used in ASP.NET or middle-tier
services where there is a relatively small number of servers having
dependencies active against the database. It was not designed for use
in client applications, where hundreds or thousands of client
computers would have SqlDependency objects set up for a single
database server.
Updated
Here is the same test as #usr did in his post. Full c# code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.SqlClient;
using DependencyMassTest.Properties;
using System.Threading.Tasks;
using System.Threading;
namespace DependencyMassTest
{
class Program
{
static volatile int goal = 50000;
static volatile int running = 0;
static volatile int notified = 0;
static int workers = 50;
static SqlConnectionStringBuilder scsb;
static AutoResetEvent done = new AutoResetEvent(false);
static void Main(string[] args)
{
scsb = new SqlConnectionStringBuilder(Settings.Default.ConnString);
scsb.AsynchronousProcessing = true;
scsb.Pooling = true;
try
{
SqlDependency.Start(scsb.ConnectionString);
using (var conn = new SqlConnection(scsb.ConnectionString))
{
conn.Open();
using (SqlCommand cmd = new SqlCommand(#"
if object_id('SqlDependencyTest') is not null
drop table SqlDependencyTest
create table SqlDependencyTest (
ID int not null identity,
SomeValue nvarchar(400),
primary key(ID)
)
", conn))
{
cmd.ExecuteNonQuery();
}
}
for (int i = 0; i < workers; ++i)
{
Task.Factory.StartNew(
() =>
{
RunTask();
});
}
done.WaitOne();
Console.WriteLine("All dependencies subscribed. Waiting...");
Console.ReadKey();
}
catch (Exception e)
{
Console.Error.WriteLine(e);
}
finally
{
SqlDependency.Stop(scsb.ConnectionString);
}
}
static void RunTask()
{
Random rand = new Random();
SqlConnection conn = new SqlConnection(scsb.ConnectionString);
conn.Open();
SqlCommand cmd = new SqlCommand(
#"select SomeValue
from dbo.SqlDependencyTest
where ID = #id", conn);
cmd.Parameters.AddWithValue("#id", rand.Next(50000));
SqlDependency dep = new SqlDependency(cmd);
dep.OnChange += new OnChangeEventHandler((ob, qnArgs) =>
{
Console.WriteLine("Notified {3}: Info:{0}, Source:{1}, Type:{2}", qnArgs.Info, qnArgs.Source, qnArgs.Type, Interlocked.Increment(ref notified));
});
cmd.BeginExecuteReader(
(ar) =>
{
try
{
int crt = Interlocked.Increment(ref running);
if (crt % 1000 == 0)
{
Console.WriteLine("{0} running...", crt);
}
using (SqlDataReader rdr = cmd.EndExecuteReader(ar))
{
while (rdr.Read())
{
}
}
}
catch (Exception e)
{
Console.Error.WriteLine(e.Message);
}
finally
{
conn.Close();
int left = Interlocked.Decrement(ref goal);
if (0 == left)
{
done.Set();
}
else if (left > 0)
{
RunTask();
}
}
}, null);
}
}
}
After 50k subscriptions are set up (takes about 5 min), here are the stats io of a single insert:
set statistics time on
insert into Test..SqlDependencyTest (SomeValue) values ('Foo');
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 16 ms.
Inserting 1000 rows takes about 7 seconds, which includes firing several hundred notifications. CPU utilization is about 11%. All this is on my T420s ThinkPad.
set nocount on;
go
begin transaction
go
insert into Test..SqlDependencyTest (SomeValue) values ('Foo');
go 1000
commit
go

the documentation says:
SqlDependency was designed to be used in ASP.NET or middle-tier
services where there is a relatively small number of servers having
dependencies active against the database. It was not designed for use
in client applications, where hundreds or thousands of client
computers would have SqlDependency objects set up for a single
database server.
It tells us not to open thousands of cache dependencies. That is likely to cause resource problems on the SQL Server.
There are a few alternatives:
Have a dependency per table
Have 100 dependencies per table, one for every percent of rows. This should be an acceptable number for SQL Server yet you only need to invalidate 1% of the cache.
Have a trigger output the ID of all changes rows into a logging table. Create a dependency on that table and read the IDs. This will tell you exactly which rows have changed.
In order to find out if SqlDependency is suitable for mass usage I did a benchmark:
static void SqlDependencyMassTest()
{
var connectionString = "Data Source=(local); Initial Catalog=Test; Integrated Security=true;";
using (var dependencyConnection = new SqlConnection(connectionString))
{
dependencyConnection.EnsureIsOpen();
dependencyConnection.ExecuteNonQuery(#"
if object_id('SqlDependencyTest') is not null
drop table SqlDependencyTest
create table SqlDependencyTest (
ID int not null identity,
SomeValue nvarchar(400),
primary key(ID)
)
--ALTER DATABASE Test SET ENABLE_BROKER with rollback immediate
");
SqlDependency.Start(connectionString);
for (int i = 0; i < 1000 * 1000; i++)
{
using (var sqlCommand = new SqlCommand("select ID from dbo.SqlDependencyTest where ID = #id", dependencyConnection))
{
sqlCommand.AddCommandParameters(new { id = StaticRandom.ThreadLocal.GetInt32() });
CreateSqlDependency(sqlCommand, args =>
{
});
}
if (i % 1000 == 0)
Console.WriteLine(i);
}
}
}
You can see the amount of dependencies created scroll through the console. It gets slow very quickly. I did not do a formal measurement because it was not necessary to prove the point.
Also, the execution plan for a simple insert into the table shows 99% of the cost being associated with maintaining the 50k dependencies.
Conclusion: Does not work at all for production use. After 30min I have 55k dependencies created. Machine at 100% CPU all the time.

Logging SQL Statements in Android; fired by application

I have a small Android app and currently I am firing a sql statement in android to get the count of rows in database for a specific where clause.
Following is my sample code:
public boolean exists(Balloon balloon) {
if(balloon != null) {
Cursor c = null;
String count_query = "Select count(*) from balloons where _id = ?";
try {
c = getReadableDatabase().rawQuery(count_query, new String[] {balloon.getId()});
if (c.getCount() > 0)
return true;
} catch(SQLException e) {
Log.e("Running Count Query", e.getMessage());
} finally {
if(c!=null) {
try {
c.close();
} catch (SQLException e) {}
}
}
}
return false;
}
This method returns me a count 1 even when the database table is actually empty. I am not able to figure out; why that would happen. Running the same query in database gets me a count of 0.
I was wondering if it's possible to log or see in a log, all the sql queries that eventually get fired on database after parameter substitution; so that I can understand what's going wrong.
Cheers

That query will always return one record (as will any SELECT COUNT). The reason is, that the record it returns contains a field that indicates how many records are present in the "balloons" table with that ID.
In other words, the one record (c.getcount()==1) is NOT saying that the number of records found in balloons is one, rather it is the record generated by Sqlite which contains a field with the result.
To find out the number of balloons, you should c.movetofirst() and then numberOfBalloons = c.getInt(0)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight