I have defined an entity as so:
public class Chair {
#GenericGenerator(name = "sequencePerEntityGenerator", strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator", parameters = {
#Parameter(name = "prefer_sequence_per_entity", value = "true"),
#Parameter(name = "sequence_per_entity_suffix", value = "_seq"),
#Parameter(name = "initial_value", value = "5000000"),
#Parameter(name = SequenceStyleGenerator.INCREMENT_PARAM, value = "1") })
#GeneratedValue(strategy = GenerationType.AUTO, generator = "sequencePerEntityGenerator")
#Id
int id;
But i would like to know what are the names of the sequences created (using Chair.class), and extract names/use dialect to create a nextval call of my own, and query a new ID without any .persist() call is made. Is this possible? If yes how so? If not how else?
My final aim is to query multiple IDs (upto millions) using a single SQL statement as provided in this stackoverflow question, in an existing application. (Other options for the same in MySQL, Postgres exist as separate answers to other questions)
PS: One may recommend using PooledOptimizer or HiLo Optimizer or any Client-ended optimizers already provided by hibernate to optimize ID generation, but given the heavy load my application has due to other processes, it is unable to allocate enough CPU time to sequence optimizers, and the synchronized generate methods of these optimizers blocks threads (asynchronous persistence). Incidentally, using optimizers slows down multi-threaded persist calls on the same Entity class, and is slower than NoopOptimizer (no optimizer).
Related
My app has profile and work databases (and others) stored locally using Sembast DB.
Please look at the two examples below, which one is a better practice for asynchronous processes?
Example 1:
final profileDBPath = p.join(appDocumentDir.path, dbDirectory, 'profile.db');
final profileDB = await databaseFactoryIo.openDatabase(profileDBPath);
final workDBPath = p.join(appDocumentDir.path, dbDirectory, 'work.db');
final workDB = await databaseFactoryIo.openDatabase(workDBPath);
final workStore = stringMapStoreFactory.store('work');
final profileStore = stringMapStoreFactory.store('profile');
Example 2:
final dbPath = p.join(appDocumentDir.path, dbDirectory, 'database.db');
final database = await databaseFactoryIo.openDatabase(dbPath);
final workStore = stringMapStoreFactory.store('work');
final profileStore = stringMapStoreFactory.store('profile');
So notice that Example 1 is opening two different database files for each profile and work. And Example 2 is using the same database file for both.
The question is which one is better in terms of stability?
For coding simplicity I like Example 2 better but my worry is when in an Async operation Example 2 will crash when they write on the same file at the same time. Any ideas?
Thank you
Example 2 will crash when they write on the same file at the same time
I don't know if that is something your experiment or just an assumption. Sembast database supports multiple concurrent readers and a single writer (single process and single isolate) and will properly use a kind of mutex to ensure data consistency. Concurrent writes will be serialized and should not trigger any crash. And if it does, that's bug that you should fill!
Personally, I would go for a single database, it would allow cross stores transaction for data consistency that 2 databases cannot provide.
Code Migration due to Performance Issues :-
SQL Server LIKE Condition ( BEFORE )
SQL Server Full Text Search --> CONTAINS ( BEFORE )
Elastic Search ( CURRENTLY )
Achieved So Far :-
We have a web page created in ASP.Net Core which has a Auto Complete Drop Down of 2.5+ Million Companies Indexed in Elastic Search https://www.99corporates.com/
Due to performance issues we have successfully shifted our code from SQL Server Full Text Search to Elastic Search and using NEST v7.2.1 and Elasticsearch.Net v7.2.1 in our .Net Code.
Still looking for a solution :-
If the user does not select a company from the Auto Complete List and simply enters a few characters and clicks on go then a list should be displayed which we had done earlier by using the SQL Server Full Text Search --> CONTAINS
Can we call the ASP.Net Web Service which we have created using SQL CLR and code like SELECT * FROM dbo.Table WHERE Name IN( dbo.SQLWebRequest('') )
[System.Web.Script.Services.ScriptMethod()]
[System.Web.Services.WebMethod]
public static List<string> SearchCompany(string prefixText, int count)
{
}
Any better or alternate option
While that solution (i.e. the SQL-APIConsumer SQLCLR project) "works", it is not scalable. It also requires setting the database to TRUSTWORTHY ON (a security risk), and loads a few assemblies as UNSAFE, such as Json.NET, which is risky if any of them use static variables for caching, expecting each caller to be isolated / have their own App Domain, because SQLCLR is a single, shared App Domain, hence static variables are shared across all callers, and multiple concurrent threads can cause race-conditions (this is not to say that this is something that is definitely happening since I haven't seen the code, but if you haven't either reviewed the code or conducted testing with multiple concurrent threads to ensure that it doesn't pose a problem, then it's definitely a gamble with regards to stability and ensuring predictable, expected behavior).
To a slight degree I am biased given that I do sell a SQLCLR library, SQL#, in which the Full version contains a stored procedure that also does this but a) handles security properly via signatures (it does not enable TRUSTWORTHY), b) allows for handling scalability, c) does not require any UNSAFE assemblies, and d) handles more scenarios (better header handling, etc). It doesn't handle any JSON, it just returns the web service response and you can unpack that using OPENJSON or something else if you prefer. (yes, there is a Free version of SQL#, but it does not contain INET_GetWebPages).
HOWEVER, I don't think SQLCLR is a good fit for this scenario in the first place. In your first two versions of this project (using LIKE and then CONTAINS) it made sense to send the user input directly into the query. But now that you are using a web service to get a list of matching values from that user input, you are no longer confined to that approach. You can, and should, handle the web service / Elastic Search portion of this separately, in the app layer.
Rather than passing the user input into the query, only to have the query pause to get that list of 0 or more matching values, you should do the following:
Before executing any query, get the list of matching values directly in the app layer.
If no matching values are returned, you can skip the database call entirely as you already have your answer, and respond immediately to the user (much faster response time when no matches return)
If there are matches, then execute the search stored procedure, sending that list of matches as-is via Table-Valued Parameter (TVP) which becomes a table variable in the stored procedure. Use that table variable to INNER JOIN against the table rather than doing an IN list since IN lists do not scale well. Also, be sure to send the TVP values to SQL Server using the IEnumerable<SqlDataRecord> method, not the DataTable approach as that merely wastes CPU / time and memory.
For example code on how to accomplish this correctly, please see my answer to Pass Dictionary to Stored Procedure T-SQL
In C#-style pseudo-code, this would be something along the lines of the following:
List<string> = companies;
companies = SearchCompany(PrefixText, Count);
if (companies.Length == 0)
{
Response.Write("Nope");
}
else
{
using(SqlConnection db = new SqlConnection(connectionString))
{
using(SqlCommand batch = db.CreateCommand())
{
batch.CommandType = CommandType.StoredProcedure;
batch.CommandText = "ProcName";
SqlParameter tvp = new SqlParameter("ParamName", SqlDbType.Structured);
tvp.Value = MethodThatYieldReturnsList(companies);
batch.Paramaters.Add(tvp);
db.Open();
using(SqlDataReader results = db.ExecuteReader())
{
if (results.HasRows)
{
// deal with results
Response.Write(results....);
}
}
}
}
}
Done. Got the solution.
Used SQL CLR https://github.com/geral2/SQL-APIConsumer
exec [dbo].[APICaller_POST]
#URL = 'https://www.-----/SearchCompany'
,#JsonBody = '{"searchText":"GOOG","count":10}'
Let me know if there is any other / better options to achieve this.
To give a simplified example:
I have a database with one table: names, which has 1 million records each containing a common boy or girl's name, and more added every day.
I have an application server that takes as input an http request from parents using my website 'Name Chooser' . With each request, I need to pick up a name from the db and return it, and then NOT give that name to another parent. The server is concurrent so can handle a high volume of requests, and yet have to respect "unique name per request" and still be high available.
What are the major components and strategies for an architecture of this use case?
From what I understand, you have two operations: Adding a name and Choosing a name.
I have couple of questions:
Qustion 1: Do parents choose names only or do they also add names?
Question 2 If they add names, doest that mean that when a name is added it should also be marked as already chosen?
Assuming that you don't want to make all name selection requests to wait for one another (by locking of queueing them):
One solution to resolve concurrency in case of choosing a name only is to use Optimistic offline lock.
The most common implementation to this is to add a version field to your table and increment this version when you mark a name as chosen. You will need DB support for this, but most databases offer a mechanism for this. MongoDB adds a version field to the documents by default. For a RDBMS (like SQL) you have to add this field yourself.
You havent specified what technology you are using, so I will give an example using pseudo code for an SQL DB. For MongoDB you can check how the DB makes these checks for you.
NameRecord {
id,
name,
parentID,
version,
isChosen,
function chooseForParent(parentID) {
if(this.isChosen){
throw Error/Exception;
}
this.parentID = parentID
this.isChosen = true;
this.version++;
}
}
NameRecordRepository {
function getByName(name) { ... }
function save(record) {
var oldVersion = record.version - 1;
var query = "UPDATE records SET .....
WHERE id = {record.id} AND version = {oldVersion}";
var rowsCount = db.execute(query);
if(rowsCount == 0) {
throw ConcurrencyViolation
}
}
}
// somewhere else in an object or module or whatever...
function chooseName(parentID, name) {
var record = NameRecordRepository.getByName(name);
record.chooseForParent(parentID);
NameRecordRepository.save(record);
}
Before whis object is saved to the DB a version comparison must be performed. SQL provides a way to execute a query based on some condition and return the row count of affected rows. In our case we check if the version in the Database is the same as the old one before update. If it's not, that means that someone else has updated the record.
In this simple case you can even remove the version field and use the isChosen flag in your SQL query like this:
var query = "UPDATE records SET .....
WHERE id = {record.id} AND isChosend = false";
When adding a new name to the database you will need a Unique constrant that will solve concurrenty issues.
When you use NHibernate to "fetch" a mapped object, it outputs a SELECT query to the database. It outputs this using parameters; so if I query a list of cars based on tenant ID and name, I get:
select Name, Location from Car where tenantID=#p0 and Name=#p1
This has the nice benefit of our database creating (and caching) a query plan based on this query and the result, so when it is run again, the query is much faster as it can load the plan from the cache.
The problem with this is that we are a multi-tenant database, and almost all of our indexes are partition aligned. Our tenants have vastly different data sets; one tenant could have 5 cars, while another could have 50,000. And so because NHibernate does this, it has the net effect of our database creating and caching a plan for the FIRST tenant that runs it. This plan is likely not efficient for subsequent tenants who run the query.
What I WANT to do is force NHibernate NOT to parameterize certain parameters; namely, the tenant ID. So I'd want the query to read:
select Name, Location from Car where tenantID=55 and Name=#p0
I can't figure out how to do this in the HBM.XML mapping. How can I dictate to NHibernate how to use parameters? Or can I just turn parameters off altogether?
OK everyone, I figured it out.
The way I did it was overriding the SqlClientDriver with my own custom driver that looks like this:
public class CustomSqlClientDriver : SqlClientDriver
{
private static Regex _partitionKeyReplacer = new Regex(#".PartitionKey=(#p0)", RegexOptions.Compiled);
public override void AdjustCommand(IDbCommand command)
{
var m = _tenantIDReplacer.Match(command.CommandText);
if (!m.Success)
return;
// replace the first parameter with the actual partition key
var parameterName = m.Groups[1].Value;
// find the parameter value
var tenantID = (IDbDataParameter ) command.Parameters[parameterName];
var valueOfTenantID = tenantID.Value;
// now replace the string
command.CommandText = _tenantIDReplacer.Replace(command.CommandText, ".TenantID=" + valueOfTenantID);
}
} }
I override the AdjustCommand method and use a Regex to replace the tenantID. This works; not sure if there's a better way, but I really didn't want to have to open up NHibernate and start messing with core code.
You'll have to register this custom driver in the connection.driver_class property of the SessionFactory upon initialization.
Hope this helps somebody!
In my DB there is View "RqstLst"
I create EF Model from DB. Now I have entity RqstLst.
There is two variant of the same query
public void MyMethod()
{
context = new WaterMEntities();
var query = context.RqstLst;
dgRqstLst.ItemsSource = query; //dgRqstLst - DataGrid in WPF
}
and
public void MyMethod()
{
dgRqstLst.ItemsSource = this.GetRqstLst();
}
private IEnumerable<RqstLst> GetRqstLst()
{
context = new WaterMEntities();
string nativeSQLQuery = "SELECT * " +
"FROM dbo.RqstLst ";
ObjectResult<RqstLst> requestes =
context.ExecuteStoreQuery<RqstLst>(nativeSQLQuery);
return requestes;
}
execution time for first variant(LINQ to Entities) is 19 sec, for second, less then 1 sec.
I look it in sql server profiler. What i do wrong in first variant?
One big difference is that ExecuteStoreQuery doesn't attach the returned objects to the context (at least not the overload you are using) but your first query does (which costs time).
Try to define the same tracking behaviour in your first query like you have in the second query (= NoTracking):
context = new WaterMEntities();
context.RqstLst.MergeOption = MergeOption.NoTracking; // in System.Data.Objects
var query = context.RqstLst;
dgRqstLst.ItemsSource = query;
You didn't do anything wrong with the first option, but depending on your configuration, the query generated by the first one could be much more complex than your straight SQL execution. Have you used the profiler to see exactly what SQL the first query generates? For example, if RqstLst happens to be an abstract base class using TPT Inheritance, the generated SQL could be huge.