Deserializing only select properties of an Entity using JDOQL query string? - google-app-engine

I have a rather large class stored in the datastore, for example a User class with lots of fields (I'm using java, omitting all decorations below example for clarity):
#PersistenceCapable
class User {
private String username;
private String city;
private String state;
private String country;
private String favColor;
}
For some user queries, I only need the favColor property, but right now I'm doing this:
SELECT FROM " + User.class.getName() + " WHERE username == 'bob'
which should deserialize all of the entity properties. Is it possible to do something instead like:
SELECT username, favColor FROM " + User.class.getName() + " WHERE username == 'bob'
and then in this case, all of the returned User instances will only spend time deserializing the username and favColor properties, and not the city/state/country properties? If so, then I suppose all the other properties will be null (in the case of objects) or 0 for int/long/float?
Thank you

No, this isn't possible with the App Engine datastore; your entire entity is stored in a protocol buffer and must be deserialized together.
If you have really large properties that aren't often needed, it's probably a good idea to put these in a separate model, although if you're really talking about just 3 strings it's almost certainly not worth the effort.

Related

Controlling NHIbernate search query output regarding parameters

When you use NHibernate to "fetch" a mapped object, it outputs a SELECT query to the database. It outputs this using parameters; so if I query a list of cars based on tenant ID and name, I get:
select Name, Location from Car where tenantID=#p0 and Name=#p1
This has the nice benefit of our database creating (and caching) a query plan based on this query and the result, so when it is run again, the query is much faster as it can load the plan from the cache.
The problem with this is that we are a multi-tenant database, and almost all of our indexes are partition aligned. Our tenants have vastly different data sets; one tenant could have 5 cars, while another could have 50,000. And so because NHibernate does this, it has the net effect of our database creating and caching a plan for the FIRST tenant that runs it. This plan is likely not efficient for subsequent tenants who run the query.
What I WANT to do is force NHibernate NOT to parameterize certain parameters; namely, the tenant ID. So I'd want the query to read:
select Name, Location from Car where tenantID=55 and Name=#p0
I can't figure out how to do this in the HBM.XML mapping. How can I dictate to NHibernate how to use parameters? Or can I just turn parameters off altogether?
OK everyone, I figured it out.
The way I did it was overriding the SqlClientDriver with my own custom driver that looks like this:
public class CustomSqlClientDriver : SqlClientDriver
{
private static Regex _partitionKeyReplacer = new Regex(#".PartitionKey=(#p0)", RegexOptions.Compiled);
public override void AdjustCommand(IDbCommand command)
{
var m = _tenantIDReplacer.Match(command.CommandText);
if (!m.Success)
return;
// replace the first parameter with the actual partition key
var parameterName = m.Groups[1].Value;
// find the parameter value
var tenantID = (IDbDataParameter ) command.Parameters[parameterName];
var valueOfTenantID = tenantID.Value;
// now replace the string
command.CommandText = _tenantIDReplacer.Replace(command.CommandText, ".TenantID=" + valueOfTenantID);
}
} }
I override the AdjustCommand method and use a Regex to replace the tenantID. This works; not sure if there's a better way, but I really didn't want to have to open up NHibernate and start messing with core code.
You'll have to register this custom driver in the connection.driver_class property of the SessionFactory upon initialization.
Hope this helps somebody!

Objectify queries are very slow (Google Datastore)

After some refactoring, we are having issues with the objectify queries that we are using in the application. The strange thing is that even if we revert to the original code the problem stays.
When the application starts, a number of 250 books are fetched from the Datastore using Objectify. The caching is enabled and seems to be working.
The problem is that it takes around 50 - 60 seconds to get the result, and for this reason sometimes the http request is killed. We never had this issues before and we can't find an answer to it.
If I ran a query like "select * from BookEntity order by creationDate desc limit 250" in the Google Datastore console and it took 5 - 7 seconds not more.
Before the refactoring, the book entity looked something like this:
#Index
#Entity
#Cache
public class BookEntity {
#Index
public String title_name;
#Index
public String author_name;
public String isbn;
public int number_of_pages;
public Ref<PdfEntity> book_pdf;
}
Now it's like this:
#Index
#Entity
#Cache
public class BookEntity {
#Index
#AlsoLoad("title_name")
private String titleName;
#Index
#AlsoLoad("author_name")
private String authorName;
private String isbn;
#AlsoLoad("number_of_pages")
private int numberOfPages;
#AlsoLoad("book_pdf")
private Ref<PdfEntity> bookPdf;
// getters and setters for the fields because now they are private
}
Here is just an example, but in reality it has around 20 fields.
In order to migrate the schema to the field names, I ran a task in GAE which loaded and then saved again all the BookEntity entities.
This example can be extended to all the entities that are used in the application, but the book is the worst performing one. Even though nothing is changed in the query, and we are talking about a basic query which fetches the newest 250 books by creationDate, it takes a lifetime to get the actual result. Any idea how I can investigate this issues further?
Problem found. We were persisting some information in the non-args constructor of the BookEntity, so for every book fetched from the datastore 3 save operations were made for some other entities which are referred from the book.

VB.NET - PetaPoco\NPoco - Fetch data from table with dynamic and static columns - Performance issue

I have a specific situation to which I haven't found a solution yet.
I have several databases with the same structure where I have a table, (lets say Users), which has known columns such as: UserID, UserName, UserMail, etc...
In the same table, I have dynamic custom columns which I can know only on runtime, such as: customField54, customField75, customField82, etc...
I have a screen where I must show a list of users, and there are thousands of records (Must show ALL the users - no question about it).
The Users table columns in database A look like this:
| UserID | UserName | UserMail | customField54 | customField55 |
and for the example, lets say I have another database B, and the table Users there looks like this:
| UserID | UserName | UserMail | customField109 | customField211 | customField235 | customField302 |
I have a single code which each time connects to another database. So I have a single code - > multiple databases, while the difference in each database is the custom fields of the Users table.
If I work with a DataTable, I can query:
SELECT * FROM Users
And then, dynamically I can retrieve the custom fields values, like this:
Dim customFieldsIDs() As Integer = GetCustomFieldsIDs()
Dim dt As DataTable = GetUserListData() // All users data in a DataTable
For Each dr In dt.Rows
Response.Write(dr.Item("UserID"))
Response.Write(dr.Item("UserName"))
Response.Write(dr.Item("UserMail"))
For Each cfID in customFieldsIDs
Response.Write(dr.Item("customField" & cfID))
Next
Next
My intention is not to work with DataTables. I want to work with strong typed objects. I cannot create a POCO of Users with the customFields as is inside, because for each database the Users table has different customFields columns, so I can't create an object with strongly typed variables.
Then, I decided to create a class Users with the known columns inside, and also a dictionary holding the customFields.
In VB.NET, I created a class Users, which looks as follows:
Public Class User
Public Property UserID As Integer
Public Property UserName As Integer
Public Property UserMail As Integer
Public Property customFieldsDictionary As Dictionary(Of Integer, String)
End Class
The class has the static values: UserID, UserName, etc...
Also, it has a dictionary of the customFieldIDs and their values, so I can retrieve the values in a single action (in O(1) complexity)
I use MicroORM PetaPoco\NPoco to populate the values.
The ORM allows me to fetch the Users data without me having to iterate the data by myself, by calling:
Dim userList As List(Of User) = db.Fetch(Of User)("SELECT * FROM Users")
But then the customFields dictionary is not populated.
In order to populate I have to iterate the userList and for each user retrieve the customFields data.
This is a very expensive way to fetch the data and results in a very bad performance.
I'd like to know if there is a way to fetch the data into the User class using PetaPoco\NPoco with a single command and manage to populate the known values and the custom fields dictionary for every user without having to iterate through the whole collection.
I hope it is understood. It is really difficult for me to explain and a very difficult issue to find a solution to.
You could try fetching everything into a dictionary and then you could map specific keys/values to your User object properties.
EDIT:
I'm not using VB.NET anymore, but I'll try to explain.
Create the indexer similar to this one:
http://www.java2s.com/Tutorial/VB/0120__Class-Module/DefineIndexerforyourownclass.htm
In the indexer you would do something like:
if (index == "FirsName") then
me.FirstName = value
end if
....
if (index.startWith("customField") then
var indexValue = int.Parse(index.Replace("customField",""))
me.customFieldsDictionary[indexValue] = value
end if
NPoco supports materializing the data into dictionary:
var users = db.Fetch<Dictionary<string, object>>("select * from users");
You should be able to pass your class to the NPoco and force it to use the Fetch overload for dictionary.
I've used this approach before, but I can't find the source at the moment.

Adding a row to a table from the properties of a class

I have a class that represents the table of a db-row. Its properties are the columns of the table. I add a new row to the table with the following code:
Public Sub AddRow(oTestRow As TestRow)
Dim sql As String
With oTestRow
sql = String.Format("INSERT INTO TestTable " &
"(ArtNr, ArtName, ArtName2, IsVal, CLenght) " &
"Values ('{0}', '{1}', '{2}', {3}, {4})",
.ArtNr, .ArtName, .ArtName2, .IsVal, .CLenght)
End With
Using oConn As New OleDbConnection(m_ConnString)
oConn.Open()
Using oInsertCmd As New OleDbCommand(sql, oConn)
oInsertCmd.ExecuteNonQuery()
End Using
End Using
End Sub
That is just an example, but my classes have around 30-40 properties and this brings a very large and complex sql string.
Creating, editing or maintaining these sql strings for many classes could generate errors.
I am wondering if any compact way or method exists in order to add the whole object's istance (the properties of course) to the table "TestTable" without writing such a large sql string.
I created the TestRow in the way that its properties are exactly the columns of the table "TestTable" (with the same name). But I did not found in the ADO.NET anything that could be used.
If changing DB system is an option, you may wanna take a look at some document based no sql solution like MongoDB, CouchDB or especially for .Net RavenDB, db4o or Eloquera.
Here is a list of some of them.
for starters anything with inline queries is a bad practice (unless the need demands for e.g. you have tables defined in the db, and dont have access to the db to deploy procedures)
you have few options - for e.g. instead of handwriting the classes , use Entitiy framework a better alternative to Linq2Sql
if you want to stick with the tags in this question I would design this making the most of OO concepts. (this is a rough sketch, but I hope this helps)
public class dbObject
protected <type> ID --- This is important. if this has value, commit will assume update, otherwise an update will be performed
public property DBTableName // set the table name
public property CommitStoredprocedure // the procedure on the database that can do commit work
public property SelectStoredProcedure // the procedure used to retrieve the i
public dbObject construcor (connection string or dbcontext etc)
set dbConnection here
end constructor
public method commit
reflect on this.properties available and prepare your commit string.
if you are using storedproc ensure that you prepare named parameters and that the stored proc is defined with the same property names as your class property names. also ensure that storedproc will update if there is an ID value or insert and return a ID when the id value is not available
Create ADO.net command and execute. (this is said easy here but you need to perfect this method)
End method
end class
public class employee inherits dbObject
// employee properties here
public string name;
end employee
public class another inherits dbObject
//another properties
public bool isValid;
end department
usage:
employee e = new employee;
e.name = "John Smith";
e.commit();
console.WriteLine(e.id); // will be the id set by the commit method from the db
If you make baseclass correct (well tested) here, this is automated and you shouldnt see errors.
you will need to extend the base class to Retrieve records from the db based on an id (if you want to instantiate objects from db)

store strings of arbitrary length in Postgresql

I have a Spring application which uses JPA (Hibernate) initially created with Spring Roo. I need to store Strings with arbitrary length, so for that reason I've annotated the field with #Lob:
public class MyEntity{
#NotNull
#Size(min = 2)
#Lob
private String message;
...
}
The application works ok in localhost but I've deployed it to an external server and it a problem with encoding has appeared. For that reason I'd like to check if the data stored in the PostgreSQL database is ok or not. The application creates/updates the tables automatically. And for that field (message) it has created a column of type:
text NOT NULL
The problem is that after storing data if I browse the table or just do a SELECT of that column I can't see the text but numbers. Those numbers seems to be identifiers to "somewhere" where that information is stored.
Can anyone tell me exactly what are these identifiers and if there is any way of being able to see the stored data in a #Lob columm from a pgAdmin or a select clause?
Is there any better way to store Strings of arbitrary length in JPA?
Thanks.
I would recommend skipping the '#Lob' annotation and use columnDefinition like this:
#Column(columnDefinition="TEXT")
see if that helps viewing the data while browsing the database itself.
Use the #LOB definition, it is correct. The table is storing an OID to the catalogs -> postegreSQL-> tables -> pg_largeobject table.
The binary data is stored here efficiently and JPA will correctly get the data out and store it for you with this as an implementation detail.
Old question, but here is what I found when I encountered this:
http://www.solewing.org/blog/2015/08/hibernate-postgresql-and-lob-string/
Relevant parts below.
#Entity
#Table(name = "note")
#Access(AccessType.FIELD)
class NoteEntity {
#Id
private Long id;
#Lob
#Column(name = "note_text")
private String noteText;
public NoteEntity() { }
public NoteEntity(String noteText) { this.noteText = noteText }
}
The Hibernate PostgreSQL9Dialect stores #Lob String attribute values by explicitly creating a large object instance, and then storing the UID of the object in the column associated with attribute.
Obviously, the text of our notes isn’t really in the column. So where is it? The answer is that Hibernate explicitly created a large object for each note, and stored the UID of the object in the column. If we use some PostgreSQL large object functions, we can retrieve the text itself.
Use this to query:
SELECT id,
convert_from(loread(
lo_open(note_text::int, x'40000'::int), x'40000'::int), 'UTF-8')
AS note_text
FROM note

Resources