I found a work around to a problem I had, and I want to know if it is valid or not. It is a similar problem to: Grails Gorm : Object references an unsaved transient instance
Lets assume I have two domain Objects (names changed to protect the guilty).
public class Shelf {
String name
Set<Book> books = [] as Set
static hasMany = [books: Book]
}
and
public class Book {
String title
Shelf shelf
}
So this means that 1 Shelf contains 0 to many Books, and one Book can be on only one Shelf.
This Shelf is very large. And at some point, it contains 80,000 Books. All stored nicely in the DB. Of course, adding new Books is getting slower and slower.
This is done by:
Book book1 = new Book("Awesome Title")
existingShelf.addToBooks(book1)
existingShelf.save(flush: true) // super slow
This is slow. Mainly (I assume) because GORM has to confirm the other 80,000 records.
So I did this to try to work around the slow point.
Book book2 = new Book("Awesome Title 2")
book2.save(flush: true)
This gives me an "Object references an unsaved transient instance", which I guess makes sense - the "shelf" value is empty.
So I did something a little weird:
Book book3 = new Book("Awesome Title 3")
book3.shelf = new Shelf()
book3.shelf.id = <known/valid id here>
book2.save(flush: true)
This works. It saves. There are no referential errors. Further code that depends on this... works.
I just made a call that last minutes and reduced it down to seconds.
But that seems too easy. I'm sure I worked around Grails magic some how. And probably broke something in the process.
Advice? Explanations?
Yes, using addTo* methods can be slow. If you look at the generated SQL you'll understand why. Doing the following:
new Book(title: "GORM Performance", shelf: grailsShelf).save()
will be faster and there is technically nothing wrong with it. Just be aware of that your instance of grailsShelf.books will not contain the new book until you've refreshed the collection from the database. This is part of what the addTo* method does for you.
Side note:
Set<Book> books = [] as Set
is unnecessary.
Related
I am trying to write an app engine application for my university. What I am trying to achieve right now, is to create a method which takes in a Course name, and returns a list of all the CourseYears (think of that as being like a link table e.g. if Maths is the course, and it has Year 1, year 2 and Year 3; MathsYear1, MathsYear2 and MathsYear3 would be the names of the CourseYears).
This is the code for the module (WARING: super dirty code below!):
#ApiMethod(name = "courseYears")
public ArrayList<CourseYear> courseYears(#Named("name") String name){
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Query.Filter keyFilter = new Query.FilterPredicate("name", Query.FilterOperator.EQUAL, name);
Query query = new Query("Course").setFilter(keyFilter);
PreparedQuery preparedQuery = datastore.prepare(query);
List<Entity> resultList = preparedQuery.asList(FetchOptions.Builder.withLimit(1));
Course course = ofy().load().type(Course.class).id(resultList.get(0).getKey().getId()).now();
ArrayList<String> courseYearNames = course.getAllCourseYearNames();
System.out.println(course.getName());
ArrayList<CourseYear> courseYears = new ArrayList<CourseYear>();
for(String courseYearName: courseYearNames){
Query.Filter courseNameFilter = new Query.FilterPredicate("name", Query.FilterOperator.EQUAL, courseYearName);
Query query2 = new Query("CourseYear").setFilter(courseNameFilter);
List<Entity> resL = preparedQuery.asList(FetchOptions.Builder.withLimit(1));
System.out.println("test");
CourseYear courseYear = ofy().load().type(CourseYear.class).id(resL.get(0).getKey().getId()).now();
courseYears.add(courseYear);
}
return courseYears;
}
It basically takes a Course name in, applies a filter on all courses to get the corresponding Course object, and then calls getAllCourseYearNames() on the course to get an array list containing all its CourseYears' names. (I would have loved to do this using Keys, but parameterised Objectify keys don't seem to be supported in this version of App Engine).
I then try and get the CourseYears by looping through the arraylist of names and applying the filter for each name. I print "test" each time to see how many times it is looping. Like I said, a super dirty way of doing it.
When I try passing a few course names as a parameters, it loops the correct number of times only once or twice, and after that does not loop at all (doesn't print "test"). I could understand if it never looped, but not doing it correctly once or twice and then never again. It doesn't successfully return a list of CourseYears when it does work, but rather the relevant number of NULLs - I don't know if this is relevant. I believe it successfully retrieves the course every time, as I print the name of the course after loading and it never fails to do this.
If anyone has ANY suggestions for why this may be happening, I would be incredibly grateful to hear them!
Thanks
query2 is never used in your code. You reuse preparedQuery from your previous query, which runs on a different entity kind.
I have a model that looks like this:
class Report(models.Model):
updater = models.CharField(max_length=15)
pub_date = models.DateTimeField(auto_add_now=True)
identifier = models.CharField(max_length=100)
... and so on...
There are some more fields but they are irrelevant to the question. Now the site has very simple functions - the users can see older reports and their data, and can edit them or add new ones.
However, the identifier field is actually an integer that symbolizes a log file that is being reported. Most of the times, each report has one log. But sometimes it has more than one. I did it as a CharField because I built the site to replace an older sharepoint 2003 website, where that field was treated as simple text. So I want that in my next version, it would be like it should be, i.e. like this:
class Report(models.Model):
updater = models.CharField(max_length=15)
pub_date = models.DateTimeField(auto_add_now=True)
... and so on...
class Log(models.Model):
report = models.ForeignKey(Report)
identifier = models.IntegerField()
The problem is, since in the old site that field was a CharField, people used this as they liked. Meaning, even if they updated various logs in the same report they just did it like this <logid1>, <logid2>. Sometimes they added some text <logid1> which is related to <logid2>.
So I want to change this, but I don't want to lose all the old data, and I can't fix all those edge cases (the DB contains around 22 thousand reports). I thought about adding this to report:
def disp_id(self):
if self.pub_date < ... #the day I'll do the update
return self.identifier
else:
return ', '.join([log.identifier for log in self.log_set.all()])
But then I'm not really getting rid of the old field now am I? I'm just adding a new one and keeping the original null from a certain date.
As far as I know, what I want to do is impossible. I'm only asking because I know that maybe I'm not the first one to deal with this sort of thing and maybe there is a solution that I'm not aware of.
Hope my explanation is clear enough, thanks in advance!
class Report(models.Model):
updater = models.CharField(max_length=15)
pub_date = models.DateTimeField(auto_add_now=True)
identifier = models.CharField(null=True)
... and so on...
logs = models.ManyToManyField(Log,null=True)
class Log(models.Model):
identifier = models.IntegerField()
Make the above model , and then make a script as follow:
ident_list = []
for reports in Report.objects.all():
identifiers = reports.identifiers.split(',')
for idents in identifiers:
if not idents in ident_list:
log = Log.create(**{'identifier' : int(idents)})
ident_list.append(int(idents))
else:
log = Log.objects.get(identifier = int(idents))
report.log.add(log)
Check the data before removing the column identifiers from the table Report.
Does it solves your purpose now ?
I've been asked to set up a process which monitors the active directory, specifically certain accounts, to check that they are not locked so that should this happen, the support team can get an early warning.
I've found some code to get me started which basically sets up requests and adds them to a notification queue. This event is then assigned to a change event and has an ObjectChangedEventArgs object passed to it.
Currently, it iterates through the attributes and writes them to a text file, as so:
private static void NotifierObjectChanged(object sender,
ObjectChangedEventArgs e)
{
if (e.ResultEntry.Attributes.AttributeNames == null)
{
return;
}
// write the data for the user to a text file...
using (var file = new StreamWriter(#"C:\Temp\UserDataLog.txt", true))
{
file.WriteLine("{0} {1}", DateTime.UtcNow.ToShortDateString(), DateTime.UtcNow.ToShortTimeString());
foreach (string attrib in e.ResultEntry.Attributes.AttributeNames)
{
foreach (object item in e.ResultEntry.Attributes[attrib].GetValues(typeof(string)))
{
file.WriteLine("{0}: {1}", attrib, item);
}
}
}
}
What I'd like is to check the object and if a specific field, such as name, is a specific value, then check to see if the IsAccountLocked attribute is True, otherwise skip the record and wait until the next notification comes in. I'm struggling how to access specific attributes of the ResultEntry without having to iterate through them all.
I hope this makes sense - please ask if I can provide any additional information.
Thanks
Martin
This could get gnarly depending upon your exact business requirements. If you want to talk in more detail ping me offline and I'm happy to help over email/phone/IM.
So the first thing I'd note is that depending upon what the query looks like before this, this could be quite expensive or error prone (ie missing results). This worries me somewhat as most sample code out there gets this wrong. :) How are you getting things that have changed? While this sounds simple, this is actually a somewhat tricky question in directory land, given the semantics supported by AD and the fact that it is a multi-master system where writes happen all over the place (and replicate in after the fact).
Other variables would be things like how often you're going to run this, how large the data set could be in AD, and so on.
AD has some APIs built to help you here (the big one that comes to mind is called DirSync) but this can be somewhat complicated if you haven't used it before. This is where the "ping me offline" part comes in.
To your exact question, I'm assuming your result is actually a SearchResultEntry (if not I can revise, tell me what you have in hand). If that is the case then you'll find an Attributes field hanging off of that guy, and from there there is AttributeNames and Values. I think you'll see how it works from there if you have Values in hand, for example:
foreach (var attr in sre.Attributes.Values)
{
var da = (DirectoryAttribute)attr;
Console.WriteLine(da.Name);
foreach (var val in da.GetValues(typeof(byte[])))
{
// Handle a byte[] val ...
}
}
As I said, if you have something other than a SearchResultEntry in hand, let us know and I can revise the code sample.
I'm making a Windows Phone 7.1 application, and I'm having a lot of trouble submitting changes to my database. Here is the structure of the tables in my database:
Day <-1-----*-> TrainingSession <-many-----1-> Sport
So, a single day can have many training sessions, and a training session has one sport. A single sport can naturally be in many different training sessions.
The primary keys look like this:
Day - DateTime
TrainingSession - int (DB generated)
Sport - nvarchar(200)
Sports will simply have attributes sportName, and an iconFileName.
I've set up Associations by putting EntitySet in both Day and Sport, and TrainingSession has EntityRef and EntityRef. I'm not 100% sure if Sport needs the EntitySet, so please correct me if I'm wrong. For the moment, I just hard-coded some sports in my Sport class for testing, and you'll see me retrieving an ObservableCollection to get those out.
Here is how I am trying to create a collection of days with training sessions, each training session having different sports:
public void CreateDay(DateTime date)
{
FitPlanDataContext calendarDatabase = new FitPlanDataContext(FitPlanDataContext.ConnectionString);
DateTime firstDate = new DateTime(date.Year, date.Month, 1);
DayItem dayItem = new DayItem();
dayItem.DateTime = firstDate;
fillTestDayItemWithRandomData(dayItem);
calendarDatabase.DayItems.InsertOnSubmit(dayItem);
calendarDatabase.SubmitChanges();
}
private void fillTestDayItemWithRandomData(DayItem dayItem)
{
ObservableCollection<SportArt> sportArtCollection = SportArtController.GetAllSports();
dayItem.TrainingSessions = new EntitySet<TrainingSession>();
ObservableCollection<TrainingSession> trainingSessionCollection = new ObservableCollection<TrainingSession>();
TrainingSession trainingSession1 = new TrainingSession();
trainingSession1.DayItem = dayItem;
trainingSession1.SportArt = sportArtCollection[1];
trainingSessionCollection.Add(trainingSession1);
TrainingSession trainingSession2 = new TrainingSession();
trainingSession2.DayItem = dayItem;
trainingSession2.SportArt = sportArtCollection[2];
trainingSessionCollection.Add(trainingSession2);
FitPlanDataContext calendarDatabase = new FitPlanDataContext(FitPlanDataContext.ConnectionString);
calendarDatabase.TrainingSessions.InsertAllOnSubmit<TrainingSession>(trainingSessionCollection);
}
This code is not working for me, and it is giving me the following error:
NotSupportedException was Unhandled:
An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported.
Before I got this error, I was also getting NullReferenceExceptions.
I've been looking around for a solution, and I saw some people used Detach or workarounds with Attach, but I havent figured out how I could implement it to my code. Could anyone give me a helping hand with this?
Also, I thought the NullReferenceException could be coming from the fact that I'm not saving any sports to the database, could this be so?
So I messed around with it a lot, and today I finally found the solution I was looking for.
It seems I asked the question wrong. I didn't include the query from the database, which is probably important to add. I actually omitted a lot of the code to keep things simple in my question, but looks like I omitted too much.
Anyways, it turned out the way I setup the database structure was correct, and nothing had to be changed there.
So here's what I did to get it working:
-The call to the method that fills the day with training sessions needed to go after submitting changes about the day. This is because days have training sessions, and I cant save training sessions without the day already in the database.
-I added using statements around the places where I need to use the datacontext instead of just creating an instance of the datacontext with a local variable. This ensures that the datacontext lives only in the scope of the using statment.
(I changed the DateTime of the day to be the date given as the parameter to the method)
public void CreateDay(DateTime date)
{
DayItem dayItem = new DayItem();
dayItem.DateTime = date;
using (FitPlanDataContext calendarDatabase = new FitPlanDataContext(FitPlanDataContext.ConnectionString))
{
calendarDatabase.DayItems.InsertOnSubmit(dayItem);
calendarDatabase.SubmitChanges();
}
fillTestDayItemWithRandomData(dayItem);
}
Then, the changes to the method that fills the day with training sessions go like this:
-I open a using statement where I instantiate a new datacontext. Then I access the database to retrieve a list of all the sports, and also the day that I need to update. I find the day I need to update by dayItemParameter. (Remember that retrieving from the database will give you a collection.)
-I create my new training sessions and fill their properties. Note that the day I retrieved from the database is the value of a training session's property because the training session is a child of day, and needs to know who its parent day is.
-I removed the instantiation of EntitySet because I realized that I already instantiate it in the constructor of the DayItem class.
-Lastly, I add all the new training sessions into a collection, and save them all to the database at once using InsertAllOnSubmit(collection).
private void fillTestDayItemWithRandomData(DayItem dayItemParameter)
{
using (FitPlanDataContext calendarDatabase = new FitPlanDataContext(FitPlanDataContext.ConnectionString))
{
ObservableCollection<SportArt> sportArtCollection;
var sportArts = (from SportArt sportArt in calendarDatabase.SportArts
select sportArt);
sportArtCollection = new ObservableCollection<SportArt>(sportArts);
ObservableCollection<DayItem> dayItemCollection;
var dayItems = (from DayItem dayItem in calendarDatabase.DayItems
where dayItem.DateTime == dayItemParameter.DateTime
select dayItem);
dayItemCollection = new ObservableCollection<DayItem>(dayItems);
DayItem foundDayItem = dayItemCollection[0];
ObservableCollection<TrainingSession> trainingSessionCollection = new ObservableCollection<TrainingSession>();
TrainingSession trainingSession1 = new TrainingSession();
trainingSession1.DayItem = foundDayItem;
trainingSession1.SportArt = sportArtCollection[1];
trainingSessionCollection.Add(trainingSession1);
TrainingSession trainingSession2 = new TrainingSession();
trainingSession2.DayItem = foundDayItem;
trainingSession2.SportArt = sportArtCollection[2];
trainingSessionCollection.Add(trainingSession2);
calendarDatabase.TrainingSessions.InsertAllOnSubmit<TrainingSession>(trainingSessionCollection);
calendarDatabase.SubmitChanges();
}
}
Conclusion:
The main problem I was having was that I was trying to save training sessions to a day that wasn't submitted to the database. The next big problem (that I think many others have) is that reading and updating of an entity has to be in the same datacontext. So, you can't create a datacontext to retrieve a day, then use another datacontext to add a training session to that day (even if you saved the value of the day to a local variable). You need to retrieve the day and save training sessions to it all in the same data context.
At the moment, my application is working, but it is quite sluggish. In this question, I'm asking about just one day, but in my actual program, I'm creating hundreds of days, which means a lot of opening and closing of the database. If anyone has suggestions to how I can
optimize the process, I'm open ears.
I realize and apologize that this post got so long, but writing it helped me to understand the situation with more depth, and I really hope that it'll help others too.
In Google App Engine, I make lists of referenced properties much like this:
class Referenced(BaseModel):
name = db.StringProperty()
class Thing(BaseModel):
foo_keys = db.ListProperty(db.Key)
def __getattr__(self, attrname):
if attrname == 'foos':
return Referenced.get(self.foo_keys)
else:
return BaseModel.__getattr__(self, attrname)
This way, someone can have a Thing and say thing.foos and get something legitimate out of it. The problem comes when somebody says thing.foos.append(x). This will not save the added property because the underlying list of keys remains unchanged. So I quickly wrote this solution to make it easy to append keys to a list:
class KeyBackedList(list):
def __init__(self, key_class, key_list):
list.__init__(self, key_class.get(key_list))
self.key_class = key_class
self.key_list = key_list
def append(self, value):
self.key_list.append(value.key())
list.append(self, value)
class Thing(BaseModel):
foo_keys = db.ListProperty(db.Key)
def __getattr__(self, attrname):
if attrname == 'foos':
return KeyBackedList(Thing, self.foo_keys)
else:
return BaseModel.__getattr__(self, attrname)
This is great for proof-of-concept, in that it works exactly as expected when calling append. However, I would never give this to other people, since they might mutate the list in other ways (thing[1:9] = whatevs or thing.sort()). Sure, I could go define all the __setslice__ and whatnot, but that seems to leave me open for obnoxious bugs. However, that is the best solution I can come up with.
Is there a better way to do what I am trying to do (something in the Python library perhaps)? Or am I going about this the wrong way and trying to make things too smooth?
If you want to modify things like this, you shouldn't be changing __getattr__ on the model; instead, you should write a custom Property class.
As you've observed, though, creating a workable 'ReferenceListProperty' is difficult and involved, and there are many subtle edge cases. I would recommend sticking with the list of keys, and fetching the referenced entities in your code when needed.