Amazon SimpleDB Query to Find "Post By Friends" - database

I have been developing an iPhone app which queries a server that relays data I store in Amazon SimpleDB. I have a database table of "Submissions" by various users. I am interfacing with Facebook to retrieve Facebook Friends and wish to make a query to "Submissions" to find posts by friends - like:
SELECT * FROM submissions WHERE userID = '00123' OR userID = '00124' OR ....
(through complete list of friends)
I think this will run into an Amazon query limit with this kind of select statement -
[Maximum number of comparisons per Select expression: 20]
Can you think of a way to elegantly pull this off with SimpleDB?
I'd rather not have to do a bunch of 20 person queries.
Or, do I need to move to a different database package and then do cross-table queries?
Thanks!

There is a way to do it with SimpleDB but it isn't elegant, it's more of a hack since it requires you to artificially duplicate the userid attribute in your submission items.
It's based on the fact that although you can only have 20 comparisons per IN predicate, you can have 20 IN predicates if they each name different attributes. So add additional synthetic attributes to your submission items of the form:
userID='00123' userID_2='00123' userID_3='00123' userID_4='00123' ... userID_20='00123'
They all have the identical value for a given submission. Then you can fetch the submission of up to 400 friends with a single query:
SELECT * FROM submissions
WHERE userID IN('00120','00121',...,'00139') OR
`userID_2` IN('00140','00141',...,'00159') OR
`userID_3` IN('00160','00161',...,'00179') OR
`userID_4` IN('00180','00181',...,'00199') OR
...
`userID_20` IN('00300','00301',...,'00319')
You can populate the 19 extra attributes at the time the submission is created (if you have the attributes to spare) and it doesn't sound like a submission's user would ever change. Also you may want to explicitly name the attributes to be returned (instead of using * ) since you would now have 19 of them that you don't care about in the return data set.
From the data model point of view, this is clearly a hack. But having said that, it gives you exactly what you would want, for users with 400 friends or less: a single query so you can restrict by date or other criteria, sort by most recent, page through results, etc. Unfortunately, a capacity of 400 won't accommodate the friend lists of all facebook users. So you may still need to implement a multi-query solution for large friend lists just the same.
My suggestion is that if SimpleDB suits the needs of your app with the exception of this issue, then consider using the hack. But if you need to do things like this repeatedly then SimpleDB is probably not the best choice.

You're needing either an IN clause or a join to a temp table. Unfortunately AmazonSimpleDB has its limitations. We abandoned it on a promising project for this very reason. We went down the path of multithreading and using the NextToken functionality before we switched gears.
You could execute parallel (multithreaded) queries to SimpleDB to get submissions, each query looking for up to 20 user IDs and then merging the results into one list. Still, it's probably time to consider a switch to MySQL or SQL Server to be able to upload a list of IDs as a temp table and then do a simple join to get the results.

I created the Simple Savant .NET library for SimpleDB, and I happen to have some utility code lying around for splitting and running in parallel multiple select queries, while limiting the IN clause of each select to 20 values. I'll probably roll this code into the next Savant release, but here it is for anyone who finds it useful:
/// <summary>
/// Invokes select queries that use parameter lists (with IN clauses) by splitting the parameter list
/// across multiple invocations that are invoked in parallel.
/// </summary>
/// <typeparam name="T">The item type</typeparam>
/// <typeparam name="P">The select parameter type</typeparam>
/// <param name="savant">The savant instance.</param>
/// <param name="command">The command.</param>
/// <param name="paramValues">The param values.</param>
/// <param name="paramName">Name of the param.</param>
/// <returns></returns>
public static List<T> SelectWithList<T,P>(ISimpleSavantU savant, SelectCommand<T> command, List<P> paramValues, string paramName)
{
var allValues = SelectAttributesWithList(savant, command, paramValues, paramName);
var typedValues = new List<T>();
foreach (var values in allValues)
{
typedValues.Add((T)PropertyValues.CreateItem(typeof (T), values));
}
return typedValues;
}
/// <summary>
/// Invokes select queries that use parameter lists (with IN clauses) by splitting the parameter list
/// across multiple invocations that are invoked in parallel.
/// </summary>
/// <typeparam name="P">The select parameter type</typeparam>
/// <param name="savant">The savant instance.</param>
/// <param name="command">The command.</param>
/// <param name="paramValues">The param values.</param>
/// <param name="paramName">Name of the param.</param>
/// <returns></returns>
public static List<PropertyValues> SelectAttributesWithList<P>(ISimpleSavantU savant, SelectCommand command, List<P> paramValues, string paramName)
{
Arg.CheckNull("savant", savant);
Arg.CheckNull("command", command);
Arg.CheckNull("paramValues", paramValues);
Arg.CheckNullOrEmpty("paramName", paramName);
var allValues = new List<PropertyValues>();
if (paramValues.Count == 0)
{
return allValues;
}
var results = new List<IAsyncResult>();
do
{
var currentParams = paramValues.Skip(results.Count * MaxValueTestsPerSimpleDbQuery).Take(MaxValueTestsPerSimpleDbQuery).ToList();
if (!currentParams.Any())
{
break;
}
var currentCommand = Clone(command);
currentCommand.Reset();
var parameter = currentCommand.GetParameter(paramName);
parameter.Values.Clear();
parameter.Values.AddRange(currentParams.Select(e => (object)e));
var result = savant.BeginSelectAttributes(currentCommand, null, null);
results.Add(result);
} while (true);
foreach (var result in results)
{
var values = ((ISimpleSavant2)savant).EndSelectAttributes(result);
allValues.AddRange(values);
}
return allValues;
}
private static SelectCommand Clone(SelectCommand command)
{
var newParameters = new List<CommandParameter>();
foreach (var parameter in command.Parameters)
{
var newParameter = new CommandParameter(parameter.Name, parameter.PropertyName, null);
newParameter.Values.Clear();
newParameters.Add(newParameter);
}
var newCommand = new SelectCommand(command.Mapping, command.CommandText, newParameters.ToArray())
{
MaxResultPages = command.MaxResultPages
};
return newCommand;
}

Related

Entity Framework 6 code-first migration - starting with the CreateDatabaseIfNotExists initializer

I have started an EF6 project to store measurement results from analytical instruments. Each instrument has a built-in PC with and it's own results database.
Initially, the database initializer CreateDatabaseIfNotExists was used. On database creation, it creates an entry in the __MigrationHistory table with a non-unique MigrationId entry (timestamp differs from instrument to instrument, e.g. 201706011336597_InitialCreate), the ContextKey if the fully qualified type of my derived DbContext.
After a while, it was decided to add more result data to the database... Furtunately, only three new tables are required. There are no changes in the existing tables.
For that, I wanted to use the MigrateDatabaseToLatestVersion initializer. But I have to support the following two scenarios:
Existing database with the non-unique MigrationId, that has to be migrated to the extended version with the three new tables.
No database, create the database with the the MigrateDatabaseToLatestVersion initializer.
How can I do this?
I have created an initial migration using the add-migration PM console command from the initial DbContext. That works well with scenario 2 (no database exists). From that starting point I can update my DbContext and create a new migration with the three new tables.
But how to support scenario 1? The Up() method of the initial migration contains the table creation code, that is not nessessary, because the tables already exist. Is an empty migration (add-migration -IgnoreChanges) helpful, maybe with a later timestamp than the initial migration?
Note: I have no access from the PM console to the target database(s), only on my developer machine to a test database.
Thanks and best regards
Karsten
Update:
I have modified the created initial migration with the static flag TablesAlreadyCreated.
public partial class InitialMigraCreate : DbMigration
{
/// <summary>
/// Set this field to true, if the tables are already created by the
/// CreateDatabaseIfNotExists database initializer. Then, the Up()
/// and Down() methods do nothing, but the
/// migration is added to the __MigrationHistory table.
/// </summary>
public static bool TablesAlreadyCreated = false;
public override void Up()
{
if (TablesAlreadyCreated)
return;
// several CreateTable calls here
}
/// <inheritdoc/>
public override void Down()
{
if (TablesAlreadyCreated)
return;
// several Drop... calls here
}
}
I have also implemented a new database initializer class as follows:
public class MigrateDatabaseToLatestVersionEx<TContext, TMigrationsConfiguration> : IDatabaseInitializer<TContext>
where TContext : DbContext
where TMigrationsConfiguration : DbMigrationsConfiguration<TContext>, new()
{
...
/// <inheritdoc />
public virtual void InitializeDatabase(TContext context)
{
if (context == null)
throw new ArgumentNullException("context");
// check whether a first migration exists from the CreateDatabaseIfNotExists database initializer
var firstConfig = new ConfigurationAutoCreatedDatabase();
firstConfig.TargetDatabase = _config.TargetDatabase;
var firstMigrator = new DbMigrator(firstConfig);
var firstDbMigrations = firstMigrator.GetDatabaseMigrations();
// create the default migrator with the current configuration
var migrator = new DbMigrator(_config);
if (1 == firstDbMigrations.Count(migra => migra.EndsWith("_InitialCreate", StringComparison.InvariantCultureIgnoreCase)))
{ // This database was created using the CreateDatabaseIfNotExists database initializer.
// That's an indication whether it's an old database
// Do the custom migration here!
InitialMigraCreate.TablesAlreadyCreated = true;
migrator.Update();
}
else
{ // do the default migration the database was created with this MigrateDatabaseToLatestVersionEx initializer
InitialMigraCreate.TablesAlreadyCreated = false;
migrator.Update();
}
}
}
It checks, whether the initial migration entry is from the CreateDatabaseIfNotExists initializer and disables the table creation/drop calls in the Up()/Down() methods in that case. ConfigurationAutoCreatedDatabase is a manually created derived DbMigrationsConfiguration class:
internal sealed class ConfigurationAutoCreatedDatabase : DbMigrationsConfiguration<MyNamespace.MyDbContext>
{
/// <summary>
/// Creates a <c>ConfigurationAutoCreated</c> object (default constructor).
/// </summary>
public ConfigurationAutoCreatedDatabase()
{
this.AutomaticMigrationsEnabled = false;
this.AutomaticMigrationDataLossAllowed = false;
this.ContextKey = "MyNamespace.MyDbContext";
}
}
So, it works for both scenarios. I hope that helps other guys with a similar problem. It would be interesting, if there is an out-of-the-box EF workflow for that task.
This is a very common scenario that EF handles nicely. The non-migration initializers (CreateDatabaseIfNotExists, etc) should be used in very early development (when you don't care about the data except for stuff that is seeded).
Once you are switching to migrations you should generate a baseline migration that takes a snapshot of your current model as you indicate (add-migration MyStartPoint -IgnoreChanges). This adds a migration with no Up() code and stores the current state of your code first model so that when you change the model only those changes are reflected. You could accomplish the same thing by commenting out the items that exist from the Up() code.
Now when you run against an existing database, it will check __MigrationHistory to see which migrations have been applied. If the database does not exist, it will be created. See here and here for more info.
Not sure what you are talking about with the MigrationId. EF handles that automatically unless you change your namespace (there is a workaround for that as well).

2sxc Blog app search only index page 1

Using 2sxc Blog app, the DNN only indexes whatever is on first page of Blog page.
Second page onwards is not indexed, hence doesn't show in search results.
Can anyone help?
This looks like a good question, and it's probably something we haven't thought about yet. Google doesn't care, but the internal search would probably "respect" the paging and only pick up the first page.
I can think of a few quick-fixes but they would be tricky to explain here. Please open an issue on the blog app on github.
Thanks alot #iJungleBoy for help.
For anyone else encountering this issues here's the solution:
Amend the visual query to create another stream example "SearchIndex"
Once thats done, amend the query within your template which gets all the list items and has paging.
#functions{
// Prepare the data - get all categories through the pipeline
public override void CustomizeData()
{
}
/// <summary>
/// Populate the search - ensure that each entity has an own url/page
/// </summary>
/// <param name="searchInfos"></param>
/// <param name="moduleInfo"></param>
/// <param name="startDate"></param>
public override void CustomizeSearch(Dictionary<string, List<ISearchInfo>> searchInfos, ModuleInfo moduleInfo, DateTime startDate)
{
foreach (var si in searchInfos["SearchIndex"])
{
si.QueryString = "post=" + AsDynamic(si.Entity).UrlKey;
}
}
}

In SSDT Schema Compare how do I ignore differences in objects of type "Schema"

From the Schema Compare Options, I deselected all Object Types:
It still shows me differences in Schema objects:
I scrolled through the big list of General options, and none of them appeared to do this:
I hacked it. If you save the compare, you can add this to the file:
<PropertyElementName>
<Name>Microsoft.Data.Tools.Schema.Sql.SchemaModel.SqlSchema</Name>
<Value>ExcludedType</Value>
</PropertyElementName>
You'll see where when you open it. This setting is not in the UI, but is apparently supported.
You can set the exclude schema in code by running the below as an exe before doing the schema merge. The below code needs the Microsoft.SqlServer.DacFx nuget package to be added to your project. It takes 2 parameters, one is the .scmp file path and second is comma separated string of schemas to exclude. It will overwrite the .scmp supplied and exclude the schema names you provided.
It essentially adds XML sections in the .scmp file that is equivalent to un-checking objects on the UI and saving the file. (persisted preference)
This exe execution can be a task in your VSTS (VSO) release pipeline, if you want to exclude one schema from being merged during deployment.
using System;
using System.Linq;
using System.Collections.Generic;
using Microsoft.SqlServer.Dac.Compare;
namespace DatabaseSchemaMergeHelper
{
/// <summary>
/// Iterates through a supplied schema compare file and excludes objects belonging to a supplied list of schema
/// </summary>
class Program
{
/// <summary>
/// first argument is the scmp file to update, second argument is comma separated list of schemas to exclude
/// </summary>
/// <param name="args"></param>
static void Main(string[] args)
{
if (args.Length == 0) return;
var scmpFilePath = args[0];
var listOfSchemasToExclude = args[1].Split(',').ToList();
// load comparison from Schema Compare (.scmp) file
var comparison = new SchemaComparison(scmpFilePath);
var comparisonResult = comparison.Compare();
// find changes pertaining to objects belonging to the supplied schema exclusion list
var listOfDifferencesToExclude = new List<SchemaDifference>();
// add those objects to a list
foreach (SchemaDifference difference in comparisonResult.Differences)
{
if (difference.TargetObject != null &&
difference.TargetObject.Name != null &&
difference.TargetObject.Name.HasName &&
listOfSchemasToExclude.Contains(difference.TargetObject.Name.Parts[0], StringComparer.OrdinalIgnoreCase))
{
listOfDifferencesToExclude.Add(difference);
}
}
// add the needed exclusions to the .scmp file
foreach (var diff in listOfDifferencesToExclude)
{
if (diff.SourceObject != null)
{
var SourceExclusionObject = new SchemaComparisonExcludedObjectId(diff.SourceObject.ObjectType, diff.SourceObject.Name,
diff.Parent?.SourceObject.ObjectType, diff.Parent?.SourceObject.Name);
comparison.ExcludedSourceObjects.Add(SourceExclusionObject);
}
var TargetExclusionObject = new SchemaComparisonExcludedObjectId(diff.TargetObject.ObjectType, diff.TargetObject.Name,
diff.Parent?.TargetObject.ObjectType, diff.Parent?.TargetObject.Name);
comparison.ExcludedTargetObjects.Add(TargetExclusionObject);
}
// save the file, overwrites the existing scmp.
comparison.SaveToFile(scmpFilePath, true);
}
}
}
right-click on the top level nodes (Add, Change, Delete) you can choose "Exclude All" to uncheck all elements of that type. This will at least quickly get you to a state where everything is unchecked.

ADO.NET EF + CF: How to use the existing CF context and bind SQL external objects (views, stored procedures, etc.)?

I have a database, 90% created with EF 4.1 + Code First approach on a SQL Server 2012; the rest is generated by some SQL code (FUNCTIONS, COMPUTED COLUMNS, VIEWS, INDEXES, ETC.).
Now, I needed to use ObjectContext and at the same time optimize performance, so I created some SQL Views directly in the db, which basically do some calculations (count, mix, sum, etc.) on the already CF's generated tables.
I'd like to use the above "external" SQL views inside my solution, possibly pointing to the same connectionstring of my CF context and using with the same repository I created.
I succeed to make an ADO.NET EDM of the Views (is this the right approach?), so now I have the Entity Model generated from db.
For the reasons described above, in first instance I used the existing data connection and I choose to do not save the additional connection string inside my web.config.
Now I have the edmx containing myModel and myModel.Store of the "external" views. For example, here's an extract of mymodel.Designer.cs, which seems to be the standard one I've seen in other edmx of other projects:
public partial class Entities : ObjectContext
{
#region Constructors
/// <summary>
/// Initializes a new Entities object using the connection string found in the 'Entities' section of the application configuration file.
/// </summary>
public Entities() : base("name=Entities", "Entities")
{
this.ContextOptions.LazyLoadingEnabled = true;
OnContextCreated();
}
/// <summary>
/// Initialize a new Entities object.
/// </summary>
public Entities(string connectionString) : base(connectionString, "Entities")
{
this.ContextOptions.LazyLoadingEnabled = true;
OnContextCreated();
}
/// <summary>
/// Initialize a new Entities object.
/// </summary>
public Entities(EntityConnection connection) : base(connection, "Entities")
{
this.ContextOptions.LazyLoadingEnabled = true;
OnContextCreated();
}
#endregion
.............
I'd like to query the "external" entities. I did several tests, but I did not succeed.
Could you tell me the right approach to the problem, please?
1) This is one of the tests I made. In this case I get an exception "The specified named connection is either not found in the configuration, not intended to be used with the EntityClient provider, or not valid":
public class TManagerRepository : ITManagerRepository, IDisposable
{
private TManagerContext context; // the context pointing to CF entities
private TManager.Models.Entities.SQL_Views.Entities entities; // the context pointing to the SQL views by the EDM
public TManagerRepository(TManagerContext context)
{
this.context = context;
this.entities = new TManager.Models.Entities.SQL_Views.Entities();
var test = (from d in this.entities.myview
select d);
}
2) Then I tried to make a specific connection too, but I get an exception which says "Could not find the conceptual model to validate".
Thank you very much for your precious help!
Best Regards
You cannot use same connection string for code first approach and for EDMX. Required connection strings have different format. So if you don't want to store connection string for EDMX context in your configuration file you must to built it manually.

Get the Windows Phone 7 Application Title from Code

I want to access the Title value that is stored in the WMAppManifest.xml file from my ViewModel code. This is the same application title that is set through the project properties.
Is there a way to access this from code using something like App.Current?
Look at the source code for WP7DataCollector.GetAppAttribute() in the Microsoft Silverlight Analytics Framework. GetAppAttribute("Title") will do it.
/// <summary>
/// Gets an attribute from the Windows Phone App Manifest App element
/// </summary>
/// <param name="attributeName">the attribute name</param>
/// <returns>the attribute value</returns>
private static string GetAppAttribute(string attributeName)
{
string appManifestName = "WMAppManifest.xml";
string appNodeName = "App";
var settings = new XmlReaderSettings();
settings.XmlResolver = new XmlXapResolver();
using (XmlReader rdr = XmlReader.Create(appManifestName, settings))
{
rdr.ReadToDescendant(appNodeName);
if (!rdr.IsStartElement())
{
throw new System.FormatException(appManifestName + " is missing " + appNodeName);
}
return rdr.GetAttribute(attributeName);
}
}
This last answer seems overly complicated to me ; you could have simply done something like:
string name = "";
var executingAssembly = System.Reflection.Assembly.GetExecutingAssembly();
var customAttributes = executingAssembly.GetCustomAttributes(typeof(System.Reflection.AssemblyTitleAttribute), false);
if (customAttributes != null)
{
var assemblyName = customAttributes[0] as System.Reflection.AssemblyTitleAttribute;
name = assemblyName.Title;
}
I have used Michael S. Scherotter his excellent code sample to work it out to a fully working code sample:
using System.Xml;
namespace KoenZomers.WinPhone.Samples
{
/// <summary>
/// Allows application information to be retrieved
/// </summary>
public static class ApplicationInfo
{
#region Constants
/// <summary>
/// Filename of the application manifest contained within the XAP file
/// </summary>
private const string AppManifestName = "WMAppManifest.xml";
/// <summary>
/// Name of the XML element containing the application information
/// </summary>
private const string AppNodeName = "App";
#endregion
#region Properties
/// <summary>
/// Gets the application title
/// </summary>
public static string Title
{
get { return GetAppAttribute("Title"); }
}
/// <summary>
/// Gets the application description
/// </summary>
public static string Description
{
get { return GetAppAttribute("Description"); }
}
/// <summary>
/// Gets the application version
/// </summary>
public static string Version
{
get { return GetAppAttribute("Version"); }
}
/// <summary>
/// Gets the application publisher
/// </summary>
public static string Publisher
{
get { return GetAppAttribute("Publisher"); }
}
/// <summary>
/// Gets the application author
/// </summary>
public static string Author
{
get { return GetAppAttribute("Author"); }
}
#endregion
#region Methods
/// <summary>
/// Gets an attribute from the Windows Phone App Manifest App element
/// </summary>
/// <param name="attributeName">the attribute name</param>
/// <returns>the attribute value</returns>
private static string GetAppAttribute(string attributeName)
{
var settings = new XmlReaderSettings {XmlResolver = new XmlXapResolver()};
using (var rdr = XmlReader.Create(AppManifestName, settings))
{
rdr.ReadToDescendant(AppNodeName);
// Return the value of the requested XML attribute if found or NULL if the XML element with the application information was not found in the application manifest
return !rdr.IsStartElement() ? null : rdr.GetAttribute(attributeName);
}
}
#endregion
}
}
Only the first two answers are correct in scope of the original question. And the second is certainly not over complicated. Wrapping the helper method with a class for each possible attribute is good object orientated development and exactly what Microsoft do all over the framework, e.g. settings designer files generated by Visual Studio.
I'd recommend using the first if you just want one specific property, the second if you want more. Should be part of the SDK really. We're trying to read the WMAppManifest.xml here not the AssemblyInfo so standard assembly reflection metadata is no good.
By the way, if you really want to get the product name from the assembly attributes (not WPAppManifest.xml) then the last sample was reading the wrong attribute! Use the AssemblyProductAttribute not the AssemblyTitleAttribute. The assembly title is really the file title, by default the same as the assembly file name (e.g. MyCompany.MyProduct.WinPhone7App) whereas the product will typically be something like the properly formatted "title" of your app in the store (e.g. "My Product"). It may not even be up-to-date after using the VS properties page, so you should check that.
I use AssemblyInfo reflection for all other application types to show the official product name and build version on an about page, it's certainly correct for that. But for these special phone app types the store manifest has more importance and other attributes you may need.
The problem with all of those answers is that they have to read the file every single time it is accessed. This is bad for performance as there are battery issues to consider if you use it frequently. Koen was closer to a proper solution, but his design still went back to the file every time you wanted to access the value.
The solution below is a one-and-done read of the file. Since it is not likely to change, there is no reason to keep going back to it. The attributes are read as the static class is initialized, with minimal fuss.
I created this Gist to demonstrate.
HTH!

Resources