I have some inherited code that uses Dapper to map a SQL SELECT into an object. The SELECT has multiple columns with the same name (some columns are omitted for brevity).
SELECT
created_timestamp AS CreatedDate,
imported_timestamp AS CreatedDate
FROM Orders
WHERE OrderId = #OrderId
An analysis of the data shows only one of the 2 CreatedDate columns are populated for each record. Running some tests revealed that Dapper seems to be picking the non-NULL CreatedDate. I couldn't find any documentation on how Dapper handles this situation. Can I rely on Dapper always picking the non-NULL value?
Dapper is (micro) ORM and it should be used for database CRUD operations.
That said, your business logic should go somewhere else. Implementation is quite simple. Do not create columns with duplicate names. Get data from database using dapper and apply your business logic at other place like checking null or else.
//Following is query
SELECT
created_timestamp AS CreatedDate,
imported_timestamp AS ImportedDate
FROM Orders
WHERE OrderId = #OrderId
//Following is your POCO/DTO
public class Order
{
public int OrderId;//or Guid if that suits you
public string CreatedDate;//or DateTime if that suits you
public string ImportedDate;//or DateTime if that suits you
}
//Following is your business logic
Order order = orderService.GetOrder(orderId);
if(order.CreatedDate != null)
//Do something
else if(order.ImportedDate != null)
//Do something else
Based on your research, even if non-null column is chosen by Dapper; this may not be guaranteed in future versions.
if column names are same, it will select 2nd one
Related
I'm currently putting together a schema that will be responsible for storing products, prices and margins.
The crux of the problem I'm having is how best to handle multiple scenarios.
Definitions - All these are fields in the Link (Intersection) table
Product - A widget
Margin - An data structure that represents how to alter the purchase
price to determine retail price. (complex enough to require a
separate table)
Supplier - Someone who supplies us with a Product
Authority - Someone the supplier is beholden to
Client - Someone we will retail to
ClientGroup - A collection of Clients
Some of these are optional. There will always be a Product-Margin mapping.
The other fields exist to define more specific relationships.
The rules will be applied with a hierarchy.
Examples:
Product "Foo" has a Margin of 10% (applies to all clients)
For ClientGroup "Group A" Foo has a Margin of 8%
For Client "Bob's Burgers" who is a member of "Group A" Foo has a margin of 6%
That would be covered by 3 rows, with the following fields populated (un-populated fields are null)
Product-Margin
ClientGroup-Margin
Client-Margin
Rule 3 is the most specific, and so would take precedence.
Is this link table to best way to store these hierarchical relationships?
If not, what is?
What is the best way of structuring a query to take advantage of this? I've written a query using temp tables and conditional logic but I cant help but think I'm square-pegging SQL and there's a better way of structuring the query.
I'd like to keep as much of the logic in SQL and out of the business logic.
In other words, the app can call a stored procedure, passing in the Product, and Client plus optionally Authority and /or Supplier and receive the appropriate Margin.
I think in your examples 2 and 3 product should also be populated, otherwise that margin is applied to all products for the client or client group.
The query to get results could be something like this:
SELECT TOP 1 Margin
FROM <table>
WHERE Product = #Product
AND COALESCE(Client,'') = COALESCE(#Client,Client,'')
AND COALESCE(ClientGroup,'') = COALESCE(#ClientGroup,ClientGroup,'')
ORDER BY Client DESC, ClientGroup DESC
# - parameters passed to stored procedure. I don't know if your solution will require joins instead but you could change the where conditions to joins.
This assumes product is always passed as parameter, others are optional (you can add Supplier and Authority there).
Order by desc client means rows that are not null appear on top, if client column is null for all rows then theres the same logic for client group.
Or you can use the order by method suggested in the comment by James B
Thanks for the input folks, the solution for the hierarchical behaviour that I wanted looks something like this:
SELECT TOP 1
FROM MarginLink
WHERE
(
(-- Client selection
(ClientId = #clientId)
OR
ClientGroupId = (SELECT ClientGroupId FROM ClientGroupClient WHERE ClientId = #clientId)
)
AND
(--Product Selection
(#productID BETWEEN ProductIdFrom AND ProductIdTo)
OR
(ProductTypeId = #productTypeID)
OR
(ProductIdFrom IS NULL AND ProductIdTo IS NULL )
)
AND
(-- Supplier
(SupplierId = #supplierId)
OR
(MarginLink.SupplierId IS NULL)
)
AND
(-- Authority
(AuthorityId = #authorityId)
OR
(MarginLink.AuthorityId IS NULL)
)
)
ORDER BY ClientId DESC, ClientGroupId DESC, ProductIdFrom DESC, ProductIdTo DESC, ProductTypeId DESC, SupplierId DESC, AuthorityId DESC
I'm using SQL Server 2014. My request I believe is rather simple. I have one table containing a field holding a date value that is stored as VARCHAR, and another table containing a field holding a date value that is stored as INT.
The date value in the VARCHAR field is stored like this: 2015M01
The data value in the INT field is stored like this: 201501
I need to compare these tables against each other using EXCEPT. My thought process was to somehow extract or TRIM the "M" out of the VARCHAR value and see if it would let me compare the two. If anyone has a better idea such as using CAST to change the date formats or something feel free to suggest that as well.
I am also concerned that even extracting the "M" out of the VARCHAR may still prevent the comparison since one will still remain VARCHAR and the other is INT. If possible through a T-SQL query to convert on the fly that would be great advice as well. :)
REPLACE the string and then CONVERT to integer
SELECT A.*, B.*
FROM TableA A
INNER JOIN
(SELECT intField
FROM TableB
) as B
ON CONVERT(INT, REPLACE(A.varcharField, 'M', '')) = B.intField
Since you say you already have the query and are using EXCEPT, you can simply change the definition of that one "date" field in the query containing the VARCHAR value so that it matches the INT format of the other query. For example:
SELECT Field1, CONVERT(INT, REPLACE(VarcharDateField, 'M', '')) AS [DateField], Field3
FROM TableA
EXCEPT
SELECT Field1, IntDateField, Field3
FROM TableB
HOWEVER, while I realize that this might not be feasible, your best option, if you can make this happen, would be to change how the data in the table with the VARCHAR field is stored so that it is actually an INT in the same format as the table with the data already stored as an INT. Then you wouldn't have to worry about situations like this one.
Meaning:
Add an INT field to the table with the VARCHAR field.
Do an UPDATE of that table, setting the INT field to the string value with the M removed.
Update any INSERT and/or UPDATE stored procedures used by external services (app, ETL, etc) to do that same M removal logic on the way in. Then you don't have to change any app code that does INSERTs and UPDATEs. You don't even need to tell anyone you did this.
Update any "get" / SELECT stored procedures used by external services (app, ETL, etc) to do the opposite logic: convert the INT to VARCHAR and add the M on the way out. Then you don't have to change any app code that gets data from the DB. You don't even need to tell anyone you did this.
This is one of many reasons that having a Stored Procedure API to your DB is quite handy. I suppose an ORM can just be rebuilt, but you still need to recompile, even if all of the code references are automatically updated. But making a datatype change (or even moving a field to a different table, or even replacinga a field with a simple CASE statement) "behind the scenes" and masking it so that any code outside of your control doesn't know that a change happened, not nearly as difficult as most people might think. I have done all of these operations (datatype change, move a field to a different table, replace a field with simple logic, etc, etc) and it buys you a lot of time until the app code can be updated. That might be another team who handles that. Maybe their schedule won't allow for making any changes in that area (plus testing) for 3 months. Ok. It will be there waiting for them when they are ready. Any if there are several areas to update, then they can be done one at a time. You can even create new stored procedures to run in parallel for any updated app code to have the proper INT datatype as the input parameter. And once all references to the VARCHAR value are gone, then delete the original versions of those stored procedures.
If you want everything in the first table that is not in the second, you might consider something like this:
select t1.*
from t1
where not exists (select 1
from t2
where cast(replace(t1.varcharfield, 'M', '') as int) = t2.intfield
);
This should be close enough to except for your purposes.
I should add that you might need to include other columns in the where statement. However, the question only mentions one column, so I don't know what those are.
You could create a persisted view on the table with the char column, with a calculated column where the M is removed. Then you could JOIN the view to the table containing the INT column.
CREATE VIEW dbo.PersistedView
WITH SCHEMA_BINDING
AS
SELECT ConvertedDateCol = CONVERT(INT, REPLACE(VarcharCol, 'M', ''))
--, other columns including the PK, etc
FROM dbo.TablewithCharColumn;
CREATE CLUSTERED INDEX IX_PersistedView
ON dbo.PersistedView(<the PK column>);
SELECT *
FROM dbo.PersistedView pv
INNER JOIN dbo.TableWithIntColumn ic ON pv.ConvertedDateCol = ic.IntDateCol;
If you provide the actual details of both tables, I will edit my answer to make it clearer.
A persisted view with a computed column will perform far better on the SELECT statement where you join the two columns compared with doing the CONVERT and REPLACE every time you run the SELECT statement.
However, a persisted view will slightly slow down inserts into the underlying table(s), and will prevent you from making DDL changes to the underlying tables.
If you're looking to not persist the values via a schema-bound view, you could create a non-persisted computed column on the table itself, then create a non-clustered index on that column. If you are using the computed column in WHERE or JOIN clauses, you may see some benefit.
By way of example:
CREATE TABLE dbo.PCT
(
PCT_ID INT NOT NULL
CONSTRAINT PK_PCT
PRIMARY KEY CLUSTERED
IDENTITY(1,1)
, SomeChar VARCHAR(50) NOT NULL
, SomeCharToInt AS CONVERT(INT, REPLACE(SomeChar, 'M', ''))
);
CREATE INDEX IX_PCT_SomeCharToInt
ON dbo.PCT(SomeCharToInt);
INSERT INTO dbo.PCT(SomeChar)
VALUES ('2015M08');
SELECT SomeCharToInt
FROM dbo.PCT;
Results:
Does anyone know a good approach using Entity Framework for the problem described below?
I am trying for our next release to come up with a performant way to show the placed orders for the logged on customer.
Of course paging is always a good technique to use when a lot of data is available I would like to see an answer without any paging techniques.
Here's the story: a customer places an order which gets an orderstatus = PENDING. Depending on some strategy we move that order up the chain in order to get it APPROVED.
Every change of status is logged so we can see a trace for statusses and maybe even an extra line of comment per status which can provide some extra valuable information to whoever sees this order in an interface.
So an Order is linked to a Customer. One order can have multiple orderstatusses stored in OrderStatusHistory.
In my testscenario I am using a customer which has 100+ Orders each with about 5 records in the OrderStatusHistory-table.
I would for now like to see all orders in one page not using paging where for each Order I show the last relevant Status and the extra comment (if there is any for this last status; both fields coming from OrderStatusHistory; the record with the highest Id for the given OrderId).
There are multiple scenarios I have tried, but I would like to see any potential other solutions or comments on the things I have already tried.
Trying to do Include() when getting Orders but this still results in multiple queries launched on the database. Each order triggers an extra query to the database to get all orderstatusses in the history table. So all statusses are queried here instead of just returning the last relevant one, plus 100 extra queries are launched for 100 orders. You can imagine the problem when there are 100000+ orders in the database.
Having 2 computed columns on the database: LastStatus, LastStatusInformation and a regular Linq-Query which gets those columns which are available through the Entity-model.
The problem with this approach is the fact that those computed columns are determined using a scalar function which can not be changed without removing the formula from the computed column, etc...
In the end I am very familiar with SQL and Stored procedures, but since the rest of the data-layer uses Entity Framework I would like to stick to it as long as possible, even though I have my doubts about performance.
Using the SQL approach I would write something like this:
WITH cte (RN, OrderId, [Status], Information)
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY Id DESC), OrderId, [Status], Information
FROM OrderStatus
)
SELECT o.Id, cte.[Status], cte.Information AS StatusInformation, o.* FROM [Order] o
INNER JOIN cte ON o.Id = cte.OrderId AND cte.RN = 1
WHERE CustomerId = #CustomerId
ORDER BY 1 DESC;
which returns all orders for the customer with the statusinformation provided by the Common Table Expression.
Does anyone know a good approach using Entity Framework?
Something like this should work as you want (make only 1 db call), but I didn't test it:
var result = from order in context.Orders
where order.CustomerId == customerId
let lastStatus = order.OrderStatusHistory.OrderBy(x => x.Id).Last()
select new
{
//you can return the whole order if you need
//Order = order,
//or only the information you actually need to display
Number = order.Number,
Status = lastStatus.Status,
ExtraComment = lastStatus.ExtraComment,
};
This assumes your Order class looks something like this:
public class Order
{
public int Id { get; set; }
public int CustomerId { get; set; }
public string Number { get; set; }
...
public ICollection<OrderStatusHistory> OrderStatusHistory { get; set; }
}
If your Order class doesn't have something like an ICollection<OrderStatusHistory> OrderStatusHistory property then you need to do a join first. Let me know if that is the case and I will edit my answer to include the join.
I have a table Messages with columns ID (primary key, autoincrement) and Content (text).
I have a table Users with columns username (primary key, text) and Hash.
A message is sent by one Sender (user) to many recipients (user) and a recipient (user) can have many messages.
I created a table Messages_Recipients with two columns: MessageID (referring to the ID column of the Messages table and Recipient (referring to the username column in the Users table). This table represents the many to many relation between recipients and messages.
So, the question I have is this. The ID of a new message will be created after it has been stored in the database. But how can I hold a reference to the MessageRow I just added in order to retrieve this new MessageID? I can always search the database for the last row added of course, but that could possibly return a different row in a multithreaded environment?
EDIT: As I understand it for SQLite you can use the SELECT last_insert_rowid(). But how do I call this statement from ADO.Net?
My Persistence code (messages and messagesRecipients are DataTables):
public void Persist(Message message)
{
pm_databaseDataSet.MessagesRow messagerow;
messagerow=messages.AddMessagesRow(message.Sender,
message.TimeSent.ToFileTime(),
message.Content,
message.TimeCreated.ToFileTime());
UpdateMessages();
var x = messagerow;//I hoped the messagerow would hold a
//reference to the new row in the Messages table, but it does not.
foreach (var recipient in message.Recipients)
{
var row = messagesRecipients.NewMessages_RecipientsRow();
row.Recipient = recipient;
//row.MessageID= How do I find this??
messagesRecipients.AddMessages_RecipientsRow(row);
UpdateMessagesRecipients();//method not shown
}
}
private void UpdateMessages()
{
messagesAdapter.Update(messages);
messagesAdapter.Fill(messages);
}
One other option is to look at the system table sqlite_sequence. Your sqlite database will have that table automatically if you created any table with autoincrement primary key. This table is for sqlite to keep track of the autoincrement field so that it won't repeat the primary key even after you delete some rows or after some insert failed (read more about this here http://www.sqlite.org/autoinc.html).
So with this table there is the added benefit that you can find out your newly inserted item's primary key even after you inserted something else (in other tables, of course!). After making sure that your insert is successful (otherwise you will get a false number), you simply need to do:
select seq from sqlite_sequence where name="table_name"
With SQL Server you'd SELECT SCOPE_IDENTITY() to get the last identity value for the current process.
With SQlite, it looks like for an autoincrement you would do
SELECT last_insert_rowid()
immediately after your insert.
http://www.mail-archive.com/sqlite-users#sqlite.org/msg09429.html
In answer to your comment to get this value you would want to use SQL or OleDb code like:
using (SqlConnection conn = new SqlConnection(connString))
{
string sql = "SELECT last_insert_rowid()";
SqlCommand cmd = new SqlCommand(sql, conn);
conn.Open();
int lastID = (Int32) cmd.ExecuteScalar();
}
I've had issues with using SELECT last_insert_rowid() in a multithreaded environment. If another thread inserts into another table that has an autoinc, last_insert_rowid will return the autoinc value from the new table.
Here's where they state that in the doco:
If a separate thread performs a new INSERT on the same database connection while the sqlite3_last_insert_rowid() function is running and thus changes the last insert rowid, then the value returned by sqlite3_last_insert_rowid() is unpredictable and might not equal either the old or the new last insert rowid.
That's from sqlite.org doco
According to Android Sqlite get last insert row id there is another query:
SELECT rowid from your_table_name order by ROWID DESC limit 1
Sample code from #polyglot solution
SQLiteCommand sql_cmd;
sql_cmd.CommandText = "select seq from sqlite_sequence where name='myTable'; ";
int newId = Convert.ToInt32( sql_cmd.ExecuteScalar( ) );
sqlite3_last_insert_rowid() is unsafe in a multithreaded environment (and documented as such on SQLite)
However the good news is that you can play with the chance, see below
ID reservation is NOT implemented in SQLite, you can also avoid PK using your own UNIQUE Primary Key if you know something always variant in your data.
Note:
See if the clause on RETURNING won't solve your issue
https://www.sqlite.org/lang_returning.html
As this is only available in recent version of SQLite and may have some overhead, consider Using the fact that it's really bad luck if you have an insertion in-between your requests to SQLite
see also if you absolutely need to fetch SQlite internal PK, can you design your own predict-able PK:
https://sqlite.org/withoutrowid.html
If need traditional PK AUTOINCREMENT, yes there is a small risk that the id you fetch may belong to another insertion. Small but unacceptable risk.
A workaround is to call twice the sqlite3_last_insert_rowid()
#1 BEFORE my Insert, then #2 AFTER my insert
as in :
int IdLast = sqlite3_last_insert_rowid(m_db); // Before (this id is already used)
const int rc = sqlite3_exec(m_db, sql,NULL, NULL, &m_zErrMsg);
int IdEnd = sqlite3_last_insert_rowid(m_db); // After Insertion most probably the right one,
In the vast majority of cases IdEnd==IdLast+1. This the "happy path" and you can rely on IdEnd as being the ID you look for.
Else you have to need to do an extra SELECT where you can use criteria based on IdLast to IdEnd (any additional criteria in WHERE clause are good to add if any)
Use ROWID (which is an SQlite keyword) to SELECT the id range that is relevant.
"SELECT my_pk_id FROM Symbols WHERE ROWID>%d && ROWID<=%d;",IdLast,IdEnd);
// notice the > in: ROWID>%zd, as we already know that IdLast is NOT the one we look for.
As second call to sqlite3_last_insert_rowid is done right away after INSERT, this SELECT generally only return 2 or 3 row max.
Then search in result from SELECT for the data you Inserted to find the proper id.
Performance improvement: As the call to sqlite3_last_insert_rowid() is way faster than the INSERT, (Even if mutex may make that wrong it is statistically true) I bet on IdEnd to be the right one and unwind the SELECT results by the end. Nearly in every cases we tested the last ROW does contain the ID you look for).
Performance improvement: If you have an additional UNIQUE Key, then add it to the WHERE to get only one row.
I experimented using 3 threads doing heavy Insertions, it worked as expected, the preparation + DB handling take the vast majority of CPU cycles, then results is that the Odd of mixup ID is in the range of 1/1000 insertions (situation where IdEnd>IdLast+1)
So the penalty of an additional SELECT to resolve this is rather low.
Otherwise said the benefit to use the sqlite3_last_insert_rowid() is great in the vast majority of Insertion, and if using some care, can even safely be used in MT.
Caveat: Situation is slightly more awkward in transactional mode.
Also SQLite didn't explicitly guaranty that ID will be contiguous and growing (unless AUTOINCREMENT). (At least I didn't found information about that, but looking at the SQLite source code it preclude that)
the simplest method would be using :
SELECT MAX(id) FROM yourTableName LIMIT 1;
if you are trying to grab this last id in a relation to effect another table as for example : ( if invoice is added THEN add the ItemsList to the invoice ID )
in this case use something like :
var cmd_result = cmd.ExecuteNonQuery(); // return the number of effected rows
then use cmd_result to determine if the previous Query have been excuted successfully, something like : if(cmd_result > 0) followed by your Query SELECT MAX(id) FROM yourTableName LIMIT 1; just to make sure that you are not targeting the wrong row id in case the previous command did not add any Rows.
in fact cmd_result > 0 condition is very necessary thing in case anything fail . specially if you are developing a serious Application, you don't want your users waking up finding random items added to their invoice.
I recently came up with a solution to this problem that sacrifices some performance overhead to ensure you get the correct last inserted ID.
Let's say you have a table people. Add a column called random_bigint:
create table people (
id int primary key,
name text,
random_bigint int not null
);
Add a unique index on random_bigint:
create unique index people_random_bigint_idx
ON people(random_bigint);
In your application, generate a random bigint whenever you insert a record. I guess there is a trivial possibility that a collision will occur, so you should handle that error.
My app is in Go and the code that generates a random bigint looks like this:
func RandomPositiveBigInt() (int64, error) {
nBig, err := rand.Int(rand.Reader, big.NewInt(9223372036854775807))
if err != nil {
return 0, err
}
return nBig.Int64(), nil
}
After you've inserted the record, query the table with a where filter on the random bigint value:
select id from people where random_bigint = <put random bigint here>
The unique index will add a small amount of overhead on the insertion. The id lookup, while very fast because of the index, will also add a little overhead.
However, this method will guarantee a correct last inserted ID.
I have two tables for tracking user sessions on my site. This is a gross oversimplification btw :
Campaign:
campaignId [int]
campaignKey [varchar(20)]
description [varchar(50)]
Session:
sessionDate [datetime]
sessionGUID [uniqueidentifier]
campaignId [int]
campaignKey [varchar(20)]
I want to insert a new record into Session, using LINQ :
var s = new Session();
dbContext.Session.InsertOnSubmit(s);
s.sessionDate = DateTime.Now;
s.sessionGUID = Guid.NewGuid();
s.campaignKey = Request.Params["c"];
// dont set s.campaignId here...
dbContext.SubmitChanges();
Notice that I am not currently setting campaignId in this code.
What I want is for something to automaticaly hookup the foreign key to the 'Campaign' table, and whatever does it must first add a new row to the 'Campaign' table if the campaign passed in has never been used before.
I have a few decisions to make and would really appreciate insight on any of them :
I don't know if I should use a trigger, a stored proc or do it in LINQ manually :
Trigger: slightly icky, never really liked using them, but would guarantee the 'campaignId' was updated by the time I need it
Stored proc: again slightly icky, my SQL is not great and I value the consistency of being able to do everything in LINQ as much as possible.
Linq manually: i'd have to keep a copy of the 'Campaign' table in memory and use a C# hashtable to do the lookup. i'd then have to worry about keeping that table up to date if another client added a new campaign.
My main two reasons for wanting this foreign key table is for a more efficient index on 'Session' for 'campaignId' so that I can do grouping faster. it just seems like it ought to be a lot faster if its just an integer column being grouped. The second reason is to give partners permissions to see only their campaigns through joins with other tables.
Before anyone asks I do NOT know the campaigns in advance, as some will be created by partners.
Most importantly: I am primarily looking for the most 'LINQ friendly' solution.
I would definitely recommend adding a nullable foreign key constraint on the Session table. Once you have that setup, it should be as simple as tacking on a method to the Session class:
public partial class Session
{
public void SetCampaignKey(string key)
{
// Use an existing campaign if one exists
Campaign campaign =
from c in dbContext.Campaigns
where c.campaignKey == key
select c;
// Create a new campaign if necessary
if (campaign == null)
{
campaign = new Campaign();
campaign.campaignKey = key;
campaign.description = string.Empty; // Not sure where this comes in
dbContext.Campaign.InsertOnSubmit(campaign);
}
// We can now set the reference directly
this.Campaign = campaign;
}
}
My LINQ may be a bit off, but something like this should work.
You can call SetCampaignKey() instead of manually setting the campaignKey property. When you call dbContext.SubmitChanges, the campaign will be added if necessary and the Session entry will be updated accordingly.
In this case, only the campaignId property would be set automatically. You could rely on a simple trigger to set campaignKey or do away with it. You could always retrieve the value by joining on the Campaign table.
Am I oversimplifying the problem?