Cannot Pass Null Value to Custom Aggregate - sql-server

Afternoon,
I'm writing a custom median function (without looking at existing solutions, i like the challenge), after lots of fiddling I'm most of the way there. I cannot however pass in a column that contains a null value. I'm handling this in the c# Code but it seems to be being stopped by SQL before it gets there.
You get this error...
Msg 6569, Level 16, State 1, Line 11 'Median' failed because parameter 1 is not allowed to be null.
C#:
namespace SQLMedianAggregate
{
[System.Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(
Microsoft.SqlServer.Server.Format.UserDefined,
IsInvariantToDuplicates = false, // duplicates may change results
IsInvariantToNulls = true, // receiving a NULL is handled later in code
IsInvariantToOrder = true, // is sorted later
IsNullIfEmpty = true, // if no values are given the result is null
MaxByteSize = -1,
Name = "Median" // name of the aggregate
)]
public struct Median : IBinarySerialize
{
public double Result { get; private set; }
public bool HasValue { get; private set; }
public DataTable DT_Values { get; private set; } //only exists for merge essentially
public static DataTable DT_Final { get; private set; } //Need a static version so its accesible within terminate
public void Init()
{
Result = double.NaN;
HasValue = false;
DT_Values = new DataTable();
DT_Values.Columns.Add("Values", typeof(double));
DT_Final = new DataTable();
DT_Final.Columns.Add("Values", typeof(double));
}
public void Accumulate(double number)
{
if (double.IsNaN(number))
{
//skip
}
else
{
//add to tables
DataRow NR = DT_Values.NewRow();
NR[0] = number;
DT_Values.Rows.Add(NR);
DataRow NR2 = DT_Final.NewRow();
NR2[0] = number;
DT_Final.Rows.Add(NR2);
HasValue = true;
}
}
public void Merge(Median group)
{
// Count the product only if the other group has values
if (group.HasValue)
{
DT_Final.Merge(group.DT_Values);
//DT_Final = DT_Values;
}
}
public double Terminate()
{
if (DT_Final.Rows.Count == 0) //Just to handle roll up so it doesn't crash (doesnt actually work
{
DataRow DR = DT_Final.NewRow();
DR[0] = 0;
DT_Final.Rows.Add(DR);
}
//Sort Results
DataView DV = DT_Final.DefaultView;
DV.Sort = "Values asc";
DataTable DTF = new DataTable();
DTF = DV.ToTable();
////Calculate median and submit result
double MiddleRow = (DT_Final.Rows.Count -1.0) / 2.0;
if (MiddleRow % 2 != 0)
{
double upper = (double)(DT_Final.Rows[Convert.ToInt32(Math.Ceiling(MiddleRow))]["Values"]);
double lower = (double)(DT_Final.Rows[Convert.ToInt32(Math.Floor(MiddleRow))]["Values"]);
Result = lower + ((upper - lower) / 2);
} else
{
Result = (double)(DT_Final.Rows[Convert.ToInt32(MiddleRow)]["Values"]);
}
return Result;
}
public void Read(BinaryReader SerializationReader)
{
//Needed to get this working for some reason
}
public void Write(BinaryWriter SerializationWriter)
{
//Needed to get this working for some reason
}
}
}
SQL:
DROP AGGREGATE dbo.Median
DROP ASSEMBLY MedianAggregate
CREATE ASSEMBLY MedianAggregate
AUTHORIZATION dbo
FROM 'C:\Users\#######\Documents\Visual Studio 2017\Projects\SQLMedianAggregate\SQLMedianAggregate\bin\Debug\SQLMedianAggregate.dll'
WITH PERMISSION_SET = UNSAFE;
CREATE AGGREGATE dbo.Median (#number FLOAT) RETURNS FLOAT
EXTERNAL NAME [MedianAggregate]."SQLMedianAggregate.Median";
Any ideas of what setting or code i'm missing that will allow this. I pretty much just want it to ignore nulls.
SQL Version is SQL2008 R2 btw

The problem is your datatype. You need to use the Sql* types for SQLCLR parameters, return values, and result set columns. In this case, you need to change:
Accumulate(double number)
into:
Accumulate(SqlDouble number)
Then, you access the double value using the Value property that all Sql* types have (i.e. number.Value in this case).
And then, at the beginning of the Accumulate method, you need to check for NULL using the IsNull property:
if (number.IsNull)
{
return;
}
Also, for more information on using SQLCLR in general, please see the series I am writing on this topic on SQL Server Central: Stairway to SQLCLR (free registration is required to read content on that site, but it's worth it :-).
And, since we are talking about median calculations here, please see the article I wrote (also on SQL Server Central) on the topic of UDAs and UDTs that uses Median as the example: Getting The Most Out of SQL Server 2005 UDTs and UDAs. Please keep in mind that the article was written for SQL Server 2005 which has a hard limit of 8000 bytes of memory for UDTs and UDAs. That limit was lifted in SQL Server 2008, so rather than using the compression technique shown in that article, you could simply set MaxByteSize in the SqlUserDefinedAggregate to -1 (as you are currently doing) or SqlMetaData.MaxSize (or something very close to that).
Also, DataTable is a bit heavy-handed for this type of operation. All you need is a simple List<Double> :-).
Regarding the following line of code (broken into 2 lines here to prevent the need to scroll):
public static DataTable DT_Final { get; private set; }
//Need a static version so its accesible within terminate
This is a huge misunderstanding of how UDAs and UDTs work. Please do NOT use static variables here. Static variables are shared across Sessions, hence your current approach is not thread-safe. So you would either get errors about it already being declared or various Sessions would alter the value unbeknownst to other Sessions, as they would all share the single instance of
DT_Final. And the errors and/or odd behavior (i.e. erroneous results that you can't debug) might happen in a single session if a parallel plan is used.
UDTs and UDAs get serialized to a binary value stored in memory, and then are deserialized which keeps their state intact. This is the reason for the Read and Write methods, and why you needed to get those working.
Again, you don't need (or want) DataTables here as they are over-complicating the operation and take up more memory than is ideal. Please see the article I linked above on UDAs and UDTs to see how the Median operation (and UDAs in general) should work.

Related

What is correct to store UpdatedAt or ExpireAt timestamps to limit updates by timeout

I need to implement function to store some value with limit on updates once per week.
I'm implemented in following way:
class Example
{
//Stored in db
public int _value;
//Stored in db
public DateTime _updatedAt;
//Stored in db
public DateTime _canUpdateAfter;
//Constant in code
public TimeSpan _updateTimeout = TimeSpan.FromMinutes(1);
public void StoreValue1(int value)
{
if (DateTime.Now - _updatedAt < _updateTimeout)
{
return;
}
_value = value;
_updatedAt = DateTime.Now;
}
public void StoreValue2(int value)
{
if (_canUpdateAfter > DateTime.Now)
{
return;
}
_value = value;
_canUpdateAfter = DateTime.Now + _updateTimeout;
}
}
I have two ways of implementing it:
Store updated time in db and calculate if timeout is passed in .net code.
Store value when timeout expire in db and compare it with current in .net code.
Which to use and why?
Both solutions are valid.
The only difference between the ways is the time to decide to set the possibility of the next update.
With solution 1 you make the decision every on code evaluation, with others you force the decision on the past.
I prefer solution 1; is more flexible, and sustainable.
Keep in mind the case of your business change update frequency. With solution 1 are enough new code deploy or change one row of your hypothetical configuration table, whereas whit solution 2 you will need to update all rows of the table.

SQL Server CLR UDF time-out

I have the following setup in SQL Server 2008 R2: A view that uses cross apply on a table-valued user-defined funtion which calls a CLR method. This CLR method converts a given BLOB to a table containing doubles.
Occasionally some queries using the view times-out after around 10 seconds. The time-out does not happen consistently for any query, and when some query fails, other queries work just fine. These queries have the same structure but query different data.
No error is logged in the SQL Server Log when the problem occurs. I suspect that the time-out is related to the use of CLR but have no observation to actually verify this.
Any suggestions for further troubleshooting would be highly appreciated. Many thanks in advance.
Edit
Here is the C# code:
private class Row
{
public SqlDateTime TimeStep { get; set; }
public SqlDouble Amount { get; set; }
}
public static void TableUDF_FillRow(object tableTypeObject, out SqlDateTime timeStep, out SqlDouble amount)
{
Row tableType = (Row)tableTypeObject;
timeStep = tableType.TimeStep;
amount = tableType.Amount;
}
[SqlFunction(DataAccess = DataAccessKind.Read,
TableDefinition = "timeStep DATETIME, amount FLOAT",
FillRowMethodName = "TableUDF_FillRow")]
public static IEnumerable VarBinaryToFloatTable(byte[] blob, DateTime simulationStartDate)
{
double[] result = new double[blob.Length / sizeof(double)];
Buffer.BlockCopy(blob, 0, result, 0, blob.Length);
var list = new List<Row>(result.Length);
foreach (double d in result)
{
Row row = new Row
{
TimeStep = simulationStartDate,
Amount = d
};
list.Add(row);
simulationStartDate = simulationStartDate.AddHours(1); // assumes one hour time steps
}
return list;
}
The table-valued UDF just calls the VarBinaryToFloatTable() method.

How to search for empty strings in a text field with Entity Framework?

I'd like to know how can I search for empty strings when I'm using a text type field with Entity Framework.
I've looked the SQL query that Entity is generating and It's using LIKE to compare because It's searching in a text type field, so when I use .Equals(""), == "", == string.Empty, .Contains(""), .Contains(string.Empty), and everything else, It's returning all results because it sql query is like '' and the == command throws exception because It uses the = command that is not valid with text type field.
When I try to use .Equals(null), .Contains(null), == null, It returns nothing, because It is generating FIELD ISNULL command.
I already tried the .Lenght == 0 but It throws an exception...
This works for me:
public class POCO
{
public int Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
}
static void Main(string[] args)
{
var pocos = new List<POCO>
{
new POCO { Id = 1, Name = "John", Description = "basic" },
new POCO { Id = 2, Name = "Jane", Description = "" },
new POCO { Id = 3, Name = "Joey", Description = string.Empty }
};
pocos.Where(x => x.Description == string.Empty)
.ToList()
.ForEach(x => Console.WriteLine($"{x.Id} {x.Name} {x.Description}"));
}
However the issue MAY BE that your T4 generated object is not fully realized with data you can use, if you are using Entity Framework. EG the translation from the database is not populating objects to interrogate correctly. I would just do an operation like this to see:
using (var context = new YOURCONTEXTNAME())
{
var persons = context.YOURDATABASEOBJECT.ToList();
persons.ForEach(x => Console.WriteLine($"{x.COLUMNINQUESTION}"));
}
If you are successfully having data in it, it should be retrieved. I would not use text if possible. Use a varchar(max) nvarchar(max) xml, whatever text will be deprecated eventually and is bad form so to speak to continue using at this point.
EDIT
Okay I see, the answer is you cannot interogate the object until it is fully realized when it is text. I did a test on my local database and created a context and tested it and sure enough you cannot do a '== string.empty', '== ""', or 'String.IsNullOrEmpty()' on a text. However you can do it once the object is materialized in a realized object. EG:
// Won't work as context does not understand type
//var persons = context.tePersons.Where(x => x.Description == string.Empty).ToList();
//Works fine as transformation got the object translated to a string in .NET
var start = context.tePersons.ToList();
var persons = start.Where(x => x.Description == String.Empty).ToList();
This poses a problem obviously as you need to get ALL your data potentially before performing a predicate. Not the best means by any measure. You could do a sql object for this instead then to do a function, proc, or view to change this.

JPA2 CriteriaBuilder: Using LOB property for greaterThan comparison

My application is using SQLServer and JPA2 in the backend. App makes use of a timestamp column (in the SQLServer sense, which is equivalent to row version see here) per entity to keep track of freshly modified entities. NB SQLServer stores this column as binary(8).
Each entity has a respective timestamp property, mapped as #Lob, which is the way to go for binary columns:
#Lob
#Column(columnDefinition="timestamp", insertable=false, updatable=false)
public byte[] getTimestamp() {
...
The server sends incremental updates to mobile clients along with the latest database timestamp. The mobile client will then pass the old timestamp back to the server on the next refresh request so that the server knows to return only fresh data. Here's what a typical query (in JPQL) looks like:
select v from Visit v where v.timestamp > :oldTimestamp
Please note that I'm using a byte array as a query parameter and it works fine when implemented in JPQL this way.
My problems begin when trying to do the same using the Criteria API:
private void getFreshVisits(byte[] oldVersion) {
EntityManager em = getEntityManager();
CriteriaQuery<Visit> cq = cb.createQuery(Visit.class);
Root<Visit> root = cq.from(Visit.class);
Predicate tsPred = cb.gt(root.get("timestamp").as(byte[].class), oldVersion); // compiler error
cq.where(tsPred);
...
}
The above will result in compiler error as it requires that the gt method used strictly with Number. One could instead use the greaterThan method which simply requires the params to be Comparable and that would result in yet another compiler error.
So to sum it up, my question is: how can I use the criteria api to add a greaterThan predicate for a byte[] property? Any help will be greatly appreciated.
PS. As to why I'm not using a regular DateTime last_modified column: because of concurrency and the way synchronization is implemented, this approach could result in lost updates. Microsoft's Sync Framework documentation recommends the former approach as well.
I know this was asked a couple of years back but just in case anyone else stumbles upon this.. In order to use a SQLServer rowver column within JPA you need to do a couple of things..
Create a type that will wrap the rowver/timestamp:
import com.fasterxml.jackson.annotation.JsonIgnore;
import javax.xml.bind.annotation.XmlTransient;
import java.io.Serializable;
import java.math.BigInteger;
import java.util.Arrays;
/**
* A RowVersion object
*/
public class RowVersion implements Serializable, Comparable<RowVersion> {
#XmlTransient
#JsonIgnore
private byte[] rowver;
public RowVersion() {
}
public RowVersion(byte[] internal) {
this.rowver = internal;
}
#XmlTransient
#JsonIgnore
public byte[] getRowver() {
return rowver;
}
public void setRowver(byte[] rowver) {
this.rowver = rowver;
}
#Override
public int compareTo(RowVersion o) {
return new BigInteger(1, rowver).compareTo(new BigInteger(1, o.getRowver()));
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
RowVersion that = (RowVersion) o;
return Arrays.equals(rowver, that.rowver);
}
#Override
public int hashCode() {
return Arrays.hashCode(rowver);
}
}
The key here is that it implement Comparable if you want to use it in calculations (which you definitely do)..
Next create a AttributeConverter that will move from a byte[] to the class you just made:
import javax.persistence.AttributeConverter;
import javax.persistence.Converter;
/**
* JPA converter for the RowVersion type
*/
#Converter
public class RowVersionTypeConverter implements AttributeConverter<RowVersion, byte[]> {
#Override
public byte[] convertToDatabaseColumn(RowVersion attribute) {
return attribute != null ? attribute.getRowver() : null;
}
#Override
public RowVersion convertToEntityAttribute(byte[] dbData) {
return new RowVersion(dbData);
}
}
Now let's apply this RowVersion attribute/type to a real world scenario. Let's say you wanted to find all Programs that have changed on or before some point in time.
One straightforward way to solve this would be to use a DateTime field in the object and timestamp column within db. Then you would use 'where lastUpdatedDate <= :date'.
Suppose that you don't have that timestamp column or there's no guarantee that it will be updated properly when changes are made; or let's say your shop loves SQLServer and wants to use rowver instead.
What to do? There are two issues to solve.. one how to generate a rowver and two is how to use the generated rowver to find Programs.
Since the database generates the rowver, you can either ask the db for the 'current max rowver' (a custom sql server thing) or you can simply save an object that has a RowVersion attribute and then use that object's generated RowVersion as the boundary for the query to find the Programs changed after that time. The latter solution is more portable is what the solution is below.
The SyncPoint class snippet below is the object that is used as a 'point in time' kind of deal. So once a SyncPoint is saved, the RowVersion attached to it is the db version at the time it was saved.
Here is the SyncPoint snippet. Notice the annotation to specify the custom converter (don't forget to make the column insertable = false, updateable = false):
/**
* A sample super class that uses RowVersion
*/
#MappedSuperclass
public abstract class SyncPoint {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
// type is rowver for SQLServer, blob(8) for postgresql and h2
#Column(name = "current_database_version", insertable = false, updatable = false)
#Convert(converter = RowVersionTypeConverter.class)
private RowVersion currentDatabaseVersion;
#Column(name = "created_date_utc", columnDefinition = "timestamp", nullable = false)
private DateTime createdDate;
...
Also (for this example) here is the Program object we want to find:
#Entity
#Table(name = "program_table")
public class Program {
#Id
private Integer id;
private boolean active;
// type is rowver for SQLServer, blob(8) for postgresql and h2
#Column(name = "rowver", insertable = false, updatable = false)
#Convert(converter = RowVersionTypeConverter.class)
private RowVersion currentDatabaseVersion;
#Column(name = "last_chng_dt")
private DateTime lastUpdatedDate;
...
Now you can use these fields within your JPA criteria queries just like anything else.. here is a snippet that we used inside a spring-data Specifications class:
/**
* Find Programs changed after a synchronization point
*
* #param filter that has the changedAfter sync point
* #return a specification or null
*/
public Specification<Program> changedBeforeOrEqualTo(final ProgramSearchFilter filter) {
return new Specification<Program>() {
#Override
public Predicate toPredicate(Root<Program> root, CriteriaQuery<?> query, CriteriaBuilder cb) {
if (filter != null && filter.changedAfter() != null) {
// load the SyncPoint from the db to get the rowver column populated
SyncPoint fromDb = synchronizationPersistence.reload(filter.changedBeforeOrEqualTo());
if (fromDb != null) {
// real sync point made by database
if (fromDb.getCurrentDatabaseVersion() != null) {
// use binary version
return cb.lessThanOrEqualTo(root.get(Program_.currentDatabaseVersion),
fromDb.getCurrentDatabaseVersion());
} else if (fromDb.getCreatedDate() != null) {
// use timestamp instead of binary version cause db doesn't make one
return cb.lessThanOrEqualTo(root.get(Program_.lastUpdatedDate),
fromDb.getCreatedDate());
}
}
}
return null;
}
};
}
The specification above works with both the binary current database version or a timestamp.. this way I could test my stuff and all the upstream code on a database other than SQLServer.
That's it really: a) type to wrap the byte[] b) JPA converter c) use attribute in query.

How does one make NHibernate stop using nvarchar(4000) for insert parameter strings?

I need to optimize a query that is being produced by a save (insert query) on a domain entity. I've configured NHibernate using Fluent NHibernate.
Here's the query generated by NHibernate during the insertion of a user's response to a poll:
exec sp_executesql N'INSERT INTO dbo.Response (ModifiedDate, IpAddress, CountryCode,
IsRemoteAddr, PollId) VALUES (#p0, #p1, #p2, #p3, #p4); select SCOPE_IDENTITY()',N'#p0
datetime,#p1 nvarchar(4000),#p2 nvarchar(4000),#p3 bit,#p4 int',
#p0='2001-07-08 03:59:05',#p1=N'127.0.0.1',#p2=N'US',#p3=1,#p4=2
If one looks at the input parameters for IpAddress and CountryCode, one will notice that NHibernate is using nvarchar(4000). The problem is that nvarchar(4000) is far larger than I need for either IpAddress or CountryCode and due to high traffic and hosting requirements I need to optimize the database for memory usage.
Here's the Fluent NHibernate auto-mapping overrides for those columns:
mapping.Map(x => x.IpAddress).CustomSqlType("varchar(15)");
mapping.Map(x => x.CountryCode).CustomSqlType("varchar(6)");
This isn't the only place that I see unnecessary nvarchar(4000)'s popping up.
How do I control NHibernate's usage of nvarchar(4000) for string representation?
How do I change this insert statement to use the proper sized input parameters?
Specify the Type as NHibernateUtil.AnsiString with a Length instead of using a CustomSqlType.
This issue can cause a huge performance problem in queries if it forces SQL Server to perform a table scan instead of using an index. We use varchar throughout our database so I created a convention to set the type globally:
/// <summary>
/// Convert all string properties to AnsiString (varchar). This does not work with SQL CE.
/// </summary>
public class AnsiStringConvention : IPropertyConventionAcceptance, IPropertyConvention
{
public void Accept(IAcceptanceCriteria<IPropertyInspector> criteria)
{
criteria.Expect(x => x.Property.PropertyType.Equals(typeof(string)));
}
public void Apply(IPropertyInstance instance)
{
instance.CustomType("AnsiString");
}
}
Okay this is what we have to do, the SQLClientDriver ignores the length property of the SqlType. So we created a our own driverclass inheriting from SQLClientDriver and override the method GenerateCommand...Something like this:
public override IDbCommand GenerateCommand(CommandType type, NHibernate.SqlCommand.SqlString sqlString, SqlType[] parameterTypes)
{
var dbCommand = base.GenerateCommand(type, sqlString, parameterTypes);
SetParameterSizes(dbCommand.Parameters, parameterTypes);
return dbCommand;
}
private static void SetParameterSizes(IDataParameterCollection parameters, SqlType[] parameterTypes)
{
for (int index = 0; index < parameters.Count; ++index)
SetVariableLengthParameterSize((IDbDataParameter)parameters[index], parameterTypes[index]);
}
private static void SetVariableLengthParameterSize(IDbDataParameter dbParam, SqlType sqlType)
{
SetDefaultParameterSize(dbParam, sqlType);
if (sqlType.LengthDefined && !IsText(dbParam, sqlType) && !IsBlob(dbParam, sqlType))
dbParam.Size = sqlType.Length;
if (!sqlType.PrecisionDefined)
return;
dbParam.Precision = sqlType.Precision;
dbParam.Scale = sqlType.Scale;
}
Here is a work around, if you want to replace all nvarchar with varchar
public class Sql2008NoNVarCharDriver : Sql2008ClientDriver
{
public override void AdjustCommand(IDbCommand command)
{
foreach (System.Data.SqlClient.SqlParameter x in command.Parameters)
{
if (x.SqlDbType == SqlDbType.NVarChar)
{
x.SqlDbType = SqlDbType.VarChar;
}
}
base.AdjustCommand(command);
}
}
Then plug it into your config
var cfg = Fluently.Configure()
.Database(MsSqlConfiguration.MsSql2008.ConnectionString(connectionString)
.Driver<Sql2008NoNVarCharDriver>())
...

Resources