Parse JSON data within SQL Server Integration Services Package? - sql-server

I'm trying to set up an SSIS job that will pull a JSON-encoded mailing list from MailChimp, compare it to a list of customers in our CRM database (SQL Server), and upload via JSON any new customers not already there. I can't seem to find anything on serializing/deserializing JSON within SSIS, other than writing a script task, and it seems that I can't import the .Net serialization libraries into a script. Any suggestions? Thanks in advance!

Couple things to address here:
First, your problem with adding new libraries in the scripting component. I assume you're using VS 2008 to do your SSIS development and want to use the .net 3.5 library to do this. You go to project, add reference and you don't see any of the dll's you need. This may be in part that you're using windows 7 and the compact 3.5 framework. .net 3.5.1 comes with Windows 7, you just have to enable it. Go to control panel, programs and features. In that screen you will see Turn Windows features on or off, click on that. In that window check Microsoft .NET Framework 3.5.1, this way take a few minutes to run. Once it finishes look for a directory similar to these C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework.NETFramework\v3.5\Profile\Client and C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\v3.5. Between these 2 directories, you will find any dll you will need for serialization/deserializtion of JSON. These can be added to your project by going to Project-->Add Reference-->Browse Tab, then navigate to the v3.5 directory and select the dlls you need(System.Web.Extensions.dll(v3.5.30729.5446)is used in this example).
To get JSON from a web service, deserialize it, and send the data to your CRM database, you will have to use a script component as a source in your data flow and add columns to your output buffer that will be used to hold the data coming from the JSON feed(on the Input and Output screen). In the code, you will need to override the CreateNewOutputRows method. Here is an example of how to do this:
Say Your JSON looked like this...[{"CN":"ALL","IN":"Test1","CO":0,"CA":0,"AB":0},{"CN":"ALL","IN":"Test2","CO":1,"CA":1,"AB":0}]
I would fist define a class to mirror this JSON feed attributes (and the columns you defined on the inputs and outputs screen) that will eventually hold these values once you deserialize...as such:
class WorkGroupMetric
{
public string CN { get; set; }
public string IN { get; set; }
public int CO { get; set; }
public int CA { get; set; }
public int AB { get; set; }
}
Now you need to call your web service and get the JSON feed using an HttpWebRequest and a Stream:
string wUrl = "YOUR WEB SERVICE URI";
string jsonString;
HttpWebRequest httpWReq = (HttpWebRequest)WebRequest.Create(wUrl);
HttpWebResponse httpWResp = (HttpWebResponse)httpWReq.GetResponse();
Stream responseStream = httpWResp.GetResponseStream();
using (StreamReader reader = new StreamReader(responseStream))
{
jsonString = reader.ReadToEnd();
reader.Close();
}
Now we deserialize our json into an array of WorkGroupMetric
JavaScriptSerializer sr = new JavaScriptSerializer();
WorkGroupMetric[] jsonResponse = sr.Deserialize<WorkGroupMetric[]>(jsonString);
After deserializing, we can now output the rows to the output buffer:
foreach (var metric in jsonResponse)
{
Output0Buffer.AddRow();
Output0Buffer.CN = metric.CN;
Output0Buffer.IN = metric.IN;
Output0Buffer.CO = metric.CO;
Output0Buffer.CA = metric.CA;
Output0Buffer.AB = metric.AB;
}
Here is what all the code put together would look like(I have a step by step example here):
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Net;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;
using System.Web.Script.Serialization;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void CreateNewOutputRows()
{
string wUrl = "YOUR WEB SERVICE URI";
try
{
WorkGroupMetric[] outPutMetrics = getWebServiceResult(wUrl);
foreach (var metric in outPutMetrics)
{
Output0Buffer.AddRow();
Output0Buffer.CN = metric.CN;
Output0Buffer.IN = metric.IN;
Output0Buffer.CO = metric.CO;
Output0Buffer.CA = metric.CA;
Output0Buffer.AB = metric.AB;
}
}
catch (Exception e)
{
failComponent(e.ToString());
}
}
private WorkGroupMetric[] getWebServiceResult(string wUrl)
{
HttpWebRequest httpWReq = (HttpWebRequest)WebRequest.Create(wUrl);
HttpWebResponse httpWResp = (HttpWebResponse)httpWReq.GetResponse();
WorkGroupMetric[] jsonResponse = null;
try
{
if (httpWResp.StatusCode == HttpStatusCode.OK)
{
Stream responseStream = httpWResp.GetResponseStream();
string jsonString;
using (StreamReader reader = new StreamReader(responseStream))
{
jsonString = reader.ReadToEnd();
reader.Close();
}
JavaScriptSerializer sr = new JavaScriptSerializer();
jsonResponse = sr.Deserialize<WorkGroupMetric[]>(jsonString);
}
else
{
failComponent(httpWResp.StatusCode.ToString());
}
}
catch (Exception e)
{
failComponent(e.ToString());
}
return jsonResponse;
}
private void failComponent(string errorMsg)
{
bool fail = false;
IDTSComponentMetaData100 compMetadata = this.ComponentMetaData;
compMetadata.FireError(1, "Error Getting Data From Webservice!", errorMsg, "", 0, out fail);
}
}
class WorkGroupMetric
{
public string CN { get; set; }
public string IN { get; set; }
public int CO { get; set; }
public int CA { get; set; }
public int AB { get; set; }
}
This can now be used as an input for a data destination (your CRM database). Once there you can use SQL to compare the data and find mismatches, send the data to another script component to serialize, and send any updates you need back to the web service.
OR
You can do everything in the script component and not output data to the output buffer. In this situation you would still need to deserialze the JSON, but put the data into some sort of collection. Then use the entity framework and LINQ to query your database and the collection. Determine what doesn't match, serialize it, and send that to the web service in the same script component.

Related

Azure Logic Apps- How to implement long-running tasks via Durable Functions, via the polling action pattern

I have an azure function that takes in these parameters (source container, source filename, destination folder, and destination container) and unzips the source filename in the folder in the destination container. There are several actions in the Logic App workflow following the Azure Unzip action that are not completed because the Logic app would timeout after completing the unzipping due to the file size. So Azure function was revamped to be a durable function and I am trying to implement it into my Logic app via the polling action. According to this site, https://medium.com/#jeffhollan/calling-long-running-functions-from-logic-apps-6d7ba5044701, I can use the built-in Azure Functions action but I have no idea what the actual workflow should look like. I am looking for a step by step graphic demonstration on how to implement the durable function via the polling action pattern in my Logic App like depicted on this page, https://yourazurecoach.com/2018/08/19/perform-long-running-logic-apps-tasks-with-durable-functions/ (which shows how to implement it via the webhook action pattern). Any help would be greatly appreciated. Thanks in advance.
After you executing built-in Azure Functions in Logic APP, you will get 202 status code. Then you can handle your business in QueueTrigger.
You can have a look at this sample:
using System.IO;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.AspNetCore.Http;
using Microsoft.Azure.WebJobs.Host;
using Newtonsoft.Json;
using System.Threading;
using System.Net.Http;
using System;
namespace HttpToQueueWebhook
{
public static class HttpTrigger
{
[FunctionName("HttpTrigger")]
public static IActionResult Run(
[HttpTrigger(AuthorizationLevel.Function, "post")]HttpRequest req,
TraceWriter log,
[Queue("process")]out ProcessRequest process)
{
log.Info("Webhook request from Logic Apps received.");
string requestBody = new StreamReader(req.Body).ReadToEnd();
dynamic data = JsonConvert.DeserializeObject(requestBody);
string callbackUrl = data?.callbackUrl;
//This will drop a message in a queue that QueueTrigger will pick up
process = new ProcessRequest { callbackUrl = callbackUrl, data = "some data" };
return new AcceptedResult();
}
public static HttpClient client = new HttpClient();
/// <summary>
/// Queue trigger function to pick up item and do long work. Will then invoke
/// the callback URL to have logic app continue
/// </summary>
[FunctionName("QueueTrigger")]
public static void Run([QueueTrigger("process")]ProcessRequest item, TraceWriter log)
{
log.Info($"C# Queue trigger function processed: {item.data}");
//Thread.Sleep(TimeSpan.FromMinutes(3));
//ProcessResponse result = new ProcessResponse { data = "some result data" };
//handle your business here.
client.PostAsJsonAsync<ProcessResponse>(item.callbackUrl, result);
}
}
public class ProcessRequest
{
public string callbackUrl { get; set; }
public string data { get; set; }
}
public class ProcessResponse
{
public string data { get; set; }
}
}
More details, you can refer to this answer logic apps web hook to chalkboard API timeout error.

Redis vs SQL Server performance

Application performance is one of the main reason of using cache over relational database. Because it stores data in memory in the form of key value pair, we can store frequently accessed data in cache which are not changes very frequently. Reading from cache is much faster than database. Redis is one of the best solution in distributed cache market.
I was doing a performance test between Azure Redis cache and Azure SQL Server. I have created a simple ASP.NET Core application and inside that I have read data from SQL Server database as well as Redis multiple times and compare the read time duration between them. For database reading I have used Entity Framework Core and for Redis reading I have used 'Microsoft.Extensions.Caching.StackExchangeRedis'.
Model
using System;
namespace WebApplication2.Models
{
[Serializable]
public class Student
{
public int Id { get; set; }
public string Name { get; set; }
public int Age { get; set; }
public string Subject { get; set; }
public Student()
{
Name = string.Empty;
Subject = string.Empty;
}
}
}
Entity Framework Core data context.
using Microsoft.EntityFrameworkCore;
using WebApplication2.Models;
namespace WebApplication2.Data
{
public class StudentContext : DbContext
{
public StudentContext(DbContextOptions<StudentContext> options)
: base(options)
{
}
public DbSet<Student>? Students { get; set; }
}
}
Startup class
public void ConfigureServices(IServiceCollection services)
{
services.AddControllersWithViews();
string studentDbConnectionString = Configuration.GetConnectionString("StudentDbConnectionString");
services.AddDbContext<StudentContext>(option => option.UseSqlServer(studentDbConnectionString));
string redisConnectionString = Configuration.GetConnectionString("RedisConnectionString");
services.AddStackExchangeRedisCache(options =>
{
options.Configuration = redisConnectionString;
});
}
appsettings.json
{
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft": "Warning",
"Microsoft.Hosting.Lifetime": "Information"
}
},
"AllowedHosts": "*",
"ConnectionStrings": {
"StudentDbConnectionString": "[Azure SQL Server connection string]",
"RedisConnectionString": "[Azure Redis cache connection string]"
}
}
Home controller
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Caching.Distributed;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Runtime.Serialization.Formatters.Binary;
using WebApplication2.Data;
using WebApplication2.Models;
namespace WebApplication2.Controllers
{
public class HomeController : Controller
{
private readonly StudentContext _studentContext;
private readonly IDistributedCache _cache;
public HomeController(StudentContext studentContext, IDistributedCache cache)
{
_studentContext = studentContext;
_cache = cache;
}
public IActionResult Index()
{
List<Student>? students = null;
var counter = 10000;
var sw = Stopwatch.StartNew();
for (var i = 0; i < counter; i++)
{
students = _studentContext.Students.OrderBy(student => student.Id).ToList();
}
sw.Stop();
ViewData["DatabaseDuraion"] = $"Database: {sw.ElapsedMilliseconds}";
if (students != null && students.Count > 0)
{
List<Student> studentsFromCache;
var key = "Students";
_cache.Set(key, ObjectToByteArray(students));
sw.Restart();
for (var i = 0; i < counter; i++)
{
studentsFromCache = (List<Student>)ByteArrayToObject(_cache.Get(key));
}
sw.Stop();
ViewData["RedisDuraion"] = $"Redis: {sw.ElapsedMilliseconds}";
}
return View();
}
private byte[] ObjectToByteArray(object obj)
{
var bf = new BinaryFormatter();
using var ms = new MemoryStream();
bf.Serialize(ms, obj);
return ms.ToArray();
}
private object ByteArrayToObject(byte[] arrBytes)
{
using var memStream = new MemoryStream();
var binForm = new BinaryFormatter();
memStream.Write(arrBytes, 0, arrBytes.Length);
memStream.Seek(0, SeekOrigin.Begin);
object obj = binForm.Deserialize(memStream);
return obj;
}
}
}
Home\Index.cshtml view
#{
ViewData["Title"] = "Home Page";
}
<div class="text-center">
<p>#ViewData["DatabaseDuraion"]</p>
<p>#ViewData["RedisDuraion"]</p>
</div>
I have found SQL Server is faster than Redis.
The ASP.NET Core application is hosted in Azure App Service with the same location with Azure SQL Server and Azure Redis.
Please let me know why Redis is slower than SQL Server?
I have used github.com/dotnet/BenchmarkDotNet to benchmark the Azure SQL Server database and Azure cache for Redis for 10000 reads. SQL Server database mean: 16.48 sec and Redis mean: 29.53 sec.
I have used JMeter and connects 100 users each reading SQL Server database/Redis 1000 times. There is not much difference between total time it took to finish reading SQL Server database vs Redis (both are near about 3 mins and 30 sec), but I saw load on Azure SQL Server database DTU. The DTU goes near 100% during the test.
As a conclusion, I think speed is not the only reason to use Redis cache over SQL Server database but another reason is Redis cache reduces good amount of load from the database.
You don't only see performance difference here BTW. For cache, Redis is also giving you cache invalidation logic, which you need to build up in SQL In memory table. So Redis all the way when it comes to cache
Think about what's happening here
In SQL
Process -> TCP -> read optimised store (single table) -> Serialisation into application models
In Redis
Process -> check for cache hit -> TCP -> read optimised store (single table) -> Serialisation into application models
Redis is great, but don't mistake its purpose, if you are doing a read from an indexed table on a well optimised index then SQL is going to be quick, why would Redis be any quicker? The power of distributed cache comes in when your authoritive store or your process have to do some computations to gain to result, so effectively what you are saving by caching is CPU disk / time (be it on sql or in proc).
If you want to really increase speed it's in memory cache that you want, this however isn't as simple as it first sounds, the real trick here is a way to invalidate in memory cache across a distributed cluster upon a change to the authoritive store.
Hope this helps

Reading from CosmosDB and write to Azure SQL

I have an Azure function to read from Cosmos DB and write to SQL. Since I am new to coding I have a little of struggle to understand how to read the incoming document. I can see that documents are shown at input:
public static async Task Run([CosmosDBTrigger(
databaseName: "ToDoList",
collectionName: "Items",
ConnectionStringSetting = "CosmosDB",
LeaseCollectionName = "leases")]IReadOnlyList<Document> input, ILogger log)
{
if (input != null && input.Count > 0)
{ }
I know that I have to read the document and deserialise it to a C# object which I have this code for (assuming it is correct):
Record resultRecord = JsonConvert.DeserializeObject<Record>(jsonString);
I am lost how to get the data from the json document and write it to the C# object. The connecting part is confusing for me.
I also have a SQL code, and again I dont understand how I should connect my C# object so the data can be read and written to SQL database.
var cnnString = "sqlConnection"; // Connecting to Azure SQL Database
using (var sqlConnection = new SqlConnection(cnnString)) // Start up sql connectin with connectionstring
{
sqlConnection.Open();
var cmd = new SqlCommand
{
//Insert into command (used to insert data into a table)
CommandText = #"insert into [dbo].[Player] ([User] values(#User)",
CommandType = CommandType.Text,
Connection = sqlConnection,
};
var record = new Record();
//set parameters
cmd.Parameters.Add(new System.Data.SqlClient.SqlParameter("#User", record.Email));
await cmd.ExecuteNonQueryAsync();
I am not sure if this is the right way of asking a question about a code, but I appreciate any help.
You want to get data from a json document,we can use the Newtonsoft.Json.dll file to parse the json document.
I think you can change you code like this:
List<Info> jobInfoList = JsonConvert.DeserializeObject<List<Info>>(json);
And here is a sample about how to get data from a json documnet:
class Program
{
static void Main(string[] args)
{
string json = #"[{'id':9527,'username':'admin'}]";
List<Info> jobInfoList = JsonConvert.DeserializeObject<List<Info>>(json);
foreach (Info jobInfo in jobInfoList)
{
Console.WriteLine("UserName:" + jobInfo.username);
}
}
}
public class Info
{
public string id { get; set; }
public string username { get; set; }
}
You can declare variables to receive the data which you want to get from the json String in foreach . Then you can insert these data into your sql database as the parameters.
How to connect to the sql database and write data to sql database,you can see:
Quickstart:
Use .NET (C#) with Visual Studio to connect and query an Azure SQL database
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-connect-query-dotnet-visual-studio#insert-code-to-query-sql-database

net core 1 (dnx 4.5.1) with enterpriselibrary 6 - setting up the connection string

i ve big problems running enterprise library data access block with net core 1 (dnx 4.5.1)
How can i setup the default connection string for entlib
my appsettings.json
"ConnectionString": "Server=localhost\sqlexpress;Initial Catalog=blind;User Id=blind;Password=blind"
Here is my problem (no default connectionstring)
Database db = DatabaseFactory.CreateDatabase();
how can i pass the appsettings ConnectionString to the entlib databasefactory
any help would be greatly appreciated
I know it's an old question, but I have a similar setup (but using .NET Core 2.0) and it took me awhile to figure out how to set the default database connection without using the web.config to manage it.
What I did was include the default database and all of the connection strings in the appsettings.json and then in my Startup class I read the appsettings.json into an object that I defined to store the default db name and the connection strings and configure the default + named database using DatabaseFactory.SetDatabase.
DatabaseFactory.SetDatabases() Definition
public class DataConfiguration
{
public string DefaultDatabase { get; set; }
public List<ConnectionStringSettings> ConnectionStrings { get; set; }
}
public class Startup
{
public Startup(IConfiguration configuration)
{
//Get the Database Connections from appsettings.json
DataConfig = configuration.Get<DataConfiguration>();
var defaultDb = DataConfig.ConnectionStrings?.Find(c => c.Name == DataConfig.DefaultDatabase);
DatabaseFactory.SetDatabases(() => new SqlDatabase(defaultDb.ConnectionString), GetDatabase);
Configuration = configuration;
}
public Database GetDatabase(string name)
{
var dbInfo = DataConfig.ConnectionStrings.Find(c => c.Name == name);
if (dbInfo.ProviderName == "System.Data.SqlClient")
{
return new SqlDatabase(dbInfo.ConnectionString);
}
return new MySqlDatabase(dbInfo.ConnectionString);
}
}
Whenever there is documentation, I always suggest reading it as it is usually good. This is one of those examples, check out the "Getting Started with ASP.NET 5 and Entity Framework 6". There are several things that you need to do to ensure that you are correctly configured.
Setup your connection string and DI.
public class ApplicationDbContext : DbContext
{
public ApplicationDbContext(string nameOrConnectionString)
: base(nameOrConnectionString)
{
}
}
Also, notice the path in the configuration, it seems to differ from yours.
public void ConfigureServices(IServiceCollection services)
{
services.AddScoped((_) =>
new ApplicationDbContext(
Configuration["Data:DefaultConnection:ConnectionString"]));
// Configure remaining services
}

DataContract doesn't work after publish into web site

I tried to solve by myself, but... Looks like I need help from people.
I have Business Silverlight application with WCF RIA and EntityFramework. Access to Database I get via LinqToEntites.
Common loading data from database I making by this:
return DbContext.Customers
This code returns full Customers table from DataBase. But sometimes I do not need to show all data. Easy way is use linq filters in client side by next code:
public LoadInfo()
{
...
var LO1 = PublicDomainContext.Load(PublicDomainContext.GetCustomersQuery());
LO1.Completed += LO1Completed;
...
}
private void LO1Completed(object sender, EventArgs eventArgs)
{
...
DatatViewGrid.ItemsSource = null;
DatatViewGrid.ItemsSource = loadOperation.Entities.Where(c=>c ...filtering...);
//or PublicDomainContext.Customers.Where(c=>c ...filtering...)
...
}
However this way has very and very important flaw: all data passing from server to client side via DomainService may be viewed by applications like Fiddler. So I need to come up with another way.
Task: filter recieving data in server side and return this data.
Way #1: LinqToEntites has a beautiful projection method:
//MSDN Example
var query =
contacts.SelectMany(
contact => orders.Where(order =>
(contact.ContactID == order.Contact.ContactID)
&& order.TotalDue < totalDue)
.Select(order => new
{
ContactID = contact.ContactID,
LastName = contact.LastName,
FirstName = contact.FirstName,
OrderID = order.SalesOrderID,
Total = order.TotalDue
}));
But, unfortunately, DomainServices cannot return undefined types, so this way won't work.
Way #2: I found next solution - make separate DTO classes (DataTransferObject). I just read some samples and made on the server side next class:
[DataContract]
public partial class CustomerDTO
{
[DataMember]
public int ISN { get; set; }
[DataMember]
public string FIO { get; set; }
[DataMember]
public string Listeners { get; set; }
}
And based this class I made a row of methods which return filtered data:
[OperationContract]
public List<CustomerDTO> Customers_Common()
{
return DbContext.Customers....Select(c => new CustomerDTO { ISN = c.ISN, FIO = c.FIO, Listeners = c.Listeners }).ToList();
}
And this works fine, all good...
But, there is strange problem: running application locally does not affect any troubles, but after publishing project on the Web Site, DomainService returns per each method HTTP 500 Error ("Not Found" exception). Of course, I cannot even LogIn into my application. DomainService is dead. If I delete last class and new methods from application and republish - all works fine, but without speacial filtering...
The Question: what I do wrong, why Service is dying with new classes, or tell me another way to solve my trouble. Please.
U P D A T E :
Hey, finally I solved this!
There is an answer: Dynamic query with WCF RIA Services
Your best shot is to find out what is causing the error. For that, override the OnError method on the DomainService like this:
protected override void OnError(DomainServiceErrorInfo errorInfo)
{
/* Log the error info to a file. Don't forget inner exceptions.
*/
base.OnError(errorInfo);
}
This is useful, because only two exceptions will be passed to the client, so if there are a lot of nested inner exceptions, you should still be able to see what actually causes the error.
In addition, you can inspect the error by attaching the debugger to the browser instance you are opening the site with. In VS2010 this is done by doing [Debug] -> [Attach to Process] in the menu-bar.

Resources