How can I export mongoDB collection into .csv which is readable/compatible in SQL Server - sql-server

I used mongoexport to export the collection in to csv file along with fields. However when I tried to import the .csv in sql server using SSIS. I got errors and the data in preview section before executing the package was wrong. Can any one please guide me how can I export the data properly which can be easily imported into sql server. I am ready for minor tuning like adding id column or changing data types etc.

So it looks like you are getting a json formatted object for each row in the file. It may be different than that, and if that's the case, how you iterate through each might change a little. But here is something to get you started:
1 - There is no out of the box JSON parser in .NET, but this seems to be a popular utility: http://www.newtonsoft.com/json
2 - If you download the parser above, you'll have to go through a little pain to get it in a useable state with SSIS:
Go to the source folder and open the solution for net40
Sign the assembly
comment out the lines that cause build errors
//[assembly: InternalsVisibleTo("Newtonsoft.Json.Schema")]
//[assembly: InternalsVisibleTo("Newtonsoft.Json.Tests")]
install the assembly to the gac
3 - Once all that is out of the way, add a file connection manager to your package and point it to the mongdb file
4 - Add a dataflow and then add a script component source to the dataflow
5 - In the script component, configure the connection manager that you created in step three, I gave mine the friendly name "MongoDbOutput"
6 - In the Inputs and Outputs section, go to Output0 and add a column for each field in the JSON object, setting datatypes to string as they will be int by default
7 - Open the script and add a reference to Newtonsoft.Json and System.IO
8 - The script below shows how to access the connection string to the file in the connection manager, read the file with a streamreader one line at a time and then parse each JSON object for each address. One line is added for every address and the name and ssn are repeated for each line.
Note also, that I added a person and address class. The json.net is pretty cool, it will take that json and push it into the object - provided all the fields match up. Goog luck!
using System.IO;
using Newtonsoft.Json;
public override void CreateNewOutputRows()
{
object MongoDbPath = Connections.MongoDbOutput.AcquireConnection(null);
string filePath = MongoDbPath.ToString();
using (StreamReader fileContents = new StreamReader(filePath))
{
while (fileContents.Peek() >= 0)
{
var contents = fileContents.ReadLine();
Person person = JsonConvert.DeserializeObject<Person>(contents);
foreach (address address in person.addresses)
{
Output0Buffer.AddRow();
Output0Buffer.name = person.name;
Output0Buffer.ssn = person.ssn;
Output0Buffer.City = address.city;
Output0Buffer.Street = address.street;
Output0Buffer.country = address.cc;
}
}
}
}
public class Person
{
public string name { get; set; }
public string ssn { get; set; }
public address[] addresses
{ get; set; }
}
public class address
{
public string street { get; set; }
public string city { get; set; }
public string cc { get; set; }
}

Related

SSIS Data Flow Task Error: Object reference not set to an instance of an object

I am trying to create a package containing several Data Flow tasks in them. The tasks are fairly similar in nature, but contain fairly important differences.
I have tried to copy the task, then change the things which need changing.
When I run the task by itself it runs fine, however when I run it with all the other tasks in the package I get the below error:
Object reference not set to an instance of an object. at
ScriptMain.Input0_ProcessInputRow(Input0Buffer Row) at
UserComponent.Input0_ProcessInput(Input0Buffer Buffer) at
UserComponent.ProcessInput(Int32 InputID, String InputName,
PipelineBuffer Buffer, OutputNameMap OutputMap) at
Microsoft.SqlServer.Dts.Pipeline.ScriptComponent.ProcessInput(Int32
InputID, PipelineBuffer buffer) at
Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost.ProcessInput(Int32
inputID, PipelineBuffer buffer)
Not the most friendly error message.
Can anyone tell me what this is and how to fix it? I assume that there is some variable or other attribute which is being repeated, but which one?
Note that many of the columns over the several data flow tasks will have the same column names.
I figured it out in the end. The reason was that the second level objects need to be explicitly declared.
I had
public class Level2
{
public string Somevalue { get; set; }
}
public class RootAttributes
{
public Level2 lvl2 { get; set; }
public string Somevalue2 { get; set; }
}
It should have been
public class Level2
{
public string Somevalue { get; set; }
}
public class RootAttributes
{
public Level2 lvl2 = new Level2;
public string Somevalue2 { get; set; }
}
The weird thing was that the top method worked in several other places.
I had this issue with my send mail task.
As I was using a file connection for my body of the email, I had to create a new file connection and it worked fine.
Use the package run report in vs 2012 to identify in which DFT task package got failed.
Use the view code option of the package to see the whether all the DFT task having proper input.

Dapper Extension Get & Update returns errors

I tried to play with Dapper Extension & MS Access and succeeded up to certain extent. My code is listed below. All the functions works properly (Insert/Count/GetList/Delete) except Get & Update. I have summarised the code below. If anybody wants I can paste all the code here
My Product class
public class Products
{
public string ProductNumber { get; set; }
public string Description { get; set; }
}
And in my main class. I tried to get the product and update it as below. con.Get<Products> function returns an exception with "Sequence contains more than one element" message and con.Update<Products> returns an exception with "At least one Key column must be defined".
using (var con = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=test.mdb"))
{
string ProductNumber = "12";
var product4 = con.Get<Products>(ProductNumber);
product4.ProductNumber = "Baz";
con.Update<Products>(product4);
Console.ReadLine();
}
Even though con.Get<Products> fails con.GetList<Products>(predicate) works perfectly. I did follow this link for setup
If DapperExtensions can't infer a key property called ID from your class, you'll need to explicitly specify one via a class mapper. Assuming the ProductNumber is the primary key in the database, the following example class mapper should set the ProductNumber to be the primary key for DapperExtensions.
using Dapper;
using DapperExtensions;
using DapperExtensions.Mapper;
public class ProductsMapper : ClassMapper<Products>
{
public ProductsMapper()
{
Map(x => x.ProductNumber).Key(KeyType.Assigned);
AutoMap();
}
}
This class can sit somewhere within the same assembly as the rest of your code. Dapper Extensions will automatically pick it up. If you have your classes and Dapper code in separate assemblies, you can point it to your mapper with the following line:
DapperExtensions.DapperExtensions.SetMappingAssemblies({ typeof(ProductsMapper).Assembly })

How to use Dapper's SqlBuilder?

I can't find any documentation or examples I can follow to use the SqlBuilder class.
I need to generate sql queries dynamically and I found this class. Would this be the best option?
the best place to start is to checkout the dapper source code from its github repo and have a look at the SqlBuilder code. The SqlBuilder class is only a 200 lines or so and you should be able to make an informed choice on whether it is right for your needed.
An other option is to build your own. I personally went down this route as it made sense. Dapper maps select querys directly to a class if you name your class properties the same as your database or add an attribute such as displayName to map from you can use reflection to get the property names. Put there names and values into a dictionary and you can genarate sql fairly easy from there.
here is something to get you started:
first an example class that you can pass to your sqlbuilder.
public class Foo
{
public Foo()
{
TableName = "Foo";
}
public string TableName { get; set; }
[DisplayName("name")]
public string Name { get; set; }
[SearchField("fooId")]
public int Id { get; set; }
}
This is fairly basic. Idea behind the DisplayName attribute is you can separate the properties out that you want to include in your auto generation. in this case TableName does not have a DisplayName attribute so will not be picked up by the next class. however you can manually use it when generating your sql to get your table name.
public Dictionary<string, object> GetPropertyDictionary()
{
var propDictionary = new Dictionary<string, object>();
var passedType = this.GetType();
foreach (var propertyInfo in passedType.GetProperties())
{
var isDef = Attribute.IsDefined(propertyInfo, typeof(DisplayNameAttribute));
if (isDef)
{
var value = propertyInfo.GetValue(this, null);
if (value != null)
{
var displayNameAttribute =
(DisplayNameAttribute)
Attribute.GetCustomAttribute(propertyInfo, typeof(DisplayNameAttribute));
var displayName = displayNameAttribute.DisplayName;
propDictionary.Add(displayName, value);
}
}
}
return propDictionary;
}
This method looks at the properties for its class and if they are not null and have a displayname attribute will add them to a dictionary with the displayname value as the string component.
This method is designed to work as part of the model class and would need to be modified to work from a separate helper class. Personally I have it and all my other sql generation methods in a Base class that all my models inherit from.
once you have the values in the dictionary you can use this to dynamically generate sql based on the model you pass in. and you can also use it to populate your dapper DynamicParamaters for use with paramiterized sql.
I hope this helps put you on the right path to solving your problems.

How to read, edit and export word documents in WPF without Microsoft office being installed?

I have an WPF application that relies heavily on manipulating documents; I want to know if there is a library that works independetly from Microsoft Office Word and that provides the following features:
Reading word documents (*.doc or rtf will be suffisiant, *.docx will be perfect)
Enable me to edit the document from my WPF app
Enable me to export again the document into other formats (word, excel, pdf)
Free :)
Thanks in advance.
I will try to answer in order:
Reading: This article is good for you.
Edit & export: May be this library works for you.
Free: The most difficult part of your question. You can do it for free using Interop Assemblies for Office. But controls for free... Many controls not free around the net.
Hope it helps.
I was faced with similar question some years ago. I had Windows forms application with some 20 reports and about 100 users and I needed to generate Word documents from application. Application was installed on a server. My first attempt was done by using Office interop, but it caused problems with performance and all kinds of unpredictable exceptions. So I started to look for alternatives and I soon landed with OpenXML.
First idea was that our team would use OpenXML SDK to generate and manipulate documents. It soon turned out that the learning curve was way too steep and our management wasn't willing to pay for the extra work.
So we started to look for alternatives. We didn't find any useful free library and so we tried some commercial ones (Aspose, Docentric). Aspose gave great results, but it was too expensive. Docentric's license is cheaper and the product performed well in Word document generation, so we finally decided to purchase it.
WHAT IT TAKES TO GENERATE A DOCUMENT FROM A TEMPLATE
Install Docentric Toolkit (you can get 30 day trial version for free)
In your VisualStudio project ad references to 4 Docentric dlls, which you can find in installation folder C:\Program Files (x86)\Docentric\Toolkit\Bin
Include Entity Framework via NuGet package If you will fill data from SQL database into the Word document
Prepare Word template, where you define layout and include fields which will get filled with data at document generation (see on-line documentation how to do it).
It doesn't take much code to prepare the data to be merged with the template. In my example I prepare order for customer "BONAP" from Northwind database. Orders include customer data, order details and product data. Data model also includes header and footer data.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Docentric.Word;
using System.Diagnostics;
namespace WordReporting
{
// Report data model
public class ReportData
{
public ReportData()
{ }
public string headerReportTemplatetName { get; set; }
public string footerDateCreated { get; set; }
public string footerUserName { get; set; }
public List<Order> reportDetails { get; set; }
}
// model extensions
public partial class Order
{
public decimal TotalAmount { get; set; }
}
public partial class Order_Detail
{
public decimal Amount { get; set; }
}
// Main
class Program
{
static void Main(string[] args)
{
// variable declaration
List<Order> orderList = new List<Order>();
string templateName = #"c:\temp\Orders_template1.docx";
string generatedDocument = #"c:\temp\Orders_result.docx";
// reading data from database
using (var ctx = new NorthwindEntities1())
{
orderList = ctx.Orders
.Include("Customer")
.Include("Order_Details")
.Include("Order_Details.Product")
.Where(q => q.CustomerID == "BONAP").ToList();
}
// collecting data for the report
ReportData repData = new ReportData();
repData.headerReportTemplatetName = templateName;
repData.footerUserName = "<user name comes here>";
repData.footerDateCreated = DateTime.Now.ToString();
repData.reportDetails = new List<Order>();
foreach (var o in orderList)
{
Order tempOrder = new Order();
tempOrder.Customer = new Customer();
tempOrder.OrderID = o.OrderID;
tempOrder.Customer.CompanyName = o.Customer.CompanyName;
tempOrder.Customer.Address = o.Customer.Address;
tempOrder.Customer.City = o.Customer.City;
tempOrder.Customer.Country = o.Customer.Country;
tempOrder.OrderDate = o.OrderDate;
tempOrder.ShippedDate = o.ShippedDate;
foreach (Order_Detail od in o.Order_Details)
{
Order_Detail tempOrderDetail = new Order_Detail();
tempOrderDetail.Product = new Product();
tempOrderDetail.OrderID = od.OrderID;
tempOrderDetail.ProductID = od.ProductID;
tempOrderDetail.Product.ProductName = od.Product.ProductName;
tempOrderDetail.UnitPrice = od.UnitPrice;
tempOrderDetail.Quantity = od.Quantity;
tempOrderDetail.Amount = od.UnitPrice * od.Quantity;
tempOrder.TotalAmount = tempOrder.TotalAmount + tempOrderDetail.Amount;
tempOrder.Order_Details.Add(tempOrderDetail);
}
repData.reportDetails.Add(tempOrder);
}
try
{
// Word document generation
DocumentGenerator dg = new DocumentGenerator(repData);
DocumentGenerationResult result = dg.GenerateDocument(templateName, generatedDocument);
// start MS Word and show generated document
ProcessStartInfo startInfo = new ProcessStartInfo();
startInfo.FileName = "WINWORD.EXE";
startInfo.Arguments = "\"" + generatedDocument + "\"";
Process.Start(startInfo);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
// wait for the input to terminate the application
Console.WriteLine("Press Enter to exit...");
Console.ReadLine();
}
}
}
}

Creating seed model from data already in DB

Is there a way to convert data in an existing database into objects that can easily be put into a seed method?
Essentially I'd like to let some people add lines to my DB via a form and then convert that into an object that I can re-seed the DB anytime I need to make changes.
The database itself is created via code-first using the following model:
public class Combo
{
public int Id { get; set; }
public string MainPrefix { get; set; }
public string MainDescriptor { get; set; }
public string MainDish { get; set; }
public string Connector { get; set; }
public string SecondaryDescriptor { get; set; }
public string SecondaryDish { get; set; }
}
High level untested idea:
Create a new T4 template which will use either your context or direct SQL to query database and generate either C# static class with method (which will return collection of object created from database data) or SQL insert commands. Depending on what you choose you will either call static method in Seed to get all entities you want to insert or simply load the created SQL script and execute it.
You can also do the same without T4 by using Code Dom (if you want to generate C# code) and creating a new custom tool for visual studio.
These features can be part of your project or the result (C# code) can be compiled in separate external assembly.

Resources