I am trying to insert a .csv file into SQL Server 2008 R2.
The .csv is 300+MB from http://ipinfodb.com/ip_database.php Complete
(City), 4.0M records.
Here're the top 5 lines, with 1st line = column headers:
"ip_start";"country_code";"country_name";"region_code";"region_name";"city";"zipcode";"latitude";"longitude";"metrocode"
"0";"RD";"Reserved";;;;;"0";"0";
"16777216";"AU";"Australia";;;;;"-27";"133";
"17367040";"MY";"Malaysia";;;;;"2.5";"112.5";
"17435136";"AU";"Australia";;;;;"-27";"133";
I tried Import and Export Data, and BULK INSERT, but haven't been able to import them correctly yet.
Shall I resort to use bcp? can it handle stripping the ""? how?
Thank you very much.
Got it, forgot to set Text Qualifier as ":
Your data looks pretty inconsistent since NULL values don't also carry a quotation enclosure.
I believe you can create a format file to customize to your particular csv file and its particular terminators in SQL SERVER.
See more here:
http://lanestechblog.blogspot.com/2008/08/sql-server-bulk-insert-using-format.html
Is this a single import or are you wanting to schedule a recurring import? If this is a one-time task, you should be able to use the Import and Export Wizard. The text qualifier will be the quotation mark ("), be sure to select column names in the first data row, and you'll want to convey that the field delimiter is the semicolon (;).
I'm not certain the file is properly formatted - the last semicolon following each of the data rows might be a problem. If you hit any errors, simply add a new column header to the file.
EDIT: I just did a quick test, the semicolons at the end will be treated as part of the final value in that row. I would suggest adding a ;"tempheader" at the end of your header (first) row - that will cause SQL to treat the final semicolon as a delimiter and you can delete that extra column once the import is complete.
In C# you can use this code, working for me
public bool CSVFileRead(string fullPathWithFileName, string fileNameModified, string tableName)
{
SqlConnection con = new SqlConnection(ConfigurationSettings.AppSettings["dbConnectionString"]);
string filepath = fullPathWithFileName;
StreamReader sr = new StreamReader(filepath);
string line = sr.ReadLine();
string[] value = line.Split(',');
DataTable dt = new DataTable();
DataRow row;
foreach (string dc in value)
{
dt.Columns.Add(new DataColumn(dc));
}
while (!sr.EndOfStream)
{
//string[] stud = sr.ReadLine().Split(',');
//for (int i = 0; i < stud.Length; i++)
//{
// stud[i] = stud[i].Replace("\"", "");
//}
//value = stud;
value = sr.ReadLine().Split(',');
if (value.Length == dt.Columns.Count)
{
row = dt.NewRow();
row.ItemArray = value;
dt.Rows.Add(row);
}
}
SqlBulkCopy bc = new SqlBulkCopy(con.ConnectionString, SqlBulkCopyOptions.TableLock);
bc.DestinationTableName = tableName;
bc.BatchSize = dt.Rows.Count;
con.Open();
bc.WriteToServer(dt);
bc.Close();
con.Close();
return true;
}
Related
I have an application that trying to extract all data in different table with 1 Database. First, I stored all the query in a .txt file to retrieve the table name and stored it in List.
[Here's my .txt file]
string script = File.ReadAllText(#"D:\Schooldb\School.txt");
List<string> strings = new List<string>();
strings.Add(script);
using (SqlConnection connection = new SqlConnection(constring))
{
foreach (string x in strings)
{
using (SqlCommand cmd = new SqlCommand(x, connection))
{
using (SqlDataAdapter adapter = new SqlDataAdapter())
{
cmd.Connection = connection;
adapter.SelectCommand = cmd;
using (DataTable dt = new DataTable())
{
adapter.Fill(dt);
string txt = string.Empty;
foreach (DataColumn column in dt.Columns)
{
//Add the Header row for Text file.
txt += column.ColumnName + "\t\t";
}
//Add new line after Column Name.
txt += "\r\n";
foreach (DataRow row in dt.Rows)
{
foreach (DataColumn column in dt.Columns)
{
//Add the Data rows.
txt += row[column.ColumnName].ToString() + "***";
}
//Add new line.
txt += "\r\n";
}
int y = 0;
StreamWriter file = new StreamWriter($#"D:\SchoolOutput\{x}_{DateTime.Now.ToString("yyyyMMdd")}.txt");
file.WriteLine(txt.ToString());
file.Close();
y++;
}
}
}
}
Expected:
teachers_datetoday
students_datetoday
subjects_datetoday
But reality my output is just
datetoday txt
Can someone tell me, where part did I go wrong?
Thanks in advance!
There are other approaches for extracting data directly using SSMS.
In this case, your code reads the entire text as a single string, and the for loop runs only once.
Instead of reading the entire file as a string, you can have each line as one command and read the commands like the following.
foreach (string line in System.IO.File.ReadLines(#"D:\Schooldb\School.txt"))
{
//Each line contains one command
//Write your logic here
}
I have a winform where I retrieve data from SQL DB into datagridview and then export it to XML file. Data returned from DB has column names (headers) in English, however, in my form, they are in different language, which I would like to be included in the export XML file. For this purpose I iterate through datagridview and assign its column names to to the datatable which is then exported to XML file.
This is my code:
private void ExportToXml(DataGridView repGrid)
{
try
{
System.Data.DataTable data = (System.Data.DataTable)repGrid.DataSource;
data.TableName = "LoanPortfolioReport";
for (int i = 0; i < repGrid.ColumnCount; i++)
{
data.Columns[i].ColumnName = repGrid.Columns[i].HeaderText;
}
//export file:
SaveFileDialog.Filter = "XML|*.xml";
SaveFileDialog.FilterIndex = 1;
SaveFileDialog.FileName = "LoanPortfolio_" + loanPortfolioDatePicker.Value.ToString("yyyyMMdd") + ".xml";
if (SaveFileDialog.ShowDialog() == DialogResult.OK)
{
data.WriteXml(SaveFileDialog.FileName);
}
}
This is my datagridview before export:
And this happens after I iterate through column headers:
If I omit the column header iteration part, then datagridview doesn't get empty. I would appreciate suggestions regarding this behavior.
Struggle with understanding C# & Npgsql as a beginner. Following code examples:
// Insert some data
using (var cmd = new NpgsqlCommand())
{ cmd.Connection = conn;
cmd.CommandText = "INSERT INTO data (some_field) VALUES (#p)";
cmd.Parameters.AddWithValue("p", "Hello world");
cmd.ExecuteNonQuery();
}
The syntax for more than one insert & update statement like this is clear so far:
cmd.CommandText = "INSERT INTO data (some_field) VALUES (#p);INSERT INTO data1...;INSERT into data2... and so on";
But what is the right solution for a loop which should handle one statement within.
This works not:
// Insert some data
using (var cmd = new NpgsqlCommand())
{
foreach(s in SomeStringCollectionOrWhatever)
{
cmd.Connection = conn;
cmd.CommandText = "INSERT INTO data (some_field) VALUES (#p)";
cmd.Parameters.AddWithValue("p", s);
cmd.ExecuteNonQuery();
}
}
It seems the values will be "concatenated" or remembered. I cannot see any possibility to "clear" the existing cmd-object.
My second solution would be to wrap the whole "using" block into the loop. But every cycle would create a new object. That seems ugly to me.
So what is the best solution for my problem?
To insert lots of rows efficiently, take a look at Npgsql's bulk copy feature - the API is more suitable (and more efficient) for inserting large numbers of rows than concatenating INSERT statements into a batch like you're trying to do.
If you want to rerun the same SQL with changing parameter values, you can do the following:
using (var cmd = new NpgsqlCommand("INSERT INTO data (some_field) VALUES (#p)", conn))
{
var p = new NpgsqlParameter("p", DbType.String); // Adjust DbType according to type
cmd.Parameters.Add(p);
cmd.Prepare(); // This is optional but will optimize the statement for repeated use
foreach(var s in SomeStringCollectionOrWhatever)
{
p.Value = s;
cmd.ExecuteNonQuery();
}
}
If you need lots of rows and performance is key then i would recommend Npgsql's bulk copy capability as #Shay mentioned. But if you are looking for quick way to do this without the bulk copy i would recommend to use Dapper.
Consider the example below.
Lets say you have a class called Event and a list of events to add.
List<Event> eventsToInsert = new List<Event>
{
new Event() { EventId = 1, EventName = "Bday1" },
new Event() { EventId = 2, EventName = "Bday2" },
new Event() { EventId = 3, EventName = "Bday3" }
};
The snippet that would add the list to the DB shown below.
var sqlInsert = "Insert into events( eventid, eventname ) values (#EventId, #EventName)";
using (IDbConnection conn = new NpgsqlConnection(cs))
{
conn.Open();
// Execute is an extension method supplied by Dapper
// This code will add all the entries in the eventsToInsert List and match up the values based on property name. Only caveat is that the property names of the POCO should match the placeholder names in the SQL Statement.
conn.Execute(sqlInsert, eventsToInsert);
// If we want to retrieve the data back into the list
List<Event> eventsAdded;
// This Dapper extension will return an Ienumerable, so i cast it to a List.
eventsAdded = conn.Query<Event>("Select * from events").ToList();
foreach( var row in eventsAdded)
{
Console.WriteLine($"{row.EventId} {row.EventName} was added");
}
}
-HTH
I changed the name of one of my tables, then afterwards encoded some data then pulled it using a view to my surprise the data is not showing. I tried renaming it back to its original name with no luck the same thing is happening.
Then finally I tried retyping the data on one of the columns and then executed the view and there the data is finally showing now the problem arises I need to re encode the data on one of the column every time a data is inserted which is obviously not a good thing to do.
here is the code on how i added some data
tblcsv.Columns.AddRange(new DataColumn[7] { new DataColumn("unit_name", typeof(string)), new DataColumn("unit", typeof(string)), new DataColumn("adrress", typeof(string)), new DataColumn("latitude", typeof(string))
,new DataColumn("longitude" , typeof(string)) , new DataColumn("region" , typeof(string)) , new DataColumn("linkid" , typeof(string))});
string ReadCSV = File.ReadAllText(forex);
foreach (string csvRow in ReadCSV.Split('\n'))
{
if (!string.IsNullOrEmpty(csvRow))
{
//Adding each row into datatable
tblcsv.Rows.Add();
int count = 0;
foreach (string FileRec in csvRow.Split(','))
{
tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
if (count == 5)
{
tblcsv.Rows[tblcsv.Rows.Count - 1][6] = link;
}
count++;
}
}
}
string consString = ConfigurationManager.ConnectionStrings["diposlConnectionString"].ConnectionString;
using (SqlConnection con = new SqlConnection(consString))
{
using (SqlBulkCopy sqlBulkCopy = new SqlBulkCopy(con))
{
//Set the database table name
sqlBulkCopy.DestinationTableName = "dbo.FRIENDLY_FORCES";
//[OPTIONAL]: Map the Excel columns with that of the database table
sqlBulkCopy.ColumnMappings.Add("unit_name", "unit_name");
sqlBulkCopy.ColumnMappings.Add("unit", "unit");
sqlBulkCopy.ColumnMappings.Add("adrress", "adrress");
sqlBulkCopy.ColumnMappings.Add("latitude", "latitude");
sqlBulkCopy.ColumnMappings.Add("longitude", "longitude");
sqlBulkCopy.ColumnMappings.Add("region", "region");
sqlBulkCopy.ColumnMappings.Add("linkid", "linkid");
con.Open();
sqlBulkCopy.WriteToServer(tblcsv);
con.Close();
}
}
the column region is where i manually edited the data
Did the renaming of the table did something to my data?
Or am I just missing something?
Thank you
Im reading a xlsx file as a db to do some work
i noticed that its reading some feilds in as int and date even though i just want it all to come in as text . is there anyway to override this feature?
Code below
(feel free to point out anything i could be doing better with my code as well)
private DataSet ExceltoDT(OpenFileDialog dialog)
{
try
{
string connst = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + dialog.FileName + ";Extended Properties=\"Excel 12.0 Xml;HDR=NO\";";
string sheet = "Sheet1$";
string strSQL = "SELECT * FROM [" + sheet + "]";
//string Table = "081710";
OleDbConnection xlsdb = new OleDbConnection(connst);
xlsdb.Open();
OleDbDataAdapter adp = new OleDbDataAdapter(strSQL, xlsdb);
DataSet ds2 = new DataSet();
adp.Fill(ds2);
adp.Dispose();
xlsdb.Close();
xlsdb.Dispose();
return ds2;
}
catch (StackOverflowException stack_ex2)
{
MessageBox.Show("(2007 Excel file) Stack Overflowed!" + "\n" + stack_ex2.Message);
return null;
}
catch (OleDbException ex_oledb2)
{
MessageBox.Show("An OleDb Error Thrown!" + "\n" + ex_oledb2.Message);
return null;
}
}
Add a ' (apostrophe) in front of every cell value. That will tell Excel "Treat this as text even when it looks like a number/date/whatever".
Not what you want? Then don't use the DB connector because it's badly broken. You'll notice that when you have a column with cells that are mixed. In that case, the DB driver will look at the first 8 rows and set the type to the majority of types it finds and return NULL for anything in that column that doesn't fit. You can fix that by hacking your registry.
Instead use the OLE API to open the Workbook and then start from there, reading row by row, converting the data as you need (this long list of posts should contain about every possible way to access Excel from C# plus all the bugs and problems you can encounter).