Is there a way to read and write to sqlite3 from octave?
I'm thinking something along the lines of RODBC in R or the sqlite3 package in python, but for octave.
I looked on octave-forge http://octave.sourceforge.net/packages.php
But could only find the 'database' package, which only supports postgresql.
Details:
OS: Ubuntu 12.04
Octave: 3.6.2
sqlite: 3.7.9
I realise this is an old question, but most answers here seem to miss the point, focusing on whether there exists a bespoke octave package providing a formal interface, rather than whether it is possible to perform sqlite3 queries from within octave at all in the first place.
Therefore I thought I'd provide a practical answer for anyone simply trying to access sqlite3 via octave; it is in fact trivial to do so, I have done so myself many times.
Simply do an appropriate system call to the sqlite3 command (obviously this implies you have an sqlite3 client installed on your system). I find the most convenient way to do so is to use the
sqlite3 database.sqlite < FileContainingQuery > OutputToFile
syntax for calling sqlite3.
Any sqlite3 commands modifying output can be passed together with the query to obtain the output in the desired format.
E.g. here's a toy example plotting a frequency chart from a table which returns appropriate scores and counts in csv format (with headers and runtime stats stripped from the output).
pkg load io % required for csv2cell (used to collect results)
% Define database and Query
Database = '/absolute/path/to/database.sqlite';
Query = strcat(
% Options to sqlite3 modifying output format:
".timer off \n", % Prevents runtime stats printed at end of query
".headers off \n", % If you just want the output without headers
".mode csv \n", % Export as csv; use csv2cell to collect results
% actual query
"SELECT Scores, Counts \n",
"FROM Data; \n" % (Don't forget the semicolon!)
);
% Create temporary files to hold query and results
QueryFile = tempname() ; QueryFId = fopen( QueryFile, 'w' );
fprintf( QueryFId, Query ); fclose( QueryFId);
ResultsFile = tempname();
% Run query
Cmd = sprintf( 'sqlite3 "%s" < "%s" > "%s"', Database, QueryFile, ResultsFile );
[Status, Output] = system( Cmd );
% Confirm query succeeded and if so collect Results
% in a cell array and clean up temp files.
if Status != 0, delete( QueryFile, ResultsFile ); error("Query Failed");
else, Results = csv2cell( ResultsFile ); delete( QueryFile, ResultsFile );
end
% Process Results
Results = cell2mat( Results );
Scores = Results(:, 1); Counts = Results(:, 2);
BarChart = bar( Scores, Counts, 0.7 ); % ... etc
Et, voilĂ
According to Octave-Forge the answer is no.
Interface to SQL databases, currently only postgresql using libpq.
But you can write your own database package using the Octave C++ API with SQLite C API
As you already found out, the new version of the database package (2.0.0) only supports postgreSQL. However, old versions of the package also supported MySQL and SQLite (the last version with them was version 1.0.4).
Its problem is that the old database packages do not work with the new Octave and SWIG versions (I think the last version of Octave where the database package worked was 3.2.4). Aside the lack of maintainer (package was abandoned for almost 4 years) its use of SWIG was becoming a problem since it made more difficult for other developers to step in. Still, some users tried to fix it and some half fixes have been done (but never released). See bug #38098 and Octave's wiki page on the database package for some reports on making it work with SQLite in Octave 3.6.2.
The new version of the package is a complete restart of the package. Would be great if you could contribute with development for SQLite bindings.
Check out this link http://octave.1599824.n4.nabble.com/Octave-and-databases-td2402806.html which asks the same question regarding MySQL.
In particular this reply from Martin Helm points the way to using JDBC to connect to any JDBC supported database -
"Look at the java bindings in the octave java package (octave-forge), it is
maintained and it works. Java is very strong and easy for database handling.
Use that and jdbc driver for mysql to connect to mysql (or with the
appropriate jdbc friver everything else which you can imagine). That is what I
do when using db queries from octave. Much easier and less indirect than
invoking scripts and parsing output from databse queries.
As far as I remeber the database package is somehow broken (at least I never
was able to use it). "
I know this thread is pretty old, but for anybody else out there looking for a similar solution, this project seems to provide it.
https://github.com/markuman/go-sqlite
Related
I am currently using a MongoDB V3.4 in production environment and have to insert data in bulk from another MongoDB same version. I am currently achieving this by using a tool which tails the oplogs of the primary DB and replicates those ops in my destination DB. However, sometimes is get an error like "Write batch sizes must be between 1 and 1000. Got 2000" and process stops. I did some digging and found the maxWriteBatchSize limit is 1000 in MongoDB v<3.5. Can I bypass or change this limit?
That limit is hardcoded at https://github.com/mongodb/mongo/blob/r3.4.24/src/mongo/s/write_ops/batched_command_request.cpp#L43
const size_t BatchedCommandRequest::kMaxWriteBatchSize = 1000;
In order to change it, you would have to recompile.
I am setting up a SQL Azure database. I need to write data into the database on daily basis. I am using 64-bit R version 3.3.3 on Windows10. Some of the columns contain text (more than 4000 characters). Initially, I have imported some data from a csv into the SQL Azure database using Microsoft SQL Server Management Studios. I set up the text columns as ntext format, because when I tried using nvarchar the max was 4000 and some of the values got truncated even though they were about 1100 characters long.
In order to append to the database I am first saving the records in a temp table when I have predefined the varTypes:
varTypesNewFile <- c("Numeric", rep("NTEXT", ncol(newFileToAppend) - 1))
names(varTypesNewFile) <- names(newFileToAppend)
sqlSave(dbhandle, newFileToAppend, "newFileToAppendTmp", rownames = F, varTypes = varTypesNewFile, safer = F)
and then append them by using:
insert into mainTable select * from newFileToAppendTmp
If the text is not too long, the above does work. However, sometimes I get the following error during the sqlSave command:
Error in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, :
'Calloc' could not allocate memory (1073741824 of 1 bytes)
My questions are:
How can I counter this issue?
Is this the format I should be using?
Additionally, even when the above works, it takes about an hour to upload about 5k of records. Is it not too long? Is this the normal amount of time it should take? If not, what could I do better.
RODBC is very old, and can be a bit flaky with NVARCHAR columns. Try using the RSQLServer package instead, which offers an alternative means to connect to SQL Server (and also provides a dplyr backend).
I like the idea of Sqlite but I'm more comfortable with PostgreSQL, Mysql, even MS Access or Oracle.
I've got something written by someone else which generates Sqlite databases that include a date/time field and I want to get those into a format that Gnuplot can understand. Both Sqliteman and Sqlite browser show the field is an integer, and it looks like a unix time_t when I just query it, except it's 3 digits longer, like 1444136564028.
It doesn't have to be done by piping sqlite3 into Gnuplot, and it doesn't have to use the unixepoch/%s time format. I just can't find any examples of converting the Sqlite time fields in a query. One example "SELECT strftime('%s','now')" works, but when I replace now with a field in a real query it doesn't work. All the examples I find seem to use immediate/literal values, not fields from queries.
And can Sqlite use a tablename.fieldname format or does it have to be select fieldname from tablename?
Unix timestamps use seconds; using milliseconds is a common Java quirk:
> WITH MyLittleTable(d) AS (VALUES(1444136564028))
SELECT datetime(d / 1000, 'unixepoch') FROM MyLittleTable;
2015-10-06 13:02:44
All..
I am currently using some systems that has an Informix DB on some older IBM AIX OS based systems. I have found myself needing to use the command line "dbaccess" feature to make some quick queries. Informix has this really annoying habit of return output in this format:
employee -1
record_desc Update
field_id 2
value
opr_activity_date 20150831
opr_activity_time 1
employee -1
record_desc Update
field_id 2
value
opr_activity_date 20150831
opr_activity_time 1
employee -1
record_desc Update
field_id 2
value
opr_activity_date 20150831
opr_activity_time 1
MySQL, MSSQL, etc.. all output something more readable in table format..
city state zipcode
Sunnyvale CA 94086
San Francisco CA 94117
Palo Alto CA 94303
Redwood City CA 94026
Los Altos CA 94022
Mountain View CA 94063
Palo Alto CA 94304
Redwood City CA 94063
I noticed that Informix will/can output in a column/table format, but I have not figured out any rhyme or reason as to how it decides the flat versus the table format.
Any idea how I can force Informix to always display in column/table output via the command line?
Obviously, this is not an issue when I am near my computer and can use my GUI tool to query the DB...
Unfortunately, there's no way to control this behaviour in DB-Access.
If the width of the selected columns (plus a little white space) exceeds the width of the terminal, DB-Access switches to that block format, because it doesn't support sideways scrolling. That's the rhyme and reason.
You can try messing around with your terminal settings so that DB-Access is aware on start-up that the terminal width is wider than 80 characters, but I've always found there's more luck than science to that, and you'll still trigger the behaviour on some queries and not others.
When I need to do what you're describing - ad hoc, simple queries for troubleshooting etc - I tend to work within VIM rather than DB-Access, and use a macro to run the query and format the output. (This is using DBI::Shell behind the scenes.) I've also got a program that accepts either a table name or SQL statement and outputs tab-delimited, CSV or an old-school ASCII character formatted table of the results. This is also perl based. I could publish either of these if there's interest in them.
I think Jonathan Leffler's SQLCMD program can also be used in place of DB-Access to generate arbitrarily wide output.
Ok..
While I found the answers RET provided to be correct and pretty much sums up what I have been able to find on the net, I also found some work-arounds that enable the ability to get what you want, but in a kludgy way! Thanks Informix! :(
Open two terminal windows to your DB system, and launch the dbaccess and authenticate and connect to your database.
Next perform the following:
unload to /home/(user)/out ...the query...
Example:
unload to /home/jewettg/out select * from books_checked_in;
It will output the query results to the file and return the row-count of the return result.
On the second terminal, and here is the cool thing, run the following command:
column -t -s '|' /home/(user)/out
This will grab the content of the "out" file, and convert the pipe-delimited content to space-delimit content and output it to the screen.
Like I said, kludgy, but it works!
You can do this by setting the DBACCESS_COLUMNS environment variable. It is supported from version 12.10.xC9.
Example:
export DBACCESS_COLUMNS=1000
What is your recommended way to import .csv files into Microsoft SQL Server 2008 R2?
I'd like something fast, as I have a directory with a lot of .csv files (>500MB spread across 500 .csv files).
I'm using SQL Server 2008 R2 on Win 7 x64.
Update: Solution
Here's how I solved the problem the end:
I abandoned trying to use LINQ to Entities to do the job. It works - but it doesn't support bulk insert, so its about 20x slower. Maybe the next version of LINQ to Entities will support this.
Took the advice given on this thread, used bulk insert.
I created a T-SQL stored procedure that uses bulk insert. Data goes into a staging table, is normalized then copied into the target tables.
I mapped the stored procedure into C# using the LINQ to Entities framework (there is a video on www.learnvisualstudio.net showing how to do this).
I wrote all the code to cycle through files, etc in C#.
This method eliminates the biggest bottleneck, which is reading tons of data off the drive and inserting it into the database.
The reason why this method is extremely quick at reading .csv files? Microsoft SQL Server gets to import the files directly from the hard drive straight into the database, using its own highly optimized routines. Most of the other C# based solutions require much more code, and some (like LINQ to Entities) end up having to pipe the data slowly into the database via the C#-to-SQL-server link.
Yes, I know it'd be nicer to have 100% pure C# code to do the job, but in the end:
(a) For this particular problem, using T-SQL requires much less code compared to C#, about 1/10th, especially for the logic to denormalize the data from the staging table. This is simpler and more maintainable.
(b) Using T-SQL means you can take advantage of the native bulk insert procedures, which speeds things up from 20-minute wait to a 30-second pause.
Using BULK INSERT in a T-SQL script seems to be a good solution.
http://blog.sqlauthority.com/2008/02/06/sql-server-import-csv-file-into-sql-server-using-bulk-insert-load-comma-delimited-file-into-sql-server/
You can get the list of files in your directory with xp_cmdshell and the dir command (with a bit of cleanup). In the past, I tried to do something like this with sp_OAMethod and VBScript functions and had to use the dir method because I had trouble getting the list of files with the FSO object.
http://www.sqlusa.com/bestpractices2008/list-files-in-directory/
If you have to do anything with the data in the files other than insert it, then I would recommend using SSIS. It can not only insert and/or update, it can also clean the data for you.
First officially supported way of importing large text files is with command line tool called "bcp" (Bulk Copy Utility), very useful for huge amounts of binary data.
Please check out this link: http://msdn.microsoft.com/en-us/library/ms162802.aspx
However, in SQL Server 2008 I presume that BULK INSERT command would be your choice number one, because on the first place it became a part of standard command set. If for any reason you have to maintain vertical compatibility, I'd stick to bcp utility, available for SQL Server 2000 too.
HTH :)
EDITED LATER: Googling around I recalled that SQL Server 2000 had BULK INSERT command too... however, there was obviously some reason I sticked up to bcp.exe, and I cannot recall why... perhaps of some limits, I guess.
I should recommend this:
using System;
using System.Data;
using Microsoft.VisualBasic.FileIO;
namespace ReadDataFromCSVFile
{
static class Program
{
static void Main()
{
string csv_file_path=#"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}
}
}
//Copy the DataTable to SQL Server using SqlBulkCopy
function static void InsertDataIntoSQLServerUsingSQLBulkCopy(DataTable csvData)
{
using(SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=yourDB;Integrated Security=SSPI;"))
{
dbConnection.Open();
using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
{
s.DestinationTableName = "Your table name";
foreach (var column in csvFileData.Columns)
s.ColumnMappings.Add(column.ToString(), column.ToString());
s.WriteToServer(csvFileData);
}
}
}
If the structure of all your CSVs are the same i recomend you to use Integration Services (SSIS) in order to loop between them and insert all of them into the same table.
I understand this is not exactly your question. But, if you get into a situation where you use a straight insert use tablock and insert multiple rows. Depends on the row size but I usually go for 600-800 rows at at time. If it is a load into an empty table then sometimes dropping the indexes and creating them after it is loaded is faster. If you can sort the data on the clustered index before it is loaded. Use IGNORE_CONSTRAINTS and IGNORE_TRIGGERS if you can. Put the database in single user mode if you can.
USE AdventureWorks2008R2;
GO
INSERT INTO Production.UnitMeasure with (tablock)
VALUES (N'FT2', N'Square Feet ', '20080923'), (N'Y', N'Yards', '20080923'), (N'Y3', N'Cubic Yards', '20080923');
GO