A long time ago I figured out that bcp is just a little C program that calls the special bit of the sybase client api to do mass data moving into the database. It lies cheats and steals and skips check constraints all in the name of speed.
Great, I'm all for it.
In sybase 12 I noticed that the api was exposed in the C client library, but not the java one.
I've been looking but I haven't found anything that says they've yet implemented it in the sybase 15 java client library.
Does anybody know if this is available or not in sybase 15?
I disagree with your comments on Java using a BCP api. Whilst I agree about the limitations of Java and ODBC/JDBC that doesn't mean there aren't advantages of using a Java BCP api. We have a system with a lot of Java and its not practical or very effective to shell out from Java and run the BCP command line utility.
Running the command line utility doesn't give very good error reporting and deadlock retries.
It also requires the writing of data to a file which is going to increase the number of operations and slow down the whole process. Sometime we can't even write a file as its on a grid which doesn't have a file system and tmp is too small.
As for the speed, well JBCP is slower than native api, however it is acceptable and certainly faster than calling repeated insert commands.
Mwillett (author of JBCP)
I am thinking not, it may be more an issue with fitting that operation into the JDBC spec.
I do see a JBCP project out on SourceForge, but don't have any experience with it.
If you don't mind your Java program not being portable anymore, you can link to any C library via JNI. That is preferable to having to rewrite your application, or to calling a separate task to BCP the data
I assume you'd rather not rewrite your whole application in C++ ;-)
The answer is NO.
But why on Earth would you want to move masses of data from Java to the server ? (1) Java isn't designed for that, so it will be very slow (2) native bcp or C+bcp or perl+bcp or any shell command+bcp would scream circles around it, and displace it anyway. It's like wanting to run bcp via ODBC or JDBC.
We need to move away from Maslow's Hammer and Use the Right Tool for the Job.
Further detail, responding to comments:
An ordinary PROGRAM that connects to the ASE server (Client-Server style) uses the provided Open Client Library; this is native, and moves the TDS packets efficiently. The connection is a universally available one inch garden hose. PROGRAMS written in C, C++, COBOL, Perl, and PowerBuilder use this transport.
ODBC (and thus JDBC because it is built on top of ODBC) is a simple method of connecting to the server using a one millimetre hose. While this is quite adequate for tasks such as using Excel to draw charts directly from ASE tables, where the data transfer speed is not relevant; it is quite inadequate for moving data of any substantial volume, for normal app access to a data server (except where the "programmer" is ignorant of the fact that [1] is available).
.
Java does not have [1] and is limited to this [2] transport.
bcp is an utility PROGRAM (exists on its own) supplied by the vendor that connects to the server much more tightly. It is not a "special bit of the client API". There is no "lying and cheating" involved, all constraints are directed by the DBA performing the task. The connection is a two and a half inch fire hose, not generally available to the public. It is designed to move large data volumes at speed. If used on the same host as the server, since the hose is not reticulated through the network, it moves data even faster.
Much later, the vendor made the bcp capability available as a Library (API in your terms), which can therefore be invoked from any reasonably architected compiler. C, C++, COBOL, and Perl are such, and produce PROGRAMS, and therefore provide access to this library directly from your code. The connection is the same two and a half inch fire hose, but due to the additional layers, it runs at a maximum speed of a two inch fire hose.
(Note to readers who expect this to be a complete list: There are two other options which I have not detailed here, because they are server-server and not relevant to this thread).
Since Java PROGRAMS are currently limited to the one millimetre connection to the ASE server, there is no use in providing the bcp API to Java (you will merely have a two and a half inch fire hose reticulated through the network, with a FLOW of one millimetre), it is an absurd enterprise. There is a reason why, despite the millions many organisations have poured into Java, during its rather long um progression, none of them have spent money in providing a firehouse that moves drips and drops.
You cannot obtain the speed of a greyhound from a dachshund, there is no use giving it racetrack training. You can stop whispering promises in its ear.
Second, Java cannot handle large (source) data sets efficiently, it was not designed for that. Therefore even if the JDBC strangulation was lifted (eg. a native Open Client Library was implemented), it still cannot move data as fast as C, C++, COBOL, Perl, PB; it will move data at a trickle (quarter inch ?) in the one inch hose. Therefore, even then, it would be absurd to provide the bcp capability to the Java library; that would be worthwhile if and when Java (which was designed with other priorities in mind) is annointed with large data transfer capability.
It may help to move out of your Java, Java, and nothing but Java mindset, and Use the Right Tool (PROGRAM) for the Job. If you are a Java "programmer", then at least you need to familiarise yourself with the capability and limitations of your programming language and the libraries available. The original question demonstrates complete ignorance of that, hence I had to supply it in my revised post.
Programmers who are not limited to Java, think about where the large data source is located; minimise data transfers across networks; think about what PROGRAMS are already written and that can be used (as opposed to writing their own in isolation from the rest of the planet); and use them.
Finally, for understanding, even if you did obtain the bcp capability in some Library, and implemented it, when you place the "program" in the real world, any reasonable DBA will dismiss it due to its trickle speed data transport, and use bcp with its fire hose instead.
Related
I'll give you a bit of background because I don't think my question is clear without it. Aside from that, I don't know much about servers but I think it'll become clear what I'm actually asking because of the background information.
I was/am building a small C++ program to be used by just me (a homework manager, which needed to keep track of tasks, so it relied on tasks and subtasks and needed multiple tables, etc.), so I figured I needed a database. I quickly stumbled upon SQLite, which was perfect for my case in many ways: -it's free, -it only uses .db files which can be interpreted by any software, -it can be embedded, -it's simple (in terms of documentation and libraries), and most importantly: -it is what SQLite.org describes as 'serverless'.
However, I found SQLite's dynamic type system extremely annoying ('why' is besides the point; I might make seperate posts asking questions about this) and I decided to look for a rdbms that has all the pros I mentioned above but also has static typing.
While going down this rabbit-hole of looking for a rdbms to fit my needs, I came across many terms which were all related to how the rdbms is implemented regarding the term 'server' and the like. All the terms are very vagues and one word does not mean the same every instance.
I noticed all of these keywords and contrasts popping up during my search:
Stand-alone vs server/client
Embedded vs... 'not embedded', I guess(?)
Classic serverless vs neo-serverless
Serverless... but in reality cloud-based (I thought clouds were servers(?))
Server vs service
Service vs application
User vs client
I'm as far as to know that a server is a proces that is executing on the background, not to be used by the user directly. But other than that, all these server-related terms are throwing me off.
I want a rdbms that has this 'serverless'ness SQLite.org speaks of. I saw many professional free SQL RDBMS providers which spoke of the ability to have 'embedded servers', which does contain the word 'embedded' but it also contains 'server'. So my question is: when a providers speaks of these 'embedded servers', what does it really mean?
Does it mean that there is one application, and when it runs it opens another application which functions as a server? And when it does so, is that server a service or just another normal application-like process? Or, does it work exactly the same like the serverlessness SQLite mentions, that being: the libraries inside of the compiled project already handle everything to work with only .db files? Does it need any files other than the database and the executable? Does the communication between the application and database file directly come from the code or is another proces used?
(PS: as a side-question: could you help me clear up what all the terms in the list above precisely mean?)
I realise my question might be all over the place, but so is the vocabulary I've come across in this journey. I hope you can understand where my confusion is coming from and can help me clear these points up. Thanks in advance.
By "serverless" SQLLite just means that it's a library, and doesn't run in a seperate process. In this it's like Access/Jet and other older DMBS programs that read and write files directly. The more common term for this is "embedded database".
The more common meaning of "serverless" these days is a cloud-based capability that doesn't require you to install or manage a "server" or VM. As in "We use Azure Functions for serverless compute".
The other DMBS systems are typically called "Client-Server DBMS", where the DMBS runs in a seperate process and the client program communicates with it over a network or some RPC mechanism. Client-Server RDBMS systems can be "bundled" or "embedded" with an application, and may not be running on a seperate computer, but would still be running in a seperate process.
Which is better (and for what reasons) to use to connect to MS SQL, Oracle or Firebird from a Delphi Win32 application -- ADO or DBX (Database Express)?
Both allow you to connect to the major databases. I like the way ADO does it all with a connection string change and the fact that ADO and the drivers are included with Windows so nothing extra to deploy (it seems, correct me if I'm wrong).
DBX is also flexible and I can compile the drivers into my app, can I not?
I really am keen to have a single source if possible, with the ability to vary databases depending on the customer's IT department/preferences.
But which is easier to program, performs better, uses memory most efficiently? Any other things to differentiate them on?
Thanks, Richard
ADO is simple to use and is there, you only must make sure to install the correponding client driver in the client side.
I found DBX more flexible and it is better integrated within IDE and another technologies like DataSnap.
For the same purpose than you, I have used DBX with Third Party Drivers from DevArt.
You can compile the drivers with your application if you buy the drivers sources.
In the beginning of Delphi, people praised the multi-DBMS support in Delphi. Everyone loved the BDE (because that was the only way to do that).
But when looking at customers over more then the past decade, I have seen a steady decrease of multi-DBMS support in their applications.
The cost of supporting multiple DBMS from one application is high.
Not only because you have to have knowledge of each DBMS, but also because each DBMS has its own set of pecularities, where you have to adapt for in your data access layer. These not only include differences in syntax and underlying data types, but also optimization strategies.
Also, some DBMS work better with ADO, some better with a direct connection (like skipping your Oracle client all together).
Finally testing all the combinations of your software with multiple DBMS systems is very intensive.
I've been involved in a few projects where we had to change the DBMS backend and/or the data access technology (from i.e. BDE to DBX, or from DBX to a direct connection). Changing the backend always was much more painfull than changing the data access technology. Multi-tier approaches made them somewhat easier, but increased the degrees of freedom and therefor the testing efforts.
Some of products that I do see that support multi-DBMS are in vertical market applications where the final customer already has their own DBMS infrastructure and the application needs to adapt to that. For instance in Dutch governmental areas, Oracle has been really strong, but SQL Server has established quite a user base as well.
So you need to think about what combinations of DBMS you want to support, not only in terms of functionality, but also in terms of cost.
If you stick to one DBMS, then it makes no sense to go for a generic data access layer like BDE, DBX or ADO: it pays off doing a connection as direct as possible.
My experience has taught me that these combinations do work well:
Interbase or Firebird with FIBPlus, AnyDAC, IBO or IBX*
Oracle with AnyDAC, DOA or ODAC
Microsoft SQL Server with ADO
IBX does not like Firebird very much.
Hope this gives you some insight in the possibilities and limitations of supporting multiple DBMS from your Delphi applications.
--jeroen
General rule: every layer of components will possibly add an additional layer of bugs. Both ADO and DBX are component wrappers around standard database functionality, thus they're both equally strong.
So the proper choice should be based on other factors, like the databases that you want to use. If you want to connect to MS-Access or SQL Server, ADO would be the better choice since it's more native for these databases. But Firebird and Oracle are more native for the DBX components.
I personally tend to use the raw ADO API's, though. Then again, I don't use data-aware components in my projects. It's less RAD, I know. But I often need to work this way because I generally write client/server applications with several layers between the database and the GUI, thus making things more complicated.
My two cents: DBX is significantly faster (on both oracle and sql), and significantly more finicky and harder to deploy.
If performance is a factor, I'd go with DBX. Otherwise, I'd just use ADO for simplicity's sake.
As others have said, DBX may have the edge in raw performance in certain cases or under specific circumstances, but ADO is the basis for a very larger number of applications in the world so although performance of ADO may be relatively poorer, clearly that does not mean "unacceptably" poor.
For myself, and informed by major projects I have worked on, the biggest "problem" with DBX is that no matter how good it may be, it is a key infrastructure technology provided by a language/tools company.
Anyone that built applications on the previous BDE technology will testify to the disruption caused when that technology is deprecated and no longer supported. Whilst no technology is immune from deprecation by it's provider, ADO clearly has the edge when it comes to industry support beyond the technology provider themselves.
For that reason I myself now always use ADO. Just changing the connection string isn't always the only thing to worry about when changing from one database type to another however. Stored procedure call syntax can vary from one ADO provider to another, and you still have to watch the SQL syntax you use if you intend deploying against multiple different SQL engines, where the SQL support may vary from on to another.
To mitigate these issues I use my own encapsulation of the ADO object model. This encapsulation does not attempt to mutate the object model into something that doesn't resemble ADO, it simply exposes those parts of ADO that I need to use directly in a more ObjectPascal friendly (and "type" safe) form (e.g enum types and sets for constants and flags etc, rather than just scores if not hundreds of integer constants).
My encapsulation also takes care of some of the minor variations in different provider behaviours/requirements, such as the previously mentioned differences in stored procedure call syntax.
I should say also that similar to another poster, I too long ago stopped used "data aware controls", which opens up this approach. If you need or wish to use data aware controls and wish to use ADO, then you cannot use ADO directly and must instead find some encapsulation that exposes ADO thru the VCL dataset model.
ADO is Microsoft world
DBX was created at the beginning (Delphi 6) for cross platform and Kylix
I'm looking for a cross-platform database engine that can handle databases up hundreds of millions of records without severe degradation in query performance. It needs to have a C or C++ API which will allow easy, fast construction of records and parsing returned data.
Highly discouraged are products where data has to be translated to and from strings just to get it into the database. The technical users storing things like IP addresses don't want or need this overhead. This is a very important criteria so if you're going to refer to products, please be explicit about how they offer such a direct API. Not wishing to be rude, but I can use Google - please assume I've found most mainstream products and I'm asking because it's often hard to work out just what direct API they offer, rather than just a C wrapper around SQL.
It does not need to be an RDBMS - a simple ISAM record-oriented approach would be sufficient.
Whilst the primary need is for a single-user database, expansion to some kind of shared file or server operations is likely for future use.
Access to source code, either open source or via licensing, is highly desirable if the database comes from a small company. It must not be GPL or LGPL.
you might consider C-Tree by FairCom - tell 'em I sent you ;-)
i'm the author of hamsterdb.
tokyo cabinet and berkeleydb should work fine. hamsterdb definitely will work. It's a plain C API, open source, platform independent, very fast and tested with databases up to several hundreds of GB and hundreds of million items.
If you are willing to evaluate and need support then drop me a mail (contact form on hamsterdb.com) - i will help as good as i can!
bye
Christoph
You didn't mention what platform you are on, but if Windows only is OK, take a look at the Extensible Storage Engine (previously known as Jet Blue), the embedded ISAM table engine included in Windows 2000 and later. It's used for Active Directory, Exchange, and other internal components, optimized for a small number of large tables.
It has a C interface and supports binary data types natively. It supports indexes, transactions and uses a log to ensure atomicity and durability. There is no query language; you have to work with the tables and indexes directly yourself.
ESE doesn't like to open files over a network, and doesn't support sharing a database through file sharing. You're going to be hard pressed to find any database engine that supports sharing through file sharing. The Access Jet database engine (AKA Jet Red, totally separate code base) is the only one I know of, and it's notorious for corrupting files over the network, especially if they're large (>100 MB).
Whatever engine you use, you'll most likely have to implement the shared usage functions yourself in your own network server process or use a discrete database engine.
For anyone finding this page a few years later, I'm now using LevelDB with some scaffolding on top to add the multiple indexing necessary. In particular, it's a nice fit for embedded databases on iOS. I ended up writing a book about it! (Getting Started with LevelDB, from Packt in late 2013).
One option could be Firebird. It offers both a server based product, as well as an embedded product.
It is also open source and there are a large number of providers for all types of languages.
I believe what you are looking for is BerkeleyDB:
http://www.oracle.com/technology/products/berkeley-db/db/index.html
Never mind that it's Oracle, the license is free, and it's open-source -- the only catch is that if you redistribute your software that uses BerkeleyDB, you must make your source available as well -- or buy a license.
It does not provide SQL support, but rather direct lookups (via b-tree or hash-table structure, whichever makes more sense for your needs). It's extremely reliable, fast, ACID, has built-in replication support, and so on.
Here is a small quote from the page I refer to above, that lists a few features:
Data Storage
Berkeley DB stores data quickly and
easily without the overhead found in
other databases. Berkeley DB is a C
library that runs in the same process
as your application, avoiding the
interprocess communication delays of
using a remote database server. Shared
caches keep the most active data in
memory, avoiding costly disk access.
Local, in-process data storage
Schema-neutral, application native data format
Indexed and sequential retrieval (Btree, Queue, Recno, Hash)
Multiple processes per application and multiple threads per process
Fine grained and configurable locking for highly concurrent systems
Multi-version concurrency control (MVCC)
Support for secondary indexes
In-memory, on disk or both
Online Btree compaction
Online Btree disk space reclamation
Online abandoned lock removal
On disk data encryption (AES)
Records up to 4GB and tables up to 256TB
Update: Just ran across this project and thought of the question you posted:
http://tokyocabinet.sourceforge.net/index.html . It is under LGPL, so not compatible with your restrictions, but an interesting project to check out, nonetheless.
SQLite would meet those criteria, except for the eventual shared file scenario in the future (and actually it could probably do that to if the network file system implements file locks correctly).
Many good solutions (such as SQLite) have been mentioned. Let me add two, since you don't require SQL:
HamsterDB fast, simple to use, can store arbitrary binary data. No provision for shared databases.
Glib HashTable module seems quite interesting too and is very
common so you won't risk going into a dead end. On the other end,
I'm not sure there is and easy way to store the database on the
disk, it's mostly for in-memory stuff
I've tested both on multi-million records projects.
As you are familiar with Fairtree, then you are probably also familiar with Raima RDM.
It went open source a few years ago, then dbstar claimed that they had somehow acquired the copyright. This seems debatable though. From reading the original Raima license, this does not seem possible. Of course it is possible to stay with the original code release. It is rather rare, but I have a copy archived away.
SQLite tends to be the first option. It doesn't store data as strings but I think you have to build a SQL command to do the insertion and that command will have some string building.
BerkeleyDB is a well engineered product if you don't need a relationDB. I have no idea what Oracle charges for it and if you would need a license for your application.
Personally I would consider why you have some of your requirements . Have you done testing to verify the requirement that you need to do direct insertion into the database? Seems like you could take a couple of hours to write up a wrapper that converts from whatever API you want to SQL and then see if SQLite, MySql,... meet your speed requirements.
There used to be a product called b-trieve but I'm not sure if source code was included. I think it has been discontinued. The only database engine I know of with an ISAM orientation is c-tree.
I am creating a desktop app in Delphi and plan to use an embedded database. I've started the project using SQlite3 with the DISQLite3 library. It works but documentation seems a bit light. I recently found Firebird (yes I've been out of Windows for a while) and it seems to have some compelling features and support.
What are some pros and cons of each embedded db? Size is important as well as support and resources. What have you used and why?
I'm using Firebird 2.1 Embedded and I'm quite happy with it.I like the fact that the database size is practically unlimited (tested with > 4 GB databases and it works) and that the database file is compatible with the Firebird Server so I can use standard tools for database management and inspection. Distribution consists of dropping few files in your exe folder.
Simultaneous access from multiple programs is not supported but simultaneous access from multiple threads is (as long as you ensure that only one 'connect' operation is in progress at any given moment).
I have used SQlite3 for a lot of projects (but from C/C++ and Objective-C). It's extremely small -- no dependencies whatsoever -- database is in a single file.
It's the db of choice for Mac developers because it's directly supported by CoreData and on the iPhone -- so there is a big user base (not to mention all of the other users).
I've been using SQLite (via DISQLite3) in FeedDemon for several months, and I highly recommend it - it has been extremely fast and stable. As Javier said, the docs for the library may be thin, but the docs for SQLite itself are very good.
I've used DBISAM on a number of projects. It is completely embedded without even a need for an external DLL. Unlike the others you listed it is commercial. A lot of great features though and very well documented and supported. The have a successor to it that I haven't tried yet though.
Let's see, quick comparison:
SQLite:
dynamic typing in the database
cross-platform files
runs on Windows, Linux, Mac, etc.
public domain
supports transactions
relies on file system security, does not include own security
Firebird embedded:
strong typing in the database
not all SQL datatypes are supported
cross-platform files
Firebird embedded only runs on Windows
Files from Firebird embedded are in the same format as the full server version
Files from Firebird embedded can be copied to a non-Windows server for use
available under a modified MPL ("what's ours is ours and must remain free, what's yours is yours and you don't have to release it")
supports transactions, triggers, etc.
MySQL embedded:
support for SQL features depends on file format
(IIRC) cross-platform files
GPL unless you pay royalties
runs on Windows, Linux, Mac
incredibly popular with the open source crowd
Even embedded databases have their strengths and weaknesses. You'll need to weigh those strengths and weaknesses against what you're doing to decide.
Firebird embedded is our #1 choice because with no code changes, a single user Delphi app with embedded database can be migrated to a multi-user server based deployment without sacrificing any of the high end features (such as stored procedures, triggers, views, etc.). And its a TRUE free database and doesn't GPL your code in the process.
Strongly recommend to use AnyDAC when working with Databases and Delphi - then you can choose to target FB or SQLite seamlessingly.
My preference would be for FB for embedded apps.
Tom
I use Sybase's Advantage Database Server, but I'm also the R&D Manager, so this post is biased. :)
We have native Delphi TTable and TQuery components for both WIN32 VCL and VCL.NET. Direct table access in addition to SQL support makes Advantage unique among many of the other Delphi offerings. Advantage supports large tables (only limited by the number of records, 2 billion) and has a free local engine, which is nice for development PCs and for small customer sites that don't require client/server functionality. Switch to client/server with a single connection property, no other changes.
We have a ton of clients so accessing the data outside of Delphi is also very easy (.NET data provider, ODBC, OLE DB, PHP, Perl, JDBC, etc).
Main Product Web Site: http://www.advantagedatabase.com
Developer's Web Site: http://devzone.advantagedatabase.com
It really depends what you need. For single-user applications, Firebird Embedded or SQLite are probably best choices (and price is right). On the other end, if you need support for large number of multiple users, you should probably use regular Firebird instead of Embedded version (server is simple to install so you won't have much problems here).
And if you need something in between, for a moderate multi-user application, one of flat databases would be better. I found that ComponentAce's Absolute Database better choice for my needs than DBISAM, NexusDB or VistaDB.
It leaves relatively small footprint (no DLLs), it's a single-file db (a must for me), supports Unicode, BLOB compression, crypting, and technical limits seem impressing for a flat database. Moreover, support was good in few occasions when I needed it.
For cons, I have noticed it doesn't support nested transactions, but other than that, I had no problems.
As for size, nothing beats SQLite.
when you refer about lack of documentation, i guess it's doc for DISQLite3. The SQLite docs are quite complete
Take a look at NexusDB. Have used very successfully in the past.
The problem with (embedded) firebird is, that the database cannot reside on a network drive. Also, it is difficult to have a database on a read only drive (CD/DVD).
For some hacks around these limitations see the Delphi Wiki:
http://delphi.wikia.com/wiki/Firebird_tipps
NexusDB offers the full range from embedded, to full client/server / remote. Also SQL2003 compliant, I believe. I'm using it on a few projects, and am very pleased so far, and the fact that it can work in such a wide range of "scales" is a big plus (not having to learn another DB for scaled-up apps, etc).
Look at this embedded database comparison: http://sql-db.cz.cc/, it can be helpful. Most of abovementioned products are presented there: Advantage, DBISAM, Firebird, MS SQL Server, and much more: Accuracer, Apollo, ElevateDB, NexusDB, TurboDB.
I am partial to Component Ace's Absolute DB. Although a commercial product ($), it is solid, easy to use, small footprint and well documented. If you are looking for a huge multi-user application, this is not the way to go, but if your multi-user needs are light (or non-existent) this is a solid option.
I'm using SQL Server Express and the ADO components. Works great. You can run the SQL Server Express install with commandline to hide the complexities from the users. You can also distribute a database that you load by filename. There are millions of SQL server users so solutions to any problems are easily found in the intertubes :-)
I did a websearch to find a fast database package for my Delphi Application. I wanted it to be completely contained in the executable with no external DLLs or libraries required. I originally found Accuracer by AidAim. They had posted how fast their database was and even gave comparisons with other similar packages to “prove” their point.
I wanted to believe their claims but I thought I’d search the web a bit more to find timings of other packages. I was very surprised to find a post at the Delphi discussion forums where a person asked what database to use, and there were 14 different suggestions. One of the responders had done his own timing comparisons and had found Accuracer to be quite slow compared to several others, which Accuracer had (conveniently) left out of their own comparison page.
The post, plus additional followup web research by me, led me to lean toward DISQLite3, a product based on the Open Source SQLite program, but with enhancements to work in Delphi very quickly, with very small overhead, and with command-based calls - which I like. It is actively under development and will soon have an official Delphi 2009 version, although apparently the current version will work under D2009.
Addenum: DISQLite3 Version 2.0.0, released Nov 17, supports D2009.
I know MS access is a comparatively crap db (and expect to be shot down in flames here), but if only small data is needed it may have advantages if ms office is used anyway. For me it was a way to store program data with more flexibility than csv files which is a common approach for scientific code.
You can create an access db from delphi code without having ms office installed using ado & odbc driver (might be necesary to have an initial .accdb file without tables to copy from then populate, I can't remember this detail. not sure licensing situation doing this.
The .accdb extension can be changed to something else & the file password protected (to a limited degree) so its not immediately obvious to users its access if that's desired.
I know a few commercial developers do this method & copied it myself. Found it easier to setup than sqlite, but maybe because I'd already used ado & access in the past.
I have used ScimoreDB. It has its quirks as they give it royalty free and it has its quirks in data types and with some installation issues. This was on a C# project.
If embedded is an absolute must, look at DBISAM.
kbMemTable is a good candidate. Runs in memory, fast, multi-threadding. Used to be free.
Components4Developers
I have used DBISAM and kbMemTable on different occasions.
What I like about DBISAM is that it has great features, and is usually very reliable. I have used it in large databases, full-text search, read-only mode, CGIs and many other situations.
It is fairly large compared to kbMemTable or SQLite based components, though. And you can't have a single file per database (or even table) - depending on the situation, that is a major disadvantage.
kbMemTable is tiny and it's great for small amounts of data. Since it runs in memory, it has to be a small amount of data, of course.
One other option I've taken on a couple of my desktop apps is dumping the data directly from/to my object hierarchy using TWriter/TReader. This is by far that smallest option, and is absurdly fast compared to using a database. The data files are tiny, too.
It has all kinds of drawbacks, though - you have to code versioning in if you might want to ever add/change fields, unless it's in-memory it is even more complicated, no multi-user support at all, etc.
Firebird embedded is our #1 choice as well. And the suite Unified Interbase v2.0 with it. A great and stable solution!
I have a database that I have to record 5 field data for every 20 sec for 10 days.. 3 field are integer , 1 field is double ( time ) and 1 field is string[5].
I am still using Delphi6 srv2 because of my components. Newer delphi versions are terrible at components that I have to spend thousands of dollars of money to rebuild my component library. Therefor delphi 6 is still best for real commertial applications that never version of delphis give many problems. At many points such as USB or comport readings so on... they release newer ones before previous versions never sit on market.
I have setup a code with Delphi6 what appends 43200 records at a table for test because I will deploy the table in application while it has 43200 records. I will shown all the data on DBChart.
Test result is below databases filled the tables by insert command with 43200 records
Dbisam = 34 sec,
ElevateDb = 11 sec,
AbsoluteDB = 45 sec,
SQLlite = 32 Minute,
Firebird = 12 min,
MSSQL12 localDB = 28 Minute,
Easy table = 8 minute,
BDE = Blocked ,
I havent tested oracle , blackfish , sysbase, nexsusDb etc.. but it seems they will also very slow. I have connected with DBChart and only elevateDb and absoluteDB has loaded 43200 records on DBchart in exceptable time such as 7~10 secs. Other all taken minutes. So slower databases always needs coding tricks to succeed in some real jobs..
I have tested their search speed as well by locate command that unfortunatly the server based databases are always slower in.
MSSQL and SQLLite3 are extremely difficult to manage in to delphi that they made me very tired.
These are my test results
At the end I decided to use AbsoluteDB, Dbisam and Elevate. I have thrown the rest off the PC .
Elevate software doesnt support recno function that requires extra codes at runtime to manage. This makes the database slower Other bug is with Elevate software is autoinc fields. There is no way to reset it . Therefore I have not chosen the Elevat software even it is the fastest database. They say many good functions but how many of them we use it in fact . They just left the most important functions not supported but fixed many many unnecessary functions. and it seems since 8 years there is no any advantage either.
If you want to see with your own eye pls just try and see..
I am thinking between two now absolute DB or DBisam4
Firebird all the way. Does pretty well everything and so far version 2.1 is very solid.
FireBird offers the opportunity to scale up to multi-users sometime down the line, or if you need concurrency (if your application goes multi-threaded).
SQLite is quite unrivaled if you only need single-user access, no other database comes close to it on any aspect, be it performance, convenience, SQL support or stability.
Firebird is really awsome and has a small footprint so you can use embedded
and it can be scaled upward for many users
and does unicode faily well
I use devart components with delphi 2009
and FIB plus for delphi 6/7 (their version for 2009 and unicode is not ready yet too bad)
Hmmm, no one has recommended the BDE - I wonder why that is ;-)
BlackFishSQL is another possibility, although I haven't tested in depth as yet.
when it comes to embedded databases the first question is : is it multiuser ?
Actually,who needs a database that does not allow multiple connections (read&write) to it ?
I have tried (intensly) all mentioned databases and found only one that actually functions the way it should. And that is Accuracer.
The only pity with accuracer is that its a three man band and chronic lack of proper support. It also is mainly static in development as we have seen no real features in years.Not surprising since only one person actually develops it. It seems they are living on old fame. Users praise reflect that (usually 10 years old comments).
For a single user experience I would recommend Absolute Database.
As for major players I would recommend SQL Server from Microsoft. Oracle has become a bloatware and is slowly dying out.
ps
what is nice in accuracer is that their embedded database functions just like full blown server. It locks only current record if its in use while the rest functions normally. Nice database. Pity only it is stagnant.
I have an experiment streaming up 1Mb/s of numeric data which needs to be stored for later processing.
It seems as easy to write directly into a database as to a CSV file and I would then have the ability to easily retrieve subsets or ranges.
I have experience of sqlite2 (when it only had text fields) and it seemed pretty much as fast as raw disk access.
Any opinions on the best current in-process DBMS for this application?
Sorry - should have added this is C++ intially on windows but cross platform is nice. Ideally the DB binary file format shoudl be cross platform.
If you only need to read/write the data, without any checking or manipulation done in database, then both should do it fine. Firebird's database file can be copied, as long as the system has the same endianess (i.e. you cannot copy the file between systems with Intel and PPC processors, but Intel-Intel is fine).
However, if you need to ever do anything with data, which is beyond simple read/write, then go with Firebird, as it is a full SQL server with all the 'enterprise' features like triggers, views, stored procedures, temporary tables, etc.
BTW, if you decide to give Firebird a try, I highly recommend you use IBPP library to access it. It is a very thin C++ wrapper around Firebird's C API. I has about 10 classes that encapsulate everything and it's dead-easy to use.
If all you want to do is store the numbers and be able to easily to range queries, you can just take any standard tree data structure you have available in STL and serialize it to disk. This may bite you in a cross-platform environment, especially if you are trying to go cross-architecture.
As far as more flexible/people-friendly solutions, sqlite3 is widely used, solid, stable,very nice all around.
BerkeleyDB has a number of good features for which one would use it, but none of them apply in this scenario, imho.
I'd say go with sqlite3 if you can accept the license agreement.
-D
Depends what language you are using. If it's C/C++, TCL, or PHP, SQLite is still among the best in the single-writer scenario. If you don't need SQL access, a berkeley DB-style library might be slightly faster, like Sleepycat or gdbm. With multiple writers you could consider a separate client/server solution but it doesn't sound like you need it. If you're using Java, hdqldb or derby (shipped with Sun's JVM under the "JavaDB" branding) seem to be the solutions of choice.
You may also want to consider a numeric data file format that is specifically geared towards storing these types of large data sets. For example:
HDF -- the most common and well supported in many languages with free libraries. I highly recommend this.
CDF -- a similar format used by NASA (but useable by anyone).
NetCDF -- another similar format (the latest version is actually a stripped-down HDF5).
This link has some info about the differences between the above data set types:
http://nssdc.gsfc.nasa.gov/cdf/html/FAQ.html
I suspect that neither database will allow you to write data at such high speed. You can check this yourself to be sure. In my experience - SQLite failed to INSERT more then 1000 rows per second for a very simple table with a single integer primary key.
In case of a performance problem - I would use CSV format to write the files, and later I would load their data to the database (SQLite or Firebird) for further processing.