save file in Lotus Notes and full text search - database

Saving file in Lotus Notes have two problems:
the full-text search is slow and inefficient
the single database cannot be very big(related to the os file system).
How can i resolve the problem and Are there any replacement?

In Lotus Notes/Domino (version 8.x) max file size for a single notes database is 64 Gb. I work with Lotus Notes/Domino for 17 years, and I did not meet the db size limit in my projects. Just split your project to different databases, according to its functional purpose.
If you want to store very big files, seems that you should choose a different platform. Lotus Notes is not a suitable tool in this case.
Regarding full-text search, if you setup the immediate indexing, then Domino server will index all new information added to the database as soon as possible. In my experience for small amount of data the indexing worked immediately, for big files it took one-two minutes.
The search itself works fast and I did not notice any slow behaviour on a hardware server that meets the Lotus Domino software requirements.

Related

Appropriate Database to store 20 GB for Delphi, Firemonkey

I don't have experience in database development, so I need your suggestions in choosing of a database that can be used in Firemonkey.
I need to store html files (without media now, but they can be with), their total size is around 20 GB (uncompressed text). A main feature must be maximally fast searching of text in the database, and it must be possible to implement human searching (like google). Plus, there can be compression (20 GB is to much to store. If compression makes searching slow it's not required).
What kind of databases are appropriate for my concern?
Thanks a lot for your suggestions!
Edited
Requirements:
Price: Free
Location: local or remote
Operating system support: Windows
System requirements: a database with a large footprint
(hopefully in exchange of better performances)
Performances: fast text searching
Concurrent users: 20
Full text indexing and searching: human (Google-like) fast
text searching is required
Manageability: doesn't matter much
I know an on-line web legal database that can search words through 100 GB of information in milliseconds. I need the same performance, and Google-like searching is required.
Delphi database access layer is separated from FireMonkey, it's the same used by VCL (although FM AFAIK relies only on LiveBindings to access data, but that's not an issue in your case).
Today 20 GB are really not much data. Almost any database will handle them without much effort if properly configured. What engine to choose depends on:
Price: how much are you going to spend for it?
Location: do you need a local database (same machine) or a remote one (LAN or WAN)?
Operating system support: which OS should it run on?
System requirements: do you need a database with a small footprint, or you can use one with a larger one (hopefully in exchange of better performances)?
Performances: what are the required performances?
Concurrent users: how much user will connect to the database concurrently?
Full text indexing and searching: not all databases offer it out of the box
Manageability: some databases may require more management than others.
There is no "one database fits all" yet.
I'm no DBA so I can't say directly, and honestly I'm not sure that any one person could give a direct answer to this question as it's one of those it just depends scenarios.
http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems
That's a good starting point to compare features and platform compatibility. I think the major thing to consider here is what hardware will be running it and how can you best utilize that to accomplish the task at hand.
If you have a server farm being sure your DB supports distribution and some sort of load balancing (most do to some degree from what I understand).
To speed up searching unless you code up a custom algorithm that searches the compressed version somehow I think you're going to want to keep the data un-compressed. Searching the compressed data actually might be faster. If you're able to use the index for the compressed file to compare against your plain text search parameters then are just looking for those keys that were matched within the index. If any are found in the index check for them within the compressed data. Without tons of custom code I haven't heard of any DB that supports this idea of searching compressed text (though I could easily be wrong on this point).
If the entire data set needs to be decompressed before doing the search it will very likely be much slower (memory is relatively cheap compared to CPU time). It looks like Firemonkey has a limited selection of DBs to use so that will help to narrow your choices down as well.
What I would suggest based on your edited question, is to write (or find) a parser or regular expression to extract all the important elements from the HTML that you would like to be searchable. Then store those in a database along with a reference for where they were found in the HTML. In terms of Google like searching, if you mean in terms of how it can correct misspellings and use synonyms, you probably need some sort of custom code to do dictionary look ups for spelling and thesaurus look ups for synonyms. I believe full text searching in any modern DB will handle the need to query with LIKE or similar statements in the where clause.
Looks like ldsandon's answer covers most of this anyhow. TLDR; if not thanks for reading.
I would recommend PostgreSQL for this task. It has good performance, and built in full text search capability for Google-like searching. And it's free and open source.
Unfortunately Delphi doesn't come with Postgres data access components out of the box. You can connect by ODBC, or you can purchase components available from, for example, Devart, DA-Soft or microOLAP.
Have you considered NoSQL databases? The Wikipedia article explains their differences to SQL databases and also mentions that they are suited as document store.
http://en.wikipedia.org/wiki/NoSQL
The article lists around twelve implementations in the document store category, many are open source. (Jackrabbit, CouchDB, MongoDB).
This question on Stackoverflow contains some pointers to Delphi clients:
Delphi and NoSQL
I would also consider caching on the application server, to speed up search. And of course a text indexing solution like Apache Lucene.
I would take Microsoft SQL Server Express Edition. I think 2008 R2 is latest stable version but there is also Denali (2011). It match all criterien you have.
You can use ADO to work with.
Try the Advantage Database Server.
It's easy to manage and configure.
Both dbase-like and SQL data management languages.
Fast indexed full text search capabilities.
Plus, unparalled support from the developers themselves.
The local server (stand-alone version, as opposed to the network based server) is free.
devzone.advantagedatabase.com
There is a Firebird version with full text search according to its documentation - http://www.red-soft.biz/en/document_21 - it uses Apache Lucene, a popular search engine

database choice: protocoling data for hundereds to million objects

I am running economic simulations. hundreds, thousands or possibly millions (but not more than 30.000.000) agents = objects have to protocol data over time, the data has always the same structure. It typically is composed from a number of booleans, floats and integers, possibly arrays/list (between 5 and 100 different variables). During the simulation the database has no read access. After the simulation the data will not be changed anymore.
For every simulation I will create a new database.
The current programming-language is python, but the choice has impact on a future project in java. Its also possible that in the future the project is run on a network.
If it matters: the objects communicate via 0mq.
What database type and implementation should I choose?
it needs to be open source
good python and java APIs are indispensable
This depends on your personal needs. Millions of rows are not that much for a database.
For a start I would try Oracle 11g Express Edition. It's free and 100% compatible to the paid version, but it's limited to 11 GB of user data and 1 GB of RAM.
If you're on the open source train, check out PostgreSQL.
However, keep in mind that (most likely) you'll be bound to any database software you choose.
Since the simulation is run on a single computer sqlite3 turned out to be a good solution. It avoids the database server which reduces complexity and increases speed.

Size of db lotus notes

I need suggestion regarding maximum size for a db lotus notes highly volatil, i.e. an application based on a db of 8+ Gb accessed by 20 users in average inserting attachments and running scripts.
tks !!
There are limits to the size of a Notes Database (sorry Ken). See the Notes Help "Table of Notes and Domino known limits" and Technote #1308379.
The most important ones are:
Database size: The maximum OS file size limit -- (up to 64GB)
Fields in a database: ~ 3000 (limited to ~ 64K total length for all field names). You can enable the database property "Allow more fields in database" to get up to 22,893 uniquely-named fields in the database.
Views in a database: No limit; however, as the number of views increases, the length of time to display other views also increases
Documents in a view: Up to the maximum size of the database
Ususally the "limiting" factors for an application are view rebuild and full text index times, as Ken suggested.
You may want to checkout Andre Guirards postings on the topic of performance as well as his white paper Performance basics for IBM Lotus Notes developers and the Domino Wiki.
I'm not sure if this answers your question, but there's theoretically no limit to the size of a Notes database. Years ago I remember hearing at Lotusphere they had tested a database at 64GBs and it worked.
That said, there will likely be some issues with view indexes growing large, and long waits for refreshing views.
In the link below you can find the limitations that concern the Lotus Notes Databases as they come from IBM and stated also from leyrer.
Limits of Lotus Notes
However, in our company we are working heavily with Lotus Notes databases and our databases our growing very fast mostly due to attachments that can be documents, spreadsheets, images, etc. The solution that we implemented in order to avoid reaching the size limits was to have normally the application and attach to it databases for the attachments. In this way the attachments are not stored to the main application, but to the attached dbs. You can create as many as you can. For example you can have the following:
MyApp.nsf - Main application
MyAppAttach1.nsf - Attached
MyAppAttach2.nsf - Attached
In this way we can also control how much the dbs will grow. I hope that this can help you in your implementation.
For large databases, it may be important to think about a strategy for archiving documents once they are no longer being actively processed. You don't mention how many documents are created/edited/deleted every day, or how large the average document is; but if it is 8GB now, how large will it be next month, or next year? Depending on the factors martin listed in his answer, this could become a concern long before you reach the 64 GB supported limit, and it is better to be prepared in advance.
First suggestion is, create an archive database and reduce the size of the main database. In the next step you should write some process in order to archive the documents weekly or monthly according to your needs.

Storing rich text documents

This is a follow-up to another question I asked earlier today. I am creating a desktop app that stores rich text documents created in WPF (in a RichTextBox control). The app uses SQL Compact, and up until now, I had planned to store each document in a binary column in the database.
I am rethinking that approach. Would it be better practice to store each rich text document in the file system, rather than saving it to the database? I figure I could put the documents in the same folder with the database, then store a relative path to each document in its database record, along with other information about the document (tags and so on).
I'd like to know some pros and cons of that approach, along with ideas of what is generally considered best practice for this sort of thing. Thanks for your help.
Personally I tend to use the filesystem
Pro DB
Can search using SQL search features (will probably be a bit wonky with RTF becuase of the control codes)
Backup the MDF file & you've backed all the documents in one place
Can easily implement versioning
Easier to keep file data & stuff that references it in sync
Pro Filesystem
Loadable by external apps (and people)
A corrupt DB kills all your documents
Searchable via filesystem tools/indexers
Less complex IO code needed
Familiar to user
The path can point anywhere (i.e on another machine/another logical drive)
More portable IO code
Not sure why it has not been mentioned before but have you looked at the FILESTREAM data type that is available in SQL Server 2008 and above?
It combines the benefits of file system storage with the benefits of DB storage. Here is a link to a MS white paper http://download.microsoft.com/download/a/c/d/acd8e043-d69b-4f09-bc9e-4168b65aaa71/SQL2008UnstructuredData.doc
Another very strong point with filestream from my point of view is that is does not eat into the size limit of the express editions of SQL Server which can be very handy

Which embedded database to use in a Delphi application?

I am creating a desktop app in Delphi and plan to use an embedded database. I've started the project using SQlite3 with the DISQLite3 library. It works but documentation seems a bit light. I recently found Firebird (yes I've been out of Windows for a while) and it seems to have some compelling features and support.
What are some pros and cons of each embedded db? Size is important as well as support and resources. What have you used and why?
I'm using Firebird 2.1 Embedded and I'm quite happy with it.I like the fact that the database size is practically unlimited (tested with > 4 GB databases and it works) and that the database file is compatible with the Firebird Server so I can use standard tools for database management and inspection. Distribution consists of dropping few files in your exe folder.
Simultaneous access from multiple programs is not supported but simultaneous access from multiple threads is (as long as you ensure that only one 'connect' operation is in progress at any given moment).
I have used SQlite3 for a lot of projects (but from C/C++ and Objective-C). It's extremely small -- no dependencies whatsoever -- database is in a single file.
It's the db of choice for Mac developers because it's directly supported by CoreData and on the iPhone -- so there is a big user base (not to mention all of the other users).
I've been using SQLite (via DISQLite3) in FeedDemon for several months, and I highly recommend it - it has been extremely fast and stable. As Javier said, the docs for the library may be thin, but the docs for SQLite itself are very good.
I've used DBISAM on a number of projects. It is completely embedded without even a need for an external DLL. Unlike the others you listed it is commercial. A lot of great features though and very well documented and supported. The have a successor to it that I haven't tried yet though.
Let's see, quick comparison:
SQLite:
dynamic typing in the database
cross-platform files
runs on Windows, Linux, Mac, etc.
public domain
supports transactions
relies on file system security, does not include own security
Firebird embedded:
strong typing in the database
not all SQL datatypes are supported
cross-platform files
Firebird embedded only runs on Windows
Files from Firebird embedded are in the same format as the full server version
Files from Firebird embedded can be copied to a non-Windows server for use
available under a modified MPL ("what's ours is ours and must remain free, what's yours is yours and you don't have to release it")
supports transactions, triggers, etc.
MySQL embedded:
support for SQL features depends on file format
(IIRC) cross-platform files
GPL unless you pay royalties
runs on Windows, Linux, Mac
incredibly popular with the open source crowd
Even embedded databases have their strengths and weaknesses. You'll need to weigh those strengths and weaknesses against what you're doing to decide.
Firebird embedded is our #1 choice because with no code changes, a single user Delphi app with embedded database can be migrated to a multi-user server based deployment without sacrificing any of the high end features (such as stored procedures, triggers, views, etc.). And its a TRUE free database and doesn't GPL your code in the process.
Strongly recommend to use AnyDAC when working with Databases and Delphi - then you can choose to target FB or SQLite seamlessingly.
My preference would be for FB for embedded apps.
Tom
I use Sybase's Advantage Database Server, but I'm also the R&D Manager, so this post is biased. :)
We have native Delphi TTable and TQuery components for both WIN32 VCL and VCL.NET. Direct table access in addition to SQL support makes Advantage unique among many of the other Delphi offerings. Advantage supports large tables (only limited by the number of records, 2 billion) and has a free local engine, which is nice for development PCs and for small customer sites that don't require client/server functionality. Switch to client/server with a single connection property, no other changes.
We have a ton of clients so accessing the data outside of Delphi is also very easy (.NET data provider, ODBC, OLE DB, PHP, Perl, JDBC, etc).
Main Product Web Site: http://www.advantagedatabase.com
Developer's Web Site: http://devzone.advantagedatabase.com
It really depends what you need. For single-user applications, Firebird Embedded or SQLite are probably best choices (and price is right). On the other end, if you need support for large number of multiple users, you should probably use regular Firebird instead of Embedded version (server is simple to install so you won't have much problems here).
And if you need something in between, for a moderate multi-user application, one of flat databases would be better. I found that ComponentAce's Absolute Database better choice for my needs than DBISAM, NexusDB or VistaDB.
It leaves relatively small footprint (no DLLs), it's a single-file db (a must for me), supports Unicode, BLOB compression, crypting, and technical limits seem impressing for a flat database. Moreover, support was good in few occasions when I needed it.
For cons, I have noticed it doesn't support nested transactions, but other than that, I had no problems.
As for size, nothing beats SQLite.
when you refer about lack of documentation, i guess it's doc for DISQLite3. The SQLite docs are quite complete
Take a look at NexusDB. Have used very successfully in the past.
The problem with (embedded) firebird is, that the database cannot reside on a network drive. Also, it is difficult to have a database on a read only drive (CD/DVD).
For some hacks around these limitations see the Delphi Wiki:
http://delphi.wikia.com/wiki/Firebird_tipps
NexusDB offers the full range from embedded, to full client/server / remote. Also SQL2003 compliant, I believe. I'm using it on a few projects, and am very pleased so far, and the fact that it can work in such a wide range of "scales" is a big plus (not having to learn another DB for scaled-up apps, etc).
Look at this embedded database comparison: http://sql-db.cz.cc/, it can be helpful. Most of abovementioned products are presented there: Advantage, DBISAM, Firebird, MS SQL Server, and much more: Accuracer, Apollo, ElevateDB, NexusDB, TurboDB.
I am partial to Component Ace's Absolute DB. Although a commercial product ($), it is solid, easy to use, small footprint and well documented. If you are looking for a huge multi-user application, this is not the way to go, but if your multi-user needs are light (or non-existent) this is a solid option.
I'm using SQL Server Express and the ADO components. Works great. You can run the SQL Server Express install with commandline to hide the complexities from the users. You can also distribute a database that you load by filename. There are millions of SQL server users so solutions to any problems are easily found in the intertubes :-)
I did a websearch to find a fast database package for my Delphi Application. I wanted it to be completely contained in the executable with no external DLLs or libraries required. I originally found Accuracer by AidAim. They had posted how fast their database was and even gave comparisons with other similar packages to “prove” their point.
I wanted to believe their claims but I thought I’d search the web a bit more to find timings of other packages. I was very surprised to find a post at the Delphi discussion forums where a person asked what database to use, and there were 14 different suggestions. One of the responders had done his own timing comparisons and had found Accuracer to be quite slow compared to several others, which Accuracer had (conveniently) left out of their own comparison page.
The post, plus additional followup web research by me, led me to lean toward DISQLite3, a product based on the Open Source SQLite program, but with enhancements to work in Delphi very quickly, with very small overhead, and with command-based calls - which I like. It is actively under development and will soon have an official Delphi 2009 version, although apparently the current version will work under D2009.
Addenum: DISQLite3 Version 2.0.0, released Nov 17, supports D2009.
I know MS access is a comparatively crap db (and expect to be shot down in flames here), but if only small data is needed it may have advantages if ms office is used anyway. For me it was a way to store program data with more flexibility than csv files which is a common approach for scientific code.
You can create an access db from delphi code without having ms office installed using ado & odbc driver (might be necesary to have an initial .accdb file without tables to copy from then populate, I can't remember this detail. not sure licensing situation doing this.
The .accdb extension can be changed to something else & the file password protected (to a limited degree) so its not immediately obvious to users its access if that's desired.
I know a few commercial developers do this method & copied it myself. Found it easier to setup than sqlite, but maybe because I'd already used ado & access in the past.
I have used ScimoreDB. It has its quirks as they give it royalty free and it has its quirks in data types and with some installation issues. This was on a C# project.
If embedded is an absolute must, look at DBISAM.
kbMemTable is a good candidate. Runs in memory, fast, multi-threadding. Used to be free.
Components4Developers
I have used DBISAM and kbMemTable on different occasions.
What I like about DBISAM is that it has great features, and is usually very reliable. I have used it in large databases, full-text search, read-only mode, CGIs and many other situations.
It is fairly large compared to kbMemTable or SQLite based components, though. And you can't have a single file per database (or even table) - depending on the situation, that is a major disadvantage.
kbMemTable is tiny and it's great for small amounts of data. Since it runs in memory, it has to be a small amount of data, of course.
One other option I've taken on a couple of my desktop apps is dumping the data directly from/to my object hierarchy using TWriter/TReader. This is by far that smallest option, and is absurdly fast compared to using a database. The data files are tiny, too.
It has all kinds of drawbacks, though - you have to code versioning in if you might want to ever add/change fields, unless it's in-memory it is even more complicated, no multi-user support at all, etc.
Firebird embedded is our #1 choice as well. And the suite Unified Interbase v2.0 with it. A great and stable solution!
I have a database that I have to record 5 field data for every 20 sec for 10 days.. 3 field are integer , 1 field is double ( time ) and 1 field is string[5].
I am still using Delphi6 srv2 because of my components. Newer delphi versions are terrible at components that I have to spend thousands of dollars of money to rebuild my component library. Therefor delphi 6 is still best for real commertial applications that never version of delphis give many problems. At many points such as USB or comport readings so on... they release newer ones before previous versions never sit on market.
I have setup a code with Delphi6 what appends 43200 records at a table for test because I will deploy the table in application while it has 43200 records. I will shown all the data on DBChart.
Test result is below databases filled the tables by insert command with 43200 records
Dbisam = 34 sec,
ElevateDb = 11 sec,
AbsoluteDB = 45 sec,
SQLlite = 32 Minute,
Firebird = 12 min,
MSSQL12 localDB = 28 Minute,
Easy table = 8 minute,
BDE = Blocked ,
I havent tested oracle , blackfish , sysbase, nexsusDb etc.. but it seems they will also very slow. I have connected with DBChart and only elevateDb and absoluteDB has loaded 43200 records on DBchart in exceptable time such as 7~10 secs. Other all taken minutes. So slower databases always needs coding tricks to succeed in some real jobs..
I have tested their search speed as well by locate command that unfortunatly the server based databases are always slower in.
MSSQL and SQLLite3 are extremely difficult to manage in to delphi that they made me very tired.
These are my test results
At the end I decided to use AbsoluteDB, Dbisam and Elevate. I have thrown the rest off the PC .
Elevate software doesnt support recno function that requires extra codes at runtime to manage. This makes the database slower Other bug is with Elevate software is autoinc fields. There is no way to reset it . Therefore I have not chosen the Elevat software even it is the fastest database. They say many good functions but how many of them we use it in fact . They just left the most important functions not supported but fixed many many unnecessary functions. and it seems since 8 years there is no any advantage either.
If you want to see with your own eye pls just try and see..
I am thinking between two now absolute DB or DBisam4
Firebird all the way. Does pretty well everything and so far version 2.1 is very solid.
FireBird offers the opportunity to scale up to multi-users sometime down the line, or if you need concurrency (if your application goes multi-threaded).
SQLite is quite unrivaled if you only need single-user access, no other database comes close to it on any aspect, be it performance, convenience, SQL support or stability.
Firebird is really awsome and has a small footprint so you can use embedded
and it can be scaled upward for many users
and does unicode faily well
I use devart components with delphi 2009
and FIB plus for delphi 6/7 (their version for 2009 and unicode is not ready yet too bad)
Hmmm, no one has recommended the BDE - I wonder why that is ;-)
BlackFishSQL is another possibility, although I haven't tested in depth as yet.
when it comes to embedded databases the first question is : is it multiuser ?
Actually,who needs a database that does not allow multiple connections (read&write) to it ?
I have tried (intensly) all mentioned databases and found only one that actually functions the way it should. And that is Accuracer.
The only pity with accuracer is that its a three man band and chronic lack of proper support. It also is mainly static in development as we have seen no real features in years.Not surprising since only one person actually develops it. It seems they are living on old fame. Users praise reflect that (usually 10 years old comments).
For a single user experience I would recommend Absolute Database.
As for major players I would recommend SQL Server from Microsoft. Oracle has become a bloatware and is slowly dying out.
ps
what is nice in accuracer is that their embedded database functions just like full blown server. It locks only current record if its in use while the rest functions normally. Nice database. Pity only it is stagnant.

Resources