I need a client side browser database. What are my options [closed] - database

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm creating a web site that I think must have a client side database. The other option would be to stick everything on the server at the expense of increased complexity and decreased scalability. What options do I have? Must I build a plugin? Must I wait until everybody's HTML5 compliant?
Update There's been a lot of comments about why I would actually need this. Here are my thoughts. Tell me if I'm being silly:
The clients will have a large and complex state that will require something like a database to provide the data interaction that I need. Therefore (I think) cookies are out of the picture.
This data is transient, so the client won't care if it gets erased as soon as they close a session. However they will need to keep the data if they go to a different web page and then come back. Therefore (I think) somehow storing the data in some sort of a javascript SQL implementation will not work.
I can certainly do everything that I want to do on the server, and servers can scale to manage the load (Facebook). But (I think) I'd rather build a plugin than pay for the infrastructure to support this load. This is for a bare bones startup. (The richer the startup is, the barer my bones will be.)

Indexed Database (Can I use)
Web SQL (Can I use)
localStorage

I'm about 5 years late in answering this, but given that there are errors and outdated data in some of the existing answers, and unaddressed points in the original question, I figured i'd throw in my two cents.
First, contrary to what others have implied on here, localStorage is not a database. It is (or should be perceived as) a persistent, string-based key-value store...
...which may be perfectly fine for your needs (and brings me to my second point).
Do you need explicit or implicitly relationships between your data items?
How about the ability to query over said items?
Or more than 5 MB in space?
If you answered "no" to all all of the above, go with localStorage and save yourself from the headaches that are the WebSQL and IndexedDB APIs. Well, maybe just the latter headache, since the former has been deprecated.
There are also several other client-side storage facilities (native and non-native) you may want to look in to, some of which are deprecated* but still see support from some browsers:
userData*
The rest of webStorage (sessionStorage and globalStorage*)
HTML5 File System*
Flash Locally Shared Objects
Silverlight Isolated Storage
Check out BakedGoods if you want to utilize any of these facilities, and more, without having to write low-level storage operation code. With it, placing data in one (or more) of them, for example, is as simple as:
bakedGoods.set({
data: [{key: "key1", value: "val1"}, {key: "key2", value: "val2"}],
storageTypes: ["silverlight", "fileSystem", "localStorage"],
options: optionsObj,
complete: function(byStorageTypeStoredKeysObj, byStorageTypeErrorObj){}
});
Oh, and for the sake of complete transparency, BakedGoods is maintained by this guy right here :) .

Use PouchDB.
PouchDB is an open-source JavaScript database inspired by Apache CouchDB that is designed to run well within the browser.
It helps building applications that works online as well as offline.
Basically, it stores the last fetched data in the in-browser database (uses IndexedDB, WebSQL under the hood) and then syncs again when the network gets active.

I came across a JavaScript Database http://www.taffydb.com/ still trying it out myself, hope this helps.

If you are looking for a NoSQL-style db on the client you can check out http://www.forerunnerdb.com. It supports the same query language as MongoDB and has a data-binding module if you want your DOM to reflect changes to your data automatically.
It is also open source, is constantly being updated with new features and the community around it is growing rapidly.
Disclaimer, I'm the lead developer of the project.

If you feel like you need it then use it for the clients that support it and implement a server-side fallback for clients that don't.
An alternative is you can use Flash and Local Shared Objects which can store a lot more information than a cookie, will work in all browsers with Flash (which is pretty much all browsers), and store typed data. You don't have to do the whole app in Flash, you can just write a tiny utility to read/write LSO data. This can be done using straight ActionScript projects without any framework and will give you a tiny 5-15kb swf.
There are two API's you'll primarily need. SharedObject.getLocal() to get access to a LSO and read/write it's data, and ExternalInterface.addCallback which you can use to register an AS3 method as a callback to call your read/write LSO method.
SharedObject
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/net/SharedObject.html?filter_flex=4.1&filter_flashplayer=10.1&filter_air=2
ExternalInterface
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/external/ExternalInterface.html
These links are to Flex references but for this you can just create an ActionScript project with no need for the Flex framework and therefore greatly reduced swf size. There are a number of good IDEs including free open-source ones like FlashDevelop.
FlashDevelop
http://www.flashdevelop.org/

Check out HTML5 Local Storage:
http://people.w3.org/mike/localstorage.html
You may also find this helpful:
HTML5 database storage (SQL lite) - few questions
When Windows 98 first came out, there were a lot of us still stuck on MS-DOS 6.22. Naturally, there were really cool features on the new operating system that wouldn't run in MS-DOS.
There comes a time when some things must be left behind to make room for innovation. If your application is really innovative and will offer cool new functionality that uses the latest and greatest technologies, then some older browsers will naturally need to be left behind.
The advantage that you have is that, unlike upgrading an operating system, upgrading from IE7 to Chrome 8 or Firefox 3.6 is a more reachable goal for the average user of your app, especially if you provide a link and upgrade instructions.

I would try Mozilla's localForage. https://localforage.github.io/localForage/

Related

Strategies for syncing data with server in PhoneGap [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I'm starting my first PhoneGap project, using AngularJS. It's a database driven app, using a REST API as the backend. To start with, I'm not going to store data locally at all, so it won't do much without Internet.
However, I would eventually like to have it store data locally, and sync when Internet is available, since I know I personally disable the Internet connections on my phone at times (air planes, low battery), or have no bars. I was wondering if you could point me toward some good resources for this type of syncing. Some recommended libraries? Or perhaps some discussions of the pitfalls and how to circumnavigate them. I've Googled a bit, but I think right now, I don't know the questions to ask.
Also, my intent to build it Internet-dependent first, and then add syncing.... Is that a good idea, or am I shooting myself in the foot? Do I need to build it syncing from the start?
I had someone suggest building the app as local-only first, rather that the Internet-only part first, which has a certain logic to it. The remote storage is kind of important to me. I know the decision there has a lot to do with my goals for the app, but from the stand point of building this, with the eventual goal being local storage + Internet storage, and two-way syncing, what's going to be easier? Or does it even make a difference?
To start with, I'm thinking of using UUIDs, rather than sequential integer primary keys. I've also thought about assigning each device an ID that is prefixed on any keys it generates, but that seems delicate. Anyone used either technique? Thoughts?
I guess I need a good system to tell what data's been synced. On the client side, I guess any records that get created/edited, can be flagged for syncing. But on the server-side, you have multiple clients, so that wouldn't work. I guess you could have a last_updated timestamp, and sync everything updated sync the last successful sync.
What about records edited in multiple places? If two client edit, and then want to sync, you have some ambiguity about merging, like when merging branches in git or other version control systems. How do you handle that? I guess git does it by storing diffs of every commit. I guess you could store diffs? The more I think about this, the more complicated it sounds. Am I over-thinking it or under-thinking it?
What about client side storage? I've thought about SQLite, or the PhoneGap local storage thing (http://docs.phonegap.com/en/1.2.0/phonegap_storage_storage.md.html). Recommendations? The syncing will be over a REST API, exchanging JSON, so I was thinking something that actually stores the data as JSON, or something JSON-like that's easy to convert, would be nice. On the other hand, if I'm going to have to exchange some sort of data diff format, maybe that's what I need to be storing?
Let me provide the answer to your question based on my experience related to the sync part as I don’t have enough experience with PhoneGap so will skip the question about PhoneGap local storage v SQLite.
I was wondering if you could point me toward some good resources for this type of syncing. Some recommended libraries?
There are a number of open source projects for syncing the PhoneGap app with the remote server. But you probably have to adjust them for your own needs or implement your own sync functionality. Below I listed some of the open-source projects. You must’ve already aware of them if you’d search the net.
PhoneGap sync plugin
Simple Offline Data Synchronization for Mobile Web and PhoneGap Applications
Synchronize a local WebSQL Db to a server
Couchbase Lite PhoneGap plugin
Additionally, you might consider the other options but that depends on your server side:
Microsoft Sync Framework Toolkit (Html5 sample is available)
OpenSync Framework - platform independent, general purpose synchronization engine
Also, my intent to build it Internet-dependent first, and then add syncing.... Is that a good idea, or am I shooting myself in the foot? Do I need to build it syncing from the start?
I believe the sync functionality is more like an additional module and shouldn’t be tightly coupled with the rest of your business logic. Once you start thinking about testing strategy for your sync you’ll realise it will be easier to test that if your sync facility is decoupled from the main code.
I think you can launch your app as soon as possible with the minimum required functionality without sync. But you’d better think about your architecture and the way you add the sync facility in advance.
To start with, I'm thinking of using UUIDs, rather than sequential integer primary keys. I've also thought about assigning each device an ID that is prefixed on any keys it generates, but that seems delicate. Anyone used either technique? Thoughts?
That depends on your project specifications and specifically your server side. For example, Azure mobile services allow only integer type for the primary keys. Although unique identifiers as primary keys are pretty handy in the distributed systems (has some disadvantages as well).
Related to assigning a device ID – I am not sure I understand the point although I don’t know your project specifics. Have a look at the sync algorithm that is used in our system (bidirectional sync using REST between multiple Android clients and central SQL Server).
What about records edited in multiple places? If two client edit, and then want to sync, you have some ambiguity about merging, like when merging branches in git or other version control systems. How do you handle that? I guess git does it by storing diffs of every commit. I guess you could store diffs? The more I think about this, the more complicated it sounds. Am I over-thinking it or under-thinking it?
This is where you need to think about how to handle the conflict resolution in your system.
If the probability of conflicts in your system will be high, e.g. users will be changing the same records quite often. Then you’d better track what fields (columns) of the records had been modified in your sync and then once the conflict is detected:
Iterate through each modified field of the server side record in conflict
Compare each modified field of the server record with the relevant field of the client.
If the client field was not modified then there is no conflict so just overwrite it with the server one.
Else there is a conflict so save the both field’s content into a temporary place for the report
At the end of sync produce the report of records in conflict.

Cross-platform, queryable, local data storage using HTML5?

I've done quite some research now on HTML5, but I am still left wondering what would be my best guess to implement local data storage that is truly cross-platform (i.e., runs on all important mobile platforms + possibly on desktop), and can easily be queried?
I want an HTML5 web application (to reach all mobile/(desktop) platforms, and for its independence of third party frameworks/libraries), but using local/offline storage to mimic performance of native applications (and do not necessarily require connectivity). It creates/alters/manages certain records for a user (up to a couple of hundred records per year). Apart from data storage, as the app doesn't need any other access to the device, I think HTML5 would be a good option.
Some requirements on the data I want to store:
the best format would be some lightweight database like SQLite (due to performance reasons, and the ability to update single records without having to write a whole file (as in the case of XML))
disadvantage: I don't see any technology available across all platforms; WebSQL is deprecated, and IndexedDB is not available in too many browsers yet
the data records shall be easily exportable/downloadable in XML format (so that the user can read/modify it on his own)
therefore, XML would be a good way to go; I assume the datasize to be reasonably low for this option; 2 concerns though:
disadvantage 1: I need a query-language that allows me to easily select/sort/alter specific records (sthg like XQuery, but available in all browsers and running locally on the client)
disadvantage 2: as far as I have seen, HTML5 FileWriterAPI support is nowhere near mature - therefore, how would I be able to alter/save the XML data locally on the client? (ok, I have seen examples where the whole XML file is saved as a single key/value pair in local storage; but disadvantage 1 would still apply...)
What options do I have? Is HTML5 mature enough to do what I am longing for?
If not, what alternatives would meet my requirements? Couple of loose thoughts: some third party libraries (JQuery(?), JSON(?) or cross-platform frameworks (a la Phonegap - which I wanted to avoid in the first place, due to their limitations), or use some server-side storage (that is synced with local storage)?
I dont know what limitations of Phonegap are you talking about,
I would suggest your application needs to be a hybrid one.
According to you requirements, you need to use native SQLite in different operating systems.
For that you need to use Phonegap, where you can write your own plugin, in which javascript act as the interface and the implementation is in native code.
Otherwise you can always check out lawnchair
http://westcoastlogic.com/lawnchair/
Thanks
Gaurav Gupta
Paxcel
For others who happened upon this post. I am currently searching for a solution as well and ran into localForage which seems a pretty good choice.
https://github.com/mozilla/localForage

Disadvantages of the Force.com platform [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We're currently looking at using the Force.com platform as our development platform and the sales guys and the force.com website are full of reasons why it's the best platform in the world. What I'm looking for, though, is some real disadvantages to using such a platform.
Here are 10 to get you started.
Apex is a proprietary language. Other than the force.com Eclipse plugin, there's little to no tooling available such as refactoring, code analysis, etc.
Apex was modeled on Java 5, which is considered to be lagging behind other languages, and without tooling (see #1), can be quite cumbersome.
Deployment is still fairly manual with lots of gotchas and manual steps. This situation is slowly improving over time, but you'll be disappointed if you're used to having automated deployments.
Apex lacks packages/namespaces. All of your classes, interfaces, etc. live in one folder on the server. This makes code much less organized and class/interface names necessarily long to avoid name clashes and to provide context. This is one of my biggest complaints, and I would not freely choose to build on force.com for this reason alone.
The "force.com IDE", aka force.com eclipse plugin, is incredibly slow. Saving any file, whether it be a class file, text file, etc., usually takes at least 5 seconds and sometimes up to 30 seconds depending on how many objects, data types, class files, etc. are in your org. Saving is also a blocking action, requiring not only compilation, but a full sync of your local project with the server. Orders of magnitude slower than Java or .NET.
The online developer community does not seem very healthy. I've noticed lots of forum posts go unanswered or unsolved. I think this may have something to do with the forum software salesforce.com uses, which seems to suck pretty hard.
The data access DSL in Apex leaves a lot to be desired. It's not even remotely competitive with the likes of (N)Hibernate, JPA, etc.
Developing an app on Apex/VisualForce is an exercise in governor limits engineering. Easily half of programmer time is spent trying to optimize to avoid the numerous governor limits and other gotchas like visualforce view state limits. It could be argued that if you write efficient code to begin with you won't have this problem, which is true to an extent. However there are many times that you have valid reasons to make more than x queries in a session, or loop through more than x records, etc.
The save->compile->run cycle is extremely slow, esp. when it involves zipping and uploading the entire static resource bundle just to do something like test a minor CSS or javascript change.
In general, the pain of a young, fledgling platform without the benefits of it being open source. You have no way to validate and/or fix bugs in the platform. They say to post it to their IdeaExchange. Yeah, good luck with that.
Disclaimers/Disclosures: There are lots of benefits to a hosted platform such as force.com. Force.com does regularly enhance the platform. There are plenty of things about it I like. I make money building on force.com
I see you've gotten some answers, but I would like to reiterate how much time is wasted getting around the various governor limits on the platform. As much as I like the platform on certain levels, I would very strongly, highly, emphatically recommend against it as a general application development platform. It's great as a super configurable and extensible CRM application if that's what you want. While their marketing is exceptional at pushing the idea of Force.com as a general development platform, it's not even remotely close yet.
The efficiency of having a stable platform and avoiding big performance and stability problems is easily wasted in trying to code around the limits that people refer to. There are so many limits to the platform, it becomes completely maddening. These limits are not high-end limits you'll hit once you have a lot of users, you'll hit them almost right away.
While there are usually techniques to get around them, it's very hard to figure out strategies for avoiding them while you're also trying to develop the business logic of your actual application.
To give you a simple sense of how developer un-friendly the environment is, take the "lack of debugging environment" referred to above. It's worse than that. You can only see up to 20 of the most recent requests to the server in the debug logs. So, as you're developing inside the application you have to create a "New" debug request, select your name, hit "Save", switch back to your app, refresh the page, click back to your debug tab, try to find the request that will house your debug log, hit "find" to search for the text you're looking for. It's like ten clicks to look at a debug output. While it may seem trivial, it's just an example of how little care and consideration has been given to the developer's experience.
Everything about the development platform is a grafted-on afterthought. It's remarkable for what it is, but a total PITA for the most part. If you don't know exactly what you are doing (as in you're certified and have a very intimate understanding of Apex), it will easily take you upwards of 10-20x the amount of time that it would in another environment to do something that seems like it would be ridiculously simple, if you can even succeed at all.
The governor limits are indeed that bad. You have a combination of various limits (database queries, rows returned, "script statements", future calls, callouts, etc.) and you have to know exactly what you are doing to avoid these. For example, if you have a calculated rollup "formula" field on an object and you have a trigger on a child object, it will execute the parent object triggers and count those against your limits. Things like that aren't obvious until you've gone through the painful process of trying and failing.
You'll try one thing to avoid one limit, and hit another in a never ending game of "whack a limit". In the process you'll have to drastically re-architect your entire app and approach, as well as rewrite all of your test code. You must have 75% test code coverage to deploy into production, which is actually very good thing, but combined with all of the other limits, it's very burdensome. You'll actually hit governor limits writing your test code that wouldn't come up in normal user scenarios, but that will prevent you from achieving the coverage.
That is not to mention a whole host of other issues. Packaging isn't what you expect. You can't package up your app and deliver it to users without significant user intervention and configuration on the part of the administrator of the org. The AppExchange is a total joke, and they've even started charging 5K just to get your app listed. Importing with the data loader sucks, especially if you have any triggers. You can't export all of your data in one step that includes your relationships in such a way that it can easily be re-imported into another org in a single step (for example a dev org). You can only refresh a sandbox once a month from production, no exceptions, and you can't include your data in a refresh by default unless you have called your account executive to get that feature unlocked. You can't mass delete data in custom objects. You can't change your package names. Certain things can take numerous days to complete after you have requested them, such as a data backup before you want to deploy an app, with no progress report along the way and not much sense of when exactly the export occurred. Given that there are synchronicity issues of data if there are relationships between the data, there are serious data integrity issues in that there is no such thing as a "transaction" that can export numerous objects in a single step. There are probably some commercial tools to facilitate some of this, but these are not within reach to normal developers who may not have a huge budget.
Everything else the other people said here is true. It can take anywhere from five seconds to a minute sometimes to save a file.
I don't mean to be so negative because the platform is very cool in some ways and they're trying to do things in a multi-tenant environment that no one else is doing. It's a very innovative environment and powerful on some levels (I actually like VisualForce a lot), but give it another year or two. They're partnering with VMware, maybe that will lead to giving developers a bit more of a playpen rather than a jail cell to work in.
Here are a few things I can give you after spending a fair bit of time developing on the platform in the last fortnight or so:
There's no RESTful API. They have a soap based API that you can call, but there is no way of making true restful calls
There's no simple way to take their SObjects and convert them to JSON objects.
The visual force pages are ok until you want to customize them and then it's a whole world of pain.
Visual force pages need to be bound to SObjects otherwise there's no way to get the standard input fields like the datepicker or select list to work.
The eclipse plugin is ok if you want to work by yourself, but if you want to work in a large team with the eclipse plugin forget it. It doesn't handle synchronizing to and from the server, it crashes and it isn't really helpful at all.
THERE IS NO DEBUGGER! If you want to debug, it's literally debugged by system.debug statements. This is probably the biggest problem I've found
Their "MVC" model isn't really MVC. It's a lot closer to ASP.NET Webforms. Your views are tightly coupled to not only the models but the controllers as well.
Storing a large number of documents is not feasible. We need to store over 100gb's of documents and we were quoted some ridiculous figure. We've decided to implement our document storage on amazons S3 infrastructure
Even tho the language is java based, it's not java. You can't import any external packages or libraries. Also, the base libraries that are available are severely limited so we've found ourselves implementing a bunch of stuff externally and then exposing those bits as services that are called by force.com
You can call external SOAP or REST based services but the message body is limited to 100kb's so it's very restrictive in what you can call.
In all honesty, whilst there are potential benefits to developing on something like the force.com platform, for me, you couldn't use the force.com platform for true enterprise level apps. At best you could write some basic crud style applications but once you move into anything remotely complicated I'd be avoiding it like the plague.
Wow- there's a lot here that I didn't even know were limitations - after working on the platform for a few years.
But just to add some other things...
The reason you don't have a line-by-line debugger is precisely because it's a multi-tenant platform. At least that's what SFDC says - it seems like in this age of thread-rich programming, that isn't much of an excuse, but that's apparently the reason. If you have to write code, you have "System.debug(String)" as your debugger - I remember having more sophisticated server debugging tools in Java 1.2 about 12 years ago.
Another thing I really hate about the system is version control. The Spring framework is not used for what Spring is usually used for - it's really more off a configuration tool in SFDC rather than version control. SFDC provides ZERO version-control.
You can find yourself stuck for days doing something that should seem so ridiculously easy, like, say, scheduling a SFDC report to export to a CSV file and email to a list of recipients... Well, about the easiest way to do that is create a custom object with a custom field, with a workflow rule and a Visualforce email template... and then for code you need to write a Visualforce component that streams the report data to the Visualforce email template as an attachment and you write anonymous APEX code schedule field-update of the custom object... For SFDC developers, this is almost a daily task... trying to put about five different technologies together to do tasks that seem so simple.... And this can cause management headaches and tensions too - Typically, you'd find this out after getting a suggestion to do something that doesn't work in the user-community (like someone already said), and then trying many things that, after you developed them you'd find they just don't work for some odd-ball reason - like "you can't schedule a VisualForce page", or "you can't call getContent from a schedulable context" or some other arcane reason.
There are so many, many maddening little gotcha's on the SFDC platform, that once you know WHY they're there, it makes sense... but they're still very bad limitations that keep you from doing what you need to do. Here's some of mine;
You can't get record owner information "out of the box" on pretty much any kind of record - you have to write a trigger that links the owner on create of the record to the record you're inserting. Why? Short answer because an owner can be either a "person" or a "queue", and the two are drastically different entities... Makes sense, but it can turn a project literally upside down.
Maddening security model. Example: "Manage Public Reports" permission is vastly different from "Create and Customize Reports" and that basically goes for everything on the platform... especially folders of any kind.
As mentioned, support is basically non-existent. If you are an extremely self-sufficient individual, or have a lot of SFDC resources, or have a lot of time and/or a very forgiving manager, or are in charge of a SFDC system that's working fine, you're in pretty good shape. If you are not in any of these positions, you can find yourself in deep trouble.
SFDC is a very seductive business proposition... no equipment footprint, pretty good security, fixed price, no infrastructure, AND you get web-based CRM with batchable, and schedualble processing... But as the other posters said, it is really quite a ramp-up in development learning, and if you go with consulting, I think the lowest price I've seen was $200/hour.
Salesforce tends integrate with other things years after some technologies become common-place - JSON and jquery come to mind... and if you have other common infrastructures that you want to do an integration with, like JIRA, expect to pay a lot extra, and they can be quite buggy.
And as one of the other posters mentioned, you are constantly fighting governor limits that can just drive you nuts... an attachment can NOT be > 5MB. Period. And sometimes < 3MB (if base64 encoded). Ten HTTP callouts in a class. Period. There are dozens of published governor limits, and many that are not which you will undoubtedly find and just want to run out of your office screaming.
I really, REALLY like the platform, but trust me - it can be one really cruel mistress.
But in fairness to SFDC, I'd say this: the biggest problem I find with the platform is not the platform itself, but the gargantuan expectations that almost anyone who sees the platform, but hasn't developed on it has.... and those people tend to be in positions of great authority in business organizations; marketing, sales, management, etc. Huge disconnects occur and heads roll, or are threatened to roll daily - all because there's this great platform out there with weird gotchas and thousands of people struggling daily to get their heads around why things should just work when they just don't and won't.
EDIT:
Just to add to lomaxx's comments about the MVC; In SFDC terminology, this is closely related to what's known as the "viewstate" -- aand it can be really buggy, in that what is on the VF page is not what is in the controller-class for the page. So, you have to go throught weird gyrations to synch whats on the page with what the controller is going to write to SF when you click your "save" button (or make your HTTP callout or whatever).... man, it's annoying.
I think other people have covered the disadvantages in more depth but to me, it doesn't seem to use the MVC paradigm or support much in the way of code reuse at all. To do anything beyond simple applications is an exercise in frustration compared to developing an application using something like ASP.Net MVC.
Furthermore, the tools, the data layer and the frustration of trying to refactor code or rename fields during the development process doesn't help.
I think as a CMS it's pretty cool but as a platform for non CMS applications, it's doesn't make sense to me.
The security model is also very very restrictive... but this isn't the worst part. You can't currently assert whether a user has the ability to perform a particular action.
You can check to see what their role is, but you can't check if that role has permissions to perform the current action.
Even worse is the response from tech support to "try the action and if there's an exception, catch it"
Considering Force.com is a "cloud" platform, its ability to act as a client to an external WSDL-defined service is pretty underwhelming. See http://force201.wordpress.com/2010/05/20/when-generate-from-wsdl-fails-hand-coding-web-service-calls/ for what you might end up having to do.
To all above, I am curious how the release of VMforce, allowing Java programmer to write code for Force.com, changes the disadvantages above?
http://www.zdnet.com/blog/saas/vmforcecom-redefines-the-paas-landscape/1071
I guess they are trying to address these issues. At dreamforce they mentioned they we're trying to drop the Governor limits to only 4. I'm not sure what the details are. They have a REST API for early access, and they bought heroku which is a ruby development in the cloud. They split out the database, with database.com so you can do all your web development on and your db calls using database.com.
I guess they are trying to make it as agnostic as possible. But right about now these are all announcements and early access so like their Safe Harbor statements don't purchase on what they say, only on what they currently have.

Architecture for a machine database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
This might be more of a serverfault.com question but a) it doesn't exist yet and b) I need more rep for when it does :~)
My employer has a few hundred servers (all *NIX) spread across several locations. As I suspect is common we don't really know how many servers we have: more than once I've been surprised to find a server that's been up for 5 years, apparently doing nothing but elevating the earth's temperature slightly. We have a number of databases that store bits of server information -- Puppet, Cobbler, Nagios, Cacti, our load balancers, DNS, various internal spreadsheets and so on but it's all very disparate, incomplete and overlapping. Maintaining this mess costs time and money.
So, I'd like to come up a single database which holds details of what each server is (hardware specs, role, etc) and replaces (or at least supplies data for) the databases mentioned above. The database and web interface are likely to be a Rails app as this is what I have most experience with. I'm more of a sysadmin than a coder.
Has this problem already been solved? I can't find any open source software that really fits the bill and I'm generally not too keen on bloaty, GUI vendor-supplied solutions.
How should I implement the device information collection bit? For instance, it'd be great to the database update device records when disks are added or removed, or when the server serial number changes because HP replace the board. This information comes from many different sources: dmidecode, command-line disk tools, SNMP against the server or its onboard lights-out card, and so on. I could expose all this through custom scripts and net-snmp, or I could run a local poller that reported the information back to the central DB (maybe via a RESTful interface or something). It must be easily extensible.
Have you done this? How? Tell me your experiences, discoveries, mistakes and recommendations!
This sounds like a great LDAP problem looking for a solution. LDAP is designed for this kind of thing: a catalog of items that is optimized for data searches and retrieval (but not necessarily writes). There are many LDAP servers to choose from (OpenLDAP, Sun's OpenDS, Microsoft Active Directory, just to name a few ...), and I've seen LDAP used to catalog servers before. LDAP is very standardized and a "database" of information that is usually searched or read, but not frequently updated, is the strong-suit of LDAP.
My team have been dumping all out systems in to RDF for a month or two now, we have the systems implementation people create the initial data in excel, which is then transformed to N3 (RDF) using Perl.
We view the data in Gruff (http://www.franz.com/downloads.lhtml) and keep the resulting RDF in Allegro (a triple store from the same guys that do Gruff)
It's incredibly simple and flexible - no schema means we simply augment the data on the fly and with a wide variety of RDF viewers and reasoning engines the presentation options are enless.
The best part for me? no coding, just create triples and throw them in the store then view them as graphs.
The collection of detailed machine information is a very frustrating problem (many vendors want to keep it this way). Even if you can spend a large amount of money, you probably will not find a simple solution to this problem. IBM and HP offer products that achieve what you are seeking, but they are very, very, expensive, and will leave a bad taste in your mouth once you realize that probably all you needed was 40-50% of the functionality they offer. You say that you need to monitor *Nix servers...most (if not all) unices support RFC 1514 (windows also supports this RFC as of windows 2000). The Host MIB support defined by RFC 1514 has its drawbacks however. Since it is SNMP based, it requires that SNMP be enabled on the machine, which is typically not the default for unix and windows machines. The reason for this is that SNMP was created before the entire world was using the Internet, and thus the old, crusty nature of its security is of concern. In many environs, this may not be acceptable for security reasons. However, if you are only dealing with machines behind the firewall, this might not be an issue (I suspect this is true in your case). Several years ago, I was working on a product that monitored hundreds of unix and windows machines. At the time, I did extensive research into the mechanics of how to acquire detailed information from each machine such as disk info, running processes, installed software, up-time, memory pressure, CPU and IO load (Including Network) without running a custom client on each machine. This info can be collected in a centralized fashion. As of three or four years ago, the RFC-1514 Host MIB spec was the only "standard" for acquiring detailed real-time machine info without resorting to OS-specific software. Sun and Microsoft announced a WebService based initiative many years ago to address some of this, but I suspect it never received any traction since I cannot at the moment even remember its marketing name.
I should mention that RFC 1514 is certainly no panacea. You are at the mercy of the OS-provided SNMP service, unless you have the luxury of deploying a custom info-collecting client to each machine. The RFC-1514 spec dictates that several parameters are optional, and if your target OS does not implement it, then you are back to custom code to provide the information.
I'm contemplating how to go about this myself, and I think this is one of the key pieces of infrastructure that not having around keeps us in the dark ages. Hopefully this will be a popular question on serverfault.com. :)
It's not just that you could install a single tool to collect this data, because that's not possible cheaply, but ideally you want everything from the hardware up to the applications on the network feeding into this thing.
I think the only approach that makes sense is a modular one. The range of devices and types of information is too disparate to come under a single tool. Also the collection of data needs to be as passive and asynchronous as possible - the reality of running infrastructure means that there will be interruptions and you can't rely on being able to get the data at all times.
I think the tools you've pointed out form something of an ecosystem that could work together - Cobbler can install from bare-metal and hand over to Puppet, which has support for generating Nagios configs, and storing configs in a database; for me only Cacti is a bit opaque in terms of programmatically inserting new devices, templates etc. but I know this is possible.
Ultimately you have to sit down and work out which pieces of information are important for the business you work for, and design a db schema around that. Then, work out how to get the information you need into the db, whether it's from Facter, Nagios, Cacti, or direct snmp calls.
Since you asked about collection of data, I think if you have quite disparate kit (Dell, HP etc.) then it makes sense to create a library to abstract away as much as possible the differences between them, so your scripts just make standard calls such as "checkdiskhealth". When you add new hardware you can add to the library rather than having to write a completely new script.
Sounds like a common problem that larger organizations would have. I know our (50 person company) sysadmin has a little access database of information about every server, license, and piece of hardware installed. He's very meticulous, but when it comes time to replace or repair hardware, he knows everything about it from his little db.
You and your organization could sponsor an open source project to get oyu what you need, and give back to the community so that additional features (that you may not need now) can be developed at no cost to you.
Maybe a simple web service? Just something that accepts a machine name or IP address. When the service gets input, it sticks it in a queue and kicks off a task to collect the data from the machine that notified it. The nature of the task (SNMP interrogation, remote call to a Perl script, whatever) could be stored as part of the machine information in the database. If the task fails, the machine ID stays in the queue and the machine is periodically re-polled until the information is collected. Of course, you also have to have some kind of monitor running on your servers to notice that something has changed and send the notification; hopefully this is easily accomplished with whatever server monitoring software you've already got in place.
There are some solutions from the big vendors for managing monstrous sets of machines - such as some of the Tivoli stuff from IBM. That is probably, however, overkill for mere hundreds of machines.
There are some free software server database solutions but I do not know if they provide hooks to update information automatically from the machines with dmidecode or SNMP. One I heard about (but no personal experience, sorry), is GLPI.
I believe you are looking for Zabbix. It's open source, easy to install and use.
I've installed for a client a few years ago, and if I remember right it has a client application that connects to the zabbix server to update it with the requested information.
I really recommend it: http://www.zabbix.com
Checkout Machdb Its an opensource solution to the problem you are describing.

Versioned RDF store [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Let me try rephrasing this:
I am looking for a robust RDF store or library with the following features:
Named graphs, or some other form of reification.
Version tracking (probably at the named graph level).
Privacy between groups of users, either at named graph or triple level.
Human-readable data input and output, e.g. TriG parser and serialiser.
I've played with Jena, Sesame, Boca, RDFLib, Redland and one or two others some time ago but each had its problems. Have any improved in the above areas recently? Can anything else do what I want, or is RDF not yet ready for prime-time?
Reading around the subject a bit more, I've found that:
Jena, nothing further
Sesame, nothing further
Boca does not appear to be maintained any more and seems only really designed for DB2. OpenAnzo, an open-source fork, appears more promising.
RDFLib, nothing further
Redland, nothing further
Talis Platform appears to support changesets (wiki page and reference in Kniblet Tutorial Part 5) but it's a hosted-only service. Still may look into it though.
SemVersion sounded promising, but appears to be stale.
Talis is the obvious choice, but privacy may be an issue, or perceived issue anyway, since its a SaaS offering. I say obvious because the three emboldened features in your list are core features of their platform IIRC.
They don't have a features list as such - which makes it hard to back up this answer, but they do say that stores of data can be individually secured. I suppose you could - at a pinch - sign up to a separate store on behalf of each of your own users.
Human readable input is often best supported by writing custom interfaces for each user-task, so you best be prepared to do that as needs demand.
Regarding prime-time readiness. I'd say yes for some applications but otherwise "not quite". Mostly the community needs to integrate with existing developer toolsets and write good documentation aimed at "ordinary" developers - probably OO developers using Java, .NET and Ruby/Groovy - and then I predict it will snowball.
See also Temporal Scope for RDF triples
From: http://www.semanticoverflow.com/questions/453/how-to-implement-semantic-data-versioning/748#748
I personally quite like the pragmatic approach which Freebase has adopted.
Browse and edit views for humans:
http ://www.freebase.com/view/guid/9202a8c04000641f80000000041ecebd
http ://www.freebase.com/edit/topic/guid/9202a8c04000641f80000000041ecebd
The data model exposed here:
http ://www.freebase.com/tools/explore/guid/9202a8c04000641f80000000041ecebd
Stricly speaking, it's not RDF (it's probably a superset of it), but part of it can be exposed as RDF:
http ://rdf.freebase.com/rdf/guid.9202a8c04000641f80000000041ecebd
Since it's a community driven website, not only they need to track who said what, when... but they are probably keeping the history as well (never delete anything):
http ://www.freebase.com/history/view/guid/9202a8c04000641f80000000041ecebd
To conclude, the way I would tackle your problem is very similar and pragmatic. AFAIK, you will not find a solution which works out-of-the-box. But, you could use a "tuple" store (3 or 4 aren't enough to keep history at the finest granularity (i.e. triples|quads)).
I would use TDB code as a library (since it gives you B+Trees and a lot of useful things you need) and I would use a data model which allows me to: count quads, assign an ownership to a quad, a timestamp and previous/next quad(s) if available:
[ id | g | s | p | o | user | timestamp | prev | next ]
Where:
id - long (unique identifier, same (g,s,p,o) will have different id...
a lot of space, but you can count quads... and when you have a
community driven website (like this one) counting things it's
important.
g - URI (or blank node?|absent (i.e. default graph))
s - URI|blank node
p - URI
o - URI|blank node|literal
user - URI
timestamp - when the quad was created
prev - id of the previous quad (if present)
next - id of the next quad (if present)
Then, you need to think about which indexes you need and this would depend on the way you want to expose and access your data.
You do not need to expose all your internal structures/indexes to external users/people/applications. And, when (and if), RDF vocabularies or ontologies for representing versioning, etc. will emerge, you are able to quickly expose your data using them (if you want to).
Be warned, this is not common practice and it you look at it with your "semantic web glasses" it's probably wrong, bad, etc. But, I am sharing the idea, since I believe it's not harmful, it allows to provide a solution to your question (it will be slower and use more space than a quad store), part of it can be exposed to the semantic web as RDF / LinkedData.
My 2 (heretic) cents.
LMF comes with a versioning module: http://code.google.com/p/lmf/wiki/ModuleVersioning
The Linked Media Framework is an easy-to-setup server application developed in JavaEE that bundles core Semantic Web technologies to offer many advanced services.
Take a look to see if Virtuoso's RDF support meets your needs, it sounds as though it might go quite a way, and it plays nice with XML and web services too. There's a commercial and a GPL'd version.
Mulgara/Fedora-Commons might fit the bill. I belive that privacy is currently a major project, and I understand that it supports versioning, but it might be too much in that is is an object-store too.
(years later)
I think both Oracle's RDF store:
http://www.oracle.com/technetwork/database/options/semantic-tech/index.html
and the recently announced graph store in IBMs DB2 supports much of this:
http://www-01.ibm.com/software/data/db2/linux-unix-windows/graph-store.html

Resources