Lightweight open-source shared file system over network - filesystems

We have two web servers with load balancing. We need to share some files between those servers. These would be uploaded files, session files, various files that php applications create.
We don't want to use a heavyweight, no longer maintained or a commercial solution. We're looking for some lightweight open-source software that would work as shared file system. It should be really easy to set up, must be HA available, must be very fast. It should work with RedHat Linux.
We looked at such solutions like drbd with synchronous file sharing but we can't use them because it can't work on an underlying filesystem like ext3.

OCFS may be up to snuff by now; it's worth checkout out at least. It's in the mainline linux kernel tree, http://oss.oracle.com/projects/ocfs2/ has some info on it. I've set it up before, it was pretty easy to get going.

DRBD is good for syncing over a network (direct crossover connection if at all possible), but EXT3 is not designed to be aware of changes that occur underneath it, at the block device level. For that reason you need a filesystem designed for such purposes such as the Global File System (GFS). To the best of my knowledge Red Hat has support for GFS.
The DRBD manual will give you an overview of how to use GFS with DRBD.
http://www.drbd.org/users-guide/ch-gfs.html
Don't take this as a final answer - I have not researched or used a multi-master system before, but at least this might give you something to go on.
Ideally, you would only sync the part of the data that's shared between the webservers.

Related

Is 9P obsolete?

I'm interested in studying the 9P FS, currently been reading the source available from these implementations: http://9p.cat-v.org/implementations
Is 9P obsolete? Are you using it for some application?
(also I've found this, some perfomance test between 9P and NFS: http://graverobbers.blogspot.com/2007/08/v9fs-performance-versus-nfs.html)
No, 9P isn't obsolete; I don't know of a protocol that does what it does and is clean and well defined enough to be implemented correctly in almost any language that exists.
9P is used in a variety of systems. A couple of recent uses in arm-js (an ARM emulator) and 9webdraw (a GSoC project that implements the Plan 9 /dev/draw). Both are HTML5 Javascript implementations.
Just to add a bit, both the Linux client implementation and several servers are under active development, so I'd say that's a pretty clear sign that folks still have use for it. One of the areas its seen heavy use more recently is the virtio-9P (aka virtfs) which is part of qemu/kvm and can be used for direct guest to host file access. It's also been used in several experimental operating systems projects (Libra, PROSE, FusedOS) and incorporated into other operating systems (BSD, MacOSX, Windows, Linux) and hypervisors (in addition to the KVM instance above, its also been incorporated in various ways into Xen). 9P is actually being used in supercomputing deployments (both for Plan 9 and Linux, see the diod project on Sourceforge).
I think the reason is that the protocol is quite simple, so implementations also tend to be quite simple and easy to integrate elsewhere (there are several applications both inside and outside the Plan 9 world which use 9P as an interface to the application, in much the same way that some web developers use RESTful interfaces).
The protocol has a couple of different variations including the 9P.L variant which was developed specifically to match the Linux VFS API better. It adds a bit of complexity to the protocol in the addition of operations, but removes some of the complexity of mapping Linux VFS API -> 9P and vice versa.
It is used in Erlang-on-Xen both as a storage protocol for goofs http://erlangonxen.org/blog/goofs-simple-filesystem
It is the way erlang on xen instances in other ways too, see here:
http://erlangonxen.org/more/9p2000e
Also, it's used by libvirt stuff with QEMU.
http://wiki.qemu.org/Documentation/9psetup
9p, to me, is like the Scheme of network protocols. For the most part, it is very simple, but people see need to extend it to fit their environments. Luckily this is done in ways that are often backwards compatible.
In addition to everything mentioned in the other answers, Microsoft is using 9P as part of their Windows Subsystem for Linux.
They add a 9P server to each Linux distribution that is running as a guest, so that Windows can mount the Linux filesystem over 9P, and Windows processes can transparently access the files on Linux's ext4 partition.

Is there a way to get battery info (status, plugged in, etc) without reading a proc/sys file on linux?

I want to get information about the battery in C on linux. I don't want to read or parse any file! Is there any low-level interface to acpi/the kernel or any other module to get the information I want to have?
I already searched the web, but every question results in the answer "parse /proc/foo/bar". I really don't want to do this because I think, low-level interfaces won't change as fast as Files do.
best regards.
The /proc filesystem does not exist on a disk. Instead, the kernel creates it in memory. They are generated on-demand by the kernel when accessed. As such, your concerns are invalid -- the /proc files will change as quickly as the kernel becomes aware of changes.
Check this for more info about /proc file system.
In any case, I don't believe there's any alternative interface.
You might be looking for UPower: http://upower.freedesktop.org/
This is a common need for both desktop environments and mobile devices, so there have been many solutions over time. For example, one of the oldest ones was acpid, which is pretty much obsolete now.
While I'd recommend using a light-weight abstraction like UPower for code clarity reasons, the files in /proc and (to some extent) /sys are considered part of the Linux kernel ABI, which means that changing them is generally frowned upon.

Possible to build support for a filesystem directly into an application?

I am wondering if it's possible to write an application that will access a foreign filesystem, but without needing support for that filesystem from the operating system. For example, I'd like to write an app in C that runs on Mac OS X that can browse / copy files from an ext2/ext3 formatted disk. Of course, you'd have to do all the transfers through the application (not through the system using cp or the Finder), but that would be OK for my purpose. Is this possible?
There are user space libraries that allow you to access file systems.
The Linux-NTFS library (libntfs) allows you to access NTFS file systems and there are user space programs like ntfsfix to do things to the file system.
E2fsprogs does the same for ext2, ext3 and ext4 filesystems.
As Basile mentioned, Mtools is another one that provides access to FAT partitions.
There was even a program that does exactly what you're looking for on Windows. It's called ext2explore and allows you to access ext2 partitions from Windows.
It is possible. For example the GNU mtools utility are doing that (assuming a way to access the raw device or partition) for MS-DOS FAT file systems.
However, file systems inside the kernel are usually very well tested and optimized.
Yes and No. For a regular user Application is usually not possible because access to block devices is restricted to root only. Every block device should give read/write to the needed block device for that effect. This would need at best a server/client approach where a service is started on the machine and configured to give the permissions on a per block device manner.
The somewhat easier alternative would be you to use the MacFUSE implementation.
Look here:
http://code.google.com/p/macfuse/
http://groups.google.com/group/macfuse?pli=1
The MacFuse project seems no longer mantained, but can give you a starting point for your project.
The dirty and quick approach is the following as root chmod 666 /dev/diskN
You can hijack syscalls and library calls from your application and then redirect reads/writes to anything like a KV store or a distributed DB layer (using the regular calls for the "virtual devices" that you do not support).
Then, the possibilities are boundless because you don't have to reach the physical/virtual devices when someone asks for them (resolving privilege issues).

How to implement a very simple filesystem?

I am wondering how the OS is reading/writing to the hard drive.
I would like as an exercise to implement a simple filesystem with no directories that can read and write files.
Where do I start?
Will C/C++ do the trick or do I have to go with a more low level approach?
Is it too much for one person to handle?
Take a look at FUSE: http://fuse.sourceforge.net/
This will allow you to write a filesystem without having to actually write a device driver. From there, I'd start with a single file. Basically create a file that's (for example) 100MB in length, then write your routines to read and write from that file.
Once you're happy with the results, then you can look into writing a device driver, and making your driver run against a physical disk.
The nice thing is you can use almost any language with FUSE, not just C/C++.
I found it quite easy to understand a simple filesystem while using the fat filesystem on the avr microcontroller.
http://elm-chan.org/fsw/ff/00index_e.html
Take look at the code you will figure out how fat works.
For learning the ideas of a file system it's not really necessary to use a disk i think. Just create an array of 512 byte byte-arrays. Just imagine this a your Harddisk an start to experiment a bit.
Also you may want to hava a look at some of the standard OS textbooks like http://codex.cs.yale.edu/avi/os-book/OS8/os8c/index.html
The answer to your first question, is that besides Fuse as someone else told you, you can also use Dokan that does the same for Windows, and from there is just a question of doing Reads and Writes to a physical partition (http://msdn.microsoft.com/en-us/library/aa363858%28v=vs.85%29.aspx (read particularly the section on Physical Disks and Volumes)).
Of course that in Linux or Unix besides using something like Fuse you only have to issue, a read or write call to the wanted device in /dev/xxx (if you are root), and in these terms the Unices are more friendly or more insecure depending on your point of view.
From there try to implement a simple filesystem like Fat, or something more exoteric like an tar filesystem, or even some simple filesystem based on Unix concepts like UFS or Minux, or just something that only logs the calls that are made and their arguments to a log file (and this will help you understand, the calls that are made to the filesystem driver during the regular use of your computer).
Now your second question (that is much more simple to answer), yes C/C++ will do the trick, since they are the lingua franca of system development, also a lot of your example code will be in C/C++ so you will at least read C/C++ in your development.
Now for your third question, yes, this is doable by one person, for example the ext filesystem (widely known in Linux world by it's successors as ext2 or ext3) was made by a single developer Theodore Ts'o, so don't think that these things aren't doable by a single person.
Now the final notes, remember that a real filesystem interacts with a lot of other subsystems in a regular kernel, for example, if you have a laptop and hibernate it the filesystem has to flush all changes made to the open files, if you have a pagefile on the partition or even if the pagefile has it's own filesystem, that will affect your filesystem, particularly the block sizes, since they will tend to be equal or powers of the page block size, because it's easy to just place a block from the filesystem on memory that by coincidence is equal to the page size (because that's just one transfer).
And also, security, since you will want to control the users and what files they read/write and that usually means that before opening a file, you will have to know what user is logged on, and what permissions he has for that file. And obviously without filesystem, users can't run any program or interact with the machine. Modern filesystem layers, also interact with the network subsystem due to the fact that there are network and distributed filesystems.
So if you want to go and learn about doing kernel filesystems, those are some of the things you will have to worry about (besides knowing a VFS interface)
P.S.: If you want to make Unix permissions work on Windows, you can use something like what MS uses for NFS on the server versions of windows (http://support.microsoft.com/kb/262965)

Cross Platform System Usage Metrics

Is there a cross platform C API that can be used to get system usage metrics?
I've worked with libstatgrab before. Gets you some pretty useful system statistics for the main Unix-like variants and Windows through Cygwin (supposedly - never tried). Different OS's work so differently - especially when it comes to usage metrics - it may be challenging to get what you want. Even something as simple sounding as "free memory" can be tricky to act on in a cross-platform way. Perhaps if you narrow things down a bit, maybe we can find something.
Unfortunately not.
The C standard is pretty much limited to dynamic allocation, string manipulation, math and text I/O. Once you get beyond that, you need OS APIs which by definition are OS specific and not cross platform.
Depending on the metrics you want to collect, you may want to consider looking at PCP (Performance Co-Pilot). This is an open-source performance framework, originally developed at Silicon Graphics, which collects and collates a vast number of possible metrics from a vast number of sources, and lets you monitor them from anywhere.
Basically PCP would involve adding another 'layer' into your system -- for example, you might monitor a distributed cluster of mixed-OS machines, each with PCP installed locally; a set of 'agents' collect the performance data on each machine, and your code could then use libpcp to collect those metrics as required.
It's hard to say without knowing your exact usage scenario (if you're talking about something running seamlessly on end-users' machines, PCP may not suit but if you want to monitor machines which you control, and are happy to run the PCP service on them, it's an awesome solution).
We use PCP very happily to collect metrics from Windows and Linux boxes, as well as internal metrics from our application, and log them all centrally, report on them, monitor trends, etc.
The good old SNMP provides C libraries for the client and server side. If you prefer something newer you should try Prometheus. They have Node Exporter ready to use and the client libraries for many languages including C.
Also note that PCP (mentioned in Cowan's answer) supports the OpenMetrics standard since version 4, so if you decide to use Prometheus as the metric data collector extending PCP or writing a custom Prometheus client are better solutions that SNMP which also supported by Prometheus through SNMP Exporter, but more difficult to set up due to the way it handles MIBs and authentication.

Resources