What is a good filesystem for embedded NAND drives? - filesystems

I am working on an embedded application that uses NAND flash for storage.
As it looks now, we won't use Linux or any other RTOS. The application must handle unexpected power downs.
We have been looking on different file system solutions, including YAFFS2, JFFS2, FAT+FTL as well as solutions from HCC Embedded.
I have heard FAT+FTL is a normal choice, but I am worried about data loss in case of unexpected power downs as well as performance. Would be grateful if anyone could share insights and experience about this

FAT-FTL is a "normal choice", but not necessarily a good choice.
YAFFS2 is newer than JFFS2 and seems to be faster and more scalable for large NANDs. This presentation of the "Embedded Linux Conference Europe" compares these two and other flash file systems.
Another solution is LogFS (note: log stands here for logorithmic, not for log-structured). It should also be more scalable, but I have no idea how muture it is.

There's UBIFS. The only implementation I know of is in the Linux kernel, and it depends on the Linux kernel's UBI interface. However, the fundamental algorithms should be implementable without too much trouble in whatever environment you are using. As for production-readiness, Nokia uses UBIFS in their N900 smartphone, and plug computers based on the SheevaPlug have support for it, too. I have found the Linux implementation to be reliable, even on flaky hardware that likes to reset itself at random. Unlike JFFS2, UBIFS does not need to read the entire filesystem at startup.
You may want to reconsider your "no Linux" decision, since using Linux would make it a lot easier to use UBIFS.

There is a commercial vendor called DataLight which offers a robust flash file system, but of course it's not free.
They have an interesting white paper (PDF) (take with a grain of salt) on performances.

Related

Writing an EXT4 file system in C?

This may sound noobish, especially as I'm ( as you may have guessed ) trying to write an Operating System. At the moment I'm stuck on trying to make a file system.
What I want is a similar file system as Linux Ubuntu which is EXT4 ( at least mine is ). I want to try and also either write it in C.
Any idea's on how I can go about this? And/or any tutorials that you might have found that may help me ( I have tried searching with no luck ) :L
Thanks in advance!
Jamie.
Really smart and experienced people who have studied this problem extensively have made bugs that ate users' data. The difference between a bug in the computation layer (e.g. a kernel crash) and a bug in the storage layer is that silently eating users' data is very bad - much worse than giving wrong answers in spreadsheets (excel is buggy yet popular) or intermittently sigfaulting while preserving data on disk (this is easily mitigated by frequent autosave).
Start by studying simpler designs, like the minix filesystems from the old operating systems book [1] (the same one Linus Torvalds started with, twenty years ago).
Like others said, ext2 without journaling, extents or ACLs is a better starting point than ext4. The source code for it is in the Linux kernel and in the e2fsprogs userspace tools package[2]. The format is well documented.
As for tutorials, consider who makes them and why they spend effort on this task. Tutorials are generally made by stakeholders in platforms to bring in new people to develop using that platform, to use the network effect to grow the platform and profit from being already-established actors in a larger ecosystem.
Do you see a business model in growing the number of people who implement their own incompatible buggy[3] file systems? Only if you sell software engineering degrees. So Microsoft only writes tutorials on how to use NTFS, not on how to implement it. Same for Sun and ZFS, Red Hat and Google with EXT2/3/4, SGI with XFS, IBM with JFS, Oracle with BTRFS, etc.
If you want education instead of training, you need to read books and study smart peoples' code they use in production, not look for tutorials.
http://en.wikipedia.org/wiki/Operating_Systems:_Design_and_Implementation
http://e2fsprogs.sourceforge.net/
How much use, in how many different use pattern with it see? Consider bugs discovered in production FSs after years of use on millions of computers. It is unlikely your code will be less buggy, even if you're as smart as Matthew Dillon.
Try looking at an existing implementation, like the one in Linux.

SoX vs OpenAL performance/overhead

Both has bindings for C, both can play various formats.
Which one is more superior? in terms of simplicity, performance, overhead and memory footprint.
Also which one is better at handling multiple streams?
I have not programmed with either of those, but I believe that OpenAL has been designed to render and output multiple-channel audio for games, with real-time performance as a requirement.
libSoX is more for input and output from audio files, as well as for format conversions. There are lots of plugins but AFAIK it has not been designed for real-time audio output. It seems significantly simpler to use, though.
You might also want to have a look at libsndfile.
What exactly is it that you want to do?
While I have never used OpenAL, I've heard a lot of bad things about it that make it sound unprofessional and basically a dead end. From Wikipedia:
OpenAL was originally developed by Loki Software in order to help them in their business of porting Windows games to Linux. After the demise of Loki, the project was maintained for a time by the free software/open source community, and implemented on NVIDIA nForce sound cards and motherboards. It is now hosted (and largely developed) by Creative Technology with on-going support from Apple, Blue Ripple Sound, and free software/open source enthusiasts.
While the OpenAL charter says that there will be an "Architecture Review Board" (ARB) modeled on the OpenGL ARB, no such organization has ever been formed and the OpenAL specification is generally handled and discussed via email on its public mailing list.
Since 1.1, the implementation by Creative has turned proprietary, with the last releases in free licenses still accessible through the project's subversion. However, OpenAL Soft is a widespread Open Source alternative.
There was also an issue with it messing up the state of the calling application; I believe just linking it caused some global constructors to run before the invocation of main in a way that altered the program's initial environment, and broke some programs (MPlayer perhaps?). It's unclear to me whether this issue was ever fixed, but it screams bad library and I would be skeptical of ever trusting a library historically contained such abuses.

Why no good extN drivers for Windows?

Why are there no good drivers for Windows for reading ext2/3/4 filesystems? Googling around indicates that there's 2 or 3 out there, but all of them have problems. Is there some technical inconsistency that makes it difficult to correctly code up something that would enable me to open up My Computer and work with an extN partition just like NTFS or FAT? I thought one of the benefits of open sources and standards was that problems like this should be solved fairly quickly.
Driver signing.
Microsoft's driver signing is by its own nature incompatible with the GPL and unsigned drivers don't work anymore.
I haven't used it myself, but a coworker of mine has used Ext2 IFS for Windows without any problems.
One of the benefits of open sources and standards is that problems like this can be solved fairly quickly. If no one is sufficiently motivated to work on a problem - whether that motivation comes from money, personal need, fame, whatever - then the problem is unlikely to get solved. (The closed source world is no different.) It probably doesn't help that relatively few open source developers have experience hacking on Windows kernel mode device drivers. Writing device drivers is a specialized skill. There are developers who understand the ext2/3/4 code very well and are very willing to work on it, but odds are that the the people experienced enough at hacking on the Linux kernel to work on the ext2/3/4 drivers are probably primarily Linux users (and so don't much care about writing drivers for Windows).
With regards to driver signing: It's my understanding that, starting with Windows Vista, Microsoft doesn't have to sign or certify your drivers for them to be installed without warnings, but you do need a code signing certificate. These are somewhere in the neighborhood of $400 - $500 a year (see Verisign's web site, for example), and most non-commercial developers aren't interested in paying out that kind of money. There are methods for disabling driver signing requirements, but none of them are something the average user is likely to try, which would hinder the acceptance of a non-signed driver.
I don't know how the Ext2 IFS for Windows handles it; either its author got a certificate somehow, or it requires that you disable the driver signing requirements.
So, to summarize, the best ext2/3/4 developers probably don't have much need for Windows, and driver signing discourages would-be open source driver developers for Windows, and the availability of NTFS for Linux means that you can use NTFS instead of ext2/3/4 to share data between Linux and Windows. These three factors work together to remove a lot of the interest in developing ext2/3/4 for Windows.

Content for Linux Operating Systems Class

I will be TA for an operating systems class this upcoming semester. The labs will deal specifically with the Linux Kernel.
What concepts/components of the Linux kernel do you think are the most important to cover in the class?
What do you wish was covered in your studies that was left out?
Any suggestions regarding the Linux kernel or overall operating systems design would be much appreciated.
My list:
What an operating system's concerns are: Abstraction and extension of the physical machine and resource management.
How the build process works ie, how architecture specific/machine code stuff is implanted
How system calls work and how modules can link up
Memory management / Virtual Memory / Paging and all the rest
How processes are born, live and die in POSIX and other systems
userspace vs kernel threads and what the difference is between process/threads
Why the monolithic Kernel design is growing tiresome and what are the alternatives
Scheduling (and some of the alternative / domain specific schedulers)
I/O, Driver development and how they are dynamically loaded
The early stages of booting and what the kernel does to setup the environment
Problems with clocks, mmu-less systems etc
... I could go on ...
I almost forgot IPC and Unix 'eveything is a file' design decisions
POSIX, why it exists, why it shouldn't
In the end just get them to go through tanenbaum's modern operating systems and also do case studies on some other kernels like Mach/Hurd's microkernel setup and maybe some distributed and exokernel stuff.
Give a broad view past Linux too, I recon
For those who are super geeky, the history of operating systems and why they are the way they are.
The Virtual File System layer is an absolute must for any Linux Operating System class.
I took a similar class in college. The most frustrating but, at the same time, helpful project was writing a small file system for the Linux operating system. Getting this to work takes ~2-3 weeks for a group of 4 people and really teaches you the ins and outs of the Kernel.
I recently took an operating systems class, and I found the projects to be challenging, but essential in understanding the concepts in class. The projects were also fun, in that they involved us actually working with the Linux source code (version 2.6.12, or thereabouts).
Here's a list of some pretty good projects/concepts that I think should be covered in any operating systems class:
The difference between user space and kernel space
Process management (i.e. fork(), exec(), etc.)
Write a small shell that demonstrates knowledge of fork() and exec()
How system calls work, i.e. how do we switch from user to kernel mode
Add a simple system call to the Linux kernel, write a test application that calls the system call to demonstrate it works.
Synchronization in and out of the kernel
Implement synchronization primitives in user space
Understand how synchronization primitives work in kernel space
Understand how synchronization primitives differ between single-CPU architectures and SMP
Add a simple system call to the Linux kernel that demonstrates knowledge of how to use synchronization primitives in the Linux kernel (i.e. something that has to acquire, say, the tasklist lock, etc. but also make it something where you have to kmalloc, which can't be done while holding a lock (unless you GFP_ATOMIC, but you shouldn't, really))
Scheduling algorithms, and how scheduling takes place in the Linux kernel
Modify the Linux task scheduler by adding your own scheduling policy
What is paging? How does it work? Why do we have paging? How does it work in the Linux kernel?
Add a system call to the Linux kernel which, given an address, will tell you if that address is present or if it's been swapped out (or some other assignment involving paging).
File systems - what are they? Why do they exist? How do they work in the Linux kernel?
Disk scheduling algorithms - why do they exist? What are they?
Add a VFS to the Linux kernel
Well, I just finished my OS course this semester so I thought I'd chime in.
I was kind of upset that we didn't actually play around with the actual OS itself, rather we just did system programming. I'd recommend having the labs be on something that is in the OS itself, which is what it sounds like what you want to do.
One lab that I did enjoy and found useful however was writing our own malloc/free routines. It was difficult, but pretty entertaining as well.
Maybe also cover loading programs into memory and/or setting up the memory manager (such as paging).
For labs, one thing that may be cool is to show them actual code and discuss about it, ask questions about what do they think things are done that way and not another, etc.
If I were again in the University I would certainly appreciate more in depth lessons about synchronization primitives, concurrency and so on... those are hard matters that are more difficult to approach without proper guidance. I remember I went to a speech by Paul "Rusty" Russell about spinlocks and other synchronization primitives that was absolutely rad, maybe you could find it in youtube and borrow some ideas.
Another good topic (or possibly exercise for the students) would be looking at virtualisation. Especially Rusty Russel's "lguest" which is designed as a simple introduction to what is required to virtualise an operating system. The docs are good reading too.
I actually just took a class that perfectly fits your description (OS Design using linux) in the spring. I was actually very frustrated with it because I felt like the teacher focused too narrowly for the projects rather than give a broader understanding. For instance, our last project revolved around futexes. My partner and I barely learned what they were, got it working (kinda) and then turned it in. I came away with no general knowledge of anything really from that project. I wish one of the projects had been to write a simple device driver or something like that.
In other words, I think it's good to make sure a good broad overview is presented, with as much detail as you can afford, but ultimately broad. I felt like my teacher nitpicked these tiny areas and made us intensely focus on those, while in the end I did NOT come away with that great of a general understanding of the inner-workings of Linux.
Another thing I'd like to note is a lot of why I didn't retain knowledge from the class was lack of organization. Topics came out of nowhere any given week, and there was no roadmap. Give the material a logical flow. Mental organization is the key to retaining the knowledge.
The networking sub-system is also quite interesting. You could follow a packet as it goes from the socket system call to the wire and the other way around.
Fun assignments could be:
create a state-full firewall by using netfilter
create an HTTP load balancer
design and implement a simple tunneling protocol
Memory mapped I/O and the 1g/3g vs 2g/2g split between kernel address space and user addressable space in 32bit operating systems.
Limitations of 32 bit architecture on hard drive size and what this means for the design of file systems.
Actually just all the pros and cons of going to 64 bit, what it means and why as well as the history and why are aren't there yet.

Lightweight open-source shared file system over network

We have two web servers with load balancing. We need to share some files between those servers. These would be uploaded files, session files, various files that php applications create.
We don't want to use a heavyweight, no longer maintained or a commercial solution. We're looking for some lightweight open-source software that would work as shared file system. It should be really easy to set up, must be HA available, must be very fast. It should work with RedHat Linux.
We looked at such solutions like drbd with synchronous file sharing but we can't use them because it can't work on an underlying filesystem like ext3.
OCFS may be up to snuff by now; it's worth checkout out at least. It's in the mainline linux kernel tree, http://oss.oracle.com/projects/ocfs2/ has some info on it. I've set it up before, it was pretty easy to get going.
DRBD is good for syncing over a network (direct crossover connection if at all possible), but EXT3 is not designed to be aware of changes that occur underneath it, at the block device level. For that reason you need a filesystem designed for such purposes such as the Global File System (GFS). To the best of my knowledge Red Hat has support for GFS.
The DRBD manual will give you an overview of how to use GFS with DRBD.
http://www.drbd.org/users-guide/ch-gfs.html
Don't take this as a final answer - I have not researched or used a multi-master system before, but at least this might give you something to go on.
Ideally, you would only sync the part of the data that's shared between the webservers.

Resources