This is my proposal to make a structural framework for this area of a UNIX kernel.
The same "basic" kind of setup under "geometry" would look like this:
Now the SCSI and WD drivers just provide access to the hardware, they don't know anything about layout, and separate "methods" do the handling of layout, slicing and partitioning.
The red circles mark "geometry" devices. These are the points in the graph which can be accessed from /dev/something or other.
A geometry device has the following public properties:
This is basically a machine with two mirrored disks. I will use this to illustrate an important concept of "geometry": on the fly insertion.
When the machine boots, let say from sd0, we need to find a suitable root filesystem. Since we want to be backwards compatible, the MBR and BSD methods will be self-identifying; ie: they will examine the available devices and instantiate themselves on those devices on which they find their respective magic sectors.
So at the time when /sbin/init gets executed the picture looks like this:
So before we mount anything read/write, we want to activate the mirroring:
Now it looks like this:
Next, using the same set of conditions we enable the mirror:
Now we're back to the setup we started with:
There are no limits to what a method can do really. Here is a beastiarium over some of the ones I can imagine:
BSD
Understands BSD style disklabels
MBR
Understands DOS/MBR/FDISK style disklabels
MIRROR
mirrors data over multiple lower devices
CONCAT
Concatenates a number of lower devices into one larger device
STRIPE
Like CONCAT, but with interleaved layout.
RAID-5
Raid-5 method over a number of lower devices.
INTERLEAVE
This is the opposite of STRIPE. It interleaves a number of upper devices onto one lower device. For two interleaves devices, all the even numbered sectors on the lower device will belong to the first upper device and the odd numbered ones to the other.
COW
"Copy On Write" Imagine the case were you had a nasty crash and fsck barfs badly over one of your filesystems. The temptation to just run "fsck -y" is there, but what will happen ? Well you put a "COW" on your device, and tell the "COW" to use your swap partition for temporary storage. Then you say fsck -y on your COWed device. The COW module will look just like a normal device, but all the writes fsck does will be stored in the temporary storage until you tell COW to "commit". So if fsck -y looked ok, you mount the device, peek around find nothing important missing and tell "COW" to commit, COW will copy all the blocks written by fsck from temporary storage to the "real" device and we're all happy. If on the other hand fsck -y removed pretty much everything on the filesystem, you will probably tell "COW" to "abandon" and take the long road home to recovery. Call it the "What if ?" method if you like, but it is my favourite method.APPLE, SUN, MVS, XENIX
These are various methods to read the disklabels as they look on various other machines and OSs.
"YOUR IDEA GOES HERE"
A method should hopefully be something very simple to write, so if you have a good idea...
It's about providing tools, not policies really...
I actually had a prototype of this running, but it suffered badly from "second systems syndrome", so a fresh start should be made. I am unlikely to have the time for it, unless I find a sponsor for it.
Poul-Henning Kamp
phk@FreeBSD.org
19990925