This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.
[file this blog page at: digg del.icio.us Technorati]
Having summarized
in my own words how flash SSD storage units are structured
it may be interesting to discuss which file
system
to use on them. The main problems with SSDs
are:
pagesizes (typically 8KiB) and erase
blocksizes (typically 1MiB).
write amplification(the page linked to provides further discussion of flash SSD structure and challenges).
The large transaction sizes seem addressable with file
systems that allocates space in extents
(ideally with an 8KiB granularity) and supports well parity
RAID storage, as a multi unit RAID strip can well have a width
of 1MiB, and block sizes of 8KiB are not uncommon.
It is also important to ensure that partitions (if any) are aligned, and most recent GNU/Linux and MS-Windows tools do that, and I usually align to 256MiB up to 1GiB anyhow.
As to reducing the number of erase operations, that means
also reducing the number of write operations, and as to this
unfortunately all
journaled
file-system have the problem that they stage all updates to
both the journal (first) and the filetree (second).
Fortunately most file-systems with a journal write to it only
metadata
updates and journal the updates as
operations instead of actual data, so the amount journaled in
normal operation is rather small, especially if filetrees
are mounted
with the noatime
option.
It is still a problem though and for this and other reasons recent versions of the ext4 file-system have the option to disable journaling (the other reason is that on rotating storage journaling can be very expensive my creating long travel distances between the journal area and the active area of the disk).
The most appropriate file-systems would be those designed for MTDs, which are flash SSDs without a simulation layer, but all those available target small capacity devices because they are aimed at embedded applications.
Next most appropriate would be file-systems designed for
erasable devices, or for write-intensive profiles. The main
candidates are
UDF
and
NILFS2
snd unfortunately the UDF code in Linux is not wholly reliable
for writing. NILFS2 instead is currently well maintained and
increasingly popular. It seems particularly well suited to
flash SSDs as log-based file-systems bunch up writes and their
major weakness, that reads then are then more scattered,
matters not given the low access times of flash SSDs. NILFS2
has a background toa
cleaner that bunches
up blocks that are likely to be read sequentially, but that
must be disabled on an SSD as it results in write
amplification and it is pointless.
There are various online comparisons of NILFS2 with other file-systems on SSD and it is sometimes a bit better and sometime a bit worse, which is unexpected, as it should be way better as it matches the underlying storage profile better. That other filetrees seem comparable probably is because:
benchmarksused do not take into account a probably significant difference in write amplification.
benchmarksonly test one shot performance on a freshly built filetree, and largely the problems with an SSD are about erase counts and performance over time.
Then there are conventional file-systems with good support for alignment and with write optimizations, for example ext4, XFS, BTRFS, OCFS2.
All of the above considered, probably the best choice
currently is XFS, as it has good support for clustering, and
supports TRIM, and is currently maintained. Many people use
ext4 as it is kept very current with new features,
including those relating to flash SSDs, but as usual I think
that it is of obsolete designed, in particular as to the
statically allocated inode
list and
the flat directory structure. JFS as usual would be much
preferable, but it is no longer actively developed, and thuis
means that it is lacking barriers and flash oriented features.
NILFS2 seems very appropriate and robust, and actively
developed, and it may be the best alternative to XFS, and I
shall test if more.
BTRFS is the file-system of the future, especially because it supports copy-on-write and flash SSD friendly layouts, but I feel like many that it is not yet production ready because of the lack of a fsck. OCFS2 performs well even in non-shared mode and it is mostly well designed and is actively developed, but I feel that might not last long.
Since I am reading and thinking about SSDs in depth thanks to
the holidays, I have realized that I add to my summary of
how flash SSD storage units are structured
that there is another important issue: initially there are
plenty of empty
erase blocks, and that
matters because an empty one can be written to not only
without first erasing it, but without first reading it
either. The erasing does not matter that much, because at some
point previously it must have been erase, but avoidance of a
RMW cycle is more
important.
Therefore ideally the firmware
on each
flash SSD storage unit should aim to keep all written blocks
packed into as few erase blocks are possible, to ensure that
as many erase blocks as possible are fully empty.
The difficulty is that eventually all writable physical pages get written, and then no erase block can be considered empty. But the file-system contained within will usually have unused blocks, the problem is that the firmware has no idea of which ones, because that is a file-system level notion. One possibility would be for the file-system to explictly fill with zeroes each unused block and thus mark in that way empty physical pages, but usually that is considered to be too slow (and unused blocks are zeroed only when read after being allocated). This would also not work with most encryption schemes, as a block of all zeroes would encrypt to non zeroes anyhow, the particular value depending on the encryption key and the address of the block (if the encryption layer is using some common scheme).
Therefore most SSD firmwares have some special device operations to declare a sequence of logical sectors as unused or empty, and these typically are the ATA TRIM and FITRIM commands and the SCSI UNMAP commands.
Ideally these would be the equivalent of commands to write, but for whatever reason in some specifications and implementations these commands are actually very slow, so they are best used periodically and not every time a block is released, even if it is possible to enable them with the discard mount option as supported currently the ext4, XFS, BTRFS, OCFS2, GFS2, file-systems (in very recent versions of the Linux kernel).
An alternative is to periodically reset the whole unit with a security erase command and reload its contents.
Another alternative would be the marking of unused logical sectors to the device would happen at the end of fsck as at that point there is an exhaustive and known-good list of unused blocks. Unfortunately I don't know which file-system fsck tools do that.
I have just tried using the backported 3.0 kernel now available for ULTS 10 and I was amazed that standalone power consumption for my Toshiba U300, laptop was considerably lower, dropping from around 1200mAh to around 860mAh:
# acpitool -B Battery #1 : present Remaining capacity : 1661 mAh, 45.36%, 01:55:36 Design capacity : 4000 mAh Last full capacity : 3662 mAh, 91.55% of design capacity Capacity loss : 8.450% Present rate : 862 mA Charging state : discharging Battery type : rechargeable, 18087 Model number : 32 mAh Serial number : NS2P3SZNJ4WR
That's impressively lower power consumption, and may be due
to what is arguably a bug
that has been
added the the power management code in Linux to match an
equivalent bug in the power management code in the
firmware
of many laptops.
Having reported an explanation of some recent SSD tests I have been considering which file-system to use on SSDs and found a relatively recent (2008) presentation presentation on SSD techology, and Linux file-systems which seems a good introduction. Summarizing the important parts, flash based SSDs are made of a collection of flash chips fairly similar to those used to store ISA PC BIOSes, and they work as a concatenation or RAID0 across them.
The single biggest performance implication is that SSD chips have positioning times negligible compared to those of rotating devices, and those times do not depend on distance.
But that just like BIOS chips, the content of SSD flash chips cannot be modified, but can be erased and then written (and erasure and writing are both slower than reading), and that the number of erasures is limited, and that the minimum amount erased is fairly large. Therefore the physical transaction size, at least for updates, is much larger than the desired logical transaction size, which is usually between 512B and 4096B, which means that at least some updates will involve RMW. Also, the minimum physical read transaction size is often rather larger than the logical sector size.
While traditionally logical sectors have been 512B, the
minimum physical read size (page
) is the
minimum erase transaction size are
currently usually 8KiB and 1024KiB
which is rather challenging, and the firmware of most SSD
storage devices then aims, for marketing purposes, to simulate
a traditional read-write small-sector
device on a read-erase-write large-sector one, with the
following goals:
The general solution is to have a table of erase blocks showing how many erasures they have suffered, and which parts of them contain data, and to take advantage of the low and constat access times to allocate logical sectors to erase blocks in an improved layout by keeping a mapping table too from logical sectors to erase block sectors, and the improved layout is pretty obvious:
hash codefor every logical sector to be written, and look it up in the table of logical sectors, and reuse the relevant physical sector if found (with a usage count).
As hinted that is operating as a log-structured file-system
would, as the latter significantly increase write and update
performance and locality
, where the latter
is important not because of access times proportional to
distance as in rotating devices, but because of write times
proportional to the number of erase blocks involved more than
to the amount to be written.
Most recent flash SSD firmware seems to work mostly as hinted
above, with the possible exceptions of hash coding logical
sector contents (deduplication
) and
compression (which is however popular).
Also it has turned out that it is quite important for sustained performance to move logical sectors so they are held in a smaller number of fuller erase blocks, and ideally so that logically near sectors be in the same erase block. The goal is not to defragment logical sectors, but to defragment erase blocks, so that when writes occur in bursts there are emtpy or mostly empty logical blocks.
This processing, which is very similar to the cleaner of a log-structured file-system, has
been realized as both on request (with a command called
TRIM) and as an automatic background operation.
As a side note to my displeasure with the increasingly skewed aspect ratios of displays, they make me even more perplexed as to the widespread practice of having a single full-screen window on the display, with a stack of hidden also full-screen windows underneath, instead of a set of overlapping windows. My usual practice on my home 24" 1920×1200 display is to use a number of somewhat overlapping windows typically in sizes like 600×800 (80×40 Emacs text), 800×1180 (96×66 Emacs text), 800× 660×960 (80×60 character terminal) 960×1024 or 1024×1152 (web browers), with some occasional 1210×920 (132×43 character terminal or 1400×1024 (web browser or image viewer) for very wide data listings or graphical content.
Those using a single full-screen visible window per display, especially if GUI elements are horizontal, end up viewing whatever content in an amazingly squat and wide way, and I have found that this is particularly unwholesome for programmers, as it encourages them to write programs with very long lines.
There are many aspect of contemporary computing that depend
on historical details (UNIX files are virtual paper tapes,
virtual terminals are one punched card wide, ...) and one of
them is related to display aspect ratios: that
GUI
toolbars
are usually grouped to the top or
bottom of displays or windows.
The early computers
(for example
Xerox Alto,
Three Rivers Perq
Apollo DN workstation)
with a GUI tended to have (monochrome, not even grayscale)
square (1024×1024) monitors or portrait ones
(600×800, 768×1024) because they were mostly
designed as document processors
.
Color and landscape displays became popular later at first
because people who wrote spreadsheets
tended to write them with many columns and few rows (also
because spreadsheets were originally used on glass tty
devices with ratios like 24 rows and
80 columnns), and also wanted to color code cells, and more
recently because most entertainment content like movies and
photographs and computer games have landscape aspect
ratios.
Amusingly some of the first computers with a GUI had a circular display (recyled WW2 radar screens or inspired by them).
As currently most displays have wide and squat landscape aspect ratios the traditional layout of most GUIs makes little sense, and as a rule I reconfigure GUI styles to have toolbars on the sides, to save scarce vertical pixel size. As to this GNOME based applications do badly, as they seem to have several elements that are not designed to work equally well horizontally or vertical, but I usually use KDE and virtually all of its GUI elements work well vertically as well as horizontally, including for example the virtual desktop panel, the task manager, and the system tray.
The sole exception I can find in the KDE SC 4.4 is that the Plasma panel's (somewhat bizarre) options sidebar is rather slippery when used vertically (and annoyingly so).
The reason why I was reminded of gamma settings issues is that for reasons that I don't know some of the LCD monitors that I have been using look rather washed out with their default settings, and this is because their default gamma seems to me usually too high.
The three examples that I have in mind are the displays of the Toshiba U300, Toshiba L630, laptops and the display of the BenQ BL2400PT monitor. The two laptop displays seem to look a lot better (at the cost of some loss of distinction among dark shades) after:
xrandr --output LVDS1 --gamma 1.4:1.4:1.6
Where the higher setting for blue is an attempt to compensate for an overall bluish tint that most laptop displays have. For the BL2400PT the monitors's own gamma settings should be 2.2 and after:
xrandr --output VGA1 --gamma 1.2:1.2:1.2
The BL2400PT and several other monitors, notably the
Dell 2007WFP and several similar
Dell ones tend to have default settings for
sharpening that are too high by one or two notches.
Most likely this is designed to counteract the blurring of
character shapes by
subpixel rendering
which is the default in MS-Windows and very regrettably now in
most GNU/Linux distributions. Anyhow it also makes graphical
shapes including
GUI
elements seem to have a bizare whitish fringe, and regardless
I disable subpixel rendering using instead bitmap or well
hinted fonts or font renderers instead.
One amazing detail is that on most monitors that default to excessive sharpening it is enabled when the input signal is digital too. This is what makes me think the default for excessive sharpening is related to the prevalence of subpixel rendering, because the original motivation for sharpening was to improve the somewhat fuzzy analogue output signal of many video cards.
Some of my least esteemed open source developers are GregKH and KeithP, the former for the appallingly opportunistic replacement of devfs with something equivalent but far bigger, more complex, and less maintanable and similarly the latter for his appallingly opportunistic update of the X window system display model with the unfathomable misdesign of RANDR which I have mentioned previously but two aspects of which in particular continue to vex me:
gammaof a monitor are now properties of an
outputrather than of an X
screen, or rather both.
Also, the availability of setting DPI and gamma by output depends on how recent the version of RANDR is, because whichever buffoon had the idea of using outputs in addition to screens to indicate monitors did not realize until rather late that ideally most if not all the properties of screens needed to be available on outputs too.
But it still seems to me that the change from screens to outputs is the bigger and worse one, not just tha the details of the change have been messed up so tastelessly, as the X model of independent screens with specific properties was one of the more elegant aspects of its architecture.
Also, in large part the output model has been a replacement instead of an addition to the screen model, because some X drivers, notably the intel driver, no longer support screens and only support outputs, which sometimes forces the use of the grotesquely inane syntax where the positions of outputs is specified in the Monitor section (which used to be a monitor type definition, not a monitor instance one).
Having just mentioned the special performance profile of SSDs I have also just read a review of a typical contemporary SSD storage unit and the benchmark results on this page are particularly telling. The first two tests are about 4KiB random reads and writes, and the notable aspects are:
The conclusion is that the SSD must be simulating a 4KiB (or 512B) sector device on a device with a much larger (erase) block size, and that not only erases, but reads and writes have a fairly large minimum transaction size. Which seems confirmed by the next two tests, again random 4KiB random reads and writes but a rate high enough that there are onn average 32 operations, or 128KiB worth of data, queued on the device at any one time:
How SSD controllers work should be fairly clear by now, and the final confirmation comes from the next two graphs, for 512KiB random reads and writes:
The final two graphs on that page report bulk sequential transfer rates, and they are typical for all types of devices, with SSD read rates typically being faster than write rates (as physical reads are faster than physical erases and writes), and with rotating storage devices having roughly the same read and write rates.
It is notable that the 3TB drive has a transfer rate of 180MB/, rather higher than the typical 120MB/s of 1TB and 2TB units, which indicates a 50% higher recording density, typical of a recording technology generation switch.
It is also notable that several SSDs peak random and sequential transfer rates exceed even by far SATA2 rates, with for example the peak read rate for the unit under review being 529.2MB/s vs. 278.5MB/s, and 273.8MB/s vs. 228.6 for writing.
Interesting article about the recent storage shortages perhaps being eased which also mentions that SSD units are being purchased sometimes to replace hard to find rotating storage, as they are typically manufactured in other areas from those affecting rotating storage production.
As to SSD there is a recent list of SSD device types and form factors including the increasingly popular PCIe card ones. The list is quite useful and it has a number of very clear photographs and several basic benchmarks.
While SSD form factors are important, some of the more obscure aspects of their structure are far more important, because their performance is very highly anisotropic with load, as they are read-any, erase-many, write-once devices with a very large minimum erase size.
Discussing with someone I made a point that should be repeated here: good quality LCD monitors like my current Philips 240PW9 or even the cheaper ones I also reviewed are on a different level from most traditional monitors. I am often amazed by how good my monitor is (even if I sometimes wish it had a higher DPI or was greyscale).
I am also still very happy about my Samsung WB2000 camera and sometimes I read reviews of comparable cameras like the Nikon S9100, Canon 230HS or Fuji F550EXR (in 8MP mode, as the 16MP mode is too noisy), where they come out as being equally good in most ways, but with a longer telephoto and a less good display and user interface (the WB2000 has features that few equivalent cameras allow, like raw image capture and manual operation including focus).
As to the display of the WB2000 it is encouraging that its 3" AMOLED display is so good, as AMOLED looks like the natural evolution of monitor displays too, not just for portable devices like a camera, smartphone or tablet.
Accidentally reading an older set of blog entries I noticed that I had already mentioned the issue with display aspect ratios and using diagonals to indicate monitor sizes, as at the time there was a transition from 16:12 to 16:10 aspect ratios, while more recently there has been a transition from 16:10 to 16:9 aspect ratios.
I have just had a very bizarre moment in which I thought I had seen an email message getting lost where it really should not. My home mail arrangements has fetchmail taking messages from remote servers, and injecting them into a local exim MTA for delivery to traditional local per-user mailboxes in /var/mail/.
I then access these mailboxes via Dovecot from VM version 8.0.13 under Emacs version 23.1.1.
I run fetchmail and before it finished I told VM to download mail. Then that froze up for a significant amount of time and I worried. I checked and at least one mail messages that was received did not make it to my mailbox.
The message however was not lost: it was still in the exim queue. Most likely since there was some interlocking of the mail store mailbox the MTA just stopped local deliveries. So no message (I think) was lost, and I just manually started the MTA to clear the small queue.
It is good to know that in general my e-mail chain is fairly reliable. In the past however I have lost email either because of running out of battery power for my laptop, or kernel crashes. In theory email tools use fsync carefully to ensure that new copies are committed to disk before deleting old copies of messages, but some have windows of vulnerability, and a bad crash can damage the filetree too.
More commonly I have list emails by deleting them inadvertently, between backups, so I could not go back and restore them.
In general I am used to the bad old days (when using for example UUCP mail forwarding)) when e-mail had for various reasons appreciable delivery delays and losses, and I never consider e-mail a reliable communication medium. Unfortunately a lot of people have become used to e-mail being both nearly instantaneous and overall fairly reliable, so they use it as a kind of instant messaging system, rather than memo writing.
I was looking at various DBMS implementation issues, and I found at the Facebook MySQL discussion group this very amusing entry:
The amusement arises because this related directly to an ancient paper by Michael Stonebraker about the tradeoffs between static indices and dynamic indices in Ingres:InnoDB uses a B-tree for clustered and secondary indexes. When all inserts are done in random order about 30% of the space in leaf nodes will be unused due to fragmentation. Some of the unused space can be reclaimed by defragmenting indexes. Back in the day it was only possible to defragment the primary key index for InnoDB. But with the arrival of fast index creation it is now possible to defragment all indexes of an InnoDB table. Percona has expanded the cases in which fast index creation is done.
@article{Held:1978:BR:359340.359348, author ={Held, Gerald and Stonebraker, Michael}, title ={B-trees re-examined}, journal ={Commun. ACM}, volume ={21}, issue ={2}, month ={February}, year ={1978}, issn ={0001-0782}, pages ={139--143}, numpages ={5}, url ={http://doi.acm.org/10.1145/359340.359348}, doi ={http://doi.acm.org/10.1145/359340.359348}, acmid ={359348}, publisher ={ACM}, address ={New York, NY, USA}, keywords ={B-trees, directory, dynamic directory, index sequential access method, static directory}, }
The topic is a performance comparison between the two types of indices, and in particular as to number of disk operations. The first conclusion was that static indices are more compact than dynamic indices, as the latter must contain empty space (usually 30%) because of index page splits when they fill, while static indices can be built with no unused space, and this reduces disk operations if there are relatively few additions to the indices, and in most databases that's the case most of the time, as additions tend to happen in batches, and indices can be rebuilt periodically after those additions. On that basis they set static indices to be the default.
Unfortunately experience showed that overwhelmingly database
administrators did not rebuild the static indices after adding
many records to the underlying tables, and then complained
about DBMS performance getting worse and worse. Which prompted
the authors to state that they should have used instead
dynamic indices like
B-trees
and that the experience made them believers in self-tuning
systems.
The irony is that in the Facebook case the dynamic InnoDB indices are explicitly rebuilt as if they were static precisely to squeeze out the free space that makes adding new records efficient.
I have recently written some notes about displays in part because at work (and at home) I stare at a display for many hours a day (when I am not moving equipment around or suffering meetings). I have perhaps stricter requirements for displays than many other people that I have met, and for example I wrote that I prefer taller displays as I mostly work on text. But there are other considerations, some of which are practical, in the sense that there are products that satisfy them, and some are not.
I really like LCD IPS or PVA/MVA displays as they give much wider viewing angles without color or brigthness or contrast changing. The color and/or brightness or contrast changes in other displays can mean that on a largish desktop display without moving one's head widely spaced parts of the displays have different color temperatures or different contrast or brightness, and that moving one's head even a little does change them in the less bad cases.
IPS and PVA/MVA displays also have better image quality, with IPS displays often having better colors and PVA/MVA having better contrast (usually thanks to deeper black).
Even if usually IPS and PVA/MVA displays were used in high end largish monitors, usually in the upper price band for office monitors (currently around £400 for 24in) or for the top band (currently around £900 for 24in) for graphics work monitors.
In the past couple of years some low cost variants of IPS and PVA/MVA have appeared, in particular eIPS from LG.Display, cPVA from Samsung, A-MVA from AU Optronics (which correspond to the 3 monitors I have reviewed recently). and result in monitors that usually cost less than half than previous office oriented IPS and MVA/PVA panels. This is in part due to lower panel cost, and in part to lower build quality of the whole monitor, and to some limitations in the panel specification, as they tend to be run in 18 bit color mode plus timewise dithering rather than full 24 bit color mode.
These panel technologies are well described in this page from the very good TFTCentral display information and review site (please donate to support them, as their reviews and information are indeed very good).
I suspect that most of the impulse for the development of cheap higher quality displays based on IPS and PVA/MVA has been driven by two market segments that have grown dramatically in recent years:
Of the above I think that the most important has been the demand for high quality displays for cellphones tablets because of the colossal numbers involved and the education of consumers willing to buy highly profitable top end gadgets. Cellphones and tables are also driving the adoption of AMOLED displays, as well as some cameras.
I am happy that IPS and PVA/MVA displays have finally become
popular so that there is some more choice at a wider set of price points
. A few things that I would like
for at least desktop monitors and that are somewhat unlikely
to happen are:
subpixel renderingtries to exploit (poorly) to improve apparent resolution, and most importantly the display is a complex sandwich of three layers of electronics and polarizers. The three layer sandwich in particular causes a lot of cost and problems, and there are for special applications grayscale monitors of amazingly better visual quality (and there were also amazingly good grayscale CRT displays). OLED displays have similar issues due to having subpixels too and can have multiple layers as well, even if without the polarizers needed by LCDs
Unfortunately the biggest markets for displays are not for desktop office monitors, but for entertainment purposes, either in very large televisions and and low resolution wide aspect ration landscape color monitors is almost the only option, except for rather small portable device displays.
There have been pretty good portrait page aspect ratio high
resolution grayscale displays in recent times, but only for
portable ebook
readers, even if some of
these also have web browsers.
Reading a review of the Dell XPS 14z laptop I was very amused, or perhaps not, to read that:
The 14z is small for a 14in laptop; it's actually slightly smaller than a 13in MacBook Pro.
The reason is that the Dell XPS 14z has a 1366×768
display, and the MacBook Pro a 1280×800 one, and the
diagonal of a display is sort of meaningless without an
indication of the aspect ratio
, as the more
extreme the latter, the smaller a screen is with the diagonal
being the same.
Indeed I now own a Toshiba U300 13in laptop and a Toshiba L630 14in laptop and they are essentially the same size as the U300 is deeper and the L630 is wider.
The 13in screen measures 285mm×178mm and the 14in screen 292mm×165mm. The 13in screen at 507cm2 is actually bigger than the 14in screen at 482cm2, because it has a less skewed aspect ratio.
By switching from 16:12 (that is 4:3) to 16:10 and then to 16:9 aspect ratios display manufacturers have been able to sell smaller displays while quoting the same diagonal sizes, and many people seem to have not realized that.
I don't like the currently popular 16:9 aspect ratio as it is too skewed, and I prefer taller screens because most of my computer activity is editing text files and documents and reading mostly text based web pages, and for all these vertical context is rather more important than width.
Indeed for text long lines seem significantly less readable than shorter lines, and I think that there is a consensus that text document width should be around 60-70 characters, or in most indoeuropean language around 10-15 words.
That is part of the reason why I have a 24in 16:10 screen with a height of 1200 pixels even if I can work with the of the often cheaper 23in 16:9 screens with a height of 1080 pixels that are popular today. The latter are really comparable to 21.5in 16:10 screens because of the skewed aspect ratio:
I don't particularly need the 1920 pixel width, and I would be satisfied with a 1600×1200 pixel size, as widths allows pages to be seen side-by-side and 1600 pixels allows two text pages to be seen together. But that is the old 16:12 aspect ratio that is no longer popular.
I have tried at work 30in 2560×1440 640mmx400mm displays and they were almost too big to use, in particular from a typical desk viewing distance it was not easy to see the whole screen at once, and sometimes I had to push myself back; I have even tried for some time two of these displays side by side and I almost never looked at the second display as it was too far to the side.
Writing about some Mac OSX CUPS and Mac OSX X11 fonts issues reminded me that Mac OSX is really just a UNIX-like operating system with its own GUI and Objective-C based application libraries. This then reminded me of a recent interview with Professor Andrew Tanenbaum.
The interview has many interesting aspects, for example the
mention of his long forgotten
Amoeba kernel
and more in general of capability
system
architectures that many think were invented by Jack
Dennis in 1967.
But the relevant one is that he points out that since Mac OSX is largely based on Mach and FreeBSD in effect it is the second most popular kernel and operating system after MS-Windows NT with an installed based several times larger than GNU/Linux:
LinuxFr.org : Do you think the Linux success is a proof he was right or is it unrelated?
Andrew Tanenbaum : No, Linux "succeeded" because BSD was frozen out of the market by AT&T at a crucial time. That's just dumb luck. Also, success is relative. I run a political website that ordinary people read. On that site statistics show that about 5% is Linux, 30% is Macintosh (which is BSD inside) and the rest is Windows. These are ordinary people, not computer geeks. I don't think of 5% as that big a success story.
Mach actually is also a microkernel, but the version of Mach used as the foundation of the Mac OSX kernel was fairly monolithic. However He also adds that the (mostly German) L4 research microkernels are deployed on several hundred million mobile phones based on some Qualcomm chipsets.
But then if embedded systems matter as to league tables QNX kernel has been a rather popular embedded microkernel for decades.
Linux as an operating system kernel is nothing new or special, incorporating almost only technology from the 1960s and 1970s, and several aspects of it seem to have been designed without any reference to decades of previous operating systems research, reinventing issues and limitations that were solved long ago.
Its main claim to fame is that like MS-Windows NT it works adequately for most purposes and is still mostly maintainable; not that it does things well or in an advanced way. Indeed I use it mostly opportunistically, because it is a better implementation of most of the UNIX architecture than MS-Windows NT, and it is popular enough that it is more widely supported.
I was asked by my friend with Mac OSX™ some time ago to look into why some programs, including its text editors, would hang during startup. To my surprise this was because they were trying to contact the print dæon cupsd and had a fairly long timeout for a reply.
Even more surprisingly the print dæmon would not start because the relevant file /usr/sbin/cupsd was not there, even if the rest of the CUPS subsystem files were there.
The first thing was to restore the file. On a GNU system
based on a nice
package manager
like RPM (or even a
less nice one like DPKG)
I could just first verify the presence and content of the
relevant package files, then extract the file and put in
place. Apparently Mac OSX does is somewhat MS-Windows like and
has just installers
and lacks the ability
to verify installed files and a work with installation
files.
But I found a shareware
tool to do that:
Pacifist
that fills that gap. It then turns out that bizarrely Mac OSX
is installed from a very small number (two IIRC) of very large
packages, instead of the more fine grained approach used by
most GNU distributions, and they are on the installation
DVD.
Using Pacifist it was easy to recover /usr/sbin/cupsd and the owner of the system told me that fixed some other glitches and delays. It looks like that either the GUI libraries or most applications open a connection to the print server during startup, which is moderately strange.
It could be instead that recent versions Mac OSX uses
launchd
to instantiate server dæmons and this meant that the
connection to the CUPS port
succeeded,
because it was held open by launchd and then there
was no response from the missing cupsd and thus a
long wait.
Some time ago I have been asked to look into why some programs would not start on a GNU/Linux server via NX when started from a Mac OSX client but would when started from MS-Windows. The reason was the lack of some fonts from the Mac OSX emulation of X-Windows. It turns out that the X11 emulator uses the standard OSX fonts (in /System/Library/Fonts/ and /Library/Fonts/), plus those in per-user directories ($HOME/.fonts/ and $HOME/Library/Fonts/), and is missing some of the more common MS-Windows and Linux fonts.
It is sufficient to add them in the usual ways to the relevant Mac OSX font directory (glob al or per user) and then to be sure to use xset fp+ to add them to the X font path, and $HOME/.fonts/fonts.conf to add them to the FontConfig font paths.
I was discussing by email compilation techniques and in
particular that both interpretation and compilation are
reductions, and I made the example of the compiler of the
UW
Lisp compiler for the
UNIVAC 1100
mainframe
series, which has been for a long
time a source of enlightenment and admitration for me.
I found some of the printouts of its sources and hoping against all hope I searched for some of the lines in those sources and I was delighted to find that the full manual and source for UW Lisp for UNIVAC 1100 have been put online by someone to whom I have grateful.
UW Lisp is notable for the considerable terseness and elegance of both its language and implementation. The assembler source of the interpreter and runtime system is 5,000 lines long, and among other modules the source of the very powerful structure editor is 120 lines long, and the full source of the compiler to machine code is 720 lines long (at least in the version I have, newer versions are a bit longer).
But then I am reminded that UNIX version 7 would support 3 interactive users in 128KiB of memory, and dozens in 1-2MiB, doing software development with VI and compiling with Make.
Having just published a long delayed entry about an excellent page about find I was remind of a long forgotten utility which is a much better alternative to find called rh written over twenty years go by Ken Stauffer.
It is ridiculously better than find. Some examples of search expressions from its manual page:
EXAMPLES The following are examples of rh expressions. (mode & 022) && (uid == $joe ); Matches all files that have uid equal to username ``joe'' and are writable by other people. !uid && (mode & ISUID ) && (mode & 02); Matches all files that are owned by root (uid==0) and that have set-uid on execution bit set, and are writable. (size > 10*1024) && (mode & 0111) && (atime <= NOW-24*3600); Finds all executable files larger than 10K that have not been executed in the last 24 hours. size < ( ("*.c") ? 4096 : 32*1024 ); Finds C source files smaller than 4K and other files smaller than 32K. No other files will match. !(size % 1024); Matches files that are a multiple of 1K. mtime >= [1982/3/1] && mtime <= [1982/3/31]; Finds files that were modified during March, 1982. strlen >= 4 && strlen <= 10; This expression will print files whose filenames are between 4 and 10 characters in length. depth > 3; Matches files that are at a RELATIVE depth of 3 or more. ( "tmp" || "bin" ) ? prune : "*.c"; This expression does a search for all "*.c" files, however it will not look into any directories called "bin" or "tmp". This is because when such a filename is encountered the prune variable is evaluated, causing further searching with the current path to stop. The general form of this would be: ("baddir1" || "baddir2" || ... || "baddirn") ? prune : <search expr>
Some of advanced examples:
ADVANCED EXAMPLES The following examples show the use of function definitions and other advanced features of Rh. Consider: dir() { return ( (mode & IFMT) == IFDIR ); } This declares a function that returns true if the current file is a directory and false otherwise. The function dir now may be used in other expressions. dir() && !mine(); This matches files that are directories and are not owned by the user. This assumes the user has written a mine() function. Since dir and mine take no arguments they may be called like: dir && !mine; Also when declaring a function that takes no arguments the parenthesis may be omitted. For example: mine { return uid == $joe; } This declares a function mine, that evaluates true when a file is owned by user name 'joe'. An alternate way to write mine would be: mine(who) { return uid == who; } This would allow mine to be called with an argument, for example: mine( $sue ) || mine( $joe ); This expression is true of any file owned by user name 'sue' or 'joe'. Since the parenthesis are optional for functions that take no argu‐ ments, it would be possible to define functions that can be used exactly like constants, or handy macros. Suppose the above definition of dir was placed in a users $HOME/.rhrc Then the command: rh -e dir would execute the expression 'dir' which will print out all directo‐ ries. Rh functions can be recursive.
Some examples of canned functions in my own ancient .rhrc:
KB { return 1<<10; } MB { return 1<<20; } avoid(p) { return (p) ? prune : 1; } months { return 30*days; } ago(d) { return NOW-d; } modin(t) { return mtime > (NOW-t); } accin(t) { return atime > (NOW-t); } chgin(t) { return ctime > (NOW-t); } modaft(t) { return mtime > t; } accaft(t) { return atime > t; } chgaft(t) { return ctime > t; } sg { return mode & 02000; } su { return mode & 04000; } si { return mode & 06000; } w { return mode & 0222; } x { return mode & 0111; } t(T) { return (mode&IFMT) == T; } f { return (mode&IFMT) == IFREG; } s { return (mode&IFMT) == IFLNK; } d { return (mode&IFMT) == IFDIR; } l { return nlink > 1; } mine { return uid == $$; } c { return "*.c" || "*.h"; } cxx { return "*.C" || "*.H"; } Z { return "*.Z"; } W { return "*.W"; } gz { return "*.gz"; } junk { return "core" || ("a.out" && mtime <= ago(2*days)) || "*.BAK" || "*.CKP" || "*[~#]"; }
Belatedly I have discovered that the GPG2 agent can handle SSH keys too and be a drop in replacement (when the enable-ssh-support option is enabled) for the OpenSSH agent.
However I was a bit surprised by it because it is somewhat rough both in design and implementation and also because it works fairly differently from the OpenSSH agent:
Starting with these prompt programs the PGP2 agent has some issues (at least on version 2.0.14 on ULTS10):
Overall the GPG2 agent is a good substitute for the OpenSSH agent, and I am using it by default.
Because of pressure from other engagements I have not been updating my blog or notes as frequently as I wanted, and I have accumulated quite a few draft updates, ranging from fairly complete to just pointer to interesting bits I wanted to write about.
This has made me question my blogging practices: because this means that while I have had time to jot down quick notes and drafts I haven't had the time to publish them into blog entries.
Part of the reason is that I have been striving to keep some standards in my publishing, for example in a previous entry I have remarked about the usefulness of fine tagging of hypertext and I try to enrich my published entries with numerous links to things I mention. I have realized that the latter is one of the things that takes most of my time in publishing my notes, because searching for the most opportune link takes time, even with a faster internet link and computer, in part because there is some kind of feedback that makes other publishers build more complicated pages to use up technology improvements, which is similar to what has happened in applications.
The other main reason is that sometimes I discuss non trivial issues, and these require something more involved than just typing in words, like research and thought and revisions.
While I don't want to descend to the level of the bloggers at
some major sites who publish word dumps or just reblog other
people's news with little comment, I have decided to try to
simplify a bit my blog entries by making them less rich in
hyperlinks and a bit closer to jottings than essays. This is
sad because after all hypertext is made more valuable by
hyperlinks, so in a compromise I'll actually write places
where I would like to put an additional hyperlink as
anchors
but leaving the link unwritten to be occasionally updated
later.
I think that I will continue to use fine grained tagging because I am fairly quick at it.
Another small change is in the organization, as I have realized that I must make a bigger distinction between time dependent content and content that is time independent or has an indefinite currency. The example I have in mind is reviews, which usually are fairly perishable as products they describe disappear from the market fairly quickly thanks to constant updates; similarly for shopping notes. This means that I will structure content on my site as two blogs, one being this technical blog and another (yet to be setup) for non technical matters (very few in this site), and I will keep existing pages with less timewise essays and information, for example the always popular Linux ALSA and Linux fonts notes.
Following the above I have been putting felshing out for
publishing several items in my long list of blog entries
to write, and these will appear in the
site syndication feed
usually with a date in the past that being when I made the
note, as that's the date of currency of the information in
those entries.