This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.
Note: since this file has grown too big, I have switched to a file per quarter scheme. I have already modified all the links in this page to point to the new files into which it has been split.
/proc/sys/vm/page-cluster really behaves I have
had a look in the Linux kernel sources and found these
astonishing bits of code:
/* Use a smaller cluster for small-memory machines */ if (megs < 16) page_cluster = 2; else page_cluster = 3;
int valid_swaphandles(swp_entry_t entry, unsigned long *offset) { int ret = 0, i = 1 << page_cluster; unsigned long toff; struct swap_info_struct *swapdev = swp_type(entry) + swap_info; if (!page_cluster) /* no readahead */ return 0; toff = (swp_offset(entry) >> page_cluster) << page_cluster; if (!toff) /* first page is swap header */ toff++, i--; *offset = toff; swap_device_lock(swapdev); do { /* Don't read-ahead past the end of the swap area */ if (toff >= swapdev->max) break; /* Don't read in free or bad pages */ if (!swapdev->swap_map[toff]) break; if (swapdev->swap_map[toff] == SWAP_MAP_BAD) break; toff++; ret++; } while (--i); swap_device_unlock(swapdev); return ret; }and both make me feel sick and depressed (why is left as an exercise to the reader
:-)).
/proc/sys/vm/page-cluster to 0
instead of the usual default of 3.CATALOG
and also XML CATALOG, and template headers for RSS files:
<?xml version="1.0"?> <!DOCTYPE rss PUBLIC "-//IDN Netscape.com/DTD RSS 0.91//EN" "http://My.Netscape.com/publish/formats/rss-0.91.dtd"> <rss version="0.91"> </rss>
<?xml version="1.0"?> <!DOCTYPE rss PUBLIC "-//IDN Silmaril.IE/DTD RSS 2.0//EN" "http://WWW.Silmaril.IE/software/rss2.dtd"> <rss version="2.0"> </rss>The RSS 0.91 DTD seems mostly fine, but the RSS 2.0 DTD is not quite right, as it based on the idea that
A channel can apparently either have one or more items, or just a title, link, and description of its ownwhich is not quite correct as authentic sample attests.
title,
description and link subelements of
item that is also quite wrong.
channel and item as follows:
<!ELEMENT channel
((title|link|description)+,
(language|copyright
|managingEditor|webMaster|pubDate|lastBuildDate
|category|generator|docs|cloud|ttl|image
|textInput|skipHours|skipDays)*,
item+)>
<!ELEMENT item
((title|link|description)+,
(author|category|comments|enclosure|guid|pubDate|source)*)>
The customary order is title, link,
description, but the definitions above leave that
unenforced as long as they precede all other subelements.
<?xml-stylesheet type="text/css" href="style/rss.css"?>The RSS JavaScript helper turns the
link
elements into
active, clickable links,
(only in Mozilla and Firefox) and to enable such
transformations
this element should be added as the last one in the body of
the RSS file:
<script xmlns="http://www.w3.org/1999/xhtml" type="text/javascript"
src="style/rss.js"></script>
dd and using heavily a JFS
partition at the same time. This did not use to happen before
I switched to JFS, even if in normal interactive use by itself
JFS seems now
very reliable.
malloc() tunables
I also had a look at the
kernel tunables for memory allocation and swapping.
I set /proc/sys/vm/swappiness long ago to be way
lower than the default, to 40, as the buffer
cache does not work very well, because it used LIFO policies
when most accesses are FIFO, and tragically file (and memory)
access pattern advising are not implemented. The result is good.
/proc/sys/vm/page-cluster is set to
0 because prefetching and/or large pages (even
worse) are a
very bad idea.
Well, the bad news and the good news are:
Sargeedition to allowing
Etch(that is currently
Sid) which involves some wrenching ABI transitions. One of these transitions is from GNU LIBC 2.3.2 to 2.3.5 and this causes trouble.
malloc checks, and therefore some applications
with previously undetected bugs now crash, and this happens
right at the time of the installation of updates package.
MALLOC_CHECK_
to
0 (or 1),
which of course is a bit sad.
MALLOC_TOP_PAD_: if the heap as to be grown
at its end, add this much to the allocation in bytes.MALLOC_TRIM_THRESHOLD_: if the heap has
more than these many free bytes at its end, shrink it.MALLOC_MMAP_MAX_: maximum number of blocks
to allocate allocate via mmap.MALLOC_MMAP_THRESHOLD_: blocks of this size
or larger (in bytes) are allocated via mmap.mmap allocations
which reveals that handling larger allocations via
mmap can be very expensive if these allocations
are short lived. Similarly for reducing the heap size when its
top can be freed.dmix plugin to use it directly.
-o noatime
crashes, which recurred, are due to
non-JFS issues.
Also, this
JFS quick patch
seems to have removed one cause of trouble.
ext3 filesystems to
JFS to see how JFS performance degrades with usage, after the
sevenfold slowdown over time
shock of ext3. It is obvious that virtually all
file system tests and benchmarks happen on a freshly loaded
filesystem, so there is very little incentive for file system
authors to reduce performance degradation over time, all that
matter is performance on a fresh load.
ext3; but
this may be because it is a freshly loaded filesystem, or
because of the switch from 4KiB to 1KiB blocks, rather than
because of better latency or performance for JFS as such.
ext3 filesystem, and I have
used that for a day. It feels better than the well used
ext3 1KiB filesystem I had originally, but my
impression is that it is not quite as good as JFS. perhaps
because because some operations that take time involve
directory scanning, and I have not enabled indexed directories
under ext3, but of course all non trivially small
JFS directories are indexed.
mmap related
well a freshly loaded ext3 with 4KiB blocks seems
to behave like JFS. It can be that I am seeing things that are
not there, or that he real issue is to have filesystem block
size equal to page size, in which case perhaps the Linux
kernel does some special mmap optimization.hdparm -t I have to set them up with a soft
readhead of 32 sector or more; 16 or less cause a huge falloff
in the report speed, for example for my /dev/hda (a
WD 80GB 7200 unit) from 40MiB/s to 13MiB/s, one third (with a
readahead set to 24 sector it gets to 24MiB/s). Now that
looks related to back-to-back transfers, probably because of
the firmware in the unit, as the others are slower with a
smaller readhead but not as much.
ext3 evolution
that as of Linux 2.6.10 ext3 locking is rather
less coarse than before, which should help a lot with
scalability to
highly parallel benchmarks.
ext3. From the paper
I learned that the indexed directories in recent
ext3 version use indexes carefully designed to be
on top of an unmodified directory data format, so even if the
index is corrupted the directory is still readable.
ext3 should be
to be itself, not to mutate into something else. But then it
it may be a case of job protection: if somebody's job title is
to be a ext3 developer it may be hard to talk
oneself out of a salary by saying that things are fine as they
are; the same logic as the constant
innovationin marketing or pricing plans: if your job is marketing manager or pricing manager, it may be couterproductive to tell your boss that the current marketing campaign or price plan are just fine.
pmap I have checked that the anonymous
mappings do indeed shrink.
ext3 with 1KiB blocks.
I had often wondered just how on a CPU with 4KiB pages was
mmap dealing with files broken in 1KiB blocks,
and now I guess that:
mmap does not deal very well with non
default block sizes that are smaller than the size of a
page.mmap sometimes cannot deal at all with
file blocks being smaller than pages then file IO
will fall back to read and write
via the buffer cache. Perhaps it turns out that this is
not well tuned because most people use page-sized file
system blocks.largeallocations into their own memory segments, and indeed
pmap shows quite a few 4KiB, 8KiB and 12KiB
anonymous mappings in existence for Konqueror (but there
is a single 59KiB anonymous mapping, presumably the main
allocator arena).mmaped into memory on exec,
the alternative is to read them into memory
where they get reblocked into 4KiB pages that then get
swapped out. The dreadful suspicion is:
mmap'ed
executables, which means all processes running the same
executable share the same pages (minus the copy-on-write
ones), whether or not they descend from a common
ancestor;read
on exec that creates a distinct copy.root file system into first a newly loaded
ext3 file system first with 1KiB blocks and then
4KiB block to be doubly sure.
ext3 with 1KiB may
not be totally reliable, because my new JFS root and its fresh
ext3 1KiB give:
| File system | JFS | ext3 1KiB |
ext3 4KiB |
|---|---|---|---|
| hdc1 (8032MiB) | 7220 Used 781 Available |
6495 Used 1061 Available |
7230 Used 327 Available |
root file system
(which in my cases includes /var and thus the
Squid cache, as well as the library headers etc.) contains a
very large percentage of small files, and thus the loss due to
the larger block size is greater than for the gain on
metadata; this seems reasonable as ext3 with 4KiB
blocks has about the same space used as JFS which also has
4KiB blocks, but a lot less space available.
ext3 with 4KiB blocks was
40-50% faster than JFS, which is slightly
puzzling.ext3 ones to JFS; and I got my first metadata
corruption (in the dtreeof a directory) when unpacking a file from a FAT32 file system into a newly formatted JFS one, and this after the crash with
noatime a while ago. It may be it is not
the JFS code after all: it could be some dodgy code somewhere
else overwriting things where it should not, after all I am
using a bleeding edge 2.6.13 kernel.
memtest86
overnight and no problems were reported. I am also real sure
CPU etc. temperature are low, and I have a superstable
550W power supply
which is wildly overspecified for my box.
df -m:
| File system | ext3 |
JFS |
|---|---|---|
| hdc6 (4016MiB) | 3332 Used 441 Available |
3381 Used 620 Available |
| hdc7 (24097MiB) | 20944 Used 1839 Available |
21164 Used 2901 Available |
| hdc8 (9028MiB) | 7997 Used 558 Available |
8098 Used 900 Available |
ext allocates most statically and usually
one makes sure it is overallocated.ext3 allocates
dynamically, the indirect blocks in the file space tree,
JFS uses extents instead.
ext3 even when the latter has a smaller block
size.ext3 with a
larger block size might take less space than with a
smaller block size, because the internal fragmentation at
the tail is less important, and many less indirect blocks
are needed because the file is chunked in many less data
blocks, or in other words a lower number of bigger
fixed size extents.
ext3 file
system, as part of it will be taken by the metadata of
newly added files, which is mostly preallocated for
ext3.ext3 speed, it is
likely that it also increases the available space under
JFS, as files that previously needed several extent
descriptors end up in a single extent.ext3
should not be fully allocated, its available space should
never fall too low; indeed I think that the default 5%
reserve is way too low, considering the
sevenfold slowdown over time
possible. The same probably applies to JFS, and doubly so,
as a larger available free space reserve raises the
chances that longer contiguous extents are found, and
therefore that both speed and space occupied are
better.fsck (there are others, but minor -- I hope).
rootfilesystem, which is around 7GiB in size, and has been rather thoroughly mixed up by upgrades, spool work and so on, copy it to a quiescent disc first blobk-by-block, then file-by-file (thus in effect optimally relaying it out), both as
ext3 with 1KiB
block size (which is also its current setup) and as JFS.
Then to apply the read/find/remove
tests and a new fsck test.
tar file on a vfat partition to a
JFS partition just got hung, and I am about to reboot. Perhaps
the combination of FAT32 and JFS transfers has not been used
much...
| File system | Repack | Find | fsck |
Notes |
|---|---|---|---|---|
used ext3 1KiB |
64m10s 81s |
06m43s 06s |
06m44s 04s |
13% non contiguous |
new ext3 1KiB |
09m12s 74s |
03m03s 03s |
04m31s 04s |
1% non contiguous |
| new JFS 4KiB | 11m56s 64s |
02m50s 05s |
02m14s 04s |
558MiB free instead of 829MiB |
new ReiserFS 4KiB notail |
26m53s 70s |
05m34s 06s |
02m34s 16s |
1293MiB free instead of 829MiB |
rootone (around 420k between files and directories, and 6.7GiB of data), and I have copied it for each test to an otherwise quiescent disc, first with
dd to get it as-is, highly
used, and then I reformatted the partition and used
tar to copy it again file-by-file to get
a neatly laid out version. For the sake of double checking I
then rebooted into the newly created partition and rerun the
same tests on the original file system, and the results were
coherent with those above (the exception is that they were
around 25% lower, as the original disc is 7200RPM vs. 5400RPM
and so on).
fsck)
the newly loaded version is roughly twice as fast; but for
reading all the data it is seven times faster. To me
this indicates that metadata (directories, inodes) is fairly
randomized even in a freshly loaded version (and indeed running
vmstat 1 shows very low transfer rates, and the
disc rattles with seeking), but data is laid out fairly
contiguously. But after repeated package upgrades and the like
the data also becomes rather randomized, and indeed this is
also borne out impressionistically by looking at the output
of vmstat 1 and the rattling of the disc (a lot less).
rootone to JFS, and then in a month or two check out how much it degrades after the usual frequent package install and upgrade that I do. With JFS the speed as freshly loaded is a bit slower or a bit faster than for
ext3 freshly loaded, but there is an
extra 5% of space used as JFS uses 4KiB blocks instead of 1KiB
(and there are lots of small files in a rootfile system).
ext3 or JFS). No
Reiser4
data out of arbitrary lazyness (it still needs to be manually
patched into the kernel).
ext3
with time the layout becomes rather fragmented, with extremly
large impact on performance in at least some cases. The cost
of seeking is so large that a raise in the non-contiguous
percentage reported by fsck.ext3 from 1% to 13%
involves a sevenfold decrease in sequential reading performance.
straightened outby dumping them to something and then copying them back file by file.
However, the disk hasn't actually written out its cache yet. It lied to the OS / file system and said it had, but it hasn't, it's busy doing something else. Poof, the power goes out.A while ago I had mentioned similar gossip and then added
Now, the journal doesn't have our data, we've already cleared it out, and the file system, which is supposed to have been coherent because we fsynced it, is not, and it is now corrupted.
I have reproduced this behavior a dozen or more times on IDE based systems. The only way to stop it is to tell the drive to stop using it's write cache.
flush the buffer cache more frequently(the kernel one) as a possibly useful palliative.
ext3 allow tuning the
flushing frequency (which is also useful for laptops, where
one wants to make it less frequent); JFS does not, and XFS has
a policy of doing it as rarely as possibly, which they call
delayed allocationbecause it raises the chances of being able to allocate a large contiguous extent, and to write to in a single block IO.
ext3 FAQ
2004-10-14.I've been using ReiserFS _EXCLUSIVELY_ since about 2.4.11 and I've never had a single problem. It's important to format with the defaults and not specify 'special' arguments to mkreiserfs or you can run into trouble.which is a classic case of the
socialway of defining that a program
works: it works if most users do not run into bugs. Usually such programs are misdesigned and misimplemented, so that they mostly do not work, and sometimes (usually only for a demo to the boss) they seem to work. Then the bugs most complained about then get fixed, and thats it.
socialdefinition of working, in 2.6.13 the XFS code crashes for blocksizes of 1024, the JFS code crashes if a JFS filesystem is mounted with
-o noatime, and UDF if one deletes files.
-o atime and indeed some operations like long
searches are faster, even if not dramatically. This is as
expected, because each directory traversal and file read
generates by default an access time update, that has to be
journaled, and the journaling involves locking etc., and
-o atime avoids all of that. Probably the benefit
is much larger on parallel systems.
ext3) that has full bad block handling.
By contrast ext3 duplicated the superblock many
times, and divides the disk into several semi independent
cylinder groups.| Feature | ext3 | JFS | XFS |
|---|---|---|---|
| Block sizes | 1024-4096 | 4096 | 512-4096 |
| Tunable commit interval | yes | no | no |
| Supports VFS lock | yes | yes | yes |
| Has own lock/snapshot | no | no | yes |
| Small data in inodes | no | some | auto |
fsck speed |
slow | ? | fast |
| Redundant metadata | yes | yes | ? |
| Bad block handling | yes | mkfs only | no |
| Supported by GRUB | yes | yes | mostly |
| Names | 8 bit | UTF-16 or 8 bit | 8 bit |
noatime |
yes | yes | yes |
sync |
yes | no | no |
O_DIRECT |
? | ? | yes |
| DMAPI | no | patch | option |
| Quotas | both | patch | both |
| Max fs size | 2-8TiB | 32PiB | 18EB |
| Max file size | 1-4TiB | 4PiB | 9EB |
| Max files/fs | 232 | 232 | 232 |
| Max files/dir | 232 | 231 | 232 |
| Max subdirs/dir | 215 | 216 | 232 |
| Number of inodes | fixed | dynamic | dynamic |
| Resize journal | offline | ? | offline |
| Journal on another partition | yes | yes | yes |
| Journals data | option | no | no |
| Journal disabling | yes | yes | no |
| Case independent | no | option | option |
| Can grow | online | online only | online only |
| Can shrink | no | no | no |
| Journals what | blocks | operations | operations |
| Journal size | fixed | fixed | grow/shrink |
| Indexed dirs | option | auto | yes |
| Quotas | both | both | both |
| EA/ACLs | both | both | both |
| Special features or misfeatures | In place convert from ext2.
MS Windows drivers. |
Case independent option.
Low CPU usage. DCE DFS compatible. OS2 compatible. |
Real time (streaming) section.
IRIX compatible. Very large write behind. Superblock on block 0. |
.tar.gz of a SUSE 9.3 root filesystem
(3132712960 bytes uncompressed, 173759 entries), chosen
because it contains a lot of small files and a number of
fairly large files.tar-ing it, finding a file based on a
non-name property, and deleting all files in the
filesystem.| Filesystem | Code size | Free aftermkfs |
Restore | Free after restore |
Repack | Find | Delete | Free after delete |
|---|---|---|---|---|---|---|---|---|
ext3 1KiB |
195,163B | 3770KiB | 5m22s 0m39s |
783KiB | 3m03s 0m27s |
0m47s 0m01s |
2m12s 0m06s |
3770KiB |
ext3 4KiB |
195,163B | 3747KiB | 5m06s 0m30s |
454KiB | 2m39s 0m25s |
0m38s 0m01s |
1m19s 0m04s |
3747KiB |
| JFS 4KiB | 189,084B | 4000KiB | 5m38s 0m31s |
683KiB | 3m46s 0m21s |
1m01s 0m03s |
2m44s 0m05s |
3988KiB |
| XFS 4KiB | 549,809B | 4007kB | 5m05s 0m56s |
720KiB | 3m50s 0m35s |
0m44s 0m26s |
1m41s 0m27s |
3923KiB |
| UDF 2KiB | 72,157B | 4016KiB | 10m45s 1m07s |
768KiB | 2m55s 1m02s |
1m40s 0m34s |
n.a. | n.a. |
cfq elevator show slightly increased
elapsed times, not surprising as the anticipatory
optimizes throughput at the expense of latency, the
viceversa for cfq.ext3 were with
data=writeback, but tests showed that with
data=ordered the restore took only a little
more time, so the latter, which is the default, is good.
With data=journal the restore took 40% more
time.# gunzip -dc /tmp/SUSE.tar.gz | (time tar -x -p -f -) # (time tar -c -f - .) | cat > /dev/null # (time find * -type d -links +500) # (time rm -rf *)and the execution of each was preceded by unmounting, flushing the buffer cache, and remounting; for restoring the archive to be restored was on a different drive.
mkudffs supports blocks sizes of 1KiB, 2KiB
and 4KiB, but the udf system module only
supports 2KiB.ext3 for a small desktop machine seems
the best bet overall, and the choice is between a
somewhat faster 4KiB block and a rather more space
efficient 1KiB version. Considering the availability of a
lot of tools (including MS Windows drivers) for
ext[23] that are not available for
other filesystem types, this impression is reinforced.
ext3 with 1KiB
blocks (still fast, saves a fair bit of space), or to JFS for
more demanding environments or for filesystems exported with
Samba (parallelizes well, good features, low CPU usage).ext3 for most
configurations (especially desktops, in particular if dual
booting with MS Windows), but
JFS for really large partitions
and/or for systems with several processors and RAID with
many discs, and perhaps switch to XFS for really really large numbers
of processors and disc arrays.cfq
for desktops as it minimizes latency,
as
when throughput is more important (but is not good on
subsystems with many heads, like RAIDs), and the
deadline elevator for DBMSes (best for random
access patterns); noop is good for storage
subsystems with their own intrinsic scheduling.cfq in particular has been useful for
me to reduce the hogging of the disc by particularly large
disc operations, like installing package or filesystem scans,
that would make most other processes rather unresponsive with
as for example.
as which favours large
sequential transfers it does 20-25MiB/s, with cfq
it does only 4-6MiB/s if it runs as 3 processes, returning to
11-12MiB/s if run as 2 processes. Pretty amazing.ext2
usually has awesome performance for almost anything, but
does not journal, so bad news for large filesystems.ext3
has pretty good performance across the board except that
since it uses kernel based coarse locking (particularly
essential for the journal) it does not scale well to
highly parallel hardware and process configurations
(presumably because locking the journal becomes a
bottleneck, as ext2 scales well).
fsck times
in part due to having index based directories disabled by
default (which is right, because ext3 is
designed to be simple, and index based directories are
sort of unnatural for it, even if they are now
available).USER PRI NI VIRT RES SHR S CPU% MEM% Command pcg 15 0 64192 38788 8120 S 0.0 2.6 kicker [kdeinit] pcg 15 0 62412 38960 7076 S 0.0 2.6 konsole [kdeinit] pcg 15 0 60752 37108 7160 S 0.0 2.5 konsole [kdeinit] pcg 15 0 46908 27868 10480 S 0.0 1.9 konsole [kdeinit]and this is what is was just after startup a few days before:
USER PRI NI VIRT RES SHR S CPU% MEM% Command pcg 15 0 34064 17656 13520 S 0.0 4.3 kicker [kdeinit] pcg 15 0 33200 17072 12508 S 0.0 4.1 konsole [kdeinit] pcg 15 0 32460 16124 12292 S 0.0 3.9 konsole [kdeinit] pcg 15 0 32452 16116 12292 S 0.0 3.9 konsole [kdeinit]This is just ridiculous and sick: well, each
konsole process has a few tabs open, but has
grown to a resident set of several dozen megabytes is
just utterly sick (never mind the over 60MB of reserved total
memory), even if almost a dozen is shared KDE libraries. And
KDE is not as bad as others...What we did witness and it seems to be the case with most of the media we tested, is that they all need a couple writes/erasures in order to "settle in" after which we had lower levels and fewer errors.This is not a big problem, because rewritable discs should usefully fully formatted and written over before use, in part to ensure they are good, in part to initialize them even if DVD+RW can format incrementally; and for DVD-RW formatting is pretty much essential, as by default they come unformatted and in incremental sequential mode, while they should be for maximum convenience be in restricted overwrite, and as the
overwritesays, they must be first written to become fully randomly rewritable.
dvd+rw-format -force=full /dev/hdgand then fully write them over with
noisedata, that is data that has been encrypted with a random password, using a script like this:
#!/bin/bash
: "'stdin' encrypted with a random password"
case "$1" in '-h')
echo 1>&2 "usage: $0 [ARGS...]"
exit 1;;
esac
if test -x '/usr/bin/aespipe' -o -x '/usr/local/bin/aespipe'
then
NOISE_KEY="/tmp/noise$$.key"
trap "rm -f '$NOISE_KEY'" 0
dd 2>/dev/null if=/dev/random bs=1 count=40 of="$NOISE_KEY"
exec 3< "$NOISE_KEY"
aespipe -e aes256 -p 3
else
export MCRYPT_KEY MCRYPT_KEY_MODE
MCRYPT_KEY="`dd 2>/dev/null if=/dev/random bs=1 count=32 \
| od -An -tx1 | tr -d '\012\040'`"
MCRYPT_KEY_MODE='hex'
exec mcrypt -b -a twofish ${1+"$@"}
fi
Another issue is that the new 16x DVD-RAM media do not support a high overwriting cycle, which means that the discs will perform the best before 10,000 overwrites (100,000 for the 1x, 2x, 3x media).What is so funny is that DVD+RW (and also DVD-RW to some extent) already supports full random access operation which was one of the two main features distinguishing DVD-RAM, and its main remaining difference with DVD-RAM was that it supported only 1,000 rewrite cycles.
gzip -2
and mcrypt and it turned out that the latter was
many times slower
than aespipe. After reading a comprehensive
article about
time and compression tradeoffs for several compressors
I decided to have look and experimented with
lzop
which has a good reputation for being particularly suitable to
backups, being fast and still offering good compression.
tar stream of my 3.47GiB home dir (mostly text,
but lots of photos too) shows:
| gzip -2 | lzop | |
|---|---|---|
| CPU user | 372s | 126s |
| CPU system | 15s | 22s |
| output | 2.57GiB | 2.69GiB |
lzop is a clear winner here,
even if there is a small, expected, increase in the size of
the compressed output. The reduction in CPU time cost allows
for higher output speed given the same CPU, as CPU time
spent compressing plus encrypting almost make backups CPU
bound here.libc6-2.3.5 or packages that
depend on it, with this filter expression:
~D(libc6~V2\.3\.5|!~D(libc6~V2\.3\.5))which is slow but fairly impressive.
Use uhci_hcd or ehci_hcd, but never both at the same time. ehci_hcd will work with all lo-speed ports, so uhci_hcd is then no needed.Ahhh, thats interesting, and poses a somewhat irritating conundrum, at least for me. I have compiled all three of the USB HCD drivers (UHCI, OHCI, EHCI) in my kernel binary, not even as modules, to get univeral support without the need for an
initrd. Will the EHCI driver support a pure
UHCI chipset? Time will tell.
lsof
and I noticed that Konqueror had opened and mapped the
Arial Unicode MS
TrueType
font, which is about 24MiB
large and has lots of glyphs.
mmap(2) if many processes open and map the same
font the font will be read into memory only once, just like
with a font server. But when a font is read into memory it
also involves per process data, and with a font server this
only happens once.
> Alloced memory amount grew to ~2266 Mb (from 1433 Mb > before) but allocation speed dropped significantly > (several times). Were you building the i686 glibc? I.e. rpmbuild --target i686 -ba -v glibc.spec?which seems to indicate that when compiling for the 686 architecture some instructions give a large speedup, my guess is that is about cheap locking.
gamin
server on which KDE depends for getting notified of filesystem
changes (mostly, but not only, to refresh the lists of files
when a directory is open in Konqueror).
sbp2 module as it was in
use, but no other module was using it. So I started suspecting
various monitoring facilities, and looked at
gamin because it has a bit of a reputation for
keeping things open. So I discovered that it cannot be
disabled or removed, but one can
configure it
to choose polling or kernel notifications (polling is slower
but safer) by path, or the same or to disable it entirely on
some file system types.
stop using cmpxchg on multiproc systems, doesn't scale.. 10x slower on 4p, 100x slower on 32p.
noise(pseudorandom stream encrypted with a random key, which seems to be rather noise like): it turns out that AES (Rijndael-128) encryption using
mcrypt
is many times slower than using
aespipe
which uses the excellent
AES implementation by Dr. Brian Gladman.A Poltergeist in My Plasma TV
LG's $5,000 set worked for a month. Then things got weird as the unit developed a disturbing "memory leak"
we expect that our malloc will find more bugs in software, and this might hurt our user community in the short term. We know that what this new malloc is doing is perfectly legal, but that realistically some open source software is of such low quality that it is just not ready for these things to happenI admire his determination, but it is quite brave. Also, while
some open source software is of such low quality, some proprietary software is even worse, and MS Windows allegedly contains deliberate support for bugs without which many important proprietary packages no longer seem to work properly.
-adobe-courier-medium-r-*-*-*-100-0-0-*-*-iso8859-*that is 10 points (100 decipoints) with the horizontal and vertical DPI defaulting to those returned by the X server.
* instead
of 0 has completely different semantics, and
selects the first (more or less random) DPI actually
available.
Well, in the file dix/dixfonts.c, procedure
GetClientResolutions, there is this marvelous bit
of code:
/* * XXX - we'll want this as long as bitmap instances are prevalent * so that we can match them from scalable fonts */ if (res.x_resolution < 88) res.x_resolution = 75; else res.x_resolution = 100;which forces (clumsily) the DPI to be either 75 or 100, regardless of the actual screen DPI, and whether there are fonts for the actual screen DPI available. Now virtually all 15" 1024x768 LCD monitors are 85DPI, and I have created a nice
font.alias
that does define 85DPI versions of all the bitmap (PCF) fonts
with the proper size (roughly). But these get ignored, and I
get the 75DPI bitmap fonts instead.
freetype font module can handle
bitmap (PCF) fonts too it might be used to work around this,
but this is not possible because the bitmap
module is loaded not just forcibly but it is also loaded
first (not last, as a default), and preempts any
subsequent font modules from handling bitmap fonts, and not
viceversa; in any case the forcing of the DPI is the server
itself, not in one of the font modules./etc/apt/sources.list
was changed to reference Debian editions not by level
(unstable, testing,
stable plus experimental which is a
state more than a level) but by name, as a I finally decided,
after many months of indecision, that tracking specific
editions is a lot safer than tracking a state (which must be
why Ubuntu does use only the edition names); especially when,
as now, the testing and unstable
levels are very incomplete and inconsistent just after the
release of a new edition of Debian, and as major ABI
transitions are in flux.
You can't use 'Package: *' and 'Pin: version' together, it's nonsensical.Well, in theory, the versions of two random packages are not related; but since people (including the official Debian distribution maintainers) encode edition and distribution names in the version string, in practice it can make a lot of sense, as for example in:
Package: * Pin: release o=Debian,a=sarge Pin-Priority: 990 Package: * Pin: version *sarge* Pin-Priority: 990which is too bad.
Hello world!program:
That simple program uses 73 shared libraries that allocate a total of 13Mb of non-shared static data.In other words in shared objects there are substantial impure areas, which are usually about relocation and links to other libraries.
a.outshared library system the impure problem did not much exist because each shared object had to be linked at a unique, statically defined, fixed (not merely default) base address and instead of going all the way to having no default base address one could have just mde the default a hint changeable at runtime.
Current versions of Portage can handle, via prelink, the changing MD5sums and mtimes of the binaries.but this is easier because of the source-based nature of Portage.
parts (shared objects) mapped into processes forked off running the
kdeinit
service, which already has mapped most of the KDE
libraries. This means that the child process inherits the
already mapped in and prelinked shared libraries of the
parent. Thus in effect prelinking is done at the start of
every session instead of statically, but that is good
enough, especially as few systems run multiple sessions at
a time.-fvisibility
option to GNU CC, which especially helps with C++
code.ldconfig.
kdeinit which remind me of the
discussion of
new_proc
in Multics, which famously had expensive process creation.hdparm -t),
for example from about 55MB/s to about 33MB/s for a value of
32 cycles.
title Xen Fedora 5
kernel (hd1,5)/boot/xen.gz dom0mem=120000
module (hd1,5)/boot/vmlinuz-2.6.12-1.1400_FC5xen0 ro \
vga=ext reboot=warm pci=biosirq \
elevator=anticipatory root=/dev/hdb6
module (hd1,5)/boot/initrd-2.6.12-1.1400_FC5xen0.img
So far Xen works pretty well and is quite fast, even if I have
had some lockups. I suspect the bleeding edge Fedora kernel;
also the Fedora firewall seems to have some issues. It seems
quite practical to just run always under Xen, to enjoy for
example fast save and restore to disk.sysfs and hotplug
can only get worse: the zd1201 driver requires
that the firmware for the peripheral it supports be loaded,
and what now happens is that it creates on loading a
sysfs entry to enable the loading, and after a
short timeout this entry is removed. This means that just
about the only way to load the firmware is to have
hotplug enabled.
hotplug and
sysfs.zd1201) is incorporated in the mainline kernel
as of release 2.6.12 so that the patch mentioned
previously
is no longer necessary./etc/fonts/local.conf.title hda1 root (hd0,0) kernel (hd0,0)/boot/bzimage root=/dev/hda1 title C: rootnoverify (hd0,2) chainloader (hd0,2)+1and a
mkisofs line to create a filesystem image
for booting GRUB:
mkisofs -no-emul-boot \ -boot-info-table -c boot/boot.catalog \ -boot-load-size 32 -b boot/grub/iso9660_stage1_5 \ -r -J -l -o /tmp/grubboot.iso /tmp/grubboot/If the bootloader one uses does not support boot choices/menus it is possible to construct a CD image with multiple boot image choices using the
-eltorito-alt-boot
options to mkisofs but this require multiple boot
choice handling in the BIOS and this is not implemented in a
significant number of cases.
lib in front of the name if
it is a runtime library package, has appeared in Fedora too,
after having been dumbly adopted by Mandriva.
termcap
library is called termcap-2.0.8.tar.bz2 and is
packaged as libtermcap-2.0.8-39.src.rpm.
Looking inside the source package one also finds some silly
inconsistency in naming, with patches have different base
names:
$ rpm -qlp libtermcap-2.0.8-39.src.rpm libtermcap-2.0.8-ia64.patch libtermcap-aaargh.patch libtermcap.spec termcap-116934.patch termcap-2.0.8-bufsize.patch termcap-2.0.8-colon.patch termcap-2.0.8-compat21.patch termcap-2.0.8-fix-tc.patch termcap-2.0.8-glibc22.patch termcap-2.0.8-ignore-p.patch termcap-2.0.8-instnoroot.patch termcap-2.0.8-setuid.patch termcap-2.0.8-shared.patch termcap-2.0.8-xref.patch termcap-2.0.8.tar.bz2 termcap-buffer.patch
lib
prefix, for example ncurses-5.4.tar.bz2 is
packaged quite properly:
$ rpm -qlp ncurses-5.4-17.src.rpm ncurses-5.4-20041218.patch ncurses-5.4-20041225.patch ncurses-5.4-20050101.patch ncurses-5.4-20050108.patch ncurses-5.4-20050115.patch ncurses-5.4-20050122.patch ncurses-5.4-xterm-kbs.patch ncurses-5.4.tar.bz2 ncurses-linux ncurses-linux-m ncurses-resetall.sh ncurses.spec patch-5.4-20041211.shstill note the marvelous idea of having one of the patches called
patch.
good taste.
/usr/lib/news/bin/procbatch, to merging all those
internal components into the public base directories, like
/usr/bin/procbatch. I then submitted a bug
report, and was told to lump it. Cool, and roughly on the same
day I stopped using RedHat, because if they had people who
could gleefully do that and persist obviously things were
going downhill.
99.9% free but with proprietary bits as limpet minesgame that RedHat have later finessed to a truly clever degree with proprietary _trademarked_ names logos and icons). Then Mandriva at one point switched to the stupid
lib prefix naming
convention, and at that demonstration of loss of good taste I
just switched to Debian. After all if one is prepared to
tolerate dumb bad taste, let's get it direct from the source,
which at least is politically correct.
ar archives containing tar.gz
archives was a good package format? Never mind the other huge
problems), the starting of daemons on package upgrade even
if they are disabled in the init runlevel
config, and the social problems that cause very
infrequent releases, and so on.
Red Hat will create the Fedora Foundation with the intent of moving Fedora project development work and copyright ownership of contributed code to the Foundation.with no mention of trademarks, a beautiful example of corporate cleverness. Copyrights matter a lot less because those are GPL'ed anyhow, but there is no GPL for trademarks or GPL equivalent permissions for RedHat's trademarks.
Performance preferences
to preload a copy in the background. My naive expectation was
that on quitting it would because of that restart it. Fat
chance.
hatefullimitations.
Making fine prints in your digital darkroom Monitor calibration and gamma.
Yet another gamma correction page.
"Brightness" and "Contrast" controls.
The Monitor calibration and Gamma assessment page. I particularly like the target gamma visual tests in this page, which are provided at three different intensity levels. My LG L1710B LCD monitor was tuned fairly easily. I can get a final gamma of 1.8 with X gamma set to 1.2 and contrast set to 90/100 and brightness to 75/100. Image quality is rather good. I had set it almost to those values by judgement with brightness and contrast set a tiny bit too low. I suspect that lots of people set their monitor to excessively low brightness levels, which are the default.
ORN: A lot of companies have been using OpenSSH in their products (Sun Microsystems, Cisco, Apple, GNU/Linux vendors, etc.). Did they give anything back, like donations or hardware?It is so ironic because the OpenBSD project is committed to the BSD license, which does not require anything other than credit or, in its new edition, nothing at all, in the way of contributions from adopters. At least the GPL requires vendors to contribute back their improvements, and this has worked really well in a number of cases.
Henning Brauer: Nobody ever gave us anything back. A plethora of vendors ship OpenSSH --commercial Unix vendors (basically all of them), all of the Linux distributors, and lots of hardware vendors (like HP in their switches)-- but none of them seem to care; none of them ever gave us anything back. All of them should very well know that quality software doesn't "just happen," but needs some funding. Yet, they don't help at all.
ORN: This is the first release that includes X.Org. Why did you choose to import it instead of XFree86 4.5.0?These people are worried by a slow switch to the GPL; so ironic too. Especially as I was once following a discussion on the
Matthieu Herrb: The primary reason is that the new revision 1.1 of the XFree86 license is less free than the old MIT license that had been used for years by XFree86. OpenBSD already avoided shipping the final XFree86 4.4 release that also uses the new license in 3.6. Then, as many other projects moved away from XFree86 because of the license, it became obvious that most new developments in the X window system now take place in X.Org. Having said that, projects like OpenBSD have to stay vigilant that X.Org doesn't turn into a Linux-only project (that would slowly slip to a GNU General Public License).
Xorg IRC channels and one of the authors said
he had quite a bit of code to do an improvement someone was
requesting, but since the server was not GPL licensed, the
code was proprietary and could not be shared. It used to
be that the development of the X reference server code was
mostly funded by major corporates, so they chose the licence
that best served their embrace and extendcompetitive advantage stategies.
sabishape
and
dokde.unstable, using the unofficial 3.4.0 packages, the KDE Konqueror browser grows to inordinate size (like more than 200MB) and under SUSE 9.2 the KDE 3.3.2 Konqueror stays at around 50MB.
malloc-override mode
(with build option --enable-redirect-malloc) and
then to set export GC_PRINT_STATS=1, and then to
preload it with
export LD_PRELOAD=/usr/local/lib/libgc.so.1.0.2)
and it is such good fun watching what it snitches on something
like Konqueror.sub read_vint {
my $b = ord CORE::getc($_[0]->[0]);
my $i = $b & 0x7F;
for (my $s = 7 ; ($b & 0x80) != 0 ; $s += 7) {
$b = ord CORE::getc $_[0]->[0];
$i |= ($b & 0x7F) << $s;
}
return $i;
}
brought me tears and screams because of its depth and
daring, and never mind this other splendid example of
Perl programming, whose style seems to be representative
of much other code in Plucene:
sub doc {
my ($self, $n) = @_;
$self->{index}->seek($n * 8, 0);
my $pos = $self->{index}->read_long;
$self->{fields}->seek($pos, 0);
my $doc = Plucene::Document->new();
for (1 .. $self->{fields}->read_vint) {
my $fi = $self->{field_infos}->{bynumber}->[ $self->{fields}->read_vint ];
my $bits = $self->{fields}->read_byte;
$doc->add(
bless {
name => $fi->name,
string => $self->{fields}->read_string,
is_stored => 1,
is_indexed => $fi->is_indexed,
is_tokenized => (($bits & 1) != 0) # No, really
} => 'Plucene::Document::Field'
);
}
return $doc;
}
Watch and learn! And these apparently are the literal
translations into Perl from Java code of equivalent
magnificence.Laptops...beware?The same article has interesting details about the various modes of journaling that Ext3 offers, and in particular that
Ext3 has a stellar reputation for being a rock-solid filesystem, so I was surprised to learn that quite a few laptop users were having filesystem corruption problems when they switched to ext3. [ ... ] had nothing to do with ext3 itself, but were being caused by certain laptop hard drives.
The write cache
[ ... ] Unfortunately, certain laptop hard drives now on the market have the dubious feature of ignoring any official ATA request to flush their write cache to disk. This isn't a wonderful design feature, although it has been allowed by the ATA spec up until recently [ ... ]
However, it gets worse. Some modern laptop hard drives have an even nastier habit of throwing away their write cache whenever the system is rebooted or suspended. Obviously, if a hard drive has both of these problems, it's going to regularly corrupt data, and there's nothing that Linux can do to prevent it from doing so.
data=journal can be very fast in some special
case (probably reading from the journal as it is writing to
it). Also, about making it flush the buffer cache more
frequently prevents huge write storms.During the past year, more than 4,100 patches from Red Hat employees were integrated into the upstream 2.6 kernel. In contrast other companies boast that their offering contains the most patches on top of the community kernel.and
Upstream - doing all our development in an open community manner. We don't sit on our technology for competitive advantage, only to spring it on the world as late as possible.These statements are commendable, and reflect some of my own thoughts but one can make some points:
scratch my itchlogic.
:-(.discoverythat it is hard to guarantee something has actually been written to disc. This has been well known for years at least to those reading the
comp.arch newsgroup. Things also are
much subtler and more complex than apparent from this late
discovery.
The plan is to reduce the amount of memory that Gnome applications consume. Gnome is barely usable on a machine with 128 MB of RAM; contrast this with Windows XP, which is very snappy on such a configuration.Even more amusing is the question that comes next:
Why do you want to reduce memory consumption?to which some answers are given. Unfortunately none seems convincing to me; I reckon that there is a vanishingly small and largely powerless constituency for fixing the many and horrifying memory wastages in most GNU/Linux applications, so the answers are merely pious hopes.
scratch my itchprinciple, and many volunteers by now just have very large PCs, with at least 1GB of memory.
Looking at the top 25 contributors to the Linux kernel today, you'll discover that more than 90% of them are on the corporate payroll full-time for companies such as HP (HPQ), IBM, Intel (INTC), Novell (NOVL), Oracle, Red Hat (RHAT) and Veritas (VRTS), among many others.and obviously these big corporations are not limiting their highly paid employees with PCs having only 128MB RAM, and I would imagine that most of them have no interest whatsoever in wasting their expensive time minimizing memory consumption in the kernel or in applications. The employees themselves are now paid large enough salaries that buying more memory even for their personal system is simply no longer an issue for them.
I spent the last month halving the memory used by these applications
I spent the last year adding to this application these snazzy transparency effects and ten more cool features as you can see in this demo
it is fine by itself, and in any case what really matters is the demo, not actual use in a loaded system.
Hello world!program:
That simple program uses 73 shared libraries that allocate a total of 13Mb of non-shared static data.This can be alleviated in the following ways:
bitmap font module (for PCF format bitmap
fonts) is loaded by default and this is unfortunate because it
has a misdesign in which it forces the DPI of fonts to be
either 75 or 100 only. Omitting it from the list of modules to
load is no good.
freetype font
module (for PCF, TrueType and Type1 fonts and a few other
types too) is specified before bitmap, it
registers itself for PCF fonts and seems to preempt the
bitmap module, which seems an acceptable
workaround.freetype font module, and the
type1 font module is not loaded. The
freetype font module uses the
FreeType library
which seems to have a fairly good Type 1 rasterizer, one
that does some decent autohinting (a side effect I guess of
FreeType having to have an autohinter for TrueType fonts),
while the original type1 module does not, and
produces fairly crude low DPI bitmaps.
April 08, 2005Now, my external hard drive box also supports USB2, so I tried that. Bad news! It does not work at all with the
Pete Zaitcev: Thomas in a cage with Firewirehttp://thomas.apestaart.org/log/index.php?p=291
Yay. Thomas is about to find why Firewire is unsupported even on Fedora (let alone on RHEL).
But If someone asked me what Firewire needed, I would answer, "only a hacker with a brain". He may be able to pull it off, though I'm not too optimistic. Firewire is about as complex as USB and we all know how well that goes despite a sustained effort by Greg K-H, David-B, Stern and myself. My approach was to hide whenever Firewire came too close and wait for it to die in the marketplace (which, I suspect, is inevitable at this point).
usb-storage driver, presumably because it uses
a chipset (ALi) that is designed to the MS USB driver
interface, not the official one. It works, very slowly, with
the slow device
ub driver, but
this anyhow only supports up to 138GB drives, and my drive has
a 160GB capacity.mmap
efficiently.design patterns.
secret saucethat makes their work tastier and less fattening at the same time.
secret sauceones.
patternsthat seem actually design oriented they tend to be trivializations of simple databased design rules.
patterns: whatever they are, they name recurrent and topical practices, whether related to design or not, and this might sometimes improve communication with and between otherwise unskilled people; which may be of benefit given the industry tendency towards employing unskilled workers.
zeroconf
a nice little utility which is like a
DHCP
client, but without the server, in that it can
configure a network interface automagically
a bit like for IPv6, but (usually) in the
IPv4 169.254.0.0/16 range reserved for that.
zeroconf package installs also
/etc/network/if-up.d/zeroconf-up
which is a script that runs the application every time an
interface is activated, whether or not I want it. Allegedly
autoconfiguration should always work, but there are cases
where I simply want to bring up an interface in a fully
passive way.
/etc/network/if-X.d/zeroconf-up
directories and the scripts therein, and as usual the Debian
Way is to try to do things automagically and in ways to that
me feel rather shoddy.
zeroconf (or
IPv4LL
is definitely a nice idea, having it started by default is not
nice. But this pales compared the the horror I feel when
updating a package containing a daemon starts the daemon even
if I have disabled its activation in the runlevel
configuration./etc/apt/sources.list:
deb http://archive.Ubuntu.com/ubuntu/ hoary main universe restricted multiverse deb http://archive.Ubuntu.com/ubuntu/ warty-security main universe restricted multiverse deb http://archive.Ubuntu.com/ubuntu/ warty-updates main universe restricted multiverse deb http://archive.Ubuntu.com/ubuntu/ warty main universe restricted multiversein order to have the option of installing Ubuntu-only packages or package versions. To ensure that only Debian packages are considered by default I have had to
release-pinpackages with an
originof
Ubuntu to a low
priority, by putting these lines in
/etc/apt/preferences:
Package: * Pin: release o=Ubuntu Pin-Priority: 90
initrd for
Debian and I was quite
amazed to see it is 2MB
compressed and about 5MB uncompressed. It is a pretty largish
rootfilesystem, and some mini distributions are smaller.
helpfulmechanisms that try to automate system driver loading and configuration, starting with those in the
initrd.just works, even the firmware loading seems good (I have put the ZyDAS firmware files in
/usr/local/lib/firmware which is the right place
for manually installed firmware files).
/sys/block/dev/queue/scheduler.hdparm -t results under 2.6 than under 2.4
and the it appears that to get the same results under 2.6
I must raise the filesystem readahead set with
hdparm -an to some large
value like 512 blocks.
| Collection | # of files | total # of lines |
|---|---|---|
arch |
19 | 16130 |
drivers |
150 | 365887 |
fixes |
106 | 14786 |
rpmify |
13 | 911 |
suse |
157 | 236323 |
uml |
10 | 2817 |
xen |
14 | 36169 |
suse and drivers
collections seem to be mostly extensions, but they also
contain a lot of fixes.
LowID). So the mystery of why queues are so long and terrible and download speeds so low persists.
KANOTIX-2005-01.iso with 1.2GB,
ubcd32-full.iso with 0.9GB,
knoppix-std-0.1.iso with 0.7GB, and I haven't had
any download running for a bit.
seedsfor downloads. This is going to become worse and worse as ISPs are not trying to switch from high monthly fixes fees to low monthly fees and then charging for bandwidth, in both directions, but the situation is ugly enough as it is.
downloadsomething they are interested in, and then once the download is complete, they close the connection/sharing.
downloadstyle usage patterns.
seedhosts to kickstart the temporary sharing that is what actually takes place, and these seeds are on relatively slow and overloaded connections.
viral marketingchannels for free software installers.
downloadcompletes.
flattenfile paths into file names, which is not hard.
distributions, especially the live CD ones.
$HOME/.gtkrc.$HOME/.gtkrc-2.0..gtkrc-2.0 even in the 2.2 and 2.4 releases
of GTK 2..gtkrc-2.0
file uses the setting gtk-font-name whose
syntax is similar to, but incompatible with that
of Fontconfig/Xft2 font names, which in turn is hardly
documented, and the differences seem gratuitous. For
example, in Fontconfig font names the point size is
separated from the font name by a dash, but not in GTK 2
settings..gtkrc-2.0 are
actually overriden by equivalent settings in the GConf
database, which apparently is only documented in an email
announcing this patch to a mailing list; this requires
Firefox to be dependent not just on the GTK libraries, but
also on the GNOME libraries, or at last the GConf ones.cool half-assed demostage seem to me the driving forces.
fixes.
--link-dest option to
rsync
and the Perl wrapper script
rsnapshot
that uses it to automate creating backups of filesystems that
are both incremental and full, using forests of hard
links.alsamixer with a rather
less misleading user interface, in particular for controls
that do not correspond to sound channels.Lets face it, because of C's constraints, writing GTK code, and especially widgets, can be ridiculously slow due to all the long names and the object orientation internals that C can't hide.C with Classes anyone?
:-)