Computing notes 2017 part two

This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.

[file this blog page at: digg del.icio.us Technorati]

2017 September

170918 Mon: Complexity and maintainability and virtualization

One of the reasons why I am skeptical about virtualization is that it raises the complexity of a system: a systems engineer has then to manage not just systems running one or more applications, but also systems running those systems, creating a cascade of dependencies that are quite difficult to investigate in case of failure.

The supreme example of this design philosophy is OpenStack which I have worked on for a while (lots of investigating and fixing), and I have wanted to create a small OpenStack setup at home on my test system to examine it a bit more closely. The system at work was installed using MAAS and Juju, which turned out to be quite unreliable too, so I got the advice to try the now-official Ansible variant of Kolla (1, 2, 3, 4) setup method.

Kolla itself does not setup ansible: it is a tool to build Docker deployables for every OpenStack service. Then the two variants are for deployment: using Ansible or Kubernetes which is another popular buzzword along with Docker and OpenStack. I chose the Ansible installer as I am familiar with Ansible and also I wanted to do a simple install with all relevant services on a single host system, without installing Kubernetes too.

It turns out that the documentation as usual is pretty awful, very long on fantastic promises of magnificence, and obfuscates the banal reality:

Note: one of the components and Dockerfile is for kolla-toolbox which is a new fairly small component for internal use.

The main value of kolla and kolla-ansible is that someone has already written the 252 Dockerfiles for the 67 components, and Ansible roles for them, and keeps somewhat maintaining them, as OpenStack components change.

Eventually my minimal install took 2-3 half days, and resulted in building 100 Docker images taking up around 11GiB and running 29 Docker instances:

soft# docker ps --format 'table {{.Names}}\t{{.Image}}' | grep -v NAMES | sort 
cron                        kolla/centos-binary-cron:5.0.0
fluentd                     kolla/centos-binary-fluentd:5.0.0
glance_api                  kolla/centos-binary-glance-api:5.0.0
glance_registry             kolla/centos-binary-glance-registry:5.0.0
heat_api_cfn                kolla/centos-binary-heat-api-cfn:5.0.0
heat_api                    kolla/centos-binary-heat-api:5.0.0
heat_engine                 kolla/centos-binary-heat-engine:5.0.0
horizon                     kolla/centos-binary-horizon:5.0.0
keystone                    kolla/centos-binary-keystone:5.0.0
kolla_toolbox               kolla/centos-binary-kolla-toolbox:5.0.0
mariadb                     kolla/centos-binary-mariadb:5.0.0
memcached                   kolla/centos-binary-memcached:5.0.0
neutron_dhcp_agent          kolla/centos-binary-neutron-dhcp-agent:5.0.0
neutron_l3_agent            kolla/centos-binary-neutron-l3-agent:5.0.0
neutron_metadata_agent      kolla/centos-binary-neutron-metadata-agent:5.0.0
neutron_openvswitch_agent   kolla/centos-binary-neutron-openvswitch-agent:5.0.0
neutron_server              kolla/centos-binary-neutron-server:5.0.0
nova_api                    kolla/centos-binary-nova-api:5.0.0
nova_compute                kolla/centos-binary-nova-compute:5.0.0
nova_conductor              kolla/centos-binary-nova-conductor:5.0.0
nova_consoleauth            kolla/centos-binary-nova-consoleauth:5.0.0
nova_libvirt                kolla/centos-binary-nova-libvirt:5.0.0
nova_novncproxy             kolla/centos-binary-nova-novncproxy:5.0.0
nova_scheduler              kolla/centos-binary-nova-scheduler:5.0.0
nova_ssh                    kolla/centos-binary-nova-ssh:5.0.0
openvswitch_db              kolla/centos-binary-openvswitch-db-server:5.0.0
openvswitch_vswitchd        kolla/centos-binary-openvswitch-vswitchd:5.0.0
placement_api               kolla/centos-binary-nova-placement-api:5.0.0
rabbitmq                    kolla/centos-binary-rabbitmq:5.0.0

Some notes on this:

Most of the installation time was taken to figure out the rather banal nature of the two components, to de-obfuscate the documentation, to realize that the prebuilt images in the Kolla Docker hub repository were both rather incomplete and old and would not do, that I had to overbuild images, but mostly to work around the inevitable bugs in both the tools and OpenStack.

I found that the half a dozen blocker bugs that I investigated were mostly known years ago, as it often happens, ands that was lucky, as most of the relevant error messages were utterly opaque if not misleading, but would eventually lead to a post by someone who had investigated it.

Overall 1.5 days to setup a small OpenStack instance (without backing storage) is pretty good, considering how complicated it is, but the questions are whether it is necessary to have that level of complexity, and how fragile it is going to be.

170917 Sun: Some STREAM and HPL results from some not-so-new systems

On a boring evening I have run the STREAM and HPL benchmarks on some local systems, with interesting results, first for STREAM:

AMD Phenom X3 720 3 CPUs 2.8GHz
over$ ./stream.100M
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
...
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:            5407.0     0.296645     0.295911     0.299323
Scale:           4751.1     0.338047     0.336766     0.344893
Add:             5457.5     0.439951     0.439759     0.440204
Triad:           5392.4     0.445349     0.445068     0.445903
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
intel i3-370M 2-4 CPUs 2.4GHz
tree$ ./stream.100M                                                            
-------------------------------------------------------------                   
STREAM version $Revision: 5.10 $                                                
...
Function    Best Rate MB/s  Avg time     Min time     Max time                  
Copy:            5668.9     0.299895     0.282241     0.327760                  
Scale:           5856.8     0.305240     0.273185     0.363663                  
Add:             6270.9     0.404547     0.382720     0.446088                  
Triad:           6207.4     0.408687     0.386636     0.456758                  
-------------------------------------------------------------                   
Solution Validates: avg error less than 1.000000e-13 on all three arrays        
-------------------------------------------------------------
AMD FX-6100 3-6 CPUs 3.3GHz
soft$ ./stream.100M
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
...
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:            5984.0     0.268327     0.267381     0.270564
Scale:           5989.1     0.269746     0.267154     0.279534
Add:             6581.8     0.366100     0.364640     0.371339
Triad:           6520.0     0.374828     0.368098     0.419086
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
AMD FX-8370e 4-8 CPUs 3.3MHz
base$ ./stream.100M 
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
...
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:            6649.8     0.242686     0.240608     0.244452
Scale:           6427.1     0.251241     0.248944     0.257000
Add:             7444.9     0.324618     0.322367     0.327456
Triad:           7522.1     0.322253     0.319058     0.324474
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
Xeon E3-1200 4-8 CPUs at 2.4GHz (Update 170920)
virt$ ./stream.100M
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
...
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:            9961.3     0.160857     0.160621     0.160998
Scale:           9938.1     0.161200     0.160997     0.161329
Add:            11416.7     0.210518     0.210218     0.210647
Triad:          11311.7     0.212472     0.212170     0.214260
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------

The X3 720 system has DDR2, the others DDR3, obviously it is not making a huge difference. Note that the X3 720 is from 2009, and the i3-370M is in a laptop from 2010, and the FX-8370e from 2014, so a span of around 5 years.

Bigger differences for HPL; run single-host with mpirun -np 4 xhpl; the last 8 tests are fairly representative, and the last column (I have modified the original printing format from %18.3e to %18.6f) is GFLOPS:

AMD Phenom X3 720 3 CPUs 2.8GHz
over$ mpirun -np 4 xhpl | grep '^W' | tail -8 | sed 's/   / /g'
WR00C2R2    35   4   4   1     0.00     0.011727
WR00C2R4    35   4   4   1     0.00     0.012422
WR00R2L2    35   4   4   1     0.00     0.011343
WR00R2L4    35   4   4   1     0.00     0.012351
WR00R2C2    35   4   4   1     0.00     0.011562
WR00R2C4    35   4   4   1     0.00     0.012483
WR00R2R2    35   4   4   1     0.00     0.011615
WR00R2R4    35   4   4   1     0.00     0.012159
intel i3-370M 2-4 CPUs 2.4GHz
tree$ mpirun -np 4 xhpl | grep '^W' | tail -8 | sed 's/   / /g'
WR00C2R2    35   4   4   1     0.00     0.084724
WR00C2R4    35   4   4   1     0.00     0.089982
WR00R2L2    35   4   4   1     0.00     0.085233
WR00R2L4    35   4   4   1     0.00     0.088178
WR00R2C2    35   4   4   1     0.00     0.085176
WR00R2C4    35   4   4   1     0.00     0.092192
WR00R2R2    35   4   4   1     0.00     0.086446
WR00R2R4    35   4   4   1     0.00     0.092728
AMD FX-6100 3-6 CPUs 3.3GHz
soft$ mpirun -np 4 xhpl | grep '^W' | tail -8 | sed 's/   / /g'
WR00C2R2    35   4   4   1     0.00     0.074744
WR00C2R4    35   4   4   1     0.00     0.074744
WR00R2L2    35   4   4   1     0.00     0.073127
WR00R2L4    35   4   4   1     0.00     0.075299
WR00R2C2    35   4   4   1     0.00     0.072952
WR00R2C4    35   4   4   1     0.00     0.076627
WR00R2R2    35   4   4   1     0.00     0.076052
WR00R2R4    35   4   4   1     0.00     0.073127
AMD FX-8370e 4-8 CPUs 3.3MHz
base$ mpirun -np 4 xhpl | grep '^W' | tail -8 | sed 's/   / /g'
WR00C2R2    35   4   4   1     0.00     0.152807
WR00C2R4    35   4   4   1     0.00     0.149934
WR00R2L2    35   4   4   1     0.00     0.150643
WR00R2L4    35   4   4   1     0.00     0.152807
WR00R2C2    35   4   4   1     0.00     0.132772
WR00R2C4    35   4   4   1     0.00     0.160093
WR00R2R2    35   4   4   1     0.00     0.163582
WR00R2R4    35   4   4   1     0.00     0.158305
Xeon E3-1200 4-8 CPUs at 2.4GHz (Update 170920)
virt$ mpirun -np 4 xhpl | grep '^W' | tail -8 | sed 's/   / /g'
WR00C2R2    35   4   4   1     0.00      0.3457
WR00C2R4    35   4   4   1     0.00      0.3343
WR00R2L2    35   4   4   1     0.00      0.3343
WR00R2L4    35   4   4   1     0.00      0.3104
WR00R2C2    35   4   4   1     0.00      0.3418
WR00R2C4    35   4   4   1     0.00      0.3622
WR00R2R2    35   4   4   1     0.00      0.3497
WR00R2R4    35   4   4   1     0.00      0.3756

Note: the X3 720 and the FX-6100 are running SL7 (64 bit), and the i3-370M and FX-8370e are running ULTS14 (64 bit). On the SL7 systems I got better slightly better GFLOPS with OpenBLAS, but on the ULTS14 systems it was 100 times slower than ATLAS plus BLAS.

Obvious here is that the X3 720 was not very competive in FLOPS even with an i3-370M for a laptop, and that at least on floating point intel CPUs are quite competitive. The reason is that most compilers optimize/schedule floating point operations specifically for what's best intel floating point internal archictectures.

Note: the site cpubenchmark.net rates using PassMark® the X3 720 at 2,692 the i3-370M at 2,022 the FX-6100 at 5,412 and the FX-8370e at 7,782 which takes into account also non-floating point speed and the number of CPUs, and these ratings seem overall fair to me.

It is however faily impressive that the FC-8370e is still twice as fast at the same GHz as the FX-6100, and it is pretty good on an absolute level. However I mostly use the much slower i3-370M in the laptop, and for interactive work it does not feel much slower.

170904 Fri: A developer buys a very powerful laptop

In the blog of a software developer there is a report of the new laptop he has bought to do hiw work:

a new laptop, a Dell XPS 15 9560 with 4k display, 32 GiBs of RAM and 1 TiB M.2 SSD drive. Quite nice specs, aren't they :-)?

That is not surprising when a smart wristwatch has dual CPU 1GHz chip, 768MiB of RAM, 4GiB of flash SSD but it has consequences for many other people: such a powerful development system means that improving the speed and memory use of the software written by that software developer will not be a very high priority. It is difficult for me to see practical solutions for this unwanted consequence of hardware abudance.

2017 August

170825 Fri: M.2 storage with and without backing capacitors

Previously I mentioned the large capacitors on flash SSDs in the 2.5in format targeted at the enterprise market. In an article about recent flash SSDs in the M.2 format there are photographs and descriptions of some with mini capacitors with the same function, and a note that:

The result is a M.2 22110 drive with capacities up to 2TB, write endurance rated at 1DWPD, ... A consumer-oriented version of this drive — shortened to the M.2 2280 form factor by the removal of power loss protection capacitors and equipped with client-oriented firmware — has not been announced

The absence of the capacitors is likely to save relatively little in cost, and having to manufacture two models is likely to add some of that back, but the lack of capacitors makes the write IOPS a lot lower because (because it requires write-through rather than write-back caching) and for the same reason also increase write amplification, thus creating enough differentiation for a much higher price for the enterprise version.

170817 Thu: 50TB 3.5in flash SSD

Interesting new of a line of (instead of the more conventional M.2 and 2.5in) 3.5in form factor flash SSD products, the Viking UHC-Silo (where UHC presumably stands for Ultra High Capacity) in 25TB and 50TB capacities, with a dual-port 6Gb/s SAS interface.

Interesting aspects: fairly conventional 500MB/s sequential read and 350MB/s sequential write rates, and not unexpectedly 16W power consumption. But note that at 350MB/s to fill (or duplicate) this unit takes 56 hours. That big flash capacity could probably sustain a much higher transfer rate, but that might involve a lot higher power drawn and thus heat dissipation, so I suspect that the limited transfer rate does not depend just on the 6Gb/s channel that after all allows for slightly higher 500MB/s read rates.

Initial price seems to be US$37,000 for the 50TB version, or around US$1,300 per TB which is more or less the price for a 1TB flash SSD enterprise SAS product.

170816 Wed: Another economic reason to virtualize

The main economic reason to virtualize has been allegedly that in many cases hardware servers that run the hosts systems of applications are mostly idle, as a typical data centre hosts many low-use applications that nevertheless are run on their own dedicated server (being effectively containerized on that hardware server) and therefore consolidating them onto a single hardware server can result in a huge saving; virtualization allows consolidating the hosting systems as-they-are, simply by copying them from real to virtual hardware, saving transition costs.

Note: that gives for granted that in many cases the applications are poorly written and cannot coexist in the same hosting system, which is indeed often the case. More disappointingly also gives for granted that increasing the impact of hardware failure from one to many applications is acceptable, and that replication of virtual systems to avoid that is cost-free, which it isn't.

But there is another rationale for consolidation, if not virtualization, that does not depend on there being a number of mostly-idle dedicated hardware servers: the notion that the cost per capacity unit of midrange servers is significantly lower than for small servers, and that the best price/capacity ratio is with servers like:

  • 24 Cores (48 cores HT), 256GB RAM, 2* 10GBit and 12* 3TB HDD including a solid BBU. 10k EUR
  • Applications that actually fill that box are rare. We we are cutting it up to sell off the part.

That by itself is only a motivation for consolidation, that is for servers that run multiple applications; in the original presentation that becomes a case for virtualization because of the goal of having a multi-tenant service with independent administration domains.

The problem with virtualization to take advantage of lower marginal cost of capacity in mid-size hardware is that it is not at all free, because of direct and indirect costs:

Having looked at the numbers involved my guess is that there is no definite advantage in overall cost to consolidation of multiple applications, and it all depends on the specific case, and that usually but not always virtualization has costs that outweight its advantages, especially for IO intensive workloads, and that this is reflected in the high costs of most virtualized hosting and storage services (1, 2).

Virtualization for consolidation has however an advantage as to the distribution of cost as it allows IT departments to redefine downwards their costs and accountabilities: from running application hosting systems to running virtual machine hosting systems, and if 10 applications in their virtual machines can be run on a single host system, that means that IT departments need to manage 10 times fewer systems, for example going from 500 small systems to 50.

But the number of actual systems has increased from 500 to 550, as the systems in the virtual machines have not disappeared, so the advantage for IT departments comes not so much from consolidation, but handing back cost of managing the 500 virtual systems to the developers and users of the applications running within them, which is what usually DevOps means.

The further stage for IT department cost-shifting is to get rid entirely of the hosting systems, and outsource the hosting of the virtual machines to a more expensive cloud provider, where the higher costs are then charged directly to the developers and users of the applications, eliminating those costs from the budget of IT department, that is only left with the cost of managing the contract with the cloud provider on behalf of the users.

Shifting hardware and systems costs out of the IT department budget into that of their users can have the advantage of boosting the career and bonuses of IT department executives by shrinking their apparent costs, even if does not reduce overall business costs. But it can reduce aggregate organization costs when it discourages IT users from using IT resources unless there is a large return, by substantially raising the direct cost of IT spending to them, so even at the aggregate level it might be, for specific cases, ultimately a good move.

That is a business that consolidates systems and switches IT provision from application hosting to systems hosting and then outsources system hosting is in effect telling its component business that they are overusing IT and that they should scale it back, by effectively charging more for application hosting, and supporting it less.

170810 Thu: Speedy writing to a BD-R DL disc

Today after upgrading (belatedly) the firmware of my BDR-2209 Pioneer drive to 1.33 I have seen for the first time a 50GB BD-R DL disc written at around 32MB/s average:

Current: BD-R sequential recording
Track 01: data  31707 MB        
Total size:    31707 MB (3607:41.41) = 16234457 sectors
Lout start:    31707 MB (3607:43/41) = 16234607 sectors
Starting to write CD/DVD at speed MAX in real TAO mode for single session.
Last chance to quit, starting real write in   0 seconds. Operation starts.
Waiting for reader process to fill input buffer ... input buffer ready.
Starting new track at sector: 0
Track 01: 9579 of 31707 MB written (fifo 100%) [buf 100%]   8.0x.
Track 01: Total bytes read/written: 33248166496/33248182272 (16234464 sectors).
Writing  time:  1052.512s

This on a fairly cheap no-name disc. I try sometimes also to write to a 50GB BD-RE DL discs, but it works only sometimes, and at best at 2x speed. I am tempted to try, just for the fun of it, to get a 100GB BD-RE XL disc (which have been theoretically available since 2011) but I suspect that's wasted time.

2017 July

170724 Mon: Different types of M.2 slots

As an another example that there are no generic products I was looking at a PC card to hold an M.2 SSD device and interface it to the host bus PCIe and the description carried a warning:

Note, this adapter is designed only for 'M' key M.2 PCIe x4 SSD's such as the Samsung XP941 or SM951. It will not work with a 'B' key M.2 PCIe x2 SSD or the 'B' key M.2 SATA SSD.

There are indeed several types of M slots and with different widths and speeds, supporting different protocols, and this is one of the most recent and faster variants. Indeed among the user reviews there is also a comment as to the speed achievable by an M.2 flash SSD attached to it:

I purchased LyCOM DT-120 to overcome the limit of my motherboard's M.2. slot. Installation was breeze. The SSD is immediately visible to the system, no drivers required. Now I enjoy 2500 MB/s reading and 1500 MB/s writing sequential speed. Be careful to install the device on a PCI x4 slot at least, or you will still be hindered.

Those are pretty remarkable speeds and much higher (peak sequential tranfer) than those for a memristor SSD.

170720 Thu: The O_PONIES summary and too much durability

So last night I was discussing the O_PONIES controversy and was asked to summarise it, which I did as follows:

There is the additional problem that available memory has grown at a much faster rate than IO speed, at least that of hard disks, and this has meant that users and application writers have been happy to let very large amounts of unwritten data accumulate in the Linux page cache, which then takes a long time to be written to persistent storage.

The comments I got on this story were entirely expected, and I was a bit disappointed, but in particular one line of comment offers me the opportunity to explain a particularly popular and delusional point of view:

the behavior I want should be indistinguishable from copying the filesystem into an infinitely large RAM, and atomically snapshotting that RAM copy back onto disk. Once that's done, we can talk about making fsync() do a subset of the snapshot faster or out of order.

right. I'm arguing that the cases where you really care are actually very rare, but there have been some "design choices" that have resulted in fsync() additions being argued for applications..and then things like dpkg get really f'n slow, which makes me want to kill somebody when all I'm doing is debootstrapping some test container

in the "echo hi > a.new; mv a.new a; echo bye > b.new; mv b.new a" case, writing a.new is only necessary if the mv b.new a doesn't complete. A filesystem could opportunistically start writing a.new if it's not already busy elsewhere. In no circumstances should the mv a.new operation block, unless the system is out of dirty buffers

you have way too much emphasis on durability

The latter point is the key and, as I understand them, the implicit arguments are:

As to that I pointed out that any decent implementation of O_PONIES indeed is based on the true and obvious point that it is pointless and wasteful to make a sequence of metadata and data updates persistent until just before a loss of non-persistent storage content and that therefore all that was needed were two systems internal functions returning the time interval to the next loss of non-persistent storage content, and to the end of the current semantically-coherent sequence of operations.

Note: the same reasoning of course applies to backups of persistent storage content: it is pointless and wasteful to make them until just before a loss of the contents of that persistent storage.

Given those O_PONIES functions, it would never be necessary to explicitly fsync, or implicitly fsync metadata in almost all cases, in a sequence of operations like:

echo hi > a.new; mv a.new a; echo bye > b.new; mv b.new a

Because they would be implicitly made persistent only once it were known that a loss of non-persistent storage content would happen before that sequence would complete.

As simple as that!

Unfortunately until someone sends patches to Linus Torvalds implementing those two simple functions there should be way too much emphasis on durability because:

170716c Sun: Hardware specification of "converged" mobile phone

In the same issue of Computer Shopper as the review of a memristor SSD there is also a review of a mobile phone also from Samsung, the model Galaxy S8+ which has an 8 CPU 2.3GHz chip, 4GiB of memory, and 64GiB of flash SSD builtin. That is the configuration of a pretty powerful desktop or a small server.

Most notably for me it has a 6.2in 2960×1440 AMOLED display, which is both physically large and with a higher pixel size than most desktop or laptop displays. There have been mobile phones with 2560×1440 displays for a few years which is amazing in itself, but this is a further step. There are currently almost no desktop or laptop AMOLED displays, and very few laptops have pixel sizes larger than 1366×768, or even decent IPS displays. Some do have 1920×1080 IPS LCD displays, only a very few even have 3200×1800 pixel sizes.

The other notable characteristic of the S8+ is that also given its processing power is huge it has an optional docking station that allows it to use an external monitor, keyboard and mouse (most likely the keyboard and mouse can be used anyhow as long as they are Bluetooth ones).

This is particularly interesting as desktop-mobile convergence (1, 2, 3) was the primary goal of the Ubuntu strategy of Canonical:

Our strategic priority for Ubuntu is making the best converged operating system for phones, tablets, desktops and more.

Earlier this year that strategy was abandoned and I now suspect that a significant motivation for that was that Samsung was introducing convergence themselves with Dex, and for Android, and on a scale and sophistication that Canonical could not match, not being a mobile phone manufacturer itself.

170716b Sun: Hardware specification of smart watch and related

In the same Computer Shopper UK, issue 255 with a review of a memristor SSD there is also a review of a series of fitness-oriented smartwatches, and some of them like the Samsung Gear S3 have 4GB of flash storage, 768MiB of RAM and a two CPU 1GHz chip. That can run a server for a significant workload.

170716 Sun: Miracles do happen: memristor product exists

Apparently memristor storage has arrived as Computer Shopper UK, issue 255 has a practical review of Intel Intel 32GB Optane Memory in an M.2 form factor. That is based on the mythical 3D XPoint memristor memory brand. The price and specifications have some interesting aspects:

The specifications don't say it, but it is not a mass storage device, with a SATA or SAS protocol, it is a sort of memory technology device as far as the Linux kernel is concerned, and for optimal access it requires special handling.

Overall this first memristor product is underwhelming: it is more expensive and slower than equivalent M.2 flash SSDs, even if the read and write access time are much better.

170712 Wed: Special case indentation for PSGML

To edit HTML and XML files like this one I use EMACS with the PSGML library, as it is driven by the relevant DTD and this drives validation (which is fairly complete), and indentation. As to the latter some HTML elements should not be indented further, because they indicate types of speech, rather than parts of the document, and some do not need to be indented at all, as the indicate preformatted text.

Having looked at PSGML there is for the latter case a variable in psgml-html.el that seems relevant:

	sgml-inhibit-indent-tags     '("pre")

but is is otherwise unimplemented. So I have come up with a more complete scheme:

(defvar sgml-inhibit-indent-tags nil
  "*An alist of tags that should not be indented at all")

(defvar sgml-same-indent-tags nil
  "*An alist of tags that should not be indented further")

(defun sgml-mode-customise ()

  (defun sgml-indent-according-to-level (element)
    (let ((name (symbol-name (sgml-element-name element))))
      (cond
	((member name sgml-inhibit-indent-tags) 0)
	((member name sgml-same-indent-tags)
	  (* sgml-indent-step
	    (- (sgml-element-level element) 1)))
	(t
	  (* sgml-indent-step
	    (sgml-element-level element)))
      )
    )
  )
)

(if-true (and (fboundp 'sgml-mode) (not noninteractive))
  (if (fboundp 'eval-after-load)
     (eval-after-load "psgml" '(sgml-mode-customise))
     (sgml-mode-customise)
  )
)

(defun html-sgml-mode ()
  (interactive)
  "Simplified activation of HTML as an application of SGML mode."

  (sgml-mode)
  (html-mode)

  (make-local-variable			'sgml-default-doctype-name)
  (make-local-variable			'sgml-inhibit-indent-tags)
  (make-local-variable			'sgml-same-indent-tags)

  (setq
    sgml-default-doctype-name		"html"
    ; tags must be listed in upper case
    sgml-inhibit-indent-tags		'("PRE")
    sgml-same-indent-tags		'("EM" "TT" "I" "B" "CITE" "VAR" "CODE" "DFN" "STRONG")
  )
)

It works well enough, except that I would prefer the elements with tags listed as sgml-inhibit-indent-tags to have the start and end also not-indented, not just the content; but PSGML indents those as content of the enclosing element, so to achieve that would require more invasive modications of the indentation code.

170704 Tue: Fast route lookup in Linux with large numbers of routes

Fascinating report with graphs on how route lookup has improved in the linux kernel, and the very low lookup times reached:

Two scenarios are tested:

  • 500,000 routes extracted from an Internet router (half of them are /24), and
  • 500,000 host routes (/32) tightly packed in 4 distinct subnets.

Since kernel version 3.6.11 the routing lookup cost was 140s and 40ns, since 4.1.42 it is 35ns and 25ns. Dedicated "enterprise" routers with hardware routing probably equivalent. In a previous post amount of memory used is given: With only 256 MiB, about 2 million routes can be stored!.

As previously mentioned, once upon a time IP routing was much more expensive than Ethernet forwarding and therefore there was a speed case for both Ethernet forwarding across multiple hubs or switches, and for routing based on prefixes; despite the big problems that arise from Ethernet forwarding across multiple switches and the limitations that are consequent to subnet routing.

But it has been a long time since subnet routing has been as cheap as Ethernet forwarding, and it is now pretty obvious that even per-address host routing is cheap enough at least on the datacentre and very likely on the campus level (500,000 addresses is huge!).

Therefore so-called host routing can result in a considerable change of design decisions, but that depends on realizing that host routing is improper terminology and the difference between IP adddresses and Ethernet addresses:

Note: the XNS internetorking protocols used addresses formed of a 32 bit prefix as network identifier (possibly subnetted) and 48 bit Ethernet identifier which to me seems a very good combination.

The popularity of Ethernet and in particular of VLAN broadcast domains spanning multiple switches depends critically on Ethernet addresses being actually identifiers: an interface can be attached to any network access point on any switch at any time and be reachable without further formality.

Now so-called host routes turn IP addresses into (within The Internet) endpoint identifiers, because the endpoint can change location, from one interface to another, or one host to another, and still identify the service it represents.

170701 Sat: Fixing an Ubuntu phone after a power loss crash

So I have this Aquaris E4.5 with Ubuntu Touch, where the battery is a bit fatigued, so it sometimes runs out before I could recharge it. Recently when it restarted the installed Ubuntu system seemed damaged: various things had bizarre or no behaviours.

For example when I clicked on System Settings>Wi-Fi I see nothing listed except that Previous networks lists a number of them, some 2-3 times, but when I click on any of them, System Settings ends; no SIM card is recognized (they work in my spare identical E4.5), and no non-local contacts appear.

Also in System Settings when clicking About, Phone, Security & Privacy, Reset it just crashes.

Previous "battery exhausted" crashes had no bad outcomes, as expected, as battery power usually runs out when the system is idle.

After some time looking into it figured out that part of the issue was that:

$HOME/.config/com.ubuntu.address-book/AddressBookApp.conf.lock
$HOME/.config/connectivity-service/config.ini.lock

were blocking the startups of the relevant services, so removing them allowed them to proceed.

As to System Settings crashing, it was doing SEGV, so strace let me figure out that was for running out of memory just after accessing something under $HOME/.cache/, so I just emptied that directory, and then all worked. Some cached setting had been perhaps corrupted. I suspect that the cache needs occasional cleaning out.

Note: another small detail: /usr/bin/qdbus invokes /usr/lib/arm-linux-gnueabihf/qt5/bin/qdbus which is missing, but /usr/lib/arm-linux-gnueabihf/qt4/bin/qdbus works.