Software and hardware annotations 2007 October

This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.

[file this blog page at: digg del.icio.us Technorati]

071010 Wed 50 times faster or infinitely scalable?: The CTO of Amazon has been intrigued by Michael Stonebraker's contention that existing DBMS products are too generic and therefore suffer from bloat and inefficiences, and that focus on specific application areas might result in DBMSes with 50 times the performance, for example for transactional use (and perhaps he is thinking about dedicates IBM mainframe transactional systems).
The objection to that is that a goal of 50 times better performance is not ambitious or useful, and that a better goal would be unbounded scalability, and that I think betrays a somewhat technical rather than business oriented approach. Because usually, and especially for databases what matters is is cost reduction, or scalability by a small finite number, rather than infinite scalability, as most businesses are not unboundedly scalable. Most businesses are established, and they cannot even dream of growing by 50 times, but they might want to cut their DBMS costs by that amount, or grow by a few times over a reasonable lifetime. Not all companies are Google-like startups, and even for those seamless scalability does not matter that much, as rarely the same technology or business model is usable over more than a 50 times growth, as mistakes and midway corrections do happen.
Even more tragically, a somewhat small (if 50 times is so) fixed factor of improvement is sort of targetable and designable and even testable; conversely unbounded scalability can come with an uncomfortably large fixed overheard and a slow rate of improvement.
To make an extreme example, a DBMS that can scale to support one trillion customer but takes 100,000 servers to support 10,000 customers is entirely pointless as there is no gain in planning for a trillion customers, and the cost for the initial part of the curve is just not practical. Conversely suppose a company initially targets 100,000 customers that can be supported with 50 servers, and then hopes to grow to 500,000 in say 5 years, and is offered to switch to a DBMS that can support that on 5 servers will be very happy, even if that means that were it to expand to one trillion customers it would need 2 billion servers, and with the fully scalable solution it would need only 2 million.
These may seem silly numbers, but the point here is that businesses make money by becoming somewhat better than what they have been and the competition, not by being unboundedly scalable and while fixed-scale improvements are boring, they tend to be where the money is.
071006 Sat Selling out: upgrading to 2GB RAM and dual monitors: So I have sold out again and for someone who about bad programming resulting in excessive memory usage, I have now upgraded my 2GHz socket 754 Athlon 64 from 1GB to 2GB. The reason has been my having now two 1280x1024 monitors and having the window manager keep 6 virtual screens. This has resulted in my keeping a much larger number of applications open and switching between them, as a kind of active bookmarks. Actually I should have said application instances as most of the windows are actually from a few applications, typically XEmacs, Konqueror, Konsole, sometimes Akregator, sometimes a large Java based network monitoring application. KDE applications do share quite a bit among instances, fortunately, but also are not always coded for lower RAM or CPU usage even if admittely I have a usage pattern with high peaks (for example when browsing the web or RSS feeds I open dozens of pages which I then survey quickly for the more relevant ones. Still however it is hard to imagine why having a few dozen 900x800 windows open should cost hundreds of MiBs. But such is modern technology: the 2^nd GiB cost me about £40, the price to be paid for more state.
Given that all those application instances do not need to be active at the same time I might have chosen to spend that money on something else, but I have suffered enough because of the Linux swap policies (which I think to be quite abysmal) that the upgrade to 2GiB allows me to keep everything memory resident and avoid swapping except on very rare cases, where I can't avoid taking the slow page in rates on offer. But since some Linux kernel and GUI developers have at least 4GiB of RAM in their PC that will soon become the minimum needed. But I hope that for now 2GiB will give me another year or two before engaging the Linux swap subsystem again.