Just to let you all know I have a new OpenPGP key, which is 4096 bit RSA. Available from all the usual PGP key sites, or you can download directly from the Contact page should you prefer, where you’ll find the fingerprint details.
Author Archives: Gary Hawkins
Why you should never trust companies who make promises containing the word “forever”, part 4,294,967,296
Google recently announced that they would be turning off their Google Talk service and replacing it with Google Hangouts instead. Whilst normally I wouldn’t care at all about this, and to a certain extent I still don’t, by doing this they have gone back on something that Google said some time ago about using open protocols and the word “forever”. Not that this was in any way a surprise to me that they’d do something like this, but it may be to some people, since their whole business model relies on selling your data which you provide to them for nothing, so if they’re not making money on a certain feature, I’d imagine they’d have no qualms about pulling it to suit themselves, rather than you.
As you may know, Google Talk was based on a protocol called XMPP (or Jabber as it was formerly known), which is an open standard for instant messaging, defined by RFC 6120-6122 et al. The advantage of using XMPP, in addition to the fact it is an open standard, is that there are a huge variety of XMPP clients and servers you can use, whether on PCs, Macs, Linux, or tablets and phones, and they all interoperate fine, in the main.
Google’s decision to stop federation of XMPP between “third-party” (to Google) servers and their own basically means they are limiting the use of their servers to Google account holders only. I expect that in due course, Google Talk itself will be retired fully so that their own official client will no longer work. (The Android app has already been replaced with “Google Hangout”.) This is not unlike Facebook Chat, which also uses XMPP, but at least they have been clear from day one that they will not be federating with anyone else and have always operated a “closed” server.
It’s a stark reminder that when a company promises “free”, there’s usually a catch, and if they promise “forever”, that means “until we feel like discontinuing it”. People who were relying on both of those terms meaning what they say they do, have just received a nasty shock.
If you were using Google Talk to communicate with my XMPP server, then you will now need to find another XMPP account to use instead. Thankfully, there are many different servers around the world who offer such accounts, one of the best known being jabber.org but many other sites are available too, and this link has a list of some of them. You can of course, if you are able to, run your own XMPP server using one of many open-source or commercial XMPP servers available.
Once you’ve registered an account, if you need an XMPP client, there are many to choose from, and many are open source. Popular ones available for Windows, Mac and Linux include Psi, Pidgin, Adium, and many more. For Android users, Xabber is available for Android from the Google Play store, and there are also plenty for other types of smartphones (though I have no direct experience of them). Some of these clients also can do voice calls now, which again due to a choice made by Google, was never available between third-party servers and Google Talk, only between their own customers.
At least Google’s decision hasn’t killed XMPP, the protocol, which is still an open standard, and still has millions of users worldwide, I’d encourage everyone to continue to use a standard which isn’t going to go away because one particular (large) company says so.
(And, in case you’re wondering, my Google Hangouts app on my Android phone is still disabled, just as the Google Talk one was before it.)
Debian Wheezy (7.0) released
Official announcement here:
http://www.debian.org/News/2013/20130504
Oh, well, jessie here we come …
BT Retail quietly trials CGNAT on some of their customer connections
According to this news story on ISPreview.co.uk, BT Retail are quietly trialling Carrier Grade NAT on some of their customer connections, This is the first of the ‘Big 5’ that are known to be doing so, not long after the Plusnet (also owned by BT) trial took place. To me, this suggests that we are now at the point where even the big ISPs no longer have the IPv4 addresses to allocate to customers, and the article suggests that up nine other customers will be sharing a single address, if you have been placed on this trial.
The worry here is that all of the Big 5 will just use this as an excuse to delay IPv6 adoption further, when what we really need is mass adoption of IPv6 folllowed by, if and only if necessary, CGNAT for all the “legacy” applications still left on the Internet.
BT’s official FAQ on the matter can be found here.
Custom logo on UEFI boot screen
This is just one of those things you always wish you could do on a real piece of hardware, just “cos you can”, but once you can, you never actually do it. But, now, you can actually “roll your own BIOS” when it comes to running virtual machines. Two methods you can use with qemu/kvm – the first is the SeaBIOS package, which you can compile to produce a traditional 16-bit BIOS (though quite why you’d want to add a logo to that is beyond me, since the boot screen only appears for one or two seconds, tops); and secondly OVMF, which comes as part of the TianoCore EDK II UEFI development kit, which can (optionally) use a specially compiled version of SeaBIOS to provide a Compatiblity Support Module so that it can emulate a BIOS as well as being UEFI firmware.
As it so happens, it is possible to add a custom logo to the OVMF UEFI code. I’ll assume you’re already familiar with how to roll your own UEFI here from the source.
- Create a BMP file using your favourite graphics program – use a black background. Don’t make it any bigger than the logo size itself (i.e. don’t make it full screen)
- Save it as an 8-bit BMP file with 256 colours (note: GIMP 2.8 appears to be buggy in this regard and actually saves the image with the wrong number of bits even if the palette size is only 256 – I had to import the 24-bit file into Paint.net and save it as 8-bit from that)
- Copy it to the EDK source tree into the directory $EDK_HOME/MdeModulePkg/Logo/Logo.bmp (back up the existing one first)
- Compile your OVMF UEFI firmware
- That’s it!
And here’s the result … (click the image for a better look)
Update: It’s been a long time since I’ve tried this, but it appears you can actually use 24-bit BMP files now (on the 202311 release of OVMF). According to the source code they must not be compressed, so no RLE encoding or anything like that. I’ve had success today with a 24-bit colour uncompressed BMP image, uncompressed, 640 pixels wide. You may or may not need imagemagick installed for this to work.
Adventures in Ticketing
This is still a work in progress – you have been warned! (Please note I’m not necessarily recommending you code the SQL in this way either, since there are far more optimal ways of doing it in a real program, but it helps to do it this way to explain what I’m doing.)
As you may have noticed from this post, the Association of Train Operating Companies have released their fares data as a series of text files in a ZIP file under a Creative Commons licence. Not one to pass up a challenge like this, I quickly downloaded the file and then went about seeing what I could make of it. Just over a week later, I had created a working, if not entirely finished and polished, Python script to import those data files into a PostgreSQL database. (I will open source this tool and release it to the world once it’s finished, but it’s really only usable by me currently. More news on that when I’ve finished it)
So, now I have a database that is 4GiB in size and 58 tables in size, so what can I do with it? Well, the obvious answer would be to find a fare from A to B. Easy, you’d think. Wrong!
First of all, let me include a small glossary of railway fares terms for you:
- NLC – National Location Code. A four-character code, usually numeric but not always, which identifies either a particular station or other place that can issue tickets.
- RJIS – Rail Journey Information System. The big railway industry database, run by Fujitsu, that the ATOC-provided files are sourced from. Basically the same set of files are sent to the various ticket issuing systems in use, but obviously they are updated more than once every three months.
- TOC – Train Operating Company
- Flow – A journey between a given origin and destination is known as a “flow”. The fare for each flow is assigned to a particular TOC or other similar body permitted to set fares (such as a Passenger Transport Executive such as West Yorkshire Metro, GMPTE in Manchester, etc.) or other body such as Transport for London. It is important to note that not every journey between two points on the UK rail network is individually priced.
- CRS code – A CRS code is a three-character (usually three letters) code which identifies a station. These usually follow the station name, for example LDS is Leeds and BRI is Bristol Temple Meads.
- Station Cluster – A station cluster is a group of stations which belong in a group for the purposes of pricing up a flow.
So let’s attempt to find a fare.
Let’s pick two stations at random, a good distance apart, say Taunton to Leeds. The obvious choice is to look up those two stations in the database and see what it tells us. According to the flow_f table, I need to specify an origin and a destination. These are stored in the database as NLCs, so I need to convert Taunton and Leeds into NLCs. To do that I need to query the location_l table…
SELECT description,nlc_code FROM location_l WHERE description='LEEDS' OR description='TAUNTON';
That didn’t work! Why not? Well, the answer is a bit subtle. Every record in the data files is fixed width, which means you (at the very least) have to strip spaces from it, and at worst get rid of superfluous dots at the end of it. Madness. Let’s try again…
SELECT description,nlc_code FROM location_l WHERE description LIKE 'LEEDS%' OR description LIKE 'TAUNTON%';
That’s better. But wait a minute, I’ve got multiple records? What on earth … ?
"TAUNTON. ";"3471" "TAUNTON STN FRCT";"7434" "LEEDS. ";"8487" "LEEDS + BUS ";"H975" "TAUNTON+BUS ";"J945" "LEEDS BUS ";"K202" "LEEDS BRD AIRBUS";"K650" "LEEDS FEST BUS ";"K684"
OK, this isn’t ideal, but it does at least provide the information we wanted – TAUNTON. (with the dot) is NLC 3471, and LEEDS. (also with the dot) is NLC 8487. So now we can query the flow_f table with the right information. So, let’s have a go …
SELECT * FROM flow_f, flow_t WHERE flow_f.flow_id=flow_t.flow_id AND origin_code='3471' AND destination_code='8487';
Nothing! Why ever not? The answer to this a little less than simple. As I mentioned above, a flow is not the same as a journey. In other words, not all journeys have an individually priced flow. In the old days of the paper fares manuals, this was generally dealt with in the book by having a ‘related station’ – in other words, if you couldn’t find the fare explicitly listed from A to B in the listing for station A, you looked it up from a related station (which would be nearby). All that is gone out of the window and replaced with something massively more complex – the idea of the Station Cluster. Hold on tight … !
What we need to do next is to find out if NLC 3471 (Taunton) and NLC 8487 are in any station clusters. If they are, we then need to repeat the above query replacing the station NLCs with the cluster NLCs. So let’s have a grapple around the station_clusters table to see if we can find what we want:
SELECT cluster_nlc,cluster_id FROM station_clusters WHERE cluster_nlc='3471' OR cluster_nlc='8487' ORDER BY cluster_nlc;
So let’s see what that query gave us:
"3471";"Q824" "3471";"Q942" "3471";"S327" "3471";"Q903" "3471";"R070" "3471";"R409" "3471";"Q391" "3471";"Q909" "3471";"T104" "3471";"Q814" "3471";"Q715" "3471";"T204" "3471";"R708" "3471";"T327" "3471";"T406" "3471";"T511" "8487";"Q864" "8487";"Q865" "8487";"Q903" "8487";"Q931" "8487";"Q937" "8487";"Q938" "8487";"Q964" "8487";"R629" "8487";"R639" "8487";"R640" "8487";"R705" "8487";"S031" "8487";"S205" "8487";"T007" "8487";"T018" "8487";"T020" "8487";"T027" "8487";"T029" "8487";"GLA1" "8487";"T143" "8487";"T229" "8487";"T129" "8487";"Q475" "8487";"Q614" "8487";"Q661" "8487";"Q696" "8487";"Q716" "8487";"Q814" "8487";"Q816" "8487";"Q819" "8487";"Q822" "8487";"Q824" "8487";"Q844" "8487";"Q845" "8487";"Q846" "8487";"Q847" "8487";"Q850"
Erk! What is all this lot? Well, the first column is the NLC of the station we’re looking for, and the second column is a list of all the station clusters that that NLC is in. And you’ll notice there’s loads of them…
Thankfully there’s a clever bit of SQL that can do this task for me. Try this …
SELECT <some fields> FROM flow_f, flow_t WHERE flow_f.flow_id=flow_t.flow_id AND flow_f.origin_code IN (SELECT cluster_id FROM station_clusters WHERE cluster_nlc='3471') AND flow_f.destination_code IN (SELECT cluster_id FROM station_clusters WHERE cluster_nlc='8487');
To be continued…
Open rail fares data now available from ATOC
After many years of trying and wrangling with ATOC from various people, there now exists a page http://data.atoc.org/fares-data from which you can actually download the fares data without having to shell out £15 or so for the Avantix Traveller CD from TSO, and/or the National Fares Manuals on paper before that.
This is great news at last though, even if it is currently only on a trial basis. Right, where’s my web browser?
Fed up of worrying about IPv4 exhaustion? Try worrying about this instead
http://samsclass.info/ipv6/exhaustion.htm
(with thanks to farnz for sending me the link)
And now the fun begins…
Today marks the firing of the starting gun in the next phase of IPv4 exhaustion. To date, running NATs on the ISP side has generally been the preserve of the mobile operators, where running servers generally isn’t something you’d want or need to do over a mobile connection. However, something new happened today – Plusnet, the Sheffield-based (and BT Group owned) ISP announced they are going to run a 3-week trial of CGNAT, or Carrier Grade NAT.
Some more info on that can be found here:
http://www.ispreview.co.uk/index.php/2013/01/isp-plusnet-trials-controversial-ipv4-address-sharing-as-ipv6-alternative.html
http://www.thinkbroadband.com/news/5658-plusnet-in-trial-of-carrier-grade-nat-to-conserve-ipv4-addresses.html
http://www.theregister.co.uk/2013/01/15/ipv4_nat/
Now, the term “carrier grade” has, to date, normally meant something along the lines of big and impressive, and can generally handle thousands or millions of whatever it can do, and generally you’d expect a “carrier grade” product to be somehow better than a product which is “non-carrier grade”. However, this is probably the one “carrier grade” product that generally makes things worse. A bit of explanation here – “carrier grade” in this context means that it is suitable for a “carrier”, or telco, to use. CGNAT is “carrier grade” in the sense that it can handle thousands and thousands of connections at once, and runs on slightly beefier hardware than your average home ADSL router, but the net effect of running CGNAT actually makes the experience worse than it would otherwise have been, since you are now removing the ability for that ISP’s users to do port forwarding through the ISP’s NAT which will almost certainly stop quite a lot of things working, for example anything that requires an incoming port (e.g. a server), or perhaps things like UPnP and even Skype. This is probably going to cause quite a lot of compaints, depending on how much it breaks, and also means that users will be getting a consderably degraded Internet experience than they already are (since NAT is not exactly how IPv6 was intended to be used in the first place).
Plusnet had an IPv6 trial, which they stopped for some unexplained reason, and so far no word on when or if it will be resurrected. Rolling out CGNAT and not also rolling out IPv6 seems very short-sighted to me, but more importantly the fact that they are even considering CGNAT at all suggests to me that there is one AS in the UK that could be running dangerously low on its allocated IPv4 addresses (with no prospect of obtaining any more from RIPE).
I shall be watching this space with interest…
Merry Christmas and Happy New Year
‘Tis the season to wish all of my readers (let’s face it, three…) a Merry Christmas and a Happy New Year.