Robotic Tendencies
The personal blog of Robert McQueen

August 12, 2019

Flathub, brought to you by…

Over the past 2 years Flathub has evolved from a wild idea at a hackfest to a community of app developers and publishers making over 600 apps available to end-users on dozens of Linux-based OSes. We couldn’t have gotten anything off the ground without the support of the 20 or so generous souls who backed our initial fundraising, and to make the service a reality since then we’ve relied on on the contributions of dozens of individuals and organisations such as Codethink, Endless, GNOME, KDE and Red Hat. But for our day to day operations, we depend on the continuous support and generosity of a few companies who provide the services and resources that Flathub uses 24/7 to build and deliver all of these apps. This post is about saying thank you to those companies!

Running the infrastructure

Mythic Beasts Logo

Mythic Beasts is a UK-based “no-nonsense” hosting provider who provide managed and un-managed co-location, dedicated servers, VPS and shared hosting. They are also conveniently based in Cambridge where I live, and very nice people to have a coffee or beer with, particularly if you enjoy talking about IPv6 and how many web services you can run on a rack full of Raspberry Pis. The “heart” of Flathub is a physical machine donated by them which originally ran everything in separate VMs – buildbot, frontend, repo master – and they have subsequently increased their donation with several VMs hosted elsewhere within their network. We also benefit from huge amounts of free bandwidth, backup/storage, monitoring, management and their expertise and advice at scaling up the service.

Starting with everything running on one box in 2017 we quickly ran into scaling bottlenecks as traffic started to pick up. With Mythic’s advice and a healthy donation of 100s of GB / month more of bandwidth, we set up two caching frontend servers running in virtual machines in two different London data centres to cache the commonly-accessed objects, shift the load away from the master server, and take advantage of the physical redundancy offered by the Mythic network.

As load increased and we brought a CDN online to bring the content closer to the user, we also moved the Buildbot (and it’s associated Postgres database) to a VM hosted at Mythic in order to offload as much IO bandwidth from the repo server, to keep up sustained HTTP throughput during update operations. This helped significantly but we are in discussions with them about a yet larger box with a mixture of disks and SSDs to handle the concurrent read and write load that we need.

Even after all of these changes, we keep the repo master on one, big, physical machine with directly attached storage because repo update and delta computations are hugely IO intensive operations, and our OSTree repos contain over 9 million inodes which get accessed randomly during this process. We also have a physical HSM (a YubiKey) which stores the GPG repo signing key for Flathub, and it’s really hard to plug a USB key into a cloud instance, and know where it is and that it’s physically secure.

Building the apps

Our first build workers were under Alex’s desk, in Christian’s garage, and a VM donated by Scaleway for our first year. We still have several ARM workers donated by Codethink, but at the start of 2018 it became pretty clear within a few months that we were not going to keep up with the growing pace of builds without some more serious iron behind the Buildbot. We also wanted to be able to offer PR and test builds, beta builds, etc ­­— all of which multiplies the workload significantly.

Packet Logo

Thanks to an introduction by the most excellent Jorge Castro and the approval and support of the Linux Foundation’s CNCF Infrastructure Lab, we were able to get access to an “all expenses paid” account at Packet. Packet is a “bare metal” cloud provider — like AWS except you get entire boxes and dedicated switch ports etc to yourself – at a handful of main datacenters around the world with a full range of server, storage and networking equipment, and a larger number of edge facilities for distribution/processing closer to the users. They have an API and a magical provisioning system which means that at the click of a button or one method call you can bring up all manner of machines, configure networking and storage, etc. Packet is clearly a service built by engineers for engineers – they are smart, easy to get hold of on e-mail and chat, share their roadmap publicly and set priorities based on user feedback.

We currently have 4 Huge Boxes (2 Intel, 2 ARM) from Packet which do the majority of the heavy lifting when it comes to building everything that is uploaded, and also use a few other machines there for auxiliary tasks such as caching source downloads and receiving our streamed logs from the CDN. We also used their flexibility to temporarily set up a whole separate test infrastructure (a repo, buildbot, worker and frontend on one box) while we were prototyping recent changes to the Buildbot.

A special thanks to Ed Vielmetti at Packet who has patiently supported our requests for lots of 32-bit compatible ARM machines, and for his support of other Linux desktop projects such as GNOME and the Freedesktop SDK who also benefit hugely from Packet’s resources for build and CI.

Delivering the data

Even with two redundant / load-balancing front end servers and huge amounts of bandwidth, OSTree repos have so many files that if those servers are too far away from the end users, the latency and round trips cause a serious problem with throughput. In the end you can’t distribute something like Flathub from a single physical location – you need to get closer to the users. Fortunately the OSTree repo format is very efficient to distribute via a CDN, as almost all files in the repository are immutable.

Fastly Logo

After a very speedy response to a plea for help on Twitter, Fastly – one of the world’s leading CDNs – generously agreed to donate free use of their CDN service to support Flathub. All traffic to the dl.flathub.org domain is served through the CDN, and automatically gets cached at dozens of points of presence around the world. Their service is frankly really really cool – the configuration and stats are reallly powerful, unlike any other CDN service I’ve used. Our configuration allows us to collect custom logs which we use to generate our Flathub stats, and to define edge logic in Varnish’s VCL which we use to allow larger files to stream to the end user while they are still being downloaded by the edge node, improving throughput. We also use their API to purge the summary file from their caches worldwide each time the repository updates, so that it can stay cached for longer between updates.

To get some feelings for how well this works, here are some statistics: The Flathub main repo is 929 GB, of which 73 GB are static deltas and 1.9 GB of screenshots. It contains 7280 refs for 640 apps (plus runtimes and extensions) over 4 architectures. Fastly is serving the dl.flathub.org domain fully cached, with a cache hit rate of ~98.7%. Averaging 9.8 million hits and 464 Gb downloaded per hour, Flathub uses between 1-2 Gbps sustained bandwidth depending on the time of day. Here are some nice graphs produced by the Fastly management UI (the numbers are per-hour over the last month):

Graph showing the requests per hour over the past month, split by hits and misses.
Graph showing the data transferred per hour over the past month.

To buy the scale of services and support that Flathub receives from our commercial sponsors would cost tens if not hundreds of thousands of dollars a month. Flathub could not exist without Mythic Beasts, Packet and Fastly‘s support of the free and open source Linux desktop. Thank you!

posted by ramcq @ 3:31 pm
Comments (6) .:. Trackback .:. Permalink

May 21, 2010

VP8 rumblings

As you all know by now, exciting moves from Google on the WebM project have lead to them open-sourcing On2’s VP8 codec to provide a freely available video codec for HTML5 content. Collabora Multimedia worked with Entropy Wave to add support to GStreamer for the new codec from day 1, and I was really happy yesterday to update my Debian system and get the support installed locally too. Thanks to our and Igalia’s fine work on GStreamer HTML5 support in WebKitGTK+, Gustavo Noronha found it worked out of the box with Epiphany too.

Predictably, the MPEG-LA aren’t too pleased with this, and are no doubt winding up their PR and industry allies at the moment, as well as this opening a new front on the Apple vs Google ongoing platform battle. But if your business model is collecting money through what is essentially a protection racket and spreading FUD about patent litigation, the VP8 license implicitly creating a zero-cost zero-revenue patent pool is not going to be good news for you (from the department of Google deleting your business model). The question is now whether the allure of Google’s content will win over against the legal chest pounding of the patent trolls, and whether they start flipping switches to make YouTube only serve up WebM content after a while.

Also in amazing and incredible news, Collabora’s Telepathy/GStreamer/GNOME/Debian/general R&D guru and staunch Web 2.0 holdout Sjoerd Simons has actually now got a blog after a mere 3 years of us suggesting it to him since he joined Collabora as an intern. He’s been hacking on some RTP payloader elements for VP8 so we can use it for video calling on the free desktop. All very exciting stuff, especially in conjunction with Muji (multi-user video calls over XMPP) support heading into Telepathy thanks to NLNet‘s ongoing support.

posted by ramcq @ 6:13 pm
Comments (3) .:. Trackback .:. Permalink

August 27, 2009

Today’s the day

If Collabora has seemed quiet recently, it’s because we’ve been slaving away on various parts of a really awesome project, which we can finally start talking about. Maemo 5 is coming! o o/ /o/

posted by ramcq @ 11:04 am
Comments (1) .:. Trackback .:. Permalink

March 13, 2009

Top acts

I’ve been very impressed several times in the past few months when I’ve discovered awesome new top-like utilities. I’m probably being slow on the uptake and everyone else but me knows about these, but in case its not just me thats been stuck in the ’70s:

htop
A much-needed refresh of oldschool top, this still works on your beloved console but gives you visual bar-graphs of CPU, RAM and swap, lets you scroll through the processes and deliver signals/renicing without having to copy the PID off the moving target. Its like the future!
iftop
One of those things I use so often now I have no idea how I even survived without it. Why is this server lagging, who’s hogging the wireless/DSL, which VM is chewing all of the upstream bandwidth? iftop shows you at a glance how much traffic is being used by which host pairs on a given interface, and you can toggle port numbers on and off with simple key-presses. Absolutely indispensable.
iotop
Does this box feel slow to anyone else? Is it swapping, or is it the database server chewing all the IO? Why does my drive keep seeking? It’s amazing… top for IO bandwidth usage!

A passing mention is deserved for apachetop too, which is pretty neat, but when a server is being hammered it’s not something I found too hard to get a feel for just by tailing the log for a while, so it’s not been as life-changing as the others. Maybe that just means my servers don’t see enough traffic.

posted by ramcq @ 10:45 pm
Comments (21) .:. Trackback .:. Permalink

March 6, 2009

This is a local mail for local people, we’ll have no trouble here!

“… all programs that interact with e-mail are broken in one way or another. Please be careful.” – Lars Wirzenius

I seem to have a cunning knack of finding problems with configuring server software, particularly involving e-mail, where a) I can’t find answers in Google, b) most people I go and ask for help say they’d usually ask me such things, and c) if I go onto IRC or mailing lists I end up helping other people and not getting any help with my problem. It’s quite likely this is just because I’m something of a perfectionist, so the ridiculous crappy hacks people come up with and seem content to entrust their mail to are unacceptable to me for one reason or another. Anyway, in my ongoing quest for the perfect mail system, I’ve painted myself into a corner again.

(I’m currently running with postfix, postgrey, clamav-milter, dspam, dovecot using LDA, managesieve and the cmusieve & antispam plugins. If I can get the current incarnation working, I’ve had enough requests to write up a full HOWTO, and seen enough around with pretty questionable content, that I’ll probably do it before too long.)

I’ve got postfix’s local transport configured to hand mail to dspam over LMTP, using mailbox_transport = lmtp:unix:/dspam/dspam.sock. dspam is configured to listen here, add X-DSPAM-Result and signature headers, and then deliver the mail with dovecot’s deliver LDA (which I’ve set to 4750 root:dspam). From dspam.conf:

ServerParameters "--deliver=innocent,spam -d %u"
ServerDomainSocketPath "/var/spool/postfix/dspam/dspam.sock"
...
Preference "spamAction=tag"
Preference "signatureLocation=headers"
...
TrustedDeliveryAgent "/usr/lib/dovecot/deliver"

My dovecot configuration is pretty standard, using PAM for both passdb and userdb, and provides the auth-master socket that deliver needs. The problem I have is that postfix’s local transport is qualifying the local username with the FQDN of the machine before delivering it to dspam with LMTP (the local mail transfer protocol), even for locally-originated mail which was only addressed with a bare username! dspam doesn’t mangle it or care if the user is local or not, and then cheerfully invokes deliver -d robot101@omega.example.co.uk, which returns EX_NOUSER (addressee unknown) because my username is just robot101. From mail robot101:

Mar 6 02:25:40 omega postfix/pickup[13607]: DAA4942F41F: uid=1000 from=<robot101>
Mar 6 02:25:40 omega postfix/cleanup[13637]: DAA4942F41F: message-id=<20090306022540.DAA4942F41F@omega.example.co.uk>
Mar 6 02:25:40 omega postfix/qmgr[12552]: DAA4942F41F: from=<robot101@omega.example.co.uk>, size=339, nrcpt=1 (queue active)
Mar 6 02:25:40 omega dovecot: auth(default): passwd(robot101@omega.example.co.uk): unknown user
Mar 6 02:25:40 omega dspam[13527]: Delivery agent returned exit code 67: /usr/lib/dovecot/deliver -d robot101@omega.example.co.uk
Mar 6 02:25:40 omega postfix/lmtp[13640]: DAA4942F41F: to=<robot101@omega.example.co.uk>, orig_to=<robot101>, relay=omega.example.co.uk[/dspam/dspam.sock], delay=0.08, delays=0.05/0.01/0.01/0.03, dsn=4.3.0, status=deferred (host omega.example.co.uk[/dspam/dspam.sock] said: 421 4.3.0 <robot101@omega.example.co.uk> Delivered (in reply to end of DATA command))

So, no e-mail for me. Dearest lazyweb, which of the three components is behaving wrongly, and how can I fix it?

(And no, I’m not just going to switch to GMail. I store my data on hard drives, which are sometimes in servers, not “in the cloud”. Until about a month ago, most people I knew spoke about clouds which were made of particles of water in the sky, rather than as a data storage media. What if it rains? 😉

Update: The problem is fixed! Even though arguably the problem is dspam’s for not knowing which users are local or not, it’s fixable in dovecot 1.1.x using the auth_username_format = %n option. Thanks so much to Angel Marin for helping me out.

Update 2: There’s also a patch for dspam floating around which adds a StripRcptDomain option, which makes the LMTP server truncate the e-mail address at the @, so essentially assumes everyone to be a local user. The problem with both of these fixes is that they’re both blunt instruments which will break virtual users on the same host. I think the real fix would be something more like a LocalDomains option in dspam, to choose which domains should be considered local and truncated from the e-mail addresses for delivery purposes.

posted by ramcq @ 2:39 am
Comments (26) .:. Trackback .:. Permalink

January 23, 2009

Auctions, Beards, Conferences and Devils

Tuz, coming soon to a Linux kernel near you

It’s the last day of the most awesome linux.conf.au 2009 conference in Hobart, Tasmania. I’ve just witnessed the a room full of 500 people sit with baited breath as Linus wielded a set of clippers to shave Bdale Garbee‘s beard, followed by Bdale (with a razor with 3 more blades than last time he shaved, a tiny bowl of water and a hand-mirror) trying to make it look neater. The LCA twitter feed was up on the projector, and someone rightly observed this whole event was actually pretty weird. There are already pictures on flickr too. However, well done to Bdale for being such a good sport, but it looks like his wife Karen will accompany him next year to make sure he doesn’t agree to anything else like this, and supervise the waxing of Rusty‘s chest… 🙂

What’s this all in aid of? After the incredible auction for this beautiful picture from Karen, and generous donations at the Penguin Dinner on Wednesday night, the conference has now raised between AU$ 35k and 40k towards the Save the Tasmanian Devil appeal. Around AU$ 1.3k of the nonsensical winning consortium’s AU $10.6k bid came from the Collabora folks who were at the dinner, and AU$ 1.2k from Collabora and Collabora Multimedia directly. We were all set to place a winning AU$ 3k bid but then Matthew and Daniel came up with the Bdale shaving scheme, and then things really picked up. I’m glad we took part – the lead scientist from the project was really grateful, and I hope the money can make a real difference to their great work.

Telepathy

On more mundane matters, I also gave my talk this morning, and my slides (Telepathy slides v2.0 thanks to Marco) are online. I also made a few demos of new awesome stuff you can do with Telepathy (most of the patches are already merged upstream or well on the way):

  • Geolocation support (XEP-0080) support in the XMPP backend and Empathy, using GeoClue to find your location and the libchamplain Clutter & Open Streetmap widget to display where your contacts are. Thanks to Pierre-Luc, Alban and Daf for their work here – more details on Pierre-Luc’s blog.
  • Support for launching file transfers over link-local XMPP from Nautilus using the Empathy plugin for nautilus-sendto. This is already merged upstream but needs a patch to work with trunk Empathy. Thanks to Marco, Jonny and Guillaume for their work on this.
  • Alban also made a neat hack to Rhythmbox which allows exporting your DAAP music server to one of your contants over a Telepathy Stream Tube. Thanks also to recent work from Marco, these tubes now go over XMPP’s SOCKS5 Bytestreams, giving much better throughput than the earlier in-band implementation, network permitting. The next step is unleashing the full might of our libnice NAT traversal library, signalling tubes with Jingle, and therefore making connections work peer to peer in up to 95% of the cases. However, this won’t affect the APIs, stuff will just go faster! Isn’t Telepathy wonderful?
  • Olivier stepped up to show off the demo from his talk about Farsight, which shows his branch using the new telepathy-farsight library to allow recording Telepathy video calls directly into the PiTiVi video editor. His network was screwed up so it didn’t work, but I did see it work in his talk yesterday! Awesome stuff, hopefully Edward and friends can pick it up and merge it in before too long.
  • Unfortunately we ran out of time for Will to show off Guillaume’s recent work on Telepathy-enabled Abiword on the desktop (rather than just Sugar’s Write activity), but I expect he’ll blog about it soon!

On that note, these were just the five that I picked to try and fit into my talk. There are a load more demos in the pipeline from the other guys in Collabora of doing stuff with Telepathy, so keep a close look on Planet Collabora for the next cool thing.

posted by ramcq @ 4:20 am
Comments (5) .:. Trackback .:. Permalink

January 21, 2009

My new font rendering technique is unstoppable

You know it’s time to call it a day and write your talk tomorrow when…

I just upgraded Gtk+, Cairo and Pango to the versions in Debian experimental while I was upgrading some Telepathy packages, and got this the next time I loaded OO.o. Magic. But seriously, anyone got any ideas what’s going on?

Update: I switched my Debian mirror to .au and downloaded OpenOffice.org 3.0.1~rc2, and installed the Gtk+ and GNOME stuff too, and not only did the fonts came back, but it no longer looks like the 80s. Score! Thanks for the tips. Back to my talk…

posted by ramcq @ 7:29 am
Comments (18) .:. Trackback .:. Permalink

April 18, 2008

Lazyweb request: gadgets I would like to have

Last night I thought of a few gadgets which I’d like to have, and although I’m pretty sure you should be able to get hold of them, I had trouble finding anything that looked quite right:

  • Alarm clock which makes coffee: I can’t be the only one who finds it hard to bootstrap my days because I have to get out of bed and make the first coffee of the day before I’ve had any coffee. My parents had a machine which was an alarm clock which made tea (very noisily) at the appropriate time in the morning. Surely there should be a similar device which can display the time and make (at least passable) coffee instead? Be it an alarm clock with a sideline in making coffee, or a coffee machine with a built in timer. I’ve noticed some of Gaggia’s bean-to-cup machines claim to have 24-hour clocks, but does that mean they have a timer function? We just don’t know.
  • Decent watch with USB storage: I found some watches online last night which had USB storage built in, some with a little USB connector that folded out, some with a mini/micro USB connector on the side, with the idea I could store (maybe parts of) my GPG and SSH keys on it, and maybe a bootable Debian installer/rescue system. The thing is, I have a reasonably nice Timex Expedition watch at the moment which I quite like: it has an electro-luminescent analog display for the middle of the night, and a digital bit for the date, alarms and multiple time-zones. The USB-enabled watches I saw didn’t look that great as watches, but I might be wrong. Does anyone have a watch that features USB storage that doesn’t compromise too much on the watch functionality? Maybe I should just give up on this one and go for the rugged USB stick on the keyring approach.
  • Video output over USB: I have a reasonably new HP 2510P laptop which I also use as my main machine at work with a docking station, TFT, keyboard, mouse, etc. However, as a machine for watching DVDs or other videos on at home, it’s a bit on the small side. I have an olde-worlde big flatscreen TV at home (which is not as good as Christian‘s flat-screen blueray surround sound movie set-up, but I think I retain the moral high ground on taste in films), but my laptop doesn’t have any video out. Is there a USB 2.0 widget which produces composite or S-Video output which I can feed to my TV, that will work with Linux, or should I just get a scan converter of some sort so I can use the VGA output?

So, answers on a postcard…

posted by ramcq @ 11:42 am
Comments (19) .:. Trackback .:. Permalink

June 25, 2006

GUADEC and Telepathy

I made it to Vilanova on Friday for GUADEC, managed to get settled in to our chalet (I’m glad we opted for one with air conditioning!). After we got them to fix the hot water, I now think it’s pretty decent accomodation for the price, complete with wifi, swimming pool and a well-stocked shop. The only downside I can see is the distance from town. On Friday night we missed the last bus and walked in, which took over an hour and I developed a bad headache by the time we reached the town (we didn’t find the right beach, but stopped in a bar instead). My enjoyment of the walk wasn’t helped by the small children who were out on the street launching fireworks, mortars and other incendiary devices at or near us most of the way. 🙂

Yesterday we hired bicycles to get to the town center which was certainly more fun, but there’s quite a hill on the way back. Also, bus in and taxi back is pretty much cheaper than the cost of hiring bikes here anyway, so I’m not sure I can recommend it as a long-term strategy. We might do it again for the novelty, and it has the benefit of not needing to wait around for a taxi to get back.

Even before I made it to the conference venue yesterday, I’ve already met loads of cool people who hack on all sorts of cool software which I use every day, and I’ve recognised lots more people who I’ve not managed to speak to yet. I’ve also realised that we need to do a lot more work to raise the profile of the Telepathy project which I’ve been working on for almost a year now (eek!). It’s a really cool way to get IM and VoIP stuff properly integrated into the GNOME desktop, and everyone should go and check out the website, play with our releases, chat with me and come to my talk on Tuesday. Oh, and if anyone wants a Telepathy or Collabora t-shirt, grab one off me or daf. 🙂

posted by ramcq @ 10:10 am
Comments (0) .:. Trackback .:. Permalink

June 1, 2006

if (n00b); warning

I wasted a non-trivial amount of time yesterday debugging code in which I’d accidentally written:

if (...);
  {
    ...
  }

Is there any situation where if (foo); can achieve something which just foo; couldn’t? Could the compiler not warn about a conditional that contained no code?

Aggravating lapses in competence aside, I’ve realised I’ve not blogged for months, so over the next few days I’m going to try and write a little about what I’ve been working on recently.

posted by ramcq @ 12:05 pm
Comments (15) .:. Trackback .:. Permalink

Next Page »