Decentralized server orchestration


Rick van de Loo
https://github.com/vdloo

DevOps @ Byte

High available websites on mobile phones

Moving devices are not that different from flaky containers/VMs

A phone losing wifi is conceptually the same as a freezing VM

In the cloud everything breaks and great software exists to deal with that

  • service discovery
  • load balancing
  • automated failovers

We can't just use that cloud stuff to cluster mobile phones

.. or can we?

What are the problems?

  • mobile phones move fast (WIFI / 3G / offline)
  • not in one zone / routes change
  • often behind NAT (no inbound traffic)
  • Android is not GNU/Linux

I need a decentralized virtual network

  • Can't do VPN -> single point of failure
  • Can't do SSH jumphosts -> paths to nodes change
  • Solution: overlay mesh networking

CJDNS

  • DHT mesh routing
  • End-to-end encryption
  • IPv6 address derived from public key
  • TUN virtual network interface

Finds a path transparently

The TUN network is flat, hops are abstracted out

Connect directly to any peer

Running GNU/Linux on Android

  • Root the phone
  • Install the LinuxDeploy Android (APK on github)
  • Create a chroot (Archlinux ARM)
  • Now you've got a shell with a full Linux userland!

Now we can just treat it like any server

Only have one real phone though

Let's find a way to virtualize more

Android-x86

AOSP port to x86

vdloo/android-x86-64-vagrant

Packer script to bootstrap an Archlinux chroot in Android 6.0 Marshmallow and package a Vagrant box.

Dealing with outages

  • Consul by Hashicorp
  • Distributed DNS interface
  • Distributed service discovery
  • Distributed failure detection
  • Distributed key value store

Raft consensus algorithm

Also notice my perfect gif cropping skills

Leader election

Source: https://raft.github.io/

Quorum requires (n/2)+1 nodes to be online

I'm misusing it by making all nodes 'server' nodes

Taping it all together with Python

https://github.com/vdloo/raptiformica

Framework for decentralized server provisioning

Uses Consul KV to store data about peers

vdloo/consul-kv

Python 3 client for the consul key/value store.

In [3]:
# from consul_kv import Connection
# conn = Connection(endpoint='http://localhost:8500/v1/kv')
# conn.put_dict({'a_dict_key': 'some_value'})

Now we have all we need to host distributed applications on mobile phones

Distributed LEMP stack on Android

Runs PHP just fine

Webshop in your pocket, no problem

Building a webapp specifically for this platform

  • Completely decentralized
  • Reverse proxy through meshnet
  • Round robin DNS with consul
  • Use distributed KV as a database
  • Autoscaling -> more nodes online == more workers

Forked fc00.org -> public CJDNS network map (Flask)

Modified to map my private network

Rewrote backend to use consul-kv

Bind a gunicorn on the IPv6 interface

All nodes use the Consul DNS server

Register the service so it is loadbalanced

Picking a reverse proxy

  • Using Consul with DNSmasq to loadbalance
  • Nginx can't resolve DNS per request
  • HAProxy can't resolve DNS per request
  • H2O -> super lightweight and CAN resolve DNS per request!

See: https://github.com/h2o/h2o

Reverse proxy to round robin load balanced consul service

Path of a request

Result

  • Self registering and redundant
  • Distributed and decentralized
  • As long as there is a path any node can join
  • Goes through firewalls / cross zones
  • Nodes can switch from WIFI to mobile data transparently

25 node network

Future

  • Move away from Consul to something with commutative properties
  • Cluster more unconventional hardware
  • Define triggers for cluster changes (less than x nodes -> do something)

That's it. Questions?


You can find these slides and the Jupyter notebook here: https://github.com/vdloo/slides