Nerve is a utility for tracking the status of machines and services. It runs locally on the boxes which make up a distributed system, and reports state information to a distributed key-value store. At Airbnb, we use Zookeeper as our key-value store. The combination of Nerve and Synapse make service discovery in the cloud easy!
We already use Synapse to discover remote services. However, those services needed boilerplate code to register themselves in Zookeeper. Nerve simplifies underlying services, enables code reuse, and allows us to create a more composable system. It does so by factoring out the boilerplate into it's own application, which independenly handles monitoring and reporting.
Beyond those benefits, nerve also acts as a general watchdog on systems. The information it reports can be used to take action from a centralized automation center: action like scaling distributed systems up or down or alerting ops or engineering about downtime.
Add this line to your application's Gemfile:
gem 'nerve'
And then execute:
$ bundle
Or install it yourself as:
$ gem install nerve
Nerve depends on a single configuration file, in json format.
It is usually called nerve.conf.json
.
An example config file is available in example/nerve.conf.json
.
The config file is composed of two main sections:
instance_id
: the name nerve will submit when registering services; makes debugging easierservices
: the hash (from service name to config) of the services nerve will be monitoringservice_conf_dir
: path to a directory in which each json file will be interpreted as a service with the basename of the file minus the .json extensionServices Config
Each service that nerve will be monitoring is specified in the services
hash.
The key is the name of the service, and the value is a configuration hash telling nerve how to monitor the service.
The configuration contains the following options:
host
: the default host on which to make service checks; you should make this your public ip to ensure your service is publically accessibleport
: the default port for service checks; nerve will report the host
:port
combo via your chosen reporterreporter_type
: the mechanism used to report up/down information; depending on the reporter you choose, additional parameters may be required. Defaults to zookeeper
check_interval
: the frequency with which service checks will be initiated; defaults to 500ms
checks
: a list of checks that nerve will perform; if all of the pass, the service will be registered; otherwise, it will be un-registeredZookeeper Reporter
If you set your reporter_type
to "zookeeper"
you should also set these parameters:
zk_hosts
: a list of the zookeeper hosts comprising the ensemble that nerve will submit registration tozk_path
: the path (or znode) where the registration will be created; nerve will create the ephemeral node that is the registration as a child of this pathEtcd Reporter
Note: Etcd support is currently experimental!
If you set your reporter_type
to "etcd"
you should also set these parameters:
etcd_host
: etcd host that nerve will submit registration toetcd_port
: port to connect to etcd.etcd_path
: the path where the registration will be created; nerve will create a node with a 30s ttl that is the registration as a child of this path, and then update it every few secondsThe core of nerve is a set of service checks. Each service can define a number of checks, and all of them must pass for the service to be registered. Although the exact parameters passed to each check are different, all take a number of common arguments:
type
: (required) the kind of check; you can see available check types in the lib/nerve/service_watcher
dir of this reponame
: (optional) a descriptive, human-readable name for the check; it will be auto-generated based on the other parameters if not specifiedhost
: (optional) the host on which the check will be performed; defaults to the host
of the service to which the check belongsport
: (optional) the port on which the check will be performed; like host
, it defaults to the port
of the servicetimeout
: (optional) maximum time the check can take; defaults to 100ms
rise
: (optional) how many consecutive checks must pass before the check is considered passing; defaults to 1fall
: (optional) how many consecutive checks must fail before the check is considered failing; defaults to 1Custom External Checks
If you would like to run a custom check but don't feel like trying to get it merged into this project, there is a mechanism for including external checks thanks to @bakins (airbnb/nerve#36).
Build your custom check as a separate gem and make sure to bundle install
it on your system.
Ideally, you should name your gem "nerve-watcher-#{type}"
, as that is what nerve will require
on boot.
However, if you have a custom name for your gem, you can specify that in the module
argument to the check.