Emulab design document and manual
From Cmcl
Emulab is a software system designed to act as a network testbed. It is built on a number of generic open-source programs as well as pieces specific to the project. The system can be understood as being composed of several broad components. An Emulab installation consists of at least two essential servers (boss, which provides the web interface to the system and holds essential system architecture, and ops, which provides users programmatic/shell access to testbed systems and manages other parts of the user experience) and a number of testbed nodes. This document aims first to provide an overview of relevant concepts and structure of each component and second to provide detailed information on the function and design decisions of subcomponents.
Contents |
Interface
- Apache
- PHP scripts
- ns files
- swapping-in and swapping-out
Central Management
- MySQL database schema
- pubsub
Node Management
Nodes in the standard managed-rack environment have 3 notable characteristics:
- They usually have 2 network interfaces, a control plane, used for out-of-band (with respect to an experiment) communication, and a data plane, which to nodes looks like a network topology specified in the experimental design (via the ns file). Some testbeds use a third network interface to provide nodes with a direct connection to the public internet - when this is not done, users typically connect to their nodes by connecting through ops.
- They typically mount, via NFS, their user directories from ops. This is done over the control plane.
- They run whatever operating system the user requests in their ns file. Emulab will automatically image new operating systems onto the nodes as required by a new experiment swap-in.
There are other types of nodes/testbeds where some or all of these characters do not hold.
- Maintenance images - A maintenance image is a miniature operating system, usually residing on a bootable CD, retrieved through a PXE/TFTP bootstrap, or USB key, and hardcoded through some means to boot when nodes are power cycled. The operating system is responsible for communicating with boss and loading whatever operating system the user has requested for the next experiment onto the hard drive of the system. Maintenance images presently are heavily customised FreeBSD images.
- OS images - An OS image is a standard operating system image used for experiments. They are much less customised than the maintenance images.
- tbheader - The tbheader is a portion of the boot block on the hard drive containing configuration information for the node. At the very least, this contains a tbheader identifier and information about what the system should boot next - maintenance images have a modified FreeBSD bootloader that reads the tbheader and follows the instructions last left there (restoring the default of booting the maintenance image next time first). It can contain arbitrary information beyond that, such as network information for the node - it is up to OS images to make sense of that configuration and obey it.
- tbbootconfig - tbbootconfig is a commandline utility used to query and write the tbheader. It is primarily used in maintenance images.
- bootinfo - bootinfo is a client that, early in the boot process of a maintenance image, is used to ask boss what, on a broad level, it should do (e.g. boot from the hard drive or prepare to image the hard drive). Bootinfo is primarily used in standard managed-rack environments - in some other environments (most notably remote nodes), tmcc is used for this kind of information too.
- tmcc/tmcd - tmcc is a simple client that is responsible for allowing nodes to query boss for their basic configuration. tmcd is a dæmon running on boss that is responsible for answering the queries (answers to a query are typically determined through SQL queries). tmcc is capable of layering its communication over SSL if appropriate client keys are installed. Some commands do not function unless they are layered over SSL.
- update dæmon - run on startup as well as from cron, this is responsible for applying much of the configuration data to the system. For example, it uses tmcc to query boss to ask what accounts should be installed on the system, and after getting a list, it creates any accounts not already present on the system.
- wget - wget is a standard opensource tool for retrieving things from a webserver. It is heavily used by emulab nodes as a generic way to get information from boss.
- imagezip - imagezip is a utility that reads or writes a compressed partition (or whole-disk) image from/to the system. It attempts to understand some specific partition types/filesystems to store information more compactly, although it has the ability to image filesystems it does not understand. It works with images in its own image format that by emulab custom uses the .ndz extension. It is not capable of deep filesystem manipulation and does not understand partition tables - images must be restored to partitions that are the same size as they were taken from (for most OSimages, they can be restored without issues onto larger partitions but the OS will not actually be able to make use of the rest of the partition). Because of this limitation, other scripts in the OS imaging process are designed to install a standard initial tbheader along with a standard MBR (including a standard partition table) onto nodes when they are first configured, with extra disk on nodes going unused. Until recently, there were only two standard tbheaders/MBRs/Partition tables.
- frisbee -
- openvpn - OpenVPN is used by some nodes (mostly remote nodes) to create ad-hoc VPNs.
Adding new nodes and changing a node's configuration
You will generally need to do all of the relevant parts of the below
Wireless testbed
- Configure them to boot via PXE
- Enable wake-on-lan in the CMOS utility. If necessary, configure them to be bootable without the battery present, and configure them to automatically boot after a power cycle
- Replace the built-in miniPCI wireless adapter with an Atheros card, disconnect that from the main antenna and attach it to the digitiser
- Attach them to the serial controller and the KVM
- Remove the battery
- Place the whole system inside the rf shield in the rack
Wired (standard emulab) testbeds
- Configure the boot mechanism. Depending on the setup, this might be CDROM, USBStick, or something else
- Enable wake-on-lan in the CMOS utility as well as automatic booting after power cycles
- Attach to serial controller and KVM
Boss/Ops configuration
The following tables need entries for the node:
- nodes (new entries for this can be fairly involved)
- interfaces
- wires
You may additionally need to configure the following subsystems for the node:
- tip (serial logging)
- dhcp (for nodes on managed networks) - Note that if you're using this, it knows the mac address of your nodes and needs to be kept current.
Other configuration
On managed networks you may need to change the Cisco VLAN configuration to place the control plane connection you used into the correct VLAN. You may also need to configure any serial and kvms appropriately.
OSImages and Emulab Integration
- How_to_make_a_new_OSImage given physical hardware - How to turn an existing physical system you have into an OSImage that emulab will be able to load onto other systems.
- How to make a new OSImage out of an existing one - How to turn an existing OSImage that you have updated into a new one with your changes
- Emulab_node_integration will describe how, given a node without the emulab client software installed, you can get it installed so it will act as a proper emulab client.
- Replacing a node without changing its name - How best to replace a retiring node with a newer system that adopts the old node's name
Software components common to nodes and boss/ops
Users
Emulab users are created through the web interface, which manages account creation on boss/ops/fs, password management, ssl/ssh keys, and synchronisation of necessary details to testbed nodes.
Accounts
- Boss/ops/fs share accounts, but most users do not have privileges to log onto boss or fs directly (implemented by not setting a valid shell for them on those systems). These three management systems share the /users partition via NFS (ops, or fs if it exists independently, exports that filesystem to the others). During an experiment's swapin, boss uses the requesting user's home directory for some validation purposes, particularly the SSL keys.
SSH/SSL keys
- SSH and "SSL" keys are generated by the system's OpenSSL component.
- ssh keys are managed by emulab. Testbed nodes that don't share the /user filesystem with ops/fs (primarily remote nodes and non-unix nodes) have the SSL keys for all users in the group of the experiment synchronised during experiment swapin so as to allow passwordless logins from ops.
- Emulab has SSL keys used for internal authentication purposes - these are located in a user's .ssl/ directory and used during swapin. They are normally generated during account creation (the "mkusercert" command can regenerate them if they become misplaced).
Synchronisation
- During experiment swapin, most nodes retrieve the account information to be placed onto the node during their customary reboot after imaging the requested OS. This happens over "tmcc", which is invoked by the "update" dæmon.
- Nodes that do not reboot (e.g. infrastructure nodes and ALWAYSUP nodes) have update every so often ask for a list of accounts/groups to add/remove
- These account changes are managed by a script called "rc.accounts"
- Testbed node root passwords are not handled through this mechanism - they are sync'd by another script called by "update".
Configuration
Some components of emulab are configured at compile-time, some are independently configured by config files in the install directory (that are not overwritten by the upgrade process), and some components are configured in the database at runtime and are relatively independent of compiles and upgrades.
Compile time
Emulab is configured and built using settings specified in the tbdefs file. This is a site-specific file containing lists of changes (VAR=VALUE) to the compile defaults. These are mostly undocumented. The configure script uses this to templatise the sources in the tree with IP addresses, people/groups to email, how federation is setup, etc.
Database
Other things to check out/integrate
- Orbit testbed doc - System-level, and more than anything Utah has, but be careful not to document any extensions they've done that arn't in the main testbed

