Agave Validator Operations Best Practices
After you have successfully setup and started a validator on testnet (or another cluster of your choice), you will want to become familiar with how to operate your validator on a day-to-day basis. During daily operations, you will be monitoring your server, updating software regularly (both the Solana validator software and operating system packages), and managing your vote account and identity account.
All of these skills are critical to practice. Maximizing your validator uptime is an important part of being a good operator.
Educational Workshops
The Solana validator community holds regular educational workshops. You can watch past workshops through the Solana validator educational workshops playlist.
Help with the validator command line
From within the Solana CLI, you can execute the agave-validator
command with
the --help
flag to get a better understanding of the flags and sub commands
available.
agave-validator --help
Restarting your validator
There are many operational reasons you may want to restart your validator. As a best practice, you should avoid a restart during a leader slot. A leader slot is the time when your validator is expected to produce blocks. For the health of the cluster and also for your validator's ability to earn transaction fee rewards, you do not want your validator to be offline during an opportunity to produce blocks.
To see the full leader schedule for an epoch, use the following command:
solana leader-schedule
Based on the current slot and the leader schedule, you can calculate open time windows where your validator is not expected to produce blocks.
Assuming you are ready to restart, you may use the agave-validator exit
command. The command exits your validator process when an appropriate idle time
window is reached. Assuming that you have systemd implemented for your validator
process, the validator should restart automatically after the exit. See the
below help command for details:
agave-validator exit --help
Upgrading
There are many ways to upgrade the Solana CLI software. As an operator, you will need to upgrade often, so it is important to get comfortable with this process.
Note validator nodes do not need to be offline while the newest version is being downloaded or built from source. All methods below can be done before the validator process is restarted.
Building From Source
It is a best practice to always build your Agave binaries from source. If you
build from source, you are certain that the code you are building has not been
tampered with before the binary was created. You may also be able to optimize
your agave-validator
binary to your specific hardware.
If you build from source on the validator machine (or a machine with the same
CPU), you can target your specific architecture using the -march
flag. Refer
to the following doc for
instructions on building from source.
agave-install
If you are not comfortable building from source, or you need to quickly install
a new version to test something out, you could instead try using the
agave-install
command.
Assuming you want to install Agave version 1.14.17
, you would execute the
following:
agave-install init 1.14.17
This command downloads the executable for 1.14.17
and installs it into a
.local
directory. You can also look at agave-install --help
for more
options.
Note this command only works if you already have the solana cli installed. If you do not have the cli installed, refer to install solana cli tools
Restart
For all install methods, the validator process will need to be restarted before
the newly installed version is in use. Use agave-validator exit
to restart
your validator process.
Verifying version
The best way to verify that your validator process has changed to the desired version is to grep the logs after a restart. The following grep command should show you the version that your validator restarted with:
grep -B1 'Starting validator with' <path/to/logfile>
Snapshots
Validators operators who have not experienced significant downtime (multiple hours of downtime), should avoid downloading snapshots. It is important for the health of the cluster as well as your validator history to maintain the local ledger. Therefore, you should not download a new snapshot any time your validator is offline or experiences an issue. Downloading a snapshot should only be reserved for occasions when you do not have local state. Prolonged downtime or the first install of a new validator are examples of times when you may not have state locally. In other cases such as restarts for upgrades, a snapshot download should be avoided.
To avoid downloading a snapshot on restart, add the following flag to the
agave-validator
command:
--no-snapshot-fetch
If you use this flag with the agave-validator
command, make sure that you run
solana catchup <pubkey>
after your validator starts to make sure that the
validator is catching up in a reasonable time. After some time (potentially a
few hours), if it appears that your validator continues to fall behind, then you
may have to download a new snapshot.
Downloading Snapshots
If you are starting a validator for the first time, or your validator has fallen too far behind after a restart, then you may have to download a snapshot.
To download a snapshot, you must NOT use the --no-snapshot-fetch
flag.
Without the flag, your validator will automatically download a snapshot from
your known validators that you specified with the --known-validator
flag.
If one of the known validators is downloading slowly, you can try adding the
--minimal-snapshot-download-speed
flag to your validator. This flag will
switch to another known validator if the initial download speed is below the
threshold that you set.
Manually Downloading Snapshots
In the case that there are network troubles with one or more of your known
validators, then you may have to manually download the snapshot. To manually
download a snapshot from one of your known validators, first, find the IP
address of the validator in using the solana gossip
command. In the example
below, 5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on
is the pubkey of one of my
known validators:
solana gossip | grep 5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on
The IP address of the validators is 139.178.68.207
and the open port on this
validator is 80
. You can see the IP address and port in the fifth column in
the gossip output:
139.178.68.207 | 5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on | 8001 | 8004 | 139.178.68.207:80 | 1.10.27 | 1425680972
Now that the IP and port are known, you can download a full snapshot or an incremental snapshot:
wget --trust-server-names http://139.178.68.207:80/snapshot.tar.bz2
wget --trust-server-names http://139.178.68.207:80/incremental-snapshot.tar.bz2
Now move those files into your snapshot directory. If you have not specified a snapshot directory, then you should put the files in your ledger directory.
Once you have a local snapshot, you can restart your validator with the
--no-snapshot-fetch
flag.
Regularly Check Account Balances
It is important that you do not accidentally run out of funds in your identity
account, as your node will stop voting. It is also important to note that this
account keypair is the most vulnerable of the three keypairs in a vote account
because the keypair for the identity account is stored on your validator when
running the agave-validator
software. How much SOL you should store there is
up to you. As a best practice, make sure to check the account regularly and
refill or deduct from it as needed. To check the account balance do:
solana balance validator-keypair.json
Note
agave-watchtower
can monitor for a minimum validator identity balance. See monitoring best practices for details.
Withdrawing From The Vote Account
As a reminder, your withdrawer's keypair should NEVER be stored on your server. It should be stored on a hardware wallet, paper wallet, or multisig mitigates the risk of hacking and theft of funds.
To withdraw your funds from your vote account, you will need to run
solana withdraw-from-vote-account
on a trusted computer. For example, on a
trusted computer, you could withdraw all of the funds from your vote account
(excluding the rent exempt minimum). The below example assumes you have a
separate keypair to store your funds called person-keypair.json
solana withdraw-from-vote-account \
vote-account-keypair.json \
person-keypair.json ALL \
--authorized-withdrawer authorized-withdrawer-keypair.json
To get more information on the command, use
solana withdraw-from-vote-account --help
.
For a more detailed explanation of the different keypairs and other related operations refer to vote account management.