Setup an Agave Validator
This is a guide for getting your validator setup on the Solana testnet cluster for the first time. Testnet is a Solana cluster that is used for performance testing of the software before the software is used on mainnet. Since testnet is stress tested daily, it is a good cluster to practice validator operations.
Once you have a working validator on testnet, you will want to learn about operational best practices in the next section. Although the guide is specific to testnet, it can be adapted to mainnet or devnet as well.
Refer to the Available Clusters section of the documentation to see example commands for each cluster.
Now let's get started.
Open The Terminal Program
To start this guide, you will be running commands on your trusted computer, not on the remote machine that you plan to use for validator operations. First, locate the terminal program on your trusted computer.
- on Mac, you can search for the word terminal in spotlight.
- on Ubuntu, you can type
CTRL + Alt + T
. - on Windows, you will have to open the command prompt as an Administrator.
Install The Solana CLI Locally
To create your validator vote account, you need to install the Solana command line interface on your local computer.
You can either use Solana's Install Tool section from the within these docs to install the CLI, or alternatively, you can also build from source.
Building from source is a great option for those that want a more secure and potentially more performant executable.
Once the Solana CLI is installed, you can return to this document once you are able to run the following command and get an answer on your terminal:
solana --version
You should see an output that looks similar to this (note your version number may be higher):
solana-cli 1.14.17 (src:b29a37cf; feat:3488713414)
Once you have successfully installed the cli, the next step is to change your
config so that it is making requests to the testnet
cluster:
solana config set --url https://api.testnet.solana.com
To verify that your config has change run:
solana config get
You should see a line that says: RPC URL: https://api.testnet.solana.com
Create Keys
On your local computer, create the 3 keypairs that you will need to run your validator (docs for reference):
NOTE Some operators choose to make vanity keypairs for their identity and vote account using the
grind
sub command (docs for reference).
solana-keygen new -o validator-keypair.json
solana-keygen new -o vote-account-keypair.json
solana-keygen new -o authorized-withdrawer-keypair.json
IMPORTANT the
authorized-withdrawer-keypair.json
should be considered very sensitive information. Many operators choose to use a multisig, hardware wallet, or paper wallet for the authorized withdrawer keypair. A keypair is created on disk in this example for simplicity. Additionally, the withdrawer keypair should always be stored safely. The authorized withdrawer keypair should never be stored on the remote machine that the validator software runs on. For more information, see validator security best practices
Create a Vote Account
Before you can create your vote account, you need to configure the Solana command line tool a bit more.
The below command sets the default keypair that the Solana CLI uses to the
validator-keypair.json
file that you just created in the terminal:
solana config set --keypair ./validator-keypair.json
Now verify your account balance of 0
:
solana balance
Next, you need to deposit some SOL into that keypair account in order create a transaction (in this case, making your vote account):
solana airdrop 1
NOTE The
airdrop
sub command does not work on mainnet, so you will have to acquire SOL and transfer it into this keypair's account if you are setting up a mainnet validator.
Now, use the Solana cluster to create a vote account.
As a reminder, all commands mentioned so far should be done on your trusted computer and NOT on a server where you intend to run your validator. It is especially important that the following command is done on a trusted computer:
solana create-vote-account -ut \
--fee-payer ./validator-keypair.json \
./vote-account-keypair.json \
./validator-keypair.json \
./authorized-withdrawer-keypair.json
Note
-ut
tells the cli command that we would like to use the testnet cluster.--fee-payer
specifies the keypair that will be used to pay the transaction fees. Both flags are not necessary if you configured the solana cli properly above but they are useful to ensure you're using the intended cluster and keypair.
Save the Withdrawer Keypair Securely
Make sure your authorized-withdrawer-keypair.json
is stored in a safe place.
If you have chosen to create a keypair on disk, you should first backup the
keypair and then delete it from your local machine.
IMPORTANT: If you lose your withdrawer key pair, you will lose control of
your vote account. You will not be able to withdraw tokens from the vote account
or update the withdrawer. Make sure to store the
authorized-withdrawer-keypair.json
securely before you move on.
SSH To Your Validator
Connect to your remote server. This is specific to your server but will look something like this:
ssh user@<server.hostname>
You will have to check with your server provider to get the correct user account and hostname that you will ssh into.
Update Your Ubuntu Packages
Make sure you have the latest and greatest package versions on your server
sudo apt update
sudo apt upgrade
Sol User
Create a new Ubuntu user, named sol
, for running the validator:
sudo adduser sol
It is a best practice to always run your validator as a non-root user, like the
sol
user we just created.
Hard Drive Setup
On your Ubuntu computer make sure that you have at least 2TB
of disk space
mounted. You can check disk space using the df
command:
df -h
If you have a drive that is not mounted/formatted, you will have to set up the partition and mount the drive.
To see the hard disk devices that you have available, use the list block devices command:
lsblk -f
You may see some devices in the list that have a name but do not have a UUID. Any device without a UUID is unformatted.
Drive Formatting: Ledger
Assuming you have an nvme drive that is not formatted, you will have to format the drive and then mount it.
For example, if your computer has a device located at /dev/nvme0n1
, then you
can format the drive with the command:
sudo mkfs -t ext4 /dev/nvme0n1
For your computer, the device name and location may be different.
Next, check that you now have a UUID for that device:
lsblk -f
In the fourth column, next to your device name, you should see a string of
letters and numbers that look like this: 6abd1aa5-8422-4b18-8058-11f821fd3967
.
That is the UUID for the device.
Mounting Your Drive: Ledger
So far we have created a formatted drive, but you do not have access to it until you mount it. Make a directory for mounting your drive:
sudo mkdir -p /mnt/ledger
Next, change the ownership of the directory to your sol
user:
sudo chown -R sol:sol /mnt/ledger
Now you can mount the drive:
sudo mount /dev/nvme0n1 /mnt/ledger
Formatting And Mounting Drive: AccountsDB
You will also want to mount the accounts db on a separate hard drive. The process will be similar to the ledger example above.
Assuming you have device at /dev/nvme1n1
, format the device and verify it
exists:
sudo mkfs -t ext4 /dev/nvme1n1
Then verify the UUID for the device exists:
lsblk -f
Create a directory for mounting:
sudo mkdir -p /mnt/accounts
Change the ownership of that directory:
sudo chown -R sol:sol /mnt/accounts
And lastly, mount the drive:
sudo mount /dev/nvme1n1 /mnt/accounts
System Tuning
Linux
Your system will need to be tuned in order to run properly. Your validator may not start without the settings below.
Optimize sysctl knobs
sudo bash -c "cat >/etc/sysctl.d/21-agave-validator.conf <<EOF
# Increase UDP buffer sizes
net.core.rmem_default = 134217728
net.core.rmem_max = 134217728
net.core.wmem_default = 134217728
net.core.wmem_max = 134217728
# Increase memory mapped files limit
vm.max_map_count = 1000000
# Increase number of allowed open file descriptors
fs.nr_open = 1000000
EOF"
sudo sysctl -p /etc/sysctl.d/21-agave-validator.conf
Increase systemd and session file limits
Add
LimitNOFILE=1000000
to the [Service]
section of your systemd service file, if you use one,
otherwise add
DefaultLimitNOFILE=1000000
to the [Manager]
section of /etc/systemd/system.conf
.
sudo systemctl daemon-reload
sudo bash -c "cat >/etc/security/limits.d/90-solana-nofiles.conf <<EOF
# Increase process file descriptor count limit
* - nofile 1000000
EOF"
### Close all open sessions (log out then, in again) ###
Copy Key Pairs
On your personal computer, not on the validator, securely copy your
validator-keypair.json
file and your vote-account-keypair.json
file to the
validator server:
scp validator-keypair.json sol@<server.hostname>:
scp vote-account-keypair.json sol@<server.hostname>:
Note: The
vote-account-keypair.json
does not have any function other than identifying the vote account to potential delegators. Only the public key of the vote account is important once the account is created.
Switch to the sol User
On the validator server, switch to the sol
user:
su - sol
Install The Solana CLI on Remote Machine
Your remote machine will need the Solana CLI installed to run the Agave validator
software. For simplicity, install the cli with user sol
. Refer again to
Solana's Install Tool or
build from source. It is best for
operators to build from source rather than using the pre built binaries.
Create A Validator Startup Script
In your sol home directory (e.g. /home/sol/
), create a folder called bin
.
Inside that folder create a file called validator.sh
and make it executable:
mkdir -p /home/sol/bin
touch /home/sol/bin/validator.sh
chmod +x /home/sol/bin/validator.sh
Next, open the validator.sh
file for editing:
nano /home/sol/bin/validator.sh
Copy and paste the following contents into validator.sh
then save the file:
#!/bin/bash
exec agave-validator \
--identity /home/sol/validator-keypair.json \
--vote-account /home/sol/vote-account-keypair.json \
--known-validator 5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on \
--known-validator 7XSY3MrYnK8vq693Rju17bbPkCN3Z7KvvfvJx4kdrsSY \
--known-validator Ft5fbkqNa76vnsjYNwjDZUXoTWpP7VYm3mtsaQckQADN \
--known-validator 9QxCLckBiJc783jnMvXZubK4wH86Eqqvashtrwvcsgkv \
--only-known-rpc \
--log /home/sol/agave-validator.log \
--ledger /mnt/ledger \
--accounts /mnt/accounts \
--rpc-port 8899 \
--dynamic-port-range 8000-8020 \
--entrypoint entrypoint.testnet.solana.com:8001 \
--entrypoint entrypoint2.testnet.solana.com:8001 \
--entrypoint entrypoint3.testnet.solana.com:8001 \
--expected-genesis-hash 4uhcVJyU9pJkvQyS88uRDiswHXSCkY3zQawwpjk2NsNY \
--wal-recovery-mode skip_any_corrupted_record \
--limit-ledger-size
Refer to agave-validator --help
for more information on what each flag is
doing in this script. Also refer to the section on
best practices for operating a validator.
This startup script is specifically intended for testnet. For more startup script examples intended for other clusters, refer to the clusters section..
Verifying Your Validator Is Working
Test that your validator.sh
file is running properly by executing the
validator.sh
script:
/home/sol/bin/validator.sh
The script should execute the agave-validator
process. In a new terminal
window, shh into your server, then verify that the process is running:
ps aux | grep agave-validator
You should see a line in the output that includes agave-validator
with all
the flags that were added to your validator.sh
script.
Next, we need to look at the logs to make sure everything is operating properly.
Tailing The Logs
As a spot check, you will want to make sure your validator is producing reasonable log output (warning, there will be a lot of log output).
In a new terminal window, ssh into your validator machine, switch users to the
sol
user and tail
the logs:
su - sol
tail -f agave-validator.log
The tail
command will continue to display the output of a file as the file
changes. You should see a continuous stream of log output as your validator
runs. Keep an eye out for any lines that say _ERROR_
.
Assuming you do not see any error messages, exit out of the command.
Gossip Protocol
Gossip is a protocol used in the Solana clusters to communicate between validator nodes. For more information on gossip, see Gossip Service. To verify that your validator is running properly, make sure that the validator has registered itself with the gossip network.
In a new terminal window, connect to your server via ssh. Identify your validator's pubkey:
solana-keygen pubkey ~/validator-keypair.json
The command solana gossip
lists all validators that have registered with the
protocol. To check that the newly setup validator is in gossip, we will grep
for our pubkey in the output:
solana gossip | grep <pubkey>
After running the command, you should see a single line that looks like this:
139.178.68.207 | 5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on | 8001 | 8004 | 139.178.68.207:80 | 1.14.17 | 3488713414
If you do not see any output after grep-ing the output of gossip, your validator may be having startup problems. If that is the case, start debugging by looking through the validator log output.
Solana Validators
After you have verified that your validator is in gossip, you should stake some
SOL to your validator. Once the stake has activated (which happens at the start
of the next epoch), you can verify that your validator is ready to be a voting
participant of the network with the solana validators
command. The command
lists all validators in the network, but like before, we can grep
the output
for the validator we care about:
solana validators | grep <pubkey>
You should see a line of output that looks like this:
5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on FX6NNbS5GHc2kuzgTZetup6GZX6ReaWyki8Z8jC7rbNG 100% 197434166 ( 0) 197434133 ( 0) 2.11% 323614 1.14.17 2450110.588302720 SOL (1.74%)
Solana Catchup
The solana catchup
command is a useful tool for seeing how quickly your
validator is processing blocks. The Solana network has the capability to produce
many transactions per second. Since your validator is new to the network, it has
to ask another validator (listed as a --known-validator
in your startup
script) for a recent snapshot of the ledger. By the time you receive the
snapshot, you may already be behind the network. Many transactions may have been
processed and finalized in that time. In order for your validator to participate
in consensus, it must catchup to the rest of the network by asking for the
more recent transactions that it does not have.
The solana catchup
command is a tool that tells you how far behind the network
your validator is and how quickly you are catching up:
solana catchup <pubkey>
If you see a message about trying to connect, your validator may not be part of
the network yet. Make sure to check the logs and double check solana gossip
and solana validators
to make sure your validator is running properly.
Once you are happy that the validator can start up without errors, the next step
is to create a system service to run the validator.sh
file automatically. Stop
the currently running validator by pressing CTRL+C
in the window where
validator.sh
is running.
Create a System Service
Follow these instructions for running the validator as a system service
Make sure to implement log rotate as well. Once you have the system service configured, start your validator using the newly configured service:
sudo systemctl enable --now sol
Now verify that the validator is running properly by tailing the logs and using the commands mentioned earlier to check gossip and Solana validators:
tail -f /home/sol/agave-validator*.log
Monitoring
agave-watchtower
is a command you can run on a separate machine to monitor
your server. You can read more about handling
automatic restarts and monitoring
using Solana Watchtower here in the docs.
Common issues
Out of disk space
Make sure your ledger is on drive with at least 2TB
of space.
Validator not catching up
This could be a networking/hardware issue, or you may need to get the latest snapshot from another validator node.