How to configure a Data Availability Committee: deploy a Data Availability Server (DAS) as a mirror
This document is currently in public preview and may change significantly as feedback is captured from readers like you. Click the Request an update button at the top of this document or join the Arbitrum Discord to share your feedback.
AnyTrust chains rely on an external Data Availability Committee (DAC) to store data and provide it on demand instead of using the parent chain as the Data Availability (DA) layer. The members of the DAC run a Data Availability Server (DAS) to handle these operations.
In this how-to you'll learn how to deploy a DAS as a mirror and enable a REST interface to respond to requests of stored information. For more information related to configuring a DAC, please see the Introduction.
You should be familiarized with how the AnyTrust protocol works and what's the role of the DAC in the protocol. You can find more information about the AnyTrust protocol in Inside AnyTrust. It is also recommended to be familiarized with Kubernetes as the examples on this guide are based on that software.
How does a Data Availability Server work?
A Data Availability Server (DAS) allows storage and retrieval of transaction data batches for an AnyTrust chain. It is the software that the members of the DAC run in order to provide the Data Availability service. It can be run in two modes: committee member or mirror.
Committee member DAS accept time-limited requests to store data batches from the sequencer of an AnyTrust chain, and return a signed certificate promising to store that data during the established time.
Mirror DAS (and, optionally, committee member servers) respond to requests to retrieve the data batches. They may also provide archived data beyond the limited time that committee member DAS are required to store the data. Mirror DAS serve two main purposes:
- Prevent the committee member servers from having to serve requests for data, allowing them to focus only on storing the data sent to them.
- Provide resiliency to the network in the case of a committee member DAS going down.
Configuration options
When setting up a DAS, there are certain options you can configure to suit your infrastructure needs:
Interfaces available in a DAS
There are two main interfaces that can be enabled in a DAS: an RPC interface to store data in the DAS, intended to be used only by the AnyTrust sequencer; and a REST interface that supports only GET operations and is intended for public use.
Committee member DA servers listen on the RPC interface for das_store
RPC messages coming from the sequencer. The sequencer signs its requests and the DAS checks the signature.
Mirror DA servers (and, optionally, committee member servers if they enable this option) listen on the REST interface and respond to queries on /get-by-hash/<hex encoded data hash>
. The response is always the same for a given hash.
Finally, IPFS is an alternative interface that serves batch retrieval. A mirror DAS can be configured to sync and pin batches to its local IPFS repository, then act as a node in the IPFS peer-to-peer network. The advantage of using this interface is that a Nitro node that is configured to use IPFS will use the batch hashes to find the batch data on the IPFS peer-to-peer network. Depending on the network configuration, that Nitro node may then also act as an IPFS node serving the batch data.
Storage options
A DAS can be configured to use one or more of four storage backends:
If more than one option is selected, store requests must succeed to all of them for it to be considered successful, while retrieve requests only require one of them to succeed.
If there are other storage backends you'd like us to support you can send us a message on Discord, or use the “Request an update” button on top of this page.
Caching
An in-memory cache can be enabled to avoid needing to access underlying storage for retrieve requests.
Requests sent to the REST interface (to retrieve data from the DAS) return always the same data for a given hash so the result is cacheable. It also contains a cache-control
header specifying that the object is immutable and to cache it for up to 28 days.
State synchronization
DA servers also have an optional REST aggregator which, when a data batch is not found in cache or storage, requests that batch to other REST servers defined in a list and stores that batch upon receiving it. This is how a DAS that misses storing a batch (the AnyTrust protocol doesn't require all of them to report success in order to post the batch's certificate to the parent chain) can automatically repair gaps in the data it stores, and also how a mirror DAS can sync its data. A public list of REST endpoints is published online, which the DAS can be configured to download and use, and additional endpoints can be specified in the configuration.
How to deploy the DAS as a mirror
We now start the process of deploying the DAS as a mirror. Remember that you can reach out to us on Discord if you are having trouble following this process.
Step 0: Prerequisites
In order to setup your DAS, you'll need the following information:
- The latest Nitro docker image:
offchainlabs/nitro-node:v2.2.2-8f33fea
- An RPC endpoint for the parent chain. It is recommended to use a third-party provider RPC or run your own node to prevent being rate limited.
- The SequencerInbox contract address in the parent chain.
- URL of the list of REST endpoints of other DA servers to configure the REST aggregator.
Step 1: Set up a persistent volume
First, we'll set up a volume to store the DAS database. In k8s, we can use a configuration like this:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: das-mirror
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
storageClassName: gp2
Step 2: Deploy the mirror DAS
To run the mirror DAS, we'll use the daserver
tool and we'll configure the following parameters:
Parameter | Description |
---|---|
--data-availability.parent-chain-node-url | RPC endpoint of a parent chain node |
--data-availability.sequencer-inbox-address | Address of the SequencerInbox in the parent chain |
--enable-rest | Enables the REST server listening on --rest-addr and --rest-port |
--rest-addr | REST server listening interface (default "localhost") |
--rest-port | (Optional) REST server listening port (default 9877) |
--log-level | Log level: 1 - ERROR, 2 - WARN, 3 - INFO, 4 - DEBUG, 5 - TRACE (default 3) |
--data-availability.rest-aggregator.enable | Enables retrieval of sequencer batch data from a list of remote REST endpoints |
--data-availability.rest-aggregator.online-url-list | A URL to a list of URLs of REST DAS endpoints that is checked at startup. This option is additive with the urls option |
--data-availability.rest-aggregator.urls | List of URLs including 'http://' or 'https://' prefixes and port numbers to REST DAS endpoints. This option is additive with the online-url-list option |
--data-availability.rest-aggregator.sync-to-storage.check-already-exists | When using a REST aggregator, checks if the data already exists in this DAS's storage. Must be disabled for fast sync with an IPFS backend (default true) |
--data-availability.rest-aggregator.sync-to-storage.eager | When using a REST aggregator, eagerly syncs batch data to this DAS's storage from the rest endpoints, using the parent chain as the index of batch data hashes; otherwise only syncs lazily |
--data-availability.rest-aggregator.sync-to-storage.eager-lower-bound-block | When using a REST aggregator that's eagerly syncing, starts indexing forward from this block from the parent chain. Only used if there is no sync state. |
--data-availability.rest-aggregator.sync-to-storage.retention-period | When using a REST aggregator, period to retain the synced data (defaults to forever) |
--data-availability.rest-aggregator.sync-to-storage.state-dir | When using a REST aggregator, directory to store the sync state in, i.e. the block number currently synced up to, so that it doesn't sync from scratch each time |
To enable caching, you can use the following parameters:
Parameter | Description |
---|---|
--data-availability.local-cache.enable | Enables local in-memory caching of sequencer batch data |
--data-availability.local-cache.expiration | Expiration time for in-memory cached sequencer batches (default 1h0m0s) |
Finally, for the storage backends you wish to configure, use the following parameters (toggle between the different options to see all available parameters):
- AWS S3 bucket
- Local Badger database
- Local files
- IPFS
Parameter | Description |
---|---|
--data-availability.s3-storage.enable | Enables storage/retrieval of sequencer batch data from an AWS S3 bucket |
--data-availability.s3-storage.access-key | S3 access key |
--data-availability.s3-storage.bucket | S3 bucket |
--data-availability.s3-storage.region | S3 region |
--data-availability.s3-storage.secret-key | S3 secret key |
--data-availability.s3-storage.object-prefix | Prefix to add to S3 objects |
--data-availability.s3-storage.discard-after-timeout | Whether to discard data after its expiry timeout (setting it to false, activates the “archive” mode) |
Parameter | Description |
---|---|
--data-availability.local-db-storage.enable | Enables storage/retrieval of sequencer batch data from a database on the local filesystem |
--data-availability.local-db-storage.data-dir | Absolute path of the directory inside the volume in which to store the database (it must exist) |
--data-availability.local-db-storage.discard-after-timeout | Whether to discard data after its expiry timeout (setting it to false, activates the “archive” mode) |
Parameter | Description |
---|---|
--data-availability.local-file-storage.enable | Enables storage/retrieval of sequencer batch data from a directory of files, one per batch |
--data-availability.local-file-storage.data-dir | Absolute path of the directory inside the volume in which to store the data (it must exist) |
Parameter | Description |
---|---|
--data-availability.ipfs-storage.enable | Enables storage/retrieval of sequencer batch data from IPFS |
--data-availability.ipfs-storage.profiles | Comma separated list of IPFS profiles to use |
--data-availability.ipfs-storage.read-timeout | Timeout for IPFS reads, since by default it will wait forever. Treat timeout as not found (default 1m0s) |
Here's an example daserver
command for a committee member DAS that:
- Enables local cache
- Enables AWS S3 bucket storage that doesn't discard data after expiring (archive)
- Enables local Badger database storage that doesn't discard data after expiring (archive)
- Uses a local committee member DAS as part of the REST aggregator
daserver
--data-availability.parent-chain-node-url "<YOUR PARENT CHAIN RPC ENDPOINT>"
--data-availability.sequencer-inbox-address "<ADDRESS OF SEQUENCERINBOX ON PARENT CHAIN>"
--enable-rest
--rest-addr '0.0.0.0'
--log-level 3
--data-availability.local-cache.enable
--data-availability.rest-aggregator.enable
--data-availability.rest-aggregator.urls "http://your-committee-member.svc.cluster.local:9877"
--data-availability.rest-aggregator.online-url-list "<URL TO LIST OF REST ENDPOINTS>"
--data-availability.rest-aggregator.sync-to-storage.eager
--data-availability.rest-aggregator.sync-to-storage.eager-lower-bound-block "BLOCK NUMBER"
--data-availability.rest-aggregator.sync-to-storage.state-dir /home/user/data/syncState
--data-availability.s3-storage.enable
--data-availability.s3-storage.access-key "<YOUR ACCESS KEY>"
--data-availability.s3-storage.bucket "<YOUR BUCKET>"
--data-availability.s3-storage.region "<YOUR REGION>"
--data-availability.s3-storage.secret-key "<YOUR SECRET KEY>"
--data-availability.s3-storage.object-prefix "<YOUR OBJECT KEY PREFIX>/"
--data-availability.s3-storage.discard-after-timeout false
--data-availability.local-db-storage.enable
--data-availability.local-db-storage.data-dir /home/user/data/badgerdb
--data-availability.local-db-storage.discard-after-timeout false
And here's an example of how to use a k8s deployment to run the that command:
apiVersion: apps/v1
kind: Deployment
metadata:
name: das-mirror
spec:
replicas: 1
selector:
matchLabels:
app: das-mirror
strategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 50%
type: RollingUpdate
template:
metadata:
labels:
app: das-mirror
spec:
containers:
- command:
- bash
- -c
- |
mkdir -p /home/user/data/badgerdb
mkdir -p /home/user/data/syncState
/usr/local/bin/daserver --data-availability.parent-chain-node-url "<YOUR PARENT CHAIN RPC ENDPOINT>" --data-availability.sequencer-inbox-address "<ADDRESS OF SEQUENCERINBOX ON PARENT CHAIN>" --enable-rest --rest-addr '0.0.0.0' --log-level 3 --data-availability.local-cache.enable --data-availability.rest-aggregator.enable --data-availability.rest-aggregator.urls "http://your-committee-member.svc.cluster.local:9877" --data-availability.rest-aggregator.online-url-list "<URL TO LIST OF REST ENDPOINTS>" --data-availability.rest-aggregator.sync-to-storage.eager --data-availability.rest-aggregator.sync-to-storage.eager-lower-bound-block "BLOCK NUMBER" --data-availability.rest-aggregator.sync-to-storage.state-dir /home/user/data/syncState --data-availability.s3-storage.enable --data-availability.s3-storage.access-key "<YOUR ACCESS KEY>" --data-availability.s3-storage.bucket "<YOUR BUCKET>" --data-availability.s3-storage.region "<YOUR REGION>" --data-availability.s3-storage.secret-key "<YOUR SECRET KEY>" --data-availability.s3-storage.object-prefix "<YOUR OBJECT KEY PREFIX>/" --data-availability.local-db-storage.enable --data-availability.local-db-storage.data-dir /home/user/data/badgerdb
image: offchainlabs/nitro-node:v2.2.2-8f33fea
imagePullPolicy: Always
resources:
limits:
cpu: "4"
memory: 10Gi
requests:
cpu: "4"
memory: 10Gi
ports:
- containerPort: 9877
hostPort: 9877
protocol: TCP
volumeMounts:
- mountPath: /home/user/data/
name: data
readinessProbe:
failureThreshold: 3
httpGet:
path: /health/
port: 9877
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
volumes:
- name: data
persistentVolumeClaim:
claimName: das-mirror
Archive DA servers (for mirror DAS)
Archive DA servers are mirror servers that don't discard any data after expiring. Each DAC should have at the very least one archive DAS to make sure all historical data is available.
To activate the “archive mode” in your DAS, set the parameter discard-after-timeout
to false in your storage method. For example:
--data-availability.s3-storage.discard-after-timeout=false
--data-availability.local-db-storage.discard-after-timeout=false
Note that local-file-storage and ipfs-storage don't discard data after expiring, so the option discard-after-timeout
is not available.
Archive servers should make use of the --data-availability.rest-aggregator.sync-to-storage
options described above to pull in any data that they don't have.
Testing the DAS
Once the DAS is running, we can test if everything is working correctly using the following methods.
Test 1: REST health check
The REST interface enabled in the mirror DAS has a health check on the path /health
which will return 200 if the underlying storage is working, otherwise 503.
Example:
curl -I <YOUR REST ENDPOINT>
Security considerations
Keep in mind the following information when running the DAS.
For a mirror DAS, using a load balancer is recommended to manage incoming traffic effectively. Additionally, as the REST interface is cacheable, consider deploying a Content Delivery Network (CDN) or caching proxy in front of your REST endpoint. The URL for the REST interface will be publicly known; ensure that it is sufficiently distinct from the RPC endpoint to prevent the latter from being easily discovered.
What to do next?
Once the DAS is deployed and tested, you'll have to communicate the following information to the chain owner, so they can update the chain parameters and configure the sequencer:
- URL of the REST endpoint
Optional parameters
Besides the parameters described in this guide, there are some more options that can be useful when running the DAS. For a comprehensive list of configuration parameters, you can run daserver --help
.
Parameter | Description |
---|---|
--conf.dump | Prints out the current configuration |
--conf.file | Absolute path to the configuration file inside the volume to use instead of specifying all parameters in the command |
Metrics
The DAS comes with the option of producing Prometheus metrics. This option can be activated by using the following parameters:
Parameter | Description |
---|---|
--metrics | Enables the metrics server |
--metrics-server.addr | Metrics server address (default "127.0.0.1") |
--metrics-server.port | Metrics server port (default 6070) |
--metrics-server.update-interval | Metrics server update interval (default 3s) |
When metrics are enabled, several useful metrics are available at the configured port, at path debug/metrics
or debug/metrics/prometheus
.
RPC metrics
Metric | Description |
---|---|
arb_das_rpc_store_requests | Count of RPC Store calls |
arb_das_rpc_store_success | Successful RPC Store calls |
arb_das_rpc_store_failure | Failed RPC Store calls |
arb_das_rpc_store_bytes | Bytes retrieved with RPC Store calls |
arb_das_rpc_store_duration (p50, p75, p95, p99, p999, p9999) | Duration of RPC Store calls (ns) |
REST metrics
Metric | Description |
---|---|
arb_das_rest_getbyhash_requests | Count of REST GetByHash calls |
arb_das_rest_getbyhash_success | Successful REST GetByHash calls |
arb_das_rest_getbyhash_failure | Failed REST GetByHash calls |
arb_das_rest_getbyhash_bytes | Bytes retrieved with REST GetByHash calls |
arb_das_rest_getbyhash_duration (p50, p75, p95, p99, p999, p9999) | Duration of REST GetByHash calls (ns) |