Generally, data pools (or just pools) can be described as discrete entities arranged around specific data sources. Anyone can create them through governance and can store any data stream. They are stored and operate on-chain, making them completely trustless. They are responsible for actually validating and archiving data by allowing participants ( protocol validator runners) to join a pool and managing the validation process on-chain, thus making the validity of the data trustless. Data pools which are currently live can be found here.
A pool always has to specify the following requirements:
- One or more data sources which the pool wants to validate and archive
- A runtime which has defined how to validate the data
- A web3 storage provider where validated data should get stored to (for example Arweave)
If those requirements are met protocol validators can join a pool and actually start validating the data.
Depending on the network a different set of data pools are currently live and validating/archiving data. To view all pools simply visit the web app.
Pools can also directly be queried by the following REST API endpoint:
Note: Additionally, a pool can directly be queried by its unique ID:
Keeping pools funded and therefore keeping the data flowing while at the same time keeping validators' stakes secured and incentivized is a challenge. KYVE designed pools to fulfill all those needs.
Keeping Pools Funded
In order to payout protocol validators and incentivize them the pool needs funds. These funds are provided by funders; they are interested in archiving the data the specific pool handles. This could be the project or the foundation behind a data source that wants its data to be permanently archived onto Arweave. Besides people interested in making the data permanent, anyone can become a funder. The only downside is that there are currently no rewards for becoming a funder. The opposite is the case; being a funder will cost you $KYVE.
Because of limited funding slots, only those who fund the highest amount can claim a funding slot. Currently, there are 50 funding slots available per pool. If there are still funding slots available, you only need to fund more than 0 $KYVE to claim a slot. You have to fund more than the current lowest funder if all slots are occupied, basically outbidding him. Once you outbid the current lowest funder, you claim his funding slot. The remaining funds of the outbid funder will be automatically transferred back to him. This mechanism ensures that only people with the highest interest in archiving the data can operate as a funder.
Basic $KYVE Flow
With the funds provided by a funder the flow of $KYVE can be summarized by the diagram below:
Keeping Protocol Nodes Incentivized
Protocol validators have many tasks. They have to collect data, bundle them, upload, and submit them. To reward these nodes for their work and keep them incentivized, they receive bundle rewards when they successfully propose a valid bundle. As described above, those rewards are funded by funders. But before the uploader receives his reward, a network fee (usually 1%) is deducted and automatically transferred to the community pool. You can find more information on the calculation of the uploader reward here.
Keeping Delegators Incentivized
Delegators are lending $KYVE to protocol validators to help secure the network and helping them to earn more rewards. Delegators have to trust protocol validators since they also receive a slash proportionally to their delegation. In return for putting the capital at risk delegators receive delegation rewards which are also funded by funders. These rewards are a certain fraction of the entire bundle reward, depending on the nodes commission. You can find more information about the commission here and more details about the delegation distribution here.
To make data pools as general as possible many parameters were introduced to fit the various requirements of data streams. For each pool the following state is stored:
The unique identifier of each pool. This can not be changed and gets assigned automatically on creation.
A human readable name for the pool. Also used when searching for a pool.
The name of the runtime. For EVM this would be
@kyvejs/evm for example. It is used in the protocol validator to double
check if the node actually supports this runtime and can take part in the upload/validation process.
A link to an image file. Usually a SVG stored on Arweave.
Runtime specific configuration in JSON format. Usually the data sources are stored here and other pool specific configuration the runtime needs. More information on how to configure this parameter can be found on the dedicated runtime documentation.
The key the data pool should start validating from. For blockchains the starting key would be
0 because this would
be the genesis block. For time based data streams this would be the starting date. The format of the start key depends
on the runtime.
The key the data pool has validated to. If a data pool has for example validated the first 1,000 blocks of a
blockchain the current key would be
The summary of the latest valid bundle which got validated. The summary of a bundle gets generated in the runtime and is used to access bundle data on-chain.
Since the keys are of type string the data pool internally keeps track by using indexes. These indexes are just counters and in the case of blockchain the index corresponds to the number of blocks validated.
A counter which keeps track of how many valid bundles the pool has produced. Used for metrics.
How long a bundle proposal round should be at least open for voting. Usually between one and five minutes. The unit is seconds.
The base reward for node operators who successfully proposed a valid bundle. This should cover all fixed costs a node operator has like server costs, transaction fees etc. in order to operate not in a loss. The unit is in ukyve.
The minimum delegation a data pool should have before it starts validating bundles. Used for security reasons to prevent for example only one node operator from proposing a bundle with a delegation of only 1 $KYVE. Unit is in ukyve.
The maximum amount of data items a bundle can have, otherwise it is automatically flagged as invalid. Prevents uploaders from submitting huge bundles and therefore destabilizing the bundle validation flow.
A boolean which indicates whether or not the pool has been temporary disabled by the governance or not. If a pool is disabled it can not validate bundles and is effectively paused. Only the governance can then enable a pool again.
An array of entries which keep track of users that funded a pool with $KYVE. For that the address and the current funding amount is stored.
The address of the funder.
The amount the funder has still left in the pool in ukyve.
The total amount of funds in ukyve the pool currently still has left.
An object which holds all info about the current runtime version and the available binaries for participating as a validator in this pool.
The version of the runtime. Protocol validators compare for security reasons their runtime version with the pool's version to ensure correct behaviour.
An object in JSON format containing download URLs to the protocol validator binaries. Used by KYSOR if auto download is enabled.
An object which holds all the info when a pool has a scheduled runtime upgrade.
Version is the new runtime version tag of the upgrade.
Binaries is the new object in JSON format containing the download links to the new upgrade binaries.
A UNIX timestamp of when the upgrade should get applied. If the scheduled time is in the past the upgrade gets applied immediately. Else it waits until that time is reached.
Duration is the time in seconds how long the pool should halt while the upgrade is getting applied. During this time no bundles can get validated. This gives every node validator the time to properly upgrade their binaries before the pool continues with the newer version. Usually about one day.
The ID of the storage provider which should get used. Here
1 equals Arweave and
2 equals Bundlr. If it is zero no
storage provider is used and data just gets validated and not archived.
The ID of the compression which should get used before storing on the storage provider. If it is
1 it used GZip
compression. If it is zero it does no compression.
Below is the query result from a pool. The actual pool state can be found under 'data'. Additionally, the
bundle_proposal and all protocol validators who have joined the pool are attached with some other information
calculated on the fly.