Our Performance

Historical errors on the top collections on both Solana and Ethereum

We report our test set error rates (i.e., the most recent 25% of data that we do not fit nor overfit hyperparameters to) to provide the best estimate of how accurate our price estimates are going forward.

We encourage the community to hold us to our error rates and to benchmark us with future price estimates we generate.

ARE vs MRE

We report both (average relative error) ARE and (median relative error) MRE so that we can be compared to other price estimate APIs, but we argue that ARE is a less gameable metric compared to MRE.

Both metrics measure how well we do across many historical price estimates. We compare historical sales transaction prices with the price estimate we could have issued immediately before. We then take the absolute relative error across all these sales transactions.

The lower the ARE/MRE, the better the price estimates.

For example, if a sales transaction for DeGods #7388 occurred at 12:30PM for 10 SOL, and our most recent price estimate was at 12:15PM for 9.8 SOL, then our relative error would be 2%.

ARE takes the mean/average across all relative errors. MRE takes the median instead. MRE is gameable because you only need to do well on 50% of all data points, and have arbitrarily large errors on the other 50%. Most collections have >50% of transactions occur near floor, so you could have quite low MRE if you just emit the floor price every time. ARE will punish you severely since your errors will be very large for above floor sales.

MRE also always tend to skew lower than ARE (errors are right heavy-tailed since it's difficult to predict well on above floor NFTs) so it's a vanity metric at best.

Relative error is also not the best metric (it is not right-sized against the variance in the sales price of the collection): but alas, it is the most intuitive for us mortals.

Solana NFT Statistics

Sorted in ascending Average Relative Error (ARE) over our test set.

Ethereum NFT Statistics

Sorted in ascending Average Relative Error (ARE) over our test set.

Last updated