# Methodology

We first provide a high-level overview of how we compute our smart floor price, then dive into more detail in the subsequent sections:

Ingest & parse on-chain and off-chain NFT transactions from the top N marketplaces

Filter out obscenely low-priced sales (a proxy for wash/accidental trades)

Compute bottom quantile statistics over historical windows of transactions to establish lower, mid, and upper bounds (per collection)

## (1) Data Pipeline

For **Solana**, we fetch, filter and parse transaction data from a cluster of on-chain RPC nodes for 9+ top NFT marketplaces.

We tag each transaction with their corresponding type (e.g., listing, de-listing, sale) and ingest it in a normalized format for querying later.

We also fetch the on-chain NFT mint accounts and their metadata (e.g., traits) for rarity scoring.

For **Ethereum**, we fetch, filter and parse events data from OpenSea's API. We fetch traits data from both Alchemy's and OpenSea's API for rarity scoring.

## (2) Filtering

Without filtering for any extremely low-priced sales transactions, we end up exposing ourselves to wash trades that are cheap to manufacture and thus cheap to manipulate. For example, it is not uncommon for the typical "floor" price to dip to obscenely low values, and we can easily observe this from the the floor plot for a collection like **DeGods:**

Observe that the median is relatively stable: in fact, it is the most robust quantile statistic (the highest breakdown point). We can use it as a reference point: that is, we can filter out all sales that are less than 0.2X the rolling median. We observe that 0.2X the median has sufficient slack based on historical data from the "blue-chip" collections we track across both Solana and Ethereum.

## (3) Quantile statistics

After filtering out low-priced sales, we can compute quantile statistics on the remaining sales transactions.

As we described with canonical floor prices, taking the minimum of a sample of transaction prices is usually a bad idea. The minimum has a breakdown point of 1, which means that all it takes is one incorrect (read: bad actor) to influence the statistic. To introduce some robustness or buffer room, we compute higher-order quantiles (e.g., 5th percentile) to establish the smart floor price.

In addition to using higher-order quantiles, we also look at a **rolling window of historical data**. This ensures that we have a larger sample size to produce a higher confidence statistic and to ensure that blips in sales data do not have a disproportional impact.

Choosing the correct quantiles, window sizes, and minimum sample size is **our secret sauce**. We've backtested our hyperparameter choices on all historical data to ensure that our choices hold up robustly throughout history and hopefully towards the future.

The final result is a much smoother and more robust time series for a "fair price":

Last updated