TL;DR¶

EIP 1559 is a proposed improvement for the transaction fee market. It sets a variable "base" gasprice to be paid by the user and burned by the protocol, in addition to a "tip" paid by the user to the block producer.
The base price ("basefee") adjusts upwards when demand is high, and downwards otherwise.
We observe in this notebook that in a stationary environmnent, basefee converges to a value that prices out enough users to achieve the target block size.

We introduce here the building blocks of agent-based simulations of EIP1559. This follows an earlier notebook that merely looked at the dynamics of the EIP 1559 mechanism. In the present notebook, agents decide on transactions based on the current basefee and form their transactions based on internal evaluations of their values and costs.

Huberman et al., 2019 introduced such a model and framework for the Bitcoin payment system. We adapt it here to study the dynamics of the basefee.

All the code is available in this repo, with some preliminary documentation here. You can also download the abm1559 package from PyPi and reproduce all the analysis here yourself!

The broad lines¶

We have several entities. Users come in randomly (following a Poisson process) and create and send transactions. The transactions are received by a transaction pool, from which the $x$ best valid transactions are included in a block created at fixed intervals. $x$ depends on how many valid transactions exist in the pool (e.g., how many post a gasprice exceeding the prevailing basefee in 1559 paradigm) and the block gas limit. Once transactions are included in the block, and the block is included in the chain, transactions are removed from the transaction pool.

How do users set their parameters? Users have their own internal ways of evaluating their costs. Users obtain a certain value from having their transaction included, which we call $v$. $v$ is different for every user. This value is fixed but their overall payoff decreases the longer they wait to be included. Some users have higher time preferences than others, and their payoff decreases faster than others the longer they wait. Put together, we have the following:

$$ \texttt{payoff} = \texttt{value} - \texttt{cost from waiting} - \texttt{transaction fee} $$

Users expect to wait for a certain amount of time. In this essay, we set this to a fixed value -- somewhat arbitrarily we choose 5. This can be readily understood in the following way. Users estimate what their payoff will be from getting included 5 blocks from now, assuming basefee remains constant. If this payoff is negative, they decide not to send the transaction to the pool (in queuing terminology, they balk). We'll play with this assumption later.

The scenario is set up this way to study stationarity: assuming some demand comes in from a fixed distribution at regular intervals, we must expect basefee to reach some stationary value and stay there. It is then reasonable for users, at this stationary point, to consider that 5 blocks from now basefee will still be at the same level. In the nonstationary case, when for instance a systemic change in the demand happens (e.g., the rate of Poisson arrivals increases), a user may want to hedge their bets by estimating their future payoffs in a different way, taking into account that basefee might increase instead. This strategy would probably be a good idea during the transition phase, when basefee shifts from one stationary point to a new one.

We make the assumption here that users choose their 1559 parameters based on their value alone. We set the transaction max_fee parameter to the value of the user and set the gas_premium parameter to a residual value -- 1 Gwei per unit of gas.

There is no loss of generality in assuming all users send the same transaction in (e.g., a simple transfer) and so all transactions have the same gas_used value (21,000). In 1559 paradigm, with a 20M gas limit per block, this allows at most 952 transactions to be included, although the mechanism will target half of that, around 475 here. The protocol adjusts the basefee to apply economic pressure, towards a target gas usage of 10M per block.

Simulation¶

We import a few classes from our abm1559 package.

In [1]:

%config InlineBackend.figure_format = 'svg'

import os, sys
sys.path.insert(1, os.path.realpath(os.path.pardir))
# You may remove the two lines above if you have installed abm1559 from pypi

from abm1559.utils import constants

from abm1559.txpool import TxPool

from abm1559.users import User1559

from abm1559.userpool import UserPool

from abm1559.chain import (
    Chain,
    Block1559,
)

from abm1559.simulator import (
    spawn_poisson_demand,
    update_basefee,
)

import pandas as pd

And define the main function used to simulate the fee market.

In [2]:

def simulate(demand_scenario, UserClass):
    # Instantiate a couple of things
    txpool = TxPool()
    basefee = constants["INITIAL_BASEFEE"]
    chain = Chain()
    metrics = []
    user_pool = UserPool()

    for t in range(len(demand_scenario)):
        if t % 100 == 0: print(t)

        # `env` is the "environment" of the simulation
        env = {
            "basefee": basefee,
            "current_block": t,
        }

        # We return a demand drawn from a Poisson distribution.
        # The parameter is given by `demand_scenario[t]`, and can vary
        # over time.
        users = spawn_poisson_demand(t, demand_scenario[t], UserClass)

        # We query each new user with the current basefee value
        # Users either return a transaction or None if they prefer to balk
        decided_txs = user_pool.decide_transactions(users, env)

        # New transactions are added to the transaction pool
        txpool.add_txs(decided_txs)

        # The best valid transactions are taken out of the pool for inclusion
        selected_txs = txpool.select_transactions(env)
        txpool.remove_txs([tx.tx_hash for tx in selected_txs])

        # We create a block with these transactions
        block = Block1559(txs = selected_txs, parent_hash = chain.current_head, height = t, basefee = basefee)

        # The block is added to the chain
        chain.add_block(block)

        # A couple of metrics we will use to monitor the simulation
        row_metrics = {
            "block": t,
            "basefee": basefee / (10 ** 9),
            "users": len(users),
            "decided_txs": len(decided_txs),
            "included_txs": len(selected_txs),
            "blk_avg_gas_price": block.average_gas_price(),
            "blk_avg_tip": block.average_tip(),
            "pool_length": txpool.pool_length(),
        }
        metrics.append(row_metrics)

        # Finally, basefee is updated and a new round starts
        basefee = update_basefee(block, basefee)

    return (pd.DataFrame(metrics), user_pool, chain)

As you can see, simulate takes in a demand_scenario array. Earlier we mentioned that each round, we draw the number of users wishing to send transactions from a Poisson distribution. This distribution is parameterised by the expected number of arrivals, called lambda $\lambda$. The demand_scenario array contains a sequence of such lambda's. We also provide in UserClass the type of user we would like to model (see the docs for more details).

Our users draw their value for the transaction (per unit of gas) from a uniform distribution, picking a random number between 0 and 20 (Gwei). Their cost for waiting one extra unit of time is drawn from a uniform distribution too, this time between 0 and 1 (Gwei). The closer their cost is to 1, the more impatient users are.

Say for instance that I value each unit of gas at 15 Gwei, and my cost per round is 0.5 Gwei. If I wait for 6 blocks to be included at a gas price of 10 Gwei, my payoff is $15 - 6 \times 0.5 - 10 = 2$.

The numbers above sound arbitrary, and in a sense they are! They were chosen to respect the scales we are used to (although gas prices are closer to 100 Gweis these days...). It also turns out that any distribution (uniform, Pareto, whatever floats your boat) leads to stationarity. The important part is that some users have positive value for transacting in the first place, enough to fill a block to its target size at least. The choice of sample the cost from a uniform distribution, as opposed to having all users experience the same cost per round, allows for simulating a scenario where some users are more in a hurry than others.

In [3]:

demand_scenario = [2000 for i in range(200)]
(df, user_pool, chain) = simulate(demand_scenario, User1559)

0
100

To study the stationary case, we create an array repeating $\lambda$ for as many blocks as we wish to simulate the market for. We set $\lambda$ to spawn on average 2000 users between two blocks.

Results¶

Let's print the head and tail of the data frame holding our metrics. Each row corresponds to one round of our simulation, so one block.

In [4]:

df

Out[4]:

	block	basefee	users	decided_txs	included_txs	blk_avg_gas_price	blk_avg_tip	pool_length
0	0	1.000000	2031	1562	952	2.000000	1.0	610
1	1	1.124900	1983	1525	952	2.124900	1.0	1183
2	2	1.265400	1912	1453	952	2.265400	1.0	1684
3	3	1.423448	1997	1493	952	2.423448	1.0	2225
4	4	1.601237	2001	1459	952	2.601237	1.0	2732
...	...	...	...	...	...	...	...	...
195	195	11.795094	2001	508	508	12.795094	1.0	1924
196	196	11.893583	2040	459	459	12.893583	1.0	1924
197	197	11.839914	1979	462	462	12.839914	1.0	1924
198	198	11.795810	1983	487	487	12.795810	1.0	1924
199	199	11.829281	1960	444	444	12.829281	1.0	1924

200 rows × 8 columns

At the start of the simulation we clearly see in column users a demand close to 2000 users per round. Among these 2000 or so, around 1500 decide to send their transaction in (decided_txs). The 500 who don't might have a low value or high per-round costs, meaning it is unprofitable for them to even send their transaction in. Eventually 952 of them are included (included_txs), maxing out the block gas limit. The basefee starts at 1 Gwei but steadily increases from there, reaching around 11.8 Gwei by the end.

By the end of the simulation, we note that decided_txs is always equal to included_txs. By this point, the basefee has risen enough to make it unprofitable for most users to send their transactions. This is exactly what we want! Users balk at the current prices.

In the next chart we show the evolution of basefee and tips. We define tip as the gas price minus the basefee, which is what miners receive from the transaction.

Note that tip is in general not equal to the gas premium that users set. This is particularly true when basefee plus gas premium exceeds the max fee of the user. In the graph below, the tip hovers around 1 Gwei (the premium), but is sometimes less than 1 too, especially when users see the prevailing basefee approach their posted max fees.

In [5]:

df.plot("block", ["basefee", "blk_avg_tip"])

Out[5]:

<AxesSubplot:xlabel='block'>

Notice the increase at the beginning followed by a short drop? At the very beginning, the pool fills up quickly with many users hopeful to get their transactions in with a positive resulting payoff. The basefee increases until users start balking and the pool is exhausted. Once exhausted, basefee starts decreasing again to settle at the stationary point where the pool only includes transactions that are invalid given the stationary basefee.

We can see the pool length becoming stationary in the next plot, showing the length of the pool over time.

In [6]:

df.plot("block", "pool_length")

Out[6]:

<AxesSubplot:xlabel='block'>

The remaining transactions are likely from early users who did not balk even though basefee was increasing, and who were quickly outbid by others.

Demand shock¶

We look at a stationary setting, where the new demand coming in each new round follows a fixed expected rate of arrival. Demand shocks may be of two kinds:

Same number of users, different values for transactions and costs for waiting.
Increased number of users, same values and costs.

We'll consider the second scenario here, simply running the simulation again and increasing the $\lambda$ parameter of our Poisson arrival process suddenly, from expecting 2000, to expecting 6000 users per round.

In [7]:

demand_scenario = [2000 for i in range(100)] + [6000 for i in range(100)]
(df_jump, user_pool_jump, chain_jump) = simulate(demand_scenario, User1559)

0
100

The next plot shows the number of new users each round. We note at block 100 a sudden jump from around 2000 new users to 6000.

In [8]:

df_jump.plot("block", "users")

Out[8]:

<AxesSubplot:xlabel='block'>

In [9]:

df_jump.plot("block", ["basefee", "blk_avg_tip"])

Out[9]:

<AxesSubplot:xlabel='block'>

We see a jump around block 100, when the arrival rate of users switches from 2000 to 6000. The basefee increases in response. With a block limit of 20M gas, about 950 transactions fit into each block. Targeting half of this value, the basefee increases until more or less 475 transactions are included in each block.

Since our users' values and costs are always drawn from the same distribution, when 2000 users show up, we expect to let in about 25% of them (~ 475 / 2000), the 25% with greatest expected payoff. When 6000 users come in, we now only expect the "richest" 8% (~ 475 / 6000) to get in, so we "raise the bar" for the basefee, since we need to discriminate more.

In [10]:

df_jump.plot("block", ["pool_length", "users", "decided_txs", "included_txs"])

Out[10]:

<AxesSubplot:xlabel='block'>

As we see with the graph above, for a short while after block 100, blocks include more than the usual ~475 transactions. This is the transition between the old and the new stationary points.

Since we have a lot more new users each round, more of them are willing and able to pay for their transactions above the current basefee, and so get included. This keeps happening until the basefee reaches a new stationary level.

Changing expected time¶

Up until now, users decided whether to join the transaction pool or not based on the expectation that they would be included at least 5 blocks after they join. They evaluated their payoff assuming that basefee did not change (due to stationarity) for these 5 blocks. If their value for transacting minus the cost of waiting for 5 blocks minus the cost of transacting is positive, they sent their transactions in!

$$ \texttt{payoff} = \texttt{value} - \texttt{cost from waiting 5 blocks} - \texttt{transaction fee} > 0 $$

Under a stationary demand however, users can expect to be included in the next block. So let's have user expect to be included in the next block, right after their appearance, and see what happens. We do this by subclassing our User1559 agent and overriding its expected_time method.

In [11]:

class OptimisticUser(User1559):
    def expected_time(self, env):
        return 0
    
demand_scenario = [2000 for i in range(100)] + [6000 for i in range(100)]
(df_opti, user_pool_opti, chain_opti) = simulate(demand_scenario, OptimisticUser)

0
100

In [12]:

df_opti.plot("block", ["basefee", "blk_avg_tip"])

Out[12]:

<AxesSubplot:xlabel='block'>

The plot looks the same as before. But let's look at the average basefee for the last 50 blocks in this scenario and the last.

In [13]:

df_opti[(df.block > 150)][["basefee"]].mean()

Out[13]:

basefee    17.433133
dtype: float64

In [14]:

df_jump[(df.block > 150)][["basefee"]].mean()

Out[14]:

basefee    15.030248
dtype: float64

When users expect to be included in the next block rather than wait for at least 5, the basefee increases! This makes sense if we come back to our payoff definition:

$$ \texttt{payoff} = \texttt{value} - \texttt{cost from waiting} - \texttt{transaction fee} $$

The estimated cost for waiting is lower now since users estimate they'll be included in the next block and not wait 5 blocks to get in. Previously, some users with high values but high time preferences might have been discouraged to join the pool. Now these users don't expect to wait as much, and since their values are high, they don't mind bidding for a higher basefee either. We can check indeed that on average, users included in this last scenario have higher values than users included in the previous one.

To do so, we export to pandas DataFrames the user pool (to obtain their values and costs) and the chain (to obtain the addresses of included users in the last 50 blocks).

In [15]:

user_pool_opti_df = user_pool_opti.export().rename(columns={ "pub_key": "sender" })
chain_opti_df = chain_opti.export()

Let's open these up and have a look at the data. user_pool_opti_df registers all users we spawned in our simulation.

In [16]:

user_pool_opti_df.tail()

Out[16]:

	user	sender	value	wakeup_block	user_type	cost_per_unit
799774	1559 affine user with value 11054269619 and co...	b0930fcba0677116	11.054270	199	user_1559	0.059898
799775	1559 affine user with value 4307381915 and cos...	cb5c2e2729133b44	4.307382	199	user_1559	0.719897
799776	1559 affine user with value 13494438833 and co...	d21482354208b7c3	13.494439	199	user_1559	0.219101
799777	1559 affine user with value 11267395546 and co...	555b23fb396ecab0	11.267396	199	user_1559	0.341294
799778	1559 affine user with value 18296621634 and co...	ba49c4f9090602b3	18.296622	199	user_1559	0.406582

Meanwhile, chain_opti_df lists all the transactions included in the chain.

In [17]:

chain_opti_df.tail()

Out[17]:

	block_height	tx_index	basefee	tx	start_block	sender	gas_used	tx_hash	gas_premium	max_fee	tip
106976	199	455	17.483026	1559 Transaction 8034d24a635a2e2c: max_fee 187...	199	135fa86cc8a0db19	21000	8034d24a635a2e2c	1.0	18.722951	1.0
106977	199	456	17.483026	1559 Transaction 98baeaee522c0661: max_fee 190...	199	00c9cd46792eed6d	21000	98baeaee522c0661	1.0	19.015471	1.0
106978	199	457	17.483026	1559 Transaction cbe0f5904d87c2fa: max_fee 190...	199	c276db600567649e	21000	cbe0f5904d87c2fa	1.0	19.014307	1.0
106979	199	458	17.483026	1559 Transaction 6ba3990799797b1a: max_fee 199...	199	166757b80c27432b	21000	6ba3990799797b1a	1.0	19.916943	1.0
106980	199	459	17.483026	1559 Transaction 849ea7b7070828ca: max_fee 184...	199	8932cefc77b8c395	21000	849ea7b7070828ca	1.0	18.485555	1.0

With a simple join on the sender column we can associate each user with their included transaction. We look at the average value of included users after the second stationary point.

In [18]:

chain_opti_df[(chain_opti_df.block_height >= 150)].join(
    user_pool_opti_df.set_index("sender"), on="sender"
)[["value"]].mean()

Out[18]:

value    19.210239
dtype: float64

When users expect to be included at least one block after they send their transaction, the average value of included users is around 19.2 Gwei.

In [19]:

user_pool_jump_df = user_pool_jump.export().rename(columns={ "pub_key": "sender" })
chain_jump_df = chain_jump.export()
chain_jump_df[(chain_jump_df.block_height >= 150)].join(
    user_pool_jump_df.set_index("sender"), on="sender"
)[["value"]].mean()

Out[19]:

value    18.685197
dtype: float64

But when users expect to be included at least five blocks after, the average value of included users is around 18.7 Gwei, confirming that when users expect next block inclusion, higher value users get in and raise the basefee in the process.

Conclusion¶

We've looked at 1559 when users with their own values and costs decide whether to join the pool or not based on the current basefee level. These users estimate their ultimate payoff by assuming stationarity: the demand between rounds follows the same arrival process and the same distribution of values and costs. In this stationary environment, basefee settles on some value and mostly stays there, allowing users to estimate their payoff should they wait for five or one blocks to be included.

We've again left aside some important questions. Here all users simply leave a 1 Gwei premium in their transactions. In reality, we should expect users to attempt to "game" the system by leaving higher tips to get in first. We can suppose that in a stationary environment, "gaming" is only possible until basefee reaches its stationary point (during the transition period) and exhausts the feasible demand. We will leave this question for another notebook.

(Temporary) non-stationarity is more interesting. The 5% meme during which sudden demand shocks precipitate a large influx of new, high-valued transactions should also see users try to outcompete each other based on premiums alone, until basefee catches up. The question of whether 1559 offers anything in this case or whether the whole situation would look like a first price auction may be better settled empirically, but we can intuit that 1559 would smooth the process slightly by offering a (laggy) price oracle.

And then we have the question of miner collusion, which rightfully agitates a lot of the ongoing conversation. In the simulations we do here, we instantiated one transaction pool only, which should tell you that we are looking at a "centralised", honest miner that includes transactions as much as possible, and not a collection or a cartel of miners cooperating. We can of course weaken this assumption and have several mining pools with their own behaviours and payoff evaluations, much like we modelled our users. We still would like to have a good theoretical understanding of the risks and applicability of miner collusion strategies. Onward!

(Bonus) Ex post individual rationality¶

Individual rationality is the idea that agents won't join a mechanism unless they hope to make some positive payoff out of it. I'd rather not transact if my value for transacting minus my costs is negative.

In general, we like this property and we want to make the mechanism individually rational to as many agents as possible. Yet, some mechanisms fail to satisfy ex post individual rationality: I might expect to make a positive payoff from the mechanism, but some realisation of the mechanism exists where my payoff is negative.

Take an auction. As long as my bid is lower or equal to my value for the auctioned item, the mechanism is ex post individually rational for me: I can never "overpay". If I value the item for 10 ETH and decide to bid 11 ETH, in a first-price auction where I pay for my bid if I have the highest, there is a realisation of the mechanism where I am the winner and I am asked to pay 11 ETH. My payoff is -1 ETH then.

In the transaction fee market, ex post individual rationality is not guaranteed unless I can cancel my transaction. In the simulations here, we do not offer this option to our agents. They expect to wait for inclusion for a certain amount of blocks, and evaluate whether their payoff after that wait is positive or not to decide whether to send their transaction or not. However, some agents might wait longer than their initial estimation, in particular before the mechanism reaches stationarity. Some realisations of the mechanism then yield a negative payoff for these agents, and the mechanism is not ex post individually rational.

Let's look at the agents' payoff using the transcript of transactions included in the chain. For each transaction, we want to find out what was the ultimate payoff for the agent who sent it in. If the transaction was included much later than the agent's initial estimation, this payoff is negative, and the mechanism wasn't ex post individually rational to them.

In [20]:

user_pool_df = user_pool.export().rename(columns={ "pub_key": "sender" })
chain_df = chain.export()
user_txs_df = chain_df.join(user_pool_df.set_index("sender"), on="sender")

In the next chunk we obtain the users' payoffs: their value minus the costs incurred from the transaction fee and the time they waited.

In [21]:

user_txs_df["payoff"] = user_txs_df.apply(
    lambda row: row.user.payoff({
        "current_block": row.block_height,
        "gas_price": row.tx.gas_price({
            "basefee": row.basefee * (10 ** 9) # we need basefee in wei
        })
    }) / (10 ** 9), # put payoff is in Gwei
    axis = 1
)
user_txs_df["epir"] = user_txs_df.payoff.apply(
    lambda payoff: payoff >= 0
)

Now we count the fraction of users in each block who received a positive payoff.

In [22]:

epir_df = pd.concat([
    user_txs_df[["block_height", "tx_hash"]].groupby(["block_height"]).agg(["count"]),
    user_txs_df[["block_height", "epir"]][user_txs_df.epir == True].groupby(["block_height"]).agg(["count"])
], axis = 1)
epir_df["percent_epir"] = epir_df.apply(
    lambda row: row.epir / row.tx_hash * 100,
    axis = 1
)

Let's plot it!

In [23]:

epir_df.reset_index().plot("block_height", ["percent_epir"])

Out[23]:

<AxesSubplot:xlabel='block_height'>

At the very beginning, all users (100%) have positive payoff. They have only waited for 1 block to get included. This percentage steadily drops, as basefee increases: some high value users waiting in the pool get included much later than they expected, netting a negative payoff.

Once we pass the initial instability (while basefee is looking for its stationary value), all users receive a positive payoff. This is somewhat expected: once basefee has increased enough to weed out excess demand, users are pretty much guaranteed to be included in the next block, and so the realised waiting time will always be less than their estimate.

Check out also: A recent ethresear.ch post by Onur Solmaz, on a 1559-inspired mechanism for daily gas price stabilization, with simulations.