Vitalik’s Blog: Without These Three Technologies, Ethereum Will Fail

Vitalik's Blog: The 3 Key Technologies Essential for Ethereum's Success

Author | Vitalik

Translation | @rickawsb

Original link:

https://vitalik.eth.limo/general/2023/06/09/three_transitions.html

As Ethereum transitions from a young, experimental technology to a mature technology stack that can bring an open, global, and permissionless experience to everyday users, this stack will need to go through three major technical transitions:

● The first is the layer-2 transition – everyone moves to layer-2 scaling solutions.

● The second is the wallet security transition – everyone moves to smart contract wallets.

● The third is the privacy transition – ensuring privacy-preserving transaction methods are available and ensuring all other tools being developed (social recovery, identity, reputation) preserve privacy.

The ecosystem transformation triangle, you can only pick 3

If the first is not achieved, Ethereum fails because the cost of every transaction is $3.75 (or $82.48 if we have another bull market), and every product targeting the mass market inevitably forgets the blockchain and adopts centralized solutions for everything.

If the second is not achieved, Ethereum fails because users will not want to store their funds (and non-financial assets), and everyone will move to centralized exchanges.

If the third is not achieved, Ethereum fails because all transactions (and POAPs, etc.) are public for anyone to see, and this is too high a privacy sacrifice for many users, and everyone will move to centralized solutions that at least somewhat hide your data.

These three transitions are critical for the reasons above. But because they involve close coordination in properly addressing these issues, they are also challenging transitions. It is not just the functionality of the protocol that needs improvement; in some cases, how we interact with Ethereum needs to fundamentally change, requiring deep changes in applications and wallets.

These three shifts will fundamentally change the relationship between users and addresses

In the world of layer 2 expansion, users will exist on many different layer 2 networks. Are you a member of ExampleDAO, which exists on Optimism? Then you have an account on Optimism! Do you hold a CDP in a stablecoin system on ZkSync? Then you have an account on ZkSync! Have you ever tried running an application on Kakarot? Then you have an account on Kakarot! The era of users having only one address is gone forever.

According to my Brave Wallet view, I have Ethereum in four places. Yes, Arbitrum and Arbitrum Nova are different. Don’t worry, things will get even more confusing over time!

Smart contract wallets add more complexity because they make having the same address across L1 and various L2s more difficult. Today, most users use externally owned accounts (EOAs), whose addresses are actually a hash of the public key used for signature verification, so there is no change between L1 and L2. However, it becomes more difficult to maintain an address when using a smart contract wallet. Although a lot of work has been done to try to make addresses equivalent code hashes that can be used across different networks (most notably CREATE2 and ERC-2470 singleton factories), it is difficult to achieve this perfectly. Some L2s (such as “type 4 ZK-EVMs”) are not exactly equivalent to EVM and often use Solidity or intermediate-level assembly languages, which prevent hash equivalence. Even if hash equivalence can be achieved, the possibility of wallets changing ownership via key changes leads to other confusing consequences.

Privacy requirements may mean that each user has more addresses, and may even change the type of address we deal with. If proposals for stealth addresses are widely adopted, users may have a new address for each transaction. Other privacy schemes, including existing ones such as Tornado Cash, change the way assets are stored in different ways: many users’ funds are stored in the same smart contract (and thus at the same address). To send funds to a specific user, users will need to rely on the privacy scheme’s own internal addressing system.

As we can see, these three shifts weaken the psychological model of “one user = one address” in different ways, and some of their effects in turn increase the complexity of implementing these shifts. Two particularly complex issues are:

1. If you want to make a payment to someone, how do you get the payment information?

2. If a user’s assets are stored in different locations on different chains, how do they change their keys and social recovery?

Three shifts in the relationship between on-chain payments (and identity)

I have some coins on Scroll and I want to use them to buy coffee (if “I” refers to me, the author of this article, then “coffee” of course means “green tea”). You are the person selling me coffee, but you can only receive Taiko coins. What should we do?

Basically, there are two solutions:

1. The recipient wallet (whether a merchant or an individual) needs to work very hard to support every L2 and have some automation to asynchronously merge funds.

2. The recipient provides their L2 alongside their address, and the sender’s wallet automatically routes the funds to the target L2 via some cross-L2 bridging system.

Of course, these solutions can be used in conjunction: the recipient provides a list of L2s they are willing to accept, and the sender’s wallet figures out the payment method, which may include direct sending (if they are lucky) or sending via a cross-L2 bridging path.

But this is just one example of a key challenge, where the three shifts introduce the need for more information to perform simple operations beyond just a 20-byte address.

Fortunately, the transition to smart contract wallets is not a huge burden for addressing systems, but there are still some technical challenges in other parts of the application stack. Wallets need to be updated to ensure that they not only send 21,000 Gas in transactions but, more importantly, that the recipient’s wallet can track not only ETH transfers from EOAs but also ETH sent by smart contract code. Applications that rely on address ownership immutability (e.g., NFTs that prohibit smart contracts from executing royalties) will have to look for other ways to achieve their goals. Smart contract wallets will also make some things easier, particularly if someone receives non-ETH ERC20 tokens only, as they will be able to use ERC-4337 payment proxies to pay for the gas of that token.

On the other hand, privacy once again raises some major challenges that we have not really solved yet. The original Tornado Cash did not introduce any of these issues because it did not support internal transfers: users could only deposit into the system and withdraw funds from it. However, once internal transfers are possible, users will need to use the internal addressing scheme of the privacy system. In fact, the “payment information” for users will need to include: (i) some kind of “payment public key”, a key commitment that the recipient can use to spend, and (ii) sending encrypted information for the sender that only the recipient can decrypt to help the recipient discover how to receive the payment.

The Invisible Address Protocol relies on the concept of a meta-address, which works as follows: part of the meta-address is the sender’s blinded version of the spending key, and the other part is the sender’s encryption key (although the minimum implementation can set these two keys to be the same).

Illustrative overview of an abstract privacy address scheme based on encryption and zero-knowledge succinct non-interactive proofs (ZK-SNARKs)

A key lesson here is that in a privacy-focused ecosystem, users will hold both a payment public key and an encryption public key, and the user’s “payment information” will have to contain both keys. There are other good reasons to expand in this direction beyond payments. For example, if we want Ethereum-based encrypted email, users will need to publicly provide some kind of encryption key. In the “EOA world,” we can reuse account keys, but in the secure smart contract wallet world, we may want to have more explicit functional definitions for this. This also helps make Ethereum-based identities more compatible with non-Ethereum distributed privacy ecosystems, especially PGP keys.

Three Transitions and Key Recovery

In a world where each user has multiple addresses, the default method for implementing key changes and social recovery is to have users run the recovery process separately on each address. This can be done with a single click: wallets can include software that performs the recovery process while executing it on all of the user’s addresses. However, even with such a user experience simplification, naive multi-address recovery still has three problems:

1. Gas costs are impractical: This problem is self-explanatory.

2. Counterfactual addresses: Addresses for which a smart contract has not yet been deployed (in fact, this means an account from which you have not yet sent funds). As a user, you may have infinitely many counterfactual addresses on each L2: including one or more on an L2 that does not yet exist, as well as another infinite set generated from the invisible address scheme.

3. Privacy: If users intentionally hold multiple addresses to avoid linking them together, they certainly don’t want to publicly link all addresses together by recovering them all at the same time!

Solving these problems is difficult. Fortunately, there is a fairly elegant solution that performs well: an architecture that separates verification logic from asset holding.

Each user has a key storage contract , which exists at a location (can be the main network or a specific L2). Then the user has addresses on different L2s, and the verification logic of each address is a pointer to the key storage contract. A proof is required to spend from these addresses, which enters the key storage contract and shows the current (or more realistically, very recent) spending public key.

This proof can be achieved in several ways:

Directly read the L1 state within L2. L2 can be modified to enable it to directly read the L1 state. If the key storage contract is on L1, this means that the contract inside L2 can access the key storage contract for free.

Merkle branch. Merkle branch can prove L1 state to L2, or L2 state to L1, or can merge the two to prove a partial state of one L2 to another L2. The main weakness of Merkle proof is the high gas cost due to the proof length: a proof may require 5KB, although this will be reduced to <1KB in the future due to Verkle tree.

ZK-SNARKs. The data cost can be reduced by using ZK-SNARKs with Merkle branch, instead of using branches themselves. Off-chain aggregation technology (such as based on EIP-4337) can be built to make a single ZK-SNARK verify all cross-chain state proofs in a block.

KZG commitment. Whether it is an L2 or a solution built on it, a sequential addressing system can be introduced, allowing the proof of the internal state of the system only 48 bytes long. Like ZK-SNARKs, multi-proof schemes can merge all these proofs into a single proof for each block.

If we want to avoid generating a proof for each transaction, we can implement a lighter scheme that only requires a cross-L2 proof to recover. The expenditure from the account depends on a spending key, whose corresponding public key is stored inside the account, but recovery requires a transaction that copies the current spending public key to the key storage. Even if your old key is lost, the fact that the funds in the actual address are safe: activating the actual address to convert it into a working contract requires a cross-L2 proof to copy the current spending public key. This topic on the Safe forum describes a similar architecture.

To add privacy to such schemes, we only need to encrypt the pointers and do all the proof inside ZK-SNARKs:

By doing more work (e.g. starting from this work), we can also peel off most of the complexity of ZK-SNARKs and build a simplified KZG-based scheme.

These schemes can become complex. The good news is that there are many potential synergies between them. For example, the concept of a “key storage contract” can also address the challenge of “addresses” mentioned in the previous section: if we want users to have persistent addresses that do not change when users update their keys, we can put the hidden canonical address, encrypted key, and other information into a key storage contract and use the address of the key storage contract as the user’s “address”.

Many auxiliary infrastructures need to be updated

Using ENS is expensive. Today, in June 2023, the situation is not too bad: transaction fees are high, but still acceptable compared to ENS domain name fees. Registering zuzalu.eth cost about $27, of which $11 was transaction fees. But if the market booms again, fees will soar. Even if ETH prices do not rise, the gas cost will rise to 200 gwei, and the transaction cost of registering a domain name will reach $104. Therefore, if we want people to actually use ENS, especially for use cases like decentralized social media, where users demand almost free registration (and ENS domain name fees are not an issue, as these platforms provide subdomains for users), we need ENS to work on L2.

Fortunately, the ENS team has taken action and ENS is truly implemented on L2! ERC-3668 (also known as the “CCIP standard”) and ENSIP-10 provide a way to automatically verify ENS subdomains on any L2. The CCIP standard requires setting up a smart contract that describes a way to verify data proofs on L2, and a domain name (e.g., Optinames uses ecc.eth) can be placed under the control of such a contract. Once the CCIP contract controls ecc.eth on L1, accessing a subdomain.ecc.eth will automatically involve looking up and verifying a proof (e.g., a Merkle branch) that the specific subdomain is stored on the L2 that actually stores that subdomain.

Actually obtaining these proofs involves accessing a list of URLs stored in the contract, and while it does feel centralized, I don’t think it actually is: this is a 1-to-N trust model (invalid proofs will be caught by the validation logic in the CCIP contract callback function, and as long as one URL returns a valid proof, that’s enough). The URL list can contain dozens of URLs.

The ENS CCIP effort is a success story and should be seen as a sign that we can actually achieve the kind of fundamental reforms we need. But there are still many application-layer reforms that need to happen. Here are some examples:

● Many dapps depend on users providing off-chain signatures. This is easy for externally owned accounts (EOAs). ERC-1271 provides a standardized way to implement off-chain signatures for smart contract wallets. However, many dapps still don’t support ERC-1271 and will need to be updated.

Using “is this an EOA?” to distinguish users from contracts in dapps (e.g. preventing transfers or enforcing royalties for NFTs) will be problematic. Overall, I recommend not trying to find a purely technical solution here; determining whether a particular crypto control transfer is a beneficial ownership transfer is a difficult problem that may not be solvable without some off-chain community-driven mechanism. It is most likely that apps will no longer rely on techniques like preventing transfers and instead rely more on techniques like Harberger taxes.

Wallets will need to improve how they interact with spending and cryptographic keys. Currently, wallets often use deterministic signatures to generate application-specific keys: signing a standard nonce (e.g. a hash of the application name) with the EOA’s private key will generate a deterministic value that cannot be generated without the private key, and so is technically secure. However, these techniques are “opaque” to wallets, preventing them from implementing UI-level security checks. In a more mature ecosystem, signing, encryption, and related functionality will need to be handled more explicitly by wallets.

Light clients (e.g. Helios) will need to verify L2, not just L1. Today, light clients primarily focus on checking L1 headers for validity (using a light client sync protocol) and verifying the Merkle branches for L1 state and transactions rooted at the L1 header. Tomorrow they’ll also need to verify proofs of L2 state rooted at the state root stored in L1 (more advanced versions will actually look at L2 pre-commits).

Wallets will need to protect assets and data

Today, the task of wallets is to protect assets. Everything is stored on the chain, and the only thing the wallet needs to protect is the private key that currently protects these assets. If you change the key, you can safely publish the previous private key on the Internet the next day. However, in the ZK world, the situation is no longer the same: wallets not only protect authentication credentials, but also save your data.

We see the initial signs of this world on Zuzalu, which uses the ZK-SNARK-based identity system ZuBlockingss. Users have a private key for authentication in the system, which can be used to generate basic proofs, such as “prove that I am a Zuzalu resident without revealing my specific identity.” However, the ZuBlockingss system is also starting to build other applications on top of this, the most important of which is stamp (ZuBlockingss version of POAP).

One of my many ZuBlockingss stamps, confirming that I am a proud member of Team Cat

The main feature of stamp compared to POAP is privacy: you keep the data locally, and only provide a ZK proof for the stamp (or some calculations in the stamp calculation) when you want to provide it to others. But this also brings some risks: if you lose this information, you will lose your stamp.

Of course, the problem of holding data can be reduced to the problem of holding a single encryption key: a third party (even the chain) can hold an encrypted copy of the data. This has the convenient advantage that the operations you take do not change the encryption key, so you do not need to interact with the system that protects your encryption key. But even so, if you lose the encryption key, you will lose everything. On the other hand, if someone sees your encryption key, they will see everything encrypted with that key.

ZuBlockingss’s actual solution is to encourage people to store keys on multiple devices (such as laptops and phones), because the chances of losing all devices at the same time are small. We can further adopt a scheme that stores keys among multiple protectors through secret sharing.

This social recovery through MPC is not sufficient as a wallet solution, because it means that not only the current protector, but also the previous protector may conspire to steal your assets, which is an unacceptable high risk. But privacy leaks are usually lower risk than complete loss of assets, and a use case with high privacy needs can always accept higher loss risks, that is, not backing up keys related to these privacy needs.

To avoid trapping users in a cumbersome multi-recovery-path system, a socially recoverable wallet may need to manage both asset recovery and cryptographic key recovery at the same time.

Returning to identity issues

A common thread of these changes is the concept of “addresses”, the encrypted identifiers you use on the chain to represent “you”, will undergo fundamental changes. The instruction on “how to interact with me” will no longer be just an ETH address; in some form, they will be a combination of multiple addresses, addresses on multiple L2s, stealth meta-addresses, cryptographic keys, and other data.

One way to achieve this is to use ENS as your identity: your ENS record can contain all this information, and if you send to someone like bob.eth (or bob.ecc.eth, etc.), they can look up and see all the information about you that is required for payment and interaction, including more complex cross-domain and privacy-preserving ways.

However, this ENS-centric approach has two weaknesses:

It binds too many things to your name. Your name is not you, your name is just one of your many attributes. You should be able to change your name without having to transfer the entire identity profile and update records in many applications.

You cannot have trustworthy pseudonyms. A key user experience feature of any blockchain is the ability to send coins to someone who has not yet interacted with that chain. Without this feature, you are stuck in a catch-22: interacting with the chain requires paying transaction fees, and paying transaction fees requires…already having coins. ETH addresses, including smart contract addresses with CREATE2, have this feature. ENS names do not, because if two Bobs both decide off-chain that they are bob.ecc.eth, there is no way to choose who will get that name.

One possible solution is to put more content into the keystore contract in the architecture previously mentioned in this article. The keystore contract can contain information about you and how to interact with you in various ways (and some of this information can be off-chain using CCIP), and users will use their keystore contract as their primary identifier. But the assets they actually receive will be stored in various different places. The keystore contract is not bound to names and has friendly counterfactuality: you can generate an address that can be proven to have been initialized only by a keystore contract with certain fixed initial parameters.

Another category of solutions involves completely abandoning the concept of user-visible addresses, similar to the Bitcoin payment protocol. One idea is to rely more on direct communication channels between the sender and the receiver; for example, the sender could send a claim link (which could be an explicit URL or a QR code), and the receiver could use that link to accept the payment in the way they prefer.

Whether the sender or the receiver moves first, the possibility of relying more on wallets for real-time generation of the latest payment information can reduce friction. However, the assumption of direct communication is indeed a thorny issue in practice, so we may eventually see a combination of different technologies.

In all of these designs, maintaining decentralization and enabling users to easily access their current assets and the latest view of messages they have issued is crucial. These views should rely on open tools, not proprietary solutions. Efforts should be made to avoid the greater complexity of payment infrastructure becoming an opaque “tower of abstraction” that makes it difficult for developers to understand what is happening and adapt to the new environment, which is a daunting task. Despite the challenges, achieving scalability, wallet security, and privacy for ordinary users is critical to the future of Ethereum. This is not only about technical feasibility, but also about the actual accessibility of ordinary users. We need to work hard to meet this challenge.