Hacking, Distributed

Attacking the DeFi Ecosystem with Flash Loans for Fun and Profit

Wed, 11 Mar 2020 07:00:00 -0700

#^Attacking the DeFi Ecosystem with Flash Loans for Fun and Profit

TL;DR Flash loans are a recent blockchain smart contract construct that enable the issuance of loans that are only valid within one transaction and must be repaid by the end of that transaction. This concept has led to a number of interesting attack possibilities, some of which have been exploited recently (February 2020). We analyze those two existing DeFi flash loan attack vectors with significant ROIs (beyond 500k%). We go on to formulate finding flash loan-based attack parameters as an optimization problem over the state of the underlying DeFi state and Ethereum blockchain. Our results show how the two previously executed attacks can be boosted to result in a profit of 829.5k USD (instead of 350k USD) and 1.1M USD (instead of 600k USD), respectively. For a light read feel free to proceed, for all details, we would like to refer to our paper.

Flash Loans - the New DeFi Kid on the Block

In the traditional economy, when a lender grants a loan to a borrower, there always is a risk that the borrower might not repay its debt. This brings us to the following question:

What if it were possible to offer credit, without the risk that the borrower does not pay back the debt?

A flash loan, taken from a liquidity pool, used, and repaid within one transaction.

Here, blockchain-based flash loans come into play. A flash loan is a loan that is only valid within one blockchain transaction. Flash loans fail, if the borrower does not repay its debt before the end of the transaction borrowing the loan. That is, because a blockchain transaction can be reverted during its execution, if the condition of a repayment is not satisfied.

The assets for the flash loan are taken from a publicly funded smart contract pool. At the time of writing, some of the biggest flash loan pools are provided by Aave and dYdX, each exceeding 20M USD in value. Aave charges an interest rate of 0.09%, while dYdX’s smart contract only demands the repayment plus 1 Wei in fees.

To gauge flash loan usage, we collected flash loan data between the 8th of January 2020 and the 26th of February 2020 with a full archive Ethereum node gathering all event logs from the Aave smart contract. Note that Aave only went live early January 2020. We observe a total of 105 loans, and most flash loans interact with lending/exchange DeFi systems (e.g. Compound, Dai, MakerDAI, Uniswap). The flash loan's transaction costs (i.e. gas) appear significant (at times beyond 4M gas, compared to 21k gas for regular Ether transfer). The full details can be found in Figure 5 in the accompanying paper.

Flash Loan Use Cases

But what are flash loans really good for? Based on the community feedback and our own thoughts we identified four uses cases for flash loans: arbitrage, wash trading, collateral swapping and flash minting.

Arbitrage

Given flash loans, a trader can perform arbitrage on different DEX, without the need to hold a monetary position or being exposed to volatility risks. The trader can simply open a loan, perform an arbitrage trade and pay back the loan plus interests. One may argue that flash loans render arbitrage risk-free, the risks of smart contract vulnerabilities however remain.

Arbitrage ExampleOn the 18th Jan 2020, a flash loan borrowed 3,137.41 DAI from Aave to make an arbitrage trade on the AMM DEX Uniswap. To prepare the arbitrage, DAI is converted to 3137.41 SAI using MakerDAO's migration contract. The arbitrage converts SAI for 18.16 ETH using SAI/ETH Uniswap, and then immediately converts 18.16 ETH back to 3,148.39 DAI using DAI/ETH Uniswap. After the arbitrage, 3,148.38 DAI is transferred back to Aave to pay the loan plus fees. This transaction costs 0.02 ETH of gas. Note that even though the transaction sender gains 3.29 DAI from the arbitrage, this particular transaction is not profitable.

Wash Trading

Another potential flash loan use case is wash trading.

The trading volume of an asset, is a metric indicating its trading popularity. The most popular assets therefore, are supposed to be traded the most --- e.g. Bitcoin to date enjoys the highest trading volume (reported up to 50T USD per day) of all cryptocurrencies.

Malicious exchanges or traders can mislead other traders by artificially inflating the trading volume of an asset to attract interests. According to the Blockchain Transparency Institute Market Surveillance report, from September 2019, 73 out of the top 100 exchanges on Coinmarketcap were wash trading over 90% of their volumes. Wash trading of securities appears illegal under U.S. law.

While wash trading on centralised exchanges may be performed at little to no cost, and possibly even without real assets, wash trading on DEX requires wash traders to hold and use assets. Flash loans can remove this “obstacle” to reduce the costs to loan interests, trading fees, and (blockchain) transaction fees. A wash trading endeavour to increase the 24-hour volume of the ETH/DAI market of Uniswap by 50% would for instance cost about 1,298 USD (with a flash loan from dYdX).

Wash trading exampleOn March 2nd, 2020, a flash loan of 0.01 ETH borrowed from dYdX performed two back-and-forth trades (first converted 0.01 ETH to 122.1898 LOOM and then converted 122.1898 LOOM back to 0.0099 ETH) on the Uniswap ETH/LOOM market. The 24-hour trading volume of the ETH/LOOM market increased by 25.8% (from 17.71 USD to 22.28 USD) as a result of the two trades.

We elaborate on two other flash loan use cases, collateral swapping and flash minting, in our paper.

Flash Loan Post-Mortem

In the following we observe two adversarial flash loan trades, one pump and arbitrage, and one oracle manipulation attack.

Pump and Arbitrage

A flash loan transaction executed on the 15th of February 2020, followed by 74 transactions, yielded a profit of 1'193.69 ETH (350k USD) given a transaction fee of 132.36 USD (cumulative 50'237'867 gas, 0.5 ETH). We first discuss the details of this transaction, and then go about to explain why the adversary could have earned a profit exceeding 829.5k USD (the numbers in boxes in the Figure below correspond to the optimal attack parameters we find, while the non-surrounded numbers are those chosen by the adversary).

The core of this trade involves a margin trade on a DEX (bZx) to increase the price of WBTC/ETH on another DEX (Uniswap) and thus creates an arbitrage opportunity. The trader then borrows WBTC using ETH as collateral (on Compound), and then purchases ETH at a “cheaper” price on the distorted (Uniswap) DEX market. To maximise the profit, the adversary then converts the “cheap” ETH to purchase WBTC at a non-manipulated market price over a period of two days after the flash loan. The adversary then returns WBTC (to Compound) to redeem the ETH collateral. The following figure outlines how this trade mainly consists of two parts. For simplicity, we omit the conversion between WETH (the ERC20-tradable version of ETH) and ETH. The full details are outlined in the paper.

Image/photo

Flash loan pump and arbitrage attack.

Oracle Manipulation

In the following, we discuss the details of a second flash loan trade, which yields a profit of 2,381.41 ETH (c. 650k USD) within a single transaction executed on the 18th of February 2020, given a transaction fee of 118.79 USD. We again find that the chosen attack parameters were sub-optimal and present attack parameters that would yield a profit of 1.1M USD instead. For this attack, the adversary involves three different exchanges for the same sUSD/ETH market pair (the Kyber-Uniswap reserve, Kyber, and Synthetix). Two of these exchanges (Kyber, Kyber-Uniswap) act as price oracles for the lending platform (bZx) from which the adversary borrows assets.

Price oracle: One of the goals of the DeFi ecosystem is to not rely on trusted third parties. This premise holds both for asset custody as well as additional information, such as asset pricing. One common method to determine an asset price is hence to rely on the pricing information of an on-chain DEX (e.g. Uniswap). One drawback of this approach, is the danger of a DEX price manipulation.

Attack intuition: The core of this trade is an oracle manipulation using a flash loan on the asset pair sUSD/ETH. The manipulation lowers the price of sUSD/ETH (from 268.30 sUSD/ETH to 106.05 sUSD/ETH on Uniswap and 108.44 sUSD/ETH on Kyber Reserve). In a second step, the adversary benefits from this sUSD/ETH price decrease by borrowing ETH with sUSD as collateral. The full details can again be found in the paper.

Image/photo

Flash loan oracle manipulation attack.

Optimal DeFi Attack Parameter Generation

It’s clearly not trivial to (i) find the attack paths that the adversaries exploited, and (ii) to determine the optimal parameters to exploit the attacks to the fullest. We therefore seek help from constrained optimization techniques to guide us towards optimal attack parameters.

Image/photo

Parametrized DeFi optimizer.

To make use of constraint optimization, we first model different components that may engage in a DeFi attack. We quantitatively formalize every endpoint provided by DeFi platforms as a state transition function S’ = T(S,p) with the constraints C(S; p), where S is the given state, p are the parameters chosen by the adversary and S’ is the output state. The state can represent, for example, the adversarial balance or any internal status of the DeFi platform, while the constraints are set by the execution requirements of the Ethereum Virtual Machine (e.g. the Ether balance of an entity should never be a negative number) or the rules defined by the respective DeFi platform (e.g. a flash loan must be repaid before the transaction termination plus loan fees). Note that when quantifying profits, we ignore the loan interest/fee payments and Ethereum transaction fees, which are negligible in the present DeFi attacks. The constraints are enforced on the input parameters and output states to ensure that the optimizer yields (for the model) valid parameters. We refer to the paper for the full details.

Optimizing the Pump and Arbitrage Attack

To solve the constraints, we apply the Sequential Least Squares Programming (SLSQP) algorithm from SciPy and use the minimize function in the optimize package. Our program is evaluated on a Ubuntu 18.04.2 machine, 16 CPU cores and 32 GB RAM. We repeated our experiment a 1'000 times, the optimizer spent 6.1 ms on average converging to the optimum.

For the Pump and Arbitrage attack, the optimizer provides parameters that would yield a maximum revenue of 2,778.94 ETH, while in the original attack the parameters only yield 1,171.70 ETH. Our Figure above highlights the ideal amounts that should have been used in the attacks. Note, due to the ignorance of trading fees and precision differences, there is a minor discrepancy between the original attack revenue calculated with our model and the real revenue which is 1,193.69 ETH. This is a 829.5k USD gain over the attack that took place.

Optimizing the Oracle Manipulation Attack

We again execute our optimizer 1,000 times on the same Ubuntu 18.04.2 machine and find an average convergence time of 12.9 ms. The optimizer discovers a setting that results in 6,323.93 ETH profit for the adversary. This results in a gain of 1.1M USD instead of about 600k USD for the attack that took place.

Discussing Flash Loans and DeFi

The current generation of DeFi had developed organically, without much scrutiny when it comes to financial security; it, therefore, presents an interesting security challenge to confront. DeFi, on the one hand welcomes innovation and the advent of new protocols, such as MakerDAO, Compound, and Uniswap. On the other hand, despite a great deal of effort spent on trying to secure smart contacts, and to avoid various forms of market manipulation, etc., there has been little-to-no effort to secure entire protocols.

As such, DeFi protocols join the ecosystem, which leads to both exploits against protocols themselves as well as multi-step attacks that utilize several protocols (see above). In a certain poignant way, this highlights the fact that a DeFi, lacking a central authority that would enforce a strong security posture, is ultimately vulnerable to a multitude of attacks effectively by design. Flash loans are merely a mechanism that accelerates these attacks. It does so by requiring no collateral (except for the minor gas costs), which, in a certain way, democratizes the attack, opening this strategy to the masses. However, it is quite likely that there will be other mechanisms invented that will enable further, potentially even more devastating, attacks in the near future.

Responsible disclosure

It is somewhat unclear how to perform responsible disclosure within DeFi, given that the underlying vulnerability and victim are not always perfectly clear and that there is a lack of security standards to apply. We reached out to Aave, Kyber, and Uniswap to disclose the contents of this work.

Determining what is malicious

An interesting question remains whether we can qualify the use of flash loans, as clearly malicious (or clearly benign). We believe this is a difficult question to answer and prefer to withhold the value judgement. The two attacks described are clearly malicious: pump and arbitrate involves manipulating the WBTC/ETH price on Uniswap; the oracle manipulation attack involves price oracle by manipulatively lowering the price of ETH against sUSD on Kyber. However, the arbitrage mechanism in general is not malicious --- it is merely a consequence of the decentralized nature of the DeFi ecosystem, where many exchanges and DEXs are allowed to exist without much coordination with each other. As such, arbitrage will continue to exist as a phenomenon, with good and bad consequences.

Does extra capital help

The main attraction of flash loans stems from them not requiring a collateral that needs to be raised. One can, however, wonder whether extra capital would make the attacks we focus on more potent and the ROI greater. Based on our results, extra collateral for the two attacks presented would not increase the ROI, as the liquidity constraints of the intermediate protocols do not allow for a higher impact.

Potential defenses

Here we discuss several potential defenses. However, we would be the first to admit that these are not foolproof and come with potential downsides that would significantly hamper normal interactions.

Should DEX accept trades coming from flash loans?
Should DEX accept coins from an address if the previous block did not show those funds in the address?
Would introducing a delay may make sense, e.g. in governance voting, or price oracles?
When designing a DeFi protocol, a single transaction should be limited in its abilities: a DEX should not allow a single transaction triggering a slippage beyond 100%.

Outlook

In the future, we anticipate DeFi protocols eventually starting to comply with a higher standard of security testing, both within the protocol itself, as well as part of integration testing into the DeFi ecosystem. We believe that eventually, this may lead to some form of DeFi standards where it comes to financial security, similar to what is imposed on banks and other financial institutions in traditional centralized (government-controlled) finance.

We anticipate that either whole-system penetration testing or an analytical approach of modeling the space of possibilities like in this work are two ways to improve future DeFi protocols.

Libra: Succinct Zero-Knowledge Proofs with Optimal Prover Computation

Hacking Distributed

Wed, 12 Feb 2020 02:00:00 -0800

#^Libra: Succinct Zero-Knowledge Proofs with Optimal Prover Computation

This blog post is based on a paper authored by Tiancheng Xie, Jiaheng Zhang, Yupeng Zhang, Charalampos Papamanthou and Dawn Song.

TL;DR Libra is a zero-knowledge proof protocol that achieves extremely fast prover time and succinct proof size and verification time. Not only does it have good complexity in terms of asymptotics, but also its actual running time is well within the bounds of enabling realistic applications. It can be applied in areas such as blockchain technology and privacy-preserving smart contracts. It is currently being implemented by Oasis Labs.

Introduction and Motivation

Zero-knowledge proofs (ZKP) are cryptographic protocols between two parties, a prover and a verifier, in which the prover wants to convince the verifier about the validity of a statement without leaking any extra information beyond the fact that the statement is true. For example, the verifier could confirm that the prover computes some functions ‘F(w) = y’ correctly even without knowing the input ‘w.’ Since they were first introduced by Goldwasser et al. [1], ZKP protocols have evolved from pure theoretical constructs to practical implementations. They have achieved proof sizes of just hundreds of bytes and verification times of a few milliseconds, regardless of the size of the statement being proved. Due to this successful transition to practice, ZKP protocols have found numerous applications, not only in the traditional computation delegation setting, but also in blockchain settings such as providing privacy of transactions in deployed cryptocurrencies (e.g., Zcash [2]).

And a ZKP protocol must meet three criteria: completeness, soundness, zero knowledge. See below for an explanation.

Image/photo

An explanation of the formal security requirements for zero-knowledge proofs.

Despite such progress in practical implementations, ZKP protocols are still notoriously hard to scale for large statements due to a particularly high overhead on generating the proof. For most protocols, this is primarily because the prover has to perform a large number of cryptographic operations, such as exponentiation in an elliptic curve group. And to make things worse, the asymptotic complexity of computing the proof is typically more than linear, e.g., ‘O(C log C)’ or even ‘O(C log^2 C),’ where ‘C’ is the size of the statement. Therefore designing ZKP protocols that enjoy linear prover time as well as succinct proof size and verification time is an open problem.

Resolving this problem has significant practical implications. For example, we could generate the proof for larger statements on blockchain within acceptable time bound, which could also further extend the privacy of transactions or smart contract.

Enter Libra (not affiliated with the Libra project launched by Facebook)

Originally proposed in apaper from early 2019, Libra solves the problem described above with a zero-knowledge proof protocol that has three important properties:

Optimal prover time: Libra only needs time that is linear in the statement size to generate a proof.
Succinct verification time and proof size: both of the proof size and the verification time in Libra are logarithmic in the statement size.
Universal trusted setup: Libra only needs a one-time trusted setup to generate the public parameters which can be used for all statements to be proved, which explains the term “universal”.

The underlying protocols of Libra are an interactive proof protocol proposed by Goldwasser, Kalai, and Rothblum, in [5] (referred to as GKR protocol), and the verifiable polynomial delegation (VPD) scheme proposed by Zhang et al. in [6]. It comes with one-time trusted setup (not per-statement trusted setup) that depends only on the size of the input (witness) to the statement that is being proved.

In the original GKR protocol, the prover could only generate a proof of the statement without satisfying the zero-knowledge property. That means the verifier could learn secret information of the prover from the proof itself, which we want to avoid. In addition, in the GKR protocol, the computation of the statement is based on the arithmetic layered circuit(layered circuit with only addition and multiplication gates) and the time of the prover to generate the proof is polynomial in the number of gates in this circuit. This is slow if the circuit size is large.

Libra’s contribution is solving these two problems in the GKR protocol and could be summarized as follows:

GKR protocol with linear prover time. Libra features a new linear-time algorithm to generate a GKR proof. Our new algorithm does not require any specific structure in the circuit and our resultsubsumes all existing improvements on the GKR prover which assume special circuit structures, such as regular circuits in [7], data-parallel circuits in [7,8], circuits with different sub-copies in [9].
An efficient approach to turn Libra into zero-knowledge. We show a way to mask the responses of our new linear-time prover with small random polynomials so as to meet the zero-knowledge property. This new zero-knowledge variant of the protocol introduces minimal overhead on the verification time compared to the original (unmasked) GKR protocol.

Comparison with existing ZKP protocols

Table 1 shows a detailed comparison between the asymptotic complexity of Libra and the existing ZKP protocols. A first observation is that Libra has the best prover time among all existing protocols, indicated as rowP on the table. In terms of asymptotics, Libra is the only protocol that satisfies all of the following properties simultaneously: linear prover time, succinct verification, and succinct proof size structured circuits. The only other protocol with linear prover time is Bulletproofs whose verification time is linear, even for structured circuits. In the practical front, Bulletproofs’ prover time and verifier time are high due to the large number of cryptographic operations required for every gate of the circuit.

The proof and verification of Libra are also competitive to other protocols. In asymptotic terms, our proof size is only larger than libSNARK and Bulletproofs, and our verification is slower than libSNARK and libSTARK. Compared to Hyrax, which is also based on similar techniques with our work, Libra improves the performance in all aspects with one-time trusted setup.

Image/photo

Comparison of Libra to existing ZKP protocols, where (G, P, V, |π|) denote the trusted setup algorithm, the prover algorithm, the verification algorithm and the proof size respectively. Also, C is the size of the circuit with depth d, and n is the size of its input.

Implementation and evaluation

Software. We implement Libra, our new zero-knowledge proof protocol, in C++. Our protocol provides an interface that takes as input a generic layered arithmetic circuit and generates a zero-knowledge proof according to the circuit and the input of the circuit. We support a class of 512 bit unsigned integers that improve on the performance of the GMP library in specific cases and use it together with GMP for large numbers and field arithmetic. We use the popular cryptographic library “ate-pairing” on a 254-bit elliptic curve for the bilinear map used in zero-knowledge VPD. We have released the code as an open-source system (https://github.com/sunblaze-ucb/Libra).

Hardware. We run all of the experiments on Amazon EC2 c5.9xlarge instances with 70GB of RAM and Intel Xeon platinum 8124m CPU with 3GHz virtual core. Our current implementation is not parallelized and we only use a single CPU core in the experiments, so we hypothesize that one can further improve the efficiency of the reported numbers. We report the average running time of 10 executions.

Methodology and benchmarks. We compare our GKR protocol to these variants on the benchmarks below:

Matrix multiplication: Prover P proves to the verifier V that it knows two matrices whose product equals to a publicly known matrix. We evaluate on different matrix size from 4×4 to 256×256.
Image scaling: P proves to V that it computes a low-resolution image by scaling from a high-resolution image correctly using the classic Lanczos resampling method. We evaluate by fixing the window size and increase the image size from 112x112 to 1072x1072.
Merkle tree: P proves to V that it knows the value of the leaves of a Merkle tree that computes to a public root value. We use SHA-256 for the hash function. And we increase the number of leaves from 16 to 256 in experiments.

We report the prover time, proof size and verification time in Figure 1.

Image/photo

Which brings us to the conclusion. As shown in Figure 1(a)(b)(c), the prover in Libra is the fastest among all systems in all three benchmarks we tested. Figure 1(d)(e)(f) show the verification time. Our verifier is much slower than libSNARK and libSTARK, which runs in 1.8ms and 28-44ms respectively in all the benchmarks. Other than these two systems, the verification time of Libra is faster, as it grows sub-linearly with the circuit size. We report the proof size in Figure 1(g)(h)(i). Our proof size is much bigger than libSNARK and Bulletproof but it is better than Aurora, Hyrax, Ligero and libSTARK.

Immediate Implementation. Tiancheng Xie and Jiaheng Zhang interned at Oasis Labs in summer 2019 and worked to implement Libra. The Oasis Team has plans to further develop it in the future.

The Next Step after Libra: Introducing Virgo!

Based on the exciting results of Libra we set to solve one vital limitation of our proposal, namely “the trusted setup”. In our new follow up project called “Virgo”, we propose a transparent ZKP protocol with even better prover time and verification time. The proof size becomes larger, but it is reasonable and still works well in practice. To be specific, the prover time of Virgo is O(C +n log n) while both of the proof size and the verification time are O(d log C + log^2 n) for a d-depth circuit with n inputs and C gates. Our scheme only uses lightweight cryptographic primitives such as random oracles and is post-quantum secure. Our implementation of the protocol, Virgo, shows that it only takes 50 seconds to generate a circuit computing a Merkle tree with 256 leaves, at least an order of magnitude faster than existing transparent schemes. The verification time is 50ms, and the proof size is 253KB, both competitive to existing transparent zero-knowledge proof protocols.

Summary. We have introduced Libra, a zero-knowledge proof protocol achieves extremely fast prover time and succinct proof size and verification time. Not only does it have good complexity in terms of asymptotics, but also its actual running time is well within the bounds of enabling realistic applications. Our cryptographic technique can be applied in other application areas such as blockchain technology and privacy-preserving smart contracts. To learn more about Libra, you can read our full paper here. You can view the code here.

We thank Sarah Allen, IC3 Community Manager, for her help in compiling this text.

Libra: Succinct Zero-Knowledge Proofs with Optimal Prover Computation

Hacking Distributed

Tue, 04 Feb 2020 06:00:00 -0800

Introduction and Motivation

Optimal prover time: Libra only needs time that is linear in the statement size to generate a proof.
Succinct verification time and proof size: both of the proof size and the verification time in Libra are logarithmic in the statement size.
Universal trusted setup: Libra only needs a one-time trusted setup to generate the public parameters which can be used for all statements to be proved, which explains the term “universal”.

GKR protocol with linear prover time. Libra features a new linear-time algorithm to generate a GKR proof. Our new algorithm does not require any specific structure in the circuit and our resultsubsumes all existing improvements on the GKR prover which assume special circuit structures, such as regular circuits in [7], data-parallel circuits in [7,8], circuits with different sub-copies in [9].
An efficient approach to turn Libra into zero-knowledge. We show a way to mask the responses of our new linear-time prover with small random polynomials so as to meet the zero-knowledge property. This new zero-knowledge variant of the protocol introduces minimal overhead on the verification time compared to the original (unmasked) GKR protocol.

Comparison with existing ZKP protocols

Implementation and evaluation

Matrix multiplication: Prover P proves to the verifier V that it knows two matrices whose product equals to a publicly known matrix. We evaluate on different matrix size from 4×4 to 256×256.
Image scaling: P proves to V that it computes a low-resolution image by scaling from a high-resolution image correctly using the classic Lanczos resampling method. We evaluate by fixing the window size and increase the image size from 112x112 to 1072x1072.
Merkle tree: P proves to V that it knows the value of the leaves of a Merkle tree that computes to a public root value. We use SHA-256 for the hash function. And we increase the number of leaves from 16 to 256 in experiments.

We report the prover time, proof size and verification time in Figure 1.

Image/photo

The Next Step after Libra: Introducing Virgo!

Liberating web data using DECO, a privacy-preserving oracle protocol

Hacking Distributed

Tue, 03 Sep 2019 07:00:00 -0700

#^Liberating web data using DECO, a privacy-preserving oracle protocol

Whether you are initiating a transfer on your bank's website, posting content on your social media account, or even viewing a public website including this very blog post, TLS is at work to make sure that all data sent between you and the web server is authentic and confidentially transmitted. But while TLS is excellent at convincing the protocol participants (i.e., you and the web server in the above example) that all data is authentic and secure, it turns out that TLS is useless if you want to convince someone else that you sent or received some particular data to or from a website. So while you can log into your bank account and check your balance, there's no way to use TLS to convince a third party that your bank account has a particular balance. As we will see, this is a serious limitation, and one which we address in a new paper.

Challenges

An oracle is a service that provides and vouches for the authenticity of data. In systems such as smart contract platforms that are unable to directly access the web, oracles serve as a critical link that enables these systems to make decisions based on real world data. When an oracle is tasked with providing publicly available data (e.g., stock prices), its task is relatively straightforward. But consider an oracle that is asked to provide data about a specific user: confirm that Alice, according to an online government portal, is above a certain age or that Bob's bank account balance is above a certain threshold. For these examples, the oracle needs to vouch for data from a TLS session that is only available when logged in as a particular user. The question becomes then: Considering TLS's limitations, how do you convince the oracle of the correctness and authenticity of this data?

You can of course give the oracle your credentials (e.g., your banking username and password) and have them log in to check your balance, but this would require a high level of trust in the oracle. For security reasons, you definitely don't want the oracle to have full write access to your bank account, but for privacy reasons, you probably don't even want the oracle to have full read access to your account. All that the oracle needs to know, in the banking example, is your account balance, and the critical question is: Can we allow the oracle to validate this information without giving it access to any additional data? DECO answers this question in the affirmative.

Introducing DECO

DECO is a novel privacy-preserving oracle protocol, created by students and faculty at IC3.

DECO is not the first to introduce this problem, and indeed other solutions have been proposed. But DECO is the first solution to work with modern versions of TLS that does not require the use of trusted hardware or active participation from the web server that provides the data. In DECO, anyone can serve as an oracle for any website, and strong privacy from the oracle is guaranteed. Where previous solutions have used trusted hardware, DECO relies on highly optimized multi-party computation and zero-knowledge proving techniques that allow the oracle to participate in a TLS session while only learning the exact data that it's trying to verify.

DECO has a variety of other applications as well. It turns out that even for publicly available data, DECO is useful, as it allows smart contracts to use an oracle service without even revealing to the oracle the rules that the smart contract is enforcing. In other realms, DECO is also useful for users who want to monetize their own data (and therefore prove that they are indeed providing correct data) without giving away anything but the data that they are selling.

In those cases where trusted hardware isn't available or can't be used, we think DECO is a leap forward for oracle technologies. We encourage you to read our new paper for full technical details, and check out our website.

We're also excited about Chainlink's plans for an initial PoC of DECO, with a focus on decentralized finance applications such as Mixicles.

Stay tuned!

On Stablecoins and Beauty Pageants

Hacking Distributed

Tue, 07 May 2019 02:30:00 -0700

#^On Stablecoins and Beauty Pageants

Image/photo

Beauty pageants are not a great way to get coherent answers to hard questions. Or, like, such as, easy ones.

In this post, we identify a problem that plagues many stablecoin implementations: incorporating price information.

Stablecoins have recently emerged as a new class of crypto-assets designed to have low volatility relative to some other asset, such as the US dollar. Usually, this is done by pegging the value of the cryptocurrency. In the case of a peg to the USD, the cryptocurrency is meant to trade at a price at or near $1. This typically involves some algorithmic processes that depend on how the current price of the coin differs from the targeted peg.

But how does the contract “know” what the price is? Some current implementations depend on an external price feed, known as an oracle. But oracles are considered undesirable because it’s difficult to know which feeds can be trusted, and selecting a set of trusted feeds introduces centralization. This has prompted many researchers to explore solutions where users vote on the current price.

We argue that approaches based on users voting on the current price, used commonly in various algorithmic stablecoin designs, is broken.

Background

In the proposed implementations, currency holders vote on the current price of the stablecoin. Stablecoin prices are then modulated by adjusting the supply of the coin when it moves away from $1. Typically, when the price of the stablecoin is too low, a secondary coin is auctioned and the proceeds are burned to contract the supply and raise the price. The secondary token has value because when the price is too high, coins are minted to the holders of the secondary token.

This creates problematic incentives when it comes to voting. Suppose the price of the currency is trading above $1. Participants could dutifully report the truth, and trigger the mechanism that dilutes the coin to reduce its price. But this would result in a net loss for them. Instead, they have an incentive to report a price that is lower than the truth, so there is less of the currency put into circulation. It is in the best interest of the participants to falsely claim that the price is still $1, or even lower. Such false reporting will lead to a short-term gain for the current holders, since no new coins will be printed, enabling the current holders to liquidate their coins above the targeted peg value. In the longer term, false reporting might lead to an unstable dynamic, where participants report a false value and drive up the prices, only to be followed by people realizing that there is a bubble and deciding to dump their coins. Since such a dynamic is obviously not desirable for a stablecoin, existing coins have proposed a simple mechanism to discourage this kind of gaming of the system.

The predominant solution, proposed for both Basis and Carbon, is to slash the funds of anyone who votes below the 25th or above the 75th percentile. On the surface, this may seem like a good way to encourage truth telling, but actually reinforces the incentives to game the system. A rational self interested actor will not necessarily report the truth; instead, they will report whatever they think everyone else will say is the true value, because they don’t want to be penalized. This leads to a dynamic known as a Keynesian beauty pageant.

Suppose the objective true price is $1.05. As discussed above, holders of the token have an incentive to not truthfully report the price of the coin, because it will result in them being diluted. Everyone who realizes this should prefer to report a lower price. Usually, it is difficult for many independent actors to coordinate on an untruthful equilibrium. However, since this is a coin designed to trade at the price of $1 this serves as a second equilibrium that is easy to coordinate on. Furthermore, since the coin is designed to trade at $1, lazy voters should choose $1 as their response because it should always be in the vicinity of the truth. Once people realize that other people will act according to these incentives, they, too, should report an incorrect price to avoid being slashed. This is especially the case if there is a coordinating mechanism across enough users, like social media. If someone who has a lot of followers and holds a lot of the stablecoin says they will vote for an incorrect price, it’s irrational for a smaller holder to vote otherwise.

Exacerbating this problem is that crypto markets are so volatile that even if you could somehow elicit truthful answers from everyone (so everyone reported what they really thought the exchange rate is instead of voting strategically) people still might report the wrong price. People would most likely consult some price feed and exchange that may have stale prices or invented volume, and use that as the price. In this case, even with everyone voting as accurately as they can, this is strictly worse than using a price feed because there will be some lag between when prices appear on a feed and when users vote.

Wisdom of the crowds is a sensible decision making paradigm for applications that require pooled public knowledge. However, as is currently being done, crowd-sourcing of pricing data feeds is fundamentally broken for stablecoins because the incentives are misaligned. Stablecoins that rely on this fundamental mechanism should be considered broken.

Decentralize Your Secrets with CHURP

Hacking Distributed

Fri, 05 Apr 2019 02:00:00 -0700

#^Decentralize Your Secrets with CHURP

At the heart of decentralized systems today is a demoralizing irony. Vast resources---intellect, equipment, and energy---go into avoiding centralized control and creating "trustless" systems like Bitcoin. But hapless users then defeat the whole purpose of these systems by handing over their private keys to centralized entities. Or worse still, they lose their keys, sending about $14 billion in Bitcoin into a black hole (according to an estimate made two years ago).

But who can blame them? Even experts find key management hard. That's why, decades after its invention, public-key cryptography is rarely used for things as simple as e-mail encryption.

It's also why the hapless users of cryptocurrency in question aren't "they." They're also "we." Even at IC3, some of us shamefacedly use centralized exchanges instead of managing our own keys.

Wouldn't it be nice if true decentralization were within the grasp of ordinary users, if there were a system that: (1) made user key management easier and (2) was itself actually decentralized? And, even better, if it could: (3) manage keys for entities that can't store secrets, such as smart contracts?

These are the goals of CHURP (CHUrn-Robust Proactivization), a new system we've developed at IC3 and presented in a paper. CHURP is an open-source project already in plans for further development and adoption at Oasis Labs, and we hope to see it used in many other places.

Committees: Here today, gone tomorrow

Decentralized key management, like single-user key management, is easy in principle. A committee of n nodes can hold a private key SK that is distributed using (t,n)-secret sharing. An adversary must then corrupt t+1 nodes in order to steal SK. This creates a strong obstacle to compromise.

Any t+1 nodes can act in concert to access SK, ensuring high uptime. In fact, by means of threshold cryptography, nodes can perform operations with SK using individual shares, never explicitly reassembling and exposing SK. The figure below shows a (2,5)-secret sharing. Each of the five nodes (A1, A2, ..., A5) has a share (the small blue square) of the secret (the orange oval). Any three nodes can jointly compute the secret. The idea is similar to multisig addresses in cryptocurrency.

Image/photo

An example of (2, 5)-Shamir secret sharing.

A number of decentralized systems, e.g., Enigma, Calypso and Coconut make use of such key management via committees. It's a compelling option, but there are some lurking problems.

The first problem is the risk of mobile adversaries. It may be hard for an adversary to compromise t+1 nodes in a short space of time. If a static set of nodes stores SK indefinitely, though, an adversary can attack nodes gradually, ultimately corrupting the t+1 it needs to learn SK.

The classical countermeasure to mobile adversaries, devised in the 1990s, is known as proactivization. Nodes periodically refresh their shares, i.e., perform a fresh secret-sharing of SK. If proactivization happens on, say, a daily basis, then the shares that an adversary compromised yesterday no longer count today. In our (2,5)-secret sharing example, if the adversary compromised 2 shares yesterday and 2 today, despite having 4 shares, she can't reassemble SK. Yesterday's shares aren't compatible with today's. Proactivization forces an adversary not just to compromise t nodes, but to do so quickly, before a refresh happens.

There's a second big problem, though---particularly in decentralized systems. Nodes may enter and leave the system. Thus the set of nodes in a committee may change over time, i.e., they're subject to churn.

Churn is a problem that existing secret-sharing schemes, even proactive ones---simply don't solve.

Churn is a Challenge

In order to understand why committee churn is not an easy problem, let's consider a naive strategy for handling it.

Image/photo

Handoff between two committees.

The figure above shows two committees---equal-sized old and new committees. Due to churn, some nodes in the old committee leave (A2 and A3), while new nodes replace them (B2 and B3). For the purpose of this example, assume that both the committees use (2,5)-secret sharing for some secret SK. (2,5)-secret sharing is meant to protect against compromise of two nodes. So let's assume that a mobile adversary can control two nodes in each of the old and new committees.

A naive strategy might directly transfer shares between the old nodes and the corresponding new ones that replace them. In particular, in the above example, node A2 could give its share to node B2 before leaving, while node A3 could give its share to node B3. But this quickly falls apart in the face of a mobile adversary. This adversary could corrupt nodes A1 and A2 in the old committee and B2 and B3 in the new committee. Thus the adversary learns a new share through node B3. The adversary thus learns 3 shares in total. Since we're using a (2,5)-secret sharing, she thus learns SK, breaking the system. [1]

CHURP Comes To Rescue

In a nutshell, CHURP is a proactive secret-sharing system that solves the above problem, and handles committee churn securely. It's not the first system to do this, but it's the first practical one.

The key innovation in CHURP is something called dimension-switching. Suppose, in our example above, it were somehow possible to switch temporarily from a (2,5)-sharing of SK to a (4,5)-sharing during the handoff from the old committee to a new one. Then, despite being able to learn 3 shares, the adversary would not learn SK.

Dimension-switching essentially "dilutes" the secret shares thus preventing leakage despite the adversary learning more during the handoff. CHURP uses bivariate polynomials (two dimensional polynomials) to share the secret. Switching from (2,5)-sharing to (4,5)-sharing can be achieved by switching between the two dimensions of the bivariate polynomial. For more details of our construction, please refer to the full paper.

Another key innovation in CHURP is a tiered protocol that achieves high performance and strong robustness simultaneously. By default, CHURP uses an optimistic path. It assumes that all nodes execute the specified protocol correctly. In this case CHURP is highly efficient. If any node cheats (e.g., it sends malformed messages), however, CHURP can efficiently detect the fact and then switch to an alternative, pessimistic execution path. In this case, the protocol runs slower but is resilient to cheating players. The optimistic path in CHURP is especially communication-efficient. The best known protocol prior to CHURP [Schultz07] incurs 5GB of network bandwidth for a 100-node committee. By comparison, CHURP (optimistic path) incurs only 2MB---a 2300x improvement! In fact, even the pessimistic path of CHURP performs better than any previously known protocol.

CHURP has some other bells and whistles. For example, it uses a trusted setup phase, as required by a special commitment scheme [Kate10] that helps keep communication costs low. But if this trusted setup fails, CHURP still remains secure. The innovation here is a hedge---an additional verification step that detects compromised trusted setup and switches to a secondary pessimistic path that avoids the vulnerable commitment scheme, at the cost of some additional slowdown.

Despite the technical intricacy, using CHURP in your project is easy. At a high level, CHURP provides a concise API that enables periodic committee rotation without changing the secret. We strongly encourage you to checkout the code and play with the demo.

Lots of Applications

Blockchain systems, by nature, cannot store private data. The ability of CHURP to store and manage private keys through dynamic committees enables interesting applications without introducing centralization. Below, we briefly enumerate a few of the most important potential applications of CHURP.

1) Cryptocurrency Management: Rather than relying on centralized exchanges to store private keys on behalf of users, or using hardware or software wallets, which are notoriously difficult to manage, users could instead store their private keys with committees. These committees could authenticate users and enforce access-control, resulting in the decentralized equivalent of today's exchanges.

Image/photo

2) Decentralized Identity: Initiatives such as the Decentralized Identity Foundation, which is backed by a number of major IT and services firms, envision an ecosystem in which users control their identities and data by means of private keys. Who will store these keys and how is an open question. The same techniques used for private key management would similarly apply to assets such as identities.

Image/photo

3) Smart-contract attestations: CHURP could augment smart contracts with confidential state, allowing them to, e.g., produce attestations regarding blockchain state change. Such signing would be of particular benefit in creating a simple smart-contract interface with off-chain systems. For example, control of Internet-of-Things (IoT) devices is a commonly proposed application of smart contracts (smart locks being a notable early example). If smart contracts cannot generate digital signatures, then the devices they control must monitor a blockchain, a resource intensive operation infeasible for IoT devices. A smart contract that can generate a digital signature, however, can simply issue authenticable commands to target devices.

Image/photo

If you are interested in learning more about CHURP, please check out our website, code, or even the paper, co-authored with Lun Wang, Andrew Low, Yupeng Zhang, and Dawn Song, all of UC Berkeley. We are excited to hear about any challenging use-cases for CHURP you might have!

[1]	There are other issues with this naive strategy such as the assumption of equal sized committees and that all nodes stay alive till the new replacing nodes join. We don't make any such assumptions in the actual protocol.

The Old Fee Market is Broken, Long Live the New Fee Market

Hacking Distributed

Tue, 22 Jan 2019 03:31:00 -0800

#^The Old Fee Market is Broken, Long Live the New Fee Market

Image/photo

Almost all cryptocurrencies today require their users to attach fees to their transactions [1]. The miners then collate transactions paying the highest fees into the blockchain, and derive an income stream. This mechanism is superficially appealing, and has led some to push hard for a blockchain vision, dubbed the "fee market," driven almost entirely by such fees.

Much of the BTC/BCH split stemmed from a difference of vision around this central point, where BTC developers wanted to steer Bitcoin away from a reliance on block rewards and towards higher transaction fees. In contrast, BCH developers reacted strongly to high fees and wanted to keep the economics of miner compensation centered primarily around block rewards. Both sides make good points: block rewards are not sustainable, because the number of coins outstanding is fixed and therefore the rewards must diminish over time. At the same time, high fees lead to terrible user experiences, where some users paid as much as $55 to send transactions, while others complained bitterly of stuck transactions.

In this post, we describe why the predominant fee paradigm used in cryptocurrencies is broken. We describe the reasons why the so-called "fee market" will not lead to a stable, predictable user experience.

In addition, we provide an alternative way to charge fees that yields a sensible fee market that has more stable, and therefore more predictable, pricing. This novel mechanism also provides lower variance, and therefore higher predictability of returns, for miners.

An analysis of how our mechanism would behave in Bitcoin shows that it could have saved users over $272 million dollars over December 2017, and could have reduced the variance of miners' fee revenues by a factor of 7.4.

Problem with the Current Fee Mechanism

Under the current fee paradigm, a user wishing to submit a transaction must figure out an appropriate fee. This turns out to be a very difficult task.

This seems straightforward, but there are problems.

Fees and Cognitive Load

The first problem is the cognitive load on the user: it is difficult to decide exactly how much to bid, whether to go over or under. Surely, if the value of the transaction to the user is X, then her bids will be capped at the utility of the transaction to that user (say, X mBTC). But between 0 and X, there are many options, and picking the right one depends on a lot of other factors. Exactly how important is this transaction to me right now? How full is the mempool? What are the competing bids? Given that there will be, in expectation, another 10 minutes of transactions streaming in to the miners before the next block is discovered, how low a fee can she get away with while still getting her transaction included in the blockchain? These are clearly difficult questions. Doing the right thing involves watching the chain closely and monitoring the transaction until it is included, an act that detracts from the hassle-free use of one's money. One may be tempted to just go through a middleman, such as an exchange, to handle it all, which creates centralization and simply recreates the current banking system except with unregulated exchanges as centralized custodians. This has, historically, led to a stream of exit scams and other SFYL-events [2].

Broken Attempts

But the real problem is much more fundamental, and it stems from the fact that the fee mechanism in Bitcoin and other currencies is implemented as a pay-what-you-bid, or multi-unit first-price, auction. This fee behavior will lead to "sticky" and unnecessarily high fees, followed by sudden fee collapses, just as we have seen over the course of the last few years.

To see why, imagine a universe where everyone is using a simple historical fee estimator. Specifically, imagine a fee estimator that looks back on historical transactions and suggests a fee based on what happened on average in the past. If transactions were paying an average of 100 satoshis per byte in the recent past, then a user will simply attach a fee of 100 or more satoshis per byte.

This approach is completely broken. During times of congestion, the fees to get included will naturally go up, as they should. If the transactions momentarily arrive faster than blocks are found, the fees attached will go up. But they will remain high even after the congestion has ended. If, for instance, the rate of arrival for transactions is exactly equal to the rate at which blocks clear them, the system should be able to support 0-fees. And yet this approach will force the users to pay fees as if they were operating at the height of congestion. The fee structure that arises during congestion is ensconced in the system even though the conditions have changed, an artifact of bad mechanism design.

One can imagine other fee estimation heuristics, such as underbidding on purpose to explore if one can get away with paying less, that would yield better results. But it all is up to the vagaries of other people's choices of fee estimators that would determine the prices. A smart user who notices that the fee estimators are broken would have little recourse except to pay the prevailing fees. The system would provide no mechanism by which optimal choices are made. And of course, the mavericks who do try to explore paying lower fees, just to see if the entire ensemble of users could move to a lower price point, will have to deal with stuck transactions and transaction delays.

Blockchains are not the only place where similar auction mechanisms have led to poor user experiences. When ad placement in search engines was based on first price auctions, researchers observed similar patterns. Advertisers would compete with each other in order to get a better placement, driving the fees sky high. This would be followed by many participants quitting the game, which would cause a precipitous drop, only to restart the unstable cycle all over again.

A Better Fee Market

In a new paper, we propose a new mechanism to charge for transactions. This mechanism is only a slight code change away from the old mechanism, but it has the potential to yield a much more stable fee market, a much better user experience as well as a big savings in fees, and a more predictable revenue stream for miners.

Our proposed mechanism is fairly simple: transactions specify a fee, just like before, and miners place transactions in a block, just like before. Except, instead of charging each transaction the fee it bid, we charge each transaction the lowest fee charged to any transaction in that block. Any surplus fees a transaction offered are returned to that user, to a designated address they specify. Hence, a transaction in effect says "I am willing to pay up to $30 for this transaction," but is charged only $5 if the lowest fee transaction in that block paid $5. The remaining $25 are returned to the user.

Our proposed mechanism brings insights from multi-unit second price auctions to the world of cryptocurrenices where currently multi-unit first price auctions are the norm. Whereas before, fee selection was a stressful and difficult task, with our mechanism, users can simply attach to their transaction the true maximum value they would be happy to pay. This is because they are not going to be charged that value: instead, they are charged whatever the minimum was to get into that block. In essence, the lowest paying transaction establishes just how little it took to get into that block, and everyone within that block is charged the same amount per byte. This is not only equitable and fair, but it takes away pressure to play games with fee selection. As an additional bonus, the benefit from playing such games decreases as the blockchain gets more popular, further disincentivizing strategic fee selection. Note also that it picks the highest-paying transactions, just like first-price auctions, though it charges them strictly less than what they would be willing to bear.

Our proposal couples this idea with three other mechanisms to provide a comprehensive solution that prevents malicious behavior by miners. First, if a miner fails to fill a block, they cannot charge any fees. So a miner cannot take a high-paying transaction, ignore the rest of the mempool, and collect the entirety of the fee paid by that transaction while refusing to fill a block. They are, of course, free to fill the rest of the block with their own synthetic transactions, but they will have to pay out of their own pockets for those (and the next point addresses why the miner cannot just pay those additional fees directly to himself). Second, a miner is rewarded the average fee collected from the last B blocks, not just the fees from block they themselves mined. This ensures that miners also have little incentive to act strategically. Instead their best interests are aligned with maximizing the number of high-value transactions cleared per second. Finally, we propose that every block reserves some space, around 20%, which are exempt from this mechanism. This enables a miner to include transactions of high importance to themselves, such as those used for pool rewards, without affecting the fee mechanism and without being penalized. This addresses the miners' needs and provides a simple migration path from the current state of affairs.

Our mechanism brings order to the chaotic fee market of today. During the dramatic Bitcoin price increase in December 2017, we estimate that users would have saved over $272 million in transaction fees and miners would have a much more predictable fee revenue, reducing their daily fee variance an average of 7.4 times. These gains are not surprising -- our mechanism enables transaction fees to correspond to the true demand that users have for block space. This will make the fee markets more predictable. As a result, users can just bid their true value and know that they are not going to be overpaying for block space. Thus, this removes a painful strategic element to simply using cryptocurrencies today.

If you are designing new cryptocurrencies or are an active user of existing currencies, we highly encourage you to push for this fee mechanism instead of unstable first-price auctions.

Footnotes

[1]	Cryptocurrencies that provide fee-free transactions for all are horribly and trivially broken, as they are open to simple, flooding-based denial-of-service attacks.

[2]	SFYL: Sorry For Your Loss.

Ron Lavi, Or Sattath, and Aviv Zohar have proposed a similar protocol to ours, however they differ in a few fundamental ways. Similar to our proposal, they propose a protocol in which the winning miner places transactions into a block and charges all transactions the lowest fee proposed by any transaction placed in that block. Lavi, Sattah and Zohar assume a single monopolistic miner, and strive to maximize revenue from fees at a cost of lower social welfare. In contrast, our work explicitly targets maximizing social welfare, and operates under a model with many miners. In their system, the monopolistic miner is incentivized to leave transactions offering positive fees out of the block even if there is space in the block as including them reduces the uniform price he can charge. This, of course, maximizes miner revenue, but we believe that the first criterion for a viable protocol must be to use the blockchain efficiently, as otherwise users are discouraged from participation. Their non-manipulation result is stronger than the one we obtain from our mechanism since we only obtain declining gain from manipulation as the system grows, but it comes at a cost of lower social welfare and desirable metrics such as transaction throughput and latency. Finally, in both our protocol and the protocol proposed by Lavi, Sattah and Zohar, users’ incentive to behave strategically vanishes as the number of users grows.

Vitalik Buterin has proposed an alternative approachbased on miners estimating, and dynamically adjusting, a single fee that is charged uniformly to all transactions within a block, coupled with dynamically varying the block size to accommodate demand. This approach differs from ours in a few key ways. First, it does not aim to maximize social welfare, and instead adopts heuristics to modify two independent variables, fees and block size. The former goal, adopted by our work, will maximize transactions cleared subject to any desired block size constraint, determined by any desirable mechanism. Since block size is a primary determinant of security and centralization, we believe it is prudent to decouple its management from the fee mechanism. Second, it assumes that the demand curve is known to the protocol, though it makes no assumptions about its behavior. If the demand curve could be inferred accurately such that all transactions whose utility exceeds the block fee can always be accommodated, then Buterin's proposal would have no incentive issues. However, inferring demand curves is difficult in adversarial, Byzantine environments, which is why auction mechanisms are used. Finally, this approach has not been proven to be resistant to manipulation by users and miners. If it is not resistant to manipulation, then this mechanism will suffer from the same problem as the current first price mechanism, where users have to solve the fee selection problem all over again.

One File for the Price of Three: Catching Cheating Servers in Decentralized Storage Networks

Hacking Distributed

Mon, 06 Aug 2018 11:00:00 -0700

#^One File for the Price of Three: Catching Cheating Servers in Decentralized Storage Networks

Hundreds of millions of dollars are riding on the solution to a really hard problem. Here we offer a solution.

Decentralized Storage Networks (DSNs) such as Filecoin want to store your files on strangers spare disk space. To do it properly, you need to store the file in multiple places since you don't trust any individual stranger's computer. But how do you differentiate between three honest servers with one copy each and three cheating servers with one copy total? Anything you ask one server about the file it can get from its collaborator.

We are pleased to serve up the world's first provably secure, practical Public Incompressible Encoding (PIE). A PIE lets you encode a file in multiple different ways such that anyone can decode it, but nobody can efficiently use one encoding to answer questions about another. Using known techniques, you can ask simple questions to see if someone is storing an encoded replica, giving you a Proof of Replication (PoRep). With a PoRep, it's just a hop, skip, and a jump to secure decentralized storage and much more. But we're excited and getting way ahead of ourselves. Let's start from the beginning.

The Pitch

The world has an insatiable hunger for storage. We're using more and more storage, and we're doing it much faster than we can build more But tons of storage devices—laptop and desktop hard drives, standalone storage systems, flash memory on mobile phones, and more, and more, and more—are sitting idle.

Wouldn't it be nice if all of this unused storage could be tapped? And wouldn't it be nicer still if its owners could profit from it? We may never know why Ethan configured his desktop with 6 TB of space despite needing less than 500 GB, but given his grad student stipend, it would be great to see him get something out of it.

The Billion-Dollar Token: Decentralized Storage Networks (DSNs), such as Sia, Storj, MaidSafe, and the soon-to-be-released Filecoin, promise to do exactly this. They offer users tokens for renting out their own unused storage. Yes, you can turn your hard drive into an AirBnB for bits.

Of course, you need a mechanism to make sure users actually store what they say they are storing. This is fairly easy at least in concept: to check that someone a copy of the file, ask them for small pieces of it at random periodically.

The Billion-Dollar Problem: If you store your data with Amazon or Google or Microsoft, they promise to store your file multiple times. That way if a hard drive fails, your pictures aren't gone. Our DSN needs to do the same thing. After all, the average home computer is not reliable. To store our pictures on three computers, we need to pay each one of them, but that's a small price to pay if it makes the data safe, right? Unfortunately, you are unlikely to get what you pay for.

Once there is money involved, people take shortcuts for profits. Instead of storing one file three times at $1 per copy, a set of servers can pretend to do so and only keep one copy between all three of them. Whenever anyone checks if one of their computers has a copy, they can use the single shared copy to pass any test asked of them. As a result, they can free up space to store two more files which they also falsely claim to have replicated. This nets them 3X as much money. It also means there's only one copy of your pictures and it's probably on the least reliable computer they could find (those are cheaper, after all). The only practical way around this is to make the three copies entirely distinct.

Sia, Storj, MaidSafe, etc. do this by encrypting your pictures three times and distributing those encrypted copies. This strategy solves the problem perfectly if you're the one encrypting your photos and someone else is storing them.

The Challenge: But what happens if the data isn't your family photos? What happens if it's a public good like blockchain state? Blockchains are getting huge—Ethereum is over 75 GB and growing fast—and it's really expensive to have everyone store everything. But who encrypts the blockchain state? Everyone needs to read it. And worse, we don't trust anyone to encrypt it properly. We need something totally different that anyone can check and anyone can decode.

We need a Public Incompressible Encoding. In fact, this is needed for many settings, for example if you want some mining process that depends on miners storing arbitrary files.

Making PIEs is Hard

If anyone can decode data and we can't trust anyone to encode it properly, then how, you might ask, could we possibly prevent the attack we described above? Assume we have a malicious server, Mallory, who tries to store as little as possible. Sure, we can encode a file in three different ways, but if everything is public, our cheating storage server Mallory can still just store one copy. When we query her for an encoding, she can re-encode her one copy to whichever encoding was asked for. It seems impossible to stop.

Indeed, without secret information, we need some other way to restrict Mallory. We can do this with time. If re-encoding any piece of the file takes a few minutes, we can probably tell if we ask for the file and Mallory responds in two minutes instead of two seconds. At the same time, if we don't do the encoding piecemeal, we can make encoding an entire file reasonably efficient.

This idea isn't new. Filecoin suggested using timing, as did van Dijk et al. for some related mechanisms. So what's the problem?

Well, van Dijk et al. mix data blocks together in a configuration that makes recomputing discarded data really slow...if Mallory is pulling all of her data from the same rotational hard drive. These days that's probably not a good assumption, especially if she's cheating with custom hardware. Filecoin suggested something simpler: a long repeated chain of operations over the whole file. Unfortunately, van Dijk et al. anticipated this construction and showed that Mallory can save most of the storage and almost all of the computation by storing carefully-selected intermediate values.

So Filecoin's approach is broken and van Dijk et al. makes assumptions about hardware that no longer hold. Given the ability to build ASICs, we need to be extremely careful about what assumptions we use. Indeed, many things that were billed as ASIC resistant—such as Equihash and Ethhash—now have ASICs.

Rather than finding some new hardware assumption for storage, we build our PIE around some fairly slow key derivation function (KDF)—what non-cryptographers might refer to as a kind of hash function. For the prototype, we use scrypt, which exploits the fact that it appears hard to accelerate sequential memory access without advances in commodity RAM, but if that turns out to be false, any moderately slow KDF will do. To make things really slow, we force Mallory to run our KDF a large number of times in sequence.

This, of course, raises the all-important question: how do we force that work to be sequential? Enter graph theory.

Computation as a DAG: To understand data dependencies within our computation, we represent it as a directed acyclic graph (DAG). Each vertex represents an operation with two steps: first we derive a key using KDF, and then we encrypt some data using that key. Edges are either data edges, representing the inputs and outputs of the encryption, or key edges, representing the data used by KDF to derive the key. In the below graph red dashed edges are key edges, while solid black edges are data edges.
Image/photo

We can use this formulation to track how much sequential work needs to happen. Because we need the output from one vertex in order to compute the values at any of its children, a path in the graph corresponds to inherently sequential work. Any key edges in that path require inherently sequential calls to KDF. Therefore, if the graph has long paths of key edges, encoding will require a lot of sequential work.

But we don't just care about encoding the whole file. We need Mallory to do a lot of work even if she only throws away some of the data. Worse, Filecoin's original proposal was broken by storing intermediate values, so we need to watch out for that as well. Since every piece of data in our computation is the output of a vertex in our graph representation, we can view Mallory storing data (intermediate or not) as removing vertices from the graph. We now need to check that even if Mallory removes a bunch of carefully-selected vertices, there is still a path with a lot of key edges.

Depth-robust graphs: A depth-robust graph (DRG) is a DAG with almost exactly this property: even if a (potentially sizable) fraction of the vertices are removed, it retains a long path. This is a very useful property. Even if Mallory stores a lot of data, she can't remove all of the long paths. Ben Fisch first suggested applying this insight to PoReps in a manner similar to how Mahmoody et al. applied it to prove inherently sequential work.

The obvious way to build a PIE using a DRG is to cut the file into one piece per vertex and then scramble it one vertex at a time. The edges of the DRG become key edges, so the output of one vertex is needed (along with the relevant file block) to compute the next. Now, if Mallory throws away too much of the file, there will be a long path left in the DRG and she'll have to do a lot of sequential work to recompute the discarded data.

Unfortunately, this doesn't quite solve our problem. Mallory will need a lot of sequential work to recompute some of the discarded data, but not all of it. Some pieces will be easy to encode. Mallory can simply not store those pieces, instead opting to re-encode them on-the-fly. The first vertex in the path, for example, needs only one call to KDF.

In talks on possible approaches in early 2018, Fisch proposed this approach and suggested addressing this concern using erasure coding, making the file internally redundant. The idea was that at least some of the redundancy would be hard to re-encode. That approach, however, has several downsides. First, it expands the size of the file, potentially substantially. Second, it still allows cheating servers to gain some storage advantage over honest ones since they can discard some of the redundant data. This is a practical problem if storage providers are compensated for the full size of the expanded file. Finally, because each encoded block is required to recover not only its own plaintext but that of each of its children in the DRG, there was no clear technique for detecting a server that discarded enough data to make the file unrecoverable, nor a formal definition of what that would mean. [1]

We opted for a different, and we believe cleaner, approach. We want to force Mallory to do a long sequential computation for any block of data she's thrown away. Ensuring this while providing ASIC resistance is the critical challenge that any solution must confront.

Dagwood Sandwich Graphs (DSaGs)

The approach we take involves feeding the outputs of a depth-robust graph into a second type of graph called butterfly graph, which has the property that there's a path from every input node to every output node. This graph, with its dependency of every output on every input, helps ensure, roughly speaking, that Mallory must compute every input node, and therefore cannot avoid computing along the long sequential path in the depth-robust graph. In actual fact, to ensure this dependency, we must layer a sequence of depth-robust graphs and butterfly graphs, something like this (where the Gs are depth-robust graphs and the Bs are butterfly graphs):
Image/photo

We think this construction looks like the famous Dagwood sandwiches from the cartoon strip Blondie, dramatized by the following lunch:
Image/photo

Yes, indeed. As if things weren't confusing enough, our PIE is made from a terrifying sandwich.

From PIE to DSN

Given a PIE, a potentially malicious storage provider like Mallory can prove that she's honestly replicating files. But how do you build a DSN using a PIE?

Pretty much any PIE-based DSN architecture involves two steps for a given file:

Prove once that the file is encoded correctly.
Audit by verifying continuously that the file is intact.

Let's start with (1). Storage providers in the DSN must prove to somebody—the file owner or the network—that an encoding G of a file F is a correct PIE. Given an authenticated version of F, such as a hash stored in a trusted place, it's easy to verify that a PIE is correct. It is public, after all, so anyone can decode G to recover F. Decoding is an expensive operation, though, so it would be nice to have an efficient way to prove correctness. We propose a different approach in our work, showing how to efficiently prove that G is incorrect. This creates conditions for various "cryptoeconomic" approaches to verification, such as slashing a provider proven to be cheating. Alternatively, SNARKs offer efficient means to verify, if not prove, correctness.

As for (2), it's not much help for G to be correct if it goes missing. It's critical to continuously check that storage providers are still storing G and haven't thrown data away. There are a number of efficient, well established techniques for this purpose. A simple one used by existing DSNs is to build a Merkle tree on G whose leaves are G's blocks and publish the root. A storage provider can then be challenged to provide a random leaf and prove its correctness by furnishing a path to the published root.

A blockchain can then perform the auditing. This could be an existing blockchain like Ethereum or, as with schemes like Sia, a new one whose consensus algorithm is independent of its storage goals. A particularly intriguing option, though, is to create a new blockchain where audit = mining.

The allure is an escape from the waste and environmental destruction of Proof of Work (PoW). Mining via useful storage was first proposed in 2014 in Permacoin. Permacoin required that nodes store a large, predetermined, public archive, however, and was only slightly less wasteful than conventional PoW.

PIEs completely change the game. Because they are incompressible, we eliminate any clever tricks a malicious miner could pull by specially crafting the files. Even if the file is all zeros, the miner must store the full encoded version. PIEs enable use of any files, even those owned by miners themselves, for mining = audit. This is the idea proposed in Filecoin, for example. Ultimately, PIEs may enable a secure and efficient consensus protocol whose main resource is storage devoted useful data, rather than wasted electricity. Provably secure, efficient PIEs are a first step in this direction, and parallel work by other teams happily promises to provide a stable of complementary, evolving techniques.

Creating a provably secure and minimal-waste scheme for mining = audit is the next step. IC3 is beavering away at it. We're happy to hear from partners who'd like to help create the next generation of DSN.

A full copy of this work can be found on the IACR Cryptology ePrint Archive.

Backstory and Acknowledgements

We would like to give a special thanks to Ben Fisch and Joe Bonneau for alerting us to the utility of depth-robust graphs. We had been working concurrently on the proof of replication problem initially using an approach based on hourglass functions, and butterfly graphs. We had several approaches that looked promising but, when we tried to prove them secure, turned out to be subtly but completely broken. We learned an important lesson: This problem space is deceptively hard. We therefore chose to hold off talking about results until we had a concrete construction and security proof. Only now have we gotten there.

After Ben's January 2018 talk on their approach, we had very useful discussions with both Ben and Joe about what security they could potentially achieve. This communication helped us identify the need for strong definitions that neither allowed a cheating server to discard any of the file risk-free nor required file expansion. It was not clear, though, how to adapt the construction Ben presented—effectively the strawman DRG construction given earlier in this post—to satisfy these properties. We got the idea of using depth-robust graphs in a different manner, combining them with hourglass functions. It took some time and very different definitions, but we ultimately arrived at a solution that provably achieved our desired properties.

We would also like to thank Juan Benet and Nicola Greco at Protocol Labs for helping clarify the motivation and practical requirements of the context.

Footnotes

[1]	Since we submitted this work, Fisch et al. have released three different articles with in-depth descriptions of constructions and formal theory that differs significantly from his talks. At the time of posting we had not had a chance to read and digest these.

On-Chain Vote Buying and the Rise of Dark DAOs

Hacking Distributed

Mon, 02 Jul 2018 08:22:00 -0700

#^On-Chain Vote Buying and the Rise of Dark DAOs

Blockchains seem like the perfect technology for online voting. They can act as “bulletin boards,” global ledgers that were hypothesized (but never truly realized) in decades of e-voting research. Better still, blockchains enable smart contracts, which can execute on-chain elections autonomously and do away with election authorities.

Unfortunately, smart contracts aren’t just good for running elections. They’re also good for buying them.

In this blog post, we’ll explain how and why. As an example, we’ll present a fully implemented, simple vote buying attack against the popular on-chain CarbonVote system. We’ll also discuss how trusted hardware enables even more powerful vote buying techniques that seem irresolvable even given state-of-the art cryptographic voting protocols.

Finally, we introduce a new form of attack called a Dark DAO, not to be confused with the “Dark DAO” the same way DAOs should not be confused with The DAO.  A Dark DAO is a decentralized cartel that buys on-chain votes opaquely (“in the dark”). We present one concrete embodiment based on Intel SGX.

In such an attack, potentially nobody, not even the DAO’s creator, can determine the DAO’s number of participants, the total amount of money pledged to the attack, or the precise logic of the attack: for example, the Dark DAO can attack a currency like Tezos, covertly collecting coins until it reaches some hidden threshold, and then telling its members to short the currency.  Such a Dark DAO also has the unique ability to enforce an information asymmetry by sending out, for example, deniable short notifications: members inside the cartel would be able to verify the short signal, but themselves could generate seemingly authentic false signals to send to outsiders.

The existence of trust-minimizing vote buying and Dark DAO primitives imply that users of all on-chain votes are vulnerable to shackling, manipulation, and control by plutocrats and coercive forces.  This directly implies that all on-chain voting schemes where users can generate their own keys outside of a trusted environment inherently degrade to plutocracy, a paradigm considered widely inferior to democratic models that such protocols attempt to approximate on-chain.

All of our schemes and attacks work regardless of identity controls, allowing user actions to be freely bought and sold.  This means that schemes that rely on user-generated keys bound to user identities, like uPort or Circles, are also inherently and fundamentally vulnerable to arbitrary manipulation by plutocrats.  Our schemes can also be repurposed to attack proof of stake or proof of work blockchains profitably, posing severe security implications for all blockchains.

Blockchain Voting Today

Blockchain voting schemes abound today. There’s Votem, an end-to-end verifiable voting scheme that allows voting using mobile devices and leverages the blockchain as a place to securely post and tally the election results. Remix, the popular smart contract IDE, offers an election-administering smart contract as its training example. Yet more examples can be found here (1), here (2), and here (3).

On-chain voting schemes face many challenges, privacy, latency, and scaling among them. None of these is peculiar to voting, and all will eventually be surmountable. Vote buying is a different story.

In political systems, vote buying is a pervasive and corrosive form of election fraud, with a substantial history of undermining election integrity around the world. Sometimes, the price of a vote is a glass of beer. Thankfully, as scholars have observed, normal market mechanisms usually break down in vote buying schemes, for three  reasons. First, vote buying is in most instances a crime. In the U.S., it’s punishable under federal law. Second, where secret ballots are used, compliance is hard to enforce. A voter can simply drink your beer, and cast her ballot in secret however she likes. Third, even if a voter does sell their vote, there is no guarantee the counter-party will pay.

No such obstacles arise in blockchain systems. Vote buying marketplaces can be run efficiently and effectively using the same powerful tool for administering elections: smart contracts. Pseudonymity and jurisdictional complications, as always, provide (some) cover against prosecution.

In general, electronic voting schemes are in some ways harder to secure against fraud than in-person voting, and have been the subject of general and academic interest for many years.  One of the fundamental building blocks was introduced early by David Chaum, providing anonymous mix networks for messages which could be anonymously sent by participants with receipts of inclusion.  Such end-to-end verifiable voting systems, where users can check that their votes are correctly counted without sacrificing privacy, are not just the realm of theoreticians and have actually been used for binding elections.

Later work by Benaloh and Tuinstra took issue with electronic voting schemes, noting that they offered voters a “receipt” that provided cryptographic proof of which way a given vote had been cast.  This would allow for extremely efficient vote buying and coercion, clearly undesirable properties. The authors defined a new property, receipt-freedom, to describe voting schemes where no such cryptographic proof was possible. Further work by Juels, Catalano, and Jakobsson modeled even more powerful coercive adversaries, showing that even receipt-free schemes were not sufficient to prevent coercion and vote buying.  This work defined a new security definition for voting schemes called “coercion resistance”, providing a protocol where no malicious party could successfully coerce a user in a manner that could alter election results.

In their work, Juels et. al note that “the security of our construction then relies on generation of the key pairs… by a trusted third party, or, alternatively, on an interactive, computationally secure key-generation protocol such as [24] between the players”.  Such “trusted key generation”, “trusted third party”, or “trusted setup” assumptions are standard in the academic literature on coercion resistant voting schemes. Unfortunately, these requirements do not translate to the permissionless model, in which nodes can come and leave at any time without knowing each other a priori.  This (somewhat) inherently means users generate their own keys in all such deployed systems, and cannot take advantage of trusted multiparty key generation or any centralized key service arbiter.

The blockchain space today, with predictable results, continues its tradition of ignoring decades of study and instead opts to implement the most naive possible form of voting: directly counting coin-weighted votes in a plutocratic fashion, stored in plain text on-chain.  Unfortunately, it is not clear that better than such a plutocracy is achievable on-chain. We show that the permissionless model is fundamentally hostile to voting. Despite any identity or second-layer based mitigation attempts, all permissionless voting systems (or schemes that allow users to generate their own key in an untrusted environment) are vulnerable to the same style of vote buying and coercion attacks.  Many vote buying attacks can also be used for coercion, shackling users to particular voting choices by force.

Image/photo

That's a nice on-chain vote you've got there...

It is worth noting that the severity of bribery attacks in such protocols was partially explored by Vitalik Buterin, though concrete mechanisms were not provided. Here we describe frictionless mechanisms useful for vote, identity buying, coercion, and coordination at a high level and discuss the implications of these particular mechanisms.

Attack Flavors

Consider a very simple voting scheme: Holders of a token get one vote per token they hold and can change their votes continually until some closing block number. We’ll use this simple “EZVote” scheme to build intuition for how our attacks can work in any on-chain voting mechanism.

There are several possible escalating attack flavors of such a scheme.

Simple Smart Contracts

The simplest low-coordination attack on on-chain voting systems involves vote buying smart contracts.  Such smart contracts would simply pay users upon a provable vote for one option (or to participate in the vote, or to abstain from the vote if the vote is not anonymous).  In EZVote, the smart contract could be a simple contract that holds your ERC20 until after the end date, votes yes, and returns it to you; all guarantees in the contract could be enforced by the underlying blockchain.

Such a scheme has advantages in that it requires only the trust assumptions already inherent in the underlying system, but has substantial disadvantages as well. For one, it is likely possible to publicly tell how many votes are purchased after the election is over, as this is required to handle the flow of payments in today’s smart contract systems. Also, the in-platform nature of the bribe opens it to censorship by parties interested in preserving the health of the underlying platform/system.

Depending on the nature of the voting scheme and the underlying protocol, there may be some workarounds for these downsides. Voters could for example provide a ring signature proving to a vote buyer that they are in a list of voters who votes yes in exchange for payments.  We leave the implementation details and generalizability of such schemes open.

In general, any mechanism for private smart contracts can also be used for private vote buying, solving the public nature of a smart contract based attack; cryptographically an equivalent would be the vote buyer and seller generating a secret key for funds storage via MPC together, signing two transactions: a yes vote and a transaction that released funds to the vote seller after the end of the interval.  The vote seller would move funds to this key only after possessing the transaction guaranteeing a refund and payment.

This would look similar to previous work on distributed certificate generation, adding security analysis for ensuring fairness.  A naive implementation of such a scheme would encumber a users’ use of funds for other purposes during the vote (such actions are possible but require cooperation on behalf of the vote buyer; alternatively, a trusted/bonded escrow party can be used).

Trusted Hardware Buying

An even more concerning vote buying attack scheme involves the use of trusted hardware, such as Intel SGX.  Such hardware has a key feature called remote attestation. Essentially, if Alice and Bob are communicating on the Internet, the trusted computing achieved by SGX allows Alice to prove to Bob that she is running a certain piece of code.

Trusted hardware is usually seen as a way to prove that you are running code that will not be malicious: for example, it is used in DRM to prove that a user will not copy files that are only temporarily licensed to them, like movies.  Instead, we will use trusted hardware to shackle cryptocurrency users, paying or forcing them to use cryptocurrency wallets based on trusted hardware that provably restrict their space of allowed behaviors (e.g. by forcing them not to vote a certain way in an election) or allow the vote buyer trust-minimized but limited use of a user’s key (e.g. a vote buyer can force a user to sign “I Vote A”, but cannot steal or spend a user’s money).

The simplest way to deploy such technology for vote buying is to simply allow users to prove they are running a vote buyer’s malicious wallet code in exchange for a payment, secured on both sides by remote attestation technology.

In our “EZVote” example, a user would simply use a cryptocurrency wallet loaded on Intel’s SGX, running the vote buyer’s program.  SGX would guarantee to the user that the wallet could never steal the user’s money (unless Intel colludes with the vote buyer). The user can provably use the wallet for everything they can do with a normal Ethereum wallet, including moving their money out (though in this case they would not be paid).  The user runs their own wallet, and does not need to trust a third party for control or security of their funds. The user may not need to trust even Intel or the trusted hardware provisioner for security of their funds, as they can compile their own wallet!

When a predefined trigger condition occurs, such an SGX program would automatically vote on EZVote as the vote buyer commands, and send a receipt to the vote buyers.  The vote buyer would itself be run an SGX enclave that maintains a total of all users who claim to have voted yes, and a list of their addresses. Given trust in SGX, the vote buyer need not see the full list of member users or know the total pledged amount.  At the end of the vote, the vote buyer’s enclave would pay all the users who have not moved their funds or changed their vote. This would be accomplished by the enclave periodically posting a Merkle root summarizing users to be paid on-chain, providing proof to each user that they will eventually be paid.  Users can claim payment after the expiry of some period by providing a proofs of inclusion in the posted Merkle history. In some particularly vulnerable vote designs, an SGX enclave can increase its efficiency by simply accumulating “yes” votes from users up-front as transactions, publishing and providing payment for them at the conclusion of the vote.

Hidden Trusted Hardware Cartels (Dark DAOs)

A more concerning attack arises when trusted hardware is combined with the idea of a DAO, spawning a trustless organization whose goal centers on manipulating cryptocurrency votes.

Image/photo

One example of a basic Dark DAO.

The figure above outlines one possible architecture. Vote buyers would support the DAO by running a network of SGX enclaves that themselves execute a consensus protocol (shown here as a dark cloud to indicate its invisibility from outside).  Users would communicate with this enclave network, and supply proof that they are running a “vote buying” (e.g.) Ethereum wallet with a current balance of X coins. This “evil wallet” attests to running the attack code a vote buyer is paying for, and the vote buyer attests that they are running code guaranteed to pay the user at the end of the attack (likely in combination with a smart contract-based protocol that cryptoeconomically enforces liveness and honesty).

The vote buyers can keep track of how many total funds are pledged to vote through the system, hiding this fact from the outside world using privacy features built into SGX.  Users can receive provable payouts for participating in such a system, achieving a property similar to all-or-nothing settlement in SGX-based decentralized exchanges.  Vote buyers can get a provable guarantee that clients will never issue votes that contradict their desired voting policy.

What makes such an organization dark is that the vote buyers need not reveal how many users are participating in the system to anybody (even potentially themselves).  The system could simply accumulate users, paying users for running the attacker’s custom wallet software, until some threshold (of e.g. coins held by such software) is reached that activates an attack; in this manner, failed attempts need not be detectable.  More damagingly, the individual incentives of any small users clearly point towards joining the system. If small users believe their vote doesn’t matter, they are likely to take the payoff with no perceived marginal downside. This is especially the case in on-chain votes, which are empirically observed to have extremely low turnout. Users that don’t vote may be ideal targets for selling their votes.

Dark DAO operators can further muddy the waters by launching attacks on choices the vote buyers actually oppose as potential false flag operations or smear campaigns; for example, Bob could run a Dark DAO working in Alice’s favor to delegitimize the outcome of an election Bob believes he is likely to lose.  The activation threshold, payout schedule, full attack strategy, number of users in the system, total amount of money pledged to the system, and more can be kept private or revealed either selectively or globally, making such DAOs ultimately tunable for structured incentive changes.

Because the organization exists off-chain, no cartel of block producers or other system participants can detect, censor, or stop the attack.

Such a dark organization has several immediate practical drawbacks.  The primary one is that for use on Intel SGX, a license would need to be granted by Intel, an unlikely event for malicious software.  Furthermore, side channel, hidden software backdoor, or platform attacks in Intel's SGX or the auditing of the Dark DAO wallet could weaken the scheme, though as trusted hardware continues to advance and develop, it is highly likely the cost of such attacks will increase substantially.  Eventually, we expect other trusted hardware to provide the remote attestation capabilities of Intel SGX, meaning that SGX will not be required for such an attack; this is why we use “SGX” interchangeably with “trusted hardware”. For example, remote attestation is achievable on some Android secure processors.  Our schemes work on any hardware device allowing for confidential data and remote attestation.

Attacks on Classic Schemes: CarbonVote & EIP999

To prove the efficacy of these vote buying strategies, we first look at  governance-critical coinvotes performed in existing cryptocurrency systems.  Perhaps the most important such vote was the DAO CarbonVote.  The operation of this vote was simple: accounts sent money to an address to vote yes, and another to vote no.  Each address was a contract that logged the vote of a given address. The CarbonVote frontend then tallied the votes, and showed the net balances of all accounts that had voted yes and/or no.  Later votes superseded earlier ones, allowing users to change their minds. At the end of the vote, a snapshot was taken of support and used to gauge community sentiment. This voting style is being reused for other controversial ecosystem issues, including EIP-186.

One possible trust-minimizing vote buying smart contract in this framework involves the use of escrow; users send Ether to an ERC20 token contract that holds the Ether until the end of the vote.  For each Ether they deposit, users receive 1 VOTECOIN.

The contract is pre-programmed to vote yes at the end of the vote with 100% of the user Ether held.  After the vote ends, each VOTECOIN token becomes fully refundable for the original Ether that created it.  Users get back their original Ether, plus any bribes that vote buyers wish to pay them for this service.

We have implemented a full, open-source proof of concept of such a contract, enabling any vote buyers to contribute funds to the contract’s BRIBEPOOL.  Users can be paid out from BRIBEPOOL by temporarily locking their Ether in the contract, and can reclaim 100% of their Ether at the end of the target vote. An attack can pay vote sellers out of BRIBEPOOL upfront (once they lock the coins, the votes are guaranteed), as dividends over time, or both.

Image/photo

Code of the vote buying Ethereum smart contract for the DAO Carbonvote

Users can also sell their VOTECOIN after locking up their Ether, essentially making VOTECOIN a tokenized vote buying derivative.  Vote sellers can then instantly unload their exposure to any risks introduced by funds lockup to parties that are indifferent to the vote’s outcome: because each ERC20 is programatically guaranteed to eventually receive all original ETH, this essentially creates a one-way-only funnel from the base asset into a derivative asset dedicated to voting a predefined way.   Buyers who are uninterested in the vote's outcome should always lock their ETH if guaranteed a non-negative payoff, and essentially have an option to later unload onto other similarly uninterested buyers. If dividends from BRIBEPOOL are paid over time to VOTECOIN in addition to upfront, these derivative tokens can even be used to speculate on the success of the attack itself.

This smart contract can be simplified with the use of oracles such as Town Crier (multiple oracles, prediction markets, etc. can be combined as well).  Because the CarbonVote system publishes results including full voter logs on Etherscan, it is relatively trivial to check which way someone has voted using any external web scraping oracles, paying them if their vote included in the final snapshot agreed with the buyers’ preference.

A Dark DAO-like model can also trivially be used. Each user simply runs a wallet that, some time after each transfer transaction, also votes the desired way on the CarbonVote (in fact this may become standard behavior for many wallets).  The user is only paid if such votes are registered, so the user is incentivized to make sure this vote transaction is included on-chain. There is no way for the network to tell how many votes in a given CarbonVote are generated by such a vote buying cartel, and how many are legitimate.

Inherent in any of these schemes is the ability to minimize trust when pooling assets across multiple vote buyers; bribery smart contracts could simply allow anyone to pay into the BRIBEPOOL, and SGX networks can be architected similarly for open participation.

Some schemes, such as the EIP999 vote, have even more severe problems.  In these schemes, if a user votes twice, the later of such votes is chosen. A simple and severe attack is then to simply collect signatures on both “yes” and “no” votes from a user, spamming the chosen signature towards the end of the election period and relying on an ability to overwhelm the blockchain to ensure that most such votes persist.  Alternatively, because contract deployers are able to vote for all the funds in a given contract, another attack is to simply force a user to use a contract-based wallet for the duration of the vote that is deployed by the vote buyer, who can then control the votes of all funds locked in contracts arbitrarily without custody of these funds.

Bitcoin is not immune to this problem either. Bitcoin’s community often leans on coin-votes, and similar vote buying schemes can be applied (as either Ethereum smart contracts as in this work, or in Dark DAO-style; Bitcoin itself does not provide native support for sufficiently rich contracts to buy votes).

Beyond Voting - Attacking Consensus

Astute readers may point out that all permissionless blockchains inherently rely on some form of permissionless voting, namely the consensus algorithm itself.   Every time a blockchain comes to global consensus on some attributes of state, what is taking place is essentially a permissionless (often coin or PoW-weighted) vote in a permissionless setting.

It is perhaps no surprise that “vote buying” has seen some exploration in these contexts.  For example, smart contracts on Ethereum can be used to attack Ethereum and other blockchains through censorship, history revision, or incentivizing empty blocks.  Such attacks work directly on the proof-of-work vote itself, bribing miners according to their weighted work. There is little reason to believe that proof of stake systems would be immune to similar attacks, especially in the presence of complex delegated voting structures whose incentives may be unclear and whose formal analysis may be incomplete or nonexistent.

A disturbing concept related to our exploration of Dark DAOs for vote buying is what we term the “Fishy DAO”, named after the classic flash game.  In this (super fun!) game, you start out as a small fish. The rules are simple; you can eat smaller competitor fish, but not fish the same size as or larger than you. You get a little bit bigger after each meal, until you eventually (if you are lucky) grow to dominate the ocean.  A modern equivalent that doesn’t require Flash and adds networking is agar.io.

Image/photo

It’s like Fishy, but the small fish can gang up on the bigger ones too!

A Fishy DAO would use Dark DAO-like technology as described above to do the same for blockchains. Using SGX, Fishy DAO members can receive non-transferable (DAO members can verify message authenticity, but non-members cannot tell if a message is forged) notifications when an attack threshold is reached, allowing them to short currency markets shortly before such an attack. Each blockchain Fishy DAO attack brings some profit to Fishy DAO, and the ensuing publicity of even failed attacks gives Fishy DAO notoriety with the profit-seeking but perhaps unethical (in some frameworks). If Fishy DAO fails to achieve required thresholds, Fishy DAO simply fades away and refunds its participants, potentially but not necessarily burning some amount of their money to incentivize them to recruit participation.

Fishy DAO requires Dark DAO technology, as if performed in the open with a smart contract, observable participation rates would provide market signals to the underlying blockchain’s price, rendering the attack unprofitable by allowing risk to be priced in. It is the cryptographically enforceable information asymmetry between DAO members and wider ecosystem participants that makes such an attack feasible.

Other Applications

Note that Dark DAOs have implications far beyond the above.  Consider for example a Dark DAO that aimed to profitably buy users’ basic income identities, paying up front at a small fee to receive a user’s regular basic income payments.  Or a Dark DAO for getting through credit checks secured on key-based identities by leasing (with trust minimized limitations) such keys from users with good credit. Or a Dark DAO that runs an evil mining pool, provably attacking an ASIC-based proof of work cryptocurrency with an unstoppable attack pool of potentially undetectable size.

One can also imagine that with identity, there may be social safeguards against buying behavior in the identity system itself.  For example, some identity systems may allow a user to show up in person to revoke or manage identities, which could socially circumvent automated technical safeguards against identity theft.  There are still ways around this: the classic solution in loans is through collateral. Potentially a "bondsman" like business could also provide social guarantees of repayment through physical/legal intimidation and contract for users who cannot afford collateral.  Payday loan and bail bond establishments would be ideally suited for that kind of business if such a permissionless basic income system were ever deployed alongside current market systems, at least in the US (in many other places there are likely even less savory institutions that could be willing to step in for an appropriate cut).

The coordination space of mechanisms in blockchains is large, and the environment hostile.  All voting or financially incentivized identity-based schemes should be very careful to consider the implications of the underlying permissionless model on long-term viability, scalability, and security.

Core Insights

Maybe you are an academic skimming this article, or maybe an interested user wondering exactly what this all means. There are a few interesting and very surprising (in the research literature) insights to be gleaned from our thought experiments above:

Permissionless e-voting *requires* trusted hardware. Perhaps the most surprising result is this one. In any model where users are able to generate their own keys (required for the "permissionless" model), low coordination bribery attacks are inherently possible using trusted hardware as described above. The only defense from this is more trusted hardware: to know a user has access to their own key material (and therefore cannot be coerced or bribed), some assurance is required that the user has seen their key. Trusted hardware can do this through either a trusted hardware token setup channel (similar to governments using electronic votes for democracy), or through an SGX-based system that guarantees that any voters have had their key material revealed to whatever operating system they are running. This inherently implements the kind of trusted setup/generation assumptions academic e-voting schemes have been using for years. Clearly, in the presence of trusted hardware, such assumptions are required for any vote, and votes can be provably bought/sold/bribed/coerced with low friction in the absence of this assumption, a surprising result with severe implications in on-chain voting.
The space of voting and coordination mechanisms is massive and extremely poorly understood. As explored through concrete examples on how to handle e.g. smart contracts voting and vote changes on Ethereum, it is clear that a wide range of design decisions fundamentally alters the incentive structures of voting mechanisms (we explore these in Appendix A below). These mechanisms are extremely complex, and can have their incentive structures altered by other coordination mechanisms like smart contracts and trusted hardware-based DAOs. The properties of these mechanisms, especially when multiple such mechanisms interact or are actively attacked by resourced actors, is extremely poorly understood. No mechanism of this kind should be used for direct on-chain decision making any time soon.
The same class of vote buying attacks works for any identity system. These attacks are not only for votes. Imagine an identity system which gives users the right to a basic income, paid weekly. I can simply pay you cash up front to buy your identity and therefore share of income for the next year, and indeed should do so if my time value of money is lower than yours (as wealth asymmetries often imply). This is the case for any system involving identity: with relatively low trust, the behavior of user identities can be constrained, and such constraints can be bought and sold on the open market. This has severe and fundamental impact on the robustness of any on-chain economic mechanism with a permissionless identity component.
On-chain voting fundamentally degrades to plutocracy. Voting and democracy fundamentally relies on secret ballot assumptions and identity infrastructure that exists only in meatspace. These assumptions do not carry over to blockchains, making the same techniques fundamentally broken in a permissionless model. External, even trusted, identity systems again do not address the issue as long as users can generate their own keys (see above).
Hard fork-based governance provides users the only exit from such plutocracy. A natural question to ask given the above is whether we've already arrived at plutocracy. The answer is "probably not". There is some evidence that the ad-hoc, informal, fork-based governance models that govern blockchains like Bitcoin and Ethereum actually provide robust user rights protection. In this model, any upgrades must offer the user an active choice, and groups of users can choose to opt out if disagreeing with rule changes. On-chain voting, on the other hand, creates a natural default that, especially when combined with inattentive or uncaring users, can create strong anti-fork inertia around staying with the coinvote.
Multiple blockchains interacting can break the incentive compatibility of all chains. Importantly and critically, the Fishy DAO style attack we've explored shows that multiple competing blockchains has the ability to fundamentally affect the internal equilibrium of all such chains. For example, in a world with only one smart contract system, Ethereum, internal incentives may lead to stable equilibria. With two players, and the underdog incentivized to launch a bribery attack to destroy their competitors, such equilibria can be disrupted, changed, and destroyed. A critical and surprisingly underexplored open area of research is modelling the macroeconomics of competition between blockchains, gaining insight into how exactly such internal equilibria can fail. We find it intuitively ~certain that critical black swan events are currently lurking in the complexity of blockchain governance and interoperability.

Obviously, these all require further exploration, tweaking, and proof. But I think we have at least provided some intuition for why we believe the above to hold in a principled analysis framework.

Conclusion

The trend of on-chain voting in blockchain is inspired by the long human tradition of voting and democracy. Unfortunately, safeguards available to us in the real world, such as enforced private/deniable voting, approximate identity controls, and attributability of widespread fraud are simply not available in the permissionless model. When public keys generated by the users themselves are used, on-chain voting is not able to provide guarantees about these users having any anti-coercion guarantees. Elaborate voting schemes do little to quell (and in many cases indeed aggravate) the problem. On-chain voting schemes further complicate incentives, creating an unstable and tangled mess of incentives that can at any time be altered by trustless smart contract or Dark DAO-style vote buying, bribery, and griefing schemes.

We encourage the community to be highly skeptical of the outcome of any on-chain vote, specifically as on-chain voting becomes an ever-important staple of decision making in blockchain systems. The space for designing mechanisms that enable new forms of abuse with lower-than-ever coordination costs supports the position that votes should be used for signals not decisions, and that a wide variety of voting mechanisms should fill such roles. Without such safeguards, it remains possible that all on-chain voting systems degenerate into plutocracy through direct vote and participation buying and even vote tokenization.

Such attacks have substantial implications for the future security of all blockchain-based voting systems.

Acknowledgements

We’d like to thank Patrick McCorry for his helpful, thorough feedback throughout the lifecycle of this post, and pioneering work in vote buying and on-chain voting systems.

We also thank Omer Shlomovits and István András Seres for their helpful comments on early access versions of this post.

Appendix A - On-chain Vote Differentiators

We notice several distinguishing factors in on-chain voting systems:

Vote-changing ability: If users cannot change their vote, trivial vote buying is possible with any method that provides a cryptographically checkable receipt. A smart contract can simply bribe users up-front for their vote, which can now never be changed. Most schemes, however, allow users to change or withdraw their votes, meaning bribery needs some continuous time component (or to be done after a snapshot of the vote is taken). Exponentially increasing payouts over time provide an interesting solution that discourages coin movement and encourages long-term signaling, and payout bonuses at vote completion are tools potential vote-buyers can use to create viable vote buying schemes when users are allowed to change votes.
Smart contract/delegated voting: Who gets to vote for funds stored by smart contracts? This is an open question that plagues existing designs; the original CarbonVote allows any contract that can call a function to vote and later change its mind. The EIP999 vote allows contract deployers to vote on behalf of contracts, a decision widely criticized as being intended to sway vote outcomes. However, neither design seems ideal. Indeed, it seems intuitively difficult for a single design to capture all the custody nuances in smart contracts fairly: funds-holding smart contracts can range from simple multisignature accounts to complex decentralized organizations with their own revenue streams and inter-contract financial relationships. Which of these coins have voting rights, and how to fairly assign these rights remains an entirely unexplored philosophical requirement for building a fair on-chain voting system. Forcing contract authors to provide explicit functionality is likely also insufficient, as the very requirements of this functionality can in the future change without backwards compatibility (through either chain voting or forks).
Deniability/provability: All of the schemes explored in this article have features which make them particularly amenable to vote buying: they provide the voter with some form of trust-minimizing cryptographic proof of their vote, either through an on-chain log, a secured web interface, or a smart contract’s state. Such schemes are particularly vulnerable to vote buying, as they make it easy for smart contract-style logic to validate votes. Some traditional e-voting schemes in academic literature provide a property known as coercion resistance. In these schemes, a user is able to change their mind post-coercion using the key they use for voting, and votes are not attributable to individual users. In general, the privacy concerns of having votes associated with any kind of long-standing identity, especially those holding coins, are severe. Such concerns would be completely disqualifying for any serious voting systems in the real world, and probably should be disqualifying in all thoughtful on-chain voting design criteria.

Choose-Your-Own-Security-Disclosure-Adventure

Hacking Distributed

Wed, 30 May 2018 03:15:00 -0700

#^Choose-Your-Own-Security-Disclosure-Adventure

I have been thinking about bugs, responsible disclosure, crowd behaviors and ethical responsibilities. This blog is a choose-your-own-adventure game for security researchers. I want to share some simple yet common scenarios with you, and outline our various options as responsible computer scientists.

My main point will be that there are no good options: no matter what you do in the choose-your-own adventure game, everyone dies. The trolley, the fellow at the switch, the folks on the line, and the folks on the other line as well. And they die because we are, collectively, acting like morons. You and I and everyone else is complicit. This post is an attempt to ask that the crowd learn to demand the right kind of assurance from software, and the right kind of behavior from all the parties involved, because we can do better.

Image/photo

But what if he really did?

BTW, as you read the scenarios outlined below, I'm sure you'll be convinced that I'm criticizing some specific coin or project, except no two readers will agree on which coins. This post is not about The DAO or the-coin-which-cannot-be-named-or-else-they-conjure-a-butthurt-online-brigade and also no-not-that-one-the-other-one or even oh-my-god-they-all-do-that. It's about all of us. The entire set of scenarios are synthetic -- an amalgamation of Sorry-For-Your-Loss (SFYL) events I've seen play out in cryptocurrencies over the years. No need to make it personal, it already is.

The Setup

Imagine that there is a software project out there, developed by some group that is not you. Let's assume that they have found a way to monetize this process. Maybe they are offering consulting services, maybe they are profiting directly by building a cryptocurrency and selling some tokens, maybe they are directing users to side services that they operate to finance this process. Somehow, they are making money off of this thing.

You, of course, are not making money off of their thing. You have no connection to them, no legal obligations. This post is about your ethical obligations, the things you have to do so you can sleep well at night and occasionally visit your relatives' graves without feeling like an utter disappointment.

Now imagine that you know of a fundamental flaw in this project.

What do you do?

Keep quiet, do nothing.
Keep quiet and exploit the bug.
Speak up and say something.

Before you answer, let's fill in with some realism.

Your Backstory

Let's add some details about you, with absolutely no loss of generality. The following backstory is almost always universal.

You didn't just happen to chance upon this flaw. No one just chances upon a flaw that has eluded the main developers who have a profit motive to ferret out the bugs. You spent years of effort, building specialist expertise. When the developers were throwing a pre-pre-launch party, you were combing arcane papers by Lamport. When the dev team was out making it rain ICO-cash in the club, you were stepping through similar code with gdb. And when they were out having "pre-marital sex," as the meme goes, you were thinking about techniques to avoid this particular flaw.

But let's be realistic: you don't actually know for certain that there is a flaw in this specific system right at this time. Sure, you're an expert at this problem, sure, it certainly feels like they have it, but you have not (yet) combed through this particular system. You have a 99% suspicion that they might have this flaw. But there's a 1% chance that they have some other mechanism in place that makes it impossible to exercise the flaw.

Their Backstory

Again without loss of generality, the following is almost universal.

The developer team was working in a hot area with a lot of competition. They could have spent an extra year or two on their project to make sure it is well-executed. But the space is really competitive, ya know. Time is short, besides, the Hooha Coin has a whitepaper that sounds exactly like yours, with phrases lifted directly out of yours, plus they added some bullshit phrases that you forgot to add, so there is no time, you need to hit the market.

The developer team sought advice on business development from successful role models. This really means people who happened to be at the right place at the right time, who have now cashed out and style themselves as successful VCs. As of 2018, it was commonly accepted wisdom to "fake it 'till you make it." They literally told young people to engage in fraud.

Some people on the dev team had been through business school. The main thing they learned in business school was how to network, but the second thing they learned was how to cut corners, how to short-change people on implicit expectations and pocket the difference. There is no law to hold you back from going to market with a possibly broken protocol as long as you yourself do not actively know, for certain, that it is certainly broken.

Finally, the dev team has followed best practices, as taught in school. They used Mongo but they made the port non-public [1] . They did not invent their own crypto. At the risk of going on a side-rant: what is it with disciplines that teach people to stay away from their discipline? Why is it that cryptographers get to tell people to leave cryptography to others, but somehow it's OK to build your own consensus protocol, "eventually-consistent" NoSQL engine that is actually plain old inconsistent, or your own programming language with weird "wat?" semantics? Anyhow, the dev team did whatever it is that we teach at your average top-10 school, the end result of pointless faculty meetings and that special out-of-touchness that comes from the lack of a mandatory retirement age for faculty in the US.

We can now see what happens when you speak up, or don't. In what follows, bold indicates options, NMD means "No More Decisions" down that path, The End means the book ended and everyone lost.

A. You Keep Quiet

You watch the system get deployed with great fanfare. Someone else finds the same flaw. [Restart, from their perspective]

B. You Exploit

The weather is really nice in Thailand. You tell yourself that 2 out of 3 people would take advantage of a serious exploit instead of reporting it if they could get away with the proceeds. 1 out of 5 would use an exploit if it made them $1 more than the bounty.

The border crossings into the US to see living relatives cause a slight feeling of anxiety, but you reassure yourself with "Code is Law, bro." You occasionally get the feeling that your dead relatives are judging you hard, but the hotel staff know this and keep you well supplied with Mai Thais. [The End]

C. You Speak Up

You mention on Twitter that the protocol may be buggy. What happens next depends on your reach.

You have no following. No one pays any attention to you. The bug is exploited by an address connected to other hacks laundered on BTC-e, and everyone loses money. You are an utter disappointment. Doubly so for not exploiting the hack yourself. [The End.]

You seek help from someone with a large following. He somehow read your nicely composed letter among the dozens of kooky messages they get every day, some involving a guy who sold his heater and wants ICO recommendations for his $150, and many involving whitepapers with the exact same bug as what you disclosed. Now he knows what you know, and has a large following. [Restart and follow along from his perspective].

You have a large following. Crowds upvote your post. It makes it to the top of Hacker News. You get 100,000 viewers, approximately 10 times the scrutiny in one day that a typical, award-winning faculty member gets over his or her life's work.

That one guy who is upset at you because you previously pointed out his errors posts something snide. Crowds vote it up because they like drama, and he has numbered references that lead to non-sensical web pages and IRC logs that do not actually support whatever inanity he is spouting. Even though no one actually clicks on the links, a blue link is considered a full refutation regardless of its destination. You ignore him and all the idiots who upvoted him. [NMD]

The aura of negativity created by the previous guy has worked. A friend posts in the HN thread and joins ranks with him. She drops in a snide reference to something personal, like your ethnic origin. You ignore her, knowing that she will come before you in a few years seeking a job. You contemplate what you'll do then: will you remind her of this, or will you never speak of her behavior? You now have two ethical dilemmas. [NMD]

Ok, the developer team goes into overdrive, immediately denying that there is a flaw. They point out that you do not have an exploit. They call your post FUD, and you a shill for the perceived competition. This seems wrong, because you did not have time to short these guys or long the competition, because you were actually thinking about the exploit, while these guys seem to be spending all their time day trading coins.

The people calling you out for FUD are getting aggressive. While your inbox consists 97% of people saying "thank you for what you do," the remaining 3% is full of vitriol and kind of annoying.

[Optional Side Story] You receive an unmarked package in the mail. Not knowing if it's a bomb, you put it on a steel table, crouch, and, holding the scissors with your left hand, open it. As you do this, you curse the day you said anything in public. .... It turns out to be dessert from a fan. You are immediately relieved. It's your favorite kind, too. You gaze at it and laugh at yourself for thinking it was a bomb... You have a jolly good time for 5 minutes, after which you realize that the dessert might be poisoned. You offer it to your colleagues, wondering how much plausible deniability you have if they keel over. [NMD]

In response to FUD allegations, do you:

C1. Ignore them and spend your time on other endeavors instead of building an exploit.

C2. Spend the time building a full exploit.

C1. You Build no Exploit

Someone else writes the exploit for your flaw.

He uses it to steal everyone's money. Everyone lost. You are widely criticized for making the issue public and enabling the theft. [The End]

He does not use it to steal everyone's money. [Go to C2, from his perspective]

C2. You Build an Exploit

You already had a day job. Now you're working for people who never engaged you, paid you, or will compensate you. They will never even be grateful to you. You're basically donating your cycles to a bunch of undeserving people. The guy at the garage with the specialized OBD2 reader won't connect it to your car for less than $15, and here you are, in the highest-tech field, clocking in free hours for someone who is already upset at you for speaking up. You feel bad for being coopted into someone else's broken money making scheme.

What happens next depends on the quality of your exploit.

Your exploit does not work, the bug isn't there. It was the 1% case. Some random other feature made your bug unexploitable, so you look like a fool.

Two years later, the developers go on to build another system, but this time omit the circumstantial feature that kept the bug from being exploited. They lose everyone's money. You feel vindicated, and happy even, which only exacerbates the unease you feel during visits to the family cemetary. [The End]

The bug is present, but you realize it would take too long to develop an exploit. You have better things to do. The developers pretend that it's the 1% case. They are lying, of course.

Two years later, some no-talent security researcher rediscovers the exact same flaw you pointed out. He is unemployed and has the time to develop an exploit. He does not give you any credit. Today, when you google keywords for your own post, his posts come up as the first links. It feels weird. When you see him at a conference, he scampers away. People ask you how this daft guy knows so much about subtleties in your expertise area. [Restart at C2 from his perspective]

Your exploit works, you release it publicly. Your colleagues pull you aside and lecture to you about responsible disclosure. They tell you exactly how you should have run your life, how you have an obligation to your own following, and what they themselves would have done had they ever been in your position, but, you see, it's hard to get out of armchairs.

Someone uses your exploit to steal everyone's money. The devs, which previously called on you to develop an exploit, now attack you for developing an exploit. An FBI agent leaves a message wanting to talk to you, and you cannot tell if it's because you have interesting insights, or because you're a person of interest. Everyone lost. [The End]

Your exploit works, you give it to the developers. What happens next depends on whether the developers are hostile or not.

C2a. The devs are actively hostile, deny that it works. They start peppering you with mumbo jumbo. You have no time for this. The developers suddenly release an extensive patch, mentioning you in a byline, but also say that your exploit never actually worked. Someone who has been watching their code repository duplicates your exploit and steals money from unpatched installations. Everyone loses. [The End]

C2b. The devs are passively hostile, ignore you. After a few weeks, you release your exploit. The aftermath is exactly the same as the case where you release the exploit immediately upon writing it. There is no universally agreed-upon waiting period, nor is a single period suitable for all bugs and all projects. No matter what you do, you will be treated as if you revealed the exploit irresponsibly. [The End]

C2c. The devs are nice people. They acknowledge you, they give you a bounty, and release a patch.

[Optional side story] But at about the same time when you notified the devs of the bug, others heard your exact idea, repackaged it using different words, and now you have to split the bounty. Interestingly, you share the bounty with someone who suggested a way to rewrite the C code that does not impact the produced binary at all -- the devs who were too incompetent to fix the problem are too incompetent to tell who had a material fix and who was just making noise. [NMD]

You feel like you did the right thing. But you took a lot of risk -- there were many paths along the way where this could have turned out differently.

The bounty is a flat $10k. You figure that you put in about 30 hours just in developing the exploit itself. $333/hour is far below your consulting rate, but it seems like a reasonable figure. Yet, like an Uber driver who doesn't realize that the amount he's getting paid per hour doesn't cover the upkeep of his vehicle, you do not realize that you forgot to charge for all the time and effort required to get you to the point where were able to recognize the error in the first place. When you factor that in, you made around $3/hour. You wonder why there are very few older folks who do what you did.

The exploit would have cost tens to hundreds of millions. You wonder if it would have made more sense to just use the exploit.

Meanwhile, someone else finds a different bug in the same system, and just uses it to steal money. The dev team is perplexed. They know they are nice people, unlike all those other dev teams who deny the existence of bugs and run PR campaigns against the people who call out problems. They know they treated you nicely. They have a Slack channel full of people, unlike everyone else who runs a Slack channel of shills and mobbing agents. They can't understand why more people don't come out and report more bugs.

Moral of the Story

The security flaw reporting game is completely broken. Empirically, we can see that we have an undeniable problem. People do not come forward, and if we think critically, there is absolutely no reason for them to come forward. There have been so many bad actors that the few people who step up ought to be sainted.

The root cause of the problem is that almost every project I know, some worth many many billions, is being run as if it's Billy Bob's Software Shack and Associated Slack Channel of Speculators and Online Mob.

The bounty programs are identical in nature and amount in cryptocurrencies and all other software. This is insane. One is meant to build something like a banking system that holds billions, the other runs on a machine in someone's den that holds Aunt Beth's pictures.

Many large projects do not even have a Chief Security Officer, whose job is to assess the severity of the flaws independently from the development team. That decoupling is essential, not only because it brings a different pair of eyes to the problem, but decouples the egos from the software artifact and makes the problem scenarios less likely to play out.

Personally, I will soon actively divest from coins that lack a dedicated person in charge of security. I suggest that you all do the same. The current situation is laughable, and the scrutiny it brings to cryptocurrencies is bad for all.

And there are other things we can do collectively: (1) Actively avoid projects that demand exploits before acknowledging problems, or otherwise create friction, (2) Do not fan the flames of online drama involving security flaws, everyone has errors, let people acknowledge and patch their code in peace, and (3) come up with better incentives and payout schedules for bounties, and at least use a sliding scale based on severity.

Time to start demanding better practices from multi-billion dollar software projects, and we need to start behaving better as a crowd.

[The End.]

[1]

Pisa: Arbitration Outsourcing for State Channels

Hacking Distributed

Tue, 22 May 2018 07:00:00 -0700

#^Pisa: Arbitration Outsourcing for State Channels

As deployed today, cryptocurrencies do not scale. To tackle this scaling problem, we introduce Pisa, which complements existing work on so-called "layer 2" solutions.

Pisa focuses generic state channels which can be used to build any application (i.e. payments, auctions, boardroom voting, gaming). The core contribution is a new protocol for hiring a new third party agent called the custodian. This custodian is designed to help alleviate a new assumption in state channels which require every participating party to remain online (synchronised with the blockchain).

In this blog post, we provide a brief description on why a generic state channel is desirable before outlining Pisa. Our primary contribution is the introduction of a custodian that can be held accountable by a customer in the event the custodian fails to watch the channel on their behalf. Afterwards we'll highlight upcoming opportunities to get our research into practice!

Generic State Channels

The premise of a channel is simple, a small group of distrustful parties want to jointly execute a program which relies on information recorded on-chain (and ultimately may influence the state of other on-chain programs) amongst themselves and not directly on the global blockchain. This requires all parties to collectively sign every new state of the program and invalidate the previously authorised state. If all parties cooperate, then every new state is locally kept amongst themselves (i.e. off-chain), but if a single party aborts, then the "disputed" state transition must be executed on-chain. In a way, this allows parties to run a local consensus protocol and rely on the global blockchain as a root of trust to enforce the liveness (and correctness) of a program's execution.

In the best case, all parties can avoid the global network's latency (i.e. confirmation on the blockchain) and the network's transaction fees. On the other hand, state channels introduce a new assumption that each party remains online and synchronised with the global blockchain.

The "always online" assumption

If one party is offline for an extended period of time, then all other parties can collude and perform an execution fork via the global blockchain. As the name suggests, this irreversibly forks the smart contract's execution and effectively invalidates any execution that was authorised locally amongst every party. In the worst case, the colluding parties can steal all coins in the channel. This is problematic for real-world use as it is likely parties will want long-lived channels (perhaps on embedded devices) and cannot remain online at all times.

Leaning tower to the rescue (Pisa)

To help alleviate this new assumption, we introduce Pisa, a protocol to allow a customer to hire a third party agent called the Custodian to watch state (and payment) channels on their behalf. Unlike previous solutions [1][2], Pisa is designed for generic state channels and not just payments. Pisa has the following properties: State privacy. The Custodian only receives a hash of the state and not the state itself. They may only learn they were hired for a particular state if it is later revealed on the global blockchain via a dispute.

O(1) storage. The Custodian is required to only store the latest appointment job they have received from the customer. As previously mentioned, this is a hash of the state and a signature from every party in the channel.

Public verifiable appointments. We propose fair exchange protocol that ratifies a signed receipt of appointment for the customer upon paying the custodian. This provides cryptographic evidence that the custodian has agreed to watch a channel on the customer's behalf and they have been paid to do it!

Fair (and real-time) reward. The custodian is paid every time the customer hires them to watch a new state on their behalf. If their service is not required (i.e. there is no dispute), the custodian is still paid..

Customer recourse (and holding the Custodian to account). The Custodian may simply not do their job when the customer is offline. In Pisa, we rely on a financial deterrent such that if the customer can prove the custodian's wrongdoing (using the signed receipt), then the custodian's pre-arranged security deposit is automatically forfeited / burnt.

Together, the above properties are necessary for any Custodian protocol to be deployed in practice. The customer has evidence of an appointment, the custodian is rewarded in real-time for their service, and there is cryptographic evidence if the custodian cheats.

Why is this work important? and what comes next?

Our paper presents a generic state channel construction (originally from Sprites) which can be used to build any application. As a bonus point, the custodian can watch any application that is built using this construction too!

Thanks to the Ethereum Foundation, we have received $250k~ to help bring Pisa from research to practical use. We'll be hiring one or two developers to help us out so please get in touch if you are interested! Our first summer project is to build an application within the generic state channel and empirically evaluate the practical difficulties involved in this scaling solution. Of course, the funding's primary goal is to implement the custodian to support the wider ecosystem.

Want to learn more about state channels? Come to our off-chain event in Berlin too!

To the moon 🚀

Acknowledgements. We’d like to thank Lefteris Karapetsas (Raiden) for bringing the monitoring problem to our attention at Devcon3 and the wider Raiden team for their feedback during the development of Pisa.

[1] WatchTower - https://scalingbitcoin.org/transcript/milan2016/unlinkable-outsourced-channel-monitoring

[2] Monitor - https://scalingbitcoin.org/transcript/milan2016/unlinkable-outsourced-channel-monitoring

Online Signature Services Are Broken

Hacking Distributed

Thu, 26 Apr 2018 03:55:00 -0700

#^Online Signature Services Are Broken

In the last few years, we have seen a proliferation of services that allow people to sign legal documents online. Essentially, they send out a link to you via email, you click on the document, write your name, they render it in a fancy font to make it look like a real signature, you click and you're supposed to be legally bound by a document. Variants of this service collect and forward legally binding documents, such as recommendation letters.

All such services are bunk. They fail at their central task, of ensuring that the person doing the signing is who they claim to be. They also fail at their secondary task, of properly documenting the basis for trust, such that, if that trust were to be broken due to fraud, the perpetrators can be prosecuted effectively.

I want to clarify what is wrong with these services because it makes an interesting case study in computer security. By the end of the article, you should be able to forge documents and get into any top CS department of your choice, including our highly ranked program at Cornell, regardless of your background, accomplishments, and previous preparation.

Central Tenets

These services fail because they violate a central tenet of user authentication: document and capture the credentials used to establish identity.

Here, you can see me use one of these services to sign a document that the recipient hopes is going to be legally binding.

Image/photo

That is not my signature. That's not my handwriting. And it could easily have been someone else doing the typing.

Authentication is the act of establishing a link between a claim to an identity and the credentials presented to eastablish that link. These services fail to document the basis credential.

The simple fact is that these services are performing authentication via an email address. The preparer of the documents makes a claim that "the person you know as EGS has email address el33th4x0r at gmail.com and will be issuing statements we would like to make legally binding."

The service emails a link to their service to that address. Access to that link permits anyone to be able to sign as the user EGS. And yet, the service hides the email address they authenticated, and instead report the provided name instead, without establishing the veracity of the binding between the name to the email address. So, this service will happily pretend that the documents were signed by "EGS", when in reality, the credential it checked was the email address el33th4x0r. Who the heck is el33th4x0r? How does the recipient know that that's the genuine address I use? How would they know if it instead came from el33th4xor?

Some services allow me to upload a picture of my own signature, and provide that instead of the handwriting font supplied by the system. This confers no actual security. Old school signatures, in writing, are symmetric, the provider and validator are in possession of the same credentials, in contrast to public key cryptography, where the situation is asymmetric, and the validator can never forge a signature. So anyone who ever processed a check from me or read a letter I wrote is fully capable of producing my exact signature, through the exact same process of scanning that I would use to generate it. It's just security theater to fool gullible people.

Attacks

Online signature services are broken even for the simple case where one party knows the binding between a name and its corresponding email address, for the number of failure points involved in email routing are immense. The sender is trusting BGP, DNS, SMTP+TLS, email forwarding, as well as the security of the email message at rest on email providers.

That's easily in excess of 5 million lines of code. Empirically, we see critical vulnerabilites in this code base at a constant rate, as evidenced by the number of times your software auto-updates itself due to security patches. Undoubtedly, there are operational measures to protect some particularly centralized systems; for instance, the GMail team guards its data at rest carefully following the incident when Chinese agents infiltrated the service and prosecuted some dissidents, but certainly, most institutions come nowhere near this level of diligence. Your email can be intercepted and your "signature" easily forged.

How To Get Into Any Graduate School

And the situation gets much worse when multiple parties are involved, especially when party A is entrusted to provide the binding between party B's name and corresponding email address.

Take the case of graduate school admissions. A number of companies have cropped up that automate the task of collating and forwarding graduate school applications and recommendation letters. Every single one I have seen, without fail, is broken, nothing more than smoke, mirrors, and a few fancy fonts designed to fool unsuspecting people. They all commit the basic error described above by authenticating the email but displaying the name. As a result, they admit massive fraud.

[Incidentally, these services also fail to actually automate the process and generally pose a centralized point of failure. Hackers, and secret services, can easily gain access to 90+% of all the recommendation letters written in a given year, and keep these forever. Overall, higher education should not outsource its core functions. But that's a separate rant.]

The attack is simple: you apply for graduate studies, and you claim that your letter writers are the biggest names you can find. Let's pick some current and future Turing award winners, say, Lampson, Clark, Stonebreaker, and Sirer. The system then leaves it up to the applicant to establish the name-email bindings. So you can provide email addresses that you control, and write the juiciest recommendation letters known to humankind from the biggest luminaries in the field. As long as you don't go overboard in the letters, there is nothing in the system that will allow anyone to catch on, because the online signature service never displays the actual email addresses to the people who consume the signed documents. Our admissions committee will never catch on that the letter from the acclaimed Butler Lampson actually came from an email address under the control of the attacker.

At the moment, all graduate admissions are essentially done by the honor code. All vetting happens not through the online signature services, whose job is to help with this vetting, but despite them, via extraneous, social methods. In essence, if it weren't for researchers occasionally talking to each other, the entire authentication system would fall apart.

This is no way to build a modern authentication system. And the fact that we have these poor services convincing people, through their hokey fonts, that they are doing an adequate job is keeping others from entering the same space and doing a better job.

Cause for Concern

Given that most online signature services essentially run for free, it's worth thinking about their economics.

A company that handles documents worth millions or even billions of dollars should charge you something. A failure of their systems, a data loss event on their side, might well render them liable. They need to hire competent staff and run a substantial operation.

Now, one could argue that they need not charge you in proportion to the value they handle, that this is a commoditized business, that there is a lot of competition. But still, because the legal downside is non-zero, there must necessarily be some offsetting charges. Yet I see very little of that.

Leaving aside the value of the documents, there is the value to be gained from knowing what's inside the documents. How much would the US government pay to know all of the business relationships between the actors in Russia? That's exactly how much they would invest in startups that provide free online signing services. The same is true for every other secret service, with foreign agencies sponsoring these companies in target countries. The entire situation is very similar to VPN services: the entire sector seems to be a set of giant honeypots.

And of course, you can bet that every single secret service is working to get access to the data repositories of competing services. It's a grim world.

Future Outlook

Luckily, there is room for optimism despite the sad state of user authentication on the Internet.

In the US, the legal system permits the use of electronic signatures based on cryptography. So we can actually implement strong signatures based on asymmetric, public key cryptography. We can sign documents without ever worrying about the recipient turning around and forging other documents with our signature.

The rise of cryptocurrencies has forced us to build key management infrastructure. Hardware wallets, whose sole function is to carry keys securely and issue signatures, are maturing, albeit slowly.

Building a public key infrastructure is never going to be easy, but at least the right ingredients are falling into place.

Document management is no cheap task, and I don't mean to underestimate how much effort companies otherwise may end up spending to manage their signed documents. But if the alternative is to entrust the entire kit and kaboodle to be managed by an unknown third party that is known to do a poor job at their central task of authentication, and where the data resides on disk, at rest, unencrypted, then it is no alternative at all. I'm optimistic that turnkey, self-hosted solutions can be developed here that do not rely on storing everything at a central point of vulnerability.

So, I expect that the situation will improve over the next decade, because there is no reason for it not to, other than complacency and lack of awareness of just how terrible the existing services are.

In the meantime, you should do three things:

1. Demand that online signature services display the actual credential they checked. For without this, the validator has no way of evaluating the central authentication claim.

If they checked just an email address, they should display just that email address. Displaying anything else as the authenticated user name is dangerously misleading.

This transparency should pave the way for new companies that authenticate users via multiple methods, and permit the consumer of the information to make informed choices.

2. Refuse to incorporate insecure services into your workflow at your institution.

3. If you are at an educational institution, you have a higher burden on your shoulders. Refuse to outsource central tasks of a university to third parties. Such parties constitute central points of failure, where their failure can result in the betrayal of the core mission of a university, to protect the students' future careers.

I have written previously about the dangerous trend in higher education towards outsourcing critical information to third parties here

Paralysis Proofs: How to Prevent Your Bitcoin From Vanishing

Hacking Distributed

Thu, 18 Jan 2018 01:30:00 -0800

#^Paralysis Proofs: How to Prevent Your Bitcoin From Vanishing

From the buried gold of Treasure Island to the seven missing Fabergé eggs, lost and stolen treasure has long been the stuff of legend and romance. In Bitcoin, though, there are no princesses, dragons, or (seafaring) pirates, and not much romance. Fortunes are often lost simply because private keys are wiped from laptops, the slips of paper they’re printed on go missing, or they’re stolen by hackers.

Mundane as it may seem, key management is critically important in any cryptographic system. Cryptocurrencies like Bitcoin and Ethereum are no exception. Lost or stolen keys can be catastrophic, but handling keys well is notoriously hard. Users need to protect their keys against theft by wily hackers, without securing them so aggressively that they might be lost. Key management is especially challenging in business settings, where often no one person is trusted with complete control of resources.

A common, powerful approach to key management for cryptocurrencies ismultisig transactions, a way to distribute keys across multiple users. Such key distribution is referred to more generally as secret sharing.

We have just released a paperaddressing a critical problem with secret sharing in general and in cryptocurrencies in particular. We refer to this problem as access-control paralysis.

How Secret Sharing Can Induce Paralysis
A few months ago, an acquaintance approached us with a simple but intriguing problem, a good example of real-world key distribution challenges.

This person---we’ll call our whale friend Richie---shared ownership of a large stash of Bitcoin with two business partners. Richie and his partners naturally didn’t want any one partner to be able to make off with the BTC. They wanted to ensure that the BTC could only be spent if they all agreed. There’s a simple solution, right? They could use 3-out-of-3 multisig where all three would then need to sign a transaction involving the BTC. Problem solved! Or is it?

However, that’s not the whole story. Richie and his partners were also worried, naturally, about what would happen in one of their keys was lost. The device storing a key might fail, a key might be deleted by mistake, or, in some very unfortunate cases, such as a car accident, a shareholder might physically lose the ability to access her key. The result would be a complete loss of all of their BTC.

This isn’t the only bad scenario. It’s also possible that Richie and his partners have different ideas about how the money should be spent, and can’t come to an agreement. Worse still, one malicious or piggish shareholder might blackmail the others by withholding her key share until they pay her. In this case, the BTC could be lost, temporarily or permanently, to indecision or malice.

We use the term paralysis to denote any of these awkward situations where the BTC can’t be spent. Unfortunately, N-out-of-N multisig doesn’t solve the paralysis problem. In fact, it makes the problem worse, as loss of any one key is fatal.

Image/photo

Richie and his business partners.

For this reason, avoiding paralysis while meeting the goal of Richie and his partners---i.e., requiring full agreement to spend the BTC---seems impossible. Actually, it is impossible with secret sharing alone, because of a basic paradox. Suppose we have an N-out-of-N multisig scheme, which we clearly need to enforce full partner agreement for transactions. If N-1 shareholders can somehow gain access to the BTC when one share goes missing, they can simply pretend that one share has gone missing and access the fund on their own. In other words, what we had to begin with was really some kind of (N-1)-out-of-N multisig, which is a contradiction.

Richie’s problem seems to have left us in a state of paralysis...

Resolving the Paradox
Thanks to the advent of two powerful technologies, blockchain and trusted hardware--Intel SGX, in particular---it turns out that we can actually resolve this paradox. We can do so efficiently and in a very general setting---to the best of our knowledge, for the first time. Toward this end, we introduce a novel technique called a Paralysis Proof System.

As you’ll see, fairly general Paralysis Proof Systems can be realized relatively easily in Ethereum using a smart contract---no SGX required. We present an example Ethereum contract in our paper. Scripting constraints in Bitcoin, however, necessitate the use of SGX and also introduce some technical challenges. Prime among these is the fact that without significant bloat in its trusted computing base, an SGX application cannot easily sync securely with the Bitcoin blockchain.

Our approach leverages a novel combination of SGX with a blockchain that avoids the need for an SGX application to have a trustworthy view of the blockchain.

Paralysis Proof Systems: Intuition
The intuition is fairly simple. A trusted third party holds all of the keys in escrow. If one or more parties cannot or will not sign transactions, leading to the paralysis described above, the others generate a Paralysis Proof showing that this is the case. Given this proof, the third party uses the keys it holds to authorize transactions.

If we have a trusted third party, though, we’re clearly not achieving the security goal set forth by Richie and his friends. One party controls all the keys!

This is where SGX comes into play. An SGX application can behave essentially like a trusted third party with predetermined constraints. For example, it can be programmed so that it is only able to sign transactions when presented with a valid proof. (In this sense, SGX applications behave a lot like smart contracts.) Thanks to SGX, we can ensure that the BTC can only be touched by a subset of parties when provable paralysis occurs.

A few technical details
Of course, even given this magic that is SGX, we still need to ensure that Paralysis Proofs can only be generated legitimately. We don’t want Richie’s partners to be able to “accuse” him, falsely claiming that he’s dead by, say, mounting an eclipse attack against the host running the SGX application. Happily, blockchains themselves provide a robust way to transmit messages and for a party to signal that she is alive. To implement a Paralysis Proof System for Bitcoin, we take advantage of this fact, along with a few tricks. For the sake of simplicity, we’ll focus on the problem of inaccessible keys, setting aside other forms of paralysis for the time being.

A Paralysis Proof is constructed by showing that a party P is not responding in a timely way and thus appears to be unable to sign transactions. The system emits a challenge that the “accused” party must respond to with what we call a life signal. If there’s no life signal in response to a challenge for some predetermined period of time (say, 24 hours), this absence constitutes the Paralysis Proof.

For Bitcoin, a life signal for party P can take the form of a UTXO of a negligible amount of Bitcoin (e.g. 0.00001 BTC), that can be spent either by P — thereby signaling her liveness — or by pk_SGX---but only after a delay. Note that sk_SGX is only known to the SGX application.

Image/photo

Let’s take our example of three shareholders again. Let’s say each of them possesses a key pair (sk_i, pk_i). First they escrow their BTC fund---let’s suppose it’s 5000 BTC---to UXTO_0 -- an output spendable by either all of them or pk_SGX. Suppose now that P_2 and P_3 decide to accuse P_1. Upon receiving their request, the SGX application prepares the following two transactions and sends them to P_2 and P_3:

t_1 that creates a life signal UTXO_1 of 0.00001 BTC spendable by either pk_1 immediately or by pk_SGX after a timeout (e.g. 144 blocks, 24 hours)
t_2 that spends both UTXO_0 and the life signal UTXO_1, to an address spendable by pk_2 and pk_3 (or pk_SGX optionally, if they want to stay in the Paralysis Proof System).

The shareholders that accused P_1 should therefore broadcast t_1 to the Bitcoin network, wait until t_1 is added to the blockchain, then wait for the next 144 blocks, and then broadcast t_2 to the Bitcoin network. There are two possible outcomes:

In the case of legit accusation where P_1 is indeed incapacitated, P_2 and P_3 will obtain access once t_2 is mined. This ensures the availability of the managed BTC fund.
In the case of malicious accusation, however, the above scheme ensures that P_1 has the opportunity to appeal while these 144 blocks are being generated. To do so P_1 just spends UTXO_1 with the secret key that is known only to her (the script of t_1 does not require the CSV condition for spending with her secret key). Since t_2 takes both UTXO_0 and UTXO_1 as inputs, spending t_1 renders t_2 a invalid transaction.

Security Reasoning
The security of life signals stems from the use of a relative timeout (CheckSequenceVerify) in the fresh t_1, and the atomicity of the signed transaction t_2. To elaborate, t_2 will be valid only if the witness (known as ScriptSig in Bitcoin) of each of its inputs is correct. The witness that the SGX enclave produced for spending the escrow fund is immediately valid, but the witness for spending t_1 becomes valid only after t_1 has been incorporated into a Bitcoin block that has been extended by 144 additional blocks (due to the CSV condition). Thus, setting the timeout parameter to a large value serves two purposes: (1) giving P1 enough time to respond, and (2) making sure that it is infeasible for an attacker to create a secret chain of 144 blocks faster than the Bitcoin miners, and then broadcast this chain (in which t_2 is valid) to overtake the public blockchain.

Beyond Cryptocurrencies and Paralysis
Although we use Bitcoin as a running example, the power of Paralysis Proof Systems extends beyond cryptocurrencies and the techniques behind them support a range of interesting new access-control policies for problems other than paralysis. Some of these policies are easy to enforce in smart contract systems like Ethereum, but others aren’t because they rely on access-controlled management of private keys, which can’t be maintained on a blockchain.

For example, Paralysis Proof Systems can be applied to credentials for decryption. You can use Paralysis Proofs to create a deadman’s switch for the release of a document, allowing it to be decrypted if a person or group of people disappear. Here are some other examples of access-control policies other than paralysis that can be realized thanks to a combination of blockchains (censorship-resistant channels) plus SGX:

Daily spending limits: It is possible to ensure that no more than some pre-agreed-upon amount---say 0.5 BTC---is spent from a common pool within a 24-hour period. (There are some practical limitations discussed in our paper.)
Event-driven access control: Using an oracle, such as our Town Crier system (which is actually the first public-facing SGX-backed production application), it is possible to condition access-control policies on real-world events. For example, daily spending limits might be denominated in USD, rather than BTC, by providing a data feed on exchange rates. One could even in principle use natural language processing to respond to real-world events. For example, a document with compromising information could be decryptable by a journalist should its author be prosecuted by a federal government.
Upgrading threshold requirements: Given agreement by a predetermined set of players, it is possible to add and/or remove players from an access structure, i.e., set of rules about authorized players. E.g., it is possible to convert a k-out-of-N decryption scheme to a (k+1)-out-of-(N+1) scheme. In a regular secret sharing scheme, upgrading isn’t possible, because a group of authorized players can always reconstruct the key they hold. If an SGX application controls a decryption key, however, it can monitor a blockchain to determine if players have voted for an upgrade. Votes are immune to suppression if they’re recorded on chain.

In general, the combination of SGX and blockchains supports a fundamental revisitation of access-control policies in decentralized systems and introduces powerful new access-control capabilities---capabilities that are otherwise impossible to achieve.

To learn more, read our paper at here.

Appendix
Many interesting extensions of the above are discussed in our paper. Here are a couple.

Paralysis Proofs via Covenants
As mentioned, it is the scripting constraint of Bitcoin that necessitates the use of SGX. In fact, we also present a (somewhat less efficient) approach without trusted hardware that makes use of covenants, a proposed Bitcoin feature. We refer readers to our paper for a covenant-based protocol. The bottom line is, the complexity of the covenants approach is significantly higher than that of an SGX implementation (in terms of conceptual as well as on-chain complexity). As there have been recent proposals to support stateless covenants in Ethereum, the comparative advantages of our SGX-based design may prove useful in other contexts too.

Tolerating broken SGX
In the aforementioned example, the fund can be spent by pk_SGX alone, but it’s important to note that’s not the only option. In fact, one can tune the knob between security and paralysis-tolerance to the best fit their needs.

For example, if the three shareholders only desire to tolerate up to one missing key share, what they can do is to move the funds into 3-out-of-4 address where the 4th player is the SGX enclave. If all of them are alive, then they can spend without SGX. If one of them of being incapacitated, the enclave will release its share if the rest two of players can show a Paralysis Proof. Therefore, even if the secret of SGX is leaked via a successful side-channel attack, the attacker cannot spend the fund unless colluded two malicious players.

This is an interesting line of future research we intend to pursue.

The Social Workings of Contract

Hacking Distributed

Wed, 17 Jan 2018 05:00:00 -0800

#^The Social Workings of Contract

Recently, scholars have begun paying attention to the legal limitations of so-called smart contracts. There are several salient critiques: smart contracts may imperfectly capture obligations; they may not fully account for changed circumstances; they may still require external interpretation. Taken together, these issues may well impede the legal utility of smart contracts “in the wild.” But there’s another set of issues at play, too: in the real world, contracts have social utility, and people use them in complex, strategic ways that often don’t align with their legal rights and obligations. These social functions require flexibility—often, the very flexibility that is intentionally short-circuited by smart contracts.

In my recent paper, “Book-Smart, Not Street-Smart: Blockchain-Based Smart Contracts and The Social Workings of Law,” I describe three common contracting practices that illustrate how contracts actually “work” in the social world. People may include contract terms that they know to be legally unenforceable in order to set behavioral norms for their contracting partners; for instance, “pay-if-you-stray” infidelity clauses in prenuptial agreements are generally unenforceable in court, but still communicate expectations about how the parties will treat each other. Or, people might include contract terms that are purposefully vague; particularly in long-term relations where the parties contract with one another repeatedly, it can promote stability to leave some expectations underspecified. Finally, parties might strategically decide not to formally enforce an enforceable contract. The looming shadow of a lawsuit can be enough to encourage people to bargain between themselves, and the outcomes of these bargains are often mutually preferable to what might result from formal adjudication.

Altogether, these uses of contract suggest that contracts are social resources as much as they are legal mechanisms. But smart contracts focus on the technical form of contract to the exclusion of social context; that’s what they’re designed to do. We might think of smart contracts as book-smart, not street-smart. While they may facilitate technically perfect and seamless implementation of agreements, the social friction required to negotiate and enforce a “dumb” contract can, in some cases, be functional and desirable. This doesn’t imply that there is no role for smart contracts in some social settings, but it does suggest that attention to the setting matters a lot. In the paper, I suggest that as a matter of policy, we ought to carefully consider the social characteristics of contracting environments—the goals of the parties, the longevity of the relationship, the availability of reputational mechanisms, and the like—before deploying smart contracts into them.

Decentralization in Bitcoin and Ethereum

Hacking Distributed

Sun, 14 Jan 2018 23:37:00 -0800

#^Decentralization in Bitcoin and Ethereum

Image/photo

We have been conducting a longitudinal study of the state of crytocurrency networks, including Bitcoin and Ethereum. We have just made public our results from our study spanning 2015 to 2017, in a peer-reviewed paper about to be presented at the upcoming Financial Cryptography and Data Security conference in February [1].

Here are some highlights from our findings.

Bitcoin Underutilizes Its Network
Bitcoin nodes generally have higher bandwidth allocated to them than Ethereum. Compared to our previous study in 2016, we see that the median bandwidth for a Bitcoin node has increased by a factor of 1.7x. The typical Bitcoin node has much more bandwidth available to it than it did before.

Higher allocated bandwidth indicates that the maximum blocksize can be increased without impacting orphan rates, which in turn affect decentralization. If people were happy about the level of decentralization in 2016, they should be able to increase the block size by 1.7x to clear almost twice as many transactions per second while maintaining the same level of decentralization.

Some people argue that increasing the maximum block size would also prohibitively increase CPU and disk requirements. Yet these costs were trivial in the first place, especially compared to today's transaction fees, and have come down drastically. For instance, a 1TB disk cost $85 on average in 2016 and $70 in 2017 [2].

To date, we have seen no sound, quantitative arguments for any specific value of the maximum block size in Bitcoin. Arguments on this topic have consisted of vague, technical-sounding-yet-technically-unjustified argumentation, bereft of scientific justification. The dissonance between the technical-soundiness of the arguments and the actual technical facts on the ground is disconcerting for a technological endeavor [3].

Ethereum is Better Distributed Than Bitcoin
Compared to Ethereum, Bitcoin nodes tend to be more clustered together, both in terms of network latency as well as geographically. Put another way, there are more Ethereum nodes, and they are better spread out around the world. That indicates that the full node distribution for Ethereum is much more decentralized.

Part of the reason for this is that a much higher percentage of Bitcoin nodes reside in datacenters. Specifically, only 28% of Ethereum nodes can be positively identified to be in datacenters, while the same number for Bitcoin is 56%.

Nodes that reside in datacenters may indicate an increased level of corporatization. They may also be a symptom of nodes deployed to skew node counts for various different implementations (a.k.a. part of Sybil attacks to influence public opinion), a hypothesis that was floated extensively during the course of our study.

In contrast, Ethereum nodes tend to be located on a wider variety of autonomous systems.

Neither Are All That Decentralized
Both Bitcoin and Ethereum mining are very centralized, with the top four miners in Bitcoin and the top three miners in Ethereum controlling more than 50% of the hash rate.

The entire blockchain for both systems is determined by fewer than 20 mining entities [4]. While traditional Byzantine quorum systems operate in a different model than Bitcoin and Ethereum, a Byzantine quorum system with 20 nodes would be more decentralized than Bitcoin or Ethereum with significantly fewer resource costs. Of course, the design of a quorum protocol that provides open participation, while fairly selecting 20 nodes to sequence transactions, is non-trivial.

Thus, we see that more research is needed in this area to develop permissionless consensus protocols that are also energy efficient.

Ethereum Wastes Mining Effort That Can Be Put To Better Use
Ethereum has a much higher uncle rate than Bitcoin's pruned block rate. This is by design, as Ethereum operates its network closer to its physical limits and achieves higher throughput. As a result, however, less of Ethereum's hash power goes towards sequencing transactions than Bitcoin's. Put another way, some hash power is wasted on uncles, which do not help carry out directly useful sequencing work on the chain.

This indicates that Ethereum would greatly benefit from a relay network, such as Falcon or FIBRE for Bitcoin. Relay networks ferry blocks quickly among miners and full nodes, and help reduce wasted effort by reducing uncle and orphan rates.

Ethereum Exhibits Better Variance in Fairness, Favoring Small Miners
Fairness is an important metric: it determines whether a small miner is at a greater disadvantage compared to a larger miner. If a system is perfectly fair, there would be fewer reasons for miners to pool their resources into larger, cooperating pools that operate in unison.

To measure fairness, we looked at the proportion of blocks that miners have on the main chain divided by the proportion of their blocks that did not help advance the blockchain, namely, pruned blocks and uncles. In an ideal system, this metric would be equal to 1.

The level of fairness in both systems is, roughly speaking, comparable. But there is a big difference in variance of fairness, with Bitcoin exhibiting high variance. That is to say, mining rewards are more unpredictable for smaller miners in Bitcoin. This is partly because the high block rate in Ethereum helps provide many more opportunities for the laws of large numbers to apply in Ethereum, while Bitcoin, with its infrequent blocks, can exhibit much more uncertainty from month to month.

More
The full details, of how we measured the data and what we found in more precise terms, are in our paper.

Footnotes
    [1]    Our study examines solely the networks and the blockchain maintained by those networks. It does not examine development centralization. Balaji Srinivasan and Leland Lee have developed a metric, called the Nakamoto Coefficient, that attempts to capture centralization across different fields.
    [2]    Historical price data is notoriously difficult to find, for some reason. The specific sources we used are PC Part Picker and Camel. Our personal experience was more drastic than the industry average, closer to a 2X drop in price over the same time frame.
    [3]    Concomittantly, Bitcoin Core has adopted a narrative that it is a Store of Value, in effect making it explicit that the token is not a technological artifact meant to facilitate payments, but an investment vehicle where early adopters are compensated by late comers.
    [4]    Of course, some of these entities are pools. And some people will claim that pools provide decentralization, because they are composed of multiple independent actors. This argument is incorrect for a few reasons: (1) we retrospectively examine the historical record, and at the time of that particular block's commitment to the blockchain, there was a de facto, undeniable agreement among the pool members to act in unison, now recorded on the blockchain, (2) perhaps the pool members would leave if the pool engaged in activities that damage the currency, but this has historically not happened, to the point where a pool exceeded 51% of the hash power, (3) even if pool members were motivated to leave their pool in the presence of unwanted behaviors (e.g. selective transaction censorship by the pool), their ability to do so depends on their ability to detect these behaviors, and most participants are not geared to detect them in the first place. In short, pools providing any level of decentralized decision making is more aspirational talk than a proven reality.

How Not To Run A Blockchain Lottery

Hacking Distributed

Mon, 25 Dec 2017 00:05:00 -0800

#^How Not To Run A Blockchain Lottery

Today's post is a cautionary tale on why running a lottery on a blockchain is so incredibly hard to get right [1].

Image/photo

Here's the setting: Eric Lombrozo, a Bitcoin Core developer, was in the Christmas spirit yesterday and decided to give away 1 BTC, split into 10 chunks of 0.1 BTC each, to people who re-tweeted him. This is a gift of approximately $1500 [2] each.

He wanted to make the giveaway provably fair, so he devised the algorithm described in his tweet thread. I won't go into the gory technical details at all, except to note that, in essence, he combines two block hashes at a pre-determined block height and derives a 16-bit number (H) that is coprime to the number of retweets. Coprime, also known as "relatively prime," just means that the winning lottery number H and the number of re-tweets do not have a divisor in common. He then indexes into the list of re-tweeters 10 times, rewarding every (H)th retweeter, wrapping around as necessary.

Now, the scheme is quite ornate and complicated. But the key operation that's happening underneath is simple: he is deriving a random number from two block hashes. This is a pattern I've seen in use in at least a dozen buggy Ethereum Dapps, and many of you are going to stop reading at this point thinking you understand the problem. Do read on, because there are multiple problems, and the actual bugs are not the usual, obvious ones, even though it'll seem that way at first.

Let's delve into this scheme and note the interesting observations and thoughts as we go along:

Concerned About Miner Attacks

Eric is worried about miner manipulation of his lottery.

Everyone deriving random numbers from block hashes should worry about attacks by miners. Recall that a miner wishing to tilt the lottery can do so by computing a block and seeing if its hash yields a good outcome for the miner. If not, the miner tosses out the block without making it public.

But Miner Attacks Are Not A Problem
2. But in this case, this worry about the miners is completely overblown. A miner would have to be insane to discard a perfectly good block to tilt this particular lottery, because they would stand to lose more than 20 BTC (~$300,000) for a gain of 0.1 BTC ($1500).

Failed Defenses Against Mythical Attacking Miners
3. Regardless, some people are overly paranoid about things that will not happen. We have heard quite a bit about the "Chinese miners" [3] and there is a subreddit dedicated to villifying them. And Core developers keep reminding us how cautious they are as a group. Perhaps the extra paranoia is called for; at least, maybe it's harmless.

Is Paranoia A Virtue?
4. But is extra paranoia really harmless? Or as Ross Anderson has argued relentlessly for a few decades, should one make security decisions based on costs? Is computer security, at its heart, a game of tradeoffs, where quantitative reasoning should keep us from going towards unnecessarily ornate solutions, which, in their effort to guard against things that are not happening now and will not happen in the future, themselves introduce other flaws? Even if they don't introduce problems directly, they take our time and concentration away from other potential issues. While we're trying to focus on the miners, we might totally miss the boat on other, bigger problems, right?

Things get philosophical at this point, so I'll abandon this line of thought and assume that the miners are evil. There is a subreddit full of messages and memes to this effect.

Does This Distrust of the Miners Actually Work?
5. So how good is this approach at keeping the evil miners at bay? Not at all. This scheme derives a random number from the combination of two hashes, at heights h and h+5. It does not make sure that these blocks came from two separate miners. What's the point of picking two numbers if they are coming from the same entity?

And even if blocks h and h+5 were coming from different miners, how do we know that they are not colluding? This seems like an insurmountable problem.

Well, It's Broken Anyway
6. But no matter! This scheme is broken anyway, in the usual, predictable way. The miner who mines the second block knows the first one, so he is fully and solely in control of the lottery's outcome. So, the two miners don't even have to collude, the second one controls the outcome.

But It Doesn't Matter
7. But wait, per point #2 above, the randomness of the number does not matter, because rational miners won't launch attacks that costs more than the potential winnings, let alone attack a good will gesture. Now, if this was a state lottery, it would be a different story, but it isn't, so let's move on. Miners attacks are not going to happen, and all the energy spent worrying about them is wasted.

It's Broken In a Different Way
8. This brings us to the crux of a more fundamental problem. This particular scheme picks numbers that are coprime to the number of re-tweeters, and picks small multiples of that number. Picking coprime integers is a good idea when devising pseudo-random number generators (PRNGs), but this is not a PRNG and the numbers aren't picked carefully, they are picked from the blockchain and multiplied by a small series of integers.

Signs of Trouble
9. As a result, some numbers are much more likely to be chosen for H than others. A simple example illustrates why. Imagine that Eric's tweet gets 1000 retweets. There are 65534 potential numbers that can be coprime with 1000. For instance, 2, 4, 5, 6, 8, and 10 are not coprime, so they will not be the first picked number. Their multiples are not going to be chosen either.

Now, it could be that a smaller number is picked, like 1, and the first 10 numbers would then be winners.

And it could be that a large number is picked, and it wraps around and covers numbers that would otherwise not be covered.

But there may well be numbers that are neither covered by getting picked (i.e. coprime with N) nor covered by a large number wrapping around (i.e. a multiple of a coprime mod N). Let's call these unlucky numbers.

11. Are there such unlucky numbers? Yes, it turns out that, if there are 1000 tweeters, there is absolutely no way that tweeters 20, 25, 40, 50, 60, 75 would ever win!

More Trouble
12. We don't know how many retweeters there will be. So perhaps the unlucky numbers for 1000 tweeters are different enough from the unlucky numbers for 1001 tweeters, and the probabilities cancel out?

We could do some number theoretic analysis at this point, but it's Christmas Eve and it's a lot easier to just crunch the numbers and check. What we can do is iterate over the possible numbers of re-tweeters, and see if there are any favored positions and unlucky numbers, and then we can see by how much the favored people are ahead of the unlucky ones.

13. I wrote a little simulator that examines what happens for every number of re-tweets. It's on github here. Everyone can look and see for themselves if I missed something.

Oooh Baby
14. Here are the results in graphical form, for a number of re-tweets in the range 1000 to 4000.

Image/photo

We can immediately see that re-tweeter number 10 is highly favored! He has 641486 different ways he can win. Second favorable position is person 970, who has a respectable 632404 ways to win. Retweeters 2, 8 and 9 are also in the top-25.

Winners Losers
------- ------
10 641486 143 475215
970 632404 336 473269
890 632321 672 472628
830 631763 756 470375
790 631226 528 468487
730 630904 840 464022
710 630818 780 461876
670 630004 660 455016
610 629438 420 454108
590 628902 924 440952

Compare this with the losers at the tail. Retweeter 924, for instance, has only 440952 ways to win! Retweeter #10 is almost 1.5 times more likely to be chosen than retweeter #924! His compatriots at 420, 660, 780, 840, and 528 are other lacklusters who are much less likely to be chosen.

Even Bigger Error
15. That all seems very clever. It initially looked like the mechanism was manipulable by miners, but then it turned out that that was totally irrelevant and the scheme was flawed all by itself. The distribution is not uniform; it favors certain people.

But there is a much bigger, even more glaring error. I'll give you a paragraph break to think about it.

16. This lottery is actually tilted in favor of people who run social media manipulation services. If you have a bunch of twitter sockpuppets, you can occupy many more slots than honest people. None of the fancy math above actually matters if you have 1000 sockpuppets and there are only 1000 organic retweeters. The sock puppets' chances of winning will be 50%.

Wrapping Up
So, what did we learn from this exercise?

Deriving fair, unmanipulable randomness from a blockchain is difficult.
Combining multiple block hashes does not provide any immunity against miner attacks, the last miner is in full control of the outcome.
If one fixates on unlikely problems, one can miss much more likely ones right under one's nose.
Striding into a big list with a number that is coprime to the size of the list will not yield a uniform selection. In fact, this scheme exhibits incredible skew, and timing your tweet well can give you a huge advantage over others.
But the day belongs, as it often does in the short term, to social media trolls and astroturfers.

    [1]    Actually, as with every Bitcoin related tale these days, this discussion contains, fractally, the entirety of the block size debate within it.
    [2]    Specifically, it's $1340 at the time of this writing, minus $40 to receive the funds, minus another $40 to use them.
    [3]    The "Chinese miners" are almost always lumped together by their nationality as if all of them come from the same mold. Yes, it's inappropriate.

Parity Proposals’ Potential Problems

Hacking Distributed

Wed, 13 Dec 2017 00:00:00 -0800

#^Parity Proposals’ Potential Problems

Image/photo

Be careful when you resurrect the dead.

Since the second Parity Multisig hack that froze a total of 514,774 ETH (242 million USD at the time of writing), there has been an ongoing debate about possible ways to unfreeze the funds and return them to their rightful owners. Yesterday, Parity technologies released a blog post and a draft EIP, consisting of four variants of a proposal that would allow the stuck funds to be unfrozen.

All of the Parity proposals essentially address the same issue. In general, they allow self-destructed contracts to be revived through the re-deployment of a new contract at that address. In the following discussion, we will start out by thinking about the security implications of such contract revival proposals in general terms, and then comment on the specifics of Parity’s four proposals. We conclude the post with a recommendation for how a potential unfreeze of funds should work.

A General Security Problem
The Parity proposal allows deployers of contracts to deploy new, different contract code in a dead contract’s stead. Any smart contracts that have been deployed with a self-destruct opcode now allow their creator to re-instantiate them with their choice of code after a self destruct. This allows creators of contracts with self-destruct to self-destruct and re-deploy their contract, potentially leading to loss of users’ funds.

A concrete example: let’s say you had a contract like 0x’s deposit contract, responsible for holding a number of ERC20 tokens as part of its operation. If such a contract could somehow be self-destructed, the creator can attempt to invoke the self-destruction mechanism and replace the deposit contract with a forwarder to themselves. Unwitting users may then send funds to the creators rather than the intended decentralized exchange. We discuss more concrete examples later in the post.

This example may seem contrived, but with the complex networks of interacting, custodial, and interdependent contracts we are building, any contract that relies on the code in contracts it interacts with not changing can be affected by intentionally or accidentally malicious contracts implementing this mechanism. The important security invariant that after deployment, code at address A can only change to empty is broken for any callers.

The Ethereum developers have so far been very conservative with breaking invariants, a strategy that we applaud. For instance, the adoption of EIP-86 was deferred due to concerns about breaking exactly the same invariant that this proposal modifies.

As we were writing this post, Nick Johnson published an excellent analysis that also argues against breaking invariants and looks at the proposed changes from a network-safety point of view. Our posts don’t overlap much, so it’s definitely worth reading his take.

On Semantic Compatibility
The fundamental property violated by Parity’s proposed changes is semantic compatibility. In general, at the time a contract is written, an author expects the contract to operate in a manner determined by the defined and documented semantics of the underlying language and platform (and any core existing contracts). When an auditor audits a contract, the same is true: auditors have to reason quite intimately about very specific edge cases and interactions of EVM constructs, and cannot make assumptions about the behavior of any code in isolation.

The proposed changes would introduce a major change to the semantics and security model of the EVM retroactively, potentially altering the assumptions of programmers, languages, and higher level tools made in the process of building a contract.

A big emphasis in the Ethereum community recently has been placed in high-assurance software development, including rigorous development and testing practices and use of formal models and tools. The proposed changes substantially weaken the security model for any contracts using self-destruct: in high-assurance, safety critical software development, any changes to the underlying platform are rigorously validated, tested, and re-evaluated for effects on higher level software that builds on them. Put another way, no vehicle manufacturer would upgrade its operating system in production without re-testing, re-auditing, and re-evaluating all the software that runs on this OS for bad interactions, side effects, and bugs. And for the proposed change, that includes every smart contract on the network today.

With regards to audits and formal verification, it is important to note that any such work performed on pre-fork contracts may need to be re-done post-fork, specifically if the verification of these contracts used the valid semantic assumptions of unchanging code in other contracts, or if the verified contract contained the self-destruct operation. This could set a precedent that imposes repeated ecosystem costs for an expensive process that ideally should only be required once.

Not all semantics-breaking changes are bad. In some cases, new semantics are introduced to components of the system explicitly specified as changeable (e.g. new opcodes, or new pre-compiles). It is important to not let backwards compatibility become a dogma, but for security, care and minimally invasive procedures are still required.

The Parity Proposals
We noted a number of implementation questions with the concrete code that Parity has proposed, specifically in their Proposals C and D. These are orthogonal to the higher level security issues described above, but nonetheless introduce substantial complexity into the Ethereum platform.

Hacky contract proxies: function setupProxyForContract(uint nonce, address destination) public { proxy[address(keccak256(msg.sender, nonce))] = destination; } is used to set the proxy for contracts in Proposal D. This allows the creator of the contract to set the proxy to any code they want at any time in the future (note that the setupProxyForContract guard does not have a single-call check, and can be used to swap out proxy contracts many times and on the fly; this allows contract creators to essentially hot-swap recovery contracts from under their users). Currently, contract creators hold no special power over contracts they created; this proposal would change that.
Hacky contract proxies 2: The above code does not handle cases where it was a contract that created a contract. In these cases, this creator contract must explicitly include proxy setup functionality, with old contracts that created contracts not able to participate in this proxy process. For a general solution, all cases should be handled.
Asymmetry between ether and tokens: In what is presumably an attempt to reduce the risk of revived contracts stealing funds, Proposals B, C, and D don’t mark the fallback function in the proxy contract as payable. Therefore, revived contracts can receive token payments, but not ether payments. This inconsistency introduces additional complexity in reasoning about the behaviour of revived contracts.
Conflicting mention / security properties of tx.origin and msg.sender: The proposal comment // please note the usage of tx.origin does not seem to match the code below it (same as in (1)). Tx.origin is not compatible with contracts created by contracts being revived (e.g. multisigs that generically forward call data). We believe a discussion of this choice should be a part of the proposal.
Light clients: In this proposal, users can deploy code generating arbitrary logs at dead contract addresses. Especially in Proposal D, any user can generate any log event for any destructed address enabling proxying. In particular, this can allow them to fool light clients and applications that rely on logging code being consistent with the audited contracts they deploy. This is likely to pose a security threat to a wide range of applications.
Inconsistent delays: Delays are used in Proposal B, but not C or D. It is unclear to us whether these were intentionally omitted, because some problems they would help mitigate exist in all three proposals.
msg.sender “spoofing”: Revived contracts will carry the same msg.sender as their previously destructed counterparts. Users can no longer rely on the code of a contract staying consistent across calls to external systems that use msg.sender for authentication (like ERC20 functionality). Two examples of this are provided in the next section.

While the proposed mechanisms (“create contract proxy at any nonce for this address”) would allow victims of the second Parity multisig hack to recover their losses, it isn’t clear that they merit adoption in their own right. A proposal for a general mechanism should be convincing on its own, ignoring the amount of funds from a single hack that could be restored if the proposal were adopted.

Especially proposals C and D would introduce significant additional technical complexity, and make the already difficult problem of reasoning about interactions between smart contracts even harder.

Vulnerable Examples
Here are a few concrete examples of contracts which may be vulnerable to a few of the problems we discussed above.

ERC20 Exchange
Bob’s full-featured exchange contract is responsible for swapping tokens on a blockchain. It’s tied to Bob’s (centralized) web UI, so it has a hard-coded self-destruct function that only Bob can call, recovering funds in case the contract is somehow compromised or to be upgraded. The contract was created by a temporary key used by Bob’s employee, Dave, who after all was just deploying a no-owner contract to the blockchain.

Sensing an incoming hack, Bob self-destructs his contract and reclaims all the users’ funds. Unfortunately, some hard-coded client contracts and exchanges still send funds to Bob’s contract, leaving them stuck there for Dave to steal (if he saved the key).

Oracle
Town Crier is an oracle service in production on Ethereum today that uses Intel's SGX. Like all good smart contract citizens, a Town Crier-like system can include self-destruct capabilities, where IC3 can destruct a contract in the event of an SGX compromise or failure. In the Town Crier security model, you are trusting IC3 for availability but not integrity: when you get answers to your queries, you know they are right. The self destruct provides no security risk in the old EVM model, as you trust IC3 for availability regardless. The current Town Crier contract implements a similar permanent kill switch.

Such a system could also send all its query responses through a proxy contract, that checks that the transaction is really being sent by Town Crier, and charges the user a fee once the query is successfully validated. In fact, this is exactly what Town Crier does (line 182).

With a feature like this, it’s easy to see what can go wrong: IC3 maliciously self-destructs the Town Crier contract, installing its own proxy that sends fake data to Town Crier clients. Because these clients use msg.sender for authentication, they assume the real Town Crier contract is sending them a call (an assumption that, until these proposals, was founded on the EVM semantics).

This proposal makes the Town Crier paradigm impossible unless the Town Crier contract can be validated to not have self-destruct functionality. This defeats the purpose of self-destruct as a cleanup incentive, and forces all security-critical contracts to enforce and audit the absence of a self-destruct, a non-trivial task, especially if delegatecalls are used.

In general, making contract code mutable breaks the use of msg.sender for code authentication and validation, a common use-pattern in both oracles and ERC20 contracts (where, for example, users transfer tokens to an address they know is a multisig to secure them).

Bad Miner
Alice, Bob, and Carol are starting an e-cats based venture together, and have decided to open a Kitty-Backed-Token (KBT), and collect cats together in their contract. Unfortunately, a bad apple hacks their kitty tokens and begins inflating his own supply, withdrawing a higher and higher share of kitties. Luckily, the users of KittyToken can vote to self-destruct the contract before it is too late, allowing its original deployers Alice, Bob, and Carol to collect the cats and save them from certain death.

The Ethereum network has implemented Parity Proposal D, which allows users to set their own proxies for the contract. As soon as the period of time opens to create forwarders, a miner named Dave creates his own proxy and inserts it first into his block, transferring out all the rare kitties in the contract to himself. Dave need not own any KBT to do this, since any user can set their own resolver and essentially spoof their msg.sender as being the new zombie contract, the DeadKittyToken (DKT).

These kitties sit on external contracts, and again have the flaw of using msg.sender for authentication of transfers. Users assume this is OK, because msg.sender can only be the smart contract they audited, but the sneaky code change allows the evil Dave to send all the kitties to 0xdead.

Our Suggestions
A number of steps should be taken to resolve this anti-pattern for future contracts, as well as to potentially deal with previously lost funds. We recommend an alternative set of actions:

Separate funds-recovery for previously vulnerable contracts from changes that need to happen to both the EVM and its tools to avoid similar losses in future contracts, using a clean-slate approach to design the latter. EIP86 seems like the most studied candidate for introducing a redeployment mechanism to Ethereum. We recommend a modification to Solidity’s Security Considerations to cover self-destruct based antipatterns, and a static warning in the Solidity compiler for use of self-destruct, especially when calling a library containing the operation. In our view, this should be sufficient to ensure funds are not lost going forward, specifically in its ability to allow users to demand redeployable contracts.

Offer the community the choice of a fork for past contracts, which should consist of a target contracts curated by a community review process over a significant length of time, remediating losses in previously affected contracts. This fork should be offered in a same manner as the DAO fork, and should not be officially supported by client developers or the Ethereum Foundation (yes, including Parity).

The anti-patterns leading to these funds losses are not a severe and general enough issue to require breaking changes to established EVM behavior, and should be handled through changes to tooling and potential case-by-case remediation for contracts before these changes occurred (we propose this as an option, but do not provide an opinion on whether to recover the funds in this article).

Conclusion
We do not believe the funds loss of recent self-destruct vulnerabilities to be general enough to require a general solution. Such a general solution can be harmful, both in the practical sense of allowing potential future funds loss vectors and in the theoretical sense of setting poor precedents that have the potential to impede the process of reliable software development on Ethereum.

We advocate for separating funds-recovery for past vulnerable contracts, which should be discussed as potential contract-specific forks on their own merits, from changes that need to happen to both the EVM and its tools to avoid the vulnerability of future contracts.

Acknowledgments
Sincere thanks to Ari Juels and Emin Gün Sirer for reading and providing comments on earlier versions of this post.

Who Has Your Back in Crypto?

Hacking Distributed

Sat, 26 Aug 2017 06:55:00 -0700

#^Who Has Your Back in Crypto?

Image/photo

I came across a Twitter poll on which entities have their interests and priorities most closely aligned with Bitcoin users. The results, if they are to be believed, indicate an enormous misunderstanding, or else they betray the result of a successful disinformation campaign. To wit, more than 60% of the respondants believe that the devs have users' interests at heart. Only 23% trust businesses, and a mere 15% say the miners.

This is completely wrong.

In general, open-source (OSS) developers, especially second generation developers who were not present at the inception of the project, have skewed interests that are at odds with those of the users. Depending on your investment thesis, either the miners or businesses have their economic interest best aligned with users. Let's discuss why.

It is incredibly common and ordinary for second generation developers to add gratuitous complexity and bloat to a project. There are two reasons that compel them to do this.

Wanna Be Famous
Image/photo

One reason is to simply leave their unique mark. How else would anyone know that the developers are any good if there isn't a unique new vision? What new line do the developers get to put on their resumes, or which files do they point to in github on subsequent job interviews? "Added Schnorr signatures" sounds far more impressive than "I responded to bug reports and refactored a 5000-line main.cpp file" no matter how useless and cumbersome Schnorr signatures may be to use in practice, and how badly a refactoring was needed [1].

This means that projects often evolve by incorporating vanity features, typically by force of pure ego. Your typical primadonna dev will want to add a dessert table to the banquet; good luck getting them to slice the bread. Especially when there is turnover in a team, the next generation often change the project aesthetic.

A well-known example and ongoing fail-mobile is the systemd project, where a Linux developer named Poettering is trying to import into Linux all of Windows' problems. Like PulseAudio before it, systemd cites legitimate areas where Linux needs improving, but then breaks away from the underlying Unix aesthetic in every possible way. Poettering and his team can't be satisfied with incremental fixes that preserve the original vision -- it is essential that they redo things in their completely unique way, which totally isn't a bug, except I couldn't hear you when we had PulseAudio and I can't parse my logs now, can't install an update without rebooting my machine like a Windows pleb, and sometimes can't start up my system and it's totally not a bug.

Wanna Be Rich
Image/photo

The second reason is to make themselves indispensable, by increasing the complexity of the OSS project. Many open source developers are uncredentialed, young people with few other accomplishments and job prospects, looking to break into a lucrative industry. A complex code base immediately makes them an expert because they know all its warts and they know where all the skeletons are buried in the code. That's because they put them there. This immediately guarantees a consulting stream. You want to add a tweak? Well, you can't, you need to hire someone who understands the mess.

The best example for this is Asterisk [2]. An OSS project so complex, no one can do anything other than what's explicitly in the tutorials, and even then, it requires magic incantations and blood sacrifice. For those who do not know, Asterisk is a system for building phone management systems. You would think that audio management would involve the composition of nice, modular units, with uniform interfaces and clean configuration files. You'd be wrong. When I last looked, there were no abstractions in Asterisk. Things just barely worked for the few use cases, and only if you did everything the way you were prescribed. If you went off script, everything would break down. "If you do A, then B, then X, then S, you'll get the effect you want, because X relies on B's setup and leaves the state ..." you get the point. The people who have mastered this completely useless tangle of constraints ensure a steady income stream and enjoy a standing within their own community of people who do not know better of being an "expert." It's like being an expert at naming things, it sounds like a discipline, almost like science, but it's all just a bunch of does-not-matter in the end.

It is the rare person who can create something, stand behind it, put in the grueling hours to respond to bugs and errors, and then see others get rich off of it, without any trace of wanting to participate in the action. So if the codebase does not provide a direct way of compensating the developers, rest assured that the developers will find a way.

Consequences
These motivations are short-sighted and self-limiting. They drive away new devs from entering the space and strangle the project. The behaviors above are rewarded in the short-term, but spell the convalescence phase for a project.

What About Cryptocurrencies?
Image/photo

So who has your best interests at heart when it comes to cryptocurrencies? Is it the miners, the businesses, or the developers?

If you hold coins, your interests are aligned with those who hold massive amounts of coins. That's not developers. It's most definitely not second-generation developers who were not there on day one when coins were cheap.

If you believe the currency will gain value by expanding community, you are aligned with those who desparately want the economy to expand. That's also not developers.

Miners hold massive amounts of coins. Businesses want a bigger economy, more users and fast coin turnover.

In contrast, developers have complex games they play. Sure, they want the coin value to grow, but not drastically more than your cousin Joe that you somehow convinced to go into your favorite coin with you. If they weren't there for the pre-mine, they wouldn't even have that many coins. While we all hate pre-mined coins, they do provide the right incentives for the long-term success of the coin.

Another way to analyze the situation is to look at the potential losses to be incurred by the different entities. A miner has 100s of millions of dollars' worth of hardware they have committed to the future success of the currency. A typical startup will have a few to tens of millions at risk. Collectively, VCs have poured more than a billion into the cryptocurrency space and they want to see it expand tenfold for returns.

In contrast, a typical second-generation dev has invested just their own time, which may possibly have been partially subsidized by a firm, and collected an average number of coins [3]. It's often unclear that they would have had better prospects elsewhere. They rage-quit and wal away from projects all the time. The worst that can happen to many devs, whose visions are flawed and whose bets turn out to completely tank a project, is to have switch IRC channels, clone a different but similar git repository, and muzzle their toxicity as they integrate themselves into a different social hierarchy.

Takeaway
Image/photo

When evaluating statements from miners, businesses and developers, cast your lot accordingly. No one likes corporations; many large and monopolistic ones tend to eke out profits at the expense of their users. Miners, well, they tend to be predominantly Chinese these days due to abundance of excess hydro power over there, so, if someone is the sieg-heilin', statue-lovin', hitler-would-have-been-alright-if-he-had-obtained-marching-permits type [4], it's clear what one would think. And developers have the benefit of actually having a face, and for some, a 7x24 social media presence. And the thing about troll-backed deceptive narratives is that it sounds all so good, and so popular -- all those likes and upvotes from those nameless trolls in those censored forums must indicate something, right?

Yet the underlying forces are precisely the opposite of how things might seem on a naive, superficial examination.

Of course, in an ideal world, people would evaluate proposals from a scientific perspective, instead of casting their lot with a particular tribe. And the ecosystem would not have distinct roles where people with different roles have goals in opposition to others. Whoever can invent that coin, whoever can build a community based on reasoned, civil, scientific discussion, is sure to make a killing.

    [1]    Bitcoin's main.cpp file used to be 5000 lines long, and quixotically, remained so for many years. Schnorr signatures are nice, there is nothing wrong with them, and we all need them as much as we need a second prehensile tail. "Wait, we don't even have a first one," you might say. That would be the point.
    [2]    I was an avid Asterisk user for multiple years, thanks to a high pain tolerance that allows me to also work on cryptocurrencies. While it was fun to automatically screen telemarketers, to redirect calls to my nearest phone, and to play hold musac for people who called the house, the underlying software was nothing but just bloat, with no attention paid to clean interfaces and repurposable components.
    [3]    On average, people are average. One could claim that developers would more consistently buy in, comparable to more fervent currency fans, but one could also claim that they would want to diversify their risk as well. From my own personal observation, many crypto investors have personal wealth that far exceeds early developers.
    [4]    He did, in fact, have a permit.

To Sink Frontrunners, Send in the Submarines

Hacking Distributed

Thu, 24 Aug 2017 22:01:00 -0700 last edited: Sun, 27 Aug 2017 22:01:00 -0700

#^To Sink Frontrunners, Send in the Submarines

The problem: Frontrunning
Miner frontrunning is a fundamental problem in blockchain-based markets in which miners reorder, censor, and/or insert their own transactions to directly profit from markets running on blockchain economic mechanisms. Miner frontrunning takes advantage of the responsibility of miners in a blockchain system to order transactionsin an attack described in great detail by Martin Swende in ablog post, which we highly recommend for background on this important issue.

Frontrunning is not strictly theoretical: in practice, frontrunning-like tactics have been observedduring large ICO releases, with increasingly sophisticated attacks anticipated as the financial incentivesfor gaming high-profile contracts increase and attract more sophisticated attackers.

As described in a previous article on frontrunning, "any scheme that a) provides full information to miners b) doesn't include nondeterminism and c) is vulnerable to ordering dependencies is gameable."

In this article, we will examine a potential mitigating strategy for frontrunning through the lens of a fair sealed/closed bid auction mechanism that resists miner frontrunning. This strategy can be used for fair markets, fairICOs, fair ENS-style auctions, and much more.

We stumbled upon this problem during the creation of our upcoming billion-dollar ICO token launch, HaxorBux (HXR).

Our mitigating strategy is an idea that we call a submarine send.Submarine sends aim at a strong confidentiality property. Rather than just concealing transaction amounts—which isn’t sufficient to prevent frontrunning, as we’ll show—submarine sends conceal the very existence of a transaction. Of course, a permanently concealed transaction isn’t very useful. Submarine sends thus also permit a transaction to be surfaced by the sender at any desired time in the future—thus the term “submarine”.

Image/photo

Submarines: They can’t frontrun you if they can’t find you

We have published a proof-of-concept implementation on github(still lacking some features like the Merkle-Patricia verification described later in this post). We encourage contributions and comments in the issues section there!

Why it’s hard to prevent frontrunning
A folklore method for concealing transaction values—used for everything from randomness generation to (non-blockchain) sealed-bid auctions—is called commit / reveal. The idea is simple. In a first protocol phase, every user Ui submits a cryptographic commitment Ui = commit(vali) to the value vali representing her input to the protocol, e.g., her bid in an auction. After all inputs have been gathered, in a second phase, every player reveals vali by decommitting Ui. During the first, commitment phase, no user can see any other user’s bid / value vali and thus front-run the second phase.

This is all well and good, but vali is a value, not actual money. So what if a user wins a sealed-bid auction conducted this way. How do we ensure that she actually pays the amount val that she bid?

In a decentralized system, where users may be untrustworthy, val is a more or less worthless IOU. The only way to ensure payment is for all users actually to commit $val, i.e., commit the money represented in their bid. In Ethereum, though, there is no (simple [1]) way to conceal the amount of Ether in an ordinary send. So when P sends $val as a commitment, she reveals her bid amount. That brings us back to square one. [2]

Suppose instead of Ethereum, we were using a hypothetical system that actually concealed transaction amounts and even the sender identity (but not whether a contract receives a transaction). Call it ZEthereum. Then, of course, we’d solve the problem. Front running isn’t possible in an auction if you don’t know your competitors’ transaction amounts. Right?

Unfortunately, even concealed transaction amounts don’t do the trick, as a simple example shows.

Example 1 (Transaction Existence): Tweedledum is bidding on a rare military helmet in an auction administered through a ZEthereum smart contract. He knows that there is only one other possible bidder, his brother Tweedledee, who loves the helmet, but has at most $1000 available to bid for it. If Tweedledum learns of the existence of a second bid, his strategy is clear: He should bid $1000.01. Otherwise, he can win with a bid of $0.01, far below the real value of the helmet. Even though bid amounts are concealed, Tweedledum can game the auction by learning whether or not a bid exists.

Image/photo

Tweedledee cheated of helmet by Tweedledum’s frontrunning

While this example is contrived, there are real systems in which the existence and also the timing of bids can leak information. For instance, in an ICO, it’s easy to imagine even naive algorithms estimating interest in a token and altering their bid based on contract buy-in statistics.

Send in the submarines
Our solution, a submarine send, is a poor-man’s solution to the problem of concealing transaction amounts and existence for a target smart contract "Contract". It doesn’t conceal transaction amounts as strongly as our hypothetical ZEthereum would, and it doesn’t definitively hide the existence of transactions. But it provides what we believe to be good enough guarantees in many cases.

The key idea is to conceal sends to Contract among other, unrelated and indistinguishable transactions.

A submarine send embeds a real transaction among a collection of cover transactions, thus achieving a form ofk-anonymity. Put another way, the submarine transaction sits in an anonymity set consisting of k cover transactions. While not as strong as notions such as differential privacy or full concealment based on cryptographic hardness assumptions, k-anonymity has proven useful in many practical settings. And some low-cost enhancements can strengthen our scheme’s properties.

The basic format of a submarine send is simple: It is a send of $val from an addrP, where P is the player interacting with Contract, to a fresh address addrX. The trick is to structure address addrX such that it has two properties: (1) addrX looks random in the view of potential adversaries and (2) $val is committed to SC in a manner that can be proven cryptographically.

Let’s illustrate with an example, showing how a system that is vulnerable to frontrunning can be rendered more secure by use of submarine sends.

Example 2 (Ash ICO): Suppose that smart contract AshContract implements sales of a new type of token called an Ash. Purchasers burn Ether to obtain tokens at a market-driven Ether-to-Ash exchange rate.

It’s easy to see that AshContract is vulnerable to frontrunning. A large buy transaction will substantially raise the market price of Ash. Thus a miner that observes such a transaction Trans can profit by squeezing in her own buy transaction before Trans and selling afterward. We can remedy this problem by enhancing AshContract as follows:

Example 2 (Ash ICO) continued: In a commit phase, a player P with address addrP makes a submarine send via transaction Trans as follows. P sends $val to address addrX for public key X, where X = H(addrP, SCN, TKEY), where SCN is a contract-specific identifier, TKEY is a transaction-specific key / witness (e.g., a randomly selected 256-bit value), and H is Ethereum-SHA-3.

In a reveal phase, to prove that she has burned $val, P sends TKEY to AshContract. AshContract then: (1) Verifies TKEY is a fresh key; (2) Computes X = H(addrP, SCN, TKEY) and addrX; and (3) Checks addrX.balance == $val. Upon successful verification, AshContract awards Ash tokens to P.

Observe that during the commit phase, X and addrX are computed from an as-yet unrevealed key (namely TKEY). Thus the transaction Trans is indistinguishable from an ordinary send to a fresh address. [3] Assuming that many other such ordinary sends happen during the bidding period, an adversary cannot identify Trans as a purchase of Ash.

During the reveal phase, P proves to AshContract that she burned $val to purchase Ash. [4] Given the way public key X is computed, it is infeasible to compute a corresponding valid private key. Thus money sent to addrX is unrecoverable, i.e., burned.

Of course, there are cases in which we don’t want to burn the committed currency. As we now show, it is possible to implement submarine sends in which $val is recoverable by AshContract after the reveal phase.

Image/photo

The two phases of a submarine send

Submarine sends via EIP-86
It is possible to implement cheap and simple submarine sends in which $val can be recovered. For this purpose, we rely on a new feature introduced inEIP-86. EIP-86 was scheduled to go live in the upcoming Metropolis hard-fork, but unfortunately its deployment has been postponed for now. We detail a temporary, more expensive option for implementing submarine sends in the blog appendix.

The crucial change in EIP-86 is the introduction of a new CREATE2 opcode to the Ethereum Virtual Machine. Like the already existing CREATE opcode, the new opcode will also create new smart contracts on the blockchain. However, unlike CREATE, CREATE2 will compute the address of the newly created smart contract C as H(addrCreator, salt, codeC), where addrCreator is the address of the contract’s creator, salt is a 256-bit salt value chosen by the creator, and codeC is the EVM byte code of C’s initcode.

To construct submarine sends, we combine this new way of computing contract addresses with a very simple smart contract which we call Forwarder. Forwarder performs one function: Upon receiving a message-call, it checks whether the message-call was sent by Contract. If so, Forwarder sends its entire Ether balance to Contract. Otherwise, Forwarder does nothing. Here is a simple implementation of Forwarder written in Solidity:

contract Forwarder {

address constant addrContract = 0x123;

function () {
if (msg.sender == addrContract)
addrContract.send(this.balance);
}

}

To save gas, we can also implement Forwarder directly in EVM bytecode, reducing the size of the Forwarder contract by ~ 75%.

Let’s put the pieces together and examine how the commit and reveal phases work.

Commit: To commit, P computes the address A := H(Contract’s address || Data || initcode(Forwarder)), where Data contains any additional required information about P’s bid as well as a fresh nonce. The bidder P the sends $val to A. At this point in time, A is a fresh address which has never been observed on the network.
Reveal: Upon P’s revelation of Data, Contract verifies the freshness of the nonce contained in Data, computes A as described above and checks that A.balance == $val. Contract then instantiates Forwarder at address A using the CREATE2 opcode. Next, Contract message-calls into this Forwarder contract which in turn promptly transfers $val to Contract.

While EIP-86 offers the best vehicle for implementation of submarine sends, as noted above, it is nonetheless possible to implement them in Ethereum today. For details, see the blog appendix (“Submarine sends, today”.)

The astute reader might notice that an important step is still missing in the above scheme: Contract does not check that the transaction that sent $val to A actually happened during the commit phase! This is crucial, as otherwise we face a frontrunning issue again: Upon seeing a reveal of a submarine send, a miner could compute A, observe A.balance, and include his own commit and reveal transactions after the commitment phase has already ended.

So how should we go about verifying the timeliness of the transaction to A? The conceptually simplest solution is to verify this off-chain, e.g. using Etherscan, but that sort of defeats the purpose of smart contracts, right?

A more technically involved solution would be for P to reveal, in addition to Data, a proof that a transaction of $val to A was made during the commit phase. This proof can then be verified by Contract. The proof would include a block number, all the transaction’s attributes (transaction index, nonce, to, gasprice, value, etc), aMerkle-Patricia Proof of inclusion in the transaction trie, as well as a block header, and convince Contract that (1) that block was mined during the commit phase; and (2) that the relevant transaction is included in that block. However, generating and verifying such a proof for every submarine send is cumbersome and expensive.

The solution we propose follows an alternative “optimistic” approach: Instead of having every bidder prove to Contract that they followed the protocol correctly, we will allow parties to prove to Contract that some bidder cheated. As the cheating party will forfeit its bid as a result, this solution should incentivize correct behavior of the bidding parties. Bidders still reveal all transaction attributes and the corresponding block header to Contract but they do not include a proof (this is the expensive part after all). After the reveal phase is over, Contract enters a short verification phase in which parties may submit proofs that some other party’s reveal is incorrect. Analogously to the above proof of correct behavior, a proof of cheat consists in showing that the transaction attributes and block header revealed by a bidder are incorrect: i.e., that the given block is not in the blockchain, or that it does not contain the purported transaction. [5] If the proof verifies, Contract would collect the bid from the Forwarder contract, and send it to the loyal watchdog as reward. As verifying other bidders’ reveals off-chain is very simple, honest bidders have a strong incentive to check for and report a cheating party that outbid them (as long as honest bids are larger than the gas cost of proving wrongdoing).

We believe this solution offers a good compromise: Although Contract still has to implement the rather tedious procedure of verifying the (non-) inclusion of a transaction in Ethereum’s blockchain, this procedure should only ever be called if a party actually reveals an incorrect bid. In such a case, the misbehaving party’s bid is forfeited and used to offload the gas costs of the party that revealed the wrongdoing.

Exactly what constitutes a cover transaction?
If we were to make a submarine send today, how hard would it be for a frontrunning miner to detect it? The answer to this question crucially depends on the size of the anonymity set we are trying to hide our submarine in. As a reminder, this anonymity set comprises other “normal” transactions in the Ethereum network that look the same as a submarine send.

Suppose our commit window spans blocks Bstart to Bend. During that window, a frontrunner will be looking for transactions to some address A that satisfy the following properties:

Address A had never been observed in the network prior to this commit window.
Address A is not involved in any other transactions (internal or external) during the commit window. [6]
The transaction to address A carries a nonzero amount of ether.

Such addresses A, which we refer to as “fresh addresses”, are candidates for being the destination of a submarine send and thus make up our anonymity set. Note that it is irrelevant whether these fresh addresses are involved in further transactions after the end of the commit window (i.e., in blocks after Bend). Indeed, although this could reveal that some addresses were actually not used as part of a submarine send (thus effectively reducing the anonymity set), at that point in time the danger of a frontrun is moot as the commit phase has passed.

The choice of the commit window (the number of blocks from Bstart to Bend) should thus be primarily governed by the size of the anonymity set one might expect over that period.

Empirical analysis: Anonymity-set size
We performed a simple experiment to determine the effective size of the anonymity set for submarine sends. We selected a commit window of roughly 30 minutes between blocks 4007900 and 4008000. In that period, we recorded 7109 transactions (excluding internal transactions), of which 661 satisfy the above properties and thus make up an anonymity set for submarine sends.

As an example, consider this addresswhich received 5 ETH in block 4007963 and has not been involved in any other transaction since. [7]

The graph below shows how the size of the anonymity set grows throughout the commit phase.

Image/photo

Size of anonymity set over time

If we consider a larger commit window of a little over two hours, between blocks 4007600 and 4008000, we find that the anonymity set for submarine sends consists of 1534 “fresh” addresses.

Transaction shaping
Of course, a frontrunning miner might try to use information other than the freshness of the destination address to de-anonymize a submarine send. In particular, perhaps the miner knows that the transaction she wants to frontrun will contain a certain amount of Ether. In that case, it is important that the submarine send “blends in” with other transactions of similar value. Thus effective cover transactions must also contain transaction amounts that are statistically similar to that in the submarine send.

Fortunately, we found that the transactions that make the anonymity set span a diverse range of Ether amounts, with the majority of values between 0.01 and 100 ETH, and an average transaction value of about 6 ETH. Below we show the distribution of Ether values for the transactions to fresh addresses between blocks 4007900 and 4008000.

Image/photo

Transaction value histogram, blocks 4007900 to 4008000

Furthermore, there are a few ways to perform what we call transaction shaping in order to help prevent transaction amounts from revealing submarine sends:

Refunds: Contracts can refund full or partial submarine send amounts, providing limited further cover similar to the ENS amount hiding mechanism.
Flexible send amounts: In some cases, a sender has some flexibility in the amount of money she transmits in a submarine send. Refunds create such flexibility, but it can exist in other settings. For example, users may be willing to randomly vary their purchase amounts in AshContract to help conceal their transactions.
Fragmentation: Senders can split submarine send initial deposits into multiple transactions (potentially from a range of addresses), providing further noise for heuristics attempting to detect these sends. Of course, given the lack of transaction-graph confidentiality in Ethereum, there is a risk that multiple transactions can be traced to the same user. This risk can be mitigated by means of mixing.
Synthetic cover traffic: Senders can create their own cover traffic at minimal cost by sending money to fresh addresses they control.

Conclusion
Frontrunning and related problems are pervasive problems, with no good available remedies in Ethereum. Submarine sends, while imperfect, offer a powerful and practical solution. Fortunately, there is already a fairly high volume of low-to-medium-value cover transactions in Ethereum today, and we can expect it to grow as the system evolves. Ourproof-of-conceptworks on the Ethereum network today, though it costs more gas than our ideal scheme. For submarine sends to be truly practical, all we await is the adoption of EIP-86.

Appendix
Submarine sends, today
If you don’t want to wait until EIP-86 is adopted, we have good news for you. It is already possible to use Submarine Sends in Ethereum, although it is more complex and costly than in the future EIP-86-Ethereum. To rid ourselves of the dependency on EIP-86, we need another way to construct fresh addresses that are later inhabited by a contract. Since we cannot use CREATE2, we will have to make do with CREATE.

Whenever a contract is created with CREATE, its address is computed as H(addrCreator, nonceCreator), where addrCreator is the address of the contract’s creator and nonceCreator counts how many contracts have been created by the contract’s creator. [8] Hence, the creator has some control over nonceCreator: to increment nonceCreator by one, the creator only has to spawn a new contract. Of course, incrementing nonceCreator until it encodes the output of a cryptographic hash function (say 80 bits [9]) is completely out of the question. But we can encode this value in a series of nonces, making the scheme somewhat practical.

As in the EIP-86 example, we also need a Forwarder contract. However, Forwarder isn’t quite as simple as in the EIP-86 example; beyond forwarding funds to Contract, it now also must be able to clone itself, i.e. to CREATE a new instance of itself.

Once again, we have a commit and a reveal phase:

Commit: P computes a commitment H’(Data), where Data contains the required information about P’s bid and a fresh nonce. P splits H’(Data) into k distinct 2-bit chunks [10] H’(Data)[1], H’(Data)[2], ..., H’(Data)[k]. The value of each two-bit chunk is interpreted as an integer in [1..4]. P then computes the address A := H(H(... H(Contract's address || H’(Data)[1]) ... || H’(Data)[k-1]) || H’(Data)[k]) and sends $val to A. As before, A is a fresh address which has never been observed on the network.
Reveal: Upon P’s revelation of Data, Contract verifies the freshness of the nonce contained in Data, computes A as described above, and checks that A.balance == $val. Contract then orchestrates a chain of contract creations by repeatedly calling clone() on various Forwarder instances until a Forwarder instance appears at address A: For each H’(D)[i] (1 ≤ i ≤ k), Contract calls the clone() function of the Forwarder instance at address H(... H(Contract's address || H’(Data)[1]) ... || H’(Data)[i-1]) at least H’(Data)[i] times. At the end of this process, a Forwarder instance will have been instantiated at address A and Contract then withdraws $val from that instance.

We have created a prototype implementation using 80 bits for H’(Data) in which creating a series of contracts and withdrawing the funds costs ~5 million gas. [11] We believe that optimising this implementation would bring the gas cost down by roughly 50%. At a currently feasible gas price of 4 gigawei this corresponds to 0.02 ETH or 3.65 USD.

Block stuffing attacks
Of course, our scheme has some limitations. Attacks still exist where an adversary can spam a large volume of high-fee transactions in an attempt to stop players from either committing or revealing, especially late in these periods. The longer the period, the more costly this becomes to the adversary. One mitigation strategy is to make the period long enough that the cost required to fill blocks exceeds the value estimated to be at stake.

Unfortunately, an inherent tradeoff between latency and fairness may always exist in systems aiming to provide fairness guarantees by tackling frontrunning, as smaller periods of hiding information from miners leave users more vulnerable toreorganizations and consider the input of fewer miners.

    [1]    https://blog.ethereum.org/2017/01/19/update-integrating-zcash-ethereum/
    [2]
There are other approaches to addressing front-running that don’t involve concealment of transaction amounts:

(A) Players could commit small, uniformly priced deposits that they forfeit if they fail to pay. But they may still not be incentivized to pay in full.

(B) Players can break up their bids across multiple transactions and reveal their belonging to the same bid in the reveal phase. But a lack of transaction-graph confidentiality means that even bids from multiple addresses might be tied to the same user. We propose this approach as an added layer of security for submarine sends, but its limitations need to be made clear.

    [3]    Indeed, in the Random Oracle Model, the ability to identify Trans as a submarine send would imply the ability to perform signature forgery.
    [4]    This example glosses over the details of how party P convinces Contract that the burn of $val actually occurred during the commit phase and not after. We’ll come back to this important issue in the next section.
    [5]    As the Transaction Trie is sorted, showing that a transaction is not in it simply reduces to showing that either there is no transaction with the given index in the trie or that the transaction at the given transaction index has different attributes than the ones given during the reveal.
    [6]    If the address only receives transactions, it could technically still be a candidate for a submarine send as we could always split up a submarine send into multiple transactions to the same destination address.
    [7]    At the time of writing, the most recent block number is 4036411.
    [8]    This isn’t the whole story: If the creator is a contract, nonceCreator counts how many contracts have been created by the creator. Otherwise, nonceCreator counts how many transactions have been initiated by the creator. Furthermore, the inputs to the hash function are rlp-encoded before hashing.
    [9]    Since we don’t require our hash function to be resistant to collision attacks, we don’t have to worry about the birthday paradox and can get away with using a shorter hash. 80 bits should still provide ample defense against pre-image attacks, especially in light of the fact that an attacker only has a rather short amount of time in which finding a pre-image benefits her.
    [10]    Somewhat surprisingly, a theoretical analysis reveals that out of all fixed-width chunking schemes, 2-bit chunking is the cheapest on average. For example, using 2-bit chunks is 20% more efficient than using 1-bit chunks.
    [11]    Since this comes close to the block gas limit, we split the withdrawal process into multiple transactions. Note that it’s relatively cheap to compute A; the high gas cost is caused by the creation of the hundreds of contracts needed for withdrawing funds from A.

The Cost of Decentralization in 0x and EtherDelta

Hacking Distributed

Sun, 13 Aug 2017 06:45:00 -0700

#^The Cost of Decentralization in 0x and EtherDelta

This Tuesday, a decentralized exchange platform called 0x will be holding a $24 million token sale. In the midst of an ICO frenzy that has funneled $1.3 billion into fledgling projects, the 0x sale seems like a drop in the bucket. 0x, though, is at the forefront of a wave of decentralized exchange projects that also includes EtherDelta, BitShares, KyberNetwork, and Omega One.

So what are decentralized exchanges, and why does the world need them?

A centralized cryptocurrency exchange is administered by an entity that represents a single point of failure. Users deposit their funds directly with the exchange, and the exchange assumes responsibility for matching buy and sell orders, which it can do in real time. Many exchanges support the purchase and sale of cryptocurrencies, fiat currencies, and cryptocurrency tokens. Examples of centralized exchanges include Coinbase, Kraken, and ShapeShift—as well as MtGox and Bitfinex. These last two were famously compromised, resulting in the loss of roughly 650,000 BTC and 120,000 BTC respectively, hundreds of millions of dollars at the time. (ShapeShift too was hacked, but a mere $200k or so worth of funds were stolen.)

Therein lies the rub. Centralized exchanges have the serious drawback that they require users to trust the exchange with their money. A fraudulent or compromised exchange can result in theft of users’ funds. There are also other drawbacks to centralized exchanges, such as the vulnerability of the users to frontrunning by the exchange administrator, as we discuss below.

Decentralized exchanges were created specifically to address centralized exchanges’ vulnerability to theft of user funds. In a decentralized exchange, users retain a degree of control of their own funds. They do not send money directly to a wallet that is controlled by a single entity. Instead, trading orders, and thus the release of user funds, are authorized directly by users via digital signatures. In principle, therefore, user funds cannot be stolen.

“In principle” is the operative phrase here. As we show, the ability to sign off on transactions does not equate with real control. Enabling users to control their own funds seems a good thing, but it has the side effect of abandoning the real-time nature of centralized exchanges in favor of slow, on-chain trading, That in turn exposes users of decentralized exchanges to new risks of monetary loss. Consistent with 0x’s apparent design philosophy of community empowerment, there is further dimension of decentralization in 0x’s governance scheme, which permits token holders to upgrade 0x contracts. This feature unfortunately creates a systemic risk such that users’ funds could again be potentially exposed to theft, resulting in some degree of the worst of both worlds.

Any complex new system is likely to suffer from design flaws. Decentralized exchanges have some definite advantages over their centralized counterparts, but also some distinct drawbacks. The design landscape is complex and it is essential to evaluate the merits of individual systems and the risks they pose to users of being cheated or exploited. Some of the flaws of 0x are inherent to decentralized exchanges, while others aren’t, as we now explain.

0x closely resembles EtherDelta, an existing exchange. We therefore compare 0x with EtherDelta to better understand the design choices and likely risks in 0x exchanges.

In this blog post, we present a brief overview of the general design behind today’s decentralized exchanges. Then we discuss three categories of decentralized-exchange design flaws:

General flaws affecting all decentralized exchanges, at least within the design space currently explored by the community.
0x-specific flaws introduced by particulars of the 0x architecture.
EtherDelta-specific flaws that are minor and don’t affect 0x, but teach us something about the challenges of correctly implementing exchanges.

Finally, we describe experiments we have performed in EtherDelta. These experiments highlights that the risks we describe are real and already emerging in a decentralized exchange with fairly low volume. We conclude with a brief discussion of the decentralized-exchange design space as a whole.

Decentralized exchanges: Brief overview
To support real-time trades, decentralized exchanges such as EtherDelta and 0x adopt essentially the same architecture for public (“broadcast”) orders. Users send their orders to off-chain matching services (called “Relayers” in 0x), which post them in off-chain order books. Such posting takes place in real time—much faster than if orders were posted on a decentralized blockchain.

Any user (called a “Maker” in 0x) can publish any buy or sell request (that this user signs) on the order book of the off-chain service. An order is accompanied by an exact price, i.e., limit orders aren’t supported.

EtherDelta and 0x both attempt to minimize trust in the off-chain matching service by not giving it the power to perform automatic matching of the buy and sell orders. Instead, any other user (called a “Taker” in 0x) trades against a posted order by adding a counterorder and digitally signing and sending the complete transaction, i.e., pair of orders, directly to the blockchain. The transaction is sent to a smart contract that executes the transaction, transferring assets between the buyer and seller (Maker and Taker in 0x). The lifecycle of a transaction is shown below in Figure 1.

EtherDelta and 0x both adopt this general design, but differ in two key ways. 0x allows anyone to stand up an exchange (“Relayer” in 0x), while EtherDelta has a unique one. Additionally, 0x, unlike EtherDelta, has a system of tokens used to pay transaction fees and for system governance.

Image/photo

Lifecycle of a transaction, using 0x terminology. (1) The Maker sends a digitally signed broadcast order (e.g., “Sell 100 TOK for 1 ETH”) to the Relayer, who (2) Posts the order to the Order book. The Taker finds the order in the order book and digitally signs it, with a counterorder (“Buy 100 TOK for 1 ETH”) to the DEX (“Decentralized EXchange”) smart contract.

General flaws in decentralized exchanges
The general design just described introduces several basic vulnerabilities:

Exposure to arbitrage: The lack of automatic matching permits in-market arbitrage, whereby stale orders are filled to the disadvantage of users unable to quickly cancel their orders in response to market fluctuations. For example, the arbitrageur can execute against a standing pair of orders (sell 1 TOK at 1 ETH and buy 1 TOK at 2 ETH) to make an immediate profit of 1 ETH. Since the only way for users to invalidate their signed orders (that they published on the off-chain service) is by sending an on-chain cancellation transaction that is explicitly processed by the exchange contract, the arbitrageur may pay a high gas fee to miners and win the race against the cancellation transaction. Therefore, users who wish to increase the probability of a successful cancellation may need to attach an excessively high fee that depends on the value of the trade, which makes the exchange platform unattractive to honest users. We show below that this problem isn’t theoretical, but already arises in practice.

Vulnerability to miner frontrunning: Order cancellations are a common feature of decentralized exchanges (after all, an exchange with no cancellation ability may not be useful in a volatile market), and their on-chain nature renders these cancellations particularly vulnerable to miner frontrunning; the miner of the next block will always have the option to execute cancelled orders with themselves as the counterparty, potentially profiting from such an order. To add injury to insult, the miner even collects gas costs from a user’s failed cancellation. This issue was noted in the Consensys 0x report, and is recognized as a limitation of on-chain cancellations in the community.

Exposure to exchange abuses: Since the off-chain matching service doesn’t perform automatic matching, it is supposed to publish all users’ orders as quickly as possible, resulting in principle in fully transparent behavior. In actual fact, though, the exchange can suppress orders, mounting a denial-of-service attack against users in order to corner a market or censor particular users’ transactions. Worse yet, it can front-run orders. Specifically, it can engage in the same kind of in-market “sandwich” arbitrage described above, especially when high-value trades are requested. The problem is that signed orders flow to the off-chain server first. The server can thus match the trade data with pseudonymous users that it controls. Both suppression and front-running by an exchange are extremely hard to detect.

0x-specific flaws
Decentralized governance: In just two days, 0x will launch the Initial Coin Offering (ICO) of their ZRX token. The token will serve two functions: First, it will allow market participants to pay a Relayer’s fees for listing orders. Second, the token will be used for "decentralized governance" over the evolution of the protocol and the DEX contract holding market participants' assets.

Why a dedicated token should be used for Relayer fees is unclear—after all one could simply pay Relayers in ETH instead. The use of a token for decentralized governance is a more interesting use case.

Unfortunately, the 0x whitepaper does not provide any detailed information on how this governance process will work. Neither does the code in 0x's github repository. Since the governance process appears to be the only good reason for creating the ZRX token, this is all the more disappointing.

The 0x whitepaper does, however, state that non-disruptive protocol updates (i.e. changing the protocol and underlying smart contracts without requiring opt-in of individual users) are an explicit design goal of the governance scheme. This immediately raises questions about the security properties of the governance process. If, for instance, 0x were to use a simple majority voting scheme to approve updates to the DEX (which holds all user assets), an attacker could perform a 51% attack where she buys more than half of all ZRX tokens and then votes to replace the DEX with a malicious contract sending all assets to the herself. Designing a secure, decentralized governance process will be difficult and involve a multitude of delicate tradeoffs. Once again, decentralization is no panacea and carries a price in terms of complexity and possibly weakened security!

Side deals: 0x's design allows for two kinds of orders, broadcast orders and point-to-point orders. Broadcast orders make use of a Relayer, who broadcasts the Maker's order to any listening Takers who can choose to fill the order by sending a signed message to the DEX. (Figure 1 shows the lifecycle of a broadcast order.) Relayers can charge fees as a reward for their broadcasting services: When an order is filled, the DEX will transfer any fees from the Taker and Maker's accounts to the Relayer's account.

In contrast, point-to-point orders do not make use of Relayers and thus avoid Relayer fees. As their name suggests, point-to-point orders allow two market participants to trade directly with each other by sending signed messages to the DEX.

This leads to a scheme that allows Makers and Takers to evade Relayer fees:

The Maker M posts an order O on a Relayer service.
A Taker T listening to the Relayer learns of the order and decides she wants to fill it. She contacts the Taker off-chain [1] with a point-to-point order O' corresponding to O: If O offers to sell 1 TOK for 1 ETH, O' will offer M to buy 1 TOK for 1 ETH from T.
Once M receives the order, she sends a single Ethereum transaction to the blockchain. This transaction calls the DEX twice, first cancelling O and then immediately filling O'. Note that thanks to the atomicity of Ethereum transactions, M carries no risk of both O and O' being filled.

Since cancellations are free and O is never filled, the Relayer will not earn any fees. Systematic exploitation of this flaw could lead to a tragedy of the commons, where individual market participants would make it uneconomical to run a Relayer by always evading fees, thereby destroying the "common good" of Relayers.

Side deals are less of a problem in EtherDelta because EtherDelta doesn’t support point-to-point orders and the fees for broadcast orders are hardcoded in the smart contract. However, 0x’s approach is not without advantages: Having a single DEX hold the assets for all point-to-point and broadcast orders and allowing multiple Relayers will likely lead to a more liquid market; furthermore, competition among Relayers may lower the fees that user pay. This tradeoff cannot be easily overcome: EtherDelta could in principle support multiple Relayers by using a separate contract for each of them, but this approach would not allow the liquidity in the order books of the Relayers to be shared.

Maker griefing: Maker griefing is an attack recognized in audits of 0x whereby an order maker moves tokens that are supposed to be involved in an order, causing it to fail in the final on-chain processing stages responsible for moving the funds. If such a failure occurs, the on-chain taker must pay gas fees to attempt execution of an order that never completes or provides any benefit to the taker, an inefficient use of taker time and money.

Repeating this attack on a large scale could potentially waste Taker gas, making Takers incur high order fees, costs, and delays in addition to the transparent Taker fees charged by the exchange. A cartel could place both legitimate and illegitimate orders, sharing information with each other out of band about which orders were legitimate. This would force outsiders to incur penalties, a potentially profitable strategy for a sufficiently powerful cartel. The recommended tooling mitigations do not entirely solve the issue, as they rely on checks of blockchain state, which could potentially change immediately before or even during the release of a new block. The potential for miner involvement as Makers in permissionless distributed markets or miner collusion with these cartels further amplifies these attacks, as miners could both collect the profit from Takers burning gas in griefing attacks and trigger the attacks in previously unseen transactions inserted in blocks before order fulfillment. Such a miner attack would allow no possibility of detection for the recommended tool-based mitigations.

The technological and economic barriers to these attacks or the formation of potential cartels mean these strategies may not surface until decentralized exchanges achieve substantial volume (and thus allow for substantial profit), dangerously providing a false sense of security and a false confidence in on-chain market architectures.

Similarity to EtherDelta: As described in the design of both exchanges, EtherDelta and 0x share a number of similarities. As EtherDelta is a full system that is currently operational and includes a number of in-production smart contracts, it is unclear where a potentially large ICO raise could be allocated beyond the development of equivalent technologies and distributed governance. The 0x roadmap provides some hints of potential strategies for differentiation, and we hope to see the development of advance and novel R&D improvements accompany the deployment of proposed EtherDelta-like infrastructure.

EtherDelta-specific flaws
Slow cancellation: Despite its processing of cancellations with low observed latency after a cancellation order is mined on-chain, the requirement of waiting until the next mined block (or later with potentially full blocks) imposes a significant barrier to real-time exchange, potentially locking up user funds and enabling profitable miner arbitrage on larger orders through frontrunning.

Slow order processing: During the posting of our test orders on EtherDelta in the experiments we conduct below, we observed a substantial delay required to place orders in the systems. This operation is not contingent on any on-chain transaction processing, and it is not clear to us why the system is imposing such a delay.

High gas costs for competing transactions: Due to the high latency of EtherDelta’s order book, some Takers may be blind to each others’ orders. This can cause a race condition where multiple Takers compete to fulfill a single order, leading to order failure with some delay and gas costs incurred by all but the winning Takers. This could potentially become problematic on attractive orders as the system increases in size, specifically with the possibility of miner participation and frontrunning.

Experiments with EtherDelta
We experimentally verified some of the flaws listed above. First, we evaluated the total theoretical arbitrage opportunity of 10 major ERC-20 Token/ETH exchanges (BQX, CDT, CVC, EOS, PAY, PLR, PPT, VERI, XRL exchange with ETH) on EtherDelta over a period of 24 hours (from 2017-8-11 00:30:00 UTC to 2017-8-12 00:30:00 UTC). Our experiments show that—assuming perfect execution, the current EtherDelta transaction fee of 0.3% of total volume, and a gas price of 20 Gwei—an arbitrageur could have gained 14.71785 ETH in the above period. At current exchange rates ($303.9 / ETH), this corresponds to $4,472.75 / day or $1.6+ million / year. Specifically, we observed 45 arbitrage opportunities during our period of study, with an average arbitrage opportunity of 0.32 ETH. Compared with centralized exchanges, EtherDelta has relatively small volume today. Were its volume to grow substantially without technical changes, arbitrage opportunities would presumably grow to be quite substantial.

Many of these arbitrage opportunities resulted from clear user error, i.e., obvious mispricing of orders. Even without such error, however, we identified approximately 6.7 ETH in arbitrage opportunities over the 24-hour period of study, corresponding to about $2,036.

Having confirmed that arbitrage opportunities exist in practice, we demonstrated experimentally that they are exploitable in practice. We designed a simple trading bot that would monitor EtherDelta's order books for arbitrage opportunities and send orders exploiting them to the blockchain. Our preliminary numbers suggest that out of 12 orders sent, 7 (or 58.3%) succeeded. The cause of failed arbitrage attempts was a failure to post our order fast enough, i.e., someone else hit the order or the maker cancelled the order before our order executed. (We did not seek to exploit all 45 observed arbitrage opportunities because this would have required roughly 80 ETH worth of tokens, which is more than we poor academic researchers possess.)

Assuming that the success rate we observed is representative, we can approximately estimate the expected net daily profit available to arbitrageurs according to the following formula:

Expected profit = (Net profit per transaction * Probability of success - Gas per transaction * (1 - Probability of success)) * Total number of opportunities

Our experiments during the period of study suggest the estimate:

Expected profit = 0.32 ETH * 58.3% * 45 - 0.004 ETH * (1 - 58.3%) * 45= 8.32 ETH

Or about ~$2500 per day. While this is not enormous, particularly if competition arises among arbitrageurs, an increase in popularity and therefore volume on decentralized exchanges would of course result in far larger potential profits.

The decentralized-exchange design space
There is a large design space for decentralized exchanges beyond the EtherDelta/0x variety. For example, an alternative is to allow the off-chain matching service to perform automatic matching, and even require each trade to include a signature by the off-chain service. To avoid loss of funds in the case that the off-chain service disappears, users would still be able to withdraw their assets without approval of the off-chain service, but only via a slow process. This design eliminates two of the major problems that we described above: The arbitrageur won’t be able to race against users who have already cancelled their orders, and miners won’t be able to front-run users. Unfortunately, the off-chain matching service will have the same kind of power to perform in-market arbitrage as centralized exchanges. Unlike centralized exchanges, though, it won’t have the power to steal users’ deposits.

Another reason to avoid automatic matching by the off-chain service, however, is that it reduces the complexity of the code of the exchange contract. Automatic matching support would naturally support functionality such as limit orders that don’t exist in EtherDelta / 0x. (Given the simplicity of the code base for which 0x are raising $24 million, code supporting automatic matching would presumably be worth several hundred million or so...)

Conclusion
Centralized exchanges have serious drawbacks, perhaps most notably exposure of users’ funds to theft. But the wave of creation of decentralized exchanges that place users’ funds in their control does not fully protect users’ funds, and introduces new problems. It is tempting to dismiss the problems we’ve observed in EtherDelta as trivial, but we believe they will grow as decentralized exchanges do. What we’re seeing today is just a harbinger of problems to come should decentralized exchanges sweep over the cryptocurrency landscape. But since the problems that we’ve identified are exacerbated when higher value trades take place, we conjecture that such problems will ultimately limit the popularity of decentralized exchanges.

This all is not to say that decentralized exchanges do not have a place or cannot fill a valuable market niche. We are also not claiming the superiority of centralized exchanges; notably, centralized exchanges can steal all user funds, the strongest form of unfairness. Nonetheless, it is important to remain aware of the tradeoffs made to achieve this decentralization and their potential negative effects on users of a certain exchange architecture. Someday, someone may devise a decentralized exchange architecture that doesn’t suffer from any of the limitations we’ve enumerated. This challenge remains an open problem.

As a final note, we observe that 0x and EtherDelta provide only a trading platform for tokens that circulate on a single blockchain. They don’t support trading BTC for LTC and so on. As our discussion illustrates, achieving even this modest goal with adequate performance and security is a herculean task for decentralized exchanges. Support for cross-chain trading requires far more complex architectures, see for example KyberNetwork and Omega One. Decentralized exchanges that enable cross-chain trading are likely to exhibit all the problems discussed in this blog post, and more.

[1] We are deliberately vague about the mechanism of off-chain communication since it isn't relevant to the scheme. A simple example would be the use of email using a simple directory-like smart contract to associate Ethereum addresses with their owner’s email addresses.

Disclosure
In the interest of transparency, we would like to note that one of the authors of this blog post, Phil Daian, is an advisor to Swap, a P2P trading protocol that is an alternative to decentralized exchanges. Swap is not immune to many of the issues identified in this blog post, given that an Indexer, the Swap mechanism for counterparty discovery, could play a role much like that of a Relayer plus Order Book. (The Swap whitepaper provides insufficient information for detailed analysis.)