Limitations of Zilliqa’s Sharding approach

Developers
August 29, 2018


Zilliqa published a blog post on their sharding design today.

It is evident that if you don’t have sharding from day 1, your blockchain has no chance of scaling with adoption. Building sharding after the fact is extremely hard. For instance, Ethereum has one of the strongest engineering teams in the blockchain space, and yet their sharding release was pushed back yet again. In their situation, integrating sharding into the system is similar to changing a car engine while the car is driving.

Zilliqa is one of the very few protocols that promises sharding, thus we followed it closely from the beginning.


I was the engineer #1 at a database company called MemSQL. MemSQL builds a distributed analytics platform that has large clusters deployed at Goldman Sachs, Uber, Comcast, Akamai and many other enterprise companies. While at MemSQL, I was responsible for its sharding implementation. Since then, I’ve co-founded Near Protocol, which has two other early MemSQL engineers who were responsible for cross-shard transactions and complex distributed joins, as well as four ex-Google engineers.

In the past, we have built sharding that powers large clusters in production and processes millions of transactions per second per aggregator node. From our experiences, we know well how to implement sharding on complex systems and what practical issues it will have.

To come back to Zilliqa post, the essence of their message can be summarized in several bullet points:

  • Execute all the single-shard transactions in parallel;
  • Do not execute transactions that affect the same smart contract in parallel;
  • Do not execute any transaction that affects more than one shard in parallel with any other transaction.

Besides that, while not explicitly stated in the blog post, it follows that Zilliqa doesn’t shard state (this Ethereum FAQ provides the explanation of difference between sharded state and sharded processing).

Executing only single-shard transactions in parallel

Only executing in parallel transactions for which the transaction initiator and the smart contract are on the same shard might not be a big problem. In Fleta, the payments are entirely designed on the idea that shards can be treated interchangeably. It doesn’t quite work for Zilliqa, since in Fleta the shard is dictated by the sender, while in Zilliqa it is dictated by the shard of the contract, but it suggests that a similar idea might be applicable.

No state sharding

Not sharding the state makes our lives easier. For example, if the state is sharded, then even the very first example in Zilliqa’s blog post becomes obsolete: assigning the payment to the shard of the sender would not be enough, since the shard of the sender would not be able to update the state for the receiver. As a result, a task as simple as processing payments becomes very complex once the state is sharded. However, It is also worth noting that even in the absence of sharding by state, assigning payments to the sender’s shard only works if the accounts are represented as UTXO. If accounts store the accumulated amount, then two shards processing transactions with the same receiver will apply conflicting updates to the receiver’s account.

Nevertheless, not sharding by state, while simplifies the system design, imposes a huge limit on the scalability of the system. The only reason why Ethereum nodes can still store the entire state is that Ethereum only processes 14 transactions per second. Once a system processes thousands of transactions per second, the state will explode, since transactions do leave a trace on the state. Introducing sharding by the state later will be as hard as introducing sharded processing into modern non-sharded blockchain protocols.

Not executing transactions that affect the same smart contract in parallel

Similarly, not sharding smart contract processing, while making the implementation simpler, limits the scalability of a protocol. Ultimately, in any ecosystem, only a few applications dominate the usage, and as Zilliqa scales to thousands of shards, five top dApps will have to reside in five shards and be limited by both the shard’s processing power (and its storage once sharding state is introduced).

With the limitations described above and while also not processing contracts that by design affect multiple shards in parallel, Zilliqa will just make another incremental change in the landscape of scalable blockchains. They might outperform EOS, Thunder, and Algorand (or at least provide better decentralization than the former two), but are not future-proof, and such limitations will prevent them from scaling with the demand for the decentralized applications platform.

The area of research concerned with the execution of distributed transactions has a long history, and shall not be ignored in the development of sharded blockchain protocols.

For example, implementations of Map-Reduce, or generally engines that involve parallel processing, shuffles, and aggregations, have been used for parallel execution of complex transactions for more than a decade.

Why then do we not see an emergence of sharded blockchain protocols that are powered by techniques proven in the industry? The primary reason is that building distributed systems in the presence of failures is an extremely complex engineering task. The number of production-tested distributed database systems that are not coming from engineering giants such as Amazon, Microsoft, Google or Facebook, who have access to the best-distributed systems engineering talent, is very small.

From this perspective, Near Protocol, with its exceptional team of distributed engineers is uniquely positioned to build a sharded decentralized applications engine.

At this stage, we do not have our sharding technical paper finished — but we will release it soon. The way we develop our approach is more practical in nature, where we first build a prototype to test all of our hypotheses. In a field as complex as distributed systems writing a whitepaper before having a working implementation is often a rushed decision, although it seems to be a widely adopted approach for blockchain projects.

At a high level, transactions in Near are split into a series of parallel “map” steps, interleaved by “shuffle” steps, and the state and execution of each smart contract is sharded. This enables execution of arbitrarily complex programs in parallel. Near also doesn’t introduce its own programming language and instead relies on the entire ecosystem of transpilers to WebAssembly, as well as access to the state in the form of SQL queries.

Stay tuned for our sharding technical paper!

To follow our progress you can use:

https://upscri.be/633436/

Thanks to Bowen Wang, Aliaksandr Hudzilin, Mikhail Kever for helping putting together this post.


Share this:

Join the community:

Follow NEAR:

More posts