Trade-offs In ZK Design Space
Based on research sponsored by Starkware
For years, the design space of cryptocurrency seemed constrained, like everything was 2-of-2 multisigs or 2/3rds of N multisigs and an oracle. Zero Knowledge Proofs provide a major lateral breakthrough in the design space of mechanism design, now you can create verifiable checksums based on complex data while maintaining privacy about the specific inputs.
Starknet uses Cairo, a ZK-focused specialty language, but what are some alternative ZK infrastructures on Ethereum and how do they differ in key architectural choices?
But first: broadly, how do ZK proofs based on STARKs work?
They arithmetize computation. What’s that? It’s putting computation into a matrix, where each row is a step and each column is a register, in that form it can be translated into mathematical constraints that can be tested.
Eli Ben-Sasson put it very well in a recent paper:
‘Arithmetization reduces statements about computational
“I processed T = 10, 000 valid Ethereum transactions, leading to new Ethereum state S”
to completely different statements, about low degree polynomials over a finite field F, like
“I know polynomials A(X), B(X) over finite field F of degree at most T that satisfy a set of polynomial
Everything in the space does something like this, but using different virtual machines.
We are going to do a break-down of the top contenders in the Ethereum and adjunct L1s that are ZK based, Starknet vs.:
ZKSync and zkPorter
Mir (Polygon Zero)
Winterfell (formerly FB, now Polygon Miden)
To save you a lot of time parsing thicc prose, Vitalik did a shorter essay covering the spectrum of different approaches which is a good complement to this 9k word article.
SNARKs vs. STARKs
The primary design consideration that divides the different implementations of ZK proofs, are split between the Zergling-like, small and fast SNARKs, and the Protoss-like expensive but powerful STARKs. Starkware is eponymously focused on STARKs, we’ll be examining overall trade-offs between SNARKs and STARKs followed by a breakdown of differences with Polygon’s and Matter Lab’s STARK-based implementations.
1a) Scaling Power vs. Compute Time
Zero knowledge proofs rely on asymmetries in mathematics, just like older encryption. Schnorr's identity algorithm is the basis of Bitcoin transactions or PGP emails, but before that, it was based on factoring a large prime number and the asymmetry between calculating the pubkey and knowing the privkey. The verification of every transaction in crypto is a verification of a low-knowledge proof.
There was an example of a zero-knowledge proof for a 10 year old on the ZK video in Wired’s 5 Levels Series. The level 1 explanation was this: here’s a picture with a cat in it, I want to prove to you there is a Puffin bird in the picture without revealing to you where in the picture. He does this by showing the Puffin through a hole in a poster that obscures the positioning of the picture. The 10 year old looks through, sees the Puffin, but does not know what part of the picture the Puffin was nested in. She describes it as a “magic trick” but it’s not a trick, it’s based on real asymmetries in math and systemic constraints. In this simple example, the constraints were the geometry of the larger poster rectangle enclosing the geometry of the picture, the field space of possible positions of the poster was high and improbable to guess, and the proof mechanic involved looking at a single spot, which is analogous to how ZK proofs are verified by testing a limited number of times.
The solution to proving Riemann's Hypothesis most likely involves non-commutative logic. The same chaotic periodicity that gives us the unpredictable prime numbers, is reflected in the distribution of 0's in the Zeta function on the imaginary number axis, Riemann's hypothesis was that the zero would always occur on the "critical line" of 1/2. The occurrence of an x where that was true, was as sporadic and unpredictable as the frequency of prime number incidence. Computer models have tabulated these trivial zeros very far out on the imaginary number line, and there still have been no non-trivial zeros. But to actually prove that they will *never* occur, you have to base the proof on an implication rather than an assertion. To date this hasn't been done, and there may be better approaches to the problem. ZK proofs are kind of like this, resting on structural implications in mathematics.
In all of these ZK proofs, finite subsets of prime numbers are also employed, these mysterious sequences underpinning number theory keep showing up as extremely powerful tools for privacy and property protection, as well as sharper ways to keep systems accountable. The key difference between SNARKs and STARKs is that SNARKs use math involving set-up secrets, and STARKS use low-degree testing, Fast Reed Solomon Interactive codes (FRI) to make oracles capable of speeding things up, and does not rely on a set-up Tau value. However over time (notable between late 2020 and late 2021) SNARK R&D lead to SNARKS that utilize those above mechanics, reducing the differences between SNARK and STARK.
The Velcro(tm) mechanism that makes ZK-STARK - and STARK-inspired recent SNARK protocols - hook and loop into the verifier, is that the verifiers can apply a series of tests to see that the polynomial equations derived from the proof are of a sufficiently low-degree, meaning the exponents in the equations aren't too big. We want small, tidy exponents, and no funny stuff with these big exponents. What are we, taking something to the exponent of 362? It's positively indecorous, in a Bayesian sense. How about x^12, that’s more genteel.
It's an asymmetry in mathematics that it's possible to translate an execution trace into these polynomial functions and then make mathematical assertions about them via a series of tests. If you test many times your odds of getting fooled drop off, and the tests are computationally cheap and fast to do, both SNARKs and STARKs carry this property, it's fundamental to ZK proofs. But where they differ is in which fold in the geometry of mathematics they hinge their proofs on.
ZK SNARK math is based on a key generator, a prover and a verifier, ZK STARK math doesn’t require a key generator because its math twist is more generalized to properties of execution traces.
ZK SNARKs follow this:
A constraint function in this case is something that would return true or false, testing some mathematical statement about x and w. With the keys we proceed:
proof = prover(Proving_Key, x, w)
Verification = verifier(Verifier_Key, x, prf)
STARKs don’t have that relationship. Instead, we go from:
->Execution Trace is produced, a matrix of register values traces each step of computation
->Polynomial equations are factored out of the matrix
->A series of low-degree tests are applied with the help of the prover to speed things up, but in a way that can’t be cheated
If the tests show the polynomials are of sufficiently low degree (low exponents in the equation), we can have ultra-high confidence in the proof.
SNARKs are cheaper to make but top out at around 2k tps, STARKs have a logarithmic payoff on scalability and because Starkware built a Domain Specific Language (Cairo) on a customized general proving circuity/AIR, you can aggregate lots of programs into one proof for maximum scalability in a transparent way. That's what the S and T in STARKs stand for: scalable and transparent. The S and N in SNARKs stands for “succinct” and “non-interactive”. STARKs embrace interactivity was a valued-added optional service.
A quick aside about this term “non-interactive”. The prototypical ZK proofs in academia back in the 1980s were interactive, you had to sort of interview the prover to verify it. Then they invented things that work as a single payload, such as the above SNARK. The probabilistic proofs model of STARKs actually embraces interactivity again, not as a starting point, but as an optional efficiency booster.
STARKs use interactivity factored out to just a few clues shared by the prover. It’s not interactivity that can have time-outs, the prover help is received as a single payload, it’s not *highly* interactive. Why not just have SNARKs with their Lamda value? The Lamdba randomizer is a set-up parameter, but the helpful clues from the prover are vitamins, we’re trying to make it possible for lots of people to crunch big proofs by tightening computing time. STARKs don’t depend on a special set-up, it just *is* as a mathematical object with ZK proof implications. Without any prover help, testing could still be done to verify the proof, but execution time would be perhaps 60-80% slower. A big part of STARK R&D is optimizing for scale around an inherently expensive approach that can also be optimized at various levels: AIR’s, the Cairo proving circuit, Cairo code’s use of memory, and the low-level prover all involve a trade-off that pays off brilliantly at scale. There are interactive mechanics that change the compute time for verifiers and make them very quick indeed, milliseconds on retail equipment.
In order to arrive at the proving system used by Cairo, Eli Ben Sasson and his small team of research colleagues produced two handfuls of papers proposing different mathematical approaches for doing probabilistic proofs and Oracle Aided Testing. They settled on one out of all of those solutions due to its optimization potential; generating proofs can be broken up in fractal recursive bits that other parties can work on in a distributed fashion and we expect it to be CPU intensive, but verification needs to be extremely fast so network latency doesn't suffer.
Where are ZK SNARKs strong?
SNARKs are great for speed, which is why the Mina blockchain was designed to maintain a continuous 22kb size as a rolling assembly line of highly distributed recursive proofing. It was able to quickly arrive at a mainnet and a community engaged in the recursive proofing network. We'll look at Mina in more detail later.
SNARKs also suit a highly parallelized architecture for scaling Ethereum, even if lacking some economy of scale in the process. SNARKs top out around 2k tps individually, but individual markets could be parallelized to their own SNARKs and aggregated as a market selection.
Why did Starkware choose STARKs? The co-founder invented STARKs in his academic work. Why did he feel it so imperative to dedicate almost the entire 2010s to a difficult R&D rabbit hole with no guarantee of hitting the jackpot on discoveries? In one of the probabilistic proof papers linked to in the STARK Math series, the first three words are "human dignity demands" - it's a sentiment common to many entrepreneurs in this space. Also the commercial potential of technology based on these mathematical discoveries was probably in the researchers’ imaginations.
Ben Sasson and his team, building on the work of other academics, eventually found themselves rooting through polynomial theory to arrive at a variety of methods, picked the most well-rounded algorithm based on asymmetries in that math space, then optimized for performance. All of Starknet is going to eventually be a big recursive proving network for these log-efficient proofs aggregating blocks of Starknet's L2 activity into an L1 proof.
1b) Collision Resistance and Quantum-Safe Cryptography
Quantum computing is coming and classic ECDSA signatures won’t resist them. The hashing functions used in SNARK generation all have this tiny risk of collisions that won’t resist quantum decryption. The quantum safe cryptography used in STARKs is based on a random oracle model (for instance using block-time as a randomizer). It seems like a fantastic tail risk right now but it becomes highly probable within 20 years, knowing that STARK infrastructure is future proofed for the next few decades is reassuring.
1c) Polygon Miden vs. Starknet
Miden replaces a generic computability AIR like Starknet’s Cairo with a STARK VM which approaches the same challenges from a different angle: a stack machine. It utilizes recursive proofing to keep a sync with the Ethereum blockchain at a ratio of 20 sidechain blocks to one mainnet block, Starknet is more flexible with timing ratios. Miden uses Wintermute which was developed at Facebook, and is lead by the former team lead on Wintermute.
The Miden VM is an evolution from the Distaff VM with Wintermute toolings, it executes one field element operation per second. Field elements will come up a lot later on if you dig into the Cairo vs. Clarity and Cairo vs. Circom articles, but the shorthand is that a field element is a number inside a special mathematical set such as the prime numbers which are used in proving, and integers, strings etc. all must encode into some field element, or “felt”, as it occurs as a keyword in Cairo. Field elements are fundamental to programming for ZK STARK proofing. What is a Field Element? Well sir: it is an element of a field. What field? A Galois field which is modulo a large prime number. Oh, ok. The elements are numbers within that set, we count in those increments in the low-level arithmetic of the Cairo VM. Cool cool, fun for programmers who like math and adventure.
The design choices behind Miden seem to have been motivated by a compromise between isomorphism to traditional Solidity development and control over the underlying circuit translation into arithmetization. Similarly to how Cairo’s bespoke proving circuit offers an intercession between complex ZK STARK math and general business logic or complex computation in a contract, so too would Miden VM allow a medium to format one’s .sol files into proofs. However the key distinction is that Cairo is a domain-specific low-level language with lots of high-level structures to work with in a toolbox like fashion, whereas Miden choose to import the general tooling of Solidity contracts in their current form, and get developers thinking about interactions with the VM stack. Similarly to Cairo, there is a strictly functional programming environment with no RAM assumptions, unlike Cairo the memory management is relegated to the up-to 32 Stack layers in the VM. Transformation of data into new data, returning from some function, is going to have to be thought of in terms of op_codes interacting with a stack number, with a value to push (often 0).
Why does Miden deal with the abstraction that way and Cairo the other? The underlying mechanic of this stack as it goes into Winterfell is an Execution Graph. How does an Execution Graph differ from an Execution Trace that powers Cairo’s proving model? Both are models of general computation, Winterfell uses Execution Trace as its core proving model, but for more on Execution Graphs see this presentation. An Execution Trace is a model of a finite operations execution graph that is contiguous (as a matrix rather than a graph) and congruent to the underlying machine (via the register columns).
Thinking in terms of the stack can be challenging in a different way than learning Cairo is challenging. The key difference is the stack embodies rule-making for AIRs while Cairo is itself an amalgam of a core Cairo AIR (like a big Rosetta Stone) and a series of builtin modules which have custom AIRs for faster performance (usually 3x-5x). In Miden, developers may find room to optimize by thinking in terms of their own AIR rules. For more on lower-level writing of AIRs, check out Cairo vs. Circom.
However it’s important to consider that this project directly references and acknowledges its intellectual foundations with Eli Ben Sasson’s work, and the prover is functionally an open-source version of Starkware’s prover.
It’s a nice graphic to summarize how STARKs work.
1d) Polygon Hermez vs. Starknet
Hermez is a beast, for the most part it uses the same polynomial commit methodologies around stack traces as Starknet. However its VM builds SNARKs, not STARKs, giving it a different throughput trade-off compared to Miden. Like Starkware’s proofs, they constitute a Validum and not a roll-up in the technical sense. It uses a series of commits of individual trusted set-ups to sum to a more decentralized variation of the standard SNARK trusted set-up.
While Miden introduces some complexity around stack manipulation within an EVM setting as an alternative to Cairo’s general computing AIR, Hermez takes the complexity a bit further.
The way Hermez’s virtual machine approaches execution traces it is appends them with auxiliary registers and them does abstract algebra around factoring that extra column out, and other work-arounds to consider non-deterministic execution, and provides a Miden++-like stack executor for developers to work with inside a Solidity framework.
The net result of this approach to folding polynomials out of an execution trace is a Read Only Memory output in binary that can be checked using the PlookUp algorithm, which is an evolution from PLONK.
What’s interesting is they’ve designed a set of opcodes that are themselves congruent to specific Polynomial Constraints. There are 14 of them, plus the ROM opcode which is predetermined. You have to respect cleverness where you see it. Here’s a VM stack machine that analogizes specific constraints to stack operations. Is this better than Cairo? What’s the difference in feel for a developer thinking with these abstractions between the two? Probably this is a good VM for Miden developers who have gotten comfortable with that VMs stack operations and they want to try controlling execution flow in a more fine-grained way using this VMs set of opcodes. These Solidity wrapped stack operator VMs don’t involve debugging Cairo-specific memory allocation bugs that could result from a Solidity to Cairo trans compilation; the translation isn’t guaranteed to compile immediately.
Another way to put it: would you like to deal with Solidity + an Assembly-like framework, or would you like to deal with a bespoke framework that combines aspects of C++, Assembly and arguably more modern high level languages like Idris, Haskell or Scala.
1e) Polygon Zero vs. Starknet
Polygon Zero is the re-brand of the Mir protocol which pertained to be an open-source prover generic ZK solution that was built off of research done by Starkware founders. It has more in common with Starknet of all the Polygon ZK protocols. But how much so? The above two also have core structural things in common.
The principal technical innovation in Mir is the creation of Plonky2, a new take on the Plonk-related family of recursive proofing systems. From their blog:
“Plonky2 is a recursive SNARK that is 100x faster than existing alternatives and natively compatible with Ethereum. It combines PLONK and FRI for the best of STARKs, with fast proofs and no trusted setup, and the best of SNARKs, with support for recursion and low verification cost on Ethereum.”
As Vincent Vega would say, that’s a bold statement, let’s unpack it.
First it’s important to clarify that “the best of SNARKs” does not imply that the Validum produced is based on a SNARK, rather it has optimized the recursion model that Starkware uses with customized circuits that build off of PLONK. The objective is to then have a better economy of scale than Starknet at the level of Ethereum proof verification’s gas cost. Whereas other recursive models depend on Plonk+Halo and Starknet depends on FRI; Plonky2 - the sequel - depends on Plonk+FRI. More on all these in section 2.
1d) zkPorter vs. Starknet
zkPorter is more analogous to Polygon Avail or to Filecoin, Arweave or Celestia than it is to Starknet. Starknet is a recursive-proofing+Ethereum archival node solution for general computation, while these other products are all data availability oriented. Starknet is based on data-availability-free Validiums, rather than roll-ups where the proof has a map included. zkPorter complements zkSync’s roll-ups with referential data. While Arweave and Filecoin are general purpose computation-oriented, zkPorter and Polygon Avail are designed around complementing other zk-rollups.
Recursive Proofing Approaches
Proof of Work was originally envisioned as a CPU-based exercise, but due to its open ended difficulty adjustment and industrial economics, ASICs made that impossible, early PoW chains that were CPU-based became dominated by miners using botnets to steal CPU and electricity usage from unsuspecting Windows users. But in order to massively distribute the cost of calculating zero knowledge proofs, breaking proofs into recursive components is done, allowing potentially many, many nodes to massively co-ordinate. Every major protocol does this, and the demand for user full node CPU performance brings back that early 2010s dream of computers just producing value by running software.
Recursive Proofing involves choosing a key mathematical approach and algorithmic approach. Various teams driven by highly trained cryptographers, computer scientists and mathematicians working in tandem have devised ways to do both STARK and SNARK based recursive proofing. In some cases, such as the Mina protocol, this is taken to an extreme of SNARK-based speed and throughput efficiency, enabling a tiny blockchain with a 22kb footprint to maintain such size through a conveyor belt of proof work being done. In other cases, the recursion enables the scaling of proof output atl low latency back to a main chain such as Ethereum, inheriting security. Every Starknet full node will be able to participate in recursive proofing activities when mainnet goes fully active.
Halo was invented by Electric Coin Company employees in 2019. Halo, like other SNARK models adopting probabilistic oracle-assisted proofs, uses Polynomial Commits as a mechanism for abetting proof verification. Polynomial Commits and Low-Degree testing are not unique to STARKs, Halo utilized this and a contemporary protocol called Fractal had more STARK-like qualities. It uses vectorization to break down fields into smaller fields that can be used in a proof and attached to an argument about additional proofs. Or is it 50x more complicated than this contrite description? Yes. For more on that, here’s the original paper.
Because all these different systems operate in math land with these fields, and involve tracing circuitry somehow through some # of computing operations, you can break down the math into less CPU intensive chunks.
The structural design innovation with Halo was not the use of Polynomial Commits, but the use of Amortized Succinctness and with it, Nested Amortization. Spend linear time working on your novel bit, spend log-linear time verifying what the last guy passed on, and then organize this with N layers deep recursion through the network so everyone can ants’ nest the whole picnic into a feast of crumbs. To a large extent this ant metaphor applies to recursive proofing in general, and Halo’s heart-beat economic innovation was making the core prover circuit not ever have to perform linear time operations itself, it just makes proofs about proofs. You might say Halo’s innovation was adding the pheromones of enforcer drone ants to the organizational flow of the ant nest.
The first big design goal with Halo was to factor out the trusted set-up conundrum with SNARK technology, so they did that. The second goal was to achieve scale efficiency in breaking up proofs between distributed nodes, and they achieved a log-linear growth efficiency for that relative to contemporaries way back in 2019, which in ZK research is now a long time ago. They succeeded in a big way and inspired the 3rd generation of recursion protocols below.
STARKs and SNARKs both use polynomial constraints and low-degree testing and can experiment with oracle-assistance using polynomial commitments as safe hints. Nor is trusted-set-up a quintessential park of SNARKs as some of these protocols factor out. FRI is a big part of what gives STARKs their unique structure and scale properties.
FRI stands for Fast Reed—Solomon Interactive Oracle Proofs of Proximity - hey they could have acronymized it to FRSIOPoP, pronounced Frisiopop, giving it a colorful Dr. Seuss-like character, but FRI will do. First off, FRI doesn’t rely on the eppilleptic curve with prime numbers math as much as it works with other kinds of Galois Fields. Secondly FRI is introduced in another paper by Eli Ben Sasson and colleagues building off of earlier work by Solomon and Reed on Probabilistic Checkable Proofs (PCP). Reed and Solomon, like Denzel Washington in Training Day, didn’t know you liked to get wet, because they have zero knowledge about you.
What exactly did Reed and Solomon do? They invented a way of sifting Polynomials out of Finite Fields! Sounds like a convenient way of translating computer programs into other numberings then threading those into testable ZK proof functions. Sasson and co. then took the next step of associating a *proximity* in these function spaces for an oracle who can safely provide assistance to verifiers based on their privileged information.
Sasson and co. then took those math building blocks and designed a messaging protocol to amp up the throughput. Here’s an intermediate level breakdown of the protocol.
There’s a variation on Polynomial Commits for the Prover to Commit help in the form of a Galois Field configuration. Then there is a formal Query process that does a number of math tests on the Committed information to make sure it’s legit. What this dance accomplishes is a 2-part flywheel analogous to the meta-proof in Halo, and it allows the network to chain these Commits and Queriess to aggregate recursive proofs. Instead of enforcer ants with special pheromones in Halo, this Commit/Query loop of assisted proofing is more like an enlightened Canadian rather than Stalinist ant hive, each worker can share regulatory pheremones with their neighbors. Since this is a ZK proof system the Canadian ants should be able to have private and censorship resistant money in case the Canadian Ant Queen tries to embargo their bank accounts.
FRI additionally adopted an enhancement called DEEP (Domain Extension for Eliminating Pretenders) which filters down on potential for prover-assistance to be fraudulent. DEEP, like a moody teenager who thinks nobody else understands, is conceptually quite simple, it involves random sampling within a large domain to compare consistency of two polynomials.
Captain America: I understood that!
For a deeper dive on DEEP, not for the meek, read the paper.
Finally FRI achieves a de-interactivizing of these prover-help methods using a Fiat-Shamir Heuristic. You know Shamir’s secret key, where you split your key in parts? That’s the guy, co-invented RSA. And with Fiat, Shamir co-invented an early ZK proof system that wasn’t fully interactive. This is their heuristic, or rule of thumb:
You should take your probabilistic Oracle Interactive Proof and make a cryptographic hash signature of it to prove the interaction occurred. This is what was meant in the first section when STARKs are said to utilize interactivity in an asynchronous way. The paper on this is here.
2c) To PLONK or not to PLONK?
PLONK is the funnily named protocol for breaking down a program into a circuit rather than an execution trace, calculating a set of polynomials based on the implied gates, and doing some FRI-like probabilistic checking of the equations. Deep dive here. Lots of things are based on PLONK so it is seen as somewhat foundational, but it’s a parallel approach to proofing.
What makes PLONK useful for structuring recursive proofing networks is that the circuits can prove other proofs fast, while the additional proofing work to append is done in linear time. This is similar to how Halo achieves its scalability. <fact check>
Key similarity, polynomial commitments are used to frame up checking a proof probabilistically. Key difference, the circuit that factors the polynomials used for identity comparison and testing, is a discontiguous model of execution vs. an execution trace, which factors polynomials based on a contiguous array. In terms of output the key difference is that PLONK emits 90% to 99% smaller proofs, but lack the quantum-security premise of STARKs.
It’s clear that there are throughput advantages to scaling network topologies. In this article we’re semi-deep-diving and passing links for the math deep dives, to a lot of different ways of… what you might call, “folding mathematical space into ZK proofing origami”. If you block box a lot of the different math approaches in your mind, and look at their I/O properties, the key attribute other than the log-scalability of STARKs vs. SNARKs is how the smaller size advantage of SNARKs and PLONK facilitates potentially greater network flexibility and pingback.
Consider the classic problem of a Bitcoin orphaned block, Bitcoin nodes ping 8 other nodes who ping 8 others etc. and after several ping throughs of geometric growth, a packet can propagate to the mempools of >50% of nodes and especially supernodes (miners). If the interval of each ping-through is too slow, e.g. if the blocks are full at 8MB and take quadratically long to verify then there can be more orphans because of this extreme latency and other blocks show up and take their place. The network becomes obfuscated. Of course now Schnorr signatures have been activated and you could probably linear-time crunch an 8MB block on Bitcoin full of those, but hey, 2017 was a long time ago! This is what people used to debate about before ZK proofs became noted in commerce.
Now having spent a paragraph on that, it must be noted, Starknet vs. Plonk or Mina and other extreme throughput-optimized examples isn’t neatly analogous to the throughput concerns around Big Blocks and orphans. Because, Starknet’s ultimate validum publications are accretive to all the proven computation, the proofs connect in a way that the Longest Chain Rule of Bitcoin leaves the door open to weird game theory scenarios like Selfish Miner Re-org attacks. Starknet doesn’t have that problem.
Will Starknet recursive proving network throughput be slower than some of these others? Even so, STARK log-scale efficiency gains means that everything on Starknet, at some momentum scale, still can win on overall throughput, because the prover CPU economy per transaction gets better as the STARK gets bigger.
We can posit the term fecundity for the network scale output efficiency of the prover network, as opposed to the throughput of individual operations. It’s a special kind of efficiency that defines the economic utility of each marginal prover that joins the network, as a function of network scale. A network is highly fecund if it can reproduce proofs with the highest possible economic utility, which is going to be correlated to more basic throughput and timing metrics.
On a final note for the deep-divers, here’s a paper for a Plonk-inspired protocol with a new primitive called a List Polynomial Commit, which bases its reasoning on comparison between Plonk and FRI. That approach, RedShift, would be later used by Matter Labs for ZK Sync.
2d) Mir vs. Starknet - Plonky2: Revenge of the Sequel
Plonky2 (paper) comes to testnet in Q4 2022 as an innovative proving algorithm combining PLONK and FRI for ZK SNARKs.
Which immediately begs the question to the Starkware team, if low-degree testing, FRI, execution traces, oracle-assisted proofs and Validiums, can be done with SNARKs, what is the quintessential feature of STARKs that differentiates them structurally? <question to team>
The principle mechanical innovation in Plonky is encoding a witness in a 64-bit field, which would not have been possible without using FRI’s approach to finite fields of prime numbers small sub-groups, vs. the prime number fields used in older SNARK models. However to understand what Plonky2 doesn’t innovate on, we have to look at its foundations in TurboPlonk, from the Aztec team.
Like Plonky, TurboPlonk and Plonky2 both use a circuit gate model to create a TransitionGate that vectorizes all the transition constraints into a polynomial, plus a copy constraint for memory assembly. They use these to create an arithmetic circuit by combining transition gates and then Plonk circuits with an Initialization Gate to constrain the starting conditions, creating another type of virtual machine. This structure can execute a number of core lookup and mathematical functions, they are highly efficient with Pedersen hashes and Elliptic curve multiplication. The team invented a new way to crunch the Elliptic curve products, called affine point addition, which saves computation, though it uses a division between x values that are presumed to be scalar fields, and that’s expensive! So they figured out another work-around to make the net-result fast.
TurboPlonk makes a lot of design choices that weaken performance superficially, then finding work-arounds to re-balance the performance and flip it back to net-positive.
Plonky2 took inspiration from TurboPlonks technique of compounding gates together, and uses this to achieve an FRI implementation that uses *only* a trillion gates instead of a sextillion, as Fractal uses in its FRI implementation. They use a smaller field to speed-up memory bottlenecks and then an extension field where a larger field is needed for soundness. They sieve the polynomials to use from these gates by having a binary tree get one polynomial for each level of depth, and the Plonky2 algorithm will prioritize tree configurations where higher-degree polynomials occupy the lower depths near the base.
What’s interesting is that there are soundness deficiencies that are the cost of the faster speed. Plonky2 uses redundant computations to round that soundness deficiency up to an acceptable threshold of confidence. This is similar to TurboPlonk trade-offs that are then compensated for via another technique.
Plonky2 further improves Plonk by replacing TurboPlonk’s Initialization gate with a PublicInputGate, which has a similar binding mechanism, and by blinding the prover polynomials using randomized inputs before applying a power of 2 and increasing the size. Finally Plonky2 uses Poseidon hashes, a logical choice. Being hash dependent, based on Merkle trees, Plonky2 achieves recursive fecundity by optimizing hash arithmetic. It uses a Sponge to streamline the hash I/O and Substitution Boxes for security, and turns the latter boxes into one PoseidonGate. Intermediate value inputs to the S-Boxes keep the degrees of all constraint Polynomials to 7, which is a serendipitous R&D payoff.
The rest of the process, executration trace, Polynomial testing, is wider (135 columns) than Starknet which creates some small trade-offs in generation time. The low-degree test is modularized to allow multiple polynomials to be utilized for faster distributed proofing times and a variation on DEEP called DEEP-ALI is used to gauge soundness.
Probably the most brilliant innovation in Plonky2 is the return of the contiguous execution trade matrix into a discontiguous Disjointed Forest, as some of these rows of the trace are redundant under Plonky, so why bother with them.
FRI tweaks include consolidating the number of Merkle trees, a uniform 8 arities (operands, or function parameters), aggregating oracle blocks into single Merkle leaves, pruning overlapping paths, accepting low degree reductions instead of constants, and grinding. Grinding was invented by Starkware and it deals with the problem of added byte length due to decommitments following up from earlier committed polynomials. They make it harder for a malicious prover to cheat by adding a 64 bit nonce that generates a hash whose leading zeroes imply work performed; an honest prover only has to crunch this once - in that sense Grinding is a mix of PoW’s anti-sybil properties and a Verifiable Delay Function.
Plonky2 throws everything including the kitchen sink at finding efficiency boosts in realm of Plonky SNARKs that can recursively prove quickly, plus cutting FRI’s math corners, plus incorporating TurboPlonk’s custom circuitry to optimize. Furthermore, it requires an optimal configuration, like any deep strategy game, finding this may be non-trivial but the paper recommends a ⅛ codeword rate for split-second small proofs, and a 1/256th rate to get large proofs to 43kb in substantially less than one minute.
2e) Mina vs. Starknet
Mina is kind of like the Solana of ZK proof systems, if the Solana founders were small blockers. It optimizes for sheer throughput, embracing the most lightweight SNARKs to streamline small-scale participation in the network. Everyone is meant to generate ZKs regarding their own transactions and admit these into the queue for agglomeration with the rest of the network. Also, it is oppose Solana in that Solanan has a follow-the-leader approach of routing packets straight to the designated block producer of a given half second, and here we have a global queue which gets picked up and factored between various node workers.
Rather than a blockchain, the chain is itself a single 22kb proof which implies all the prior instances of itself, *kind of* like a blockchain, but there’s no Merkle or Patricia Tries structuring it, “chain history” is structurally endemic to its own proofyness. <fact check>
Mina’s core innovation in ZK methodology is to take the Plonk model and avoid the efficiency loss that comes from amping an already power of 2 polynomial by another power of 2 to achieve ZK masking, so instead they take a series of witness elements from an execution trace, build a tuple that contains those and many 0s, interpolate a Witness Polynomial, and then do some math using that. Mina also makes use of Poseidon hashes which are based on this paper.
Cairo has libraries for SHA-256 but it is a performance grinder, there is a Pedersen Has builtin and a Keccak builtin, but an algo is easy to ZK-ize or it isn’t, even if you optimize with custom AIRs or circuits. Poseidon is a really cool innovation, Cairo ought to get a library and then a builtin to utilize it.
Language and Tooling Environments
Zero Knowledge proofs are not just a strong tool for batch settling off-chain sets of transactions onto a host chain, they’re not just for doing cheap Dex trades, they are a new kind of primitive in the toolbox of mechanism, game and app designers. Starknet created a programming language called Cairo to make it intelligible for creators to think around ZK proofs. Meanwhile the CirCom language has originated in recent years as an open-source initiative with a mix of corporate sponsors. We take a deeper look at the two languages in a follow up article.
3a) Circuits in CirCom vs. the “Rosetta Circuit” in Cairo
We’ll deep dive this in “Cairo vs. CirCom” but the reason Cairo is so-named is because it is an example of a, arguably semi-baroque type of circuit design, one which intends a compile environment for generalized, Turing-complete computation within an Assembly-like configuration, that has a lot of helper keywords and specialized modules called builtins, which use *other* circuits than Cairo’s. In Circom or cousin AIR languages, you would design circuits for specific algorithms and if you were very good at it, with a team, you might try to design a Cairo competitor based on, e.g. better garbage collection in your VM. It could be JAVA all over again.
The name Cairo comes from the discovery of the Rosetta Stone in Cairo in the 1799. The stone had the same message written in 3 languages, 2 of which were known, and one of which was not, ancient Egyptian Hieroglyphics. The stone acted as a ZK proof for the syntax of that language. But what the Cairo language does is provide a syntax for carving Rosetta Stones of behavior into assembly registers, in order to produce ZK proofs.
In the other contenders the use of circuits is implied by constructions within the VM stack architectures. You have Solidity wrap key instructions and op codes pushed to the stack and it resolves contiguously as its own execution trace. These projects have arrived at alternative models of creating a generalized computing circuit.
Now that’s, 18kb
In this context the stone is actually a low-knowledge proof because there is the text in two other languages, it doesn’t translate an execution trace into polynomial constraints. But it does have constraints, the syntax of the respective languages in the translation. If you have ever been curious as to what it actually says, here is a link, it’s semi-bureaucratic and semi-poetic, encoding trade data, ceremony, edicts and a holiday.
3b) Language SDKs
Cairo has a transcompiler from Solidity to Cairo called Warp. Cairo provides 3 ways to build: a python virtual environment, Visual Studio extension and the Playground, which is in-browser. Cairo should have more .js tooling and this is something that people are working on in the community.
Matic.js is the go-to SDK for doing things on Polygon, which would then have specific modular frameworks for each Hermez, Miden and Zero VM environments. Hermez offers a sample project to learn from, editing a wallet. Miden’s repo has an examples folder. The Zero/Mir protocol’s repo for Plonky2 has no SDK, examples and requires frequent updating to participate in, as it is not yet stable, in time this will get better.
Mina has a Dockerized devnet connection set-up and a .js client-side SDK, with a full SDK coming soon.
So far Cairo’s competitive advantage comes from the relative abundance of material around its use, sample code bases, frameworks and vectors by which non-Cairo devs can contribute to a Cairo project (with at least 1 resident expert, ideally C++ or Assembly dev, to fix memory alloc bugs and so on). However this is a very competitive space so we can expect this gap to close over time.
3c) Learning Products
A testament to the sort of surprising upside you can get with more developer tooling network effects are fun products like Symbon, a gamified process of learning Cairo by making a game in Cairo.
As yet there isn’t anything like this on Mina, though with their launch of zkApps (the smart contract end of their L1 protocol) we might expect a grant to create a similar learning journey in a fun and ordered form. The same is largely true of the Polygon-sponsored protocols as their launch process is a bit slower and didn’t prioritize the same tools. Again, a gap that can be closed over the next year with some prioritization.
Validium’s vs. Roll-ups
Validiums are a coinage for what was originally a version of Plasma (remember that? From 2017?) incorporating zero knowledge proofs. Whereas a roll-up embeds some data availability, a Validium assumes the verifier has access to that data off-chain.
The duality of ZK proofs and data availability is fundamental to their math, and the decoupling is kind of the point of zero knowledge. However, they must then be recoupled under perhaps imperfect information for verifiers to contextualize what they are doing. This is why ZKPorter from Matter Labs or Polygon Avail are parts of the quiver.
In Starknet the core client is an implementation of Ethereum with this added Cairo VM. The client is able to look at Ethereum archival state to source data availability directly. However the dependence on single-threading in LevelDB may lead to bottlenecks in performance, one solution is to use another DB schematic for blockchain archiving such as VulcanDB. Another solution would be to custom-engineer compartmentalizing of archival access for Starknet’s client. Either way, high-performance “native” data availability that is self-consistent to a single L1 protocol is possible.
zkPorter vs. Starknet
Porter and Starknet both use Validiums. All the hype about Plasma back in 2017 was actually legitimate, it just needed a rebrand. Now imagine if you had something like Tendermint where you stake your tokens to earn fees, but accept the responsibility and liability of being penalized if you fail to fulfill them. In zkPorter’s case this is to ensure data availability that makes the Validiums checkable. The prize feature for Porter isn’t being a PoS Validium sidechain, it’s the infrastructural benefits of having the ZKSync roll-ups interact with Porter to delegate DA, then having the Validium as a check on that.
In Starknet’s case, there are no tokenomics around data availability, rather something close to the full validating node principle on Bitcoin. How could you run a Lightning Node that routes BTC without having a synced node plus being privy to the relevant subset of LN signed transactions comprising your routes? You need the full node, but not for staking, for operational uptime execution based on archival data. Another analogy is OP_Return layers like OmniLayer and TradeLayer where reindexing the chain history produces an archival state of the layer, in this case the layer is Starknet itself. However, unlike those systems there’s also the top-line Validum on L1 which nicely bridges everything (whereas LN needs a channel to “bridge”into the network, OP_Return layers need an atomic swap of UTXO for tokens).
To further differentiate from Porter, Starknet had its own evolution of delivering Porter-like services, in 2020 StarkEx used essentially a multisig to co-assert its state changing Validums and preserve data records. Staker slashing is a way to manage a big multisig, and this may remind of you Liquid and various other side chain models, albeit more attenuated, custody only of data not assets. This has evolved into everyone on the network using inline data from their modified Ethereum validating nodes to co-attest to everything, and the data redundancy of those few who run full archival nodes. It also supports both Roll-up and Validum modes.
Aztec vs. Starknet
We’ve already looked at TurboPlonk, but it’s important to note how Aztec’s privacy objectives stack up and what that has to do with Validiums vs. Roll-ups. Also, if TurboPlonk didn’t razz you, why not look into UltraPlonk? It rapidly Plonks the algorithms, like SHA, that even TurboPlonk grinds on, borrowing heavily from STARK technology. Plookups enable a prover circuit to crunch tables of data, a fundamental aspect of most algorithms, so that would enable some kind of Domain Specific Language to be created.
The core privacy-oriented innovation in Aztec is to leverage the efficiency gain for crunching Pedersen hashes with TurboPlonk into lots of 256-layer deep Merkle tree navigations to verify proofs of nullification. You mix the spends proofs and the nullifier proofs together and you give a ZK Monero treatment where it's impossible to analyze. Because this is made to compute efficiently, it can be packed into a rollup. They focused on Pedersen and skipped Poisedon because of its limited track record. They also improved on affine coordinate usage to do curve arithmetic more efficiently.
Starknet hasn’t been designed for privacy at an atomic level like Aztec, instead, one could make mixing and shielding apps in Cairo that would enhance the privacy features on the network.
As we’ll see with Loopring, there are ZK Roll-up, compact canonical data designs one can make that could be implemented in Cairo for ZK mode, leading to Cairo-ization/Starknet ports of specific rollup models that are optimal for different features.
Exercises in Roll-up Design: Evaluating Loopring
What if you embraced that roll-ups have data-availability laden into them, and minimize that data footprint before rolling it up? This is the objective of Loopring, whose design resonates personally with this author, to use chains of signatures, like a sophisticated queue structure for multisig, and use that to make atomic Dex liquidity drivers. Brilliant stuff, having personally worked on a protocol for 2-of-2 trading channels, the idea of a ZK aggregator that proofs in direct circuits these more basic atoms, that’s a whipper snapper. They built custom circuits to do the things, for an idea of what that code looks like here is the spot exchange circuit.
The major downside for Loopring is a slower R&D time in developing other apps, NFT games and the like. There is some work in that direction as well, but it’s limited. Maybe this elegant focus is Looprings major strength on pushing volume. The paper goes into more detail about the components for market infrastructure, scaling and cancellation handling, but the interesting one is the relayer. Like TradeLayer or a Discrete Log Contract system settled with Lightning Network sign-offs, you need relayers to collate indicators of interest and match people together.
In bi-lateral trading environments that are more “popular” on Bitcoin (it’s hard to do it in other ways due to constraints of that protocol) it is essential to have a way of organizing these handshakes, so the custom orderbook rules overlaying that are all possible. These markets can have things like preferential routing for trade privacy, variable rebates and also possibly counter-parties who waste your time and induce opportunity cost. By having this loop structure checksum you solve both the timing issue, the liquidity issue by half, and you skip to collating these bilateral trades. By analogy, you could potentially have a ZK-rollup or Validium (but a roll-up may well be appropriate for the size) that represents the collation of 2-of-2 signatures, and/or Watchtower behaviors, and attests to their lawful state transitions per some set of publicly known rules.
Overall the paper really shows you a strong contrast between 2017/’18 white papers and the math thicc papers we’ve been avoiding clicking the links to throughout this very long and also thicc essay. There’s a lot more about feature-composition than compromises with math space to improve the TPS, that compromise was already made in the form of a tight canonical trade template design, plus signature verifications, that powers the engine. There’s multiple tranches of the token on different blockchains. The 2019 batch of recursive proofing papers covered above have a very different tenor.
The first major innovation is the use of an order ring as a provable data-structure, you just have users push orders which get a soft confirmation promise and then they collate together to clear in a sort of batch-auction.
How to make the token go pump via some discounted cashflow multiple? We have a deterministic margin of revenue share in the bid/ask spread capture that relayers who “mine rings” can extract by way of processing. On the more mechanics heavy side, there is an anti-front-running mechanic that amounts to a zk verifiable multisig arrangement between the ring-miner and the user.
Overall Loopring shows us the promise of lean data+signature models for trading combined with collation of bilateral or closed-ring multilateral trades used here, with L2 rollups. In theory one might even be able to design a roll-up lean enough to fit under the 80 byte OP_Return limit on Bitcoin.
5) Conclusion: Design Choices and Why
5a) STARKs or SNARKs - the Zergling rush vs. Protoss log-payoff
Starkware thought the biggest payoff based on a contiguous computing environment (Cairo compilation) would get the biggest network effects. Because they were not trying to create an L1, such as Mina, the need to hyper-optimize throughput on the recursive proofing network is rendered secondary.
Polygon has invested broadly to diversify away the question, they have Zero (Mir) which is an innovative faster STARK, they have Miden which is Winterfell/open-sourced STARK (without the Plonky2 innovation), they have Avail to mitigate data availability needs (whereas Starknet solves this by accessing ETH archival data inline to the client) <fact check>, they have Nightfall to play up privacy features, they have Hermez to embrace a SNARK-based solution.
5b) Execution VM - Domain Specific Language (e.g. Cairo) vs. Wrapped Stack Operations vs. General Circuit Design (Circom, AIR languages)
Starkware’s primary differentiator is really the user experience of developing apps for Starknet or how that translates to quantity, quality and performance efficiency of end-user apps. Competitors like Mir/Polygon Zero/Plonky2 can save $800 on an otherwise $1100 commit that aggregates 14k transactions with a $1 fee, maybe in the future it is $0.10, so Polygon will enjoy a higher gross margin on a Plonky2 service when future extreme scale competition brings things way down. But by then, the network effects of build-out and user momentum will probably be more important, hence the focus on a bespoke environment that can attract top developer talent and garner that momentum. The Stack VMs one can wrap inside Solidity are themselves metaphors for thinking about circuit design that one would have to engage in from the bottom-up if working in something like Circom.
5c) Recursion methodologies
The succession of papers that Ben Sasson and colleagues worked on over the 2010s tracks a path that tries to leapfrog other research to arrive at a more optimally productive STARK model, which is now being applied to a network launch. There may be some throughput disadvantages in FRI recursion vs. networks where the ZK atoms are smaller but with more capped tps. We may see networks where it’s cheaper for marginal new provers to earn, vs. networks that see higher overall tps in aggregate. Not unlike playing Mario Cart as Princess Peach vs. Bowser, one is great for acceleration, the other for top-speed.
Starcraft has trade-offs that are highly balanced based on the constraints of being an oddly symmetrical multiplayer game. But being a competitive researcher/entrepreneur in the field of ZK computation is not like Starcraft. You really can find a better strategy by getting lots of heavy Protoss units in the right formation. That formation is the collection of design choices made by the Starkware team. Stack execution based zkEVM attempts? That’s like the Terrans. You really want to play the Terrans?
The moral of all this in-depth analysis: if you want to play the Terrans instead of the Protoss, maybe Cairo and Starknet development aren’t for you. But if you know that the Protoss are the best, it’s time to play with warping .sol contracts over the Cairo and understanding how to port to Starknet, you might just find it has
Cairo developers after 1 year
Thanks for reading TradeLayer’s Post-TradFi Truths! Subscribe for free to receive new posts and support my work.