Why truly private voting remains elusive

Why truly private voting remains  elusive

Private internet voting has long felt like the white whale of cryptography. Though digital voting systems exist for elections in some countries today, they generally rely on a third party to maintain the privacy of each person's vote. While this vastly simplifies things (e.g. ensuring votes are valid), cryptographers have long wondered if a better system could be devised in which votes remain hidden forever and no single party is trusted with maintaining individuals' privacy. The introduction of distributed public ledgers (aka blockchain) has renewed interest in private voting systems given their open nature and added complexity. How can we hope to move governance of corporations on-chain and build network states if individuals can't even place private votes to express their true preferences?

What makes private voting so elusive and do we have the cryptographic toolkit yet to build a "perfect" system? We'll explore the properties needed on the security/privacy side from first principles and motivate how various primitives like zero-knowledge proofs, threshold cryptography, and fully homomorphic encryption get us closer to our goals.

Keep reading if you'd like to learn how to build your own private app using SPF (including private quadratic voting!). If you can't wait, jump directly to our demos.

An abbreviated history

The origin of privacy

Before Britain's 1872 Secret Ballot Act, voters stated their choices orally in front of spectators, with complete voting records published in newspapers and pollbooks; this enabled landlords to identify which tenants voted against their interests, "with tenants being threatened with eviction if they didn’t vote as their landlords directed." In the United States, parties' distributed ballots were designed "so that they might be tracked to the ballot box by scent, as well as by size, shape, and color," allowing party agents to verify that paid voters cast their ballot for the bought candidate. Using similar ballot tracking techniques, German employers before 1905 stationed representatives at polling places "who would impose harsh sanctions on voters who used the 'wrong' ballots."

These problems were so severe that the secret ballot became standard in democracies worldwide by World War I, as it eliminated "the ability to monitor how those bribed voted, thereby eliminating candidates' incentive to bribe."

Digital voting

Digital (or electronic) voting involves the use of machines to assist with either tallying ballots or casting the votes themselves. Importantly, digital voting should be distinguished from internet voting, which involves casting votes in a legally binding manner over the internet.

Though we use machines to help us with many tasks in our daily life, digital voting is still not widespread, with a number of countries abstaining from its use in national elections (e.g. Canada, Germany, Japan, UK). In many cases, electronic voting has even been rolled back after trials, with concerns around cybersecurity, miscounting, and votes later being revealed.

Internet voting is even less common, with committees in Canada, Finland, and the Netherlands having explicitly recommended against it, suggesting that the current risks outweigh the benefits. A few countries are on the vanguard of the move to internet voting though, including Estonia and Thailand. Estonia, with its robust digital identification system, debuted internet-based general elections in 2007. Thailand was one of the first countries to use blockchain for party elections back in 2018.

Blockchain

Most blockchains (e.g. Bitcoin, Ethereum, Solana) are public by default. With the rise of onchain organizations (aka DAOs), voting is done onchain with voters being concerned about expressing their true preferences given that their votes are public, influenced by other votes, or even bribed.

What makes an "ideal" private voting scheme?

Before presenting our solution, it's worth defining what properties a private voting scheme should have. Given we work on cryptography, we'll focus on properties related to privacy, security, and verifiability over social choice theory aspects (e.g condorcet methods, median voter theorem, strategic voting) which deserve its own blog post.

Based on first principles, a good private voting scheme should achieve at minimum the following:

  1. Ballot confidentiality: Vote choices remain hidden during the voting process. While anonymity (hiding if someone participated) is a nice property, it introduces additional complexity around identity and one-person-one-vote verification. Knowing how someone voted is much more sensitive than knowing that someone voted at all.
  2. No single trusted party: A voting system where any one party is trusted to protect how individuals voted or with correct tallying of votes is ripe for abuse.
  3. Only tally result revealed: How individuals voted should never be known to anyone but the voter; only the final aggregated result is public.
  4. Verifiable: Anyone can independently verify that the ballots were tallied correctly according to the stated rules.

We call such a scheme "ideal" as (spoiler alert) we will not be able to fully achieve all four properties in any scheme we look at. Though we will get pretty close.

Cryptography toolbox

So why can't we just use standard public key cryptography to create a private voting scheme?

Let's say there is a public-secret key pair \((pk, sk)\) where \(sk\) is owned by the election coordinator.

Each voter will encrypt their vote using \(pk\); if desired, they can publish their encrypted vote somewhere publicly accessible and immutable (like a blockchain). Once the voting period is over, the coordinator will decrypt each person's vote (using their \(sk\)) and tally the result.

Unfortunately, in this model the coordinator must decrypt every ballot in order to tally the results of the election, requiring an immense amount of trust in the coordinator to handle the private ballot data properly (no data leaks, malicious use, etc). Even if the ballots are never misused, the voter must trust that the coordinator tallied the votes properly.

Providing verifiability via ZKPs

To address the latter problem, the coordinator could produce a zero-knowledge proof (ZKP) to prove that they have indeed tallied the votes correctly. Generally speaking, ZKPs can help address verifiability issues but they do not help with achieving any of the other properties we'd like a private voting scheme to have. Public key cryptography on its own only allows us to achieve a voting system with property 1; once we add ZKPs we get a voting system with property 1 and 4.

Distributing trust via threshold cryptography

To reduce the dependence on a single trusted coordinator, one could use threshold cryptography to break up the secret key \(sk\) into \(n\) shares such that \(n\) different parties hold each share and some subset \(t\) of them are needed to provide their share in order to decrypt. This improves matters as we now don't have a single trusted coordinator but have distributed trust across n parties. However, to determine the election result, these parties still have to decrypt each person's vote.

Threshold cryptography is not without its own set of challenges. Picking the values \(n\) and \(t\) are not always clear; we'll need additional tools to guarantee that the committee members have behaved honestly; depending on the construction, there may be a risk of the last party to provide their key share seeing the tally but refusing to share the result with others. We'll put aside these issues for now though. If you'd like to learn more about them, we recommend this blog post series on multi-party computation.

Layering on ZKPs with threshold cryptography gives us a voting scheme that achieves property 1, 2, and 4. How do we get all four properties though?

Solutions in the wild today

Let's look at some of the cryptographic solutions proposed and deployed in the blockchain space (from Shutter, MACI, NounsDAO to name a few). While these solutions are all a step in the right direction and improve the status quo, they are not without their own faults. Where do these fall short?

Commit-and-reveal protocols and time-lock encryption

With commit-and-reveal protocols, individuals first "commit" to their vote such that it is hidden and cannot be changed. In the second phase, each person returns to reveal what their vote was.

With time-lock encryption, individuals place votes that are only possible to decrypt after a certain period has elapsed. However, voters are not responsible for returning to reveal their individual votes. Voting systems relying on either of these cryptographic primitives achieve property 1, 2, and 4 but fail to achieve property 3.

Additively homomorphic encryption

An additively homomorphic encryption scheme supports adding ciphertexts together such that \(Enc(m_a) + Enc(m_b) = Enc(m_a + m_b)\). One such example is the ElGamal encryption scheme.

Why might additively homomorphic encryption be interesting in the context of voting? Well, a large problem with the schemes we saw so far is that even if we distribute trust across a number of committee members via threshold cryptography, these members must still decrypt each person's vote to tally the result. The goal with using additively homomorphic encryption is to get us closer to achieving property 3 because we can now add up encrypted votes to determine the result.

Say Alice, Bob, and Cathy are deciding if they should increase the DAO's budget. They each cast encrypted votes whether they agree or disagree as a tuple. Alice submits \((Enc(1), Enc(0))\) to represent that she does want the DAO budget to be increased; Bob and Cathy each submit \((Enc(0),Enc(1))\) to indicate they don't want the budget to be increased. To obtain the total tally for those who want the budget to be increased we add up the first values in each tuple \(R_{yes}=Enc(1)+Enc(0)+Enc(0)=Enc(1+0+0)\) and to determine the total tally for those who don't want to the budget to be increased we add up all the second values in each typle \(R_{no}=Enc(0)+Enc(1)+Enc(1)=Enc(0+1+1)\). Only the final values \(R_{yes}\) and \(R_{no}\) will be decrypted (using the coordinator's secret key) to find the values \(R_{yes}=1\) and \(R_{no}=2\). Unfortunately, we must trust that the coordinator will only decrypt the final tally results and not individual votes.

A more subtle issue is protecting against "bad" or invalid votes. How do we know Alice submitted an encryption of 0 or 1 if nobody ever sees the vote? For all we know, she could have submitted a vote with the value 2 or -1. To protect against this, ZKPs will be needed; each user will need to prove that they've submitted a "valid" vote.

When we combine threshold cryptography with additively homomorphic encryption, we get a voting scheme in which we distribute trust across a number of parties. Assuming there is an honest majority of committee members, individual votes are never seen by anyone but the voter (since committee members will only decrypt the final tally result and will not agree to decrypt individual votes).

As we only support addition over ciphertexts, only a limited set of voting schemes can be supported using this paradigm. For example, we can support binary voting (where people vote 0 or 1) or binary voting weighted on a public stake. Can we get closer to achieving all four properties for arbitrary voting schemes?

(Threshold) Fully homomorphic encryption meets voting

Fully homomorphic encryption (FHE) allows anyone to run arbitrary computer programs directly over encrypted data. In its standard format, there is a single public/private key, meaning only one person can decrypt and read data, whereas any party can compute over that encrypted data. While this does enable many applications, it has a fundamental flaw; messages encrypted under different secret keys cannot be used in the same program. This format is obviously no good for private voting, where we have many different parties wanting to vote and we don't want any single party capable of reading everyone's votes.

To address this need, we'll use threshold FHE (sometimes abbreviated as thFHE). In threshold FHE, we'll split the private key into \(n\) shares which are held by \(n\) different parties. To decrypt any information, a certain threshold number of parties \(t\) will need to come together. We will have to assume a majority of these parties are honest and will not decrypt information they are not supposed to; specifically, they will not decrypt anyone's individual vote, just the final tally. No single key share holder can decrypt information and we can tolerate a certain number of malicious key share holders without issue. Unfortunately, if something bad happens to befall more than \(n-t\) key share holders, we may not be able to decrypt the final result.

High level FHE-based voting workflow

We imagine a voting system implemented using threshold FHE to have the following steps:

  • Setup: Appropriate values of \(t\) and \(n\) are chosen with which to split the threshold FHE private key and distributed to the \(n\) individual parties.
  • Vote: Each voter encrypts their vote with respect to the threshold FHE public key, publishing it somewhere publicly accessible and immutable such as a blockchain. They also produce a ZKP to indicate that they have submitted a "valid" vote.
  • Tally: Once the voting period is over, the key holders tally the encrypted votes (i.e. run the appropriate voting program over the encrypted votes) to obtain the encrypted voting outcome, published to a publicly accessible and immutable source such as a blockchain.
  • Result: The key holders perform threshold decryption to obtain just the voting outcome, producing a proof that they have done this correctly (i.e. verifiable decryption).

Does this achieve our ideal properties?

So does this scheme fulfill all the desired properties of a private voting system? Let's take a look:

  1. Ballot confidentially: Each ballot is encrypted and remains so throughout the voting process.
  2. No trusted coordinator: Trust is distributed across $n$ coordinators.
  3. Only the tally result is ever revealed: Threshold FHE promises that only the output from the FHE program is ever decrypted (assuming an honest majority/supermajority of parties on the committee). Unfortunately, we can only achieve property 3 with these caveats today.
  4. Tallying is verifiable: FHE computations provide deterministic outputs for the same encrypted inputs. As such, any third party can ensure the resulting encrypted result of the vote is a stated value. ZKPs could also be used to prove that tallying was done correctly.

Enter the Secure Processing Framework

Our team built the Secure Processing Framework (SPF) to easily bring threshold FHE to any application on or off chain without developers having to provide their own infrastructure for decryption or compute. Let's look at how to use the SPF the create and deploy a private voting program.

As a developer, you just need to be comfortable writing standard C code. By adding some annotations to indicate which programs should be FHE programs and which inputs should be encrypted, we can transform your app into an FHE app that can be used on any chain (EVM-based or otherwise) or in web2. Behind the scenes, SPF determines how to best transform your program into its FHE equivalent, provides the threshold committee, as well as a host of other functionalities as needed (e.g. access control, storage). While a fully robust system would include ZKPs, we do not yet support this feature.

Creating a binary voting program with no privacy

Let's walk through a simple program for checking whether an issue on a ballot passes, sometimes referred to as binary voting. Our FHE program will need to

  • iterate over all the ballots,
  • check that each ballot is encoded properly, and
  • tally up votes for and against an issue.

The following C program implements this logic:

#include <stdint.h>
#include <stdbool.h>

void binary_vote(uint8_t *ballots,
                 uint16_t num_ballots,
                 bool *issue_passes) {
    uint16_t tally = 0;

    // Iterate through all the votes
    for (uint16_t i = 0; i < num_ballots; i++) {
        // Validate each ballot is encoded
        // properly by coercing it to a boolean.
        tally += (bool) ballots[i];
    }

    // Determine if the issue passes.
    *issue_passes = tally > (num_ballots / 2);
}

but it isn't quite an FHE program yet.

Transforming your program into its private equivalent

What else are we missing? Luckily very little; to convert our program to an FHE one, we simply need to annotate that the function should be compiled to FHE using [[clang::fhe_program]] and which inputs are encrypted using [[clang::encrypted]]:

#include <parasol.h>

[[clang::fhe_program]]
void binary_vote([[clang::encrypted]] uint8_t *ballots,
                 uint16_t num_ballots, // no annotation, not encrypted
                 [[clang::encrypted]] bool *issue_passes) {
    uint16_t tally = 0;

    // Iterate through all the votes
    for (uint16_t i = 0; i < num_ballots; i++) {
        // Validate each ballot is encoded
        // properly by coercing it to a boolean.
        tally += (bool) ballots[i];
    }

    // Determine if the issue passes.
    *issue_passes = tally > (num_ballots / 2);
}

We can then compile this program with the Sunscreen variant of clang to generate a binary file containing our optimized FHE program. This file can be uploaded to the SPF service where it can be used on and off chain as desired.

Private quadratic voting

Binary voting is a good first start, but how hard is it to implement more complex voting schemes like quadratic voting? It turns out to be pretty simple!

For quadratic voting, voters can vote n times by using n^2 of their voting credits. In this case, we will need to allow voters to express their vote as a signed integer, and then take the square root of their voting credits to know how many times they are voting. For this particular example, we will also only allow people to use 100 credits total; anything more results in the voter's ballot not being counted.

#include <parasol.h>

// Signed square root: returns sqrt(|x|) with sign of x preserved.
inline int8_t sqrt_with_sign(int8_t x) {
    int8_t magnitude = sqrt8(abs8(x));
    return iselect8(x < 0, -magnitude, magnitude);
}

[[clang::fhe_program]]
void quadratic_vote([[clang::encrypted]] int8_t *voter_credits,
                    uint16_t num_voters,
                    [[clang::encrypted]] bool *issue_passes) {
    int16_t tally = 0;
    for (uint16_t i = 0; i < num_voters; i++) {
        int8_t vote = iselect8(
            abs8(voter_credits[i]) <= 100,    // if the voter uses <= 100 credits
            sqrt_with_sign(voter_credits[i]), // apply the vote
            0                                 // otherwise, do not count the vote.
        );
        tally += vote;
    }
    *issue_passes = tally > 0;
}

This program is similar in structure to the binary vote, but with two important changes:

  1. iselect8 is used to validate that the voters are only using up to 100 credits, and if not the vote is nullified. This demonstrates how validations can be done in FHE.
  2. The voter's choice is being encoded as a signed integer, allowing them to vote for or against an issue based on the sign of their ballot.

How far can you go? We gave a presentation for PSE that demonstrated more complex voting schemes like ranked choice voting and condorcet methods. These more complex options show that we can cover the gamut of voting methods out there, enabling applications to choose the best voting system for them while keeping the vote private.

Deploying onchain with Solidity

How do we use these voting programs onchain? The answer also turns out to be pretty simple; our Secure Processing Framework (SPF) is designed to listen to blockchains for FHE events and process them. You can use use our library contracts to create events onchain that specify what program to run with what inputs, which is then picked up by the SPF and run offchain. Developers write code to request that results of programs are decrypted and posted onchain, enabling the results of a program (such as the tally in a vote) to be shared more broadly.

As demonstrated in the following diagram, there are a few steps:

  • Compile the FHE program with Sunscreen clang.
  • Upload it to the SPF service.
  • Deploy a Solidity contract to Sepolia or Monad (these are the chains currently supported today).
  • The contract collects encrypted and plaintext inputs for the program through users interacting with the contract.
  • The contract calls for the program to be run, and potentially requests that the result of the program is decrypted.

An example contract that calls the binary voting program can be found here.

How users interact with FHE enabled contracts

Now that you have successfully deployed a voting application onchain, how do you get users to interact with it?

  1. Voters encrypt their votes and upload them to the SPF service, getting an identifier in return that is used to reference that ciphertext.
  2. Voters submit encrypted votes by identifier. The contract never sees the plaintext data, and cannot decrypt it because the contract does not have the correct permissions to do so.
  3. After all the votes have been submitted, a contract administrator kicks of the tally using our library functions and requests the decrypted result be posted back to the contract.
  4. After the tally has been run, the result is decrypted and posted back onchain by the SPF service.

Let's see this in action.

Demo and build it yourself

We have built a web app demonstrating binary voting both offchain (web2) and onchain. Check it out here!.

If you would like to build your own FHE-enabled smart contract, see our SPF docs.

Towards better voting schemes

While we've focused on private voting, privacy is important for a large number of systems. SPF can be used to build privacy-preserving auctions, prediction markets, and much more.

The infrastructure exists; now it's time to build.