LERNA: Secure Single-Server Aggregation

In this section, we construct two technical tools, a key-homomorphic masking scheme and a flat secret sharing scheme. As outlined in the technical overview (Sect. 1.1), the masking scheme is used for hiding clients’ input vectors, and the secret sharing scheme is used for sharing each client’s secret masking key.

3.1 Key-Homomorphic Masking

We first introduce the syntax of a key-homomorphic masking scheme.

$\textsf{Setup}(1^{\lambda }, \ell , B_{\text {msg}}):$ takes as inputs the security parameter $\lambda $, a message dimension $\ell $, and a lower bound $B_{\text {msg}}$ on the message modulus. It outputs public parameters $\textsf{pp}$, which defines a key space $\mathcal {K}$, a message space $\mathbb {Z}_{p_m}^\ell $ with some modulus $p_m \ge B_{\text {msg}}$, and a mask space $\mathbb {Z}_q^\ell $ with some modulus q.

In our framework, we assume every client enters the setup phase (Fig. 1) with common correctly generated public parameters $\textsf{pp}$. If the $\textsf{Setup}$ algorithm is deterministic, or public-coin, then this assumption is simply a notational convenience, since each client can compute the common $\textsf{pp}$ on its own, using a random oracle to derive common public randomness if necessary.

In our framework, each client $P_i$ derives its secret masking key $k_i$ in the setup phase, and re-uses it during all online phases. In contrast, it derives a fresh tag $\mathbf {\tau }$ for each online phase, using common public randomness. We only require the key-homomorphic property to hold for masks under the same tag $\mathbf {\tau }$. While the tag is public to all clients, the masking keys must remain secret.

$\textsf{Mask}(\textsf{pp}, k, \mathbf {\tau }, \textbf{m}):$ takes as inputs a masking key $k \in \mathcal {K}$, a tag $\tau $, and a message $\textbf{m}\in \mathbb {Z}^\ell _{p_m}$, and outputs a masked message $\textbf{c}_m$.
$\textsf{UnMask}(\textsf{pp}, \textbf{c}_m, \textbf{c}_0):$ takes as inputs a masked message $\textbf{c}_m$, and an “empty” mask $\textbf{c}_0$ (of message $\textbf{0}$) under the same key and tag. It recovers a message $\textbf{m}^*$ or $\bot $.

The $\textsf{UnMask}$ algorithm is a bit unusual, as it doesn’t take the masking key k or the tag $\mathbf {\tau }$ to recover the message. Instead, it asks the caller to first compute an empty mask $\textbf{c}_0$ using the key k and tag $\tau $, and then feed $\textbf{c}_0$ to the algorithm. We define such a syntax because in our framework, the caller of $\textsf{UnMask}$ is the server. The clients jointly help the server compute the empty mask $\textbf{c}_0$, instead of revealing their masking keys, so that the keys remain secret during each online phase.

$\textsf{Eval}(\textsf{pp}, L, \{\textbf{c}_i\})$: takes as inputs a linear function $L$ with d integer coefficients and d masks $\{\textbf{c}_i\}_{i \in [d]}$. It homomorphically evaluates L on the masks and outputs the result $\textbf{c}_L$.

As mentioned earlier, the input masks $\{\textbf{c}_i\}$ to the $\textsf{Eval}$ algorithm should be masked under a common tag $\tau $. Evaluating L on the masks roughly translates to evaluating L on both the masking keys, over the key space $\mathcal {K}$, and over the messages, over the message space $\mathbb {Z}_{p_m}^\ell $. We define this property below as key-homomorphism.

Correctness. Formally, we define the correctness and the key-homomorphism requirement as follows.

Definition 1

(correctness). For all public parameters $\textsf{pp}$, tags $\tau $, and keys k output by $\textsf{Setup}$, $\textsf{TagGen}$ and $\textsf{KeyGen}$, and for all messages $\textbf{m} \in \mathbb {Z}^\ell _{p_m}$, the following holds.

$$ \Pr \left[ { \textsf{UnMask}(\textsf{pp}, \textbf{c}_m, \textbf{c}_0) = \textbf{m} \,\left| \, \begin{aligned} \textbf{c}_m &\leftarrow \textsf{Mask}(\textsf{pp}, k, \tau , \textbf{m}), \\ \textbf{c}_0 &\leftarrow \textsf{Mask}(\textsf{pp}, k, \tau , \textbf{0}). \end{aligned} \right. }\right] = 1. $$

Definition 2

(key-homomorphism). Consider any linear function L, represented by d integer coefficients. For all public parameters $\textsf{pp}$, tag $\tau $, and keys $k_1,.., k_\ell $ output by $\textsf{Setup}$, $\textsf{TagGen}$, and $\textsf{KeyGen}$, and all messages $\textbf{m}_1, ..., \textbf{m}_d \in \mathbb {Z}_{p_m}^\ell $, the following holds.

$$\begin{aligned} &\bigl \{ \tilde{\textbf{c}}_L \leftarrow \textsf{Mask}\bigl ( \textsf{pp}, L(\{k_i\}), \tau , L(\{\textbf{m}_i\}) \bigr ) \bigr \} \\ \equiv {} &\bigl \{ \textbf{c}_L \leftarrow \textsf{Eval}(\textsf{pp}, L, \{\textbf{c}_i\}) \,\left| \, \textbf{c}_i \leftarrow \textsf{Mask}(\textsf{pp}, k_i, \tau , \textbf{m}_i \right. ) \bigr \}, \end{aligned}$$

where $L(\{\textbf{m}_i\})$ is evaluated over $\mathbb {Z}^\ell _{p_m}$ and $L(\{k_i\})$, over the key space $\mathcal {K}$.

The key-homomorphism definition above requires the evaluated mask $\textbf{c}_L$ to have the same distribution as the “target” mask $\tilde{\textbf{c}}_L$. We next introduce a relaxation to this rather strong property. Roughly, the evaluated mask $\textbf{c}_L$ should be distributed close to the target mask $\tilde{\textbf{c}}_L$. In other words, through homomorphic evaluation we obtain the target mask with some bounded additive noise.

Our framework requires two additional properties from an approximate key-homomorphic scheme. First, when computing $\textsf{UnMask}$ on a masked input $\textbf{c}_m$ and an empty mask $\textbf{c}_0$, any additive noises in them translate to additive noises in the recovered message. Second, when computing $\textsf{Eval}$ on noisy masks, the additive noises translates to an additive noise in the evaluated mask, with bounded magnitude. We formalize the above requirements as follows.

Definition 3

($\epsilon \text {-approximate}$ key-homomorphism). Consider any linear function L, with d integer coefficients whose absolute values are bounded by some $B_L \in \mathbb {N}$.

Let $\tilde{\textbf{c}}_L$, $\textbf{c}_L$ be the evaluated and the “target” masks as defined in Definition 2. We require
$$ \Vert \tilde{\textbf{c}}_L - \textbf{c}_L\Vert _{\infty } \le \epsilon d B_L. $$
Let $\textbf{c}_m, \textbf{c}_0$ be the masked input and the empty mask as defined in Definition 1. For all integer noise vectors $\textbf{e}_1, \textbf{e}_2 \in \mathbb {Z}^\ell $, we require
$$ \textsf{UnMask}(\textsf{pp}, \textbf{c}_m + \textbf{e}_1, \textbf{c}_0 + \textbf{e}_2) = \textbf{m} + \textbf{e}_1 + \textbf{e}_2 \text { mod }{p_m}. $$
Let $\textsf{pp}$, $\{\textbf{c}_i\}$ be the public parameters and the masks as defined in Definition 2. For all integer noise vectors $\{\textbf{e}_i\}$, whose values are bounded by some $B_e \in \mathbb {N}$ we require
$$ \Vert \textsf{Eval}(\textsf{pp}, L, \{\textbf{c}_i\}) - \textsf{Eval}(\textsf{pp}, L, \{\textbf{c}_i + \textbf{e}_i\}) \Vert _{\infty } \le B_e d B_L. $$

Security. For security, we require a mask under a randomly chosen key hides its message. We further require this holds for a polynomial number of adaptive sessions, each with a fresh tag sampled with public randomness, reusing the same key.

Definition 4

(security). Let $\lambda $ be the security parameter. The masking scheme is secure if for all input dimension $\ell = \ell (\lambda ) \le {\text {poly}}({\lambda })$ and message modulus lower bound $B_{\text {msg}}= B_{\text {msg}}(\lambda ) \le 2^{{\text {poly}}({\lambda })}$, any efficient adversary $\mathcal {A}$ has negligible advantage in distinguishing the experiments $\textsf{Exp}_{\textsf{Mask}}^{\mathcal {A}, b}(1^{\lambda })$ defined as follows:

The challenger computes $\textsf{pp}\leftarrow \textsf{Setup}(1^{\lambda }, \ell , B_{\text {msg}})$, and samples a masking key $k \leftarrow \textsf{KeyGen}(\textsf{pp})$. It launches $\mathcal {A}(1^{\lambda })$, sends $\textsf{pp}$ to $\mathcal {A}$, and repeats the following steps until $\mathcal {A}$ outputs a bit $b'$.
1. 1.
  Run $\tau \leftarrow \textsf{TagGen}(\textsf{pp}; r)$ using fresh randomness $r$, and send $(\tau , r)$ to $\mathcal {A}$. $\mathcal {A}$ replies with a message $\textbf{m} \in \mathbb {Z}^{\ell }_{p_m}$.
2. 2.
  If $b = 1$, compute $\textbf{c}_1 \leftarrow \textsf{Mask}(\textsf{pp}, k, \tau , \textbf{m})$. Otherwise, compute $\textbf{c}_0 \leftarrow \textsf{Mask}(\textsf{pp}, k, \tau , \textbf{0})$. Send $c_b$ to $\mathcal {A}$.

Construction Based on LWR. We construct a $1\text {-approximate}$ key-homomorphic masking scheme based on the learning with rounding (LWR) assumption [5]. The construction is a slight modification to the almost seed homomorphic PRG based on LWR in [11].

Definition 5

(LWR [5]). let $\lambda $ be the security parameter, $n=n(\lambda )$, $q=q(\lambda )$, $p=p(\lambda )$ be integers. The $\text {LWR}_{n,q,p}$ assumption states that for any $m ={\text {poly}}({n})$ $A\leftarrow \mathbb {Z}_q^{m\times n}$, $\textbf{s}\leftarrow \mathbb {Z}_q^n$, $\textbf{u}\leftarrow \mathbb {Z}^m_q$, the following indistinguishability holds:

$$ (A, \left\lfloor A\cdot \textbf{s} \right\rfloor _p) \approx ^c (A, \left\lfloor \textbf{u} \right\rfloor _p), $$

where $\left\lfloor \cdot \right\rfloor _p$ is the rounding function defined as $ \left\lfloor \cdot \right\rfloor _p : \mathbb {Z}_q \rightarrow \mathbb {Z}_p : x \mapsto \left\lfloor (p/q) \cdot x \right\rfloor $.

Construction 1

(key-homomorphic masking by LWR).

$\textsf{Setup}(1^{\lambda },\ell , p_m):$ deterministically choose a modulus $q$ and dimension $n$ such that $\text {LWR}_{n,q,p_m}$ is assumed to be hard. Output $\textsf{pp}= (\ell , p_m, q, n)$. The key space is $\mathcal {K}= \mathbb {Z}_q^n$, the message space, $\mathbb {Z}_{p_m}^\ell $, which is the same as the mask space.
$\textsf{KeyGen}(\textsf{pp}):$ sample a vector $\textbf{s}\leftarrow \mathbb {Z}_q^n$, and output $k = \textbf{s}$.
$\textsf{TagGen}(\textsf{pp}):$ sample a matrix $A \leftarrow \mathbb {Z}_q^{n\times \ell }$, and output $\tau = A$.
$\textsf{Mask}(\textsf{pp}, k, \tau , \textbf{m}):$ parse the key and tag as $k, \tau = \textbf{s}, A$. Output the masked message $ \textbf{c}_m = \left\lfloor A\cdot \textbf{s} \right\rfloor _{p_m} + \textbf{m} \in \mathbb {Z}^\ell _{p_m}. $
$\textsf{UnMask}(\textsf{pp}, \textbf{c}_m, \textbf{c}_0):$ output the message $ \textbf{m}^* = \textbf{c}_m - \textbf{c}_0 \in \mathbb {Z}^\ell _{p_m} $.
$\textsf{Eval}(\textsf{pp}, L, \{\textbf{c}_i\}):$ parse $L$ as d integer coefficients $u_1, \ldots , u_d$. Output the evaluated mask $ \textbf{c}_L = \sum _{i \in [d]} u_i \textbf{c}_i \in \mathbb {Z}^\ell _{p_m}. $

The idea of the construction is simple. A masking key is an LWR secret $k = \textbf{s}$, and a tag is a random LWR public matrix $\tau = A$. Given a masking key $\textbf{s}$, a tag A, and a message $\textbf{m}$ as inputs, the $\textsf{Mask}$ algorithm hides the message $\textbf{m}$ with a fresh LWR sample $\left\lfloor A\cdot \textbf{s} \right\rfloor _{p_m}$. We defer the proof of Lemma 1 to the full version.

Lemma 1

Construction 1 is a $1\text {-approximate}$ key-homomorphic masking scheme under the $LWR_{n,q, p_m}$ assumption.

Choosing Parameters q, n. It is proved in [5] that under the Learning With Error (LWE) assumption with dimension n, modulus q, and any noise distribution bounded by B, the LWR assumption also holds with dimension n and moduli $q, p_m$ such that $q \ge B p_m n^{\omega (1)}$. It’s commonly believed that the LWE assumption holds for sufficiently large $B = {\text {poly}}({n})$, and sub-exponential modulus-to-noise ratio $ \alpha = q/B \le 2^{\sqrt{n}}. $ Therefore, given a message modulus $p_m \in \mathbb {N}$, it suffices to set $n = (\log {p_m} + \varOmega (\lambda ))^2$, and $q = B p_mn^{\log \lambda }$.

Extension to Ring LWR. The above scheme can also be instantiated using the Ring LWR assumption introduced together with LWR in [5]. We implement the more computationally efficient version with Ring LWR and present experiment data in Sect. 5.

Construction Based on DCR. Due to limited space, we defer our construction of an exact key-homomorphic masking scheme under the decisional composite residuosity (DCR) assumption to the full version.

3.2 Flat Secret Sharing

A threshold secret-sharing scheme with M parties normally has two algorithms $\textsf{Share}, \textsf{Recon}$, and is parameterized by privacy and reconstruction thresholds $\gamma , \rho $, where $0 < \gamma < \rho < 1$. Running $\textsf{Share}$ on a secret value creates M shares. Running $\textsf{Recon}$ on any subset of more than $\rho M$ shares recovers the secret. Any subset of less than $\gamma M$ shares contains no information about the secret.

Secret Sharing in Our Framework. Our framework uses the scheme in an unusual way. In the setup phase, the clients run $\textsf{Share}$ to create shares of their masking keys. In the online phase, the server runs $\textsf{Recon}$ not over the key shares, but homomorphically over empty masks created under the key shares. As long as $\textsf{Recon}$ is a linear function, the key-homomorphism property (Definition 2) ensures that running $\textsf{Recon}$ over the masks translates to over the underlying key shares. The two thresholds $\gamma , \rho $ guarantees the masking keys are hidden when at most $\gamma M$ clients are corrupted, and $\textsf{Recon}$ succeeds when at least $\rho M$ clients are online.

This approach creates a technical challenge when the masking scheme has only approximate key-homomorphism (Definition 3). Namely, evaluating $\textsf{Recon}$ homomorphically creates an additive noise, which grows with the magnitude of the coefficients of $\textsf{Recon}$. The noise then propagates into the aggregation result.

To help remove the noise, each client input is multiplied with a scaling factor $\varDelta $, set larger than the noise. To accommodate the factor $\varDelta $ in the clients inputs, the message modulus of the masking scheme is in turn increased by $\log {\varDelta }$ bits. This overhead motivates us to construct a secret-sharing scheme with small coefficients in $\textsf{Recon}$, which we call a flat secret sharing scheme.

Overview of Our Scheme. Our starting point is the linear secret sharing scheme [17] that has 0, 1 coefficients. However, using the scheme has a prohibitive overhead: the total share size scales polynomially in the population M, namely $\varOmega (M^{5.3})$.

A first attempt at reducing the share size is to run the scheme in a small committee, sampled during the setup phase. If client corruption and dropout happen independently to the committee sampling, then the fractions of corruption and dropout in the committee roughly equal the true fractions in the population. This is true in our framework, where the set of corrupted clients, and potential dropout clients are decided statically at the beginning.

That is, we add a $\textsf{Setup}$ algorithm to the scheme, which samples a committee $Q \subseteq [M]$ at random. It can be shown that when the fractions $0 < \gamma < \rho < 1$ has a constant gap, a committee of size $O(\kappa )$ suffices, with a $O(2^{-\kappa })$ statistical error.

Running [17] as a blackbox with a committee of size $O(\kappa )$ reduces the total share size from $O(M^{5.3})$ to $O(\kappa ^{5.3})$. But we are able to further improve it to $O(\kappa ^2)$, with a $O(\kappa ^2)$-size committee, by re-visiting the analysis of [26], and constructing a committee version of [17] in a non-blackbox way. We summarize the syntax of our committee-based scheme for some secret space $\mathcal {M}$ below.

$\textsf{Setup}(1^\kappa , M)$ takes as inputs the statistical security parameter $\kappa $, and the population size M. It outputs a committee Q of share holders, and public parameters $\textsf{pp}$.
$\textsf{Share}(\textsf{pp}, s)$ outputs shares $\{s_j\}_{j \in Q}$ computed from $s \in \mathcal {M}$.
$\textsf{Recon}(\textsf{pp}, W, \{s_j\}_{j \in W})$ takes as inputs a set W indicating which shares are received, and the set of shares $\{s_j\}_{j \in W}$. It outputs a recovered secret $s^*$ or $\bot $.

Correctness and Security. Formally, we define the correctness requirements as follows.

Definition 6

($\rho $-reconstruction). Let $\kappa $ be the statistical security parameter. For all population size $M \in \mathbb {N}$, secret $s \in \mathcal {M}$, and subset $T \subseteq [M]$ with size $|T| > \rho M$ the following holds.

$$\begin{aligned} \begin{aligned} & \Pr \left[ { \begin{aligned} &\textsf{Recon}(\textsf{pp}, W, \{s_j\}_{W})\\ &{} \quad = s \end{aligned} \,\left| \, \begin{aligned} (Q, \textsf{pp}) &{}\leftarrow \textsf{Setup}(1^\kappa , M), \\ W &{}= T \cap Q, \\ \{s_j\}_{Q} &\leftarrow \textsf{Share}(\textsf{pp}, s) \end{aligned} \right. }\right] \ge 1 - {\text {negl}}({\kappa }). \end{aligned} \end{aligned}$$

The usual security requires that, for any corruption set $C \subseteq [M]$ below the threshold, i.e. $|C| \le \gamma M$, corrupted shares $\{s_j\}_{Q \cap C}$ contain no information about the secret s.

We need a stronger property (which implies the usual one) to prove security of our framework: given corrupted shares $\{s_j\}_{Q\cap C}$ of 0, there is algorithm $\textsf{Ext}$ that “extents” them to a full set of shares $\{s_j\}_Q$ for any secret s. The shares $\{s_j\}_Q$ distribute statistically close to shares of s. This is analogous to the property that, given a corrupted subset of Shamir’s shares, one can interpolate the rest of the shares to any secret s. We formalize this requirement as follows.

Definition 7

($\gamma $-simulation-privacy). Let $\kappa $ be the statistical security parameter. There exists an efficient deterministic algorithm $\textsf{Ext}$ such that for all population size $M \in \mathbb {N}$, secret $s \in \mathcal {M}$, and subset $C \subseteq [M]$ with size $|C| < \gamma M$ the following two distributions are statistically close.

They share the same public parameters $(Q, \textsf{pp}) \leftarrow \textsf{Setup}(1^\kappa , M)$.

1.
$\{s_j\}_Q$ is computed normally as $\{s_j\}_Q \leftarrow \textsf{Share}(\textsf{pp}, s)$.
2.
$\{\tilde{s}_j\}_{Q} = \{s'_j\}_{Q\cap C} \cup \{\tilde{s}_j\}_{Q\cap \overline{C}}$ is computed by $\{s'_j\}_{Q} \leftarrow \textsf{Share}(\textsf{pp}, 0)$ and $\{\tilde{s}_j\}_{Q\cap \overline{C}} = \textsf{Ext}(\textsf{pp}, C, \{s'_j\}_{Q\cap C}, s)$.

Flatness. As explained in “Secret Sharing in Our Framework”, we require the $\textsf{Recon}$ algorithm to have small coefficients as a linear function over the input shares. This minimizes the noise introduced by evaluating $\textsf{Recon}$ homomorphically over empty masks. A similar situation arises in the security proof of our framework, where the simulator needs to evaluate $\textsf{Ext}$ (Definition 7) homomorphically over noisy masks. We therefore additionally require $\textsf{Ext}$ to have small coefficients as a linear function over the input shares and the secret. We summarize the above requirements as “flatness”.

Definition 8

(flatness). Let $\kappa $ be the statistical security parameter. A flat secret sharing scheme satisfies the following.

The $\textsf{Recon}$ algorithm, when not outputting $\bot $, can be written as a linear function over the input shares, with integer coefficients bounded by O(1).
The $\textsf{Ext}$ algorithm can be written as a linear function over the input shares and the secret, with integer coefficients bounded by $O(\log {\kappa })$.

Construction Details. We start by recalling the result of [8] and [17], summarized in the following theorem.

Theorem 1

(formula to secret sharing [8, 17]). For secrets over $\mathcal {M}= \mathbb {Z}_q$ for any modulus $q$ or $\mathcal {M}= \mathbb {Z}$, there exists an efficient algorithm that translates any monotone Boolean formula $f :{\{{0,1}\}}^M \rightarrow {\{{0,1}\}}$, over variables $x_1, \ldots , x_M$, of size $d = |f|$, to a pair of secret sharing algorithms $\textsf{Share}_f,\textsf{Recon}_f$ satisfy the following:

$\textsf{Share}_f(s)$ computes d share units, each corresponding to a literal in f. For each share holder $i \in [M]$, its share $s_i$ consists of all units corresponding to $x_i$. $\textsf{Share}_f(s)$ outputs the shares $\{s_i\}$.

If $\mathcal {M}= \mathbb {Z}_q$, each share unit is an element in $\mathbb {Z}_q$. If $\mathcal {M}= \mathbb {Z}$, with secrets bounded by $B$, each unit is an integer bounded by $B 2^{\kappa }$.
For any subset $T \subseteq [M]$, let $\textbf{a}_T \in {\{{0,1}\}}^{M}$ denote the assignment where $a_i = 1$ iff $i \in T$. For every subset of the shares $\{s_j\}_T$, reconstruction $\textsf{Recon}(T, \{s_j\}_T)$ succeeds iff $f(\textbf{a}_T) = 1$.

For any subset $\{s_j\}_C$ that fails to reconstruct, there exists a simulation algorithm $\textsf{Ext}$ defined analogously to Definition 7.
The algorithms $\textsf{Share}_f, \textsf{Recon}_f$ satisfy “flatness” per Definition 8.

With Theorem 1, constructing a flat secret sharing scheme for any access structure reduces to finding a corresponding formula f:

$\textsf{Setup}$ constructs a formula f as $\textsf{pp}$, and defines the committee Q as the set of distinct literals in f.
$\textsf{Share}, \textsf{Recon}$ simply run $\textsf{Share}_{f}, \textsf{Recon}_{f}$ given by Theorem 1.

Below we first describe the result of [26], which shows the existence of a formula $f_t$, of size $O(M^{5.3})$, for any t-threshold function. (Note that for any $\gamma M < t < \rho M$, $f_t$ satisfies our requirement.)

Construction 2

(t-threshold monotone Boolean formula [26]).

In [26], $f_t$ (over M variables) is implicitly constructed through a formulae distribution $F_t$ satisfying the following:

$$\begin{aligned} \forall \textbf{a}\in {\{{0,1}\}}^M, \quad \Pr \left[ { f(\textbf{a}) = \textsf{Thresh}_t(\textbf{a}) \,|\, f \leftarrow F_t}\right] > 1-2^{M}, \end{aligned}$$

(1)

where $\textsf{Thresh}_t$ denotes the t-threshold function. Applying the union bound over all $2^M$ values for $\textbf{a}$, we have

$$ \Pr \left[ {\forall \textbf{a}\in {\{{0,1}\}}^M, \ f(\textbf{a}) = \textsf{Thresh}_t(\textbf{a}) \,|\, f \leftarrow F_t}\right] > 0. $$

Hence, there exists a formula $f_t$ in $F_t$ that computes $\textsf{Thresh}_t$ exactly.

Further, note that for any threshold $0 < t < M$, the function $\textsf{Thresh}_t$ over M inputs is equivalent to $\textsf{Thresh}_{M'/2}$ over $M' = M + D \le 2M$ inputs, with $D \le M$ dummy variables always set to 1 or 0, respectively for the case of $t < M/2$ or $t \ge M/2$. For technical reasons, we always choose $M'$ to be odd. Therefore, it remains to construct a formulae distribution $F_{M/2}$, for any odd M.

The construction is recursive. In the base case, $F^{(0)}$ is defined as

For $i \ge 1$, the formulae distribution $F^{i}$ is defined inductively

$$\begin{aligned} F^{(i)} := ( F_1^{(i-1)} \vee F_2^{(i-1)} ) \wedge ( F_3^{(i-1)} \vee F_4^{(i-1)} ), \end{aligned}$$

where $F_1^{(i-1)}, F_2^{(i-1)}, F_3^{(i-1)}, F_4^{(i-1)}$ are distributions independent and identical to $F^{(i-1)}$. It’s shown that after $k = O(1) + 2.65\log {M}$ recursion steps, the distribution $F_{M/2} = F^{(k)}$ satisfies Eq. 1.

According to Eq. 1, we examine the probability that, for any assignment $\textbf{a}\in {\{{0,1}\}}^M$, a sample $f^{(i)}\leftarrow F^{(i)}$ computes the incorrect result.

When $\textbf{a}$ has less than M/2 ones, $f^{(i)}(\textbf{a})$ is supposed to output 0, but instead (incorrectly) outputs 1. Let $p_s^{(i)}$ denote this probability, i.e., $f^{(i)}(\textbf{a}) = 1$. By construction, we have
$$\begin{aligned} p_s^{(i)} = \bigl (1 - (1-p_s^{(i-1)})^2\bigr )^2. \end{aligned}$$
(2)
When $\textbf{a}$ has at least M/2 ones, let $p_c^{(i)}$ denote the probability that $f^{(i)}(\textbf{a})$ (incorrectly) outputs 0. Similarly, we have
$$\begin{aligned} p_c^{(i)} = 1 - (1 - \bigl (p_c^{(i-1)}\bigr )^2)^2. \end{aligned}$$
(3)

By construction of $F^{(0)}$, and that M is odd, we also have

$$ p_s^{(0)} < p(\frac{1}{2} - \frac{1}{2M}), \quad p^{(0)}_c \le (1-p) + p(\frac{1}{2} - \frac{1}{2M}). $$

It remains to show that $p_s^{(k)} , p_c^{(k)} < 2^M$ for $k = O(1) + 2.65\log {M}$, which follows from the technical claims below, which are taken directly from [26].

Claim 1

(phase 1). For the recurrence relations specified by Eq. 2, 3 with any initial values satisfying $p_s^{(0)} < p/2 - p/(2M)$, $p^{(0)}_c < 1-p/2 -p/(2M)$, it holds that $p^{(k_1)}_s \le p/2 - \varOmega (1)$, and $p^{(k_1)}_c \le 1-p/2 - \varOmega (1)$ for $k_1 = 1.65\log {M}$.

Claim 2

(phase 2). For the recurrence relations specified by Eq. 2, 3 with any initial values satisfying $p_s^{(0)} < p/2 - \varOmega (1)$, and $p^{(0)}_c < 1-p/2 - \varOmega (1)$, it holds that $p^{(k_2)}_s, p^{(k_2)}_c < 2^M$ for $k_2 = O(1) + \log {M}$.

Intuitively, a formula sampled from $F^{(0)}$ fails with probability close to (but less than) p/2 and $1-p/2$ respectively in the two cases. Each recursive step “shifts” them further away from the starting points towards 0. Claim 1 shows that it takes $k_1 = O(\log {M})$ steps to start at $\varTheta (1/M)$-away and shift to $\varOmega (1)$-away from the starting points. Claim 2 shows that it takes additional $k_2 = O(\log {M})$ steps to shift exponentially close to 0.

Since each recursive step multiplies the formula size by 4, after $k = k_1 + k_2 = O(1) + 2.65\log {M}$ steps, the formulas in $F^{(k)}$ has size $4^{O(1)+2.65\log {M}} = O(M^{5.3})$.

Our first observation is instead of the formula $f_t$, we only need a formula $f_{\rho , \gamma }$ that 1) computes 1 if the inputs have $> \rho M$ ones, 2) computes 0 if the inputs have $< \gamma M$ ones, and 3) may otherwise compute either. We denote this $(\rho ,\gamma )$-threshold function $\textsf{Thresh}_{\rho ,\gamma }$. A similar trick reduces computing $\textsf{Thresh}_{\rho ,\gamma }$ over M variables to $\textsf{Thresh}_{1/2+\delta , 1/2-\delta }$ over $M' \le 2M$ variables for some constant fraction $\delta = (\rho - \gamma )/4$.

This observation allows us to calculate the initial failure probability for $f^{(0)} \leftarrow F^{(0)}$ differently from above.

When $\textbf{a}$ has less than $M(1/2 - \delta )$ ones, $f^{(0)}$ fails (i.e., computes 1) with probability $p^{(0)}_s < p(1/2 - \delta ) < p/2 - \varOmega (1)$.
When $\textbf{a}$ has more than $M(1/2 + \delta )$ ones, $f^{(0)}$ fails with probability $p^{(0)}_c < (1-p) + p(1/2 + \delta ) < 1-p/2 - \varOmega (1)$.

Since the initial values of $p^{(0)}_s, p^{(0)}_c$ already satisfies the condition for Claim 2, we indeed only need $k_2 = O(1) + \log {M}$ recursive steps! This observation already let us reduce the size of the formula from $4^{O(1)+2.56\log {M}} = O(M^{5.3})$ to $4^{O(1) + \log {M}} = O(M^{2})$.

Our second observation is that in the static corruption model, the set of corrupted and the reconstructing share holders $C, T_i$ at each iteration i is fixed before the secret sharing $\textsf{Setup}$ algorithm. Therefore, instead of finding an exact formula $f_{\rho ,\gamma }$ that’s correct on all assignments, it suffices to sample $f \leftarrow F_{\rho ,\gamma }$ during $\textsf{Setup}$ that’s correct on the (${\text {poly}}({\kappa })$ many) fixed assignments $\textbf{a}_C$ and $\textbf{a}_{T_i}$.

In particular, we can avoid taking the union bound over $2^M$ values for $\textbf{a}$, and only construct a distribution $F_{\rho , \gamma }$ (equivalently, $F_{1/2+\delta ,1/2-\delta }$) such that

$$ \forall \textbf{a}\in {\{{0,1}\}}^M, \quad \Pr \left[ { f(\textbf{a}) = \textsf{Thresh}_{\rho ,\gamma }(\textbf{a}) \,|\, f \leftarrow F_{\rho ,\gamma }}\right] > 1-2^{\kappa }. $$

By Claim 2, we now only need $k_2' = O(1) + \log {\kappa }$ recursive steps, which further reduces the formula size to $O(\kappa ^2)$!

To summarize, we obtain the following lemma.

Lemma 2

(flat secret sharing). For any population size $M \in \mathbb {N}$, constant fractions $0 < \gamma < \rho < 1$, integer modulus q and dimension $\ell $, there exists a flat secret-sharing scheme $\textsf{Setup}, \textsf{Share},\textsf{Recon}$ for secrets space $\mathcal {M}= \mathbb {Z}^\ell _q$ or $\mathcal {M}= \mathbb {Z}^\ell $, with privacy and reconstruction thresholds $\gamma , \rho $. Furthermore,

It has committee size $|Q| = O(\kappa ^2)$, where the constant depends on the thresholds $\gamma , \rho $.
The $\textsf{Recon}$ algorithm, when written as a linear function, has $O(\kappa )$ non-zero coefficients, which are 1 or $-1$.

When sharing a secret according to a formula f, $\textsf{Share}_f$ views f as a tree with AND, OR on the intermediate nodes, and literals $x_i$ on the leaf nodes. It assigns a share to each node of this tree: i) Upon reaching an AND node, split the current share s into two additive shares of s, and assign them to the children. ii) Upon reaching an OR node, duplicate s and assign them to the children. iii) Upon reaching a literal $x_i$, assign s to share holder i. Reconstruction according to f follows a similar recursive algorithm.

LERNA: Secure Single-Server Aggregation via Key-Homomorphic Masking | Springer Nature Link