On the lecture we saw the Sparse Multiplicative Weights (SMW) IDC Algorithm for releasing accurate answers to *m*-sparse queries while preserving differential privacy in the interactive setting (the data curator acts as an intermediary and must answer an adaptively chosen stream of queries as they arrive). Here I am going to describe another algorithm (proposed by Avrim Blum and Aaron Roth) for releasing accurate answers to *m*-sparse queries that works in the non-interactive setting, in which the data estimator must in one shot output a data-structure which encodes the answers to every query on interest. In this algorithm the entire computation must be performed in time polynomial in *n*, and the time required to evaluate any query on the output data structure must also be polynomial.

The non-interactive mechanism releases a random projection of the database into polynomially many dimensions, together with the corresponding projection matrix.Queries are evaluated by computing their projection using the public projection matrix , and then taking the inner product of the projected query and the projected database. The difficulty comes because of the projection matrix projects vectors from *|X|*-dimensional space to *poly(n) *dimensional space, and so normally would take *|X|poly(n) *bits to represent. The algorithm are constrained to run in time *poly(n)*, however, and so we need a concise representation of the projection matrix. This is achieved by using a matrix implicitly generated by a family of limited-independence hash functions which have concise representations. This requires using a limited independence version of the Johnson-Lindenstrauss lemma, and of concentration bounds.

Before present the algorithm let me recall some basic concepts. A database *D *is a multiset of elements from some (possibly) infinite abstract universe *X*.We write *|D| = n* to denote the cardinality of *D*.For any *[0,n]. *A linear query

Similarly to how we think of a database as a vector, we can think of a query as a vector

**Definition 1** (Sparsity) The sparsity of a linear query Q is *m*-sparse if its sparsity is at most *m*.We will assume that given an m-sparse query we ca quickly (in the polynomial in m) enumerate the elements

**Definition 2** (Accuracy for non-Interactive Mechanism). Let *-accurate for ** *if there exist a function

**Definition 3** (Neighboring databases) Two databases D and D’ are neighbors if they differ only in the data of a single individual: i. e. symmetric difference

**Definition 4** (Differential Privacy) A randomized algorithm M acting on databases and outputting elements from some abstract range

**Definition 5**(The Laplace Distribution). The Laplace Distribution (centered at 0) with scale *b* is the distribution with probability density function:

**Definition 6**(Random Projection Data Structure). The random projection data structure

1.

2.

To evaluate a linear query Q on a random projection data structure

then we output:

**ALGORITHM (**SparseProject takes as input a private database D of size *n, *privacy parameter *m *and the size of the target query class *k***):**

**Sparse project**(

**Let**

**Let **f be a randomly chose hash function from a family of

**Let **

**for** *i=1 *to T **do**

**Let****Let**

**end for**

**Output**

**Remark**: There are various ways to select a hash function from a family of *r*-wise independent hash functions mapping *s* such that *r* polynomial in the finite field *r-*wise independent hash function mapping *r-*wise independent hash function mapping

Below I provide basic theorems about SparseProject Algorithm. (All proofs you can find in *Fast Private Data Release Algorithms for Sparse Queries* (Avrim Blum and Aaron Roth, 2011))

**Theorem 1*** Sparse Project is -differentially private.*

**Theorem 2***For any and any , and with respect to any class of m-sparse linear queries of cardinality , SparseProject is -accurate for:*

*where the hides a term logarithmic in (m+n).*