Welcome to David Xiao's Home Page

dxiao	email: @liafa.univ-paris-diderot.fr
	snail mail: David Xiao LIAFA Université Paris Diderot - Paris 7 Case 7014 75205 Paris Cedex 13 France

English | Français | 中文 | עברית

About Me

Welcome to my home page. I am a CNRS researcher in the Algorithms and Complexity Group at LIAFA, focusing on complexity theory and cryptography. Prior to coming here, I obtained my Ph. D from Princeton University under the co-supervision of Boaz Barak and Avi Wigderson (IAS).

Here is my CV [PS, PDF].

Please note: I am currently on leave and am not responding to requests for internships or doctoral candidates. You may contact another member of our group here.

Publications

The list below is out of date. For the most complete list of my publications, please refer to DBLP.

Complexity

"Lower bounds on information complexity via zero-communication protocols and applications"
[Abs, Bib, PS, PDF]

I. Kerenidis, S. Laplante, V. Lerays, J. Roland, and D. Xiao. FOCS 2012.

x

We show that almost all known lower bound methods for communication complexity are also lower bounds for the information complexity. In particular, we define a relaxed version of the partition bound of Jain and Klauck and prove that it lower bounds the information complexity of any function. Our relaxed partition bound subsumes all norm based methods (e.g. the factorization norm method) and rectangle-based methods (e.g. the rectangle/corruption bound, the smooth rectangle bound, and the discrepancy bound), except the partition bound.

Our result uses a new connection between rectangles and zero-communication protocols where the players can either output a value or abort. We prove the following compression lemma: given a protocol for a function f with information complexity I, one can construct a zero-communication protocol that has non-abort probability at least $2^{O(I)}$ and that computes f correctly with high probability conditioned on not aborting. Then, we show how such a zero-communication protocol relates to the relaxed partition bound.

We use our main theorem to resolve three of the open questions raised by Braverman. First, we show that the information complexity of the Vector in Subspace Problem is $\Omega(n^{1/3})$, which, in turn, implies that there exists an exponential separation between quantum communication complexity and classical information complexity. Moreover, we provide an $\Omega(n)$ lower bound on the information complexity of the Gap Hamming Distance Problem.

x

@inproceedings {KLLRX12,
author = "Kerenidis, Iordanis and Laplante, Sophie and Leray, Virginie and Roland, J\'{e}r\'{e}mie and Xiao, David",
title = "Lower bounds on information complexity via zero-communication protocols and applications",
booktitle = "In Proc. FOCS '12",
pages = "500--509",
year = "2012",
}
"Improved bounds for the randomized decision tree complexity of recursive majority"
[Abs, Bib, PS, PDF]

F. Magniez, A. Nayak, M. Santha, and D. Xiao. ICALP 2011.

x

We consider the randomized decision tree complexity of the recursive 3-majority function. For evaluating a height $h$ formulae, we prove a lower bound for the $\delta$-two-sided-error randomized decision tree complexity of $(1-2\delta)(5/2)^h$, improving the lower bound of $(1-2\delta)(7/3)^h$ given by Jayram et al. (STOC '03). We also state a conjecture which would further improve the lower bound to $(1-2\delta)2.54355^h$.

Second, we improve the upper bound by giving a new zero-error randomized decision tree algorithm that has complexity at most $(1.007) \cdot 2.64946^h$, improving on the previous best known algorithm, which achieved $(1.004) \cdot 2.65622^h$.

Our lower bound follows from a better analysis of the base case of the recursion of Jayram et al. Our algorithm uses a novel ``interleaving'' of two recursive algorithms.

x

@inproceedings {MNSX11,
author = "Magniez, Frederic and Nayak, Ashwin and Santha, Miklos and Xiao, David",
title = "Improved bounds for the randomized decision tree complexity of recursive majority",
booktitle = "Proc. ICALP 2011",
pages = "317--329",
year = "2011",
}
"Learning to create is as hard as learning to appreciate"
[Abs, Bib, PS, PDF]

D. Xiao. COLT 2010.

x

We explore the relationship between a natural notion of "learning to create" (LTC) studied by Kearns et al. (STOC '94) and the standard PAC model of Valiant (CACM '84), which can be thought of as a formalization of "learning to appreciate". Our main theorem states that "if learning to appreciate is hard, then so is learning to create". More formally, we prove that if there exists a concept class for which PAC learning with respect to efficiently samplable input distributions is hard, then there exists another (possibly richer) concept class for which the LTC problem is hard. We also show that our theorem is tight in two senses, by proving that there exist concrete concept classes for which PAC learning is hard but LTC is easy, and by showing that it is unlikely our main theorem can be improved to the case of PAC learning with respect to unsamplable input distributions.

x

@inproceedings {X10learn,
author = "Xiao, David",
title = "Learning to create is as hard as learning to appreciate",
booktitle = "Proc. $23$'d COLT",
pages = "516--528",
year = "2010",
}
"A new sampling protocol and applications to basing cryptographic primitives on NP"
[Abs, Bib, PS, PDF]

I. Haitner, M. Mahmoody, and D. Xiao. CCC 2010

x

We investigate the question of what languages can be decided efficiently with the help of a
recursive collision-finding oracle. Such an oracle can be used to break collision-resistant hash
functions or, more generally, statistically hiding commitments. The oracle we consider Sam_d,
where d is the recursion depth, is based on the identically-named oracle defined in the work of
Haitner et al. (FOCS '07). Our main result is a constant-round public-coin protocol AM-Sam
that allows an efficient verifier to emulate a Sam_d oracle for any constant depth d = O(1) with the
help of a BPP^NP prover. AM-Sam allows us to conclude that if L is decidable by a k-adaptive
randomized oracle algorithm with access to a Sam_O(1) oracle, then L∈AM[k]∩coAM[k].
The above yields the following corollary: assume there exists an O(1)-adaptive reduction that
bases constant-round statistically hiding commitment on NP-hardness, then NP ⊆ coAM and
the polynomial hierarchy collapses. The same result holds for any primitive that can be broken
by Sam_O(1) including collision-resistant hash functions and O(1)-round oblivious transfer where
security holds statistically for one of the parties. We also obtain non-trivial (though weaker)
consequences for k-adaptive reductions for any k = poly(n). Prior to our work, most results in
this research direction either applied only to non-adaptive reductions (Bogdanov and Trevisan,
SIAM J. of Comp. '06) or to primitives with special regularity properties (Brassard FOCS '79,
Akavia et al., FOCS '06).
The main technical tool we use to prove the above is a new constant-round public-coin
protocol (SampleWithSize) that we believe may be interesting in its own right, and that guarantees the following. Given an efficient function f on n bits, let D be the output distribution
D = f(U_n), then SampleWithSize allows an efficient verifier Arthur to use an all-powerful prover
Merlin's help to sample a random y ← D along with a good multiplicative approximation of the
probability p_y = Pr_{y' ← D}[y' = y]. The crucial feature of SampleWithSize is that it extends even
to distributions of the form D = f(U_S), where U_S is the uniform distribution on an efficiently
decidable subset S ⊆{0.1}ⁿ (such D are called efficiently samplable with post-selection), as
long as the verifier is also given a good approximation of the value |S|.

x

@inproceedings {HMX10,
author = "Haitner, Iftach and Mahmoody, Mohammad and Xiao, David",
title = "A new sampling protocol and applications to basing cryptographic primitives on NP",
booktitle = "Proc. $25$'th CCC",
pages = "76--87",
year = "2010",
}
"On the power of randomized reductions and the checkability of SAT"
[Abs, Bib, PS, PDF]

M. Mahmoody, and D. Xiao. CCC 2010

x

We prove new results regarding the complexity of various complexity classes under randomized oracle reductions. We first prove that $\BPP^{\prSZK} \subseteq \AM \cap \coAM$, where $\prSZK$ is the class of promise problems having statistical zero knowledge proofs. This strengthens the previously known facts that $\prSZK$ is closed under $\class{NC}^1$ truth-table reductions (Sahai and Vadhan, J. ACM '03) and that $\P^\prSZK \subseteq \AM \cap \coAM$ (Vadhan, personal communication). Our results imply that for most cryptographically interesting lattice problems, there is a sharp threshold for the approximation factor below which we do not know if the problems are even in $\AM$, while above which the problems are in $\AM \cap \coAM$ not only via Karp reductions but also via randomized oracle reductions. This resolves a long-standing open question of Goldreich and Goldwasser (JCSS '98).

Then we investigate the power of randomized oracle reductions with relation to the notion of instance checking (Blum and Kannan, J. ACM '95). We observe that a theorem of Beigel implies that if any problem in $\TFNP$ such as Nash equilibrium is $\NP$-hard under randomized oracle reductions, then $\SAT$ is checkable.

We also observe that Beigel's theorem can be extended to an average-case setting by relating checking to the notion of program testing (Blum et al., JCSS '93). From this, we derive that if one-way functions can be based on $\NP$-hardness via a randomized oracle reduction, then $\SAT$ is checkable. By showing that $\NP$ has a \emph{non-uniform} tester, we also show that worst-case to average-case randomized oracle reduction for any relation (or language) $R \in \NP$ implies that $R$ has a non-uniform instance checker. These results hold even for adaptive randomized oracle reductions.

x

@inproceedings{MX10,
author = "Mahmoody, Mohammad and Xiao, David",
title = "On the power of randomized reductions and the checkability of SAT",
booktitle = "Proc. $25$'th CCC 2010",
pages = "64--75",
year = "2010",
}
"On basing ZK != BPP on the hardness of PAC learning"
[Abs, Bib,PS, PDF]

D. Xiao. CCC 2009.
x
Learning is a central task in computer science, and there are various formalisms for capturing the notion. One important model studied in computational learning theory is the PAC model of Valiant (CACM 1984). On the other hand, in cryptography the notion of "learning nothing" is often modelled by the simulation paradigm: in an interactive protocol, a party learns nothing if it can produce a transcript of the protocol by itself that is indistinguishable from what it gets by interacting with other parties. The most famous example of this paradigm is zero knowledge proofs, introduced by Goldwasser, Micali, and Rackoff (SICOMP 1989).

Applebaum et al. (FOCS 2008) observed that a theorem of Ostrovsky and Wigderson (ISTCS 1993) combined with the transformation of one-way functions to pseudo-random functions (Håstad et al. SICOMP 1999, Goldreich et al. J. ACM 1986) implies that if there exist non-trivial languages with zero-knowledge arguments, then no efficient algorithm can PAC learn polynomial-size circuits. They also prove a weak reverse implication, that if a certain non-standard learning task is hard, then zero knowledge is non-trivial. This motivates the question we explore here: can one prove that hardness of PAC learning is equivalent to non-triviality of zero-knowledge? We show that this statement cannot be proven via the following techniques:
1. Relativizing techniques: there exists an oracle relative to which learning polynomial-size circuits is hard and yet the class of languages with zero knowledge arguments is trivial.
2. Semi-black-box techniques: if there is a black-box construction of a zero-knowledge argument for an NP-complete language (possibly with a non-black-box security reduction) based on hardness of PAC learning, then NP has statistical zero knowledge proofs, namely NP is contained in SZK.
Under the standard conjecture that NP is not contained in SZK, our results imply that most standard techniques do not suffice to prove the equivalence between the non-triviality of zero knowledge and the hardness of PAC learning. Our results hold even when considering non-uniform hardness of PAC learning with membership queries. In addition, our technique relies on a new kind of separating oracle that may be of independent interest.
x

@inproceedings{Xiao09,
author = {David Xiao},
title = {On basing $\ZK \neq \BPP$ on the hardness of PAC learning},
booktitle = {In Proc. CCC '09},
pages = {304--315},
year = {2009},
}
"On basing lower-bounds for learning on worst-case assumptions"
[Abs, Bib, PS, PDF]

B. Applebaum, B. Barak, D. Xiao. FOCS 2008

x

We consider the question of whether $\P\neq\NP$ implies that there exists some concept class that is efficiently representable but is still hard to learn in the PAC model of Valiant (CACM '84). We show that unless the Polynomial Hierarchy collapses, such a statement cannot be proven via a large class of reductions including Karp reductions, truth-table reductions, and a restricted form of non-adaptive Turing reductions. Also, a proof that uses a Turing reduction of constant levels of adaptivity would imply an important consequence in cryptography as it yields a transformation from any average-case hard problem in $\NP$ to a one-way function. Our results hold even in the stronger model of agnostic learning.

These results are obtained by showing that lower bounds for improper learning are intimately related to the complexity of zero-knowledge arguments and to the existence of weak cryptographic primitives. In particular, we prove that if a language $L$ reduces to the task of improper learning circuits, then, depending on the type of the reduction in use, either (1) $L$ has a statistical zero-knowledge argument system, or (2) the worst-case hardness of $L$ implies the existence of a weak variant of one-way functions defined by Ostrovsky-Wigderson (ISTCS '93). Interestingly, we observe that the converse implication also holds. Namely, if (1) or (2) hold then the intractability of L implies that improper learning is hard.

x

@inproceedings{ABX08,
author = {Benny Applebaum and Boaz Barak and David Xiao},
title = {On Basing Lower-Bounds for Learning on Worst-Case Assumptions},
year = {2008},
booktitle = {Proc. FOCS '08},
pages = {211-220}
}
"Derandomizing the Ahlswede-Winter matrix-valued Chernoff bound using pessimistic estimators and applications"
[Abs, Bib, PS, PDF]

A. Wigderson and D. Xiao, Theory of Computing, Vol. 4 #3 (2008)

x

Ahlswede and Winter [IEEE Trans. Inf. Th. 2002] introduced a Chernoff bound for matrix-valued random variables, which is a non-trivial generalization of the usual Chernoff bound for real-valued random variables. We present an efficient derandomization of their bound using the method of pessimistic estimators (see Raghavan [JCSS 1988]). As a consequence, we derandomize a construction of Alon and Roichman [RSA 1994] to efficiently construct an expanding Cayley graph of logarithmic degree on any (possibly non-abelian) group. This gives an optimal solution to the homomorphism testing problem of Shpilka and Wigderson [STOC 2004]. We also apply these pessimistic estimators to the problem of solving semi-definite covering problems, thus giving a deterministic algorithm for the quantum hypergraph cover problem of Ahslwede and Winter.

The results above appear as theorems in our paper ``A randomness-efficient sampler for matrix-valued functions and applications'' [FOCS 2005, ECCC 2005], as consequences of the main claim of that paper: a randomness efficient sampler for matrix valued functions via expander walks. However, we discovered an error in the proof of that main theorem (which we briefly describe in the appendix). That claim stating that the expander walk sampler is good for matrix-valued functions thus remains open. One purpose of the current paper is to show that the applications in that paper hold true despite our inability to prove the expander walk sampler theorem for matrix-valued functions.

x

@article{WX08,
author = {Avi Wigderson and David Xiao},
title = {Derandomizing the Ahlswede-Winter matrix-valued Chernoff bound using pessimistic estimators, and applications},
journal = {Theory of Computing},
year = {2008},
pages = {53-76},
publisher = {Theory of Computing},
volume = {4},
number = {3},
URL = {http://www.theoryofcomputing.org/articles/v004a003},
eprint = {toc:v004/a003},
}
"A randomness-efficient sampler for matrix-valued functions and applications"
[Abs, Bib, PS, PDF]

A. Wigderson and D. Xiao, FOCS 2005

x

N.B.: refer to [WX08] for details about errors in this paper.

In this paper we give a randomness-efficient sampler for matrix-valued functions. Specifically, we show that a random walk on an expander approximates the recent Chernoff-like bound for matrix-valued functions of Ahlswede and Winter [IEEE Trans. Inf. Th. '02], in a manner which depends optimally on the spectral gap. The proof uses perturbation theory, and is a generalization of Gillman's [FOCS '93] and Lezaud's [Ann. App. Prob. '98] analyses of the Ajtai-Komlos-Szemeredi sampler for real-valued functions [STOC '87].

Derandomizing our sampler gives a few applications, yielding deterministic polynomial time algorithms for problems in which derandomizing independent sampling gives only quasi-polynomial time deterministic algorithms. The first (which was our original motivation) is to a polynomial-time derandomization of the Alon-Roichman theorem [Rand. St. and Alg. '94]: given a group of size $n$, find $O(\log n)$ elements which generate it as an expander. This implies a second application - efficiently constructing a randomness-optimal homomorphism tester, significantly improving the previous result of Shpilka and Wigderson [FOCS '04]. The third is to a ``non-commutative'' hypergraph covering problem - a natural extension of the set-cover problem which arises in quantum information theory, in which we efficiently attain the integrality gap when the fractional semi-definite relaxation cost is constant.

x

@inproceedings{WX05,
author = {Avi Wigderson and David Xiao},
title = {A randomness-efficient sampler for matrix-valued functions and applications},
booktitle = {Proc.\ $46$th FOCS},
pages = {397--406},
year={2005},
}

Cryptography and Security

"Languages with efficient Zero Knowledge PCPs are in SZK "
[Abs, Bib, PS, PDF]

M. Mahmoody and D. Xiao. To appear, TCC 2013.

x

A Zero-Knowledge PCP (ZK-PCP) is a randomized PCP such that the view of any (perhaps cheating) efficient verifier can be efficiently simulated up to small statistical distance. Kilian, Petrank, and Tardos (STOC '97) constructed ZK-PCPs for all languages in $NEXP$. Ishai, Mahmoody, and Sahai (TCC '12), motivated by cryptographic applications, revisited the possibility of efficient ZK-PCPs for all of $NP$ where the PCP is encoded as a polynomial-size circuit that given a query $i$ returns the $i$'th symbol of the PCP. Ishai et al showed that there is no efficient ZK-PCP for $NP$ with non-adaptive verifier, who prepares all of its PCP queries before seeing any answers, unless $NP \subseteq coAM$ and polynomial-time hierarchy collapses. The question of whether adaptive verification can lead to efficient ZK-PCPs for $NP$ remained open.

In this work, we resolve this question and show that any language or promise problem with efficient ZK-PCPs must be in $SZK$ (the class of promise problems with a statistical zero-knowledge single-prover proof system). Therefore, no $NP$-complete problem can have an efficient ZK-PCP unless $NP \subseteq SZK$ (which also implies $NP \subseteq coAM$ and the polynomial-time hierarchy collapses).

We prove our result by reducing any promise problem with an efficient ZK-PCP to two instances of the "Conditional Entropy Approximation" problem defined and studied by Vadhan (FOCS'04) which is known to be complete for the class $SZK$.

x

@techreport{MX12,
author = "Mohammad Mahmoody and David Xiao",
title = "Languages with efficient Zero Knowledge PCPs are in $\mathbf{SZK}$",
booktitle = {In Proc. TCC 2012},
note = {To appear},
year = "2012"
}
"Round-optimal black-box statistically binding selective-opening secure commitments"
[Abs, Bib, PS, PDF]

D. Xiao. AFRICACRYPT 2012.

x

Assuming t-round statistically hiding commitments in the stand-alone model, we build a (t + 2)-round statistically binding commitment secure against selective opening attacks under parallel composition. In particular, assuming collision- resistant hash functions, we build such commitments in 4 rounds.

x

@inproceedings{Xiao12Com,
author = "David Xiao",
title = "Round-optimal black-box statistically binding selective-opening secure commitments",
booktitle = {In Proc. AFRICACRYPT '12},
pages = {395--411},
year = "2012"
}
"Is privacy compatible with truthfulness?"
[Abs, Bib, PS, PDF]

D. Xiao. To appear, ITCS 2013. First appeared as Cryptology ePrint Technical Report, 2011/005, 2011.

x

We investigate the mainstream interpretation of differential privacy, which says that given a differentially private mechanism, people are likely to share their data truthfully because they are at little risk of revealing their own information. We argue that this interpretation is incomplete because people attach a concrete value to their privacy, and so they should be explicitly motivated to reveal their information. Using the notion of truthfulness from game theory, we show that in certain settings, existing differentially private mechanisms do not encourage participants to report their information truthfully.

On the positive side, we exhibit a transformation that, for games where the type space is small and the goal is to optimize social welfare, takes truthful and efficient mechanisms and transform them into differentially private, truthful, and efficient mechanisms.

We also study what happens when an explicit numerical cost is assigned to the information leaked by a mechanism. We show that in this case, even differential privacy may not be strong enough of a notion to motivate people to participate truthfully.

x

@techreport{X11priv,
author = "David Xiao",
title = "Is privacy compatible with truthfulness?",
institution = {{Cryptology ePrint Archive}},
number = "2011/005",
year = "2011"
}
"On the round complexity of black-box constructions of commitments secure against selective opening attacks"
[Abs, Bib, PS, PDF]

D. Xiao. Cryptology ePrint Technical Report 2009/513, 2012. Preliminary version appeared in TCC 2011. [Bib]
x
Selective opening attacks against commitment schemes occur when the commitment scheme is repeated in parallel and an adversary can choose depending on the commit-phase transcript to see the values and openings to some subset of the committed bits. Commitments are secure under such attacks if one can prove that the remaining, unopened commitments stay secret.

We prove the following black-box constructions and black-box lower bounds for commitments secure against selective opening attacks for parallel composition:
1. $3$ (resp. $4$) rounds are necessary to build computationally (resp. statistically) binding and computationally hiding commitments.
2. There is a black-box construction of $(t+3)$-round statistically binding commitments secure against selective opening attacks based on $t$-round stand-alone statistically hiding commitments.
3. $O(1)$-round statistically-hiding commitments are equivalent to $O(1)$-round statistically-binding commitments.
Our lower bounds improve upon the parameters obtained by the impossibility results of Bellare \etal{} (EUROCRYPT '09), and are proved in a fundamentally different way, by observing that essentially all known impossibility results for black-box zero-knowledge can also be applied to the case of commitments secure against selective opening attacks.
x

@techreport{Xiao11ComRevised,
author = {David Xiao},
title = "On the round complexity of black-box constructions of commitments secure against selective opening attacks",
institution = {{Cryptology ePrint Archive}},
number = "2009/513",
year = "2012"
}

x

@inproceedings{Xiao11Com,
author = {David Xiao},
title = {(Nearly) Round-Optimal Black-Box Constructions of Commitments Secure against Selective Opening Attacks},
booktitle = {Proc. $8$th TCC},
year = {2011},
pages = {541--558},
}
"On the round complexity of zero-knowledge proofs from one-way permutations"
[Abs, Bib, PS, PDF]

S. Gordon, H. Wee, D. Xiao, and A. Yerukhimovich. LATINCRYPT 2010.

x

We consider the following problem: can we construct constant-round zero-knowledge proofs (with negligible soundness) for \NP assuming only the existence of one-way permutations? We answer the question in the negative for fully black-box constructions (using only black-box access to both the underlying primitive and the cheating verifier) that satisfy a natural restriction on the ``adaptivity'' of the simulator's queries. Specifically, we show that only languages in coAM have constant-round zero-knowledge proofs of this kind.

x

@inproceedings{GWXY09,
author = "S. Dov Gordon and Hoeteck Wee and David Xiao and Arkady Yerukhimovich",
title = "On the round complexity of zero-knowledge proofs from one-way permutations",
booktitle = "Proc. 1st LATINCRYPT",
pages = "189--204",
year = "2010"
}
"Path-quality monitoring in the presence of adversaries"
[Abs, Bib,PS, PDF]

S. Goldberg, D. Xiao, E. Tromer, B. Barak, and J. Rexford. SIGMETRICS 2008

x

Edge networks connected to the Internet need effective monitoring techniques to drive routing decisions and detect violations of Service Level Agreements (SLAs). However, existing measurement tools, like ping, traceroute, and trajectory sampling, are vulnerable to attacks that can make a path look better than it really is. In this paper, we design and analyze path-quality monitoring protocols that reliably raise an alarm when the packet-loss rate and delay exceed a threshold, even when an adversary tries to bias monitoring results by selectively delaying, dropping, modifying, injecting, or preferentially treating packets.

Despite the strong threat model we consider in this paper, our protocols are efficient enough to run at line rate on high-speed routers. We present a secure sketching protocol for identifying when packet loss and delay degrade beyond a threshold. This protocol is extremely lightweight, requiring only 250–600 bytes of storage and periodic transmission of a comparably sized IP packet to monitor billions of packets. We also present secure sampling protocols that provide faster feedback and accurate round-trip delay estimates, at the expense of somewhat higher storage and communication costs. We prove that all our protocols satisfy a precise definition of secure path-quality monitoring and derive analytic expressions for the trade-off between statistical accuracy and system overhead. We also compare how our protocols perform in the client-server setting, when paths are asymmetric, and when packet marking is not permitted.
.

x

@article{GXTBR08,
author = {Goldberg, Sharon and Xiao, David and Tromer, Eran and Barak, Boaz and Rexford, Jennifer},
title = {Path-quality monitoring in the presence of adversaries},
journal = {SIGMETRICS Perform. Eval. Rev.},
volume = {36},
number = {1},
year = {2008},
issn = {0163-5999},
pages = {193--204},
doi = {http://doi.acm.org/10.1145/1384529.1375480},
publisher = {ACM},
address = {New York, NY, USA},
}
"Protocols and lower bounds for failure localization in the Internet"
[Abs, Bib,PS, PDF]

B. Barak, S. Goldberg, and D. Xiao. EUROCRYPT 2008
x
A secure failure-localization path-quality-monitoring (FL-PQM) protocol allows a sender to localize faulty links on a single path through a network to a receiver, even when intermediate nodes on the path behave adversarially. Such protocols were proposed as tools that enable Internet service providers to select high-performance paths through the Internet, or to enforce contractual obligations. We give the first formal definitions of security for FL-PQM protocols and construct:
1. A simple FL-PQM protocol that can localize a faulty link every time a packet is not correctly delivered. This protocol’s communication and storage overhead is O(1) additional messages of length O(n) per packet (where n is the security parameter).
2. A two more efficient FL-PQM protocols that can localize a faulty link when a noticeable fraction of the packets sent during some time period are not correctly delivered. Our sampling-based protocol has a storage and communication overhead that is an arbitrarily small fraction of the total number of packets sent T. Our sketching-based protocol requires O(n + log T ) storage and only two additional messages of similar length.
We also prove lower bounds for such protocols:
1. Every secure FL-PQM protocol requires each intermediate node on the path to have some shared secret information (e.g., keys).
2. If secure FL-PQM protocol exist then so do one-way functions.
3. Every black-box construction of a FL-PQM protocol from a random oracle that securely localizes every packet and adds at most O(log n) messages overhead per packet requires each intermediate node to invoke the oracle. These results show that implementing FL-PQM requires active cooperation (i.e., maintaining keys and agreeing on, and performing, cryptographic protocols) from all of the intermediate nodes along the path. This may be problematic in the Internet, where links operate at extremely high speeds, and intermediate nodes are owned by competing business entities with little incentive to cooperate.
x

@inproceedings{BGX08,
author = {Boaz Barak and Sharon Goldberg and David Xiao},
title = {Protocols and Lower Bounds for Secure Failure Localization on the Internet},
booktitle = {Proc. EUROCRYPT '08},
pages = {341--360},
year = {2008}
}

Surveys and Theses

"Pseudo-aléa: objets et génération" (français)
[Abs, Bib,PS, PDF]

D. Xiao. Journées annuelles de la Société Mathématiques de France, 2011

x

Français

L'aléa est un outil indispensable dans l'informatique théorique. Il trouve aussi de plus en plus d'applications dans les preuves mathématiques. Malgré son utilité, il est souvent souhaitable de réduire ou même d'éliminer l'aléa. Nous verrons qu'il est souvent possible de remplacer l'aléa avec des objets pseudo-aléatoires, et nous en examinerons plusieurs exemples: les graphes expandeurs, les extracteurs d'aléa, les générateurs pseudo-aléatoires, et les codes correcteurs d'erreurs. Nous exposerons aussi le principe de transfert, qui permet de transférer des propriétés d'un ensemble pseudo-aléatoire à ses sous-ensembles.

English

Randomness is indispensable in theoretical computer science, and has found progressively more applications in proofs in mathematics. Despite its usefulness, it is often desirable to reduce or eliminate randomness when possible. We will see that it is often possible to replace randomness with pseudorandom objects. We will look at several examples: expander graphs, extractors, pseudorandom generators, and error-correcting codes. We will also discuss the transference principle, which allows us to transfer properties from pseudo-random sets to their subsets.

x

@article{Xiao11PR,
author = "David Xiao",
title = "Pseudo-al\'{e}a: objets et g\'{e}n\'{e}ration",
journal = "Journ\'{e}es annuelles de la Soci\'{e}t\'{e} Math\'{e}matiques de France",
publisher = "Soci\'{e}t\'{e} Math\'{e}matiques de France",
year = "2011"
}
"New Perspectives on the Complexity of Computational Learning, and Other Problems in Theoretical Computer Science" [Abs, Bib, PS, PDF]
D. Xiao. Ph. D thesis. Princeton University, 2009.
x
In this thesis we present the following results.
- Learning theory, and in particular PAC learning, was introduced by Valiant [CACM 1984] and has since become a major area of research in theoretical and applied computer science. One natural question that was posed at the very inception of the field is whether there are classes of functions that are hard to learn.
  
  PAC learning is hard under widely held conjectures such as the existence of one-way functions, and on the other hand it is known that if PAC learning is hard then $\P \neq \NP$. We further study sufficient and necessary conditions for PAC learning to be hard, and we prove that:
  
  $\ZK \neq \BPP$ implies that PAC learning is hard.
  
  It is unlikely using standard techniques that one can prove that PAC learning is hard implies that $\ZK \neq \BPP$.
  
  It is unlikely using standard techniques that one can prove that $\P \neq \NP$ implies that $\ZK \neq \BPP$.
  
  Here, ``standard techniques'' refers to various classes of efficient reductions. Together, these results imply that the hardness of PAC learning lies between the non-triviality of $\ZK$ on the one hand and the hardness of $\NP$ on the other hand. Furthermore, the hardness of PAC learning lies ``strictly'' between the two, in the sense that most standard techniques cannot prove equivalence with either $\ZK \neq \BPP$ or $\NP \neq \P$.
  
  xIn proving these results, we show new connections between PAC learning and auxiliary-input one-way functions, which were defined by Ostrovsky and Wigderson [ISTCS 1993] to better understand $\ZK$. We also define new problems related to PAC learning that we believe of are independent interest, and may be useful in future studies of the complexity of PAC learning.
- A secure failure-localization (FL) protocol allows a sender to localize faulty links on a single path through a network to a receiver, even when intermediate nodes on the path behave adversarially. Such protocols were proposed as tools that enable Internet service providers to select high-performance paths through the Internet, or to enforce contractual obligations. We give the first formal definitions of security for FL protocols and prove that for such protocols, security requires each intermediate node on the path to have some shared secret information (\eg keys), and that every black-box construction of a secure FL protocol from a random oracle requires each intermediate node to invoke the random oracle. This suggests that achieving this kind of security is unrealistic as intermediate nodes have little incentive to participate in the real world.
- Ahlswede and Winter [IEEE Trans. Inf. Th. 2002] introduced a Chernoff bound for matrix-valued random variables, which is a non-trivial generalization of the usual Chernoff bound for real-valued random variables. We present an efficient derandomization of their bound using the method of pessimistic estimators (see Raghavan [JCSS 1988]). As a consequence, we derandomize a construction of Alon and Roichman [RSA 1994] to efficiently construct an expanding Cayley graph of logarithmic degree on any (possibly non-abelian) group. This gives an optimal solution to the homomorphism testing problem of Shpilka and Wigderson [STOC 2004]. We also apply these pessimistic estimators to the problem of solving semi-definite covering problems, thus giving a deterministic algorithm for the quantum hypergraph cover problem of Ahslwede and Winter.
x

@phdthesis{XiaoThesis09,
author = "David Xiao",
title = "New Perspectives on the Complexity of Computational Learning, and Other Problems in Theoretical Computer Science",
school = "Princeton University",
year = "2009"
}
"The Evolution of Expander Graphs"
[Abs, Bib,PS, PDF]

D. Xiao. AB thesis, Harvard College 2003

x

This thesis has two goals. First, we present an introduction to the line of work that began with the study of expander graphs in the non-constructive setting, which then led to the algebraic con- struction of expanders, and finally has recently produced combinatorial constructions. Second, we extend the work on combinatorial constructions by new analyses and new constructions.

x

@thesis{Xiao03,
author = "David Xiao",
title = "The Evolution of Expander Graphs",
institution = "Harvard University",
year = "2003"
}

Other

"Estimating and comparing entropy across written natural languages using PPM compression",
[Abs, Bib,PS, PDF]

F. Behr, V. Fossum, M. Mitzenmacher, D. Xiao, Data Compression Conference 2003

x

Previous work on estimating the entropy of written natural language has focused primarily on English. We expand this work by considering other natural languages, including Arabic, Chinese, French, Greek, Japanese, Korean, Russian, and Spanish. We present the results of PPM compression on machine-generated and human-generated translations of texts in various languages.

Under the assumption that languages are equally expressive, and that PPM compression does well across languages, one would expect that translated documents would compress to approximately the same size. We verify this empirically on a novel corpus of translated documents. We suggest as an application of this finding using the size of compressed natural language texts as a mean of automatically testing translation quality.

x

@inproceedings{BFMX03,
author = "F. Behr and V. Fossum and M. Mitzenmacher and D. Xiao",
title = "Estimating and comparing entropy across written natural languages using PPM compression",
booktitle = "Proc. DCC '03",
year = "2003",
pages = "416"
}

dxiao

About Me

Présentation

我的背景

האתר בבניה

Publications

Français

English