- You are here:
- GT Home
- Home
- News & Events

Series: Dissertation Defense

Classical vertex coloring problems ask for the minimum number of colors needed
to color the vertices of a graph, such that adjacent vertices
use different colors. Vertex coloring does have quite a few practical applications
in
communication theory, industry engineering and computer science. Such examples can
be
found in the book of Hansen and Marcotte.
Deciding whether a graph is 3-colorable or not is a well-known NP-complete problem,
even for triangle-free graphs. Intutively, large girth may help reduce the chromatic
number.
However, in 1959, Erdos used the probabilitic method to prove that for any two
positive
integers g and k, there exist graphs of girth at least g and chromatic number at
least k.
Thus, restricting girth alone does not help bound the chromatic number. However, if
we
forbid certain tree structure in addition to girth restriction, then it is possible
to bound the
chromatic number. Randerath determined several such tree structures, and conjectured
that
if a graph is fork-free and triangle-free, then it is 3-colorable, where a fork is a
star K1,4
with two branches subdivided once.
The main result of this thesis is that Randerath's conjecture is true for graphs
with odd
girth at least 7. We also give an outline of a proof that Randerath's conjecture
holds for
graphs with maximum degree 4.

Series: Dissertation Defense

Series: Dissertation Defense

This dissertation investigates the statistical learning scenarios where a
high-dimensional parameter has to be estimated from a given sample of fixed
size, often smaller than the dimension of the problem.
The first part answers some open questions for the binary classification
problem in the framework of active learning.
Given a random couple (X,Y)\in R^d\times {\pm 1} with
unknown distribution P, the goal of binary classification is to predict a
label Y based on the observation X. The prediction rule is constructed
based on the observations (X_i,Y_i)_{i=1}^n sampled from P.
The concept of active learning can be informally characterized as follows:
on every iteration, the algorithm is allowed to request a label Y for any
instance X which it considers to be the most informative.
The contribution of this work consists of two parts: first, we provide the
minimax lower bounds for performance of the active learning methods under
certain assumptions. Second, we propose an active learning algorithm which
attains nearly optimal rates over a broad class of underlying distributions
and is adaptive with respect to the unknown parameters of the problem.
The second part of this work is related to sparse recovery in the framework
of dictionary learning.
Let (X,Y) be a random couple with unknown distribution P, with X
taking its values in some metric space S and Y - in a bounded subset of
R.
Given a collection of functions H={h_t}_{t\in \mb T}
mapping S to R, the goal of dictionary learning is to construct a
prediction rule for Y given by a linear (or convex) combination of the
elements of H.
The problem is sparse if there exists a good prediction rule that depends on
a small number of functions from H.
We propose an estimator of the unknown optimal prediction rule based on
penalized empirical risk minimization algorithm.
We show that proposed estimator is able to take advantage of the possible
sparse structure of the problem by providing probabilistic bounds for its
performance. Finally, we provide similar bounds in the density estimation
framework.

Series: Dissertation Defense

Series: Dissertation Defense

Series: Dissertation Defense

The primary objective of this thesis is to make a quantitative study of
complex biological networks. Our fundamental motivation is to obtain the
statistical dependency between modules by injecting external noise. To
accomplish this, a
deep study of stochastic dynamical systems would be essential. The first
part is about the stochastic dynamical system theory. The classical
estimation of invariant measures of Fokker-Planck equations is improved by
the level set method. Further, we develop a discrete Fokker-Planck-type
equation to study the discrete stochastic dynamical systems. In the second
part, we quantify systematic measures including degeneracy, complexity and
robustness. We also provide a series of results on their properties and the
connection between them. Then we apply our theory to the JAK-STAT signaling
pathway network.

Series: Dissertation Defense

Series: Dissertation Defense

The fields of statistical physics, discrete probability, combinatorics, and theoretical computer science have converged around efforts to understand random structures and algorithms. Recent activity in the interface of these fields has enabled tremendous breakthroughs in each domain and has supplied a new set of techniques for researchers approaching related problems. This thesis makes progress on several problems in this interface whose solutions all build on insights from multiple disciplinary perspectives.
First, we consider a dynamic growth process arising in the context of DNA-based self-assembly. The assembly process can be modeled as a simple Markov chain. We prove that the chain is rapidly mixing for large enough bias in regions of Z^d. The proof uses a geometric distance function and a variant of path coupling in order to handle distances that can be exponentially large. We also provide the first results in the case of fluctuating bias, where the bias can vary depending on the location of the tile, which arises in the nanotechnology application. Moreover, we use intuition from statistical physics to construct a choice of the biases for which the Markov chain M_{mon} requires exponential time to converge.
Second, we consider a related problem regarding the convergence rate of biased permutations that arises in the context of self-organizing lists. The Markov chain M_{nn} in this case is a nearest-neighbor chain that allows adjacent transpositions, and the rate of these exchanges is governed by various input parameters. It was conjectured that the chain is always rapidly mixing when the inversion probabilities are positively biased, i.e., we put nearest neighbor pair x < y in order with bias 1/2 <= p_{xy} <=
1 and out of order with bias 1-p_{xy}. The Markov chain M_{mon} was known to have connections to a simplified version of this biased card-shuffling. We provide new connections between M_{nn} and M_{mon} by using simple combinatorial bijections, and we prove that M_{nn} is always rapidly mixing for two general classes of positively biased {p_{xy}}. More significantly,
we also prove that the general conjecture is false by exhibiting values for the p_{xy}, with 1/2 <= p_{xy} <= 1 for all x < y, but for which the transposition chain will require exponential time to converge.
Finally, we consider a model of colloids, which are binary mixtures of molecules with one type of molecule suspended in another. It is believed that at low density typical configurations will be well-mixed throughout, while at high density they will separate into clusters. This clustering has proved elusive to verify, since all local sampling algorithms are known to be inefficient at high density, and in fact a new nonlocal algorithm was recently shown to require exponential time in some cases. We characterize the high and low density phases for a general family of discrete interfering binary mixtures by showing that they exhibit a "clustering property" at high density and not at low density. The clustering property states that there will be a region that has very high area, very small perimeter, and high density of one type of molecule. Special cases of interfering binary mixtures include the Ising model at fixed magnetization and independent sets.

Series: Dissertation Defense

This dissertation has two principal components: the dimension of
posets with planar cover graphs, and the cartesian product of posets
whose cover graphs have hamiltonian cycles that parse into symmetric
chains. Posets of height two can have arbitrarily large dimension.
In 1981, Kelly provided an infinite sequence of planar posets that
shows that the dimension of planar posets can also be arbitrarily
large. However, the height of the posets in this sequence increases
with the dimension. In 2009, Felsner, Li, and Trotter conjectured
that for each integer h \geq 2, there exists a least positive
integer c_h so that if P is a poset having a planar cover graph
(hence P is a planar poset as well) and the height of P is h,
then the dimension of P is at most c_h. In the first principal
component of this dissertation we prove this conjecture. We also give
the best known lower bound for c_h, noting that this lower bound is
far from the upper bound. In the second principal component, we
consider posets with the Hamiltonian Cycle--Symmetric Chain Partition
(HC-SCP) property. A poset of width w has this property if its cover
graph has a Hamiltonian cycle which parses into w symmetric chains.
This definition is motivated by a proof of Sperner's Theorem that uses
symmetric chains, and was intended as a possible method of attack on
the Middle Two Levels Conjecture. We show that the subset lattices
have the HC-SCP property by showing that the class of posets with the
strong HC-SCP property, a slight strengthening of the HC-SCP property,
is closed under cartesian product with a two-element chain.
Furthermore, we show that the cartesian product of any two posets from
this class has the HC-SCP property.

Series: Dissertation Defense

Advisor: Liang Peng

In 1988, Owen introduced empirical likelihood as a nonparametric
method for constructing confidence intervals and regions.
It is well known that empirical likelihood has several attractive advantages
comparing to its competitors such as bootstrap: determining the
shape of confidence regions automatically; straightforwardly incorporating
side information expressed through constraints; being Bartlett correctable.
In this talk, I will discuss some extensions of the empirical likelihood
method to several interesting and important statistical inference situations
including: the smoothed jackknife empirical likelihood method for the
receiver operating characteristic (ROC) curve, the smoothed empirical
likelihood method for the conditional Value-at-Risk with the volatility
model being an ARCH/GARCH model and a nonparametric regression respectively. Then, I will
propose a method for testing nested stochastic models with discrete and
dependent observations.