- You are here:
- GT Home
- Home
- News & Events

Series: Stochastics Seminar

We will introduce the Dunkl derivative as well as the Dunkl process and some of its properties. We will treat its radial part called the radial Dunkl process and light the connection to the eigenvalues of some matrix valued processes and to the so called Brownian motions in Weyl chambers. Some open problems will be discussed at the end.

Series: Stochastics Seminar

So far, likelihood-based interval estimate for quantiles has not been studied in literature for interval censored Case 2 data and partly interval-censored data, and in this context the use of smoothing has not been considered for any type of censored data. This article constructs smoothed weighted empirical likelihood ratio confidence intervals (WELRCI) for quantiles in a unified framework for various types of censored data, including right censored data, doubly censored data, interval censored data and partly interval-censored data. The 4th-order expansion of the weighted empirical log-likelihood ratio is derived, and the 'theoretical' coverage accuracy equation for the proposed WELRCI is established, which generally guarantees at least the 'first-order' accuracy. In particular for right censored data, we show that the coverage accuracy is at least O(n^{-1/2}), and our simulation studies show that in comparison with empirical likelihood-based methods, the smoothing used in WELRCI generally gives a shorter confidence interval with comparable coverage accuracy. For interval censored data, it is interesting to find that with an adjusted rate n^{-1/3}, the weighted empirical log-likelihood ratio has an asymptotic distribution completely different from that by the empirical likelihood approach, and the resulting WELRCI perform favorably in available comparison simulation studies.

Series: Stochastics Seminar

Let X=(X_1,\ldots,X_n) be a n-dimensional random vector for which the distribution has Markov structure corresponding to a junction forest, assuming functional forms for the marginal distributions associated with the cliques of the underlying graph. We propose a latent variable approach based on computing junction forests from filtrations. This methodology establishes connections between efficient algorithms from Computational Topology and Graphical Models, which lead to parametrizations for the space of decomposable graphs so that: i) the dimension grows linearly with respect to n, ii) they are convenient for MCMC sampling.

Series: Stochastics Seminar

Many context-free formalisms based on transitive properties of trees and strings have been converted to probabilitic models. We have Probabilistic Finite Automaton, Probabilistic Context Free Grammar and Probabilistic Tree Adjoining Grammars and many other probabilistic models of grammars. Typically such formalisms employ context-free productions that are transitively closed. Context-free grammars can be represented declaratively through context-sensitive grammars that analyse or check wellformedness of trees. When this direction is elaborated further, we obtain constraint-based representations for regular, context-free and mildly-context sensitive languages and their associated structures. Such representations can also be Probabilistic and this could be achieved by combining weighted rational operations and Dyck languages. More intuitively, the rational operations are packed to a new form of conditional rule: Generalized Restriction or GR in short (Yli-Jyrä and Koskenniemi 2004), or a predicate logic over strings. The conditional rule, GR, is flexible and provides total contexts, which is very useful e.g. when compiling rewriting rules for e.g. phonological alternations or speech or text normalization. However, the total contexts of different conditional rewriting rules can overlap. This implies that the conditions of different rules are not independent and the probabilities do not combine like in the case of context-free derivations. The non-transitivity causes problems for the general use of probabilistic Generalized Restriction e.g. when adding probabilities to phonological rewriting grammars that define regular relations.

Series: Stochastics Seminar

The limiting law of the length of the longest increasing subsequence, LI_n, for sequences (words) of length n arising from iid letters drawn from finite, ordered alphabets is studied using a straightforward Brownian functional approach. Building on the insights gained in both the uniform and non-uniform iid cases, this approach is then applied to iid countable alphabets. Some partial results associated with the extension to independent, growing alphabets are also given. Returning again to the finite setting, and keeping with the same Brownian formalism, a generalization is then made to words arising from irreducible, aperiodic, time-homogeneous Markov chains on a finite, ordered alphabet. At the same time, the probabilistic object, LI_n, is simultaneously generalized to the shape of the associated Young tableau given by the well-known RSK-correspondence. Our results on this limiting shape describe, in detail, precisely when the limiting shape of the Young tableau is (up to scaling) that of the iid case, thereby answering a conjecture of Kuperberg. These results are based heavily on an analysis of the covariance structure of an m-dimensional Brownian motion and the precise form of the Brownian functionals. Finally, in both the iid and more general Markovian cases, connections to the limiting laws of the spectrum of certain random matrices associated with the Gaussian Unitary Ensemble (GUE) are explored.

Series: Stochastics Seminar

Consider a class of multidimensional degenerate diffusion processes of the following form

X_t = x+\int_0^t (X_s) ds+\int_0^t \sigma(X_s) dW_s,

Y_t = y+\int_0^t F(X_s)ds,

where b,\sigma, F are assumed to be smooth and b,\sigma bounded. Suppose now that \sigma\sigma^* is uniformly elliptic and that \nabla F does not degenerate. These assumptions guarantee that only one Poisson bracket is needed to span the whole space. We obtain a parametrix representation of Mc Kean-Singer type for the density of (X_t,Y_t) from which we derive some explicit Gaussian controls that characterize the additional singularity induced by the degeneracy. This particular representation then allows to give a local limit theorem with the usual convergence rate for an associated Markov chain approximation. The "weak" degeneracy allows to use the local limit Theorem in Gaussian regime but also induces some difficulty to define the suitable approximating process. In particular two time scales appear. Another difficulty w.r.t. the standard literature on the topic, see e.g. Konakov and Mammen (2000), is the unboundedness of F.

Series: Stochastics Seminar

In this presentation, interactions between spectra of classical Gaussian ensembles and subsequence problems are studied with the help of the powerful machinery of Young tableaux. For the random word problem, from an ordered finite alphabet, the shape of the associated Young tableaux is shown to converge to the spectrum of the (generalized) traceless GUE. Various properties of the (generalized) traceless GUE are established, such as a law of large number for the extreme eigenvalues and the convergence of the spectral measure towards the semicircle law. The limiting shape of the whole tableau is also obtained as a Brownian functional. The Poissonized word problem is finally talked, and, with it, the convergence of the whole Poissonized tableaux is derived.

Series: Stochastics Seminar

Adaptive estimation of linear functionals occupies an important position in the theory of nonparametric function estimation. In this talk I will discuss an adaptation theory for estimation as well as for the construction of confidence intervals for linear functionals. A between class modulus of continuity, a geometric quantity, is shown to be instrumental in characterizing the degree of adaptability and in the construction of adaptive procedures in the same way that the usual modulus of continuity captures the minimax difficulty of estimation over a single parameter space. Our results thus "geometrize" the degree of adaptability.

Series: Stochastics Seminar

Spatial data are often more dispersed than would be expected if the points were independently placed. Such data can be modeled with repulsive point processes, where the points appear as if they are repelling one another. Various models have been created to deal with this phenomenon. Matern created three algorithms that generate repulsive processes. Here, MatÃ©rn Type III processes are used to approximate the likelihood and posterior values for data. Perfect simulation methods are used to draw auxiliary variables for each spatial point that are part of the type III process.

Series: Stochastics Seminar

Robustness of several nonparametric multivariate "threshold type" outlier identification procedures is studied, employing a masking breakdown point criterion subject to a fixed false positive rate. The procedures are based on four different outlyingness functions: the widely-used "Mahalanobis distance" version, a new one based on a "Mahalanobis quantile" function that we introduce, one based on the well-known "halfspace" depth, and one based on the well-known "projection" depth. In this treatment, multivariate location outlyingness functions are formulated as extensions of univariate versions using either "substitution" or "projection pursuit," and an equivalence paradigm relating multivariate depth, outlyingness, quantile, and centered rank functions is applied. Of independent interest, the new "Mahalanobis quantile" outlyingness function is not restricted to have elliptical contours, has a transformation-retransformation representation in terms of the well-known spatial outlyingness function, and corrects to full affine invariance the orthogonal invariance of that function. Here two special tools, also of independent interest, are introduced and applied: a notion of weak covariance functional, and a very general and flexible formulation of affine equivariance for multivariate quantile functions. The new Mahalanobis quantile function inherits attractive features of the spatial version, such as computational ease and a Bahadur-Kiefer representation. For the particular outlyingness functions under consideration, masking breakdown points are evaluated and compared within a contamination model. It is seen that for threshold type outlier identification the Mahalanobis distance and projection procedures are superior to the others, although all four procedures are quite suitable for robust ranking of points with respect to outlyingness. Reasons behind these differences are discussed, and directions for further study are indicated.