### Department of Mathematics and Systems Analysis

- Research groups
- Studies
- Personnel
- Current
- Contact information
- Internal pages

Aalto Stochastics and Statistics Seminar is organized by Kalle Kytölä, Lasse Leskelä, Pauliina Ilmonen, and Christian Webb. Feel free to contact one of us if you are interested in giving a talk. You may also earn credit points by active participation.

- Join stochastics@list.aalto.fi to receive seminar announcements and stay updated on probability and statistics in Aalto University

- 2.3. 15:15 Konstantin Avrachenkov (INRIA Sophia Antipolis): Hedonic coalitional game approach to network partitioning – M205
The traditional methods for detecting community structure in a network are based on selecting dense subgraphs inside the network. Here we propose to use the methods of coalitional game theory that highlight not only the link density but also the mechanisms of cluster formation. Specifically, we propose an approach which is based on hedonic coalitional games. This approach allows to find clusters with various resolution. Furthermore, the modularity-based approach and its generalizations as well as ratio cut and normalized cut methods can be viewed as particular cases of the hedonic games. Finally, for methods based on potential hedonic games we suggest a very efficient computational scheme using Gibbs sampling. Bio: Konstantin Avrachenkov received the masters degree in control theory from St. Petersburg State Polytechnic University in 1996, the Ph.D. degree in mathematics from the University of South Australia in 2000, and the Habilitation (Doctor of Science) degree from the University of Nice Sophia Antipolis in 2010. Currently, K. Avrachenkov is Director of Research at Inria Sophia Antipolis. His main research interests are Markov chains, Markov decision processes, stochastic games and singular perturbations. He applies these methodological tools to the modelling and control of telecommunication systems and to design data mining and machine learning algorithms. He has won 5 best paper awards. He is an Associate Editor of the International Journal of Performance Evaluation, Probability in the Engineering and InformationalSciences, ACM TOMPECS and Stochastic Models.

- 18.12.2019 16:15 Joni Virta (University of Turku): Fast tensorial independent component analysis – M1 (M232)
- 18.12.2019 15:15 Tom Claeys (Université Catholique de Louvain): Random growth, interacting particles, and Riemann-Hilbert problems: from KPZ to KdV – M1 (M232)
- 18.12.2019 14:00 Vesa Julin (University of Jyväskylä): The Gaussian isoperimetric problem for symmetric sets – M1 (M232)
- 18.12.2019 13:00 Jaron Sanders (TU Eindhoven): Markov chains for error accumulation in quantum circuits – M1 (M232)
- 18.12.2019 11:00 Kaie Kubjas (Aalto): Exact solutions in log-concave maximum likelihood estimation – M1 (M232)
- 18.12.2019 10:00 Teemu Pennanen (King's College London): Convex duality in nonlinear optimal transport – M1 (M232)
- 9.12.2019 14:15 Sami Helander (Aalto): On Adaptive functional data depths – Y313
Typically, in the functional context, data depth approaches heavily emphasize the location of the functions in the distribution, therefore often missing important shape or roughness features. Commonly, these depth approaches either integrate pointwise depth values to achieve a global value, or measure the expected distance from a function to the distribution. In this talk, we introduce a new class of functional depths, based on the distribution of depth values along the domain, and discuss their properties. We study the asymptotic properties of these $J$th order $k$th moment integrated depths, and illustrate their usefulness in supervised functional classification. In particular, we demonstrate the importance of receptivity to shape variations, and show that, similarly to existing depth notions, the new class of depth functions takes into account the variation in location, while remaining receptive to variations in shape and roughness.

- 2.12.2019 14:15 Paavo Raittinen (Aalto): On early detection of high-risk prostate cancer: applied discovery and validation models using genotype information – Y313
Prostate cancer incidence rate is extremely high and on the rise, counting over 1.2 million new cases annually and causing 350 000 deaths in 2018. While the prognosis is typically good, approximately 20% of the new cases classifies as high-risk prostate cancer with dire consequences. Moreover, the initial prostate cancer diagnosis always reflects as worry and quality of life impairment. The initial prostate cancer determination is based on prostate specific antigen (PSA) measure, which cannot distinguish between low-risk and high-risk cases. After the PSA determination, the tumor state is characterized with various invasive methods such as Gleason score and T-stage classification. However, both methods display inaccuracy and puts patient under infection risk. Our take on this challenge is to use inflammation-related gene single nucleotide polymorphisms (SNP) as predictors of high-risk prostate cancer. SNP is a low-cost, non-invasive, and stable biomarker. We have explored inflammation SNP association with high-risk prostate cancer in a genotyped part of Finnish Randomized Screening for Prostate Cancer cohort (n = 2715) and found several statistically significant associations. Furthermore, our validation model using unknown prostate cancer cohort collected during hospital visits (n = 888) is in concordance with our discovery model. Remarkably, few SNPs increase early high-risk prostate cancer detection over PSA alone.

- 25.11.2019 14:15 Joona Karjalainen (Aalto): Modeling overlapping communities with random intersection graphs – Y313
Many real-life networks can be naturally modeled by assuming an underlying community structure on the nodes. When each node can belong to more than one community, we say that the communities overlap. This talk discusses the modeling of such networks with random intersection graphs. We review some of their asymptotic properties, such as subgraph counts, and discuss consistent moment-based parameter estimation in a sparse setting.

- 21.11.2019 15:15 Bas Lemmens (University of Kent): Horofunctions, fixed points, and illuminating the unit ball – M2 (M233)
A central problem in metric fixed point theory is to understand when a nonexpansive (i.e. Lipschitz with constant 1) self-map of a metric space has a fixed point. Even in the case where the metric space is a finite dimensional normed space, this is a subtle problem, as the map need not be a Lipschitz contraction and the space is not bounded, so neither the contraction mapping theorem nor the Brouwer fixed point theorem applies. In this talk I will give necessary and sufficient conditions for a nonexpansive map on a finite dimension normed space to have a bounded non-empty fixed point set. Moreover, we will provide a procedure that can detect fixed points of such maps using sets that illuminate the unit ball of the normed space. We will see how horofunctions play a role in this problem. Time permitting I will also discuss some applications to stochastic games.

- 21.11.2019 10:15 Stanislav Nagy (Charles University): Geometry of multivariate quantiles – Y313
The halfspace depth is a tool of non-parametric statistics, whose main aim is a reasonable generalisation of quantiles to multivariate data. It was first proposed in 1975; its rigorous investigation starts in the 1990s, and still an abundance of open problems stimulates the research in the area. We present interesting links of the halfspace depth, and some well-studied concepts from geometry. Using these relations we resolve several open problems concerning the depth, and outline perspectives for future research not only in non-parametric statistics, but also in certain areas of convex geometry. The talk is intended to be largely self-contained; no particular knowledge of probability and statistics is necessary.

- 20.11.2019 14:15 Jan Härkönen (Aalto University): Quantum Monte Carlo simulation of positron annihilation radiation in solids (MSc project presentation) – M3 (M234)
This project concentrates on simulating the momentum density of annihilating electron-positron pairs. We use the CASINO simulation program in order to optimize the wave function of a system to simulate the the momentum density using Quantum Monte Carlo methods. The simulations involve diamond, silicon and germanium FCC-lattices.

- 18.11.2019 14:15 Marko Voutilainen (Aalto): Modeling and estimation of multivariate strictly stationary processes – Y313
We discuss how discrete and continuous time multivariate stationary processes can be characterized by an AR(1) type of equation and Langevin equation, respectively. Under the assumption of finite second moments, this leads to quadratic matrix equations for the model parameter matrix that are known as continuous time Riccati equations (CAREs). Based on the equations, we define an estimator for the parameter that inherits consistency and the rate of convergence from autocovariance estimators of the (observed) stationary process. Furthermore, the limiting distribution is given by a linear function of the limit random variable of the autocovariance estimators.

- 11.11.2019 14:15 Hoa Ngo (Aalto): First passage percolation on mixed sparse random graphs with two types of nodes – Y313
A mixed graph is a graph consisting of both undirected edges and directed edges.This talk discusses first passage percolation on a connected mixed random graph with a given degree sequence, where an undirected edge is formed between type-1 nodes and a directed edge between type-1 and type-2 nodes. Weights on edges are assumed to be independent and exponentially distributed. We analyze a flooding time, which is the minimum time that a uniformly chosen node reaches all other nodes. We derive an asymptotic formula for the flooding time as the number of nodes tend to infinity. As an application, we discuss continuous time information spreading on a random regular graph, where we also take into account the impact of passive nodes. Type-1 nodes can be interpreted as active message spreaders and type-2 nodes can be interpreted as passive receivers which may only receive the message. In this setting we derive an asymptotic formula for the flooding time which is also called the broadcast time in the literature

- 4.11.2019 14:15 Niko Lietzén (Aalto): Complex-valued latent variable models – Y405
In several fields of science, a generic problem consists of separating useful signals from uninteresting noise and interference. The problem can be approached by implementing latent variable models. In our approach, we aim to find latent processes, when only linear mixtures of them are observable. In this context, we provide an estimation procedure for complex-valued stochastic processes. Furthermore, we study the asymptotic behavior of the so-called unmixing estimators. We provide novel asymptotic theory for scenarios, when the estimators are not root-n consistent and the limiting distributions are not Gaussian.

- 28.10.2019 14:15 Jaakko Lehtomaa (University of Helsinki): On asymptotic independence and support detection techniques for heavy-tailed multivariate data – Y405
One of the central objectives of modern risk management is to find a set of risks where the probability of multiple simultaneous catastrophic events is negligible. That is, risks are taken only when their joint behavior seems sufficiently independent. Our objective is to provide additional tools for describing dependence structures of multiple risks when the individual risks can obtain very large values. The study is performed in the setting of multivariate regular variation. We show how asymptotic independence is connected to properties of the support of the angular measure and present an asymptotically consistent estimator of the support. The estimator generalizes to any dimension greater than or equal to two and requires no prior knowledge of the support. The validity of the support estimate can be rigorously tested under mild assumptions by an asymptotically normal test statistic.

- 21.10.2019 14:15 Shinji Koshida (Chuo University): Coupling of multiple Schramm-Loewner evolution and Gaussian free field – Y405
It is known that Schramm Loewner evolution (SLE) is coupled with Gaussian free field (GFF) to give a solution to the flow line problem for an imaginary surface. I will overview our recent work where we extended this coupling to the case of multiple SLE. There, we found that the SLE partition function that defines a multiple SLE and the boundary perturbation for GFF are determined essentially uniquely so that the associated multiple SLE and GFF are coupled with each other.

- 15.10.2019 10:15 David Adame-Carrillo (Universitat Politècnica de Catalunya): Towards extended Minimal Models in Conformal Field Theory – M2 (M233)
We present a physics approach to conformal field theory in two dimensions: the bootstrap approach. In this approach, one directly imposes conditions on correlation functions inspired by conformal symmetries. Within this framework, we give emphasis to the fusion rules of degenerate representations. Using fusion rules, we build a well-known set of simple models called Minimal Models. Finally, we propose an extension of them at central charge c=0.

- 8.10.2019 11:15 Lauri Viitasaari (Aalto): Stochastic heat equation revisited - quantitative approximation results – Y405
Partial differential equations, PDEs, describe many real life phenomena, and they are a subject of active research. Recently a growing attention have been paid to stochastic versions of PDEs - stochastic partial differential equations, or SPDEs for short. Such equations arise naturally as a random shock may represent some external random force affecting the system, or possibly some measurement errors. However, in the study of SPDEs many classical approaches breaks down completely. Indeed, even the concept of differential is subtle - the solution being typically only Hölder continuous. Moreover, as there is a random force affecting the system, the solution is also a random object. Typically, analysing this randomness is very complicated. In this talk, we discuss d-dimensional stochastic heat equations driven by a Gaussian noise which is white in time and has a spatial covariance given by the Riesz kernel. Basic theory and properties of the solutions are discussed. As a main result, we present a quantitative central limit theorem stating that the spatial average of the solution over an Euclidean ball is close to a Gaussian distribution, when the radius of the ball tends to infinity. Our central limit theorem is described in the total variation distance, using Malliavin calculus and Stein's method. We also provide a functional central limit theorem and analogous results in the case of space-time white noise. Extensions and further open questions are discussed.

- 23.9.2019 15:15 Dario Gasbarra (University of Helsinki): Stein operators for Gaussian polynomial random variables: an algebraic approach – Y405
For a standard Gaussian random variable N, integration by parts gives the Stein equation E(Nf(N)- Df(N))=0 The Stein equation characterizes the distribution and it is the key in proving quantitative limit theorems towards the Gaussian. Here we take the first steps in extending the methodology, and give an algorithm producing all the Stein differential operators with polynomial coefficients for target random variables of the form X= p(N_1, ..., N_d), with Gaussian N and polynomial p. This is a joint work with Ehsan Azmoodeh (Bochum) and Robert Gaunt (Manchester)

- 16.9.2019 14:15 Jukka Kohonen (Aalto): Clustering, combinatorics and computation -- and some connections – Y405
- 29.8.2019 14:15 Francesco Spadaro: Constructing 2D Ising fermions with a geometrical-probabilistic approach – M3 (M234)
We will discuss a construction of correlations of discrete fermions for the two-dimensional critical FK-Ising and Ising models as expectations over geometrical configurations. The observable plays the role of a precursor for the free fermion in the Ising CFT, and it inspires the construction of CFT fields in the continuum case in terms of SLE/CLE measures.

- 22.8.2019 15:30 Kalle Kytölä: SLE random curves and conformal field theory – M3 (M234)
- 22.8.2019 15:00 Taha Ameen: Diagonalization of the 2D Ising model transfer matrix – M3 (M234)
- 22.8.2019 14:00 David Radnell: An introduction to the geometric structures underlying conformal field theory – M3 (M234)
- 22.8.2019 13:30 Christian Webb: On logarithmically correlated random fields – M3 (M234)
- 22.8.2019 11:30 Armando Gutiérrez: Elements of metric functional analysis – M3 (M234)
- 22.8.2019 11:00 Alex Karrila: On multiple SLE type scaling limits – M3 (M234)
- 25.7.2019 14:15 Vincent Beffara (Université Grenoble Alpes): Percolation for smooth 2D random fields – M3 (M234)
- 25.6.2019 11:15 Mihaela Mihaylova (Aalto U): Correlations studies of LDL-aggregation, LDL-lipidome and clinical data of bariatric surgery patients – M2 (M233)
Atherosclerotic cardiovascular disease (ASCVD), also known as coronary artery disease (CAD), is one of the leading causes of death in the world.[1] A consensus has been reached that the main cause of ASCVD are low-density lipoproteins (LDL). [2] ASCVD develops in the innermost layer of the coronary artery wall (intima). Once LDL particles enter the wall, they are retained, modified, and accumulate there. [3] There are several well-known risk factors of ASCVD, among which obesity, smoking, hypertension and LDL-cholesterol concentration in the plasma.[3] A novel approach to assessing the risk of ASCVD however suggests that, not only the concentration, but also the quality of LDL might be associated with ASCVD. [3] It shows that the susceptibility of LDL particles to aggregate (in the presence of the enzyme hrSMase) varies between humans and depends on the composition of the LDL particles. [3] The presence of aggregation-prone LDL in the plasma was found to be associated with future coronary artery disease (CAD) deaths. [3] This makes investigating LDL-aggregation further particularly important. This masters thesis studies LDL aggregation of patients who underwent bariatric surgery - a procedure performed on people with obesity, for the purpose of weight loss. It focuses on four main points: ● Creating a nonlinear mixed-effects model of LDL-aggregation and obtaining a single quantitative measure of LDL-aggregation. ● Investigating whether there is a significant difference in LDL-aggregation in patients before and after bariatric surgery ● Studying correlations between LDL-aggregation and lipids from the LDL-lipidome, as well as correlations with clinical data of bariatric surgery patients. ● Investigating whether there is a significant difference in the LDL-lipidome lipids and clinical parameters in the patients before and after the operation The presentation will discuss the progress made on the project. It will cover the following points. ● Problem Overview: Theory and Data ● Solution Plan ● Step 1: Modelling of LDL-aggregation - Nonlinear Mixed-Effect Models - Modelling using the Bayesian approach - Modelling Problems - Possible Solutions References: [1] George, S. and Johnson, J. (2010). Atherosclerosis: Molecular and Cellular Mechanisms. Weinheim: Wiley-VCH-Verl. [2] Ference, B. et al (2017). Low-density lipoproteins cause atherosclerotic cardiovascular disease. 1. Evidence from genetic, epidemiologic, and clinical studies. A consensus statement from the European Atherosclerosis Society Consensus Panel. European Heart Journal, 38(32), pp.2459-2472. [3] Ruuth, M. et al (2018). Susceptibility of low-density lipoprotein particles to aggregate depends on particle lipidome, is modifiable, and associates with future cardiovascular deaths. European Heart Journal, 39(27),pp.2562-2573.

- 18.6.2019 14:15 Tatu Hyytiäinen: Changepoint detection in network activity measurement data (diploma thesis talk). – M2 (M233)
- 7.6.2019 12:15 Maximilien Dreveton (Inria Sophia Antipolis): Almost exact recovery in label spreading – M2 (M233)
In semi-supervised graph clustering setting, an expert provides cluster membership of few nodes. This little amount of information allows one to achieve high accuracy clustering using efficient computational procedures. Our main goal is to provide a theoretical justification why the graph-based semi-supervised learning works very well. Specifically, for the Stochastic Block Model in the moderately sparse regime, we prove that popular semi-supervised clustering methods like Label Spreading achieve asymptotically almost exact recovery as long as the fraction of labeled points does not go to zero and the average degree goes to infinity.

Page content by: webmaster-math [at] list [dot] aalto [dot] fi