IT Technical reports
http://www.it.uu.se/research/publications/reports
Technical reports from the Department of Information Technology, Uppsala University, Sweden
20191209T16:28:56Z
Department of Information Technology, Uppsala University, Sweden
Copyright © 2005 Department of Information Technology, Uppsala University, Sweden
daily
12
19990101T00:00+01:00

Technical report 2019011: Fast Parallel Solver for the SpaceTime IgADG Discretization of the Anisotropic Diffusion Equation
http://www.it.uu.se/research/publications/reports/2019011
20191101
Pietro Benedusi, Paola Ferrari, Carlo Garoni, Rolf Krause, and Stefano SerraCapizzano
<b>Abstract:</b> We consider the spacetime discretization of the (linear) anisotropic diffusion equation, using an isogeometric analysis (IgA) approximation in space and a discontinuous Galerkin (DG) approximation in time. Drawing inspiration from a former spectral analysis, we propose for the resulting spacetime linear system a new solution method combining a suitable preconditioned GMRES (PGMRES) algorithm with a few iterations of an appropriate multigrid method. The performance of our new solution method is illustrated through numerical experiments, which show its competitiveness in terms of robustness, runtime and parallel scaling.

Technical report 2019010: The Frisch Scheme: Time and Frequency Domain Aspects
http://www.it.uu.se/research/publications/reports/2019010
20191101
Umberto Soverini and Torsten Söderström
<b>Abstract:</b> Several estimation methods have been proposed for identifying errorsinvariables systems, where both input and output measurements are corrupted by noise. One of the more interesting approaches is the Frisch scheme. The method can be applied using either time or frequency domain representations. This paper investigates the general mathematical and geometrical aspects of the Frisch scheme, illustrating the analogies and the differences between the time and frequency domain formulations.

Technical report 2019009: Evaluation of Methods Handling Missing Data in PCA on Genotype Data: Applications for Ancient DNA
http://www.it.uu.se/research/publications/reports/2019009
20191001
Kristiina Ausmees
<b>Abstract:</b> Principal Component Analysis (PCA) is a method of projecting data onto a basis that maximizes its variance, possibly revealing previously unseen patterns or features. PCA can be used to reduce the dimensionality of multivariate data, and is widely applied in visualization of genetic information. In the field of ancient DNA, it is common to use PCA to show genetic affinities of ancient samples in the context of modern variation. Due to the low quality and sequence coverage often exhibited by ancient samples, such analysis is not straightforward, particularly when performing joint visualization of multiple individuals with nonoverlapping sequence data. The PCA transform is based on variances of allele frequencies among pairs of individuals, and discrepancies in overlap may therefore have large effects on scores. As the relative distances between scores are used to infer genetic similarity, it is important to distinguish between the effects of the particular set of markers used and actual genetic affinities. This work addresses the problem of using an existing PCA model to estimate scores of new observations with missing data. We address the particular application of visualizing genotype data, and evaluate approaches commonly used in population genetic analyses as well as other methods from the literature. The methods considered are that of trimmed scores, projection to the model plane, performing PCA individually on samples and subsequently merging them using Procrustes transformation, as well as the two leastsquares based methods trimmed score regression and known data regression. Using empirical ancient data, we demonstrate the use of the different methods, and show that discrepancies in the set of loci considered for different samples can have pronounced effects on estimated scores. We also present an evaluation of the methods based on modern data with varying levels of simulated sparsity, concluding that their relative performance is highly datadependent.

Technical report 2019008: An Empirical Evaluation of Genotype Imputation of Ancient DNA
http://www.it.uu.se/research/publications/reports/2019008
20191001
Kristiina Ausmees, Federico SanchezQuinto, Mattias Jakobsson, and Carl Nettelblad
<b>Abstract:</b> With capabilities of sequencing ancient DNA to high coverage often limited by sample quality or cost, imputation of missing genotypes presents a possibility to increase power of inference as well as costeffectiveness in analysis of ancient data. However, the high degree of uncertainty often associated with ancient DNA poses several methodological challenges, and performance of imputation methods in this context has not been fully explored. To gain further insights, we performed a systematic evaluation of imputation of ancient data using BEAGLE 4.0 and reference data from phase 3 of the 1000 Genomes project, investigating the effects of coverage, phased reference and study sample size. Making use of five ancient samples with highcoverage data available, we evaluated imputed data with respect to accuracy, reference bias and genetic affinities as captured by PCA. We obtained genotype concordance levels of over 99% for data with 1x coverage, and similar levels of accuracy and reference bias at levels as low as 0.75x. Our findings suggest that using imputed data can be a realistic option for various population genetic analyses even for data in coverage ranges below 1x. We also show that a large and varied phased reference set as well as the inclusion of low to moderatecoverage ancient samples can increase imputation performance, particularly for rare alleles. Indepth analysis of imputed data with respect to genetic variants and allele frequencies gave further insight into the nature of errors arising during imputation, and can provide practical guidelines for postprocessing and validation prior to downstream analysis.

Technical report 2019007: Performance of an OO Compute Kernel on the JVM: Revisiting Java as a Language for Scientific Computing Applications (Extended Version)
http://www.it.uu.se/research/publications/reports/2019007
20190901
Malin Källén and Tobias Wrigstad
<b>Abstract:</b> The study of Java as a programming language for scientific computing is warranted by simpler, more extensible and more easily maintainable code. Previous work on refactoring a C++ scientific computing code base to follow best practises of objectoriented software development revealed a coupling of such practises and considerable slowdowns due to indirections introduced by abstractions. In this paper, we explore how Java's JIT compiler handle such abstractioninduced indirection using a typical scientific computing compute kernel extracted from a linear solver written in C++. We find that the computation times for large workloads on one machine can be onpair for C++ and Java. However, for distributed computations, a better parallelisation strategy needs to be found for nonblocking communication. We also report on the impact on performance for common "gripes": garbage collection, array bounds checking, and dynamic binding.

Technical report 2019006: Frequency Domain Identification of FIR Models from Noisy InputOutput Data
http://www.it.uu.se/research/publications/reports/2019006
20190801
Umberto Soverini and Torsten Söderström
<b>Abstract:</b> This paper describes a new approach for identifying FIR models from a finite number of measurements, in the presence of additive and uncorrelated white noise. In particular, two different frequency domain algorithms are proposed. The first algorithm is based on some theoretical results concerning the dynamic Frisch scheme. The second algorithm maps the FIR identification problem into a quadratic eigenvalue problem. Both methods resemble in many aspects some other identification algorithms, originally developed in the time domain. The features of the proposed methods are compared with each other and with those of some time domain algorithms by means of Monte Carlo simulations.

Technical report 2019005: Block Generalized Locally Toeplitz Sequences: Theory and Applications in the Multidimensional Case
http://www.it.uu.se/research/publications/reports/2019005
20190701
Giovanni Barbarino, Carlo Garoni, and Stefano SerraCapizzano
<b>Abstract:</b> In computational mathematics, when dealing with a large linear discrete problem (e.g., a linear system) arising from the numerical discretization of a partial differential equation (PDE), the knowledge of the spectral distribution of the associated matrix has proved to be a useful information for designing/analyzing appropriate solversespecially, preconditioned Krylov and multigrid solversfor the considered problem. Actually, this spectral information is of interest also in itself as long as the eigenvalues of the aforementioned matrix represent physical quantities of interest, which is the case for several problems from engineering and applied sciences (e.g., the study of natural vibration frequencies in an elastic material). The theory of multilevel generalized locally Toeplitz (GLT) sequences is a powerful apparatus for computing the asymptotic spectral distribution of matrices An arising from virtually any kind of numerical discretization of PDEs. Indeed, when the meshfineness parameter n tends to infinity, these matrices An give rise to a sequence {An}n, which often turns out to be a multilevel GLT sequence or one of its "relatives", i.e., a multilevel block GLT sequence or a (multilevel) reduced GLT sequence. In particular, multilevel block GLT sequences are encountered in the discretization of systems of PDEs as well as in the higherorder finite element or discontinuous Galerkin approximation of scalar/vectorial PDEs. In this work, we systematically develop the theory of multilevel block GLT sequences as an extension of the theories of (unilevel) GLT sequences [GLTbookI], multilevel GLT sequences [GLTbookII], and block GLT sequences [bg]. We also present several emblematic applications of this theory in the context of PDE discretizations.

Technical report 2019004: Block Generalized Locally Toeplitz Sequences: Theory and Applications in the Unidimensional Case
http://www.it.uu.se/research/publications/reports/2019004
20190701
Giovanni Barbarino, Carlo Garoni, and Stefano SerraCapizzano
<b>Abstract:</b> In computational mathematics, when dealing with a large linear discrete problem (e.g., a linear system) arising from the numerical discretization of a differential equation (DE), the knowledge of the spectral distribution of the associated matrix has proved to be a useful information for designing/analyzing appropriate solversespecially, preconditioned Krylov and multigrid solversfor the considered problem. Actually, this spectral information is of interest also in itself as long as the eigenvalues of the aforementioned matrix represent physical quantities of interest, which is the case for several problems from engineering and applied sciences (e.g., the study of natural vibration frequencies in an elastic material). The theory of generalized locally Toeplitz (GLT) sequences is a powerful apparatus for computing the asymptotic spectral distribution of matrices An arising from virtually any kind of numerical discretization of DEs. Indeed, when the meshfineness parameter n tends to infinity, these matrices An give rise to a sequence {An}n, which often turns out to be a GLT sequence or one of its "relatives", i.e., a block GLT sequence or a reduced GLT sequence. In particular, block GLT sequences are encountered in the discretization of systems of DEs as well as in the higherorder finite element or discontinuous Galerkin approximation of scalar/vectorial DEs. This work is a review, refinement, extension, and systematic exposition of the theory of block GLT sequences. It also includes several emblematic applications of this theory in the context of DE discretizations.

Technical report 2019003: Minimizing Replay under WayPrediction
http://www.it.uu.se/research/publications/reports/2019003
20190501
Ricardo Alves, Stefanos Kaxiras, and David BlackSchaffer
<b>Abstract:</b> Waypredictors are effective at reducing dynamic cache energy by reducing the number of ways accessed, but introduce additional latency for incorrect waypredictions. While previous work has studied the impact of the increased latency for incorrect waypredictions, we show that the latency variability has a far greater effect as it forces replay of inflight instructions on an incorrect wayprediction. To address the problem, we propose a solution that learns the confidence of the wayprediction and dynamically disables it when it is likely to mispredict. We further improve this approach by biasing the confidence to reduce latency variability further at the cost of reduced waypredictions. Our results show that instruction replay in a waypredictor reduces IPC by 6.9% due to 10% of the instructions being replayed. Our confidencebased waypredictor degrades IPC by only 2.9% by replaying just 3.4% of the instructions, reducing waypredictor cache energy overhead (compared to serial access cache) from 8.5% to 1.9%.

Technical report 2019002: Block Generalized Locally Toeplitz Sequences: Theory and Applications
http://www.it.uu.se/research/publications/reports/2019002
20190401
C. Garoni and S. SerraCapizzano
<b>Abstract:</b> When dealing with a large linear system arising from the numerical discretization of a differential equation (DE), the knowledge of the spectral distribution of the associated matrix has proved to be a useful information for designing/analyzing appropriate solversespecially, preconditioned Krylov and multigrid solvers for the considered system. The theory of generalized locally Toeplitz (GLT) sequences is a powerful apparatus for computing the asymptotic spectral distribution of matrices An arising from virtually any kind of numerical discretization of DEs. Indeed, when the meshfineness parameter n tends to infinity, these matrices An give rise to a sequence { An }, which often turns out to be a GLT sequence or one of its "relatives", i.e., a block GLT sequence or a reduced GLT sequence. In particular, block GLT sequences are encountered in the discretization of systems of DEs as well as in the higherorder finite element or discontinuous Galerkin approximation of scalar/vectorial DEs. This work is a review, refinement, extension, and systematic exposition of the theory of block GLT sequences. It also includes several emblematic applications of this theory in the context of DE discretizations.

Technical report 2018014: Delorean: Virtualized Directed Profiling for Cache Modeling in Sampled Simulation
http://www.it.uu.se/research/publications/reports/2018014
20181201
Nikos Nikoleris, Erik Hagersten, and Trevor E. Carlson
<b>Abstract:</b> Current practice for accurate and efficient simulation (e.g., SMARTS and Simpoint) makes use of sampling to significantly reduce the time needed to evaluate new research ideas. By evaluating a small but representative portion of the original application, sampling can allow for both fast and accurate performance analysis. However, as cache sizes of modern architectures grow, simulation time is dominated by warming microarchitectural state and not by detailed simulation, reducing overall simulation efficiency. While checkpoints can significantly reduce cache warming, improving efficiency, they limit the flexibility of the system under evaluation, requiring new checkpoints for software updates (such as changes to the compiler and compiler flags) and many types of hardware modifications. An ideal solution would allow for accurate cache modeling for each simulation run without the need to generate rigid checkpointing data a priori. Enabling this new direction for fast and flexible simulation requires a combination of (1) a methodology that allows for hardware and software flexibility and (2) the ability to quickly and accurately model arbitrarilysized caches. Current approaches that rely on checkpointing or statistical cache modeling require rigid, upfront state to be collected which needs to be amortized over a large number of simulation runs. These earlier methodologies are insufficient for our goals for improved flexibility. In contrast, our proposed methodology, Delorean, outlines a unique solution to this problem. The Delorean simulation methodology enables both flexibility and accuracy by quickly generating a targeted cache model for the next detailed region on the fly without the need for upfront simulation or modeling. More specifically, we propose a new, more accurate statistical cache modeling method that takes advantage of hardware virtualization to precisely determine the memory regions accessed and to minimize the time needed for data collection while maintaining accuracy. Delorean uses a multipass approach to understand the memory regions accessed by the next, upcoming detailed region. Our methodology collects the entire set of key memory accesses and, through fast virtualization techniques, progressively scans larger, earlier regions to learn more about these key accesses in an efficient way. Using these techniques, we demonstrate that Delorean allows for the fast evaluation of systems and their software though the generation of accurate cache models on the fly. Delorean outperforms previous proposals by an order of magnitude, with a simulation speed of 150 MIPS and a similar average CPI error (below 4%).

Technical report 2018013: How to Extend the Application Scope of GLTSequences
http://www.it.uu.se/research/publications/reports/2018013
20181101
Stanislav Morozov, Stefano SerraCapizzano, and Eugene Tyrtyshnikov
<b>Abstract:</b> In this paper we address the problem of finding the distribution of eigenvalues and singular values for matrix sequences. The main focus of this paper is the spectral distribution for matrix sequences arising in discretization of PDE. In the last two decades the theory of GLTsequences aimed at this problem has been developed. We investigate the possibility of application of GLTtheory to discretization of PDE on nonrectangular domains and show that in many cases the present GLTtheory is insufficient. We also propose a generalization of GLTsequences that enables one to cope with a wide range of PDE discretization problems defined on polygonal domains.

Technical report 2018012: Eigenvalue Isogeometric Approximations Based on Bsplines: Tools and Results
http://www.it.uu.se/research/publications/reports/2018012
20180701
SvenErik Ekström and Stefano SerraCapizzano
<b>Abstract:</b> In such a short note we consider the spectral analysis of large matrices coming from the numerical approximation of the eigenvalue problem (a(x)u'(x))'=λ b(x) u(x), x∈ (0,1), where u(0) and u(1) are given, by using isogeometric methods based on Bsplines. We give precise estimates for the extremal eigenvalues and global distributional results. The techniques involve dyadic decomposition arguments, the GLT analysis, and basic extrapolation methods.

Technical report 2018011: Nonlinear System Identification of the Dissolved Oxygen to Effluent Ammonium Dynamics in an Activated Sludge Process
http://www.it.uu.se/research/publications/reports/2018011
20180601
Tatiana Chistiakova, Per Mattsson, Bengt Carlsson, and Torbjörn Wigren
<b>Abstract:</b> Aeration of biological reactors in wastewater treatment plants is important to obtain a high removal of soluble organic matter as well as for nitrification but requires a significant use of energy. It is hence of importance to control the aeration rate, for example, by ammonium feedback control. The goal of this report is to model the dynamics from the set point of an existing dissolved oxygen controller to effluent ammonium using two types of system identification methods for a Hammerstein model, including a newly developed recursive variant. The models are estimated and evaluated using noise corrupted data from a complex mechanistic model (Activated Sludge Model no.1). The performances of the estimated nonlinear models are compared with an estimated linear model and it is shown that the nonlinear models give a significantly better fit to the data. The resulting models may be used for adaptive control (using the recursive Hammerstein variant), gainscheduling control, L2 stability analysis, and model based fault detection.