Uppsala Architecture Research Team

The Uppsala Architecture Research Team is a multi-disciplinary research group that works on a broad range of challenges in computer architecture, including microarchitecture, memory systems, compilers, security, power efficiency, simulation and modeling, runtime optimizations, co-design, and distributed systems.

Professor Stefanos Kaxiras

(PhD Wisconsin) worked at Bell Labs before coming to Uppsala. His research interests include and memory consistency models, coherence, and microarchitecture with an emphasis on security and (reducing) speculation.

Professor David Black-Schaffer

(PhD Stanford) worked at Apple before coming to Uppsala. His research interests include runtime scheduling and memory system design.

Assistant Professor Yuan Yao

(PhD Royal Institute of Technology, Stockholm) has research interests in Network on Chip (NoC) and Non-Von-Neumann architectures.

Assistant Professor Chang Hyun Park

(PhD KAIST) conducts research on the virtual memory system on both the architecture and systems side.

Professor (Emeritus) Erik Hagersten

(PhD Royal Institute of Technology, Stockholm) was the chief server architect at Sun Microsystems before coming to Uppsala. His research interests include efficient memory system designs and modeling.

Postdocs and Visiting Researchers

Associate Professor Magnus Själander
Postdoc Researcher Peter Munch

Graduate Students

Per Ekemark
Johan Janzén
Hassan Muhammad
Marina Shimchenko
Pavlos Aimoniotis
Alireza Haddadi
Rashid Aligholipour
Ahmed Nematallah
Mehmetali Semi Yenimol
Xiaoyue Chen
Shiming Li
George Stoian
Hannah Atmer

UART Research Group photo 2019

Projects

Body Operating System

Challenge: Power efficient and secure computation in an in-body processing system. Link to project page: BOS

Efficient Processors

Challenge: Making general purpose processors more efficient.
Results: Offloading instructions to simpler schedulers to reduce scheduling cost (ICCD2018, HPCA2019, DATE2019, HPCA2020); caching in the pipeline (ISCA2019).

Security and Speculation

Challenge: Building processors that are secure by design; Reducing our reliance on speculation without losing its performance advantages.
Results: Understanding speculative shadows to reduce the impact of reduced speculation (ISCA2019); hiding speculative effects (CF2019), Non-Speculative techniques to reorder memory accesses (ISCA2017, IEEE Micro Top Picks 2018, ISCA2018, MICRO2018); Compiler orchestrated software-out-of-order execution on in-order cores (PACT2016 SRC-Bronze medal, CGO2017, PLDI2018, Best of CAL 2017, TransOnComputers2018 - Featured article of the month); Limited speculation cores (ISCA2015).

Compiling for Power Efficiency

Challenge: Co-designing the hardware and compiler to maximize efficiency.
Results: Decoupling access and execute to improve DVFS (ICS2013, CGO2014, CC2016 Best Paper, HIP3ES2016, HIP3ES2017);

Smart Memory Systems

Challenge: Understanding where and when data is needed to reduce the energy consumed in moving it and the time wasted waiting for it.
Results: Direct-to-data cache designs that avoid searches (MICRO2013, ISCA2014, MICRO2015, HPCA2018); intelligent policies for placing data based on reuse for CPUs (ICCD2016, SBAC-PAD2017, ICS2019) and GPUs (IISCW2017).

Scheduling

Challenge: Matching the heterogeneous behavior of tasks and applications to heterogeneous hardware for performance.
Results: CPU and GPU task analysis and modeling (JParallelComputing2018, ISPASS2018); GPU co-execution (SBAC-PAD),

Complexity-Effective Coherence

Challenge: Create novel coherence protocols to enable highly-efficient multi/many-core systems and software shared memory implementation.
Results: Application driven, highly-efficient, VIPS family of protocols (PACT2012, ISCA2013, ISCA2015, HPCA2015); ArgoDSM distributed shared memory system (HPDC2015); Racer TSO: data-race-detection coherence, transparent to software (MICRO2016, IEEE Micro Top Picks 2017 honorable mention); compiler-assisted cache coherence (IPDPS2015, TPDS2016, CGO2017, CCPR2017, TPDS2018).

Previous Projects

Modeling

Challenge: Using low-overhead profile information to quickly model memory system behavior and performance.
Results: Architecturally independent performance models for memory systems (CGO2012, IISWC2012) and performance (ISPASS2015) and resource-sharing performance profiling (CGO2013, PACT2012).

Software Optimization for Memory Systems

Challenge: Automatic software-based cache bypassing and prefetching without hurting co-execution on multicores.
Results: Adaptive software bypassing (HPCA2013) and prefetching (PACT2015).

Startups

Eta Scale AB works to commercialize memory coherence technology for both efficient scalable hardware implementations and software distributed shared memory. (Active)

Green Cache AB took the Direct-to-Data memory system technology and worked with clients to investigate the energy-savings potential in their future mobile SoCs. (IP purchased)

Acumem AB developed the StatCache statistical memory modeling technology into the ThreadSpotter turn-key tool to help developers identify and fix memory system related issues in their software. (Sold to Rouge Wave)

Alumni (and first job)

PhD Alumni

Christos Sakalis (PhD 2021, IAR, Sweden)
Mehdi Alipour (PhD 2020, Ericsson, Sweden)
Kim-Anh Tran (PhD 2020, Google, Germany)
Ricardo Alves (PhD 2019, Intel, USA)
Nikos Nikoleris (PhD 2019, ARM, UK)
Germán Ceballos (PhD 2018, Ericsson, Sweden)
Magnus Norgren (Swedish Patent Office)
Andreas Sembrant (PhD 2017, Nvidia, USA)
Mahdad Davari (PhD 2017, Ericsson, Sweden)
Muneeb Khan (PhD 2016, Ericsson, Sweden)
Moncef Mechri (IMC, Netherlands)
Vasileios Spiliopoulos (ZeroPoint, Sweden)
Konstantinos Koukos (PhD 2016, KTH, Sweden)
Andreas Sandberg (PhD 2014, ARM, UK)
David Eklöv (PhD 2011, Samsung, USA)
Håkan Zeffer (PhD 2006, Sun Microsystems, USA)
Henrik Löf (PhD 2006, Stanford University, USA)
Erik Berg (PhD 2005, Xelerated, Sweden)
Martin Karlsson (PhD 2006, Sun Microsystems, USA)
Dan Wallin (PhD 2006, Virtutech, Sweden)
Zoran Radovic (PhD 2005, Sun Microsystems, USA)

Licentiate Alumni

Gustaf Borgström (Lic 2022, IAR, Sweden)

Postdoc Alumni

Dr. Anirban Nag (Huawei, Switzerland)
Dr. Mihail Popov (Huawei, UK)
Professor Rakesh Kumar (NTNU, Norway)
Dr. Gregory Vaumourin (Atos, France)
Dr. Andra Hugo (DDN Storage, France)
Professor Trevor Carlson (NUS, Sinagpore)
Professor Magnus Själander (NTNU, Norway)
Professor Alberto Ros (University of Murcia, Spain)
Dr. Nina Shariati (Uppsala University, Sweden)

Recent Publications

A First Exploration of Fine-Grain Coherence for Integrity Metadata. Per Ekemark, Alberto Ros, Konstantinos Sagonas, and Stefanos Kaxiras. In 2024 INTERNATIONAL SYMPOSIUM ON SECURE AND PRIVATE EXECUTION ENVIRONMENT DESIGN, SEED 2024, pp 62-72, IEEE Computer Society, 2024. (DOI).
JANUS: A Simple and Efficient Speculative Defense using Reinforcement Learning. Pavlos Aimoniotis and Stefanos Kaxiras. In , Hilo, HI, USA, 2024.
Mark-Scavenge: Waiting for Trash to Take Itself Out. Jonas Norlinder, Erik Osterlund, David Black-Schaffer, and Tobias Wrigstad. In Proceedings of the ACM on Programming Languages, volume 8, number OOPSLA2, Association for Computing Machinery (ACM), 2024. (DOI, Fulltext, fulltext:print).
Mutator-Driven Object Placement using Load Barriers. Jonas Norlinder, Albert Mingkun Yang, David Black-Schaffer, and Tobias Wrigstad. In MPLR 2024: Proceedings of the 21st ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes, Association for Computing Machinery (ACM), 2024. (DOI, Fulltext).
Doppelganger Loads: A Safe, Complexity-Effective Optimization for Secure Speculation Schemes. Amund Bergland Kvalsvik, Pavlos Aimoniotis, Stefanos Kaxiras, and Magnus Själander. In ISCA '23: Proceedings of the 50th Annual International Symposium on Computer Architecture, Conference Proceedings Annual International Symposium on Computer Architecture, Association for Computing Machinery (ACM), New York, NY, 2023. (DOI, fulltext:print).
Exploring the Latency Sensitivity of Cache Replacement Policies. Ahmed Nematallah, Chang Hyun Park, and David Black-Schaffer. In IEEE Computer Architecture Letters, volume 22, number 2, pp 93-96, Institute of Electrical and Electronics Engineers (IEEE), 2023. (DOI, fulltext:postprint).
Faster FunctionalWarming with Cache Merging. Gustaf Borgström, Christian Rohner, and David Black-Schaffer. In PROCEEDINGS OF SYSTEM ENGINEERING FOR CONSTRAINED EMBEDDED SYSTEMS, DRONESE AND RAPIDO 2023, pp 39-47, Association for Computing Machinery (ACM), 2023. (DOI).
Game-of-Life Temperature-Aware DVFS Strategy for Tile-Based Chip Many-Core Processors. Yuan Yao. In IEEE Journal on Emerging and Selected Topics in Circuits and Systems, volume 13, number 1, pp 58-72, Institute of Electrical and Electronics Engineers (IEEE), 2023. (DOI).
How addresses are made. Xiaoyue Chen, Pavlos Aimoniotis, and Stefanos Kaxiras. In 2023 IEEE International ymposium on Workload Characterization, IISWC, International Symposium on Workload Characterization Proceedings, pp 223-225, IEEE, 2023. (DOI).
Large-scale Graph Processing on Commodity Systems: Understanding and Mitigating the Impact of Swapping. Alireza Haddadi, David Black-Schaffer, and Chang Hyun Park. In The International Symposium on Memory Systems (MEMSYS '23), pp 1-11, Association for Computing Machinery (ACM), 2023. (DOI, Fulltext, fulltext:print).
Protean: Resource-efficient Instruction Prefetching. Muhammad Hassan, Chang Hyun Park, and David Black-Schaffer. In The International Symposium on Memory Systems (MEMSYS '23), pp 1-13, Association for Computing Machinery (ACM), 2023. (DOI, Fulltext, fulltext:print).
ReCon: Efficient Detection, Management, and Use of Non-Speculative Information Leakage. Pavlos Aimoniotis, Amund Bergland Kvalsvik, Xiaoyue Chen, Magnus Själander, and Stefanos Kaxiras. In 56th IEEE/ACM International Symposium on Microarchitecture, MICRO 2023, pp 828-842, Association for Computing Machinery (ACM), 2023. (DOI, Fulltext, fulltext:print).
SE-CNN: Convolution Neural Network Acceleration via Symbolic Value Prediction. Yuan Yao. In IEEE Journal on Emerging and Selected Topics in Circuits and Systems, volume 13, number 1, pp 73-85, Institute of Electrical and Electronics Engineers (IEEE), 2023. (DOI).
Silent Stores in the Battery-less Internet of Things: A Good Idea?. Weining Song, Stefanos Kaxiras, Luca Mottola, Thiemo Voigt, and Yuan Yao. In , 2023.
Speculative inter-thread store-to-load forwarding in SMT architectures. Josue Feliu, Alberto Ros, Manuel E. Acacio, and Stefanos Kaxiras. In Journal of Parallel and Distributed Computing, volume 173, pp 94-106, Elsevier, 2023. (DOI, Fulltext).
Analysing software prefetching opportunities in hardware transactional memory. Marina Shimchenko, Rubén Titos-Gil, Ricardo Fernández-Pascual, Manuel E. Acacio, Stefanos Kaxiras, Alberto Ros, and Alexandra Jimborean. In Journal of Supercomputing, volume 78, number 1, pp 919-944, Springer Nature, 2022. (DOI).
Clueless: A Tool Characterising Values Leaking as Addresses. Xiaoyue Chen, Pavlos Aimoniotis, and Stefanos Kaxiras. In Proceedings of the 11th International Workshop on Hardware and Architectural Support for Security And Privacy, HASP 2022, pp 27-34, Association for Computing Machinery (ACM), 2022. (DOI, Fulltext, fulltext:print).
Data-Out Instruction-In (DOIN!): Leveraging Inclusive Caches to Attack Speculative Delay Schemes. Pavlos Aimoniotis, Amund Bergland Kvalsvik, Magnus Själander, and Stefanos Kaxiras. In 2022 IEEE International Symposium on Secure and Private Execution Environment Design (SEED 2022), pp 49-60, Institute of Electrical and Electronics Engineers (IEEE), 2022. (DOI).
Delay-on-Squash: Stopping Microarchitectural Replay Attacks in Their Tracks. Christos Sakalis, Stefanos Kaxiras, and Magnus Sjalander. In ACM Transactions on Architecture and Code Optimization (TACO), volume 20, number 1, Association for Computing Machinery (ACM), 2022. (DOI, Fulltext, fulltext:print).
Dependence-aware Slice Execution to Boost MLP in Slice-out-of-order Cores. Rakesh Kumar, Mehdi Alipour, and David Black-Schaffer. In ACM Transactions on Architecture and Code Optimization (TACO), volume 19, number 2, Association for Computing Machinery (ACM), 2022. (DOI).
Every Walk's a Hit: Making Page Walks Single-Access Cache Hits. Chang Hyun Park, Ilias Vougioukas, Andreas Sandberg, and David Black-Schaffer. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’22), February 28 – March 4, 2022, Lausanne, Switzerland, Association for Computing Machinery (ACM), 2022. (DOI, Fulltext, fulltext:postprint, fulltext:print).
Faster Functional Warming with Cache Merging. Gustaf Borgström, Christian Rohner, and David Black-Schaffer. 2022. (fulltext).
Free Atomics: Hardware Atomic Operations without Fences. Ashkan Asgharzadeh, Juan M. Cebrian, Arthur Perais, Stefanos Kaxiras, and Alberto Ros. In PROCEEDINGS OF THE 2022 THE 49TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '22), Conference Proceedings Annual International Symposium on Computer Architecture, pp 14-26, Association for Computing Machinery (ACM), 2022. (DOI).
Splash-4: A Modern Benchmark Suite with Lock-Free Constructs. Eduardo José Gómez-Hernández, Juan M. Cebrian, Stefanos Kaxiras, and Alberto Ros. In 2022 IEEE International Symposium on Workload Characterization (IISWC), Proceedings of the IEEE International Symposium on Workload Characterization, pp 51-64, Institute of Electrical and Electronics Engineers (IEEE), 2022. (DOI).
Supporting Dynamic Translation Granularity for Hybrid Memory Systems. Bokyeong Kim, Soojin Hwang, Sanghoon Cha, Chang Hyun Park, Jongse Park, and Jaehyuk Huh. In 2022 IEEE 40th International Conference on Computer Design (ICCD), Proceedings IEEE International Conference on Computer Design, pp 25-32, Institute of Electrical and Electronics Engineers (IEEE), 2022. (DOI).
A Reusable Characterization of the Memory System Behavior of SPEC2017 and SPEC2006. Muhammad Hassan, Chang Hyun Park, and David Black-Schaffer. In ACM Transactions on Architecture and Code Optimization (TACO), volume 18, number 2, Association for Computing Machinery (ACM), 2021. (DOI).
Do Not Predict – Recompute!: How Value Recomputation Can Truly Boost the Performance of Invisible Speculation. Christos Sakalis, Zamshed I. Chowdhury, Shayne Wadle, Ismail Akturk, Alberto Ros, Magnus Själander, Stefanos Kaxiras, and Ulya R. Karpuzcu. In 2021 International Symposium on Secure and Private Execution Environment Design (SEED), pp 89-100, Institute of Electrical and Electronics Engineers (IEEE), 2021. (DOI).
Early Address Prediction: Efficient Pipeline Prefetch and Reuse. Ricardo Alves, Stefanos Kaxiras, and David Black-Schaffer. In ACM Transactions on Architecture and Code Optimization (TACO), volume 18, number 3, Association for Computing Machinery (ACM), 2021. (DOI, Fulltext, fulltext:print).
Efficient, Distributed, and Non-Speculative Multi-Address Atomic Operations. Eduardo Jose Gomez-Hernandez, Juan M. Cebrian, Ruben Titos-Gil, Stefanos Kaxiras, and Alberto Ros. In Proceedings of 54th Annual IEEE/ACM International Symposium on Microarchitecture, Micro 2021, International Symposium on Microarchitecture Proceedings, pp 337-349, Association for Computing Machinery (ACM), 2021. (DOI).
ITSLF: Inter-Thread Store-to-Load Forwarding in Simultaneous Multithreading. Josue Feliu, Alberto Ros, Manuel E. Acacio, and Stefanos Kaxiras. In Proceedings of 54th Annual IEEE/ACM International Symposium on Microarchitecture, Micro 2021, International Symposium on Microarchitecture Proceedings, pp 1296-1308, Association for Computing Machinery (ACM), 2021. (DOI).
Reorder Buffer Contention: A Forward Speculative Interference Attack for Speculation Invariant Instructions. Pavlos Aimoniotis, Christos Sakalis, Magnus Sjalander, and Stefanos Kaxiras. In IEEE COMPUTER ARCHITECTURE LETTERS, volume 20, number 2, pp 162-165, Institute of Electrical and Electronics Engineers (IEEE), 2021. (DOI).
Seeds of SEED: Preventing Priority Inversion in Instruction Scheduling to Disrupt Speculative Interference. Christos Sakalis, Magnus Själander, and Stefanos Kaxiras. In 2021 International Symposium on Secure and Private Execution Environment Design (SEED), pp 101-107, Institute of Electrical and Electronics Engineers (IEEE), 2021. (DOI).
Splash-4: Improving Scalability with Lock-Free Constructs. Eduardo Jose Gomez-Hernandez, Ruixiang Shao, Christos Sakalis, Stefanos Kaxiras, and Alberto Ros. In 2021 IEEE International Symposium On Performance Analysis Of Systems And Software (ISPASS 2021), pp 235-236, Institute of Electrical and Electronics Engineers (IEEE), 2021. (DOI).
TSOPER: Efficient Coherence-Based Strict Persistency. Per Ekemark, Yuan Yao, Alberto Ros, Konstantinos Sagonas, and Stefanos Kaxiras. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), International Symposium on High-Performance Computer Architecture : Proceedings, pp 125-138, Institute of Electrical and Electronics Engineers (IEEE), 2021. (DOI).

Full UART publications list.

Teaching and Recruiting

Please see Teaching and Applying for a PhD and MS Thesis Projects for more information.

Group History

The Uppsala Architecture Research Team was founded in 1999 when Professor Erik Hagersten (PhD from the Royal Institute of Technology) moved back to Sweden from his position as chief server architect at Sun Microsystems. For the first 10 years UART did pioneering work in statistical cache modeling, leading to a successful commercialization of the technology. Professor Stefanos Kaxiras (PhD from Wisconsin) joined the group in 2010, moving from the University of Patras in Greece and bringing extensive experience in power efficiency and coherency. Professor David Black-Schaffer (PhD from Stanford) also joined in 2010, bringing heterogeneous runtime experience from his work on OpenCL at Apple. Professors Hagersten, Black-Schaffer, and Kaxiras, together with PhD student Andreas Sembrant, successfully commercialized their work in direct-to-data memory systems in the company Green Cache AB, whose IP was purchased in 2018. Associate Professor Alexandra Jimborean (PhD from University of Strasbourg ) joined in 2012, bringing experience in compile-time and run-time code analysis and optimization. Since then the group has grown to include multiple PhD students and postdocs.