Publications

Conferences and Journals

HPCA 2025 (Top-tier Conference)
Marching Page Walks: Batching and Concurrent Page Table Walks for Enhancing GPU Throughput
Jiwon Lee, Gun Ko, Myung Kuk Yoon, Ipoom Jeong, Yunho Oh, and Won Woo Ro
The 31st IEEE International Symposium on High-Performance Computer Architecture (HPCA-31), Las Vegas, Nevada, USA, 2025.

IEEE ESL (Major journal in embedded systems)
TLP Balancer: Predictive Thread Allocation for Multi-Tenant Inference in Embedded GPUs
Minseong Gil, Jaebeom Jeon, Junsu Kim, Sangun Choi, Gunjae Koo, Myung Kuk Yoon, and Yunho Oh
Accepted at IEEE Embedded Systems Letters.

ICPP 2024 (Major conference in computer architecture)
VitBit: Enhancing Embedded GPU Performance for AI Workloads through Register Operand Packing
Jaebeom Jeon, Junsu Kim, Jaeyong Park, Gunjae Koo, Myung Kuk Yoon, and Yunho Oh
The 53rd International Conference on Parallel Processing (ICPP 2024), Gotland, Sweden, 2024.

IEEE Access (SCIE)
SAVector: Vectored Systolic Arrays
Sangun Choi, Seongjun Park, Jaeyong Park, Jongmin Kim, Gunjae Koo, Seokin Hong, Myung Kuk Yoon, and Yunho Oh
IEEE Access, Vol. 12, pp. 44446-44461, 2024.

JSA (Top journal in computer architecture)
Conflict-Aware Compiler for Hierarchical Register File on GPUs
Eunbi Jeong, Eun Seong Park, Gunjae Koo, Yunho Oh*, and Myung Kuk Yoon*
Accepted at Journal of Systems Architecture.

IEEE ESL (Major journal in embedded systems)
Adaptive Kernel Merge and Fusion for Multi-Tenant Inference in Embedded GPUs
Jaebeom Jeon, Gunjae Koo, Myung Kuk Yoon, and Yunho Oh
Accepted at IEEE Embedded Systems Letters.

MICRO 2023 (Top-tier Conference)
MAD MAcce: Supporting Multiply-Add Operations for Democratizing Matrix-Multiplication Accelerator
Seunghwan Sung, Sujin Hur, Sungwoo Kim, Dongho Ha, Yunho Oh, and Won Woo Ro
The 56th IEEE/ACM International Symposium on Microarchitecture (MICRO 2023), Toronto, Canada, 2023.

ICPP 2023 (Major conference in computer architecture)
Warped-MC: An Efficient Memory Controller Scheme for Massively Parallel Processors
Jonghyun Jeong, Yunho Oh, Myung Kuk Yoon, and Gunjae Koo
The 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, 2023.

ISCA 2023 (Top-tier Conference)
R2D2: Removing ReDunDancy Utilizing Linearity of Address Generation in GPUs
Dongho Ha, Yunho Oh, and Won Woo Ro
The 50th ACM/IEEE International Symposium on Computer Architecture (ISCA-50), Orlando, Florida, 2023.

ISCA 2023 (Top-tier Conference)
Imprecise Store Exceptions
Siddharth Gupta, Yuanlong Li, Qingxuan Kang, Abhishek Bhattacharjee, Babak Falsafi, Yunho Oh, and Mathias Payer
Accepted at The 50th ACM/IEEE International Symposium on Computer Architecture (ISCA-50), Orlando, Florida, 2023.

HPCA 2023 (Top-tier Conference)
AstriFlash: A Flash-Based System for Online Services
Siddharth Gupta, Yunho Oh, Lei Yan, Mark Sutherland, Abhishek Bhattacharjee, Babak Falsafi, and Peter Hsu
The 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA-29), Montreal, Quebec, Canada, 2023.

HPCA 2023 (Top-tier Conference)
SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs
Jiwon Lee, Ju Min Lee, Yunho Oh, William Song, and Won Woo Ro
The 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA-29), Montreal, Quebec, Canada, 2023.

ACM TACO (Major journal in computer architecture)
Scale-Out Systolic Arrays
Ahmet Caner Yüzügüler, Canberk Sönmez, Mario Drumond, Yunho Oh, Babak Falsafi, and Pascal Frossard
ACM Transactions on Architecture and Code Optimization, vol. 20, issue 2, no. 27, pp. 1-25, Mar. 2023.

IEEE TC (Top journal in computer architecture)
FLIXR: Embedding Index into Flash Translation Layer in SSDs
Gunjae Koo*, Yunho Oh*, Hung-Wei Tseng, Won Woo Ro, and Murali Annavaram
IEEE Transactions on Computers (*equally contributed as the first authors), vol. 72, no. 3, pp. 250-263, Jan. 2023.

IEEE ESL (Major journal in embedded systems)
CASH-RF: A Compiler-Assisted Hierarchical Register File in GPUs
Yunho Oh, Ipoom Jeong, Won Woo Ro, and Myung Kuk Yoon
IEEE Embedded Systems Letters, vol. 14, pp. 187-190, Dec. 2022.

IEEE Access (SCIE)
Analyzing GCN Aggregation on GPU
Inje Kim, Jonghyun Jeong, Yunho Oh, Myung Kuk Yoon, and Gunjae Koo
IEEE Access, vol. 10, pp. 113046-113060, Nov. 2022.

IEEE Access (SCIE)
GhostLeg: Selective Memory Coalescing for Secure GPU Architecture
Jongmin Lee, Seungho Jung, Taewon Suh, Yunho Oh, Myung Kook Yoon, and Gunjae Koo
IEEE Access, vol. 10, pp. 111449-111462, Nov. 2022.

IEEE Access (SCIE)
TEA-RC: Thread Context-Aware Register Cache for GPUs
Ipoom Jeong, Yunho Oh, Won Woo Ro, and Myung Kuk Yoon
IEEE Access, vol. 10, pp. 82049-82062, August. 2022.

ISCA 2021 (Top-tier Conference)
Rebooting Virtual Memory with Midgard
Siddharth Gupta, Atri Bhattacharyya, Yunho Oh, Abhishek Bhattacharjee, Babak Falsafi, and Mathias Payer
The 48th ACM/IEEE International Symposium on Computer Architecture, virtual conference, 2021.

MICRO 2020 (Top-tier conference)
Duplo: Lifting Redundant Memory Accesses of Deep Neural Networks for GPU Tensor Cores
Hyeonjin Kim, Sungwoo Ahn, Yunho Oh, Bogil Kim, Won Woo Ro, and William J. Song
The 53rd IEEE/ACM International Symposium on Microarchitecture, virtual conference, 2020.

ISCA 2019 (Top-tier conference)
Linebacker: Preserving Victim Cache Lines in Idle Register Files of GPUs
Yunho Oh, Gunjae Koo, Murali Annavaram, and Won Woo Ro
The 46th ACM/IEEE International Symposium on Computer Architecture, Phoenix, AZ, USA, 2019.

IEEE TC (Top journal in computer architecture)
Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs
Yunho Oh, Keunsoo Kim, Myung Kuk Yoon, Jong Hyun Park, Yongjun Park, Murali Annavaram, and Won Woo Ro
IEEE Transactions on Computers, vol. 68, no. 4, pp. 609-616, April. 2019.

MICRO 2018 (Top-tier conference)
FineReg: Augmenting GPU Throughput via Fine-Grained Register File Management
Yunho Oh, Myung Kuk Yoon, William J. Song, and Won Woo Ro
The 51st ACM/IEEE International Symposium on Microarchitecture, Fukuoka, Japan, 2018

IEEE TC (Top journal in computer architecture)
WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs
Yunho Oh, Myung Kuk Yoon, Jong Hyun Park, Yongjun Park, and Won Woo Ro
IEEE Transactions on Computers, vol. 67, no. 9, pp. 1366-1373, Sept. 2018.

IEEE TPDS (Top journal in computer architecture)
Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs
Myung Kuk Yoon, Yunho Oh, Sangpil Lee, Seung Hun Kim, Deokho Kim, and Won Woo Ro
IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 11, pp. 3142-3156, Nov. 2017.

ISCA 2017 (Top-tier conference)
Access Pattern-Aware Cache Management for Improving Data Utilization in GPU
Gunjae Koo, Yunho Oh, Won Woo Ro, and Murali Annavaram
The 44th ACM/IEEE International Symposium on Computer Architecture, Toronto, ON, Canada, 2017.

ISCA 2016 (Top-tier conference)
APRES: Improving Cache Efficiency by Exploiting Load Characteristics on GPUs
Yunho Oh, Keunsoo Kim, Myung Kuk Yoon, Jong Hyun Park, Yongjun Park, Won Woo Ro, and Murali Annavaram
The 43rd ACM/IEEE International Symposium on Computer Architecture, Seoul, Korea, 2016.

ISPASS 2015 (Major conference)
DRAW: Investigating Benefits of Adaptive Fetch Group Size on GPU
Myung Kuk Yoon, Yunho Oh, Sangpil Lee, Seung Hun Kim, Deokho Kim, and Won Woo Ro
The 2015 IEEE International Symposium on Performance Analysis of Systems and Software, Philadelphia, PA, USA, 2015.

ITC-CSCC 2015
Improving Pipeline Utilization with Two-Level Instruction Issue on GPUs
Yunho Oh, Jong Hyun Park, and Won Woo Ro
The 30th International Technical Conference on Circuits/Systems, Computers and Communications, Seoul, Korea, 2015.

KIISE
Introduction to Researches on Performance Bottlenecks of Many-Core GPU Architectures
Yunho Oh, Myung Kuk Yoon, Jong Hyun Park, and Won Woo Ro
Communications of KIISE, Vol. 32 No. 5, May, 2014.

IJPP (SCIE)
GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table
Yunho Oh, Doohwan Oh, and Won W. Ro
International Journal of Parallel Programming, Vol. 41, No. 4, pp. 526-551, August, 2013.

ICEIC 2010
Multi-Threaded Filtered BackProjection Algorithm on Multi-Core Processors
Yun H. Oh, and Won W. Ro
The 10th International Conference on Electronics, Information, and Communication, Cebu, Philippines, 2010.

ISMRM 2010 (Top-tier conference in medical imaging)
Accelerated Reconstruction Using Parallel Computing for Spiral Spectroscopic Imaging
Dong H. Kim, Yun H. Oh, Yun H. Nam, M. Gu, and Won W. Ro
International Society for Magnetic Resonance in Medicine Annual Meeting, Stockholm, Sweden, 2010.

ELEX (SCIE)
Hardware Implementation of a Tessellation Accelerator for the OpenVG Standard
Seung Hun Kim, Yunho Oh, Karam Park, and Won W. Ro
IEICE Electronics Express, Vol. 7, No. 6, pp 440-446, March, 2010.

Workshops

MLArchSys
Accuracy Boosters: Epoch-Driven Mixed-Mantissa Block Floating Point for DNN Training
Simla Burcu Harma, Ayan Chakraborty, Babak Falsafi, Martin Jaggi, and Yunho Oh
ML for Computer Architecture and Systems co-located with ISCA 2023.

SPMA
AstriFlash: An Online Flash-Based Memory Hierarchy
Siddharth Gupta, Yunho Oh, Lei Yan, Mark Sutherland, Abhishek Bhattacharjee, Babak Falsafi, and Peter Hsu
The 10th Workshop on Systems for Post-Moore Architectures co-located with Eurosys 2020.

HENND
Accelerating Neural Network with Selective Thread-Level Parallelism Regulation and Cache Bypassing on GPUs
Kwanghee Chang, Yunho Oh, Myung Kuk Yoon, and Won Woo Ro
International Workshop on Highly Efficient Neural Networks Design in conjunction with ESWEEK 2017.

Patent

Central processing unit, GPU simulation method thereof, and computing system including the same
Won Woo Ro, Karam Park, Yunho Oh, Sangpil Lee, and Minwoo Kim
US Patent 9,378,533.

Operation device of convolutional neural network, operation method of convolutional neural network and computer program stored in a recording medium to execute the method thereof
William Jinho Song, Won Woo Ro, Hyeonjin Kim, Sungwoo Ahn, Yunho Oh, and Bogil Kim
US Patent 17/752,235.