Optimize byte swapping routines in memory_generic.cc · xenia-project/xenia#308

(0 留言) (0 反應) (0 負責人)C++ (7,418 star) (1,077 fork)batch import

cpugood first issue

描述

AVX intrinsics and unrolled loops could help swap large chunks of memory much faster.

技術棧: cpp
領域: performance
議題類型: performance
難度: 3
預計時間: 1-3 hours
活動狀態: stale
清晰度: clear
前置要求: C++ programmingSIMD intrinsics (AVX)Understanding of byte swapping
新手友善度: 30
研究方向: Investigate the current byte swapping implementation in memory generic.cc. Research AVX intrinsics for 128/256 bit byte swaps. Look into unrolled loop patterns to improve throughput. Test the optimized version for correctness and performance.