avcodec/mips: [loongson] optimize put_hevc_qpel_hv_8 with mmi.

Optimize put_hevc_qpel_hv_8 with mmi in the case width=4/8/12/16/24/32/48/64.
This optimization improved HEVC decoding performance 11%(1.81x to 2.01x, tested on loongson 3A3000).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit is contained in:
Shiyou Yin
2019-01-21 18:10:24 +08:00
committed by Michael Niedermayer
parent 8133921ad2
commit 6d19164811
4 changed files with 240 additions and 10 deletions

View File

@@ -250,6 +250,15 @@
: "memory" \
);
/**
* brief: Transpose 2X2 word packaged data.
* fr_i0, fr_i1: src
* fr_o0, fr_o1: dst
*/
#define TRANSPOSE_2W(fr_i0, fr_i1, fr_o0, fr_o1) \
"punpcklwd "#fr_o0", "#fr_i0", "#fr_i1" \n\t" \
"punpckhwd "#fr_o1", "#fr_i0", "#fr_i1" \n\t"
/**
* brief: Transpose 4X4 half word packaged data.
* fr_i0, fr_i1, fr_i2, fr_i3: src & dst