h264/aarch64: optimize neon loop filter
authorJanne Grunau <janne-libav@jannau.net>
Tue, 1 Jan 2019 21:37:11 +0000 (22:37 +0100)
committerJanne Grunau <janne-libav@jannau.net>
Sat, 26 Jan 2019 11:05:10 +0000 (12:05 +0100)
commit846c3d6aca5484904e60946c4fe8b8833bc07f92
tree45d1953156d38d627bb0328a41725cd238526ec3
parentd7f4f5c4a18a0c9e62635cfa6fe8a9302b413c01
h264/aarch64: optimize neon loop filter

Exit as soon as possible if no filtering will be done.

Improves the checkasm --bench cycle count on a Snapdragon 820e:
h264_h_loop_filter_luma_8bpp_c:      72.4 ->  72.5
h264_h_loop_filter_luma_8bpp_neon:   97.1 ->  56.3
h264_v_loop_filter_luma_8bpp_c:     174.0 -> 173.5
h264_v_loop_filter_luma_8bpp_neon:   62.9 ->  60.9
h264_h_loop_filter_chroma_8bpp_c:    30.2 ->  30.3
h264_h_loop_filter_chroma_8bpp_neon: 51.6 ->  25.7
h264_v_loop_filter_chroma_8bpp_c:    57.3 ->  57.3
h264_v_loop_filter_chroma_8bpp_neon: 28.0 ->  24.0
libavcodec/aarch64/h264dsp_neon.S