1839 Commits

Author SHA1 Message Date
Michael Niedermayer
27f8d38682 avcodec/x86/mpegvideodsp: Fix signedness bug in need_emu
Fixes: out of array read
Fixes: 3516/attachment-311488.dat

Found-by: Insu Yun, Georgia Tech.
Tested-by: wuninsu@gmail.com
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 58cf31cee7a456057f337b3102a03206d833d5e8)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-16 02:31:20 +01:00
Michael Niedermayer
c5e5599010 avcodec/me_cmp: Fix crashes on ARM due to misalignment
Adds a diff_pixels_unaligned()

Fixes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=872503

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit bc488ec28aec4bc91ba47283c49c9f7f25696eaa)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-08-23 13:15:18 +02:00
Ronald S. Bultje
c277b24173 videodsp: fix 1-byte overread in top/bottom READ_NUM_BYTES iterations.
This can overread (either before start or beyond end) of the buffer in
Nx1 (i.e. height=1) images.

Fixes mozilla bug 1240080.

(cherry picked from commit 0f88b3f82fafd536979993aeaafcb11a22266dbd)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-08-23 13:15:16 +02:00
Michael Niedermayer
71fc26403f avcodec/x86/sbrdsp: Fix using uninitialized upper 32bit of noise
Fixes crash
Fixes: flicker-1.scout3d21443372922.28.m4a

Found-by: Dale Curtis <dalecurtis@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 1b82b934a166e60f64e966eaa97512ba9dcb615b)

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-06 12:40:49 +01:00
Ronald S. Bultje
d3af86c867 videodsp: don't overread edges in vfix3 emu_edge.
Fixes trac ticket 3226. Also see Andreas' analysis in
https://bugs.debian.org/801745, which was very helpful.
(cherry picked from commit 52f84d82bdf1851ecfcc412c1719e5f6f3396209)
2015-10-25 01:20:42 +02:00
Michael Niedermayer
ff02eeafd8 avcodec/x86/h264_weight: handle weight1=128
Fix ticket4596

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
(cherry picked from commit e1009665759d4a3938dd2dd07b7e84d8bc9c5290)
2015-06-19 11:25:38 +02:00
Michael Niedermayer
95cf5e83a7 Merge commit '4dc0fbb13c33b4e5bdb766652f4daf900ccc952f' into release/2.4
* commit '4dc0fbb13c33b4e5bdb766652f4daf900ccc952f':
  x86: cavs: Remove an unneeded scratch buffer

Conflicts:
	libavcodec/x86/cavsdsp.c

See: d79f7bf0d63a81ee66026ee92a6946a7303d04bd
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-28 22:40:53 +02:00
Michael Niedermayer
e4e64f2fea avcodec/x86/cavsdsp: remove unneeded tmp
This is faster and simpler as well

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
(cherry picked from commit d79f7bf0d63a81ee66026ee92a6946a7303d04bd)

Conflicts:

	libavcodec/x86/cavsdsp.c
2015-05-28 22:40:23 +02:00
Michael Niedermayer
4dc0fbb13c x86: cavs: Remove an unneeded scratch buffer
Simplifies the code and makes it build on certain compilers
running out of registers on x86.

CC: libav-stable@libav.org
Reported-By: mudler
(cherry picked from commit e4610300de6869bd6b3b00e76cfeabb6d7653dcd)
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-05-28 18:42:30 +02:00
Michael Niedermayer
b8d3c3ea86 Merge commit '2af720fe5f0418612a8fc26b0147a0e10414fcbe' into release/2.4
* commit '2af720fe5f0418612a8fc26b0147a0e10414fcbe':
  x86: Put COPY3_IF_LT under HAVE_6REGS

Conflicts:
	libavcodec/x86/mathops.h

See: b38910c9790253b362839042a17e13252c1d4b90
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-05-19 20:39:54 +02:00
Luca Barbato
2af720fe5f x86: Put COPY3_IF_LT under HAVE_6REGS
It uses 6 registers, unbreaks building on hardened x86 system.

Bug-Id: gentoo/541930
CC: libav-stable@libav.org
2015-05-19 12:04:41 +01:00
Michael Niedermayer
88c06ca251 avcodec/x86/mlpdsp_init: Simplify mlp_filter_channel_x86()
Based on patch by Francisco Blas Izquierdo Riera
Commit message partly taken from carl

fixes a compilation
error in mlpdsp_init.c with -fstack-check and some gcc compilers (I
reproduced the issue with gcc 4.7.3) by simplifying the code.

See also https://bugs.gentoo.org/show_bug.cgi?id=471756

$ make libavcodec/x86/mlpdsp_init.o
libavcodec/x86/mlpdsp_init.c: In function ‘mlp_filter_channel_x86’:
libavcodec/x86/mlpdsp_init.c:142:5: error: can’t find a register in
class ‘GENERAL_REGS’ while reloading ‘asm’
libavcodec/x86/mlpdsp_init.c:142:5: error: ‘asm’ operand has impossible
constraints

4551 -> 4509 dezicycles

Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
(cherry picked from commit 03f39fbb2a558153a3c464edec1378d637a755fe)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-29 03:34:21 +02:00
Carl Eugen Hoyos
2d1309c352 hevc_deblock: Fix compilation with nasm
CC: libav-stable@libav.org
Bug-Id: 795
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2015-02-22 23:46:55 +00:00
Michael Niedermayer
d694ab846c avcodec/x86/vp9lpf: Always include x86util.asm
Fixes executable stack

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
(cherry picked from commit 41d82b85ab0ee8bb2931c1f783e30c38c2fb5206)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-09-18 01:11:33 +02:00
James Almer
52ec81c67d x86/hevc_res_add: add missing guards to hevc_transform_add32_8_avx2
Should fix compilation with old Yasm/Nasm versions.

Signed-off-by: James Almer <jamrial@gmail.com>
2014-09-04 23:34:01 -03:00
James Almer
c3d2426cca x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2
~20% faster than AVX.

Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-09-04 20:21:29 -03:00
James Darnley
46ef45ab59 lavc/x86/v210: give cpuflag to INIT macro
This lets the cglobal macro automatically append a suffix to the function name.
This means that INIT_XMM avx must be used rather than INIT_AVX.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-09-05 00:35:07 +02:00
Michael Niedermayer
5b58d79a99 Merge commit '7a1d6ddd2c6b2d66fbc1afa584cf506930a26453'
* commit '7a1d6ddd2c6b2d66fbc1afa584cf506930a26453':
  xvid: Add C IDCT

Conflicts:
	libavcodec/dct-test.c
	libavcodec/xvididct.c

See: 298b3b6c1f8f66b9bc6de53a7b51d3de745d946b
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-09-03 04:09:38 +02:00
Michael Niedermayer
5db23c07a3 Merge commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b'
* commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b':
  idctdsp: Add global function pointers for {add|put}_pixels_clamped functions

Conflicts:
	libavcodec/arm/idctdsp_init_arm.c
	libavcodec/dct.h
	libavcodec/idctdsp.c
	libavcodec/jrevdct.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-09-03 03:19:40 +02:00
Pascal Massimino
7a1d6ddd2c xvid: Add C IDCT
Thanks to Pascal Massimino and Michael Militzer for relicensing as LGPL.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-09-02 14:41:13 -07:00
Diego Biurrun
95c0cec03a idctdsp: Add global function pointers for {add|put}_pixels_clamped functions
These function pointers already existed in the ARM code. Adding them globally
allows calls to the function pointers to access arch-optimized versions of the
functions transparently.
2014-09-02 14:41:13 -07:00
Reimar Döffinger
d9e2aceb7f Add missing "const" all over the place.
Only "./configure --enable-gpl" on x86 was tested.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2014-08-29 18:57:25 +02:00
Michael Niedermayer
5403a288a7 Merge commit '8d27bf1cff35be406b0fd89d832e1852d4c573bc'
* commit '8d27bf1cff35be406b0fd89d832e1852d4c573bc':
  x86: xvid: K&R formatting cosmetics

Conflicts:
	libavcodec/x86/xvididct_sse2.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-27 21:20:39 +02:00
Michael Niedermayer
b3b05a11d3 Merge commit 'dcb7c868ec7af7d3a138b3254ef2e08f074d8ec5'
* commit 'dcb7c868ec7af7d3a138b3254ef2e08f074d8ec5':
  cosmetics: Make naming scheme of Xvid IDCT consistent with other IDCTs

Conflicts:
	libavcodec/mpeg4videodec.c
	libavcodec/x86/Makefile
	libavcodec/x86/dct-test.c
	libavcodec/x86/xvididct_sse2.c
	libavcodec/xvididct.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-27 21:09:30 +02:00
Michael Niedermayer
3ff5ca89fc Merge commit '1f156af4274dc72d588620f6bedb4e9e66023c92'
* commit '1f156af4274dc72d588620f6bedb4e9e66023c92':
  x86: xvid_idct: Drop unused definitions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-27 21:01:54 +02:00
Diego Biurrun
8d27bf1cff x86: xvid: K&R formatting cosmetics 2014-08-27 05:58:04 -07:00
Diego Biurrun
dcb7c868ec cosmetics: Make naming scheme of Xvid IDCT consistent with other IDCTs 2014-08-27 04:54:05 -07:00
Diego Biurrun
1f156af427 x86: xvid_idct: Drop unused definitions 2014-08-27 04:36:41 -07:00
Christophe Gisquet
3e892b2bcd x86: hevc_mc: split differently calls
In some cases, 2 or 3 calls are performed to functions for unusual
widths. Instead, perform 2 calls for different widths to split the
workload.

The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't
be processed that way without modifications: some calls use unaligned
buffers, and having branches to handle this was resulting in no
micro-benchmark benefit.

For block_w == 12 (around 1% of the pixels of the sequence):
Before:
12758 decicycles in epel_uni, 4093 runs, 3 skips
19389 decicycles in qpel_uni, 8187 runs, 5 skips
22699 decicycles in epel_bi, 32743 runs, 25 skips
34736 decicycles in qpel_bi, 32733 runs, 35 skips

After:
11929 decicycles in epel_uni, 4096 runs, 0 skips
18131 decicycles in qpel_uni, 8184 runs, 8 skips
20065 decicycles in epel_bi, 32750 runs, 18 skips
31458 decicycles in qpel_bi, 32753 runs, 15 skips

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-24 12:05:33 +02:00
Christophe Gisquet
38e2aa3759 x86: hevc_mc: correct unneeded use of SSE4 code
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-24 11:43:33 +02:00
Christophe Gisquet
2346f2b5db x86: hevcdsp: use compilation-time-fixed constant
The stride for some buffers is known.

Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 16:26:30 +02:00
Christophe Gisquet
dad7f15567 hevcdsp: remove more instances of compile-time-fixed parameters
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 15:22:42 +02:00
Christophe Gisquet
d4f44b66d3 hevcdsp: remove compilation-time-fixed parameter
The dststride parameter is always MAX_PB_SIZE.

Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 14:57:37 +02:00
Christophe Gisquet
fb1a98ec5b x86: hevc_mc: assume 2nd source stride is 64
Reviewed-by: Mickaël Raulet <mraulet@gmail.com
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 13:21:37 +02:00
James Almer
54ca4dd43b x86/hevc_res_add: refactor ff_hevc_transform_add{16,32}_8
* Reduced xmm register count to 7 (As such they are now enabled for x86_32).
* Removed four movdqa (affects the sse2 version only).
* pxor is now used to clear m0 only once.

~5% faster.

Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-21 15:01:33 -03:00
James Almer
76a99d467f x86/hecv_res_add: add ff_hevc_transform_add{8,16,32}_8_avx
~15% faster than sse2

Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-20 16:54:52 -03:00
James Almer
9f498f4e6f x86/hevc_res_add: fix register count in hevc_transform_add{16,32}_10_avx2
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-19 21:34:52 -03:00
Pierre Edouard Lepere
a6af4bf64d x86: hevc: adding transform_add
Reviewed-by: James Almer <jamrial@gmail.com>
Approved-by: Ronald S. Bultje
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-20 01:28:56 +02:00
Michael Niedermayer
3bb2297351 Merge commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6'
* commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6':
  build: Add explanatory comments to (optimization) blocks in the Makefiles

Conflicts:
	libavcodec/ppc/Makefile
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-15 20:25:12 +02:00
Michael Niedermayer
c1df467d73 Merge commit '835f798c7d20bca89eb4f3593846251ad0d84e4b'
* commit '835f798c7d20bca89eb4f3593846251ad0d84e4b':
  mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes

Conflicts:
	libavcodec/h261dec.c
	libavcodec/intrax8.c
	libavcodec/mjpegenc.c
	libavcodec/mpeg12dec.c
	libavcodec/mpeg12enc.c
	libavcodec/mpeg4videoenc.c
	libavcodec/mpegvideo.c
	libavcodec/mpegvideo.h
	libavcodec/mpegvideo_enc.c
	libavcodec/rv10.c
	libavcodec/x86/mpegvideoenc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-15 20:11:56 +02:00
Diego Biurrun
efd26bedec build: Add explanatory comments to (optimization) blocks in the Makefiles 2014-08-15 02:55:21 -07:00
Diego Biurrun
835f798c7d mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes 2014-08-15 01:26:33 -07:00
James Darnley
54a51d3840 lavc/flacenc: partially unroll loop in flac_enc_lpc_16
It now does 12 samples per iteration, up from 4.

From 1.8 to 3.2 times faster again.  3.6 to 5.7 times faster overall.
Runtime is reduced by a further 2 to 18%.  Overall runtime reduced by
4 to 50%.

Same conditions as before apply.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 03:09:26 +02:00
James Darnley
0081a14e7d lavc/flacenc: add sse4 version of the 16-bit lpc encoder
From 1.8 to 2.4 times faster.  Runtime is reduced by 2 to 39%.  The
speed-up generally increases with compression_level.

This lpc encoder is not used with levels < 3 so it provides no speed-up
in these cases.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 01:14:47 +02:00
Ronald S. Bultje
45bed0ab30 vp9/x86: fix bug in intra_pred_hd_32x32.
Fixes mismatch in first keyframe in sample
ffvp9_fails_where_libvpx.succeeds.webm from ticket 3849. There's still
a second mismatch a few frames into the sample.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-12 13:11:21 +02:00
James Almer
c97870d1a1 x86/dca: remove unused header
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-12 12:46:53 +02:00
James Almer
e20ff251a6 x86/ttadsp: remove an unnecessary mova
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-12 12:29:05 +02:00
Michael Niedermayer
3841f2ae66 Merge commit 'd35b94fbabd8beb5d566c0b5d01688aff62c3b36'
* commit 'd35b94fbabd8beb5d566c0b5d01688aff62c3b36':
  avcodec: Rename xvidmmx IDCT to xvid

Conflicts:
	doc/APIchanges
	libavcodec/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-09 12:11:13 +02:00
Michael Niedermayer
0dcebb9f63 Merge commit '84d173d3de97c753234ab0c0b50551d51413d663'
* commit '84d173d3de97c753234ab0c0b50551d51413d663':
  xvididct: Ensure that the scantable permutation is always set correctly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-08 22:17:04 +02:00
Diego Biurrun
d35b94fbab avcodec: Rename xvidmmx IDCT to xvid
The Xvid IDCT is not MMX-specific.
2014-08-08 11:13:30 -07:00