Update after the last API change.
[libav.git] / doc / swscale.txt
CommitLineData
38d174b3
MN
1 The official guide to swscale for confused developers.
2 ========================================================
3
4Current (simplified) Architecture:
5---------------------------------
6 Input
7 v
8 _______OR_________
9 / \
10 / \
11 special converter [Input to YUV converter]
12 | |
13 | (8bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 )
14 | |
15 | v
16 | Horizontal scaler
17 | |
18 | (15bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 )
19 | |
20 | v
21 | Vertical scaler and output converter
22 | |
23 v v
24 output
25
26
4a0266a0
RP
27Swscale has 2 scaler paths. Each side must be capable of handling
28slices, that is, consecutive non-overlapping rectangles of dimension
f236aa47 29(0,slice_top) - (picture_width, slice_bottom).
38d174b3
MN
30
31special converter
4a0266a0 32 These generally are unscaled converters of common
38d174b3
MN
33 formats, like YUV 4:2:0/4:2:2 -> RGB15/16/24/32. Though it could also
34 in principle contain scalers optimized for specific common cases.
35
36Main path
4a0266a0
RP
37 The main path is used when no special converter can be used. The code
38 is designed as a destination line pull architecture. That is, for each
88cdf2f4 39 output line the vertical scaler pulls lines from a ring buffer. When
f236aa47
DB
40 the ring buffer does not contain the wanted line, then it is pulled from
41 the input slice through the input converter and horizontal scaler.
42 The result is also stored in the ring buffer to serve future vertical
88cdf2f4
MN
43 scaler requests.
44 When no more output can be generated because lines from a future slice
45 would be needed, then all remaining lines in the current slice are
46 converted, horizontally scaled and put in the ring buffer.
f236aa47
DB
47 [This is done for luma and chroma, each with possibly different numbers
48 of lines per picture.]
38d174b3
MN
49
50Input to YUV Converter
f236aa47 51 When the input to the main path is not planar 8 bits per component YUV or
4d6a1161 52 8-bit gray, it is converted to planar 8-bit YUV. Two sets of converters
f236aa47 53 exist for this currently: One performs horizontal downscaling by 2
4d6a1161 54 before the conversion, the other leaves the full chroma resolution,
f236aa47 55 but is slightly slower. The scaler will try to preserve full chroma
4d6a1161 56 when the output uses it. It is possible to force full chroma with
f236aa47 57 SWS_FULL_CHR_H_INP even for cases where the scaler thinks it is useless.
38d174b3
MN
58
59Horizontal scaler
4a0266a0 60 There are several horizontal scalers. A special case worth mentioning is
f236aa47 61 the fast bilinear scaler that is made of runtime-generated MMX2 code
38d174b3 62 using specially tuned pshufw instructions.
f236aa47
DB
63 The remaining scalers are specially-tuned for various filter lengths.
64 They scale 8-bit unsigned planar data to 16-bit signed planar data.
4d6a1161
DB
65 Future >8 bits per component inputs will need to add a new horizontal
66 scaler that preserves the input precision.
38d174b3
MN
67
68Vertical scaler and output converter
f236aa47 69 There is a large number of combined vertical scalers + output converters.
38d174b3
MN
70 Some are:
71 * unscaled output converters
72 * unscaled output converters that average 2 chroma lines
73 * bilinear converters (C, MMX and accurate MMX)
74 * arbitrary filter length converters (C, MMX and accurate MMX)
75 And
f236aa47
DB
76 * Plain C 8-bit 4:2:2 YUV -> RGB converters using LUTs
77 * Plain C 17-bit 4:4:4 YUV -> RGB converters using multiplies
78 * MMX 11-bit 4:2:2 YUV -> RGB converters
79 * Plain C 16-bit Y -> 16-bit gray
38d174b3
MN
80 ...
81
f236aa47
DB
82 RGB with less than 8 bits per component uses dither to improve the
83 subjective quality and low-frequency accuracy.
38d174b3
MN
84
85
86Filter coefficients:
87--------------------
f236aa47
DB
88There are several different scalers (bilinear, bicubic, lanczos, area,
89sinc, ...). Their coefficients are calculated in initFilter().
90Horizontal filter coefficients have a 1.0 point at 1 << 14, vertical ones at
911 << 12. The 1.0 points have been chosen to maximize precision while leaving
92a little headroom for convolutional filters like sharpening filters and
38d174b3
MN
93minimizing SIMD instructions needed to apply them.
94It would be trivial to use a different 1.0 point if some specific scaler
95would benefit from it.
f236aa47 96Also, as already hinted at, initFilter() accepts an optional convolutional
38d174b3
MN
97filter as input that can be used for contrast, saturation, blur, sharpening
98shift, chroma vs. luma shift, ...
99