JPEG XL

Info

rules 57
github 35276
reddit 647

JPEG XL

tools 4225
website 1655
adoption 20712
image-compression-forum 0

General chat

welcome 3810
introduce-yourself 291
color 1414
photography 3435
other-codecs 23765
on-topic 24923
off-topic 22701

Voice Channels

General 2147

Archived

bot-spam 4380

Specification issues

Traneptora
2022-11-08 09:51:49
since things like incorrect parsing can cause values to be invalid
2022-11-08 09:51:53
and it catches that
_wb_
2022-11-08 09:52:07
yeah throwing errors is certainly the best thing to do, even besides the debugging usefulness
2022-11-08 09:53:18
the best decoder is one that decodes all compliant bitstreams and refuses to decode any non-compliant one — we are probably missing some of those cases in libjxl atm btw
dds
_wb_ currently the spec says "All referenced samples are within the bounds of the reference frame." so this is an invalid bitstream atm
2022-11-08 09:54:47
TIL how ISO specification idiom works: In this case "All referenced samples are within the bounds of the reference frame" makes me think "well isn't that nice" but it's intended to convey "If any sample is outside the bounds of the reference frame then the bitstream is invalid so continue at your own risk".
2022-11-08 09:55:21
silly ISO
_wb_
2022-11-08 09:55:36
yeah I agree the ISO language is confusing
lonjil
2022-11-08 09:56:07
I have heard from many people that ISO standards tend to be incredibly detailed and yet written such that it's hard to tell exactly what you're supposed to do, leading to interoperability problems.
_wb_
2022-11-08 09:56:23
if we say "they are within the bounds" it means "if they're not, this is not a valid bitstream"
2022-11-08 09:57:57
if we say "they must be within the bounds" it means "no need to check, it's physically impossible that they are not within the bounds, e.g. it's a consequence of the bitstream syntax that they have to be"
dds
2022-11-08 09:58:08
oh what
2022-11-08 09:59:04
ok i can only find one occurance of 'must' and it's not confusing
_wb_
2022-11-08 09:59:33
> Levels are defined in such a way that if a decoder supports level N, it must also support lower levels.
2022-11-08 10:00:14
yeah this statement is a consequence of the way the levels are defined, so we can use "must" there
dds
2022-11-08 10:01:04
it feels like the ISO language was designed for countries to agree on common standards for traffic signs, not for nitpicky geeks trying to level up the internet
_wb_
2022-11-08 10:01:47
most of the normative statements are done with "shall" in ISO, but we avoided making all sentences sound biblical and having hundreds of "shall"s by leveraging this sentence in clause 7: > These specifications are normative in the sense that they are defining an output that alternative implementations shall duplicate.
2022-11-08 10:02:46
that one single "shall" makes basically all the rest of the document have implicit shalls
2022-11-08 10:04:38
half the job of an ISO staff editor is making sure the use of shall, should, must etc is done correctly, and it's very confusing if you're not used to it so I imagine most first-time spec editors get it wrong all the time
dds
2022-11-08 10:08:38
so ISO speak is a bit like legal speak in that it mostly reads like normal English but the words don't always mean what you think they do
improver
2022-11-08 11:03:02
they mostly mean what is specified in long intros many skip
2022-11-08 11:04:20
but it was pretty clear at least to me that anything not specified was not part of the spec
2022-11-08 11:05:18
it's just kinda how you're used to approaching specifications
lonjil
2022-11-09 01:15:58
Is that supposed to be a plus between 18 and u(6), rather than a new line?
Traneptora
2022-11-09 04:49:48
It is
_wb_
2022-11-09 05:31:03
Yeah I also just saw that, will be fixed.
Traneptora
2022-11-09 05:13:01
Section K.5.2, definition of SplitMix64, has `0×` instead of `0x` in front of the hexadecimal numbers
2022-11-09 06:12:03
> The strength of the noise is computed as follows. Given the X and Y samples of the input pixel, let InR = Y + X and InG = Y − X. Let InscaledR = InR * 6 and InscaledG = InG * 6. These values are split into integer parts in_intR = floor(InscaledR) and in_intG = floor(InscaledG) and fractional parts in_fracR = InscaledR − in_intR and in_fracG = InscaledG − in_intG. If the value InScaledR or InScaledG is 8 or more, the corresponding integer and fractional parts are clamped back to 7 and 1 respectively. > > Using the LUT values decoded as described in K.5.1, the strengths SR and SG are defined as SR = LUT[in_intR] * (1 − in_fracR) + LUT[in_intR + 1] * in_fracR and SG = LUT[in_intG] * (1 − in_fracG) + LUT[in_intG + 1] * in_fracG. Each strength is clamped between 0 and 1.
2022-11-09 06:12:12
What if `InR` or `InG` is negative?
2022-11-09 06:12:16
it very well could be the case
2022-11-09 06:20:21
looks like libjxl clamps to 0
2022-11-09 06:20:38
should say that in the spec tho
2022-11-09 06:23:29
Also
2022-11-09 06:23:30
> If the value InScaledR or InScaledG is 8 or more, the corresponding integer and fractional parts are clamped back to 7 and 1 respectively.
2022-11-09 06:23:43
This will overflow `LUT` at 7
2022-11-09 06:23:49
because LUT is an array of size 8
2022-11-09 06:24:00
so LUT[7 + 1] will fail
veluca
Traneptora Section K.5.2, definition of SplitMix64, has `0×` instead of `0x` in front of the hexadecimal numbers
2022-11-09 06:36:49
how did you even notice 🤣
Traneptora
2022-11-09 06:37:06
because I copied and pasted and got a syntax error <:KEKW:643601031040729099>
veluca
Traneptora looks like libjxl clamps to 0
2022-11-09 06:37:39
it should
Traneptora This will overflow `LUT` at 7
2022-11-09 06:38:10
ahhhhh you have no idea how many times I ended up changing the number for that clamping
2022-11-09 06:38:27
I thought I got it right ... what does libjxl clamp to? 6?
Traneptora
2022-11-09 06:39:33
I double checked the LUT overflow, apparently libjxl does the following ```cpp float kScale = 6f; if (in_scaled >= kScale) in_int = kScale - 1; in_frac = 1f; } else { in_int = floor(in_scaled); in_frac = in_scaled - in_int; } ```
2022-11-09 06:39:35
Yup
2022-11-09 06:39:46
`in_int` is capped at 5, because of this
2022-11-09 06:39:52
which means LUT[7] is never used
2022-11-09 06:40:08
although I believe this is a bug in libjxl
2022-11-09 06:40:18
where it should do the following:
2022-11-09 06:40:49
```cpp float kScale = 6f; if (in_scaled >= 7f) in_int = 6; in_frac = 1f; } else { in_int = floor(in_scaled); in_frac = in_scaled - in_int; } ```
2022-11-09 06:41:09
since `float lut[8]` is sealed in stone
2022-11-09 06:41:47
so either Scale needs to change to 7, or the clamping should change, cause otherwise something is weird
2022-11-09 06:45:41
in either case the spec causes an overflow as the spec is written
_wb_
2022-11-09 06:53:38
So bug in both libjxl and spec? Ugh
Traneptora
2022-11-09 06:54:08
perhaps
2022-11-09 06:54:21
unclear if libjxl is bugged, but probably is, given that lut[7] is not used
2022-11-09 06:54:40
spec is definitely bugged as it will cause an overrun of the buffer
_wb_
2022-11-09 06:55:32
So spec should probably say: > If the value InScaledR or InScaledG is 7 or more, the corresponding integer and fractional parts are clamped back to 6 and 1 respectively.
2022-11-09 06:55:57
And libjxl should say what you wrote above
Traneptora
2022-11-09 06:56:37
or equivalent in HIghway-ese
2022-11-09 06:56:46
this is pseudocode
_wb_ So spec should probably say: > If the value InScaledR or InScaledG is 7 or more, the corresponding integer and fractional parts are clamped back to 6 and 1 respectively.
2022-11-09 06:57:13
it should also mention that they're clamped to 0 if negative
veluca
2022-11-09 07:38:59
I hate that code 🤣 it's too hard for me
Traneptora
2022-11-09 11:34:57
I'm confused about noise groups, and how they interact with upsampling
veluca
2022-11-10 01:01:31
~~that's two of us~~
2022-11-10 01:02:08
iirc the way that works is that noise is never generated in tiles bigger than 256x256, even if upsampling happens
Traneptora
2022-11-10 04:24:59
the spec and libjxl differ in randomness seed behavior
2022-11-10 04:26:30
libjxl generates one batch of 16 numbers for every 16 pixels wide, and then it generates an extra batch for the remaining stragglers
2022-11-10 04:26:39
it does this even if the size of the remaining stragglers is zero
2022-11-10 04:26:59
i.e. if the width of the group is a multiple of 16, which it often will be, it generates an extra batch and does nothing with it
2022-11-10 04:30:00
this isn't in the spec
2022-11-10 04:36:25
I don't know if this is just a libjxl bug, or if it should be changed in the spec
veluca
Traneptora i.e. if the width of the group is a multiple of 16, which it often will be, it generates an extra batch and does nothing with it
2022-11-10 04:43:20
uh, I don't think it's supposed to do that
2022-11-10 04:43:28
at the very least it's not intentional...
Traneptora
2022-11-10 04:43:51
dec_noise.cc, line 72
2022-11-10 04:43:57
``` size_t x = 0; // Only entire batches (avoids exceeding the image padding). for (; x + kFloatsPerBatch <= xsize; x += kFloatsPerBatch) { rng->Fill(batch); for (size_t i = 0; i < kFloatsPerBatch; i += Lanes(df)) { BitsToFloat(reinterpret_cast<const uint32_t*>(batch) + i, row + x + i); } } // Any remaining pixels, rounded up to vectors (safe due to padding). rng->Fill(batch); size_t batch_pos = 0; // < kFloatsPerBatch for (; x < xsize; x += N) { BitsToFloat(reinterpret_cast<const uint32_t*>(batch) + batch_pos, row + x); batch_pos += N; } ```
2022-11-10 04:44:05
`x + kFloatsPerBatch <= xsize`
2022-11-10 04:44:11
I think this should probably be a strict inequality
2022-11-10 04:44:26
then, the remaining batch gets filled for 1-16, rather than for 0-15
2022-11-10 04:44:58
notice that if `xSize` is a multiple of `kFloatsPerBatch` (i.e. 16)
2022-11-10 04:45:15
then we have `x + 16 <= xSize` as the limiting condition, which is the same thing as `x < xSize`
2022-11-10 04:46:02
so this fills the entire row
2022-11-10 04:46:17
and then `rng->Fill(batch);` is called again, unconditionally.
veluca at the very least it's not intentional...
2022-11-10 04:46:40
in that case, i'll code to spec and assume this is a libjxl bug
veluca
2022-11-10 04:52:57
can you file an issue?
Traneptora
veluca can you file an issue?
2022-11-10 05:06:46
https://github.com/libjxl/libjxl/issues/1888
veluca
2022-11-10 05:09:57
thanks!
_wb_
2022-11-10 07:38:00
I propose we indeed treat this as a libjxl bug and change some noise on existing images
Traneptora
2022-11-10 07:47:20
what about the other discrepancy, with the scale factor?
2022-11-10 07:47:51
where libjxl maps `scaled >= 6f` to `intPart = 5` and `fracPart = 1f` (causing `lut[7]` to be unused)
2022-11-10 07:48:37
or the spec maps `scaled >= 8f` to `intPart = 7` and `fracPart = 1f` (causing `lut[8]` to blow up)
2022-11-10 07:48:59
IMO it should map `scaled >= 7f` to `intPart = 6` and `fracPart = 1f`
veluca
2022-11-10 07:53:01
we also should fix, and ideally for the last time as we already "fixed" it way too many times
2022-11-10 07:53:09
can you also open a bug to libjxl for this?
Traneptora
2022-11-10 08:04:21
yup, there's one more spec issue with noise, the spec's missing a 0.5f multiplier somewhere
2022-11-10 08:04:23
lemme dig it up
2022-11-10 08:25:40
I gotta go do some real lifing but I'll try to open an issue about that soon
spider-mario
2022-11-10 08:25:52
we discussed that LUT thing a while ago, I believe
2022-11-10 08:26:02
2022-11-10 08:26:09
it’s that code, isn’t it?
Traneptora
2022-11-10 08:27:08
yes, but that's not the issue
2022-11-10 08:27:18
the current code isn't sawtoothed in either the spec or libjxl, the issue is the integer part threshold
2022-11-10 08:27:32
not the fractional part
spider-mario
2022-11-10 08:27:48
right, but I thought we had specifically fixed that too
2022-11-10 08:28:14
e.g. https://github.com/libjxl/libjxl/pull/1238/files
Traneptora
2022-11-10 08:29:25
somehow this change isn't in git main libjxl though?
2022-11-10 08:29:31
well at least the one in `dec_noise.cc`
2022-11-10 08:30:22
now that code is in `stage_noise.cc`
2022-11-10 08:30:22
https://github.com/libjxl/libjxl/blob/main/lib/jxl/render_pipeline/stage_noise.cc#L71
2022-11-10 08:30:33
and it appears to use the older version
spider-mario
2022-11-10 08:31:21
oh no
2022-11-10 08:31:26
thanks for spotting this
Traneptora
2022-11-10 08:32:09
yw :)
2022-11-10 08:34:18
the seeding change did make it into main though
2022-11-10 11:03:13
Last spec discrepancy relating to noise synthesis, btw:
2022-11-10 11:03:38
> `Given the X and Y samples of the input pixel, let InR = Y + X and InG = Y − X.`
2022-11-10 11:03:49
it's actually `(Y+X)/2` and `(Y-X)/2`
2022-11-10 11:04:54
https://github.com/libjxl/libjxl/blob/main/lib/jxl/render_pipeline/stage_noise.cc#L192
2022-11-10 11:05:21
notice that you're multiplying by `0.5` first
2022-11-10 11:05:40
here `half` is just `const auto half = Set(d, 0.5f);`
2022-11-10 11:07:40
I assume this is supposed to happen and a spec bug, not a libjxl bug
2022-11-10 11:08:01
so I'm not going to report it on the issue tracker. but in either case spec and libjxl don't match, so at least one of the two is not correct
2022-11-11 03:21:02
> The dequantization matrices are computed for each possible value of the DctSelect field as the inverse of the weights matrix.
2022-11-11 03:21:13
How am I supposed to compute the inverse of the weights matrix if they are not square?
2022-11-11 03:23:33
unless you mean reciprocols taken element-wise
2022-11-11 03:23:39
in which case, can I assume they are nonzero?
2022-11-11 03:45:06
```cpp GetDCTQuantWeights(params) { bands(0) = params[0]; for (i = 1; i < len; i++) { bands(i) = bands(i − 1) * Mult(params[i]); /* bands[i] >= 0 */ for (y = 0; y < Y; y++) for (x = 0; x < X ; x++) { dx = x / (X − 1); dy = y / (Y − 1); distance = sqrt(dx*dx + dy*dy); weight = Interpolate(distance, sqrt(2) + 1e−6, bands, num_bands); weights(x, y) = weight; } } } ```
2022-11-11 03:45:09
this doesn't make sense
2022-11-11 03:45:29
you're overwriting `weights[x][y]` each iteration of params
2022-11-11 03:45:42
and it's not referenced again
2022-11-11 03:45:48
is this supposed to be `+=`?
2022-11-11 03:46:49
or is it not supposed to be a nested for loop?
2022-11-11 03:58:37
> For encoding mode DCT2, params(c, i) are copied in position (x, y) as follows, where rectangles are defined by their top right and bottom left corners
2022-11-11 03:58:45
Is this supposed to say "top left and bottom right"?
2022-11-11 04:08:32
> For encoding mode DCT4×8, the weights matrix is obtained by copying into position (x, y) the value in position (x, y Idiv 2) in the 8 × 4 matrix
2022-11-11 04:08:39
and 8x4 matrix has 8 rows and 4 columns. it's taller than it is wide
2022-11-11 04:08:46
should this say 4x8?
2022-11-11 04:09:17
using the mathematical convention of Rows by Columns, not the convention of width by height
2022-11-11 04:36:31
DCT2 leaves weight[0][0] undefined
2022-11-11 04:36:35
does it deafult to 1?
veluca
Traneptora in which case, can I assume they are nonzero?
2022-11-11 07:07:59
Yup (and yes, it is element by element)
Traneptora or is it not supposed to be a nested for loop?
2022-11-11 07:09:01
*facepalm* no it is not supposed to be nestes
Traneptora does it deafult to 1?
2022-11-11 07:09:44
Doesn't matter, weight 0 0 is never used
Traneptora
2022-11-11 12:26:34
I.6.5:The pairs (x, y) in the LLF vector is sorted in ascending order according to the value y * bwidth + x.
2022-11-11 12:26:47
I assume this is supposed to say `y * (bwidth / 8) + x`
2022-11-11 12:26:54
otherwise it overflows the size of LLF
_wb_
2022-11-11 12:57:04
Isn't bwidth the width in blocks, so already /8?
Traneptora
2022-11-11 01:05:29
it's the matrix width for a specific block
2022-11-11 01:13:05
> Let nb_block_ctx be equal to max(block_ctx_map) + 1. The decoder reads a histogram with `495 * num_hf_presets * nb_block_ctx` pre-clustered distributions D from the codestream as specified in C.1.
2022-11-11 01:13:13
Where's this 495 coming from? I can't find it in libjxl code.
veluca
Traneptora I assume this is supposed to say `y * (bwidth / 8) + x`
2022-11-11 01:15:52
does it actually matter as long as it's `y * <something greater than any x> + x`?
Traneptora
2022-11-11 01:16:10
yes because otherwise you end up with zero padding and overflow
veluca
2022-11-11 01:16:16
no?
2022-11-11 01:16:26
it just says how they are sorted
Traneptora
2022-11-11 01:16:30
well, I suppose not, it's an order
veluca
2022-11-11 01:16:31
not what their positions are
Traneptora
2022-11-11 01:16:32
not an index
veluca
2022-11-11 01:17:03
in practice they do end up at index `y*bw/8+x`
Traneptora > Let nb_block_ctx be equal to max(block_ctx_map) + 1. The decoder reads a histogram with `495 * num_hf_presets * nb_block_ctx` pre-clustered distributions D from the codestream as specified in C.1.
2022-11-11 01:17:33
that's a good question, let me check
Traneptora
2022-11-11 01:18:03
upon further inspection, it appears to be 458 + 37
2022-11-11 01:18:10
where 37 is kNonZeroBuckets
2022-11-11 01:18:17
and 458 is kZeroDensityContextCount
veluca
2022-11-11 01:18:53
yep
Traneptora
2022-11-11 03:26:13
> Finally, for k in the range [num_blocks, size), the decoder reads an integer ucoeff from the codestream
2022-11-11 03:26:19
`size` is not defined here
2022-11-11 03:27:24
unless it's the natural coeffient order size way back in hfpass
2022-11-11 03:27:30
if that's what it is, this is very confusing
2022-11-11 04:24:51
that spec doesn't say what happens if it reads non_zeroes directly and it's immediately zero
2022-11-11 04:25:12
spec doesn't disallow this
veluca
2022-11-12 12:25:35
doesn't that just make all of AC 0?
Traneptora
2022-11-12 03:52:49
spec doesn't say that
2022-11-12 03:53:32
Typo in section I.3.2: ```c BlockContext() { idx = (c < 2 ? c ^ 1 : 2) * 13 + s; idx *= (qf_thresholds.size() + 1); for (t : qf_thresholds) if (qf > t) idx++; for (i = 0; i < 3; i++) idx *= (lf_thresholds[i].size() + 1); lf_idx = 0; for (t : lf_thresholds[0]) if (qdc[0] > t) lf_idx++; lf_idx *= (lf_thresholds[2].size() + 1); for (t : lf_thresholds[2]) if (qdc[2] > t) lf_idx++; lf_idx *= (lf_thresholds[0].size() + 1); // <- TYPO HERE for (t : lf_thresholds[1]) if (qdc[1] > t) lf_idx++; return block_ctx_map[idx + lf_idx]; } ```
2022-11-12 03:53:45
I believe that line should say `lf_thresholds[1].size() + 1`
2022-11-12 03:53:55
not `lf_thresholds[0]`
2022-11-12 03:54:22
Typo in section I.3.2: ``` CoeffNumNonzeroContext[64] = { 0, 0, 31, 62, 62, 93, 93, 93, 93, 23, 123, 123, 123, 152, 152, 152, 152, 152, 152, 152, 152, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206}; ```
2022-11-12 03:54:36
The `23` in the top row, 4th from the right should be 123
2022-11-12 03:54:51
also, libjxl starts these arrays with `0xBAD` instead of zero
2022-11-12 03:55:15
I don't know if that's because it can be anything, or if matters
2022-11-12 03:55:55
typo in section I.3.2:
2022-11-12 03:56:17
``` if (k == num_blocks) { if (non_zeros > size / 16) prev = 1; else prev = 0; } ``` this condition should be `prev = 0; else prev = 1`
2022-11-12 03:56:44
libjxl dec_group.cc line 486 has this: `size_t prev = (nzeros > size / 16 ? 0 : 1);`
veluca
Traneptora also, libjxl starts these arrays with `0xBAD` instead of zero
2022-11-12 02:19:12
that one should never be used
Traneptora
2022-11-12 06:50:22
Section I.2.2: there's a +1 missing in the QF Thresholds decoding
2022-11-12 06:50:34
``` for (i = 0; i < nb_qf_thr; i++) qf_thresholds.push_back(U32(u(2), 4 + u(3), 12 + u(5), 44 + u(8)))); ```
2022-11-12 06:50:38
This is in the pseudocode
2022-11-12 06:51:13
`entropy_coder.cc`, line 52
2022-11-12 06:51:18
```cpp for (uint32_t& i : qft) { i = U32Coder::Read(kQFThresholdDist, br) + 1; } ```
2022-11-12 06:51:24
notice there's a +1 there.
2022-11-12 11:27:41
Section I.2.4 has this code:
2022-11-12 11:27:48
```cpp if (encoding_mode == RAW) { params.denominator = F16(); ZeroPadToByte(); params = /* read a 3-channel image from a modular sub-bitstream of the same shape as the required quant matrix */; } ```
2022-11-12 11:28:34
That `ZeroPadToByte()` should not be there. If you do it, you'll fail to verify the submodular bitstream for example with ants.jxl
2022-11-12 11:29:10
2022-11-12 11:46:41
Also, something that appears to missing from Annex I (VarDCT) is an overview. Annex H (modular) had an overview of the general process of modular
2022-11-12 11:46:47
VarDCT's annex appears to be missing it
2022-11-12 11:46:55
It explains how to do each step, but doesn't explain when to do each step
2022-11-12 11:47:28
I now have Quantized LF coefficients and Quantized HF coefficients, but I'm unsure when I dequent them, when I detransform them (via DCT or some other method), and when I decorrelate them.
2022-11-12 11:47:39
I know *how* to do those things, but not when.
2022-11-12 11:49:46
The spec also appears to be lacking guidance on how to handle edge cases, like if image dimensions are odd, or not multiples of 8, etc.
2022-11-12 11:51:11
if the image is a multiple of 8x8 it's not hard, but what if it isn't?
veluca
2022-11-13 06:18:41
then you pretend it is and crop at the very end, I'm sure it's written *somewhere*
Traneptora
2022-11-13 08:36:35
maybe it's in the preamble, but I didn't see it in annex I
2022-11-13 08:37:05
in either case, a general overview of VarDCT like there is for modular would be helpful
Fraetor
2022-11-13 12:17:15
Minor issue, but in section 4.3: Operators, for and and or if is misspell as iff.
_wb_
2022-11-13 12:21:54
"iff" means "if and only if", but you're right, it should be spelled out because not everyone will be familiar with that abbreviation
Fraetor
2022-11-13 02:05:40
Ah, well I learned something new.
2022-11-13 02:11:20
Also, the examples are really helpful.
veluca
2022-11-13 02:25:42
that's my fault for being a mathematician 😛
Traneptora
2022-11-13 03:48:29
same, mathematicians ^_^
veluca then you pretend it is and crop at the very end, I'm sure it's written *somewhere*
2022-11-13 10:27:30
well it's not necessary for modular images so it'd have to be in VarDCT's section
2022-11-13 10:27:36
and I didn't see it although I might have glossed over it
veluca
2022-11-13 11:01:42
I can't see that either...
2022-11-13 11:02:18
(and it's especially tricky how it is supposed to behave especially as you mix in extra channels and upsampling, so it really ought to be mentioned somewhere)
Traneptora
2022-11-13 11:08:30
extra channels are done in modular so they don't have that crop factor right?
2022-11-13 11:08:48
VarDCT is specifically about the exactly three color channels iirc
veluca
2022-11-13 11:27:16
correct
Traneptora
2022-11-14 01:09:54
Section I.2.1, Quantizer Bundle, is missing a factor of `1 << 16` in global_scale
2022-11-14 01:10:03
it has `mXDC = m_x_lf_unscaled / (global_scale * quant_lf);`
2022-11-14 01:10:14
but it should have `mXDC = (1 << 16) * m_x_lf_unscaled / (global_scale * quant_lf);`
2022-11-14 01:10:29
and same for Y and B
Moritz Firsching
2022-11-14 01:40:32
..fixing this now
2022-11-14 01:42:25
also watch out for taking the reciprocals of `m_foo_lf_unscaled` https://github.com/lifthrasiir/j40/blob/acfe38647a0614e0562f739d4a40d2fc8a397e7f/j40.h#L5136
_wb_
2022-11-14 02:26:26
<@604964375924834314> I'm fixing noise in the spec, can you fix it in the render pipeline code in libjxl?
spider-mario
2022-11-14 05:07:10
yep, I suppose it will mostly boil down to making sure that the previous fixes are really applied
2022-11-14 05:07:26
I was afraid of exactly that when we started copying code to the rendering pipeline
2022-11-14 05:07:33
which makes me wonder how I didn’t catch this
Traneptora
2022-11-14 05:12:27
> `ScaleF(N, n) { return 1 / (cos(pi / (2 * N)) * cos(pi / N) * cos(pi / (N / 2))); }`
2022-11-14 05:12:30
Section I.6.6
2022-11-14 05:12:41
This doesn't make sense as there's no little `n` here
2022-11-14 05:12:50
I tried to trace it back to libjxl but it just has a bunch of precomputed LUTs
2022-11-14 05:13:02
so I can't figure out what it did
spider-mario
_wb_ <@604964375924834314> I'm fixing noise in the spec, can you fix it in the render pipeline code in libjxl?
2022-11-14 05:17:52
it was actually already fixed in #12, wasn’t it?
2022-11-14 05:17:55
(the spec)
Traneptora
2022-11-14 05:18:22
I can't see #12, but i can say that the spec as written is off by 1 from libjxl in the other direction
2022-11-14 05:18:36
it clamps to 7, not 6, so adding 1 a bit later will overflow
_wb_
spider-mario it was actually already fixed in #12, wasn’t it?
2022-11-14 05:27:52
Looks like the spec was correct but libjxl was off by -1, then we fixed libjxl but also did a +1 in the spec, and then the render pipeline code regressed to the off by -1 version so we ended up having -1 in libjxl and +1 in the spec.
Traneptora
2022-11-14 05:45:38
For DctSelect types IDENTIFY, DCT2×2, DCT4×4, DCT8×4, DCT4×8, AFV0, AFV1, AFV2, AFV3, the output is equal to the input.
2022-11-14 05:45:41
What's Identify?
_wb_
2022-11-14 06:01:18
A typo for IDENTITY, which was renamed to Hornuss between DIS and FDIS because someone remarked that it is not at all an identity transform 🙂
2022-11-14 06:01:36
Good catch!
Traneptora
2022-11-14 06:21:45
I see, and a search and replace missed it cause it was a typo <:KEKW:643601031040729099>
Moritz Firsching
2022-11-15 09:50:19
I fixed the IDENTIFY/Hornuss typo in the spec draft...
Traneptora
2022-11-15 12:58:39
for RAW quant weight decoding, the spec says that you use `params / denominator`
2022-11-15 12:58:49
but it's actually `1 / (params * denominator)`
2022-11-15 12:59:00
(this is before you take the reciprocols of all the weights)
2022-11-15 03:04:57
LLF coefficient ScaleF is missing a multiple of `n`
2022-11-15 03:05:07
```cpp ScaleF(N, n) { return 1 / (cos(pi / (2 * N)) * cos(pi / N) * cos(pi / (N / 2))); } ```
2022-11-15 03:05:11
It *should* say this:
2022-11-15 03:05:25
```c++ ScaleF(N, n) { return 1 / (cos(n * pi / (2 * N)) * cos(n * pi / N) * cos(n * pi / (N / 2))); } ```
2022-11-15 03:05:53
alternatively it could be relabeled for clarity
2022-11-15 03:09:53
```cpp ScaleF(c, b) { inverse_scale_f = cos(b * pi / (2 * c)) * cos(b * pi / c) * cos(2 * b * pi / c); return 1 / inverse_scale_f; } ```
2022-11-15 03:10:24
I chose `b` and `c` here because the values that are fed into it lower in that section are `ScaleF(cy, bheight)`
2022-11-15 03:10:26
which matches up
2022-11-15 03:10:52
I don't like mixing `N` and `n` in the same function, that could get confusing
2022-11-15 03:10:58
(section I.6.6, btw)
2022-11-15 03:22:20
I think this is what it's supposed to be, at least
2022-11-15 03:52:21
it's hard to tell because libjxl just has LUTs
2022-11-15 07:18:21
The default value for `x_qm_scale` in the Frame Header is wrong
2022-11-15 07:18:46
in the spec it says the default is 3
2022-11-15 07:19:04
with a signalling condition of `!all_default and metadata.xyb_encoded and encoding == kVarDCT`
2022-11-15 07:19:28
the actual default is `3` iff `metadata.xyb_encoded and encoding == kVarDCT`
2022-11-15 07:19:46
and defaults to `2` iff `!metadata.xyb_encoded || encoding == kModular`
2022-11-15 07:20:40
so if, for example, you have a reconstructed JPEG with `!all_default && !metadata.xyb_encoded` it should default to 2, not to 3
2022-11-15 07:22:51
libjxl has this (converted to pseudocode) ```cpp if (metadata.xyb_encoded && encoding == kVarDCT) { visitor->visit_fields(&x_qm_scale, 3, /*default=*/all_default) visitor->visit_fields(&b_qm_scale, 2, /*default=*/all_default) } else { x_qm_scale = 2; b_qm_scale = 2; } ```
_wb_
2022-11-15 07:28:23
Oh, wow that's confusing
Traneptora
2022-11-15 07:28:46
tripped me up when debugging CFL
Moritz Firsching also watch out for taking the reciprocals of `m_foo_lf_unscaled` https://github.com/lifthrasiir/j40/blob/acfe38647a0614e0562f739d4a40d2fc8a397e7f/j40.h#L5136
2022-11-15 08:46:10
already figured that one out since it's used in modular
Traneptora but it's actually `1 / (params * denominator)`
2022-11-15 08:47:14
https://github.com/libjxl/libjxl/blob/main/lib/jxl/quant_weights.cc#L238
2022-11-15 11:57:29
> Horizontal (vertical) upsampling is defined as follows: the channel width (height) is multiplied by two (but limited to the image dimensions), and every subsampled pixel B with left (top) neighbour A and right (bottom) neighbour C is replaced by two upsampled pixels with values 0.25 * A + 0.75 * B and 0.75 * B + 0.25 * C.
2022-11-15 11:57:48
What happens in edge cases? i.e. what happens if hshift == 1 and x=0
_wb_
2022-11-16 10:10:22
I assume mirroring, i.e. A=B for the leftmost pixels and C=B for the rightmost ones. <@179701849576833024> is that how we do it? We should add that to the spec then...
veluca
2022-11-16 10:12:34
Somewhere the spec says that whenever we access out of bound pixels they are mirrored
_wb_
2022-11-16 10:30:15
I guess that's true, but in many places we explicitly say so too. It's a bit confusing if we sometimes say it explicitly and sometimes we rely on the generic statement.
2022-11-16 10:54:53
2022-11-16 10:55:16
This should be coefficients, not samples, right?
Traneptora
2022-11-16 12:27:51
that's correct, those are dequantized coefficients
_wb_
2022-11-16 12:35:36
I'll change that then to 'coefficient' instead of 'sample', since this could easily be misinterpreted as doing things in the pixel domain, not the dct domain
2022-11-16 12:37:20
Also going to reorganize things so LLF goes earlier
Traneptora
2022-11-16 12:58:18
Thanks. Putting things in the order you do them is ideal
2022-11-16 12:59:37
Namely Decode HF -> Dequant HF -> Recorrelate HF -> Overlay LF to LLF atop HF -> Inverse DCT
2022-11-16 01:00:26
considering that LLF is not recorrellated it should go after CfL
_wb_
2022-11-16 01:02:34
Yeah, I'm going to first define DCT, then LLF (since it uses DCT_2D and it's better to avoid forward refs imo), then Coefficients to samples
2022-11-16 05:33:15
Updated draft:
2022-11-16 05:46:40
oh nice, there's a nicer looking html output
Traneptora
2022-11-16 09:26:51
way better than Cambria :)
2022-11-16 09:33:03
There's a typo in section I.9.6: `coeffs_8x4(0, 0)` should be `coeffs_4x8(0, 0)`
2022-11-16 09:33:22
as the variable is called 4x8 even in the DCT8x4 transform
2022-11-16 10:18:19
End of the AFV transform definition
2022-11-16 10:18:23
``` samples_4x8 = IDCT_2D(coeffs_4x8); for (iy = 0; iy < 4; iy++) for (ix = 0; ix < 8; ix++) samples(ix, (flip_y == 1 ? 0 : 4) + iy) = samples_4x4(ix, iy); ```
2022-11-16 10:18:27
I believe that last samples should say samples4x8
2022-11-17 06:28:54
AFV transform definition
2022-11-17 06:28:58
> `coeffs_4x4(0, 0) = coefficients(0, 0) − coefficients(0, 1) + coefficients(1, 0);`
2022-11-17 06:29:11
I believe this should be `coeffs_4x4(0, 0) = coefficients(0, 0) + coefficients(0, 1) - coefficients(1, 0);`
2022-11-17 06:29:24
I haven't double checked against libjxl, but making that change fixes a number of visual artifacts with the corner of AFV
yurume
2022-11-17 06:34:35
that matches J40: https://github.com/lifthrasiir/j40/blob/216f83ba7a616685f847268f009e8fcb85106f76/j40.h#L6147
Traneptora
2022-11-17 06:35:58
good catch, I missed the second half
2022-11-18 08:38:09
Section J.4.1 appears to have some links to the wrong sections
2022-11-18 08:38:43
section J.4.2 specifies how to calculate distance as a function of `(x, y, cx, cy)`
2022-11-18 08:38:56
and section J.4.3 specifies how to calculate weight as a function of distance and sigma
2022-11-18 08:40:11
but I think the linked sections are wrong, since the third step specifies the L1 distance in J.4.3
2022-11-18 08:40:36
was this something that got shuffled, and the linked sections didn't get updated?
2022-11-18 08:43:04
Section F.2, between tables F.6 and F.7, definition of `downsample[i]`:
2022-11-18 08:43:15
something got messed up here, idk what
2022-11-18 08:44:05
Top of Annex F, link on the right hand side to F.3 doesn't work
2022-11-18 08:44:34
It links to `#toc` but the anchor for F.3 is `#toc45`
2022-11-18 08:44:51
dunno how that happened
2022-11-18 08:45:31
it looks like there *is* a div with an ID of "toc" but it doesn't work, and idk why
2022-11-18 08:45:54
using firefox
_wb_
2022-11-18 09:24:00
looks like it's a reserved keyword or something that metanorma is already using for the document toc
2022-11-18 09:24:06
renaming it fixed it
Traneptora
2022-11-18 09:24:52
2022-11-18 09:25:06
Why is sigma being clamped to 1e-4 ?
2022-11-18 09:25:22
if it was less than 1e-4, now it is equal to 1e-4, and it's still gonna be less than 0.3
_wb_
2022-11-18 09:26:28
yeah that seems redundant. <@604964375924834314> can we just remove that `sigma = max(1e−4, sigma)` line?
Traneptora
2022-11-18 09:26:46
in theory epf_sigma_for_modular could be less than 0.3 but that would just disable epf for a modular frame
2022-11-18 09:26:57
so there'd be no point
2022-11-18 09:27:16
when you could just disable epf
spider-mario
2022-11-18 09:28:05
I’m not the best person to ask about EPF
2022-11-18 09:28:17
I think <@179701849576833024> knows more about this
2022-11-18 09:28:48
hm, okay, reading the context a bit more, it does seem redundant
_wb_
2022-11-18 09:29:11
oh right I saw a sigma and was thinking splines, but this is epf 🙂
veluca
2022-11-18 09:29:17
It is entirely possible for things to be redundant
Traneptora
2022-11-18 09:34:45
> `inv_sigma = step_multiplier[step] * 4 * (sqrt(0.5) − 1) / sigma;`
2022-11-18 09:34:54
just to double check, is this supposed to be negative?
2022-11-18 09:35:07
` v = 1 − scaled_distance * inv_sigma;` will always be positive
2022-11-18 09:35:23
and there's a check for `(v <= 0)`
_wb_
2022-11-18 09:46:13
EPF section now:
Traneptora
2022-11-18 09:46:58
2022-11-18 09:47:03
yea, it's specified in J.4.2
_wb_
2022-11-18 09:47:05
What I would propose to change it to:
Traneptora
2022-11-18 09:47:09
ah
2022-11-18 09:47:34
I think that's more clear
_wb_
2022-11-18 09:50:46
rewrote the downscale[i] thing a bit, there was some misparsing of code blocks going on there but I also didn't like the parentheses and duplication of the range of i
veluca It is entirely possible for things to be redundant
2022-11-18 09:55:25
I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.
Traneptora
_wb_ I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.
2022-11-18 09:55:57
epf_sigma_for_modular is constant and signalled in the frame header
2022-11-18 09:56:15
so if epf_sigma_for_modular < 0.3, that's identical to the frame header just disabling epf
2022-11-18 09:56:41
it wouldn't make sense for epf_sigma_for_modular to be < 0.3 and have epf enabled
2022-11-18 09:56:57
it might even make sense to forbid it
2022-11-18 04:58:56
DCT4x4Params doesn't match the spec compared to libjxl
2022-11-18 04:59:00
``` // DCT4 (quant_kind 3) static constexpr const QuantEncodingInternal DCT4X4() { return QuantEncodingInternal::DCT4(DctQuantWeightParams({{{{ V(2200.0), V(0.0), V(0.0), V(0.0), }}, {{ V(392.0), V(0.0), V(0.0), V(0.0), }}, {{ V(112.0), V(-0.25), V(-0.25), V(-0.5), }}}}, 4),```
2022-11-18 04:59:02
libjxl
2022-11-18 04:59:28
spec
2022-11-18 10:09:17
``` The resulting quant is then multiplied by a per-block multiplier, the value of HfMul at the coordinates of the 8 × 8 rectangle containing the current sample, and, for the X and B channels, by pow(0.8, frame_header.x_qm_scale − 2) and pow(0.8, frame_header.b_qm_scale — 2), respectively. ```
2022-11-18 10:09:19
This is not correct
2022-11-18 10:09:42
It is multiplied by `(1 << 16) * quantizer.global_scale / HfMul`
2022-11-18 10:09:51
it is not multiplied by `HfMul`
2022-11-18 10:15:18
see line 145 of dec_group.cc
2022-11-19 04:06:52
Section E.4.2:
2022-11-19 04:06:56
``` value = 0; shift = 0; while (1) { b = u(8); value += (b & 127) << shift; if (b <= 127) break; shift += 7; /* shift < 63 */; } ```
2022-11-19 04:07:06
I'm looking at this comment `/* shift < 63 */`
2022-11-19 04:07:15
it's not clear to me if this is a requirement before or after adding 7
2022-11-19 04:09:29
because it's on the same line
2022-11-19 04:10:28
it says earlier that the final integer is at at most 63 bits, so I think it's still technically specified, but it might be clearer if the pseudocode reflected that
_wb_
2022-11-19 04:27:29
basically value can use uint64_t
2022-11-19 04:29:39
Maybe better to write `/* shift <= 56 */` in the beginning of the while body...
Traneptora
2022-11-19 04:54:30
that would be more obvious, yea
2022-11-19 05:14:16
Section E.4.4
2022-11-19 05:14:18
> `/* output nothing, stop reading the tag list, and proceed to ~\ref{iccmaincontent}~ */`
2022-11-19 05:14:30
looks like you left some `\LaTeX` in there
_wb_
2022-11-19 05:20:09
oops, lol
Traneptora
2022-11-19 05:42:58
``` f (tagcode == 1) { // the tag string is set to 4 custom bytes read from the data tag = u(4 * 8) from data stream; } else if (tagcode == 2) { tag = “rRTC”; } else if (tagcode == 3) { tag = “rXYZ”; ```
2022-11-19 05:43:07
is `rRTC` supposed to be `rTRC`?
_wb_
2022-11-19 05:47:36
Uh, best to check libjxl code
Traneptora
2022-11-19 05:48:11
later on it has the following
2022-11-19 05:48:14
``` /* append the following 3 groups of 4 bytes to the output: tag, tagstart, tagsize */ if (tagcode == 2) { /* append the following additional 6 groups of 4 bytes to the output: “gTRC”, tagstart, tagsize, “bTRC”, tagstart, tagsize */ ```
2022-11-19 05:48:22
I'm guessing TRC stands for Transfer Characteristics
2022-11-19 05:48:27
and rRTC is probably a typo
_wb_
2022-11-19 05:51:20
https://github.com/libjxl/libjxl/blob/9d960611774b5b14a46f33a20768cde2cd0c219f/lib/jxl/icc_codec_common.h#L48
Traneptora
2022-11-19 05:51:33
looks like it is a typo
_wb_
2022-11-19 05:51:53
Good catch. Indeed a typo for rTRC
Traneptora
2022-11-19 05:53:43
> `In this subclause, Shuffle(bytes, width) denotes the following operation: Bytes are inserted in raster order (row by row from left to right, starting with the top row) into a matrix with width rows, and as many columns as needed.`
2022-11-19 05:53:55
Why is it called `width` if the number of columns is arbitrary, and there's that many rows?
2022-11-19 05:54:06
usually the word `height` is used to describe the number of scanlines, i.e. the number of rows.
2022-11-19 05:55:15
How am I supposed to insert bytes in raster order into a matrix with as many columns as needed?
2022-11-19 05:56:52
oh, I see. I create a matrix with `width` rows and `height` columns (yes, this is backward), because I then transpose it later
2022-11-19 05:56:57
but that's very confusing
2022-11-19 05:57:47
and `height` in this case is `CeilDiv(bytes.size(), width)`
2022-11-19 08:55:21
Section E.4.5
2022-11-19 08:55:24
> ` val = (bytes[i + j] + (p >> (width − 1 − j))) & 255;`
2022-11-19 08:55:49
`p >> (width - 1 - j)` should be `p >> (8 * (width - 1 - j))`
2022-11-19 08:55:55
as `width` and `j` are in units of bytes, not bits
2022-11-20 09:18:37
Section J.4.2:
2022-11-20 09:18:51
``` DistanceStep2(x, y, cx, cy) { dist = 0; for (c = 0; c < 3; c++) { dist += abs(sample(x + ix, y + iy, c) − sample(x + cx + ix, y + cy + iy, c)) * rf.epf_channel_scale[c]; } return dist; } ``` `iy` and `ix` are not defined here
2022-11-20 09:18:55
is this a copy/paste typo?
2022-11-20 09:42:05
Section I.8, ScaleF function
2022-11-20 09:42:13
this is my fault, I flipped `c` and `b`
2022-11-20 09:42:51
`c` should be in the numerator, and `b` in the denominator
2022-11-20 09:51:12
Section I.6, Chroma From Luma:
2022-11-20 09:51:20
```ytox_dc_ = static_cast<int>(br->ReadFixedBits<kBitsPerByte>()) + std::numeric_limits<int8_t>::min();```
2022-11-20 09:51:31
spec says -127, but this is -128
2022-11-20 09:52:15
since `signed char` goes from -128 to +127
2022-11-20 09:56:32
this makes a noticeable difference if the codestream encodes 128
2022-11-20 09:56:37
since it goes from 1 to 0
2022-11-21 04:26:33
Section G.3:
2022-11-21 04:26:36
These are in the wrong order
2022-11-21 04:27:14
the correct order is `lf_coeff` *then* `mlf_group` *then* `hf_meta`
2022-11-21 04:27:53
See `dec_frame.cc` line 301:
2022-11-21 04:27:56
```cpp Status FrameDecoder::ProcessDCGroup(size_t dc_group_id, BitReader* br) { PROFILER_FUNC; const size_t gx = dc_group_id % frame_dim_.xsize_dc_groups; const size_t gy = dc_group_id / frame_dim_.xsize_dc_groups; const LoopFilter& lf = dec_state_->shared->frame_header.loop_filter; if (frame_header_.encoding == FrameEncoding::kVarDCT && !(frame_header_.flags & FrameHeader::kUseDcFrame)) { JXL_RETURN_IF_ERROR( modular_frame_decoder_.DecodeVarDCTDC(dc_group_id, br, dec_state_)); } const Rect mrect(gx * frame_dim_.dc_group_dim, gy * frame_dim_.dc_group_dim, frame_dim_.dc_group_dim, frame_dim_.dc_group_dim); JXL_RETURN_IF_ERROR(modular_frame_decoder_.DecodeGroup( mrect, br, 3, 1000, ModularStreamId::ModularDC(dc_group_id), /*zerofill=*/false, nullptr, nullptr, /*allow_truncated=*/false)); if (frame_header_.encoding == FrameEncoding::kVarDCT) { JXL_RETURN_IF_ERROR( modular_frame_decoder_.DecodeAcMetadata(dc_group_id, br, dec_state_)); } else if (lf.epf_iters > 0) { FillImage(kInvSigmaNum / lf.epf_sigma_for_modular, &dec_state_->sigma); } decoded_dc_groups_[dc_group_id] = uint8_t{true}; return true; } ```
2022-11-21 04:28:33
notice that DecodeVarDCTDC is called before DecodeGroup
_wb_
Traneptora ``` DistanceStep2(x, y, cx, cy) { dist = 0; for (c = 0; c < 3; c++) { dist += abs(sample(x + ix, y + iy, c) − sample(x + cx + ix, y + cy + iy, c)) * rf.epf_channel_scale[c]; } return dist; } ``` `iy` and `ix` are not defined here
2022-11-21 05:04:44
<@179701849576833024> I suppose there is something missing there? In Step0/1 the ix,iy iterate over 5 values making a cross, I assume it's not the same in Step2 because then why would there be two different functions, but is it just ix=iy=0 or is it also iterating over some set of offsets?
veluca
2022-11-21 05:05:02
No just 0 0
_wb_
_wb_ I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.
2022-11-21 05:05:20
also ping <@179701849576833024> on this one
veluca
2022-11-21 05:06:26
It does apply to modular too
_wb_
2022-11-21 05:16:14
Latest draft:
Traneptora
2022-11-21 05:32:40
oh, nice
2022-11-21 09:51:47
Section I.2.3, the default values for `x_factor_lf` and `b_factor_lf` are 128, not 127
Traneptora ```ytox_dc_ = static_cast<int>(br->ReadFixedBits<kBitsPerByte>()) + std::numeric_limits<int8_t>::min();```
2022-11-21 09:55:10
same line of code as here, default coded value is 0
_wb_
2022-11-22 03:48:53
Today's draft:
2022-11-23 03:05:12
updated draft:
2022-11-23 03:06:03
I changed the signalling of `save_before_ct` quite a bit, I hope it is correct and easier to understand now
Traneptora
2022-11-23 04:29:48
much easier to read now
jjido
2022-11-25 10:12:49
Just want to say than you for the excellent work <@853026420792360980> .
Traneptora
2022-11-26 08:59:24
yw :)
_wb_
2022-11-28 03:04:29
<@853026420792360980> and <@268284145820631040> you could do me a big favor by checking if all the bugs you found are fixed now (and pointing me to the ones that still need fixing or where the fix is wrong/not good enough)
Traneptora
2022-11-28 03:15:57
I can take a look
_wb_
2022-11-29 11:50:23
Current draft: (clarified bit depth, save_before_ct, and hshift/vshift in modular)
Traneptora
2022-11-29 04:35:34
Section G.2.2: > If the kUseLfFrame flag in frame_header is set, this subclause is skipped and the samples from LFFrame[frame_header.lf_frame] are used instead of the values that would be computed by this subclause and I.5.2.
2022-11-29 04:35:45
`frame_header.lf_frame` is not defined in the FrameHeader bundle
2022-11-29 04:48:37
is this the same `lf_level` used?
2022-11-29 04:49:17
does that mean `kUseLFFrame != 0` and `lf_level > 1` means you have an LF Frame which itself uses an LF Frame for its LF Coefficients?
2022-11-29 04:53:46
Section I.4:
2022-11-29 04:53:48
> To compute context, the decoder uses the procedures and constants in the following code, where c is the current channel (with 0=X, 1=Y, 2=B), s is the Order ID (see Table I.7) of the DctSelect value and qf is the HfMul value for the current varblock, and **qdc[3] are the quantized LF values** of (the top-left 8×8 block within) the current varblock (taking into account chroma subsampling if needed). The lists of thresholds qf_thresholds and lf_thresholds[3], and block_ctx_map are as decoded in LfGlobal ( G.1.1 and I.2.2).
2022-11-29 04:53:58
specifically, these are the *quantized* LF Values
2022-11-29 04:54:09
what happens if you have an LF Frame? which skips dequantization
2022-11-29 04:56:33
especially considering the LF Frame values are floats
2022-11-29 04:56:38
and the quantized LF Values are ints
_wb_
2022-11-29 05:07:59
that's a really good question — <@179701849576833024> any idea what libjxl will do here? just have all zeroes there or something? we need to put it in the spec...
Traneptora
2022-11-29 05:11:58
I tried populating the quantized values and I got some incorrect values
2022-11-29 05:12:04
I tried populating zeroes and got other incorrect values
2022-11-29 05:12:24
lenna-correct.jxl
2022-11-29 05:12:33
(converted to png with libjxl for discord)
2022-11-29 05:12:52
using the dequantized LF values produced this
2022-11-29 05:13:02
using zeroes produced this
2022-11-29 05:13:11
those are varblock boundaries, btw
2022-11-29 05:13:18
those seams
2022-11-29 05:13:22
2022-11-29 05:14:33
weirdly enough, both ANS streams verified
2022-11-29 05:14:38
which I'm somewhat surprised about
2022-11-29 05:14:51
I would have expected one to not verify and for that to just give it away
_wb_
Traneptora does that mean `kUseLFFrame != 0` and `lf_level > 1` means you have an LF Frame which itself uses an LF Frame for its LF Coefficients?
2022-11-29 05:17:21
Yes, you can have recursive LF frames, up to 4 levels deep. So if you have a max-size image of 2^30 by 2^30 pixels (about one gigapixel by one gigapixel or one exapixel), you can have its 2^27 x 2^27 DC in LFFrame[0] which is a 1:8 image, which can have _its_ 2^24 x 2^24 DC in LFFrame[1] which is a 1:64 image, which can have _its_ 2^21 x 2^21 DC in LFFrame[2], which is a 1:512 image, which can have _its_ 2^18 x 2^18 DC in LFFrame[3], which is a 1:4096 image. That image needs to be modular or VarDCT with its own DC — in the latter case its DC would be a 2^15 x 2^15 downscale at 1:32768 of the full image, which would still be a 1 gigapixel image that would be encoded in 16384 DC groups.
Traneptora
2022-11-29 05:17:53
I see, so `lf_level > 0` is strictly for LF Frames themselves using LF Frames
2022-11-29 05:18:18
this was not clear to me
2022-11-29 05:18:38
either way, the `lf_frame` is not defined, it should be `lf_level`
_wb_
Traneptora I tried populating the quantized values and I got some incorrect values
2022-11-29 05:19:28
wait how did you get a bitstream that uses qdc in combination with LF frames?
Traneptora
_wb_ wait how did you get a bitstream that uses qdc in combination with LF frames?
2022-11-29 05:19:41
because it's used in the HF Coefficient decoding
_wb_
2022-11-29 05:20:14
yeah but libjxl only uses that when recompressing jpegs and then it doesn't produce lf frames
Traneptora
2022-11-29 05:20:27
that's not what the spec says
_wb_
2022-11-29 05:20:32
I mean
Traneptora
2022-11-29 05:20:33
and also not what it does
2022-11-29 05:20:43
see section I.4
_wb_
2022-11-29 05:20:58
it sets the thresholds so that it effectively doesn't use that
Traneptora
2022-11-29 05:21:13
`BlockContext()` uses `qdc`
2022-11-29 05:21:38
and `blockContext()` is needed to determine which context to read the HF coefficient from
_wb_
2022-11-29 05:22:32
libjxl will set `nb_lf_thr` to zero for each channel when doing vardct-from-pixels
2022-11-29 05:23:03
so it effectively doesn't matter what the value of qdc is to determine the context
2022-11-29 05:23:28
that's probably why both ANS streams verified
Traneptora
2022-11-29 05:24:15
but they produced different results
2022-11-29 05:24:16
which is odd
_wb_
2022-11-29 05:24:49
yeah that is odd, could be another spec bug or a jxlatte bug
Traneptora
2022-11-29 05:27:11
oh yea, the number of LF Thresholds is empty
_wb_ yeah that is odd, could be another spec bug or a jxlatte bug
2022-11-29 05:30:33
I see. I have a race condition somewhere because I'm getting different results on subsequent executions
2022-11-29 05:30:36
I have to debug
_wb_
Traneptora oh yea, the number of LF Thresholds is empty
2022-11-29 05:32:56
yeah and the only time they will not be empty is with recompressed jpegs — we needed some extra context there to help with compression, and quantized DC was somewhat useful; for 'full' VarDCT images we can use the adaptive quant weights which are a much more useful piece of context (but not for recompressed jpegs since those obviously don't do real adaptive quantization)
2022-11-29 05:33:40
but we never made lf frames work for recompressed jpegs so we never encountered the issue of what to do with `qdc` in that case
veluca
_wb_ that's a really good question — <@179701849576833024> any idea what libjxl will do here? just have all zeroes there or something? we need to put it in the spec...
2022-11-29 05:36:17
The real answer is that I don't remember, but I thought all 0...
_wb_
2022-11-29 05:36:47
https://github.com/libjxl/libjxl/blob/89926120c9e7aea0483c17be70570a7448609da5/lib/jxl/passes_state.cc#L60
2022-11-29 05:36:54
just checked, all zero indeed
Traneptora
_wb_ libjxl will set `nb_lf_thr` to zero for each channel when doing vardct-from-pixels
2022-11-29 05:37:20
sure, but the spec doesn't explicitly say that the number of LF Thresholds must be zero if `kUseLFFrame != 0`
2022-11-29 05:37:37
if this is always the case it should say it
_wb_
2022-11-29 05:38:24
yeah either we do that or we just say that quantized lf coeffs are to be considered zero if an lf frame was used
Traneptora
2022-11-29 05:38:36
that works too
_wb_
2022-11-29 05:51:35
Wait, what libjxl has there are the indices it will use to add to the block ctx. Setting those to zero is effectively the same thing as pretending all qdc values are -inf
2022-11-29 05:52:32
(so they are always in the first bucket of every lf threshold bucket, regardless of what the threshold values are)
2022-11-29 05:53:10
Probably better to just say the nb of thresholds is zero in this case (and check it in libjxl too, I guess)
Traneptora
2022-11-29 06:47:21
turns out the bug was in my inverse vertical squeeze
2022-11-29 06:47:51
I was iterating over each row, and then each column, as usual, rather than iterating over each column and then each row
2022-11-29 06:48:17
which matters as each vertical strip references the previous pixel in the column
2022-11-29 06:48:20
i.e. at higher rows
2022-11-29 06:48:28
so you need to do the rows in order and parallelize over the columns
2022-11-29 06:48:45
after that, it's working properly
_wb_
2022-11-29 06:49:59
Nice, and yes, vertical unsqueeze is a bit tricky to parallelize
Traneptora
2022-11-29 06:50:14
in my case I just swapped the order of the y and x
2022-11-29 06:50:36
basically I have a function which runs one pixel and then I have a loop that iterates over the row
2022-11-29 06:50:43
and then I run that row loop on the pool
2022-11-29 06:51:03
in this case I just changed it so I iterate over the x coordinate on the outside, then run on pool, and then iterate over y coordinate on the inside
_wb_
2022-11-29 06:52:09
Yeah that works, you can simd it too but that's where it gets trickier
Traneptora
2022-11-29 06:54:00
it's decoding correctly now except for this artifact
2022-11-29 06:54:01
2022-11-29 06:54:13
which looks like some sort of X channel overflow. but that's likely a jxlatte bug, not a spec bug.
veluca
Traneptora in this case I just changed it so I iterate over the x coordinate on the outside, then run on pool, and then iterate over y coordinate on the inside
2022-11-29 10:41:26
that works but destroys your cache/memory bandwidth efficiency
Traneptora
veluca that works but destroys your cache/memory bandwidth efficiency
2022-11-29 10:41:46
I don't really have any in this case either way
2022-11-29 10:42:01
I'd have to transpose, squeeze horizontally, and transpose back for maximum efficiency
2022-11-29 10:42:19
but I agree that vertical squeeze is hard to optimize
_wb_
2022-11-30 03:54:59
Today's spec draft:
Traneptora
2022-11-30 03:58:07
I'll take a look at it this afternoon
2022-12-02 04:08:38
DCT-II is the forward DCT and DCT-III is the inverse DCT. Is this intentional?
2022-12-02 04:08:52
I mean the pseudocode/formulas specified for the DCTs works but are these names accurate?
veluca
2022-12-02 04:10:13
probably not
2022-12-02 04:10:35
I don't remember when is the last time we checked those definitions, but a long time ago
Traneptora
2022-12-02 04:11:13
Section J.2, JPEG Upsampling, doesn't specify what happens in corner cases. I'm guessing it uses `Mirror` (section 5.2) like every other restoration/upsampling filter, but this isn't explicit.
2022-12-02 04:12:30
Section C.1: libjxl disallows specific nested-nested-nested etc. distributions from using LZ77 because it can become a zip bomb
2022-12-02 04:12:44
the spec does not
2022-12-02 04:12:54
I don't know if the spec should be disallowing these as well