JPEG XL

Specification issues

2001 - 2110

1501 - 2000

Traneptora

2022-11-08 09:51:49	since things like incorrect parsing can cause values to be invalid
2022-11-08 09:51:53	and it catches that

_wb_

2022-11-08 09:52:07	yeah throwing errors is certainly the best thing to do, even besides the debugging usefulness
2022-11-08 09:53:18	the best decoder is one that decodes all compliant bitstreams and refuses to decode any non-compliant one — we are probably missing some of those cases in libjxl atm btw

dds

	_wb_ currently the spec says "All referenced samples are within the bounds of the reference frame." so this is an invalid bitstream atm
2022-11-08 09:54:47	TIL how ISO specification idiom works: In this case "All referenced samples are within the bounds of the reference frame" makes me think "well isn't that nice" but it's intended to convey "If any sample is outside the bounds of the reference frame then the bitstream is invalid so continue at your own risk".
2022-11-08 09:55:21	silly ISO

_wb_

2022-11-08 09:55:36

yeah I agree the ISO language is confusing

lonjil

2022-11-08 09:56:07

I have heard from many people that ISO standards tend to be incredibly detailed and yet written such that it's hard to tell exactly what you're supposed to do, leading to interoperability problems.

_wb_

2022-11-08 09:56:23	if we say "they are within the bounds" it means "if they're not, this is not a valid bitstream"
2022-11-08 09:57:57	if we say "they must be within the bounds" it means "no need to check, it's physically impossible that they are not within the bounds, e.g. it's a consequence of the bitstream syntax that they have to be"

dds

2022-11-08 09:58:08	oh what
2022-11-08 09:59:04	ok i can only find one occurance of 'must' and it's not confusing

_wb_

2022-11-08 09:59:33	> Levels are defined in such a way that if a decoder supports level N, it must also support lower levels.
2022-11-08 10:00:14	yeah this statement is a consequence of the way the levels are defined, so we can use "must" there

dds

2022-11-08 10:01:04

it feels like the ISO language was designed for countries to agree on common standards for traffic signs, not for nitpicky geeks trying to level up the internet

_wb_

2022-11-08 10:01:47	most of the normative statements are done with "shall" in ISO, but we avoided making all sentences sound biblical and having hundreds of "shall"s by leveraging this sentence in clause 7: > These specifications are normative in the sense that they are defining an output that alternative implementations shall duplicate.
2022-11-08 10:02:46	that one single "shall" makes basically all the rest of the document have implicit shalls
2022-11-08 10:04:38	half the job of an ISO staff editor is making sure the use of shall, should, must etc is done correctly, and it's very confusing if you're not used to it so I imagine most first-time spec editors get it wrong all the time

dds

2022-11-08 10:08:38

so ISO speak is a bit like legal speak in that it mostly reads like normal English but the words don't always mean what you think they do

improver

2022-11-08 11:03:02	they mostly mean what is specified in long intros many skip
2022-11-08 11:04:20	but it was pretty clear at least to me that anything not specified was not part of the spec
2022-11-08 11:05:18	it's just kinda how you're used to approaching specifications

lonjil

2022-11-09 01:15:58

Is that supposed to be a plus between 18 and u(6), rather than a new line?

Traneptora

2022-11-09 04:49:48

It is

_wb_

2022-11-09 05:31:03

Yeah I also just saw that, will be fixed.

Traneptora

2022-11-09 05:13:01	Section K.5.2, definition of SplitMix64, has `0×` instead of `0x` in front of the hexadecimal numbers
2022-11-09 06:12:03	> The strength of the noise is computed as follows. Given the X and Y samples of the input pixel, let InR = Y + X and InG = Y − X. Let InscaledR = InR * 6 and InscaledG = InG * 6. These values are split into integer parts in_intR = floor(InscaledR) and in_intG = floor(InscaledG) and fractional parts in_fracR = InscaledR − in_intR and in_fracG = InscaledG − in_intG. If the value InScaledR or InScaledG is 8 or more, the corresponding integer and fractional parts are clamped back to 7 and 1 respectively. > > Using the LUT values decoded as described in K.5.1, the strengths SR and SG are defined as SR = LUT[in_intR] * (1 − in_fracR) + LUT[in_intR + 1] * in_fracR and SG = LUT[in_intG] * (1 − in_fracG) + LUT[in_intG + 1] * in_fracG. Each strength is clamped between 0 and 1.
2022-11-09 06:12:12	What if `InR` or `InG` is negative?
2022-11-09 06:12:16	it very well could be the case
2022-11-09 06:20:21	looks like libjxl clamps to 0
2022-11-09 06:20:38	should say that in the spec tho
2022-11-09 06:23:29	Also
2022-11-09 06:23:30	> If the value InScaledR or InScaledG is 8 or more, the corresponding integer and fractional parts are clamped back to 7 and 1 respectively.
2022-11-09 06:23:43	This will overflow `LUT` at 7
2022-11-09 06:23:49	because LUT is an array of size 8
2022-11-09 06:24:00	so LUT[7 + 1] will fail

veluca

	Traneptora Section K.5.2, definition of SplitMix64, has `0×` instead of `0x` in front of the hexadecimal numbers
2022-11-09 06:36:49	how did you even notice 🤣

Traneptora

2022-11-09 06:37:06

because I copied and pasted and got a syntax error <:KEKW:643601031040729099>

veluca

	Traneptora looks like libjxl clamps to 0
2022-11-09 06:37:39	it should
	Traneptora This will overflow `LUT` at 7
2022-11-09 06:38:10	ahhhhh you have no idea how many times I ended up changing the number for that clamping
2022-11-09 06:38:27	I thought I got it right ... what does libjxl clamp to? 6?

Traneptora

2022-11-09 06:39:33	I double checked the LUT overflow, apparently libjxl does the following ```cpp float kScale = 6f; if (in_scaled >= kScale) in_int = kScale - 1; in_frac = 1f; } else { in_int = floor(in_scaled); in_frac = in_scaled - in_int; } ```
2022-11-09 06:39:35	Yup
2022-11-09 06:39:46	`in_int` is capped at 5, because of this
2022-11-09 06:39:52	which means LUT[7] is never used
2022-11-09 06:40:08	although I believe this is a bug in libjxl
2022-11-09 06:40:18	where it should do the following:
2022-11-09 06:40:49	```cpp float kScale = 6f; if (in_scaled >= 7f) in_int = 6; in_frac = 1f; } else { in_int = floor(in_scaled); in_frac = in_scaled - in_int; } ```
2022-11-09 06:41:09	since `float lut[8]` is sealed in stone
2022-11-09 06:41:47	so either Scale needs to change to 7, or the clamping should change, cause otherwise something is weird
2022-11-09 06:45:41	in either case the spec causes an overflow as the spec is written

_wb_

2022-11-09 06:53:38

So bug in both libjxl and spec? Ugh

Traneptora

2022-11-09 06:54:08	perhaps
2022-11-09 06:54:21	unclear if libjxl is bugged, but probably is, given that lut[7] is not used
2022-11-09 06:54:40	spec is definitely bugged as it will cause an overrun of the buffer

_wb_

2022-11-09 06:55:32	So spec should probably say: > If the value InScaledR or InScaledG is 7 or more, the corresponding integer and fractional parts are clamped back to 6 and 1 respectively.
2022-11-09 06:55:57	And libjxl should say what you wrote above

Traneptora

2022-11-09 06:56:37	or equivalent in HIghway-ese
2022-11-09 06:56:46	this is pseudocode
	_wb_ So spec should probably say: > If the value InScaledR or InScaledG is 7 or more, the corresponding integer and fractional parts are clamped back to 6 and 1 respectively.
2022-11-09 06:57:13	it should also mention that they're clamped to 0 if negative

veluca

2022-11-09 07:38:59

I hate that code 🤣 it's too hard for me

Traneptora

2022-11-09 11:34:57

I'm confused about noise groups, and how they interact with upsampling

veluca

2022-11-10 01:01:31	~~that's two of us~~
2022-11-10 01:02:08	iirc the way that works is that noise is never generated in tiles bigger than 256x256, even if upsampling happens

Traneptora

2022-11-10 04:24:59	the spec and libjxl differ in randomness seed behavior
2022-11-10 04:26:30	libjxl generates one batch of 16 numbers for every 16 pixels wide, and then it generates an extra batch for the remaining stragglers
2022-11-10 04:26:39	it does this even if the size of the remaining stragglers is zero
2022-11-10 04:26:59	i.e. if the width of the group is a multiple of 16, which it often will be, it generates an extra batch and does nothing with it
2022-11-10 04:30:00	this isn't in the spec
2022-11-10 04:36:25	I don't know if this is just a libjxl bug, or if it should be changed in the spec

veluca

	Traneptora i.e. if the width of the group is a multiple of 16, which it often will be, it generates an extra batch and does nothing with it
2022-11-10 04:43:20	uh, I don't think it's supposed to do that
2022-11-10 04:43:28	at the very least it's not intentional...

Traneptora

2022-11-10 04:43:51	dec_noise.cc, line 72
2022-11-10 04:43:57	``` size_t x = 0; // Only entire batches (avoids exceeding the image padding). for (; x + kFloatsPerBatch <= xsize; x += kFloatsPerBatch) { rng->Fill(batch); for (size_t i = 0; i < kFloatsPerBatch; i += Lanes(df)) { BitsToFloat(reinterpret_cast<const uint32_t>(batch) + i, row + x + i); } } // Any remaining pixels, rounded up to vectors (safe due to padding). rng->Fill(batch); size_t batch_pos = 0; // < kFloatsPerBatch for (; x < xsize; x += N) { BitsToFloat(reinterpret_cast<const uint32_t>(batch) + batch_pos, row + x); batch_pos += N; } ```
2022-11-10 04:44:05	`x + kFloatsPerBatch <= xsize`
2022-11-10 04:44:11	I think this should probably be a strict inequality
2022-11-10 04:44:26	then, the remaining batch gets filled for 1-16, rather than for 0-15
2022-11-10 04:44:58	notice that if `xSize` is a multiple of `kFloatsPerBatch` (i.e. 16)
2022-11-10 04:45:15	then we have `x + 16 <= xSize` as the limiting condition, which is the same thing as `x < xSize`
2022-11-10 04:46:02	so this fills the entire row
2022-11-10 04:46:17	and then `rng->Fill(batch);` is called again, unconditionally.
	veluca at the very least it's not intentional...
2022-11-10 04:46:40	in that case, i'll code to spec and assume this is a libjxl bug

veluca

2022-11-10 04:52:57

can you file an issue?

Traneptora

	veluca can you file an issue?
2022-11-10 05:06:46	https://github.com/libjxl/libjxl/issues/1888

veluca

2022-11-10 05:09:57

thanks!

_wb_

2022-11-10 07:38:00

I propose we indeed treat this as a libjxl bug and change some noise on existing images

Traneptora

2022-11-10 07:47:20	what about the other discrepancy, with the scale factor?
2022-11-10 07:47:51	where libjxl maps `scaled >= 6f` to `intPart = 5` and `fracPart = 1f` (causing `lut[7]` to be unused)
2022-11-10 07:48:37	or the spec maps `scaled >= 8f` to `intPart = 7` and `fracPart = 1f` (causing `lut[8]` to blow up)
2022-11-10 07:48:59	IMO it should map `scaled >= 7f` to `intPart = 6` and `fracPart = 1f`

veluca

2022-11-10 07:53:01	we also should fix, and ideally for the last time as we already "fixed" it way too many times
2022-11-10 07:53:09	can you also open a bug to libjxl for this?

Traneptora

2022-11-10 08:04:21	yup, there's one more spec issue with noise, the spec's missing a 0.5f multiplier somewhere
2022-11-10 08:04:23	lemme dig it up
2022-11-10 08:25:40	I gotta go do some real lifing but I'll try to open an issue about that soon

spider-mario

2022-11-10 08:25:52	we discussed that LUT thing a while ago, I believe
2022-11-10 08:26:02
2022-11-10 08:26:09	it’s that code, isn’t it?

Traneptora

2022-11-10 08:27:08	yes, but that's not the issue
2022-11-10 08:27:18	the current code isn't sawtoothed in either the spec or libjxl, the issue is the integer part threshold
2022-11-10 08:27:32	not the fractional part

spider-mario

2022-11-10 08:27:48	right, but I thought we had specifically fixed that too
2022-11-10 08:28:14	e.g. https://github.com/libjxl/libjxl/pull/1238/files

Traneptora

2022-11-10 08:29:25	somehow this change isn't in git main libjxl though?
2022-11-10 08:29:31	well at least the one in `dec_noise.cc`
2022-11-10 08:30:22	now that code is in `stage_noise.cc`
2022-11-10 08:30:22	https://github.com/libjxl/libjxl/blob/main/lib/jxl/render_pipeline/stage_noise.cc#L71
2022-11-10 08:30:33	and it appears to use the older version

spider-mario

2022-11-10 08:31:21	oh no
2022-11-10 08:31:26	thanks for spotting this

Traneptora

2022-11-10 08:32:09	yw :)
2022-11-10 08:34:18	the seeding change did make it into main though
2022-11-10 11:03:13	Last spec discrepancy relating to noise synthesis, btw:
2022-11-10 11:03:38	> `Given the X and Y samples of the input pixel, let InR = Y + X and InG = Y − X.`
2022-11-10 11:03:49	it's actually `(Y+X)/2` and `(Y-X)/2`
2022-11-10 11:04:54	https://github.com/libjxl/libjxl/blob/main/lib/jxl/render_pipeline/stage_noise.cc#L192
2022-11-10 11:05:21	notice that you're multiplying by `0.5` first
2022-11-10 11:05:40	here `half` is just `const auto half = Set(d, 0.5f);`
2022-11-10 11:07:40	I assume this is supposed to happen and a spec bug, not a libjxl bug
2022-11-10 11:08:01	so I'm not going to report it on the issue tracker. but in either case spec and libjxl don't match, so at least one of the two is not correct
2022-11-11 03:21:02	> The dequantization matrices are computed for each possible value of the DctSelect field as the inverse of the weights matrix.
2022-11-11 03:21:13	How am I supposed to compute the inverse of the weights matrix if they are not square?
2022-11-11 03:23:33	unless you mean reciprocols taken element-wise
2022-11-11 03:23:39	in which case, can I assume they are nonzero?
2022-11-11 03:45:06	```cpp GetDCTQuantWeights(params) { bands(0) = params[0]; for (i = 1; i < len; i++) { bands(i) = bands(i − 1) * Mult(params[i]); /* bands[i] >= 0 / for (y = 0; y < Y; y++) for (x = 0; x < X ; x++) { dx = x / (X − 1); dy = y / (Y − 1); distance = sqrt(dxdx + dy*dy); weight = Interpolate(distance, sqrt(2) + 1e−6, bands, num_bands); weights(x, y) = weight; } } } ```
2022-11-11 03:45:09	this doesn't make sense
2022-11-11 03:45:29	you're overwriting `weights[x][y]` each iteration of params
2022-11-11 03:45:42	and it's not referenced again
2022-11-11 03:45:48	is this supposed to be `+=`?
2022-11-11 03:46:49	or is it not supposed to be a nested for loop?
2022-11-11 03:58:37	> For encoding mode DCT2, params(c, i) are copied in position (x, y) as follows, where rectangles are defined by their top right and bottom left corners
2022-11-11 03:58:45	Is this supposed to say "top left and bottom right"?
2022-11-11 04:08:32	> For encoding mode DCT4×8, the weights matrix is obtained by copying into position (x, y) the value in position (x, y Idiv 2) in the 8 × 4 matrix
2022-11-11 04:08:39	and 8x4 matrix has 8 rows and 4 columns. it's taller than it is wide
2022-11-11 04:08:46	should this say 4x8?
2022-11-11 04:09:17	using the mathematical convention of Rows by Columns, not the convention of width by height
2022-11-11 04:36:31	DCT2 leaves weight[0][0] undefined
2022-11-11 04:36:35	does it deafult to 1?

veluca

	Traneptora in which case, can I assume they are nonzero?
2022-11-11 07:07:59	Yup (and yes, it is element by element)
	Traneptora or is it not supposed to be a nested for loop?
2022-11-11 07:09:01	facepalm no it is not supposed to be nestes
	Traneptora does it deafult to 1?
2022-11-11 07:09:44	Doesn't matter, weight 0 0 is never used

Traneptora

2022-11-11 12:26:34	I.6.5:The pairs (x, y) in the LLF vector is sorted in ascending order according to the value y * bwidth + x.
2022-11-11 12:26:47	I assume this is supposed to say `y * (bwidth / 8) + x`
2022-11-11 12:26:54	otherwise it overflows the size of LLF

_wb_

2022-11-11 12:57:04

Isn't bwidth the width in blocks, so already /8?

Traneptora

2022-11-11 01:05:29	it's the matrix width for a specific block
2022-11-11 01:13:05	> Let nb_block_ctx be equal to max(block_ctx_map) + 1. The decoder reads a histogram with `495 * num_hf_presets * nb_block_ctx` pre-clustered distributions D from the codestream as specified in C.1.
2022-11-11 01:13:13	Where's this 495 coming from? I can't find it in libjxl code.

veluca

	Traneptora I assume this is supposed to say `y * (bwidth / 8) + x`
2022-11-11 01:15:52	does it actually matter as long as it's `y * <something greater than any x> + x`?

Traneptora

2022-11-11 01:16:10

yes because otherwise you end up with zero padding and overflow

veluca

2022-11-11 01:16:16	no?
2022-11-11 01:16:26	it just says how they are sorted

Traneptora

2022-11-11 01:16:30

well, I suppose not, it's an order

veluca

2022-11-11 01:16:31

not what their positions are

Traneptora

2022-11-11 01:16:32

not an index

veluca

2022-11-11 01:17:03	in practice they do end up at index `y*bw/8+x`
	Traneptora > Let nb_block_ctx be equal to max(block_ctx_map) + 1. The decoder reads a histogram with `495 * num_hf_presets * nb_block_ctx` pre-clustered distributions D from the codestream as specified in C.1.
2022-11-11 01:17:33	that's a good question, let me check

Traneptora

2022-11-11 01:18:03	upon further inspection, it appears to be 458 + 37
2022-11-11 01:18:10	where 37 is kNonZeroBuckets
2022-11-11 01:18:17	and 458 is kZeroDensityContextCount

veluca

2022-11-11 01:18:53

yep

Traneptora

2022-11-11 03:26:13	> Finally, for k in the range [num_blocks, size), the decoder reads an integer ucoeff from the codestream
2022-11-11 03:26:19	`size` is not defined here
2022-11-11 03:27:24	unless it's the natural coeffient order size way back in hfpass
2022-11-11 03:27:30	if that's what it is, this is very confusing
2022-11-11 04:24:51	that spec doesn't say what happens if it reads non_zeroes directly and it's immediately zero
2022-11-11 04:25:12	spec doesn't disallow this

veluca

2022-11-12 12:25:35

doesn't that just make all of AC 0?

Traneptora

2022-11-12 03:52:49	spec doesn't say that
2022-11-12 03:53:32	Typo in section I.3.2: ```c BlockContext() { idx = (c < 2 ? c ^ 1 : 2) * 13 + s; idx = (qf_thresholds.size() + 1); for (t : qf_thresholds) if (qf > t) idx++; for (i = 0; i < 3; i++) idx = (lf_thresholds[i].size() + 1); lf_idx = 0; for (t : lf_thresholds[0]) if (qdc[0] > t) lf_idx++; lf_idx = (lf_thresholds[2].size() + 1); for (t : lf_thresholds[2]) if (qdc[2] > t) lf_idx++; lf_idx = (lf_thresholds[0].size() + 1); // <- TYPO HERE for (t : lf_thresholds[1]) if (qdc[1] > t) lf_idx++; return block_ctx_map[idx + lf_idx]; } ```
2022-11-12 03:53:45	I believe that line should say `lf_thresholds[1].size() + 1`
2022-11-12 03:53:55	not `lf_thresholds[0]`
2022-11-12 03:54:22	Typo in section I.3.2: ``` CoeffNumNonzeroContext[64] = { 0, 0, 31, 62, 62, 93, 93, 93, 93, 23, 123, 123, 123, 152, 152, 152, 152, 152, 152, 152, 152, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206}; ```
2022-11-12 03:54:36	The `23` in the top row, 4th from the right should be 123
2022-11-12 03:54:51	also, libjxl starts these arrays with `0xBAD` instead of zero
2022-11-12 03:55:15	I don't know if that's because it can be anything, or if matters
2022-11-12 03:55:55	typo in section I.3.2:
2022-11-12 03:56:17	``` if (k == num_blocks) { if (non_zeros > size / 16) prev = 1; else prev = 0; } ``` this condition should be `prev = 0; else prev = 1`
2022-11-12 03:56:44	libjxl dec_group.cc line 486 has this: `size_t prev = (nzeros > size / 16 ? 0 : 1);`

veluca

	Traneptora also, libjxl starts these arrays with `0xBAD` instead of zero
2022-11-12 02:19:12	that one should never be used

Traneptora

2022-11-12 06:50:22	Section I.2.2: there's a +1 missing in the QF Thresholds decoding
2022-11-12 06:50:34	``` for (i = 0; i < nb_qf_thr; i++) qf_thresholds.push_back(U32(u(2), 4 + u(3), 12 + u(5), 44 + u(8)))); ```
2022-11-12 06:50:38	This is in the pseudocode
2022-11-12 06:51:13	`entropy_coder.cc`, line 52
2022-11-12 06:51:18	```cpp for (uint32_t& i : qft) { i = U32Coder::Read(kQFThresholdDist, br) + 1; } ```
2022-11-12 06:51:24	notice there's a +1 there.
2022-11-12 11:27:41	Section I.2.4 has this code:
2022-11-12 11:27:48	```cpp if (encoding_mode == RAW) { params.denominator = F16(); ZeroPadToByte(); params = /* read a 3-channel image from a modular sub-bitstream of the same shape as the required quant matrix */; } ```
2022-11-12 11:28:34	That `ZeroPadToByte()` should not be there. If you do it, you'll fail to verify the submodular bitstream for example with ants.jxl
2022-11-12 11:29:10
2022-11-12 11:46:41	Also, something that appears to missing from Annex I (VarDCT) is an overview. Annex H (modular) had an overview of the general process of modular
2022-11-12 11:46:47	VarDCT's annex appears to be missing it
2022-11-12 11:46:55	It explains how to do each step, but doesn't explain when to do each step
2022-11-12 11:47:28	I now have Quantized LF coefficients and Quantized HF coefficients, but I'm unsure when I dequent them, when I detransform them (via DCT or some other method), and when I decorrelate them.
2022-11-12 11:47:39	I know how to do those things, but not when.
2022-11-12 11:49:46	The spec also appears to be lacking guidance on how to handle edge cases, like if image dimensions are odd, or not multiples of 8, etc.
2022-11-12 11:51:11	if the image is a multiple of 8x8 it's not hard, but what if it isn't?

veluca

2022-11-13 06:18:41

then you pretend it is and crop at the very end, I'm sure it's written *somewhere*

Traneptora

2022-11-13 08:36:35	maybe it's in the preamble, but I didn't see it in annex I
2022-11-13 08:37:05	in either case, a general overview of VarDCT like there is for modular would be helpful

Fraetor

2022-11-13 12:17:15

Minor issue, but in section 4.3: Operators, for and and or if is misspell as iff.

_wb_

2022-11-13 12:21:54

"iff" means "if and only if", but you're right, it should be spelled out because not everyone will be familiar with that abbreviation

Fraetor

2022-11-13 02:05:40	Ah, well I learned something new.
2022-11-13 02:11:20	Also, the examples are really helpful.

veluca

2022-11-13 02:25:42

that's my fault for being a mathematician 😛

Traneptora

2022-11-13 03:48:29	same, mathematicians ^_^
	veluca then you pretend it is and crop at the very end, I'm sure it's written somewhere
2022-11-13 10:27:30	well it's not necessary for modular images so it'd have to be in VarDCT's section
2022-11-13 10:27:36	and I didn't see it although I might have glossed over it

veluca

2022-11-13 11:01:42	I can't see that either...
2022-11-13 11:02:18	(and it's especially tricky how it is supposed to behave especially as you mix in extra channels and upsampling, so it really ought to be mentioned somewhere)

Traneptora

2022-11-13 11:08:30	extra channels are done in modular so they don't have that crop factor right?
2022-11-13 11:08:48	VarDCT is specifically about the exactly three color channels iirc

veluca

2022-11-13 11:27:16

correct

Traneptora

2022-11-14 01:09:54	Section I.2.1, Quantizer Bundle, is missing a factor of `1 << 16` in global_scale
2022-11-14 01:10:03	it has `mXDC = m_x_lf_unscaled / (global_scale * quant_lf);`
2022-11-14 01:10:14	but it should have `mXDC = (1 << 16) * m_x_lf_unscaled / (global_scale * quant_lf);`
2022-11-14 01:10:29	and same for Y and B

Moritz Firsching

2022-11-14 01:40:32	..fixing this now
2022-11-14 01:42:25	also watch out for taking the reciprocals of `m_foo_lf_unscaled` https://github.com/lifthrasiir/j40/blob/acfe38647a0614e0562f739d4a40d2fc8a397e7f/j40.h#L5136

_wb_

2022-11-14 02:26:26

<@604964375924834314> I'm fixing noise in the spec, can you fix it in the render pipeline code in libjxl?

spider-mario

2022-11-14 05:07:10	yep, I suppose it will mostly boil down to making sure that the previous fixes are really applied
2022-11-14 05:07:26	I was afraid of exactly that when we started copying code to the rendering pipeline
2022-11-14 05:07:33	which makes me wonder how I didn’t catch this

Traneptora

2022-11-14 05:12:27	> `ScaleF(N, n) { return 1 / (cos(pi / (2 * N)) * cos(pi / N) * cos(pi / (N / 2))); }`
2022-11-14 05:12:30	Section I.6.6
2022-11-14 05:12:41	This doesn't make sense as there's no little `n` here
2022-11-14 05:12:50	I tried to trace it back to libjxl but it just has a bunch of precomputed LUTs
2022-11-14 05:13:02	so I can't figure out what it did

spider-mario

	_wb_ <@604964375924834314> I'm fixing noise in the spec, can you fix it in the render pipeline code in libjxl?
2022-11-14 05:17:52	it was actually already fixed in #12, wasn’t it?
2022-11-14 05:17:55	(the spec)

Traneptora

2022-11-14 05:18:22	I can't see #12, but i can say that the spec as written is off by 1 from libjxl in the other direction
2022-11-14 05:18:36	it clamps to 7, not 6, so adding 1 a bit later will overflow

_wb_

	spider-mario it was actually already fixed in #12, wasn’t it?
2022-11-14 05:27:52	Looks like the spec was correct but libjxl was off by -1, then we fixed libjxl but also did a +1 in the spec, and then the render pipeline code regressed to the off by -1 version so we ended up having -1 in libjxl and +1 in the spec.

Traneptora

2022-11-14 05:45:38	For DctSelect types IDENTIFY, DCT2×2, DCT4×4, DCT8×4, DCT4×8, AFV0, AFV1, AFV2, AFV3, the output is equal to the input.
2022-11-14 05:45:41	What's Identify?

_wb_

2022-11-14 06:01:18	A typo for IDENTITY, which was renamed to Hornuss between DIS and FDIS because someone remarked that it is not at all an identity transform 🙂
2022-11-14 06:01:36	Good catch!

Traneptora

2022-11-14 06:21:45

I see, and a search and replace missed it cause it was a typo <:KEKW:643601031040729099>

Moritz Firsching

2022-11-15 09:50:19

I fixed the IDENTIFY/Hornuss typo in the spec draft...

Traneptora

2022-11-15 12:58:39	for RAW quant weight decoding, the spec says that you use `params / denominator`
2022-11-15 12:58:49	but it's actually `1 / (params * denominator)`
2022-11-15 12:59:00	(this is before you take the reciprocols of all the weights)
2022-11-15 03:04:57	LLF coefficient ScaleF is missing a multiple of `n`
2022-11-15 03:05:07	```cpp ScaleF(N, n) { return 1 / (cos(pi / (2 * N)) * cos(pi / N) * cos(pi / (N / 2))); } ```
2022-11-15 03:05:11	It should say this:
2022-11-15 03:05:25	```c++ ScaleF(N, n) { return 1 / (cos(n * pi / (2 * N)) * cos(n * pi / N) * cos(n * pi / (N / 2))); } ```
2022-11-15 03:05:53	alternatively it could be relabeled for clarity
2022-11-15 03:09:53	```cpp ScaleF(c, b) { inverse_scale_f = cos(b * pi / (2 * c)) * cos(b * pi / c) * cos(2 * b * pi / c); return 1 / inverse_scale_f; } ```
2022-11-15 03:10:24	I chose `b` and `c` here because the values that are fed into it lower in that section are `ScaleF(cy, bheight)`
2022-11-15 03:10:26	which matches up
2022-11-15 03:10:52	I don't like mixing `N` and `n` in the same function, that could get confusing
2022-11-15 03:10:58	(section I.6.6, btw)
2022-11-15 03:22:20	I think this is what it's supposed to be, at least
2022-11-15 03:52:21	it's hard to tell because libjxl just has LUTs
2022-11-15 07:18:21	The default value for `x_qm_scale` in the Frame Header is wrong
2022-11-15 07:18:46	in the spec it says the default is 3
2022-11-15 07:19:04	with a signalling condition of `!all_default and metadata.xyb_encoded and encoding == kVarDCT`
2022-11-15 07:19:28	the actual default is `3` iff `metadata.xyb_encoded and encoding == kVarDCT`
2022-11-15 07:19:46	and defaults to `2` iff `!metadata.xyb_encoded \|\| encoding == kModular`
2022-11-15 07:20:40	so if, for example, you have a reconstructed JPEG with `!all_default && !metadata.xyb_encoded` it should default to 2, not to 3
2022-11-15 07:22:51	libjxl has this (converted to pseudocode) ```cpp if (metadata.xyb_encoded && encoding == kVarDCT) { visitor->visit_fields(&x_qm_scale, 3, /default=/all_default) visitor->visit_fields(&b_qm_scale, 2, /default=/all_default) } else { x_qm_scale = 2; b_qm_scale = 2; } ```

_wb_

2022-11-15 07:28:23

Oh, wow that's confusing

Traneptora

2022-11-15 07:28:46	tripped me up when debugging CFL
	Moritz Firsching also watch out for taking the reciprocals of `m_foo_lf_unscaled` https://github.com/lifthrasiir/j40/blob/acfe38647a0614e0562f739d4a40d2fc8a397e7f/j40.h#L5136
2022-11-15 08:46:10	already figured that one out since it's used in modular
	Traneptora but it's actually `1 / (params * denominator)`
2022-11-15 08:47:14	https://github.com/libjxl/libjxl/blob/main/lib/jxl/quant_weights.cc#L238
2022-11-15 11:57:29	> Horizontal (vertical) upsampling is defined as follows: the channel width (height) is multiplied by two (but limited to the image dimensions), and every subsampled pixel B with left (top) neighbour A and right (bottom) neighbour C is replaced by two upsampled pixels with values 0.25 * A + 0.75 * B and 0.75 * B + 0.25 * C.
2022-11-15 11:57:48	What happens in edge cases? i.e. what happens if hshift == 1 and x=0

_wb_

2022-11-16 10:10:22

I assume mirroring, i.e. A=B for the leftmost pixels and C=B for the rightmost ones. <@179701849576833024> is that how we do it? We should add that to the spec then...

veluca

2022-11-16 10:12:34

Somewhere the spec says that whenever we access out of bound pixels they are mirrored

_wb_

2022-11-16 10:30:15	I guess that's true, but in many places we explicitly say so too. It's a bit confusing if we sometimes say it explicitly and sometimes we rely on the generic statement.
2022-11-16 10:54:53
2022-11-16 10:55:16	This should be coefficients, not samples, right?

Traneptora

2022-11-16 12:27:51

that's correct, those are dequantized coefficients

_wb_

2022-11-16 12:35:36	I'll change that then to 'coefficient' instead of 'sample', since this could easily be misinterpreted as doing things in the pixel domain, not the dct domain
2022-11-16 12:37:20	Also going to reorganize things so LLF goes earlier

Traneptora

2022-11-16 12:58:18	Thanks. Putting things in the order you do them is ideal
2022-11-16 12:59:37	Namely Decode HF -> Dequant HF -> Recorrelate HF -> Overlay LF to LLF atop HF -> Inverse DCT
2022-11-16 01:00:26	considering that LLF is not recorrellated it should go after CfL

_wb_

2022-11-16 01:02:34	Yeah, I'm going to first define DCT, then LLF (since it uses DCT_2D and it's better to avoid forward refs imo), then Coefficients to samples
2022-11-16 05:33:15	Updated draft:
2022-11-16 05:46:40	oh nice, there's a nicer looking html output

Traneptora

2022-11-16 09:26:51	way better than Cambria :)
2022-11-16 09:33:03	There's a typo in section I.9.6: `coeffs_8x4(0, 0)` should be `coeffs_4x8(0, 0)`
2022-11-16 09:33:22	as the variable is called 4x8 even in the DCT8x4 transform
2022-11-16 10:18:19	End of the AFV transform definition
2022-11-16 10:18:23	``` samples_4x8 = IDCT_2D(coeffs_4x8); for (iy = 0; iy < 4; iy++) for (ix = 0; ix < 8; ix++) samples(ix, (flip_y == 1 ? 0 : 4) + iy) = samples_4x4(ix, iy); ```
2022-11-16 10:18:27	I believe that last samples should say samples4x8
2022-11-17 06:28:54	AFV transform definition
2022-11-17 06:28:58	> `coeffs_4x4(0, 0) = coefficients(0, 0) − coefficients(0, 1) + coefficients(1, 0);`
2022-11-17 06:29:11	I believe this should be `coeffs_4x4(0, 0) = coefficients(0, 0) + coefficients(0, 1) - coefficients(1, 0);`
2022-11-17 06:29:24	I haven't double checked against libjxl, but making that change fixes a number of visual artifacts with the corner of AFV

yurume

2022-11-17 06:34:35

that matches J40: https://github.com/lifthrasiir/j40/blob/216f83ba7a616685f847268f009e8fcb85106f76/j40.h#L6147

Traneptora

2022-11-17 06:35:58	good catch, I missed the second half
2022-11-18 08:38:09	Section J.4.1 appears to have some links to the wrong sections
2022-11-18 08:38:43	section J.4.2 specifies how to calculate distance as a function of `(x, y, cx, cy)`
2022-11-18 08:38:56	and section J.4.3 specifies how to calculate weight as a function of distance and sigma
2022-11-18 08:40:11	but I think the linked sections are wrong, since the third step specifies the L1 distance in J.4.3
2022-11-18 08:40:36	was this something that got shuffled, and the linked sections didn't get updated?
2022-11-18 08:43:04	Section F.2, between tables F.6 and F.7, definition of `downsample[i]`:
2022-11-18 08:43:15	something got messed up here, idk what
2022-11-18 08:44:05	Top of Annex F, link on the right hand side to F.3 doesn't work
2022-11-18 08:44:34	It links to `#toc` but the anchor for F.3 is `#toc45`
2022-11-18 08:44:51	dunno how that happened
2022-11-18 08:45:31	it looks like there is a div with an ID of "toc" but it doesn't work, and idk why
2022-11-18 08:45:54	using firefox

_wb_

2022-11-18 09:24:00	looks like it's a reserved keyword or something that metanorma is already using for the document toc
2022-11-18 09:24:06	renaming it fixed it

Traneptora

2022-11-18 09:24:52
2022-11-18 09:25:06	Why is sigma being clamped to 1e-4 ?
2022-11-18 09:25:22	if it was less than 1e-4, now it is equal to 1e-4, and it's still gonna be less than 0.3

_wb_

2022-11-18 09:26:28

yeah that seems redundant. <@604964375924834314> can we just remove that `sigma = max(1e−4, sigma)` line?

Traneptora

2022-11-18 09:26:46	in theory epf_sigma_for_modular could be less than 0.3 but that would just disable epf for a modular frame
2022-11-18 09:26:57	so there'd be no point
2022-11-18 09:27:16	when you could just disable epf

spider-mario

2022-11-18 09:28:05	I’m not the best person to ask about EPF
2022-11-18 09:28:17	I think <@179701849576833024> knows more about this
2022-11-18 09:28:48	hm, okay, reading the context a bit more, it does seem redundant

_wb_

2022-11-18 09:29:11

oh right I saw a sigma and was thinking splines, but this is epf 🙂

veluca

2022-11-18 09:29:17

It is entirely possible for things to be redundant

Traneptora

2022-11-18 09:34:45	> `inv_sigma = step_multiplier[step] * 4 * (sqrt(0.5) − 1) / sigma;`
2022-11-18 09:34:54	just to double check, is this supposed to be negative?
2022-11-18 09:35:07	` v = 1 − scaled_distance * inv_sigma;` will always be positive
2022-11-18 09:35:23	and there's a check for `(v <= 0)`

_wb_

2022-11-18 09:46:13

EPF section now:

Traneptora

2022-11-18 09:46:58
2022-11-18 09:47:03	yea, it's specified in J.4.2

_wb_

2022-11-18 09:47:05

What I would propose to change it to:

Traneptora

2022-11-18 09:47:09	ah
2022-11-18 09:47:34	I think that's more clear

_wb_

2022-11-18 09:50:46	rewrote the downscale[i] thing a bit, there was some misparsing of code blocks going on there but I also didn't like the parentheses and duplication of the range of i
	veluca It is entirely possible for things to be redundant
2022-11-18 09:55:25	I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.

Traneptora

	_wb_ I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.
2022-11-18 09:55:57	epf_sigma_for_modular is constant and signalled in the frame header
2022-11-18 09:56:15	so if epf_sigma_for_modular < 0.3, that's identical to the frame header just disabling epf
2022-11-18 09:56:41	it wouldn't make sense for epf_sigma_for_modular to be < 0.3 and have epf enabled
2022-11-18 09:56:57	it might even make sense to forbid it
2022-11-18 04:58:56	DCT4x4Params doesn't match the spec compared to libjxl
2022-11-18 04:59:00	``` // DCT4 (quant_kind 3) static constexpr const QuantEncodingInternal DCT4X4() { return QuantEncodingInternal::DCT4(DctQuantWeightParams({{{{ V(2200.0), V(0.0), V(0.0), V(0.0), }}, {{ V(392.0), V(0.0), V(0.0), V(0.0), }}, {{ V(112.0), V(-0.25), V(-0.25), V(-0.5), }}}}, 4),```
2022-11-18 04:59:02	libjxl
2022-11-18 04:59:28	spec
2022-11-18 10:09:17	``` The resulting quant is then multiplied by a per-block multiplier, the value of HfMul at the coordinates of the 8 × 8 rectangle containing the current sample, and, for the X and B channels, by pow(0.8, frame_header.x_qm_scale − 2) and pow(0.8, frame_header.b_qm_scale — 2), respectively. ```
2022-11-18 10:09:19	This is not correct
2022-11-18 10:09:42	It is multiplied by `(1 << 16) * quantizer.global_scale / HfMul`
2022-11-18 10:09:51	it is not multiplied by `HfMul`
2022-11-18 10:15:18	see line 145 of dec_group.cc
2022-11-19 04:06:52	Section E.4.2:
2022-11-19 04:06:56	``` value = 0; shift = 0; while (1) { b = u(8); value += (b & 127) << shift; if (b <= 127) break; shift += 7; /* shift < 63 */; } ```
2022-11-19 04:07:06	I'm looking at this comment `/* shift < 63 */`
2022-11-19 04:07:15	it's not clear to me if this is a requirement before or after adding 7
2022-11-19 04:09:29	because it's on the same line
2022-11-19 04:10:28	it says earlier that the final integer is at at most 63 bits, so I think it's still technically specified, but it might be clearer if the pseudocode reflected that

_wb_

2022-11-19 04:27:29	basically value can use uint64_t
2022-11-19 04:29:39	Maybe better to write `/* shift <= 56 */` in the beginning of the while body...

Traneptora

2022-11-19 04:54:30	that would be more obvious, yea
2022-11-19 05:14:16	Section E.4.4
2022-11-19 05:14:18	> `/* output nothing, stop reading the tag list, and proceed to ~\ref{iccmaincontent}~ */`
2022-11-19 05:14:30	looks like you left some `\LaTeX` in there

_wb_

2022-11-19 05:20:09

oops, lol

Traneptora

2022-11-19 05:42:58	``` f (tagcode == 1) { // the tag string is set to 4 custom bytes read from the data tag = u(4 * 8) from data stream; } else if (tagcode == 2) { tag = “rRTC”; } else if (tagcode == 3) { tag = “rXYZ”; ```
2022-11-19 05:43:07	is `rRTC` supposed to be `rTRC`?

_wb_

2022-11-19 05:47:36

Uh, best to check libjxl code

Traneptora

2022-11-19 05:48:11	later on it has the following
2022-11-19 05:48:14	``` /* append the following 3 groups of 4 bytes to the output: tag, tagstart, tagsize / if (tagcode == 2) { / append the following additional 6 groups of 4 bytes to the output: “gTRC”, tagstart, tagsize, “bTRC”, tagstart, tagsize */ ```
2022-11-19 05:48:22	I'm guessing TRC stands for Transfer Characteristics
2022-11-19 05:48:27	and rRTC is probably a typo

_wb_

2022-11-19 05:51:20

https://github.com/libjxl/libjxl/blob/9d960611774b5b14a46f33a20768cde2cd0c219f/lib/jxl/icc_codec_common.h#L48

Traneptora

2022-11-19 05:51:33

looks like it is a typo

_wb_

2022-11-19 05:51:53

Good catch. Indeed a typo for rTRC

Traneptora

2022-11-19 05:53:43	> `In this subclause, Shuffle(bytes, width) denotes the following operation: Bytes are inserted in raster order (row by row from left to right, starting with the top row) into a matrix with width rows, and as many columns as needed.`
2022-11-19 05:53:55	Why is it called `width` if the number of columns is arbitrary, and there's that many rows?
2022-11-19 05:54:06	usually the word `height` is used to describe the number of scanlines, i.e. the number of rows.
2022-11-19 05:55:15	How am I supposed to insert bytes in raster order into a matrix with as many columns as needed?
2022-11-19 05:56:52	oh, I see. I create a matrix with `width` rows and `height` columns (yes, this is backward), because I then transpose it later
2022-11-19 05:56:57	but that's very confusing
2022-11-19 05:57:47	and `height` in this case is `CeilDiv(bytes.size(), width)`
2022-11-19 08:55:21	Section E.4.5
2022-11-19 08:55:24	> ` val = (bytes[i + j] + (p >> (width − 1 − j))) & 255;`
2022-11-19 08:55:49	`p >> (width - 1 - j)` should be `p >> (8 * (width - 1 - j))`
2022-11-19 08:55:55	as `width` and `j` are in units of bytes, not bits
2022-11-20 09:18:37	Section J.4.2:
2022-11-20 09:18:51	``` DistanceStep2(x, y, cx, cy) { dist = 0; for (c = 0; c < 3; c++) { dist += abs(sample(x + ix, y + iy, c) − sample(x + cx + ix, y + cy + iy, c)) * rf.epf_channel_scale[c]; } return dist; } ``` `iy` and `ix` are not defined here
2022-11-20 09:18:55	is this a copy/paste typo?
2022-11-20 09:42:05	Section I.8, ScaleF function
2022-11-20 09:42:13	this is my fault, I flipped `c` and `b`
2022-11-20 09:42:51	`c` should be in the numerator, and `b` in the denominator
2022-11-20 09:51:12	Section I.6, Chroma From Luma:
2022-11-20 09:51:20	```ytox_dc_ = static_cast<int>(br->ReadFixedBits<kBitsPerByte>()) + std::numeric_limits<int8_t>::min();```
2022-11-20 09:51:31	spec says -127, but this is -128
2022-11-20 09:52:15	since `signed char` goes from -128 to +127
2022-11-20 09:56:32	this makes a noticeable difference if the codestream encodes 128
2022-11-20 09:56:37	since it goes from 1 to 0
2022-11-21 04:26:33	Section G.3:
2022-11-21 04:26:36	These are in the wrong order
2022-11-21 04:27:14	the correct order is `lf_coeff` then `mlf_group` then `hf_meta`
2022-11-21 04:27:53	See `dec_frame.cc` line 301:
2022-11-21 04:27:56	```cpp Status FrameDecoder::ProcessDCGroup(size_t dc_group_id, BitReader* br) { PROFILER_FUNC; const size_t gx = dc_group_id % frame_dim_.xsize_dc_groups; const size_t gy = dc_group_id / frame_dim_.xsize_dc_groups; const LoopFilter& lf = dec_state_->shared->frame_header.loop_filter; if (frame_header_.encoding == FrameEncoding::kVarDCT && !(frame_header_.flags & FrameHeader::kUseDcFrame)) { JXL_RETURN_IF_ERROR( modular_frame_decoder_.DecodeVarDCTDC(dc_group_id, br, dec_state_)); } const Rect mrect(gx * frame_dim_.dc_group_dim, gy * frame_dim_.dc_group_dim, frame_dim_.dc_group_dim, frame_dim_.dc_group_dim); JXL_RETURN_IF_ERROR(modular_frame_decoder_.DecodeGroup( mrect, br, 3, 1000, ModularStreamId::ModularDC(dc_group_id), /zerofill=/false, nullptr, nullptr, /allow_truncated=/false)); if (frame_header_.encoding == FrameEncoding::kVarDCT) { JXL_RETURN_IF_ERROR( modular_frame_decoder_.DecodeAcMetadata(dc_group_id, br, dec_state_)); } else if (lf.epf_iters > 0) { FillImage(kInvSigmaNum / lf.epf_sigma_for_modular, &dec_state_->sigma); } decoded_dc_groups_[dc_group_id] = uint8_t{true}; return true; } ```
2022-11-21 04:28:33	notice that DecodeVarDCTDC is called before DecodeGroup

_wb_

	Traneptora ``` DistanceStep2(x, y, cx, cy) { dist = 0; for (c = 0; c < 3; c++) { dist += abs(sample(x + ix, y + iy, c) − sample(x + cx + ix, y + cy + iy, c)) * rf.epf_channel_scale[c]; } return dist; } ``` `iy` and `ix` are not defined here
2022-11-21 05:04:44	<@179701849576833024> I suppose there is something missing there? In Step0/1 the ix,iy iterate over 5 values making a cross, I assume it's not the same in Step2 because then why would there be two different functions, but is it just ix=iy=0 or is it also iterating over some set of offsets?

veluca

2022-11-21 05:05:02

No just 0 0

_wb_

_wb_ I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.

2022-11-21 05:05:20

also ping <@179701849576833024> on this one

veluca

2022-11-21 05:06:26

It does apply to modular too

_wb_

2022-11-21 05:16:14

Latest draft:

Traneptora

2022-11-21 05:32:40	oh, nice
2022-11-21 09:51:47	Section I.2.3, the default values for `x_factor_lf` and `b_factor_lf` are 128, not 127
	Traneptora ```ytox_dc_ = static_cast<int>(br->ReadFixedBits<kBitsPerByte>()) + std::numeric_limits<int8_t>::min();```
2022-11-21 09:55:10	same line of code as here, default coded value is 0

_wb_

2022-11-22 03:48:53	Today's draft:
2022-11-23 03:05:12	updated draft:
2022-11-23 03:06:03	I changed the signalling of `save_before_ct` quite a bit, I hope it is correct and easier to understand now

Traneptora

2022-11-23 04:29:48

much easier to read now

jjido

2022-11-25 10:12:49

Just want to say than you for the excellent work <@853026420792360980> .

Traneptora

2022-11-26 08:59:24

yw :)

_wb_

2022-11-28 03:04:29

<@853026420792360980> and <@268284145820631040> you could do me a big favor by checking if all the bugs you found are fixed now (and pointing me to the ones that still need fixing or where the fix is wrong/not good enough)

Traneptora

2022-11-28 03:15:57

I can take a look

_wb_

2022-11-29 11:50:23

Current draft: (clarified bit depth, save_before_ct, and hshift/vshift in modular)

Traneptora

2022-11-29 04:35:34	Section G.2.2: > If the kUseLfFrame flag in frame_header is set, this subclause is skipped and the samples from LFFrame[frame_header.lf_frame] are used instead of the values that would be computed by this subclause and I.5.2.
2022-11-29 04:35:45	`frame_header.lf_frame` is not defined in the FrameHeader bundle
2022-11-29 04:48:37	is this the same `lf_level` used?
2022-11-29 04:49:17	does that mean `kUseLFFrame != 0` and `lf_level > 1` means you have an LF Frame which itself uses an LF Frame for its LF Coefficients?
2022-11-29 04:53:46	Section I.4:
2022-11-29 04:53:48	> To compute context, the decoder uses the procedures and constants in the following code, where c is the current channel (with 0=X, 1=Y, 2=B), s is the Order ID (see Table I.7) of the DctSelect value and qf is the HfMul value for the current varblock, and qdc[3] are the quantized LF values of (the top-left 8×8 block within) the current varblock (taking into account chroma subsampling if needed). The lists of thresholds qf_thresholds and lf_thresholds[3], and block_ctx_map are as decoded in LfGlobal ( G.1.1 and I.2.2).
2022-11-29 04:53:58	specifically, these are the quantized LF Values
2022-11-29 04:54:09	what happens if you have an LF Frame? which skips dequantization
2022-11-29 04:56:33	especially considering the LF Frame values are floats
2022-11-29 04:56:38	and the quantized LF Values are ints

_wb_

2022-11-29 05:07:59

that's a really good question — <@179701849576833024> any idea what libjxl will do here? just have all zeroes there or something? we need to put it in the spec...

Traneptora

2022-11-29 05:11:58	I tried populating the quantized values and I got some incorrect values
2022-11-29 05:12:04	I tried populating zeroes and got other incorrect values
2022-11-29 05:12:24	lenna-correct.jxl
2022-11-29 05:12:33	(converted to png with libjxl for discord)
2022-11-29 05:12:52	using the dequantized LF values produced this
2022-11-29 05:13:02	using zeroes produced this
2022-11-29 05:13:11	those are varblock boundaries, btw
2022-11-29 05:13:18	those seams
2022-11-29 05:13:22
2022-11-29 05:14:33	weirdly enough, both ANS streams verified
2022-11-29 05:14:38	which I'm somewhat surprised about
2022-11-29 05:14:51	I would have expected one to not verify and for that to just give it away

_wb_

Traneptora does that mean `kUseLFFrame != 0` and `lf_level > 1` means you have an LF Frame which itself uses an LF Frame for its LF Coefficients?

2022-11-29 05:17:21

Yes, you can have recursive LF frames, up to 4 levels deep. So if you have a max-size image of 2^30 by 2^30 pixels (about one gigapixel by one gigapixel or one exapixel), you can have its 2^27 x 2^27 DC in LFFrame[0] which is a 1:8 image, which can have _its_ 2^24 x 2^24 DC in LFFrame[1] which is a 1:64 image, which can have _its_ 2^21 x 2^21 DC in LFFrame[2], which is a 1:512 image, which can have _its_ 2^18 x 2^18 DC in LFFrame[3], which is a 1:4096 image. That image needs to be modular or VarDCT with its own DC — in the latter case its DC would be a 2^15 x 2^15 downscale at 1:32768 of the full image, which would still be a 1 gigapixel image that would be encoded in 16384 DC groups.

Traneptora

2022-11-29 05:17:53	I see, so `lf_level > 0` is strictly for LF Frames themselves using LF Frames
2022-11-29 05:18:18	this was not clear to me
2022-11-29 05:18:38	either way, the `lf_frame` is not defined, it should be `lf_level`

_wb_

	Traneptora I tried populating the quantized values and I got some incorrect values
2022-11-29 05:19:28	wait how did you get a bitstream that uses qdc in combination with LF frames?

Traneptora

	_wb_ wait how did you get a bitstream that uses qdc in combination with LF frames?
2022-11-29 05:19:41	because it's used in the HF Coefficient decoding

_wb_

2022-11-29 05:20:14

yeah but libjxl only uses that when recompressing jpegs and then it doesn't produce lf frames

Traneptora

2022-11-29 05:20:27

that's not what the spec says

_wb_

2022-11-29 05:20:32

I mean

Traneptora

2022-11-29 05:20:33	and also not what it does
2022-11-29 05:20:43	see section I.4

_wb_

2022-11-29 05:20:58

it sets the thresholds so that it effectively doesn't use that

Traneptora

2022-11-29 05:21:13	`BlockContext()` uses `qdc`
2022-11-29 05:21:38	and `blockContext()` is needed to determine which context to read the HF coefficient from

_wb_

2022-11-29 05:22:32	libjxl will set `nb_lf_thr` to zero for each channel when doing vardct-from-pixels
2022-11-29 05:23:03	so it effectively doesn't matter what the value of qdc is to determine the context
2022-11-29 05:23:28	that's probably why both ANS streams verified

Traneptora

2022-11-29 05:24:15	but they produced different results
2022-11-29 05:24:16	which is odd

_wb_

2022-11-29 05:24:49

yeah that is odd, could be another spec bug or a jxlatte bug

Traneptora

2022-11-29 05:27:11	oh yea, the number of LF Thresholds is empty
	_wb_ yeah that is odd, could be another spec bug or a jxlatte bug
2022-11-29 05:30:33	I see. I have a race condition somewhere because I'm getting different results on subsequent executions
2022-11-29 05:30:36	I have to debug

_wb_

	Traneptora oh yea, the number of LF Thresholds is empty
2022-11-29 05:32:56	yeah and the only time they will not be empty is with recompressed jpegs — we needed some extra context there to help with compression, and quantized DC was somewhat useful; for 'full' VarDCT images we can use the adaptive quant weights which are a much more useful piece of context (but not for recompressed jpegs since those obviously don't do real adaptive quantization)
2022-11-29 05:33:40	but we never made lf frames work for recompressed jpegs so we never encountered the issue of what to do with `qdc` in that case

veluca

	_wb_ that's a really good question — <@179701849576833024> any idea what libjxl will do here? just have all zeroes there or something? we need to put it in the spec...
2022-11-29 05:36:17	The real answer is that I don't remember, but I thought all 0...

_wb_

2022-11-29 05:36:47	https://github.com/libjxl/libjxl/blob/89926120c9e7aea0483c17be70570a7448609da5/lib/jxl/passes_state.cc#L60
2022-11-29 05:36:54	just checked, all zero indeed

Traneptora

	_wb_ libjxl will set `nb_lf_thr` to zero for each channel when doing vardct-from-pixels
2022-11-29 05:37:20	sure, but the spec doesn't explicitly say that the number of LF Thresholds must be zero if `kUseLFFrame != 0`
2022-11-29 05:37:37	if this is always the case it should say it

_wb_

2022-11-29 05:38:24

yeah either we do that or we just say that quantized lf coeffs are to be considered zero if an lf frame was used

Traneptora

2022-11-29 05:38:36

that works too

_wb_

2022-11-29 05:51:35	Wait, what libjxl has there are the indices it will use to add to the block ctx. Setting those to zero is effectively the same thing as pretending all qdc values are -inf
2022-11-29 05:52:32	(so they are always in the first bucket of every lf threshold bucket, regardless of what the threshold values are)
2022-11-29 05:53:10	Probably better to just say the nb of thresholds is zero in this case (and check it in libjxl too, I guess)

Traneptora

2022-11-29 06:47:21	turns out the bug was in my inverse vertical squeeze
2022-11-29 06:47:51	I was iterating over each row, and then each column, as usual, rather than iterating over each column and then each row
2022-11-29 06:48:17	which matters as each vertical strip references the previous pixel in the column
2022-11-29 06:48:20	i.e. at higher rows
2022-11-29 06:48:28	so you need to do the rows in order and parallelize over the columns
2022-11-29 06:48:45	after that, it's working properly

_wb_

2022-11-29 06:49:59

Nice, and yes, vertical unsqueeze is a bit tricky to parallelize

Traneptora

2022-11-29 06:50:14	in my case I just swapped the order of the y and x
2022-11-29 06:50:36	basically I have a function which runs one pixel and then I have a loop that iterates over the row
2022-11-29 06:50:43	and then I run that row loop on the pool
2022-11-29 06:51:03	in this case I just changed it so I iterate over the x coordinate on the outside, then run on pool, and then iterate over y coordinate on the inside

_wb_

2022-11-29 06:52:09

Yeah that works, you can simd it too but that's where it gets trickier

Traneptora

2022-11-29 06:54:00	it's decoding correctly now except for this artifact
2022-11-29 06:54:01
2022-11-29 06:54:13	which looks like some sort of X channel overflow. but that's likely a jxlatte bug, not a spec bug.

veluca

	Traneptora in this case I just changed it so I iterate over the x coordinate on the outside, then run on pool, and then iterate over y coordinate on the inside
2022-11-29 10:41:26	that works but destroys your cache/memory bandwidth efficiency

Traneptora

	veluca that works but destroys your cache/memory bandwidth efficiency
2022-11-29 10:41:46	I don't really have any in this case either way
2022-11-29 10:42:01	I'd have to transpose, squeeze horizontally, and transpose back for maximum efficiency
2022-11-29 10:42:19	but I agree that vertical squeeze is hard to optimize

_wb_

2022-11-30 03:54:59

Today's spec draft:

Traneptora

2022-11-30 03:58:07	I'll take a look at it this afternoon
2022-12-02 04:08:38	DCT-II is the forward DCT and DCT-III is the inverse DCT. Is this intentional?
2022-12-02 04:08:52	I mean the pseudocode/formulas specified for the DCTs works but are these names accurate?

veluca

2022-12-02 04:10:13	probably not
2022-12-02 04:10:35	I don't remember when is the last time we checked those definitions, but a long time ago

Traneptora

2022-12-02 04:11:13	Section J.2, JPEG Upsampling, doesn't specify what happens in corner cases. I'm guessing it uses `Mirror` (section 5.2) like every other restoration/upsampling filter, but this isn't explicit.
2022-12-02 04:12:30	Section C.1: libjxl disallows specific nested-nested-nested etc. distributions from using LZ77 because it can become a zip bomb
2022-12-02 04:12:44	the spec does not
2022-12-02 04:12:54	I don't know if the spec should be disallowing these as well

Info

JPEG XL

General chat

Voice Channels

Archived

Specification issues