|
Traneptora
|
2022-11-08 09:51:49
|
since things like incorrect parsing can cause values to be invalid
|
|
2022-11-08 09:51:53
|
and it catches that
|
|
|
_wb_
|
2022-11-08 09:52:07
|
yeah throwing errors is certainly the best thing to do, even besides the debugging usefulness
|
|
2022-11-08 09:53:18
|
the best decoder is one that decodes all compliant bitstreams and refuses to decode any non-compliant one — we are probably missing some of those cases in libjxl atm btw
|
|
|
|
dds
|
|
_wb_
currently the spec says "All referenced samples are within the bounds of the reference frame." so this is an invalid bitstream atm
|
|
2022-11-08 09:54:47
|
TIL how ISO specification idiom works: In this case "All referenced samples are within the bounds of the reference frame" makes me think "well isn't that nice" but it's intended to convey "If any sample is outside the bounds of the reference frame then the bitstream is invalid so continue at your own risk".
|
|
2022-11-08 09:55:21
|
silly ISO
|
|
|
_wb_
|
2022-11-08 09:55:36
|
yeah I agree the ISO language is confusing
|
|
|
lonjil
|
2022-11-08 09:56:07
|
I have heard from many people that ISO standards tend to be incredibly detailed and yet written such that it's hard to tell exactly what you're supposed to do, leading to interoperability problems.
|
|
|
_wb_
|
2022-11-08 09:56:23
|
if we say "they are within the bounds" it means "if they're not, this is not a valid bitstream"
|
|
2022-11-08 09:57:57
|
if we say "they must be within the bounds" it means "no need to check, it's physically impossible that they are not within the bounds, e.g. it's a consequence of the bitstream syntax that they have to be"
|
|
|
|
dds
|
2022-11-08 09:58:08
|
oh what
|
|
2022-11-08 09:59:04
|
ok i can only find one occurance of 'must' and it's not confusing
|
|
|
_wb_
|
2022-11-08 09:59:33
|
> Levels are defined in such a way that if a decoder supports level N, it must also support lower levels.
|
|
2022-11-08 10:00:14
|
yeah this statement is a consequence of the way the levels are defined, so we can use "must" there
|
|
|
|
dds
|
2022-11-08 10:01:04
|
it feels like the ISO language was designed for countries to agree on common standards for traffic signs, not for nitpicky geeks trying to level up the internet
|
|
|
_wb_
|
2022-11-08 10:01:47
|
most of the normative statements are done with "shall" in ISO, but we avoided making all sentences sound biblical and having hundreds of "shall"s by leveraging this sentence in clause 7:
> These specifications are normative in the sense that they are defining an output that alternative implementations shall duplicate.
|
|
2022-11-08 10:02:46
|
that one single "shall" makes basically all the rest of the document have implicit shalls
|
|
2022-11-08 10:04:38
|
half the job of an ISO staff editor is making sure the use of shall, should, must etc is done correctly, and it's very confusing if you're not used to it so I imagine most first-time spec editors get it wrong all the time
|
|
|
|
dds
|
2022-11-08 10:08:38
|
so ISO speak is a bit like legal speak in that it mostly reads like normal English but the words don't always mean what you think they do
|
|
|
improver
|
2022-11-08 11:03:02
|
they mostly mean what is specified in long intros many skip
|
|
2022-11-08 11:04:20
|
but it was pretty clear at least to me that anything not specified was not part of the spec
|
|
2022-11-08 11:05:18
|
it's just kinda how you're used to approaching specifications
|
|
|
lonjil
|
2022-11-09 01:15:58
|
Is that supposed to be a plus between 18 and u(6), rather than a new line?
|
|
|
Traneptora
|
2022-11-09 04:49:48
|
It is
|
|
|
_wb_
|
2022-11-09 05:31:03
|
Yeah I also just saw that, will be fixed.
|
|
|
Traneptora
|
2022-11-09 05:13:01
|
Section K.5.2, definition of SplitMix64, has `0×` instead of `0x` in front of the hexadecimal numbers
|
|
2022-11-09 06:12:03
|
> The strength of the noise is computed as follows. Given the X and Y samples of the input pixel, let InR = Y + X and InG = Y − X. Let InscaledR = InR * 6 and InscaledG = InG * 6. These values are split into integer parts in_intR = floor(InscaledR) and in_intG = floor(InscaledG) and fractional parts in_fracR = InscaledR − in_intR and in_fracG = InscaledG − in_intG. If the value InScaledR or InScaledG is 8 or more, the corresponding integer and fractional parts are clamped back to 7 and 1 respectively.
>
> Using the LUT values decoded as described in K.5.1, the strengths SR and SG are defined as SR = LUT[in_intR] * (1 − in_fracR) + LUT[in_intR + 1] * in_fracR and SG = LUT[in_intG] * (1 − in_fracG) + LUT[in_intG + 1] * in_fracG. Each strength is clamped between 0 and 1.
|
|
2022-11-09 06:12:12
|
What if `InR` or `InG` is negative?
|
|
2022-11-09 06:12:16
|
it very well could be the case
|
|
2022-11-09 06:20:21
|
looks like libjxl clamps to 0
|
|
2022-11-09 06:20:38
|
should say that in the spec tho
|
|
2022-11-09 06:23:29
|
Also
|
|
2022-11-09 06:23:30
|
> If the value InScaledR or InScaledG is 8 or more, the corresponding integer and fractional parts are clamped back to 7 and 1 respectively.
|
|
2022-11-09 06:23:43
|
This will overflow `LUT` at 7
|
|
2022-11-09 06:23:49
|
because LUT is an array of size 8
|
|
2022-11-09 06:24:00
|
so LUT[7 + 1] will fail
|
|
|
|
veluca
|
|
Traneptora
Section K.5.2, definition of SplitMix64, has `0×` instead of `0x` in front of the hexadecimal numbers
|
|
2022-11-09 06:36:49
|
how did you even notice 🤣
|
|
|
Traneptora
|
2022-11-09 06:37:06
|
because I copied and pasted and got a syntax error <:KEKW:643601031040729099>
|
|
|
|
veluca
|
|
Traneptora
looks like libjxl clamps to 0
|
|
2022-11-09 06:37:39
|
it should
|
|
|
Traneptora
This will overflow `LUT` at 7
|
|
2022-11-09 06:38:10
|
ahhhhh you have no idea how many times I ended up changing the number for that clamping
|
|
2022-11-09 06:38:27
|
I thought I got it right ... what does libjxl clamp to? 6?
|
|
|
Traneptora
|
2022-11-09 06:39:33
|
I double checked the LUT overflow, apparently libjxl does the following
```cpp
float kScale = 6f;
if (in_scaled >= kScale)
in_int = kScale - 1;
in_frac = 1f;
} else {
in_int = floor(in_scaled);
in_frac = in_scaled - in_int;
}
```
|
|
2022-11-09 06:39:35
|
Yup
|
|
2022-11-09 06:39:46
|
`in_int` is capped at 5, because of this
|
|
2022-11-09 06:39:52
|
which means LUT[7] is never used
|
|
2022-11-09 06:40:08
|
although I believe this is a bug in libjxl
|
|
2022-11-09 06:40:18
|
where it should do the following:
|
|
2022-11-09 06:40:49
|
```cpp
float kScale = 6f;
if (in_scaled >= 7f)
in_int = 6;
in_frac = 1f;
} else {
in_int = floor(in_scaled);
in_frac = in_scaled - in_int;
}
```
|
|
2022-11-09 06:41:09
|
since `float lut[8]` is sealed in stone
|
|
2022-11-09 06:41:47
|
so either Scale needs to change to 7, or the clamping should change, cause otherwise something is weird
|
|
2022-11-09 06:45:41
|
in either case the spec causes an overflow as the spec is written
|
|
|
_wb_
|
2022-11-09 06:53:38
|
So bug in both libjxl and spec? Ugh
|
|
|
Traneptora
|
2022-11-09 06:54:08
|
perhaps
|
|
2022-11-09 06:54:21
|
unclear if libjxl is bugged, but probably is, given that lut[7] is not used
|
|
2022-11-09 06:54:40
|
spec is definitely bugged as it will cause an overrun of the buffer
|
|
|
_wb_
|
2022-11-09 06:55:32
|
So spec should probably say:
> If the value InScaledR or InScaledG is 7 or more, the corresponding integer and fractional parts are clamped back to 6 and 1 respectively.
|
|
2022-11-09 06:55:57
|
And libjxl should say what you wrote above
|
|
|
Traneptora
|
2022-11-09 06:56:37
|
or equivalent in HIghway-ese
|
|
2022-11-09 06:56:46
|
this is pseudocode
|
|
|
_wb_
So spec should probably say:
> If the value InScaledR or InScaledG is 7 or more, the corresponding integer and fractional parts are clamped back to 6 and 1 respectively.
|
|
2022-11-09 06:57:13
|
it should also mention that they're clamped to 0 if negative
|
|
|
|
veluca
|
2022-11-09 07:38:59
|
I hate that code 🤣 it's too hard for me
|
|
|
Traneptora
|
2022-11-09 11:34:57
|
I'm confused about noise groups, and how they interact with upsampling
|
|
|
|
veluca
|
2022-11-10 01:01:31
|
~~that's two of us~~
|
|
2022-11-10 01:02:08
|
iirc the way that works is that noise is never generated in tiles bigger than 256x256, even if upsampling happens
|
|
|
Traneptora
|
2022-11-10 04:24:59
|
the spec and libjxl differ in randomness seed behavior
|
|
2022-11-10 04:26:30
|
libjxl generates one batch of 16 numbers for every 16 pixels wide, and then it generates an extra batch for the remaining stragglers
|
|
2022-11-10 04:26:39
|
it does this even if the size of the remaining stragglers is zero
|
|
2022-11-10 04:26:59
|
i.e. if the width of the group is a multiple of 16, which it often will be, it generates an extra batch and does nothing with it
|
|
2022-11-10 04:30:00
|
this isn't in the spec
|
|
2022-11-10 04:36:25
|
I don't know if this is just a libjxl bug, or if it should be changed in the spec
|
|
|
|
veluca
|
|
Traneptora
i.e. if the width of the group is a multiple of 16, which it often will be, it generates an extra batch and does nothing with it
|
|
2022-11-10 04:43:20
|
uh, I don't think it's supposed to do that
|
|
2022-11-10 04:43:28
|
at the very least it's not intentional...
|
|
|
Traneptora
|
2022-11-10 04:43:51
|
dec_noise.cc, line 72
|
|
2022-11-10 04:43:57
|
```
size_t x = 0;
// Only entire batches (avoids exceeding the image padding).
for (; x + kFloatsPerBatch <= xsize; x += kFloatsPerBatch) {
rng->Fill(batch);
for (size_t i = 0; i < kFloatsPerBatch; i += Lanes(df)) {
BitsToFloat(reinterpret_cast<const uint32_t*>(batch) + i, row + x + i);
}
}
// Any remaining pixels, rounded up to vectors (safe due to padding).
rng->Fill(batch);
size_t batch_pos = 0; // < kFloatsPerBatch
for (; x < xsize; x += N) {
BitsToFloat(reinterpret_cast<const uint32_t*>(batch) + batch_pos,
row + x);
batch_pos += N;
}
```
|
|
2022-11-10 04:44:05
|
`x + kFloatsPerBatch <= xsize`
|
|
2022-11-10 04:44:11
|
I think this should probably be a strict inequality
|
|
2022-11-10 04:44:26
|
then, the remaining batch gets filled for 1-16, rather than for 0-15
|
|
2022-11-10 04:44:58
|
notice that if `xSize` is a multiple of `kFloatsPerBatch` (i.e. 16)
|
|
2022-11-10 04:45:15
|
then we have `x + 16 <= xSize` as the limiting condition, which is the same thing as `x < xSize`
|
|
2022-11-10 04:46:02
|
so this fills the entire row
|
|
2022-11-10 04:46:17
|
and then `rng->Fill(batch);` is called again, unconditionally.
|
|
|
veluca
at the very least it's not intentional...
|
|
2022-11-10 04:46:40
|
in that case, i'll code to spec and assume this is a libjxl bug
|
|
|
|
veluca
|
2022-11-10 04:52:57
|
can you file an issue?
|
|
|
Traneptora
|
|
veluca
can you file an issue?
|
|
2022-11-10 05:06:46
|
https://github.com/libjxl/libjxl/issues/1888
|
|
|
|
veluca
|
2022-11-10 05:09:57
|
thanks!
|
|
|
_wb_
|
2022-11-10 07:38:00
|
I propose we indeed treat this as a libjxl bug and change some noise on existing images
|
|
|
Traneptora
|
2022-11-10 07:47:20
|
what about the other discrepancy, with the scale factor?
|
|
2022-11-10 07:47:51
|
where libjxl maps `scaled >= 6f` to `intPart = 5` and `fracPart = 1f` (causing `lut[7]` to be unused)
|
|
2022-11-10 07:48:37
|
or the spec maps `scaled >= 8f` to `intPart = 7` and `fracPart = 1f` (causing `lut[8]` to blow up)
|
|
2022-11-10 07:48:59
|
IMO it should map `scaled >= 7f` to `intPart = 6` and `fracPart = 1f`
|
|
|
|
veluca
|
2022-11-10 07:53:01
|
we also should fix, and ideally for the last time as we already "fixed" it way too many times
|
|
2022-11-10 07:53:09
|
can you also open a bug to libjxl for this?
|
|
|
Traneptora
|
2022-11-10 08:04:21
|
yup, there's one more spec issue with noise, the spec's missing a 0.5f multiplier somewhere
|
|
2022-11-10 08:04:23
|
lemme dig it up
|
|
2022-11-10 08:25:40
|
I gotta go do some real lifing but I'll try to open an issue about that soon
|
|
|
spider-mario
|
2022-11-10 08:25:52
|
we discussed that LUT thing a while ago, I believe
|
|
2022-11-10 08:26:02
|
|
|
2022-11-10 08:26:09
|
it’s that code, isn’t it?
|
|
|
Traneptora
|
2022-11-10 08:27:08
|
yes, but that's not the issue
|
|
2022-11-10 08:27:18
|
the current code isn't sawtoothed in either the spec or libjxl, the issue is the integer part threshold
|
|
2022-11-10 08:27:32
|
not the fractional part
|
|
|
spider-mario
|
2022-11-10 08:27:48
|
right, but I thought we had specifically fixed that too
|
|
2022-11-10 08:28:14
|
e.g. https://github.com/libjxl/libjxl/pull/1238/files
|
|
|
Traneptora
|
2022-11-10 08:29:25
|
somehow this change isn't in git main libjxl though?
|
|
2022-11-10 08:29:31
|
well at least the one in `dec_noise.cc`
|
|
2022-11-10 08:30:22
|
now that code is in `stage_noise.cc`
|
|
2022-11-10 08:30:22
|
https://github.com/libjxl/libjxl/blob/main/lib/jxl/render_pipeline/stage_noise.cc#L71
|
|
2022-11-10 08:30:33
|
and it appears to use the older version
|
|
|
spider-mario
|
2022-11-10 08:31:21
|
oh no
|
|
2022-11-10 08:31:26
|
thanks for spotting this
|
|
|
Traneptora
|
2022-11-10 08:32:09
|
yw :)
|
|
2022-11-10 08:34:18
|
the seeding change did make it into main though
|
|
2022-11-10 11:03:13
|
Last spec discrepancy relating to noise synthesis, btw:
|
|
2022-11-10 11:03:38
|
> `Given the X and Y samples of the input pixel, let InR = Y + X and InG = Y − X.`
|
|
2022-11-10 11:03:49
|
it's actually `(Y+X)/2` and `(Y-X)/2`
|
|
2022-11-10 11:04:54
|
https://github.com/libjxl/libjxl/blob/main/lib/jxl/render_pipeline/stage_noise.cc#L192
|
|
2022-11-10 11:05:21
|
notice that you're multiplying by `0.5` first
|
|
2022-11-10 11:05:40
|
here `half` is just `const auto half = Set(d, 0.5f);`
|
|
2022-11-10 11:07:40
|
I assume this is supposed to happen and a spec bug, not a libjxl bug
|
|
2022-11-10 11:08:01
|
so I'm not going to report it on the issue tracker. but in either case spec and libjxl don't match, so at least one of the two is not correct
|
|
2022-11-11 03:21:02
|
> The dequantization matrices are computed for each possible value of the DctSelect field as the inverse of the weights matrix.
|
|
2022-11-11 03:21:13
|
How am I supposed to compute the inverse of the weights matrix if they are not square?
|
|
2022-11-11 03:23:33
|
unless you mean reciprocols taken element-wise
|
|
2022-11-11 03:23:39
|
in which case, can I assume they are nonzero?
|
|
2022-11-11 03:45:06
|
```cpp
GetDCTQuantWeights(params) {
bands(0) = params[0];
for (i = 1; i < len; i++) {
bands(i) = bands(i − 1) * Mult(params[i]);
/* bands[i] >= 0 */
for (y = 0; y < Y; y++)
for (x = 0; x < X ; x++) {
dx = x / (X − 1);
dy = y / (Y − 1);
distance = sqrt(dx*dx + dy*dy);
weight = Interpolate(distance, sqrt(2) + 1e−6, bands, num_bands);
weights(x, y) = weight;
}
}
}
```
|
|
2022-11-11 03:45:09
|
this doesn't make sense
|
|
2022-11-11 03:45:29
|
you're overwriting `weights[x][y]` each iteration of params
|
|
2022-11-11 03:45:42
|
and it's not referenced again
|
|
2022-11-11 03:45:48
|
is this supposed to be `+=`?
|
|
2022-11-11 03:46:49
|
or is it not supposed to be a nested for loop?
|
|
2022-11-11 03:58:37
|
> For encoding mode DCT2, params(c, i) are copied in position (x, y) as follows, where rectangles are defined by their top right and bottom left corners
|
|
2022-11-11 03:58:45
|
Is this supposed to say "top left and bottom right"?
|
|
2022-11-11 04:08:32
|
> For encoding mode DCT4×8, the weights matrix is obtained by copying into position (x, y) the value in position (x, y Idiv 2) in the 8 × 4 matrix
|
|
2022-11-11 04:08:39
|
and 8x4 matrix has 8 rows and 4 columns. it's taller than it is wide
|
|
2022-11-11 04:08:46
|
should this say 4x8?
|
|
2022-11-11 04:09:17
|
using the mathematical convention of Rows by Columns, not the convention of width by height
|
|
2022-11-11 04:36:31
|
DCT2 leaves weight[0][0] undefined
|
|
2022-11-11 04:36:35
|
does it deafult to 1?
|
|
|
|
veluca
|
|
Traneptora
in which case, can I assume they are nonzero?
|
|
2022-11-11 07:07:59
|
Yup (and yes, it is element by element)
|
|
|
Traneptora
or is it not supposed to be a nested for loop?
|
|
2022-11-11 07:09:01
|
*facepalm* no it is not supposed to be nestes
|
|
|
Traneptora
does it deafult to 1?
|
|
2022-11-11 07:09:44
|
Doesn't matter, weight 0 0 is never used
|
|
|
Traneptora
|
2022-11-11 12:26:34
|
I.6.5:The pairs (x, y) in the LLF vector is sorted in ascending order according to the value y * bwidth + x.
|
|
2022-11-11 12:26:47
|
I assume this is supposed to say `y * (bwidth / 8) + x`
|
|
2022-11-11 12:26:54
|
otherwise it overflows the size of LLF
|
|
|
_wb_
|
2022-11-11 12:57:04
|
Isn't bwidth the width in blocks, so already /8?
|
|
|
Traneptora
|
2022-11-11 01:05:29
|
it's the matrix width for a specific block
|
|
2022-11-11 01:13:05
|
> Let nb_block_ctx be equal to max(block_ctx_map) + 1. The decoder reads a histogram with `495 * num_hf_presets * nb_block_ctx` pre-clustered distributions D from the codestream as specified in C.1.
|
|
2022-11-11 01:13:13
|
Where's this 495 coming from? I can't find it in libjxl code.
|
|
|
|
veluca
|
|
Traneptora
I assume this is supposed to say `y * (bwidth / 8) + x`
|
|
2022-11-11 01:15:52
|
does it actually matter as long as it's `y * <something greater than any x> + x`?
|
|
|
Traneptora
|
2022-11-11 01:16:10
|
yes because otherwise you end up with zero padding and overflow
|
|
|
|
veluca
|
2022-11-11 01:16:16
|
no?
|
|
2022-11-11 01:16:26
|
it just says how they are sorted
|
|
|
Traneptora
|
2022-11-11 01:16:30
|
well, I suppose not, it's an order
|
|
|
|
veluca
|
2022-11-11 01:16:31
|
not what their positions are
|
|
|
Traneptora
|
2022-11-11 01:16:32
|
not an index
|
|
|
|
veluca
|
2022-11-11 01:17:03
|
in practice they do end up at index `y*bw/8+x`
|
|
|
Traneptora
> Let nb_block_ctx be equal to max(block_ctx_map) + 1. The decoder reads a histogram with `495 * num_hf_presets * nb_block_ctx` pre-clustered distributions D from the codestream as specified in C.1.
|
|
2022-11-11 01:17:33
|
that's a good question, let me check
|
|
|
Traneptora
|
2022-11-11 01:18:03
|
upon further inspection, it appears to be 458 + 37
|
|
2022-11-11 01:18:10
|
where 37 is kNonZeroBuckets
|
|
2022-11-11 01:18:17
|
and 458 is kZeroDensityContextCount
|
|
|
|
veluca
|
|
Traneptora
|
2022-11-11 03:26:13
|
> Finally, for k in the range [num_blocks, size), the decoder reads an integer ucoeff from the codestream
|
|
2022-11-11 03:26:19
|
`size` is not defined here
|
|
2022-11-11 03:27:24
|
unless it's the natural coeffient order size way back in hfpass
|
|
2022-11-11 03:27:30
|
if that's what it is, this is very confusing
|
|
2022-11-11 04:24:51
|
that spec doesn't say what happens if it reads non_zeroes directly and it's immediately zero
|
|
2022-11-11 04:25:12
|
spec doesn't disallow this
|
|
|
|
veluca
|
2022-11-12 12:25:35
|
doesn't that just make all of AC 0?
|
|
|
Traneptora
|
2022-11-12 03:52:49
|
spec doesn't say that
|
|
2022-11-12 03:53:32
|
Typo in section I.3.2:
```c
BlockContext() {
idx = (c < 2 ? c ^ 1 : 2) * 13 + s;
idx *= (qf_thresholds.size() + 1);
for (t : qf_thresholds) if (qf > t) idx++;
for (i = 0; i < 3; i++) idx *= (lf_thresholds[i].size() + 1);
lf_idx = 0;
for (t : lf_thresholds[0]) if (qdc[0] > t) lf_idx++;
lf_idx *= (lf_thresholds[2].size() + 1);
for (t : lf_thresholds[2]) if (qdc[2] > t) lf_idx++;
lf_idx *= (lf_thresholds[0].size() + 1); // <- TYPO HERE
for (t : lf_thresholds[1]) if (qdc[1] > t) lf_idx++;
return block_ctx_map[idx + lf_idx];
}
```
|
|
2022-11-12 03:53:45
|
I believe that line should say `lf_thresholds[1].size() + 1`
|
|
2022-11-12 03:53:55
|
not `lf_thresholds[0]`
|
|
2022-11-12 03:54:22
|
Typo in section I.3.2:
```
CoeffNumNonzeroContext[64] = {
0, 0, 31, 62, 62, 93, 93, 93, 93, 23, 123, 123, 123,
152, 152, 152, 152, 152, 152, 152, 152, 180, 180, 180, 180, 180,
180, 180, 180, 180, 180, 180, 180, 206, 206, 206, 206, 206, 206,
206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206,
206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206, 206};
```
|
|
2022-11-12 03:54:36
|
The `23` in the top row, 4th from the right should be 123
|
|
2022-11-12 03:54:51
|
also, libjxl starts these arrays with `0xBAD` instead of zero
|
|
2022-11-12 03:55:15
|
I don't know if that's because it can be anything, or if matters
|
|
2022-11-12 03:55:55
|
typo in section I.3.2:
|
|
2022-11-12 03:56:17
|
```
if (k == num_blocks) {
if (non_zeros > size / 16) prev = 1;
else prev = 0;
}
```
this condition should be `prev = 0; else prev = 1`
|
|
2022-11-12 03:56:44
|
libjxl dec_group.cc line 486 has this: `size_t prev = (nzeros > size / 16 ? 0 : 1);`
|
|
|
|
veluca
|
|
Traneptora
also, libjxl starts these arrays with `0xBAD` instead of zero
|
|
2022-11-12 02:19:12
|
that one should never be used
|
|
|
Traneptora
|
2022-11-12 06:50:22
|
Section I.2.2: there's a +1 missing in the QF Thresholds decoding
|
|
2022-11-12 06:50:34
|
```
for (i = 0; i < nb_qf_thr; i++)
qf_thresholds.push_back(U32(u(2), 4 + u(3), 12 + u(5), 44 + u(8))));
```
|
|
2022-11-12 06:50:38
|
This is in the pseudocode
|
|
2022-11-12 06:51:13
|
`entropy_coder.cc`, line 52
|
|
2022-11-12 06:51:18
|
```cpp
for (uint32_t& i : qft) {
i = U32Coder::Read(kQFThresholdDist, br) + 1;
}
```
|
|
2022-11-12 06:51:24
|
notice there's a +1 there.
|
|
2022-11-12 11:27:41
|
Section I.2.4 has this code:
|
|
2022-11-12 11:27:48
|
```cpp
if (encoding_mode == RAW) {
params.denominator = F16();
ZeroPadToByte();
params = /* read a 3-channel image from a modular sub-bitstream
of the same shape as the required quant matrix */;
}
```
|
|
2022-11-12 11:28:34
|
That `ZeroPadToByte()` should not be there. If you do it, you'll fail to verify the submodular bitstream for example with ants.jxl
|
|
2022-11-12 11:29:10
|
|
|
2022-11-12 11:46:41
|
Also, something that appears to missing from Annex I (VarDCT) is an overview. Annex H (modular) had an overview of the general process of modular
|
|
2022-11-12 11:46:47
|
VarDCT's annex appears to be missing it
|
|
2022-11-12 11:46:55
|
It explains how to do each step, but doesn't explain when to do each step
|
|
2022-11-12 11:47:28
|
I now have Quantized LF coefficients and Quantized HF coefficients, but I'm unsure when I dequent them, when I detransform them (via DCT or some other method), and when I decorrelate them.
|
|
2022-11-12 11:47:39
|
I know *how* to do those things, but not when.
|
|
2022-11-12 11:49:46
|
The spec also appears to be lacking guidance on how to handle edge cases, like if image dimensions are odd, or not multiples of 8, etc.
|
|
2022-11-12 11:51:11
|
if the image is a multiple of 8x8 it's not hard, but what if it isn't?
|
|
|
|
veluca
|
2022-11-13 06:18:41
|
then you pretend it is and crop at the very end, I'm sure it's written *somewhere*
|
|
|
Traneptora
|
2022-11-13 08:36:35
|
maybe it's in the preamble, but I didn't see it in annex I
|
|
2022-11-13 08:37:05
|
in either case, a general overview of VarDCT like there is for modular would be helpful
|
|
|
Fraetor
|
2022-11-13 12:17:15
|
Minor issue, but in section 4.3: Operators, for and and or if is misspell as iff.
|
|
|
_wb_
|
2022-11-13 12:21:54
|
"iff" means "if and only if", but you're right, it should be spelled out because not everyone will be familiar with that abbreviation
|
|
|
Fraetor
|
2022-11-13 02:05:40
|
Ah, well I learned something new.
|
|
2022-11-13 02:11:20
|
Also, the examples are really helpful.
|
|
|
|
veluca
|
2022-11-13 02:25:42
|
that's my fault for being a mathematician 😛
|
|
|
Traneptora
|
2022-11-13 03:48:29
|
same, mathematicians ^_^
|
|
|
veluca
then you pretend it is and crop at the very end, I'm sure it's written *somewhere*
|
|
2022-11-13 10:27:30
|
well it's not necessary for modular images so it'd have to be in VarDCT's section
|
|
2022-11-13 10:27:36
|
and I didn't see it although I might have glossed over it
|
|
|
|
veluca
|
2022-11-13 11:01:42
|
I can't see that either...
|
|
2022-11-13 11:02:18
|
(and it's especially tricky how it is supposed to behave especially as you mix in extra channels and upsampling, so it really ought to be mentioned somewhere)
|
|
|
Traneptora
|
2022-11-13 11:08:30
|
extra channels are done in modular so they don't have that crop factor right?
|
|
2022-11-13 11:08:48
|
VarDCT is specifically about the exactly three color channels iirc
|
|
|
|
veluca
|
2022-11-13 11:27:16
|
correct
|
|
|
Traneptora
|
2022-11-14 01:09:54
|
Section I.2.1, Quantizer Bundle, is missing a factor of `1 << 16` in global_scale
|
|
2022-11-14 01:10:03
|
it has `mXDC = m_x_lf_unscaled / (global_scale * quant_lf);`
|
|
2022-11-14 01:10:14
|
but it should have `mXDC = (1 << 16) * m_x_lf_unscaled / (global_scale * quant_lf);`
|
|
2022-11-14 01:10:29
|
and same for Y and B
|
|
|
Moritz Firsching
|
2022-11-14 01:40:32
|
..fixing this now
|
|
2022-11-14 01:42:25
|
also watch out for taking the reciprocals of `m_foo_lf_unscaled` https://github.com/lifthrasiir/j40/blob/acfe38647a0614e0562f739d4a40d2fc8a397e7f/j40.h#L5136
|
|
|
_wb_
|
2022-11-14 02:26:26
|
<@604964375924834314> I'm fixing noise in the spec, can you fix it in the render pipeline code in libjxl?
|
|
|
spider-mario
|
2022-11-14 05:07:10
|
yep, I suppose it will mostly boil down to making sure that the previous fixes are really applied
|
|
2022-11-14 05:07:26
|
I was afraid of exactly that when we started copying code to the rendering pipeline
|
|
2022-11-14 05:07:33
|
which makes me wonder how I didn’t catch this
|
|
|
Traneptora
|
2022-11-14 05:12:27
|
> `ScaleF(N, n) { return 1 / (cos(pi / (2 * N)) * cos(pi / N) * cos(pi / (N / 2))); }`
|
|
2022-11-14 05:12:30
|
Section I.6.6
|
|
2022-11-14 05:12:41
|
This doesn't make sense as there's no little `n` here
|
|
2022-11-14 05:12:50
|
I tried to trace it back to libjxl but it just has a bunch of precomputed LUTs
|
|
2022-11-14 05:13:02
|
so I can't figure out what it did
|
|
|
spider-mario
|
|
_wb_
<@604964375924834314> I'm fixing noise in the spec, can you fix it in the render pipeline code in libjxl?
|
|
2022-11-14 05:17:52
|
it was actually already fixed in #12, wasn’t it?
|
|
2022-11-14 05:17:55
|
(the spec)
|
|
|
Traneptora
|
2022-11-14 05:18:22
|
I can't see #12, but i can say that the spec as written is off by 1 from libjxl in the other direction
|
|
2022-11-14 05:18:36
|
it clamps to 7, not 6, so adding 1 a bit later will overflow
|
|
|
_wb_
|
|
spider-mario
it was actually already fixed in #12, wasn’t it?
|
|
2022-11-14 05:27:52
|
Looks like the spec was correct but libjxl was off by -1, then we fixed libjxl but also did a +1 in the spec, and then the render pipeline code regressed to the off by -1 version so we ended up having -1 in libjxl and +1 in the spec.
|
|
|
Traneptora
|
2022-11-14 05:45:38
|
For DctSelect types IDENTIFY, DCT2×2, DCT4×4, DCT8×4, DCT4×8, AFV0, AFV1, AFV2, AFV3, the output is equal to the input.
|
|
2022-11-14 05:45:41
|
What's Identify?
|
|
|
_wb_
|
2022-11-14 06:01:18
|
A typo for IDENTITY, which was renamed to Hornuss between DIS and FDIS because someone remarked that it is not at all an identity transform 🙂
|
|
2022-11-14 06:01:36
|
Good catch!
|
|
|
Traneptora
|
2022-11-14 06:21:45
|
I see, and a search and replace missed it cause it was a typo <:KEKW:643601031040729099>
|
|
|
Moritz Firsching
|
2022-11-15 09:50:19
|
I fixed the IDENTIFY/Hornuss typo in the spec draft...
|
|
|
Traneptora
|
2022-11-15 12:58:39
|
for RAW quant weight decoding, the spec says that you use `params / denominator`
|
|
2022-11-15 12:58:49
|
but it's actually `1 / (params * denominator)`
|
|
2022-11-15 12:59:00
|
(this is before you take the reciprocols of all the weights)
|
|
2022-11-15 03:04:57
|
LLF coefficient ScaleF is missing a multiple of `n`
|
|
2022-11-15 03:05:07
|
```cpp
ScaleF(N, n) { return 1 / (cos(pi / (2 * N)) * cos(pi / N) * cos(pi / (N / 2))); }
```
|
|
2022-11-15 03:05:11
|
It *should* say this:
|
|
2022-11-15 03:05:25
|
```c++
ScaleF(N, n) { return 1 / (cos(n * pi / (2 * N)) * cos(n * pi / N) * cos(n * pi / (N / 2))); }
```
|
|
2022-11-15 03:05:53
|
alternatively it could be relabeled for clarity
|
|
2022-11-15 03:09:53
|
```cpp
ScaleF(c, b) {
inverse_scale_f = cos(b * pi / (2 * c)) * cos(b * pi / c) * cos(2 * b * pi / c);
return 1 / inverse_scale_f;
}
```
|
|
2022-11-15 03:10:24
|
I chose `b` and `c` here because the values that are fed into it lower in that section are `ScaleF(cy, bheight)`
|
|
2022-11-15 03:10:26
|
which matches up
|
|
2022-11-15 03:10:52
|
I don't like mixing `N` and `n` in the same function, that could get confusing
|
|
2022-11-15 03:10:58
|
(section I.6.6, btw)
|
|
2022-11-15 03:22:20
|
I think this is what it's supposed to be, at least
|
|
2022-11-15 03:52:21
|
it's hard to tell because libjxl just has LUTs
|
|
2022-11-15 07:18:21
|
The default value for `x_qm_scale` in the Frame Header is wrong
|
|
2022-11-15 07:18:46
|
in the spec it says the default is 3
|
|
2022-11-15 07:19:04
|
with a signalling condition of `!all_default and metadata.xyb_encoded and encoding == kVarDCT`
|
|
2022-11-15 07:19:28
|
the actual default is `3` iff `metadata.xyb_encoded and encoding == kVarDCT`
|
|
2022-11-15 07:19:46
|
and defaults to `2` iff `!metadata.xyb_encoded || encoding == kModular`
|
|
2022-11-15 07:20:40
|
so if, for example, you have a reconstructed JPEG with `!all_default && !metadata.xyb_encoded` it should default to 2, not to 3
|
|
2022-11-15 07:22:51
|
libjxl has this (converted to pseudocode)
```cpp
if (metadata.xyb_encoded && encoding == kVarDCT) {
visitor->visit_fields(&x_qm_scale, 3, /*default=*/all_default)
visitor->visit_fields(&b_qm_scale, 2, /*default=*/all_default)
} else {
x_qm_scale = 2;
b_qm_scale = 2;
}
```
|
|
|
_wb_
|
2022-11-15 07:28:23
|
Oh, wow that's confusing
|
|
|
Traneptora
|
2022-11-15 07:28:46
|
tripped me up when debugging CFL
|
|
|
Moritz Firsching
also watch out for taking the reciprocals of `m_foo_lf_unscaled` https://github.com/lifthrasiir/j40/blob/acfe38647a0614e0562f739d4a40d2fc8a397e7f/j40.h#L5136
|
|
2022-11-15 08:46:10
|
already figured that one out since it's used in modular
|
|
|
Traneptora
but it's actually `1 / (params * denominator)`
|
|
2022-11-15 08:47:14
|
https://github.com/libjxl/libjxl/blob/main/lib/jxl/quant_weights.cc#L238
|
|
2022-11-15 11:57:29
|
> Horizontal (vertical) upsampling is defined as follows: the channel width (height) is multiplied by two (but limited to the image dimensions), and every subsampled pixel B with left (top) neighbour A and right (bottom) neighbour C is replaced by two upsampled pixels with values 0.25 * A + 0.75 * B and 0.75 * B + 0.25 * C.
|
|
2022-11-15 11:57:48
|
What happens in edge cases? i.e. what happens if hshift == 1 and x=0
|
|
|
_wb_
|
2022-11-16 10:10:22
|
I assume mirroring, i.e. A=B for the leftmost pixels and C=B for the rightmost ones. <@179701849576833024> is that how we do it? We should add that to the spec then...
|
|
|
|
veluca
|
2022-11-16 10:12:34
|
Somewhere the spec says that whenever we access out of bound pixels they are mirrored
|
|
|
_wb_
|
2022-11-16 10:30:15
|
I guess that's true, but in many places we explicitly say so too. It's a bit confusing if we sometimes say it explicitly and sometimes we rely on the generic statement.
|
|
2022-11-16 10:54:53
|
|
|
2022-11-16 10:55:16
|
This should be coefficients, not samples, right?
|
|
|
Traneptora
|
2022-11-16 12:27:51
|
that's correct, those are dequantized coefficients
|
|
|
_wb_
|
2022-11-16 12:35:36
|
I'll change that then to 'coefficient' instead of 'sample', since this could easily be misinterpreted as doing things in the pixel domain, not the dct domain
|
|
2022-11-16 12:37:20
|
Also going to reorganize things so LLF goes earlier
|
|
|
Traneptora
|
2022-11-16 12:58:18
|
Thanks. Putting things in the order you do them is ideal
|
|
2022-11-16 12:59:37
|
Namely Decode HF -> Dequant HF -> Recorrelate HF -> Overlay LF to LLF atop HF -> Inverse DCT
|
|
2022-11-16 01:00:26
|
considering that LLF is not recorrellated it should go after CfL
|
|
|
_wb_
|
2022-11-16 01:02:34
|
Yeah, I'm going to first define DCT, then LLF (since it uses DCT_2D and it's better to avoid forward refs imo), then Coefficients to samples
|
|
2022-11-16 05:33:15
|
Updated draft:
|
|
2022-11-16 05:46:40
|
oh nice, there's a nicer looking html output
|
|
|
Traneptora
|
2022-11-16 09:26:51
|
way better than Cambria :)
|
|
2022-11-16 09:33:03
|
There's a typo in section I.9.6: `coeffs_8x4(0, 0)` should be `coeffs_4x8(0, 0)`
|
|
2022-11-16 09:33:22
|
as the variable is called 4x8 even in the DCT8x4 transform
|
|
2022-11-16 10:18:19
|
End of the AFV transform definition
|
|
2022-11-16 10:18:23
|
```
samples_4x8 = IDCT_2D(coeffs_4x8);
for (iy = 0; iy < 4; iy++)
for (ix = 0; ix < 8; ix++)
samples(ix, (flip_y == 1 ? 0 : 4) + iy) = samples_4x4(ix, iy);
```
|
|
2022-11-16 10:18:27
|
I believe that last samples should say samples4x8
|
|
2022-11-17 06:28:54
|
AFV transform definition
|
|
2022-11-17 06:28:58
|
> `coeffs_4x4(0, 0) = coefficients(0, 0) − coefficients(0, 1) + coefficients(1, 0);`
|
|
2022-11-17 06:29:11
|
I believe this should be `coeffs_4x4(0, 0) = coefficients(0, 0) + coefficients(0, 1) - coefficients(1, 0);`
|
|
2022-11-17 06:29:24
|
I haven't double checked against libjxl, but making that change fixes a number of visual artifacts with the corner of AFV
|
|
|
yurume
|
2022-11-17 06:34:35
|
that matches J40: https://github.com/lifthrasiir/j40/blob/216f83ba7a616685f847268f009e8fcb85106f76/j40.h#L6147
|
|
|
Traneptora
|
2022-11-17 06:35:58
|
good catch, I missed the second half
|
|
2022-11-18 08:38:09
|
Section J.4.1 appears to have some links to the wrong sections
|
|
2022-11-18 08:38:43
|
section J.4.2 specifies how to calculate distance as a function of `(x, y, cx, cy)`
|
|
2022-11-18 08:38:56
|
and section J.4.3 specifies how to calculate weight as a function of distance and sigma
|
|
2022-11-18 08:40:11
|
but I think the linked sections are wrong, since the third step specifies the L1 distance in J.4.3
|
|
2022-11-18 08:40:36
|
was this something that got shuffled, and the linked sections didn't get updated?
|
|
2022-11-18 08:43:04
|
Section F.2, between tables F.6 and F.7, definition of `downsample[i]`:
|
|
2022-11-18 08:43:15
|
something got messed up here, idk what
|
|
2022-11-18 08:44:05
|
Top of Annex F, link on the right hand side to F.3 doesn't work
|
|
2022-11-18 08:44:34
|
It links to `#toc` but the anchor for F.3 is `#toc45`
|
|
2022-11-18 08:44:51
|
dunno how that happened
|
|
2022-11-18 08:45:31
|
it looks like there *is* a div with an ID of "toc" but it doesn't work, and idk why
|
|
2022-11-18 08:45:54
|
using firefox
|
|
|
_wb_
|
2022-11-18 09:24:00
|
looks like it's a reserved keyword or something that metanorma is already using for the document toc
|
|
2022-11-18 09:24:06
|
renaming it fixed it
|
|
|
Traneptora
|
2022-11-18 09:24:52
|
|
|
2022-11-18 09:25:06
|
Why is sigma being clamped to 1e-4 ?
|
|
2022-11-18 09:25:22
|
if it was less than 1e-4, now it is equal to 1e-4, and it's still gonna be less than 0.3
|
|
|
_wb_
|
2022-11-18 09:26:28
|
yeah that seems redundant. <@604964375924834314> can we just remove that `sigma = max(1e−4, sigma)` line?
|
|
|
Traneptora
|
2022-11-18 09:26:46
|
in theory epf_sigma_for_modular could be less than 0.3 but that would just disable epf for a modular frame
|
|
2022-11-18 09:26:57
|
so there'd be no point
|
|
2022-11-18 09:27:16
|
when you could just disable epf
|
|
|
spider-mario
|
2022-11-18 09:28:05
|
I’m not the best person to ask about EPF
|
|
2022-11-18 09:28:17
|
I think <@179701849576833024> knows more about this
|
|
2022-11-18 09:28:48
|
hm, okay, reading the context a bit more, it does seem redundant
|
|
|
_wb_
|
2022-11-18 09:29:11
|
oh right I saw a sigma and was thinking splines, but this is epf 🙂
|
|
|
|
veluca
|
2022-11-18 09:29:17
|
It is entirely possible for things to be redundant
|
|
|
Traneptora
|
2022-11-18 09:34:45
|
> `inv_sigma = step_multiplier[step] * 4 * (sqrt(0.5) − 1) / sigma;`
|
|
2022-11-18 09:34:54
|
just to double check, is this supposed to be negative?
|
|
2022-11-18 09:35:07
|
` v = 1 − scaled_distance * inv_sigma;`
will always be positive
|
|
2022-11-18 09:35:23
|
and there's a check for `(v <= 0)`
|
|
|
_wb_
|
2022-11-18 09:46:13
|
EPF section now:
|
|
|
Traneptora
|
2022-11-18 09:46:58
|
|
|
2022-11-18 09:47:03
|
yea, it's specified in J.4.2
|
|
|
_wb_
|
2022-11-18 09:47:05
|
What I would propose to change it to:
|
|
|
Traneptora
|
2022-11-18 09:47:09
|
ah
|
|
2022-11-18 09:47:34
|
I think that's more clear
|
|
|
_wb_
|
2022-11-18 09:50:46
|
rewrote the downscale[i] thing a bit, there was some misparsing of code blocks going on there but I also didn't like the parentheses and duplication of the range of i
|
|
|
veluca
It is entirely possible for things to be redundant
|
|
2022-11-18 09:55:25
|
I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.
|
|
|
Traneptora
|
|
_wb_
I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.
|
|
2022-11-18 09:55:57
|
epf_sigma_for_modular is constant and signalled in the frame header
|
|
2022-11-18 09:56:15
|
so if epf_sigma_for_modular < 0.3, that's identical to the frame header just disabling epf
|
|
2022-11-18 09:56:41
|
it wouldn't make sense for epf_sigma_for_modular to be < 0.3 and have epf enabled
|
|
2022-11-18 09:56:57
|
it might even make sense to forbid it
|
|
2022-11-18 04:58:56
|
DCT4x4Params doesn't match the spec compared to libjxl
|
|
2022-11-18 04:59:00
|
```
// DCT4 (quant_kind 3)
static constexpr const QuantEncodingInternal DCT4X4() {
return QuantEncodingInternal::DCT4(DctQuantWeightParams({{{{
V(2200.0),
V(0.0),
V(0.0),
V(0.0),
}},
{{
V(392.0),
V(0.0),
V(0.0),
V(0.0),
}},
{{
V(112.0),
V(-0.25),
V(-0.25),
V(-0.5),
}}}},
4),```
|
|
2022-11-18 04:59:02
|
libjxl
|
|
2022-11-18 04:59:28
|
spec
|
|
2022-11-18 10:09:17
|
```
The resulting quant is then multiplied by a per-block multiplier, the value of HfMul at the coordinates of the 8 × 8 rectangle containing the current sample, and, for the X and B channels, by pow(0.8, frame_header.x_qm_scale − 2) and pow(0.8, frame_header.b_qm_scale — 2), respectively.
```
|
|
2022-11-18 10:09:19
|
This is not correct
|
|
2022-11-18 10:09:42
|
It is multiplied by `(1 << 16) * quantizer.global_scale / HfMul`
|
|
2022-11-18 10:09:51
|
it is not multiplied by `HfMul`
|
|
2022-11-18 10:15:18
|
see line 145 of dec_group.cc
|
|
2022-11-19 04:06:52
|
Section E.4.2:
|
|
2022-11-19 04:06:56
|
```
value = 0; shift = 0;
while (1) {
b = u(8);
value += (b & 127) << shift;
if (b <= 127) break;
shift += 7; /* shift < 63 */;
}
```
|
|
2022-11-19 04:07:06
|
I'm looking at this comment `/* shift < 63 */`
|
|
2022-11-19 04:07:15
|
it's not clear to me if this is a requirement before or after adding 7
|
|
2022-11-19 04:09:29
|
because it's on the same line
|
|
2022-11-19 04:10:28
|
it says earlier that the final integer is at at most 63 bits, so I think it's still technically specified, but it might be clearer if the pseudocode reflected that
|
|
|
_wb_
|
2022-11-19 04:27:29
|
basically value can use uint64_t
|
|
2022-11-19 04:29:39
|
Maybe better to write `/* shift <= 56 */` in the beginning of the while body...
|
|
|
Traneptora
|
2022-11-19 04:54:30
|
that would be more obvious, yea
|
|
2022-11-19 05:14:16
|
Section E.4.4
|
|
2022-11-19 05:14:18
|
> `/* output nothing, stop reading the tag list, and proceed to ~\ref{iccmaincontent}~ */`
|
|
2022-11-19 05:14:30
|
looks like you left some `\LaTeX` in there
|
|
|
_wb_
|
2022-11-19 05:20:09
|
oops, lol
|
|
|
Traneptora
|
2022-11-19 05:42:58
|
```
f (tagcode == 1) {
// the tag string is set to 4 custom bytes read from the data
tag = u(4 * 8) from data stream;
} else if (tagcode == 2) {
tag = “rRTC”;
} else if (tagcode == 3) {
tag = “rXYZ”;
```
|
|
2022-11-19 05:43:07
|
is `rRTC` supposed to be `rTRC`?
|
|
|
_wb_
|
2022-11-19 05:47:36
|
Uh, best to check libjxl code
|
|
|
Traneptora
|
2022-11-19 05:48:11
|
later on it has the following
|
|
2022-11-19 05:48:14
|
```
/* append the following 3 groups of 4 bytes to the output: tag, tagstart, tagsize */
if (tagcode == 2) {
/* append the following additional 6 groups of 4 bytes to the output:
“gTRC”, tagstart, tagsize, “bTRC”, tagstart, tagsize */
```
|
|
2022-11-19 05:48:22
|
I'm guessing TRC stands for Transfer Characteristics
|
|
2022-11-19 05:48:27
|
and rRTC is probably a typo
|
|
|
_wb_
|
2022-11-19 05:51:20
|
https://github.com/libjxl/libjxl/blob/9d960611774b5b14a46f33a20768cde2cd0c219f/lib/jxl/icc_codec_common.h#L48
|
|
|
Traneptora
|
2022-11-19 05:51:33
|
looks like it is a typo
|
|
|
_wb_
|
2022-11-19 05:51:53
|
Good catch. Indeed a typo for rTRC
|
|
|
Traneptora
|
2022-11-19 05:53:43
|
> `In this subclause, Shuffle(bytes, width) denotes the following operation: Bytes are inserted in raster order (row by row from left to right, starting with the top row) into a matrix with width rows, and as many columns as needed.`
|
|
2022-11-19 05:53:55
|
Why is it called `width` if the number of columns is arbitrary, and there's that many rows?
|
|
2022-11-19 05:54:06
|
usually the word `height` is used to describe the number of scanlines, i.e. the number of rows.
|
|
2022-11-19 05:55:15
|
How am I supposed to insert bytes in raster order into a matrix with as many columns as needed?
|
|
2022-11-19 05:56:52
|
oh, I see. I create a matrix with `width` rows and `height` columns (yes, this is backward), because I then transpose it later
|
|
2022-11-19 05:56:57
|
but that's very confusing
|
|
2022-11-19 05:57:47
|
and `height` in this case is `CeilDiv(bytes.size(), width)`
|
|
2022-11-19 08:55:21
|
Section E.4.5
|
|
2022-11-19 08:55:24
|
> ` val = (bytes[i + j] + (p >> (width − 1 − j))) & 255;`
|
|
2022-11-19 08:55:49
|
`p >> (width - 1 - j)` should be `p >> (8 * (width - 1 - j))`
|
|
2022-11-19 08:55:55
|
as `width` and `j` are in units of bytes, not bits
|
|
2022-11-20 09:18:37
|
Section J.4.2:
|
|
2022-11-20 09:18:51
|
```
DistanceStep2(x, y, cx, cy) {
dist = 0;
for (c = 0; c < 3; c++) {
dist += abs(sample(x + ix, y + iy, c) −
sample(x + cx + ix, y + cy + iy, c)) * rf.epf_channel_scale[c];
}
return dist;
}
```
`iy` and `ix` are not defined here
|
|
2022-11-20 09:18:55
|
is this a copy/paste typo?
|
|
2022-11-20 09:42:05
|
Section I.8, ScaleF function
|
|
2022-11-20 09:42:13
|
this is my fault, I flipped `c` and `b`
|
|
2022-11-20 09:42:51
|
`c` should be in the numerator, and `b` in the denominator
|
|
2022-11-20 09:51:12
|
Section I.6, Chroma From Luma:
|
|
2022-11-20 09:51:20
|
```ytox_dc_ = static_cast<int>(br->ReadFixedBits<kBitsPerByte>()) +
std::numeric_limits<int8_t>::min();```
|
|
2022-11-20 09:51:31
|
spec says -127, but this is -128
|
|
2022-11-20 09:52:15
|
since `signed char` goes from -128 to +127
|
|
2022-11-20 09:56:32
|
this makes a noticeable difference if the codestream encodes 128
|
|
2022-11-20 09:56:37
|
since it goes from 1 to 0
|
|
2022-11-21 04:26:33
|
Section G.3:
|
|
2022-11-21 04:26:36
|
These are in the wrong order
|
|
2022-11-21 04:27:14
|
the correct order is `lf_coeff` *then* `mlf_group` *then* `hf_meta`
|
|
2022-11-21 04:27:53
|
See `dec_frame.cc` line 301:
|
|
2022-11-21 04:27:56
|
```cpp
Status FrameDecoder::ProcessDCGroup(size_t dc_group_id, BitReader* br) {
PROFILER_FUNC;
const size_t gx = dc_group_id % frame_dim_.xsize_dc_groups;
const size_t gy = dc_group_id / frame_dim_.xsize_dc_groups;
const LoopFilter& lf = dec_state_->shared->frame_header.loop_filter;
if (frame_header_.encoding == FrameEncoding::kVarDCT &&
!(frame_header_.flags & FrameHeader::kUseDcFrame)) {
JXL_RETURN_IF_ERROR(
modular_frame_decoder_.DecodeVarDCTDC(dc_group_id, br, dec_state_));
}
const Rect mrect(gx * frame_dim_.dc_group_dim, gy * frame_dim_.dc_group_dim,
frame_dim_.dc_group_dim, frame_dim_.dc_group_dim);
JXL_RETURN_IF_ERROR(modular_frame_decoder_.DecodeGroup(
mrect, br, 3, 1000, ModularStreamId::ModularDC(dc_group_id),
/*zerofill=*/false, nullptr, nullptr,
/*allow_truncated=*/false));
if (frame_header_.encoding == FrameEncoding::kVarDCT) {
JXL_RETURN_IF_ERROR(
modular_frame_decoder_.DecodeAcMetadata(dc_group_id, br, dec_state_));
} else if (lf.epf_iters > 0) {
FillImage(kInvSigmaNum / lf.epf_sigma_for_modular, &dec_state_->sigma);
}
decoded_dc_groups_[dc_group_id] = uint8_t{true};
return true;
}
```
|
|
2022-11-21 04:28:33
|
notice that DecodeVarDCTDC is called before DecodeGroup
|
|
|
_wb_
|
|
Traneptora
```
DistanceStep2(x, y, cx, cy) {
dist = 0;
for (c = 0; c < 3; c++) {
dist += abs(sample(x + ix, y + iy, c) −
sample(x + cx + ix, y + cy + iy, c)) * rf.epf_channel_scale[c];
}
return dist;
}
```
`iy` and `ix` are not defined here
|
|
2022-11-21 05:04:44
|
<@179701849576833024> I suppose there is something missing there? In Step0/1 the ix,iy iterate over 5 values making a cross, I assume it's not the same in Step2 because then why would there be two different functions, but is it just ix=iy=0 or is it also iterating over some set of offsets?
|
|
|
|
veluca
|
2022-11-21 05:05:02
|
No just 0 0
|
|
|
_wb_
|
|
_wb_
I guess the question is: does `If sigma < 0.3 for a given varblock, the decoder skips all steps on the pixels of that block` also apply in the modular case? If no, then the clamping does make sense and we should clarify that the skipping only happens in the vardct case. If yes, then we should remove the redundant clamping and maybe clarify that "a given varblock" means the whole frame in the modular case, since there is no per-varblock sigma adjustment.
|
|
2022-11-21 05:05:20
|
also ping <@179701849576833024> on this one
|
|
|
|
veluca
|
2022-11-21 05:06:26
|
It does apply to modular too
|
|
|
_wb_
|
2022-11-21 05:16:14
|
Latest draft:
|
|
|
Traneptora
|
2022-11-21 05:32:40
|
oh, nice
|
|
2022-11-21 09:51:47
|
Section I.2.3, the default values for `x_factor_lf` and `b_factor_lf` are 128, not 127
|
|
|
Traneptora
```ytox_dc_ = static_cast<int>(br->ReadFixedBits<kBitsPerByte>()) +
std::numeric_limits<int8_t>::min();```
|
|
2022-11-21 09:55:10
|
same line of code as here, default coded value is 0
|
|
|
_wb_
|
2022-11-22 03:48:53
|
Today's draft:
|
|
2022-11-23 03:05:12
|
updated draft:
|
|
2022-11-23 03:06:03
|
I changed the signalling of `save_before_ct` quite a bit, I hope it is correct and easier to understand now
|
|
|
Traneptora
|
2022-11-23 04:29:48
|
much easier to read now
|
|
|
|
jjido
|
2022-11-25 10:12:49
|
Just want to say than you for the excellent work <@853026420792360980> .
|
|
|
Traneptora
|
2022-11-26 08:59:24
|
yw :)
|
|
|
_wb_
|
2022-11-28 03:04:29
|
<@853026420792360980> and <@268284145820631040> you could do me a big favor by checking if all the bugs you found are fixed now (and pointing me to the ones that still need fixing or where the fix is wrong/not good enough)
|
|
|
Traneptora
|
2022-11-28 03:15:57
|
I can take a look
|
|
|
_wb_
|
2022-11-29 11:50:23
|
Current draft: (clarified bit depth, save_before_ct, and hshift/vshift in modular)
|
|
|
Traneptora
|
2022-11-29 04:35:34
|
Section G.2.2:
> If the kUseLfFrame flag in frame_header is set, this subclause is skipped and the samples from LFFrame[frame_header.lf_frame] are used instead of the values that would be computed by this subclause and I.5.2.
|
|
2022-11-29 04:35:45
|
`frame_header.lf_frame` is not defined in the FrameHeader bundle
|
|
2022-11-29 04:48:37
|
is this the same `lf_level` used?
|
|
2022-11-29 04:49:17
|
does that mean `kUseLFFrame != 0` and `lf_level > 1` means you have an LF Frame which itself uses an LF Frame for its LF Coefficients?
|
|
2022-11-29 04:53:46
|
Section I.4:
|
|
2022-11-29 04:53:48
|
> To compute context, the decoder uses the procedures and constants in the following code, where c is the current channel (with 0=X, 1=Y, 2=B), s is the Order ID (see Table I.7) of the DctSelect value and qf is the HfMul value for the current varblock, and **qdc[3] are the quantized LF values** of (the top-left 8×8 block within) the current varblock (taking into account chroma subsampling if needed). The lists of thresholds qf_thresholds and lf_thresholds[3], and block_ctx_map are as decoded in LfGlobal ( G.1.1 and I.2.2).
|
|
2022-11-29 04:53:58
|
specifically, these are the *quantized* LF Values
|
|
2022-11-29 04:54:09
|
what happens if you have an LF Frame? which skips dequantization
|
|
2022-11-29 04:56:33
|
especially considering the LF Frame values are floats
|
|
2022-11-29 04:56:38
|
and the quantized LF Values are ints
|
|
|
_wb_
|
2022-11-29 05:07:59
|
that's a really good question — <@179701849576833024> any idea what libjxl will do here? just have all zeroes there or something? we need to put it in the spec...
|
|
|
Traneptora
|
2022-11-29 05:11:58
|
I tried populating the quantized values and I got some incorrect values
|
|
2022-11-29 05:12:04
|
I tried populating zeroes and got other incorrect values
|
|
2022-11-29 05:12:24
|
lenna-correct.jxl
|
|
2022-11-29 05:12:33
|
(converted to png with libjxl for discord)
|
|
2022-11-29 05:12:52
|
using the dequantized LF values produced this
|
|
2022-11-29 05:13:02
|
using zeroes produced this
|
|
2022-11-29 05:13:11
|
those are varblock boundaries, btw
|
|
2022-11-29 05:13:18
|
those seams
|
|
2022-11-29 05:13:22
|
|
|
2022-11-29 05:14:33
|
weirdly enough, both ANS streams verified
|
|
2022-11-29 05:14:38
|
which I'm somewhat surprised about
|
|
2022-11-29 05:14:51
|
I would have expected one to not verify and for that to just give it away
|
|
|
_wb_
|
|
Traneptora
does that mean `kUseLFFrame != 0` and `lf_level > 1` means you have an LF Frame which itself uses an LF Frame for its LF Coefficients?
|
|
2022-11-29 05:17:21
|
Yes, you can have recursive LF frames, up to 4 levels deep. So if you have a max-size image of 2^30 by 2^30 pixels (about one gigapixel by one gigapixel or one exapixel), you can have its 2^27 x 2^27 DC in LFFrame[0] which is a 1:8 image, which can have _its_ 2^24 x 2^24 DC in LFFrame[1] which is a 1:64 image, which can have _its_ 2^21 x 2^21 DC in LFFrame[2], which is a 1:512 image, which can have _its_ 2^18 x 2^18 DC in LFFrame[3], which is a 1:4096 image. That image needs to be modular or VarDCT with its own DC — in the latter case its DC would be a 2^15 x 2^15 downscale at 1:32768 of the full image, which would still be a 1 gigapixel image that would be encoded in 16384 DC groups.
|
|
|
Traneptora
|
2022-11-29 05:17:53
|
I see, so `lf_level > 0` is strictly for LF Frames themselves using LF Frames
|
|
2022-11-29 05:18:18
|
this was not clear to me
|
|
2022-11-29 05:18:38
|
either way, the `lf_frame` is not defined, it should be `lf_level`
|
|
|
_wb_
|
|
Traneptora
I tried populating the quantized values and I got some incorrect values
|
|
2022-11-29 05:19:28
|
wait how did you get a bitstream that uses qdc in combination with LF frames?
|
|
|
Traneptora
|
|
_wb_
wait how did you get a bitstream that uses qdc in combination with LF frames?
|
|
2022-11-29 05:19:41
|
because it's used in the HF Coefficient decoding
|
|
|
_wb_
|
2022-11-29 05:20:14
|
yeah but libjxl only uses that when recompressing jpegs and then it doesn't produce lf frames
|
|
|
Traneptora
|
2022-11-29 05:20:27
|
that's not what the spec says
|
|
|
_wb_
|
2022-11-29 05:20:32
|
I mean
|
|
|
Traneptora
|
2022-11-29 05:20:33
|
and also not what it does
|
|
2022-11-29 05:20:43
|
see section I.4
|
|
|
_wb_
|
2022-11-29 05:20:58
|
it sets the thresholds so that it effectively doesn't use that
|
|
|
Traneptora
|
2022-11-29 05:21:13
|
`BlockContext()` uses `qdc`
|
|
2022-11-29 05:21:38
|
and `blockContext()` is needed to determine which context to read the HF coefficient from
|
|
|
_wb_
|
2022-11-29 05:22:32
|
libjxl will set `nb_lf_thr` to zero for each channel when doing vardct-from-pixels
|
|
2022-11-29 05:23:03
|
so it effectively doesn't matter what the value of qdc is to determine the context
|
|
2022-11-29 05:23:28
|
that's probably why both ANS streams verified
|
|
|
Traneptora
|
2022-11-29 05:24:15
|
but they produced different results
|
|
2022-11-29 05:24:16
|
which is odd
|
|
|
_wb_
|
2022-11-29 05:24:49
|
yeah that is odd, could be another spec bug or a jxlatte bug
|
|
|
Traneptora
|
2022-11-29 05:27:11
|
oh yea, the number of LF Thresholds is empty
|
|
|
_wb_
yeah that is odd, could be another spec bug or a jxlatte bug
|
|
2022-11-29 05:30:33
|
I see. I have a race condition somewhere because I'm getting different results on subsequent executions
|
|
2022-11-29 05:30:36
|
I have to debug
|
|
|
_wb_
|
|
Traneptora
oh yea, the number of LF Thresholds is empty
|
|
2022-11-29 05:32:56
|
yeah and the only time they will not be empty is with recompressed jpegs — we needed some extra context there to help with compression, and quantized DC was somewhat useful; for 'full' VarDCT images we can use the adaptive quant weights which are a much more useful piece of context (but not for recompressed jpegs since those obviously don't do real adaptive quantization)
|
|
2022-11-29 05:33:40
|
but we never made lf frames work for recompressed jpegs so we never encountered the issue of what to do with `qdc` in that case
|
|
|
|
veluca
|
|
_wb_
that's a really good question — <@179701849576833024> any idea what libjxl will do here? just have all zeroes there or something? we need to put it in the spec...
|
|
2022-11-29 05:36:17
|
The real answer is that I don't remember, but I thought all 0...
|
|
|
_wb_
|
2022-11-29 05:36:47
|
https://github.com/libjxl/libjxl/blob/89926120c9e7aea0483c17be70570a7448609da5/lib/jxl/passes_state.cc#L60
|
|
2022-11-29 05:36:54
|
just checked, all zero indeed
|
|
|
Traneptora
|
|
_wb_
libjxl will set `nb_lf_thr` to zero for each channel when doing vardct-from-pixels
|
|
2022-11-29 05:37:20
|
sure, but the spec doesn't explicitly say that the number of LF Thresholds must be zero if `kUseLFFrame != 0`
|
|
2022-11-29 05:37:37
|
if this is always the case it should say it
|
|
|
_wb_
|
2022-11-29 05:38:24
|
yeah either we do that or we just say that quantized lf coeffs are to be considered zero if an lf frame was used
|
|
|
Traneptora
|
2022-11-29 05:38:36
|
that works too
|
|
|
_wb_
|
2022-11-29 05:51:35
|
Wait, what libjxl has there are the indices it will use to add to the block ctx. Setting those to zero is effectively the same thing as pretending all qdc values are -inf
|
|
2022-11-29 05:52:32
|
(so they are always in the first bucket of every lf threshold bucket, regardless of what the threshold values are)
|
|
2022-11-29 05:53:10
|
Probably better to just say the nb of thresholds is zero in this case (and check it in libjxl too, I guess)
|
|
|
Traneptora
|
2022-11-29 06:47:21
|
turns out the bug was in my inverse vertical squeeze
|
|
2022-11-29 06:47:51
|
I was iterating over each row, and then each column, as usual, rather than iterating over each column and then each row
|
|
2022-11-29 06:48:17
|
which matters as each vertical strip references the previous pixel in the column
|
|
2022-11-29 06:48:20
|
i.e. at higher rows
|
|
2022-11-29 06:48:28
|
so you need to do the rows in order and parallelize over the columns
|
|
2022-11-29 06:48:45
|
after that, it's working properly
|
|
|
_wb_
|
2022-11-29 06:49:59
|
Nice, and yes, vertical unsqueeze is a bit tricky to parallelize
|
|
|
Traneptora
|
2022-11-29 06:50:14
|
in my case I just swapped the order of the y and x
|
|
2022-11-29 06:50:36
|
basically I have a function which runs one pixel and then I have a loop that iterates over the row
|
|
2022-11-29 06:50:43
|
and then I run that row loop on the pool
|
|
2022-11-29 06:51:03
|
in this case I just changed it so I iterate over the x coordinate on the outside, then run on pool, and then iterate over y coordinate on the inside
|
|
|
_wb_
|
2022-11-29 06:52:09
|
Yeah that works, you can simd it too but that's where it gets trickier
|
|
|
Traneptora
|
2022-11-29 06:54:00
|
it's decoding correctly now except for this artifact
|
|
2022-11-29 06:54:01
|
|
|
2022-11-29 06:54:13
|
which looks like some sort of X channel overflow. but that's likely a jxlatte bug, not a spec bug.
|
|
|
|
veluca
|
|
Traneptora
in this case I just changed it so I iterate over the x coordinate on the outside, then run on pool, and then iterate over y coordinate on the inside
|
|
2022-11-29 10:41:26
|
that works but destroys your cache/memory bandwidth efficiency
|
|
|
Traneptora
|
|
veluca
that works but destroys your cache/memory bandwidth efficiency
|
|
2022-11-29 10:41:46
|
I don't really have any in this case either way
|
|
2022-11-29 10:42:01
|
I'd have to transpose, squeeze horizontally, and transpose back for maximum efficiency
|
|
2022-11-29 10:42:19
|
but I agree that vertical squeeze is hard to optimize
|
|
|
_wb_
|
2022-11-30 03:54:59
|
Today's spec draft:
|
|
|
Traneptora
|
2022-11-30 03:58:07
|
I'll take a look at it this afternoon
|
|
2022-12-02 04:08:38
|
DCT-II is the forward DCT and DCT-III is the inverse DCT. Is this intentional?
|
|
2022-12-02 04:08:52
|
I mean the pseudocode/formulas specified for the DCTs works but are these names accurate?
|
|
|
|
veluca
|
2022-12-02 04:10:13
|
probably not
|
|
2022-12-02 04:10:35
|
I don't remember when is the last time we checked those definitions, but a long time ago
|
|
|
Traneptora
|
2022-12-02 04:11:13
|
Section J.2, JPEG Upsampling, doesn't specify what happens in corner cases. I'm guessing it uses `Mirror` (section 5.2) like every other restoration/upsampling filter, but this isn't explicit.
|
|
2022-12-02 04:12:30
|
Section C.1: libjxl disallows specific nested-nested-nested etc. distributions from using LZ77 because it can become a zip bomb
|
|
2022-12-02 04:12:44
|
the spec does not
|
|
2022-12-02 04:12:54
|
I don't know if the spec should be disallowing these as well
|
|