JPEG XL

Info

rules 57
github 35276
reddit 647

JPEG XL

tools 4225
website 1655
adoption 20712
image-compression-forum 0

General chat

welcome 3810
introduce-yourself 291
color 1414
photography 3435
other-codecs 23765
on-topic 24923
off-topic 22701

Voice Channels

General 2147

Archived

bot-spam 4380

Specification issues

veluca
Traneptora Section J.2, JPEG Upsampling, doesn't specify what happens in corner cases. I'm guessing it uses `Mirror` (section 5.2) like every other restoration/upsampling filter, but this isn't explicit.
2022-12-02 04:14:43
yep
Traneptora I don't know if the spec should be disallowing these as well
2022-12-02 04:14:55
I think also yep
Traneptora
2022-12-02 04:15:41
Note 4 below Table E.6 contains some `\LaTeX`
2022-12-02 04:16:37
I wonder if it's worth mentioning that PQ is also specified in SMPTE ST 2084
2022-12-02 04:16:59
and that 709, PQ, sRGB, are also specified in ITU-T H.273
2022-12-02 04:24:51
You use `&` as the bitwise-and operator, but don't define it in Section 4.3
2022-12-02 04:25:13
I assume most people reading this will know what it does, but it's still a good idea to be explicit
2022-12-02 04:26:08
it's not in the precedence table either
2022-12-02 04:28:50
`CoeffNumNonzeroContext[64] = {` there's an extra space in the first row, 123, 4th from the right. it's not incorrect, but the formatting is slightly off
Lines N Shapes
2022-12-03 07:37:40
In section B.2.3 the '+' is missing in what should be "If s == 1, value is 1 + u(4)"
yurume
2022-12-08 03:24:54
G.4.1 has the following sentence: > The dimensions correspond to `width`, except for extra channels with `dim_shift > 0` (as specified in Annex H) which have dimensions `ceil(width / (1 << dim_shift))`.
2022-12-08 03:25:56
I've been told that this should also say about `height`, as it's currently implicit (though obvious enough)
2022-12-08 03:28:51
C.2.6: the second loop `for (i = 0; i < table_size; i++)` should loop until `alphabet_size`, otherwise when `alphabet_size != table_size`, `underfull` might be initialized with more than `alphabet_size` elements
_wb_
veluca I think also yep
2022-12-08 03:51:32
What exactly should be changed in the spec for this?
veluca
2022-12-08 03:52:18
good question, I don't remember where this thing is
_wb_
Traneptora it's not in the precedence table either
2022-12-08 04:16:00
I'll add it to the precedence table, but I don't think we need to explicitly define it, since we start by saying > This document uses the operators defined by the C++ programming language (ISO/IEC 14882), with the following differences:
Traneptora
2022-12-08 04:16:15
Yea, that's fair
_wb_
2022-12-08 04:16:28
in the case of bitwise and, we use it without differences from how it is used in C++
2022-12-08 04:19:23
we also use bitwise or and xor so I'll add those to the precedence table too — though I'm not sure why we actually have that table, since it's just the same one as in C++
Tirr
_wb_ What exactly should be changed in the spec for this?
2022-12-08 04:23:33
in C.2.2: > Otherwise, if is_simple is false, the decoder reads a Bool() use_mtf. The decoder then initializes a symbol decoder using a single distribution D, as specified in C.2.1. spec should forbid using lz77 here
Traneptora
2022-12-08 04:33:02
currently libjxl allows lz77 here
2022-12-08 04:33:30
it only doesn't allow it on the child distribution if the parent already has <= 2 dists
2022-12-08 04:34:07
I ran into this issue when developing jxlatte, if you disallow lz77 for all nesteds you'll reject otherwise valid bitstreams
Tirr
2022-12-08 04:38:40
aah that's right, i thought the condition was just `true`
_wb_
2022-12-08 05:11:11
So I should add something like this to C.2.1 then, below table C.1 or so: > If `num_dist <= 2`, then `lz77.enabled` is `false`. Does that look OK?
2022-12-08 05:18:05
Or is it rather in C.2.2, where there is the recursive reference to C.2.1, that it should say that if `num_dist <= 2`, then in the nested distribution, `lz77.enabled` is false?
Tirr
2022-12-08 05:21:41
for me, it sounds like one should skip reading `lz77.enabled` when `num_dist <= 2`
2022-12-08 05:23:49
maybe i'm not familiar with normative specifications
2022-12-08 05:46:32
or before reading `LZ77Params`, add something like "If the decoder is being initialized by another decoder, and the parent decoder has `num_dist <= 2`, then the decoder reads a Bool() which is `false`, and sets `lz77.enabled` to `false`"?
_wb_
2022-12-08 05:47:29
I don't think that would be correct. I think the field needs to be read anyway, but it has to be `false` in a valid bitstream (so a decoder is free to explode into flames, or just politely refuse to decode, if it's not the case)
2022-12-08 05:48:14
Yes, something like that, except "being initialized by another decoder" and "parent decoder" are not really rigorous enough
Tirr
2022-12-08 05:51:15
yeah that was just an idea
_wb_ So I should add something like this to C.2.1 then, below table C.1 or so: > If `num_dist <= 2`, then `lz77.enabled` is `false`. Does that look OK?
2022-12-08 05:54:50
now I feel this is also ok
veluca
_wb_ Or is it rather in C.2.2, where there is the recursive reference to C.2.1, that it should say that if `num_dist <= 2`, then in the nested distribution, `lz77.enabled` is false?
2022-12-08 05:59:42
seems about right (not sure about the section)
_wb_
2022-12-18 08:45:13
We don't currently have a notion of "multi-page" like TIFF. I propose we add to the spec that if the frame duration field has the maximum value, it is to be interpreted as "end of page".
2022-12-18 08:52:18
So conceptually you can have multiple pages (max duration frames), each possibly consisting of multiple animation frames (nonzero duration frames), each possibly consisting of multiple layers (zero duration frames), each possibly internally using more than one frame to be encoded (a layer could have data in kReferenceOnly or LF frames, which are not semantically part of the decoded object).
2022-12-18 08:57:18
Multi-page with animations on evey page is maybe a bit unusual, but it could have use cases, e.g. for a set of cinemagraphs. We only have a loop count at the image header level so I propose it would then apply to every page.
2022-12-18 09:03:09
Multi-page still images could be quite useful in principle, to represent a set of related images that are not to be displayed as an animation, but as an ordered collection, with manual navigation. Comic books or rasterized slide decks, for example.
2022-12-18 09:06:33
Besides potential compression benefits (compared to storing these as separate files, one per page), having them as a single file object is also just more convenient.
spider-mario
2022-12-18 09:07:21
the thought of having the animation be one “level” higher (and therefore possibly having a different number of pages per “frame”) briefly entered my mind but it quickly left it again
2022-12-18 09:07:32
it would be terrible
_wb_
2022-12-18 09:12:56
How I propose it would gracefully degrade in applications that don't know about it, you just see the first page and if the animation playback has buttons for seeking or going to next/previous frame, those could be used for basic navigation. It's of course better if they actually know it's multi-page and not just a very long animation...
Fraetor
2022-12-18 11:09:39
~~I'd personally argue that paging is better handled at a higher level (by PDF or CBZ, et al.) but if there are significant space gains then it might be worth it.~~ My mind has been changed.
_wb_
2022-12-18 12:36:51
I don't see a significant difference between animation and multi-page, to me it's just that the first is autoplaying at the speed chosen by the author, while the second is manually navigated frame by frame.
improver
2022-12-18 01:32:49
you gotta need new metadata value telling that it's multi page. & even then i would say that stuff like CBZ are easier to use still because of how its being handled already in most apps
_wb_
2022-12-18 01:56:07
Already implemented things that work already are of course easier to use than new things. This seems a simple enough spec change though (just assigning mildly new semantics to existing syntax), applications can choose if they want to provide a specific user interface for it or not, but mentioning it explicitly in the spec can help to encourage viewers to interpret it as a multi-page file and handle it like how they would handle cbz or pdf or whatever.
Fraetor
2022-12-18 02:04:42
TBF, as long as falling back to a really slow animation is an option, it is probably a good idea. Multipage is a useful feature of TIF.
2022-12-18 02:09:56
Would you ba able to have differently size pages?
_wb_
2022-12-18 02:31:33
Frames don't have to be canvas-sized, they can be larger or smaller. But there is only one canvas size per image, and presumably what is shown (at least in an animation) is always canvas-sized (i.e. if a frame is smaller, it only updates part of the canvas, and if it is larger, you don't see what's outside the canvas)
2022-12-18 02:35:21
If you would want to mix landscape and portrait pages, probably the best way is make a square canvas, set the pages to not be saved as reference, and set the blend source of every page to an uninitialized reference frame, which has the effect of introducing transparent black padding (implicit all zeroes).
Traneptora
2022-12-25 05:18:23
This isn't quite a bug but it's not very clear
2022-12-25 05:18:47
in section K.4.1, you read the splines themselves one at a time
2022-12-25 05:19:08
it's not obvious to me that you read the control points for spline 0, then you read the coefficients for spline 0, then each for spline 1, etc.
2022-12-25 05:19:16
It's specified but it could be clearer
_wb_
2023-01-11 08:10:48
Today I'm flying to Sydney for the 98th JPEG meeting, finally an in-person meeting again after 3 years of virtual meetings due to covid. In the weekend we'll be preparing the final CD texts of the second editions of 18181-1 and 18181-2, which will then be discussed during the JPEG meeting next week and hopefully approved on Friday next week. If there are any more spec bugs or unclear wordings, feel free to mention them here so we can address all the remaining issues. We'd really like to have a very high-quality committee draft this time, so the delta between the CD and the final IS can be as small as possible. Up to the CD stage we are more or less free to (discretely) share the document for feedback; once it goes to DIS stage there's the ISO copyright thing and things unfortunately go behind the paywall. If we can have a CD that's good enough, then effectively in practice the paywall isn't a real obstacle. For the first edition this was impossible since we were still very actively defining the bitstream back then, so the delta between CD and IS was huge, but for the second edition I hope we can have a CD that is basically the same as the IS.
Tirr
2023-01-11 09:31:29
in H.5.2, minor typo: `W` is not in the list of "corresponding samples"
_wb_
veluca seems about right (not sure about the section)
2023-01-13 03:30:10
> Otherwise, if is_simple is false, the decoder reads a Bool() use_mtf. The decoder then initializes a symbol decoder using a single distribution D, as specified in C.2.1**, where `lz77.enabled` is `false`**. For each pre-clustered distribution i, the decoder reads an integer as specified in C.3.3. Finally, if use_mtf, the decoder applies an inverse move-to-front transform to the cluster mapping (see code below).
2023-01-13 03:31:29
is that the correct fix? In this recursive case, it is reading 'a single distribution' which means `num_dist == 1` there, right?
veluca
2023-01-13 04:02:23
I ... think so?
_wb_
2023-01-13 04:18:22
the entropy decoding wording in the spec is still not quite as clear as I would prefer it to be — it should be a lot better now than in the first edition, but things are still a bit too vague for my taste, as demonstrated by how tricky it seems to be to locate exactly how to fix this zip bomb thingie
Traneptora
_wb_ > Otherwise, if is_simple is false, the decoder reads a Bool() use_mtf. The decoder then initializes a symbol decoder using a single distribution D, as specified in C.2.1**, where `lz77.enabled` is `false`**. For each pre-clustered distribution i, the decoder reads an integer as specified in C.3.3. Finally, if use_mtf, the decoder applies an inverse move-to-front transform to the cluster mapping (see code below).
2023-01-13 04:59:40
this is not always the case, no. it's based on the number of distributions in the parent, not in the child
2023-01-13 04:59:53
if `num_dist <= 2` in the *parent* then it sets disallow lz77
_wb_
2023-01-13 05:00:44
so if `num_dist == 2`, right? since if it's 1 then there is no recursive call anyway
Traneptora
2023-01-13 05:01:06
according to libjxl code, yes
_wb_ so if `num_dist == 2`, right? since if it's 1 then there is no recursive call anyway
2023-01-13 05:03:51
if you set it to always disallow for nested then certain files don't decode, such as bench.jxl
2023-01-13 05:04:31
2023-01-13 05:04:44
I believe this is the same file as in the conformance testing repo but I've uploaded it here anyway
_wb_
2023-01-13 05:06:26
right. so if num_dist > 2, then the recursive distribution can still use lz77, but then that one can have num_dist == 1+1 (initially 1 but due to lz77.enabled == true it becomes 2), and if it is recursively reading a distribution, then that one cannot use lz77
2023-01-13 05:08:25
so at most there are 3 levels (grandparent with num_dist > 2, parent with num_dist == 1+1 = 2, child with num_dist = 1 and not allowed to get the +1 from lz77.enabled)
2023-01-13 05:09:32
Here is my current version of both parts of the spec:
Traneptora
2023-01-13 05:09:45
https://github.com/libjxl/libjxl/blob/58fff0336ec59fa3080a239ba89e0ed24d75d050/lib/jxl/dec_context_map.cc#L83
2023-01-13 05:09:49
Here's the relevant line in libjxl
2023-01-13 05:10:20
I have to sleep (it's midnight in my timezone) but I can try to take another scan through the specs tomorrow
_wb_
2023-01-13 05:11:47
ok I think what I wrote in the spec is the same thing as what libjxl does (I wrote num_dist == 2 instead of <= 2 but that boils down to the same thing because we're in a part of the spec where num_dist > 1)
2023-01-14 07:25:45
Current spec draft:
2023-01-14 07:26:53
one thing that was added: we now specify multi-page images to exist: > If duration has the value 0xFFFFFFFF, the decoder presents the next frame as the next page in a multi-page image.
Fraetor
2023-01-14 03:10:13
Finally, a format to rival TIFF.
improver
2023-01-14 03:12:33
i can't imagine web browsers making use of this though (too much UX changes necessary). in that case they should display just first image?
2023-01-14 03:13:19
or i guess interpret that value literally and do very long delay
2023-01-14 03:13:33
will there be any libjxl changes related to this
Fraetor
2023-01-14 03:36:33
Yeah, any decoders that don't implement this will just show them as really long duration animations.
2023-01-14 03:45:13
Which at the duration that is reprisentable with the least bits (100 ticks per second), is 1.3 years per frame.
2023-01-14 03:52:49
(Though you can get that down to 49 days in the same number if bits if you like, or beyond a millennium.)
Tirr
2023-01-15 05:42:32
In C.2.5: In the case of simple distribution (when the first Bool() is true), `alphabet_size` is set to 1 or 2, but this might not cover `v1` and/or `v2`
2023-01-15 05:43:08
`alphabet_size` is referred in alias mapping (C.2.6) so this might be confusing
2023-01-15 05:43:41
iirc libjxl sets `alphabet_size` to `max(v1, v2) + 1`
2023-01-15 05:57:30
In Table F.7, minor typo: backticks didn't make the text render as code
andrew
_wb_ If you would want to mix landscape and portrait pages, probably the best way is make a square canvas, set the pages to not be saved as reference, and set the blend source of every page to an uninitialized reference frame, which has the effect of introducing transparent black padding (implicit all zeroes).
2023-01-15 08:52:35
I still think there ought to be a clean way to express a multi-page document with differing page sizes like TIFF
_wb_
2023-01-15 08:57:22
I think it's clean enough: frames can already have arbitrary dimensions that can be different from the image dimensions. The only difference with TIFF (I think) is that you also have the image dimensions, defining a canvas, and frame offsets defining where to show each frame (and possibly what to crop when showing the frame). I think that's a feature: it means it's well defined how viewers should (by default) display a series of pages with differing dimensions.
2023-01-15 08:59:32
But applications can of course also just get the individual frames and display them in a different way (e.g. not cropping the outside-canvas parts of the frame, resized to fit viewport width or to fit the screen, or whatever)
andrew
2023-01-15 08:59:57
that canvas size issue still makes it feel a bit hacky
2023-01-15 09:03:23
perhaps it should be specified that if the duration is 0xFFFFFFFF, the canvas dimension should be ignored for subsequent pages?
_wb_
2023-01-15 09:04:21
How so? Generally I'm assuming viewers will display multi-page images like a slideshow where the canvas is rescaled to fit the screen/viewer window, not like a scrolling document — it's not like PDF where scrolling makes sense since the document likely involves text that is read top to bottom and the bottom of one page continues at the top of the next; generally I expect jxl to be used for photos and illustrations, which don't really have a particular 'reading direction'
2023-01-15 09:04:52
having it well-defined how to position pages on the canvas does make sense to me
2023-01-15 09:05:28
if applications choose to, they are still free to ignore the canvas, get all the individual pages, and display them in whatever way they want
andrew
2023-01-15 09:06:28
in my mind multi-page images (like in TIFF) are used to store scanned documents
2023-01-15 09:06:44
like if I put a .tif into Adobe Acrobat, it will create a PDF from each page in the image
_wb_
2023-01-15 09:07:43
Right. For that use case it would indeed make sense to display in a scrolling way and the canvas thing is not useful.
2023-01-15 09:08:32
(btw I kind of regret that Orientation is an image header field and not a frame header field, in this use case it would have been useful if it can be adjusted per frame)
andrew
2023-01-15 09:09:14
alternatively, you could make a valid argument that multi-page documents is not the job of JPEG-XL and should be handled by a container format like CBZ
_wb_
2023-01-15 09:11:31
Yes. But there are advantages to doing it at this level. E.g. if you have repeating elements like shared headers/footers/logos in a multi-page image, encoding it in a single jxl codestream allows encoding the repeated elements just once, while if it's handled by a container, that is not possible.
andrew
2023-01-15 09:13:34
here's how Windows Photo Viewer handles a multi-page TIF btw
2023-01-15 09:16:37
if you're going to go all-in on multi-page JXL, there should be frame-level metadata (exif, physical dimensions, etc.) such that it's possible to (semi-)losslessly merge/unmerge multiple JXL files
_wb_
2023-01-15 09:58:43
There can be multiple Exif boxes in a jxl file, I think we just need to define what that means — I would suggest Exif box *n* corresponds to frame *n*, and if there are fewer Exif boxes than frames, the last Exif gets repeated for the remaining frames.
sklwmp
2023-01-16 12:54:14
The Rec. 709 specification does not actually state a "display" function, but seems to recommend BT.1886 (gamma 2.4), while the current implementation uses gamma 1/0.45 (inverse of Rec. 709)
Traneptora
sklwmp The Rec. 709 specification does not actually state a "display" function, but seems to recommend BT.1886 (gamma 2.4), while the current implementation uses gamma 1/0.45 (inverse of Rec. 709)
2023-01-16 01:58:23
except H.273 does specify the gamma value
2023-01-16 01:58:49
https://www.itu.int/rec/T-REC-H.273-202107-I/en
2023-01-16 01:59:22
look at section 8.2
2023-01-16 02:00:01
it specifies `V = 4.500 * Lc` for `Lc < beta` and specifies `V = alpha * Lc^0.45 - (alpha - 1)` for `Lc >= beta`
2023-01-16 02:00:33
It's not a good idea to use BT.1886 because it's not actually the inverse of BT.709
2023-01-16 02:00:35
(really)
2023-01-16 02:01:17
generally speaking you should use the inverse of the function in H.273 for any purpose that is not targeting a display
2023-01-16 02:01:38
and if targeting a display you should invert BT.709 with BT.1886
2023-01-16 02:01:57
the reason these functions aren't inverses of each other can be boiled down to "historical crap"
2023-01-16 02:03:04
but the 709 transfer function looks similar to the sRGB one in that it has a linear segment and a gamma segment
2023-01-16 02:07:15
either way, for libjxl it should be inverting BT.709 to do that
sklwmp
Traneptora except H.273 does specify the gamma value
2023-01-16 02:47:29
It does, but for the opto-electrical transfer function, which converts *from* light to electricity. Nothing for the other way around (which is what most decoders are concerned about). Neither H.273 nor ITU-R BT.709-6 specify an electro-optical transfer function, but BT.709 refers to BT.1886 as the "reference decoding function" for a "reference monitor".
Traneptora and if targeting a display you should invert BT.709 with BT.1886
2023-01-16 02:48:31
should libjxl do this? or does it introduce complications because libjxl is an image format
2023-01-16 02:48:58
most video decoding software uses BT.1886 to display BT.709 video, it seems
2023-01-16 02:49:22
also, these ITU links seem to 404 for me?
Traneptora
sklwmp should libjxl do this? or does it introduce complications because libjxl is an image format
2023-01-16 03:08:36
IMO no, since libjxl doesn't target displays
2023-01-16 03:08:43
it should use the inverse of what's specified in H.273
2023-01-16 03:09:16
do note that libjxl uses a CMS to encode, which means it's likely littlecms or skcms that is actually inverting BT.709, not libjxl native code
sklwmp
2023-01-16 03:19:20
although, just to clarify, the Rec 709 standard *was* designed with displays in mind, right? are color calculations the only situation wherein you use the inverse OETF instead the EOTF?
Traneptora
2023-01-16 03:26:48
any time you have to convert something to linear light
spider-mario
sklwmp most video decoding software uses BT.1886 to display BT.709 video, it seems
2023-01-16 12:04:30
anecdotally, I recall grading a video and letting DaVinci Resolve encode it with a 2.4 gamma and a Rec. 709 tag, uploading the result to YouTube and Facebook, and the result looking totally washed out because the end-to-end encoding/playback chain of YouTube and Facebook did not actually seem to apply such a high gamma
2023-01-16 12:06:54
it seems that my videos look most like what they are supposed to if I just encode them with the 709 OETF
Tirr
2023-01-18 06:32:55
In I.3.2, minor typo: there is broken formatting with backticks
2023-01-18 12:20:23
I.2.4 has similar formatting error after the second psuedocode block
2023-01-18 01:16:55
In I.2.4, the first pseudocode: in the double for loop in `encoding_mode == AFV`, I think `vals` should be `params` instead
2023-01-18 01:18:50
I haven't checked the corresponding part of libjxl yet
2023-01-19 05:30:37
In I.2.4, when computing weights matrix: weights(0, 0) is not defined for encoding mode DCT2. I guess weights(0, 0) = 1.0, is this correct?
yurume
2023-01-19 05:34:08
not just undefined, but it is not used at all. so its value doesn't actually matter (libjxl had `-0xBAD` in its place AFAIK) but the current wording of the spec obscures that fact.
Tirr
2023-01-19 05:35:38
hmm then I'll go with +0.0
yurume
2023-01-19 05:35:51
it is not used because it is entirely covered by LLF coefficients, the top-left 1/8 x 1/8 portion of coefficients, which are derived from LF images
2023-01-19 05:36:06
yeah anything will work
2023-01-19 05:36:15
I briefly considered to put NaN there
Tirr
2023-01-19 05:38:04
Hornuss mode defines weight(0, 0) as 1.0 though, is it actually used?
yurume
2023-01-19 05:38:14
again, nope
Tirr
2023-01-19 05:40:16
so basically we don't care the area (0, 0) to (w/8, h/8) of the dequant matrix
yurume
2023-01-19 05:42:37
correct
2023-01-19 05:43:20
if you look at the coefficient decoding process, it even doesn't bother to read those them, it skips immediately to the first non-LLF coefficient
_wb_
2023-01-19 07:33:37
Unless anyone finds another issue, this will be the CD text we'll be approving tomorrow at the final plenary of the 98th JPEG meeting. There's only one known issue left, the references in the pseudocode in E.4.4 but that's due to a bug in Metanorma (https://github.com/metanorma/metanorma-iso/issues/909) and I'll fix that manually in the Word document.
2023-01-19 11:27:01
One of the trickier things was to define some limits on splines so there's a bound on the decode effort it takes to render them that can be computed without actually rendering them or forcing a specific rendering implementation, so especially in level 5 you cannot make an image that decodes crazy slow by using many/large splines, but also so things are not restricted too much. <@853026420792360980> I think you're the only one who already did an independent implementation of splines, so please take a look to see if how we did it makes sense to you.
Tirr
2023-01-19 12:16:59
In I.2.4, when computing weight matrix: in the pseudocode for AFV encoding mode, `Interpolate` is called with fourth argument `4`. What does this mean?
_wb_
2023-01-19 01:23:46
looks to me like a spurious argument that just shouldn't be there
veluca
2023-01-19 01:26:54
it means that the spec/that function was modified a lot of times
2023-01-19 01:26:55
😛
Tirr
2023-01-19 01:55:42
maybe that fourth argument was some kind of multiplier?
_wb_
2023-01-19 01:58:52
no it was bands.size() basically, iiuc
2023-01-19 10:26:36
Traneptora
_wb_ One of the trickier things was to define some limits on splines so there's a bound on the decode effort it takes to render them that can be computed without actually rendering them or forcing a specific rendering implementation, so especially in level 5 you cannot make an image that decodes crazy slow by using many/large splines, but also so things are not restricted too much. <@853026420792360980> I think you're the only one who already did an independent implementation of splines, so please take a look to see if how we did it makes sense to you.
2023-01-19 11:23:35
what are fwidth and fheight here?
2023-01-19 11:23:38
frame width and frame height?
_wb_
2023-01-19 11:26:03
yes
2023-01-20 03:21:42
The new spline limits will not go in the CD yet, since we couldn't get consensus to just do that already without following the formal process of NB comments, so I suppose they will be officially added only at the DIS stage.
2023-01-20 03:23:41
So what's above is basically what we will approve to be the CD, except it has those new two rows in the levels table that are not part of the official CD and that we will request to be added, as a ballot comment on the CD ballot.
veluca
2023-01-20 07:01:42
since when do you need comments for changes to a WD?
_wb_
2023-01-20 07:16:41
Since it's not an editorial change but a technical change in the profiles and levels, and Fernando claims profile/levels definitions cannot be changed without following the process. In fact he also wants us to provide an input document for the next meeting to explain the change and discuss the implications on interoperability.
2023-01-20 07:18:40
Which makes sense in general and on an abstract level. In this particular case I would argue that the interop implications are minimal, but he has a point in principle that these implications do need to be considered.
2023-01-20 07:22:30
I will make some slides for the next meeting to convey the point that we're essentially only defining the playground for a coding tool that is currently only used frivolously in jxlart, and although we do want to have a well-defined restriction on it in case there are future, more serious applications that use this, and also to ensure that the browser use case (level 5) has some amount of protection against excessive resource use. Since we expect a future hardware profile to not include splines at all, the implications for that will be zero.
2023-01-20 07:39:12
In the meantime, libjxl will just define the de-facto limits, and I suppose other software implementations can either also use those limits, or choose to not check level conformance at all, which is always also a valid approach to make a conforming decoder.
Traneptora
2023-01-25 09:41:20
^ jxlatte doesn't reject streams that otherwise would be conformant but break level rules
2023-01-25 09:41:29
for most things
2023-01-25 09:41:32
for some things it does
2023-01-25 09:41:55
I check the size header for raw codestreams to make sure I'm not getting a false positive
_wb_
2023-02-06 07:29:09
I wonder which of the 3 independent decoders will be the first to implement a full decoder 😛
Tirr
2023-02-12 07:17:17
I.5.3: when computing `Mul` it should be divided by `quantizer.global_scale`, not multiplied
_wb_
2023-02-12 07:20:50
🤦
Tirr
2023-02-12 07:21:01
I.8: `cx` should be `max(bwidth, bheight) / 8`, `cy` should be `min(bwidth, bheight) / 8`, `ScaleF` arguments should be `(y, cy)` and `(x, cx)`
2023-02-13 09:32:24
I.9.8: - each occurence of `coefficients(1, 0)` and `coefficients(0, 1)` should be `(0, 1)` and `(1, 0)`, respectively - in the last for loop, `samples_4x4` should be `samples_4x8`
_wb_
2023-02-13 12:31:20
This is all VarDCT stuff, not really what I know best. Can the implementors of libjxl, j40 and/or jxlatte double-check these findings? (some of these things might be spec bugs I accidentally introduced when fixing other spec bugs, so I want to be sure)
Tirr
2023-02-13 12:41:10
I'm not 100% sure about those (especially I.8). jxl-oxide still seems to have a bug related to LF that makes decoded image look darker and have blocky artifacts
2023-02-13 12:41:15
double-checking would be very helpful
Traneptora
_wb_ This is all VarDCT stuff, not really what I know best. Can the implementors of libjxl, j40 and/or jxlatte double-check these findings? (some of these things might be spec bugs I accidentally introduced when fixing other spec bugs, so I want to be sure)
2023-02-14 10:02:45
I was the one who reported those originally, but I later mentioned that I got something backwards
2023-02-14 10:03:00
I don't know if it was ever changed
2023-02-14 10:05:06
https://discord.com/channels/794206087879852103/1021189485960114198/1043823541138763817
Traneptora I believe that last samples should say samples4x8
2023-02-14 10:08:42
I reported this already here, so I can verify it
Tirr I.5.3: when computing `Mul` it should be divided by `quantizer.global_scale`, not multiplied
2023-02-14 10:13:47
not sure about this one, compare it here: https://discord.com/channels/794206087879852103/1021189485960114198/1043286802934550643
Tirr
2023-02-19 10:39:43
I.2.4: for encoding mode RAW, computed weights matrix should be used as is, reciprocals should not be taken
2023-02-19 10:39:51
(it fixed my losslessly transcoded image)
Traneptora It is multiplied by `(1 << 16) * quantizer.global_scale / HfMul`
2023-02-19 10:47:31
I think it's `(1 << 16) / (global_scale * HfMul)`, to be consistent with LF dequant where `m{X,Y,B}DC` is computed as `(1 << 16) * m_{x,y,b}_lf_unscaled / (global_scale * quant_lf)`
2023-02-19 10:47:35
it worked for me at least
Traneptora
Tirr I.2.4: for encoding mode RAW, computed weights matrix should be used as is, reciprocals should not be taken
2023-02-19 12:31:54
I confirm this as I reported it too
yoochan
2023-03-03 03:58:19
Annex L 2.2 ```Lmix = (pow(Lgamma − cbrt(oim.opsin_bias0)),3) + oim.opsin_bias0) * itscale; Mmix = (pow(Mgamma − cbrt(oim.opsin_bias1)),3) + oim.opsin_bias1) * itscale; Smix = (pow(Sgamma − cbrt(oim.opsin_bias2)),3) + oim.opsin_bias2) * itscale;``` on thoses 3 lines, there is one more closing parenthesis than opening 😄
2023-03-03 04:03:41
and even if it is perfectly clear what is what, the symbol B is used twice, both for the YXB and RGB third channel...
2023-03-03 04:04:14
I'm perhaps dumbreading too much here 😅
_wb_
2023-03-03 04:33:07
Leonard from Adobe just pointed me to this: https://github.com/pdf-association/pdf-issues
2023-03-03 04:33:35
maybe we should do something similar, creating a public repo just for opening spec issues
2023-03-03 04:33:43
wdyt, <@795684063032901642> ?
yoochan
2023-03-03 07:17:10
I guess having the html under revision control is not an acceptable option ?
2023-03-03 07:18:22
the html is perhaps not the original... is it generated from the pdf or the other way around ?
_wb_
2023-03-03 08:33:59
Html is generated from metanorma sources
2023-03-03 08:34:10
Which we have on a private git repo
2023-03-03 08:34:23
I want to make it public but ISO does not allow that
yoochan
2023-03-04 05:58:05
I understand... It seems difficult to open public issues on a private document. But it could be issues tagged "spec" on the main libjxl repo
Fraetor
2023-03-04 10:52:15
I think having a separate repo for with just a README.md would work. It gives a place to publicly track spec issues.
Traneptora
2023-03-10 06:56:03
Section G.1.2, Table G.2, typo in definition of `m_b_lf`
2023-03-10 06:56:08
it's listed as `m_b_l`
2023-03-11 09:41:44
<@794205442175402004> dyou have an updated spec draft? so I can use the latest version when working on libhydrium
_wb_
2023-03-11 09:43:22
Haven't made any updates yet since the version above, iirc. Still need to get the more recently reported issues fixed...
Traneptora
2023-03-11 09:43:39
I see, thanks
yoochan
2023-03-12 10:31:39
Libhydrium?
sklwmp
2023-03-12 10:39:56
i believe it is this: https://github.com/thebombzen/hydrium
yoochan
2023-03-12 11:08:20
thx ! I couldn't find it 🙂
Traneptora
2023-03-12 11:16:14
very WIP
yoochan
2023-03-12 08:40:37
Every journey start with a first step
improver
2023-03-13 07:52:44
imo these numbers are more "expected", similar how people keep using 16:10 & not 8:5, it makes remembering them easier when u can stack them alongside other ratio numbers
2023-03-13 07:57:08
though its not very common as a ratio anyway...
_wb_
2023-03-13 08:12:11
yes, I suppose in particular the 2:1 is to emphasize that it's an aspect ratio, but then maybe we should also have done `× 1 Idiv 1` in the first one 🙂
2023-03-13 08:13:09
maybe we should make it a table again instead of pseudocode...
improver
2023-03-13 08:26:30
what is 12:10 used for btw? i can't seem to find it as being common for anything on the net, though its position does make sense in this list so its prob not a typo
_wb_
2023-03-13 08:33:52
No idea how that list of aspect ratios was decided, it was already there in pik.
2023-03-13 08:42:38
480x576 is described in BT.1358 so there's that, I guess
2023-03-13 08:43:22
it seems a reasonable intermediate aspect ratio between 5:4 and square
2023-03-13 08:43:36
but I agree it seems a bit exotic
Tirr
2023-03-13 09:19:42
I once had a 16:10 monitor
spider-mario
Tirr I once had a 16:10 monitor
2023-03-13 09:42:22
afaik, current macbooks are still 16:10
2023-03-13 09:42:36
the Pro that have the notch are 16:10 even after you remove the height of the notch (so even a bit taller with the notch)
Fraetor
2023-03-13 07:39:05
And many other laptops are too.
jonnyawsom3
2023-03-13 11:42:40
I know 16:10 is getting popular-ish since it gives you a full 1080p with the taskbar underneath on windows as opposed to covering it up partially, plus more room for editing timelines and such
Traneptora
2023-03-15 08:49:50
Annex O says the initial state should be 0×130000
2023-03-15 08:49:55
I believe this should say 0x130000
2023-03-15 08:50:06
as this is a hex literal, not multiplication
_wb_
2023-03-15 08:58:14
Yep. It's a metanorma thing, it auto-converts things that look like number x number to use the times symbol. We should put all hex literals in `monospace` to avoid that...
Tirr
2023-03-17 05:13:07
I.6: chroma from luma for LF should be skipped when kUseLfFrame is set (which seems obvious, but still)
Traneptora
2023-03-18 11:51:12
update: I was wrong
2023-03-18 11:51:30
gab_custom is *not* signalled if RestorationFilter all_default = 1
2023-03-18 11:51:47
it didn't matter 7 times out of 8 but I found and 8th time where it mattered
2023-03-18 11:53:04
2023-03-18 11:53:06
here's a sample
Tirr
2023-04-11 05:42:48
Table I.6: the second element of SeqB should be `-0.3633036457487539`
2023-04-11 05:43:20
(the sign is missing)
_wb_
2023-04-11 05:45:54
Oops!
Tirr I.8: `cx` should be `max(bwidth, bheight) / 8`, `cy` should be `min(bwidth, bheight) / 8`, `ScaleF` arguments should be `(y, cy)` and `(x, cx)`
2023-04-12 02:13:41
can this particular change be confirmed by now?
Tirr
2023-04-12 02:14:22
I'm still using modified one at least
Traneptora
_wb_ can this particular change be confirmed by now?
2023-04-12 02:15:37
https://discord.com/channels/794206087879852103/1021189485960114198/1043823541138763817
_wb_
2023-04-12 02:17:31
ok, looks to me like that change makes sense
2023-04-12 02:23:19
this is the draft as I have it now
2023-04-12 02:23:59
(preparing for the upcoming JPEG meeting where we will try to move to DIS stage)
Eugene Vert
2023-04-16 10:34:53
The draft still seems to have extra arclengths in K.4.2 `Sigma = ContinuousIDCT(dctSigma, 31 * arclength_from_start / arclength);` `value = ContinuousIDCT(dctC, 31 * arclength_from_start / arclength);`
2023-04-16 10:35:07
https://discord.com/channels/794206087879852103/794206170445119489/1039469299045240892
_wb_
2023-04-17 12:42:13
thanks for noticing that, I only looked at spec bugs reported here, not in <#794206170445119489> , so I missed that one!
2023-04-17 12:42:33
Is this still an issue https://discord.com/channels/794206087879852103/794206170445119489/1039467250803036160 ? <@853026420792360980>
Traneptora
_wb_ Is this still an issue https://discord.com/channels/794206087879852103/794206170445119489/1039467250803036160 ? <@853026420792360980>
2023-04-17 02:15:27
I can double check in a bit
_wb_
Traneptora I can double check in a bit
2023-04-18 06:45:06
that would be great
Eugene Vert
_wb_ Is this still an issue https://discord.com/channels/794206087879852103/794206170445119489/1039467250803036160 ? <@853026420792360980>
2023-04-18 10:08:30
I'm getting very strange results using the version from the specification
2023-04-18 10:11:09
For example: (libjxl is 1st picture)
2023-04-18 10:12:28
Or this:
2023-04-18 10:49:29
With ```c++ diffX = x - p.x; diffY = y - p.y; distance = sqrt(diffX * diffX + diffY * diffY); factor = erf((0.5 * distance + SQRT_0125) * inverseSigma); factor -= erf((0.5 * distance - SQRT_0125) * inverseSigma); /* the sample at column x and row y of channel C */ += 0.25 * value * intensity * sigma * factor * factor; ``` as in `splines.cc:67 void DrawSegment` line 79 and tweaked ```c++ xend = min(image.height − 1, floor(p.y + maximum_distance + 1.5)); yend = min(image.height − 1, floor(p.y + maximum_distance + 1.5)); ``` as in `splines.cc:92 void DrawSegment` line 98, the splines render normally and match the libjxl version
2023-04-18 11:27:04
Also ```c++ maximum_distance = −2 * log(0.1) * Sigma * Sigma; ``` is ```c++ maximum_distance = sqrt(-2 * sigma * sigma * (log(0.1) * 3.0 - log(max_color))); ``` as in splines.cc:108 void ComputeSegments line 135
Traneptora
2023-04-18 04:43:55
ah, looks like you double checked
2023-04-18 04:43:58
thanks
_wb_
2023-04-18 04:46:46
ugh what is `max_color`?
Traneptora
_wb_ ugh what is `max_color`?
2023-04-18 04:52:02
maximum value of colorX, colorY, and colorB
2023-04-18 04:52:22
this is what I have for it in jxlatte
2023-04-18 04:52:38
```java values[0] = fourierICT(coeffX, t) * arc.arcLength; values[1] = fourierICT(coeffY, t) * arc.arcLength; values[2] = fourierICT(coeffB, t) * arc.arcLength; float maxColor = MathHelper.max(0.01f, values[0], values[1], values[2]); ```
2023-04-18 04:53:19
where `t \in [0, 31]`
2023-04-18 04:53:27
proportional to progress along arc
2023-04-18 04:54:35
in libjxl it's this
2023-04-18 04:54:38
```c++ // We cap from below colors to at least 0.01. float max_color = 0.01f; for (size_t c = 0; c < 3; c++) { max_color = std::max(max_color, std::abs(color[c] * intensity)); } ```
_wb_
2023-04-18 05:03:29
<@604964375924834314> would you mind taking a look at the splines rendering in the spec and fixing the above spec bugs in the spec directly? I'll review if you make a PR on the spec repo
spider-mario
2023-04-18 05:04:24
I’m sick at the moment but I’ll try to have a look later in the week
_wb_
2023-04-18 06:19:33
Oh, no rush, get better soon!
Eugene Vert Also ```c++ maximum_distance = −2 * log(0.1) * Sigma * Sigma; ``` is ```c++ maximum_distance = sqrt(-2 * sigma * sigma * (log(0.1) * 3.0 - log(max_color))); ``` as in splines.cc:108 void ComputeSegments line 135
2023-04-21 10:38:33
as far as I understand, the maximum_distance in libjxl is just a tighter estimate than in the spec, to avoid making the bounding box larger than it should be, for speed reasons. But having a bigger number in the spec shouldn't really make a difference. So not sure if this needs to change in the spec, it's just an optimization of the implementation. But it looks like you also found a bug in the spec that makes things blurrier than they should be, and that should be fixed of course.
2023-04-21 10:38:51
<@179701849576833024> could you take a look too?
veluca
2023-04-21 10:51:14
Maybe next week
_wb_
2023-04-21 11:34:13
`maximum_distance` only influences the size of the rectangle that gets repainted, where you don't want to make those rectangles too large since there is one of those for every sample point along the spline, and if they're too large you're effectively adding a lot of zeroes (or values extremely close to zero) most of the time, which isn't making any visible difference anyway
2023-04-21 11:36:01
but as far as the spec goes (which doesn't have to care about computational complexity, only about correctness), I think we could probably even drop `maximum_distance` completely and always theoretically update all pixels (even if most of those updates will be + 0)
2023-04-21 02:47:32
updated draft:
2023-04-21 02:48:08
simplifying/fixing the spline rendering to this:
2023-04-21 02:51:03
(old version looked like this, which wasn't very correct and also included details on how to pick xbegin/xend/ybegin/yend which were both unnecessary and wrong)
2023-04-21 02:53:59
<@736666062879195236> <@604964375924834314> <@179701849576833024> <@853026420792360980> please double-check if the stuff I'm putting in the spec is now actually correct, and if the NOTE makes sense.
2023-04-21 02:55:17
if there are no more bugs found, this will be the version that will become the DIS we submit at the end of next week.
2023-04-21 02:56:48
(so please if you find more spec bugs, find them before ~Wednesday April 26th, so we can still fix them in time)
2023-04-21 02:58:39
once it goes to DIS stage, any change will have to be made twice (once in the Word file the ISO editor will give us, and once in the metanorma source repo) and it would be best if no more technical changes need to be made so we can skip FDIS and go straight from DIS to IS
Traneptora
_wb_ <@736666062879195236> <@604964375924834314> <@179701849576833024> <@853026420792360980> please double-check if the stuff I'm putting in the spec is now actually correct, and if the NOTE makes sense.
2023-04-21 03:04:04
not sure I like abstracting away max_distance and just iterating over the whole image since in practice that will be absurdly slow. in other places where something was specified and an implementation suggestion existed, it could safely be ignored for naive implementations. e.g. in the WP section of the modular spec, it says "you can implement X / (1 << 24) with a lookup-table" but if you just do the integer division, it's not *that* bad. but in this case I'd probably suggest some heuristic for the size of that rectangle rather than just say "it exists"
2023-04-21 03:04:49
but mathematically speaking it appears what you have written is correct, from my first look
2023-04-21 03:07:01
someone doing a naive implementation-directly-from-spec should be able to do so IMO without a lot of extra math
Tirr
2023-04-21 03:08:43
this is unrelated to splines and more like a question. it seems that VarDCT RGB is possible. do we read coeffs in GRB order when this is happening? the spec only says for XYB (in YXB order) and it feels a bit unclear to me
2023-04-21 03:09:50
for YCbCr images coeffs are in YCbCr order, but modular images are in CbYCr order
2023-04-21 03:10:18
(this confused me when implementing vardct for the first time)
_wb_
2023-04-21 03:10:22
makes sense to put something a bit better than the super naive thing there, as long as it's safe and not too complicated. Adding the thing with `max_color` and everything kind of complicates things quite a bit, but maybe we can just have some simpler bound that is not as tight but also helps implementers. Then again there are other things like e.g. the DCT itself where we don't really offer any implementation advice in the spec, so I think it might be better to let those who want inspiration for optimizing things just look at existing implementations for that
Traneptora
Tirr this is unrelated to splines and more like a question. it seems that VarDCT RGB is possible. do we read coeffs in GRB order when this is happening? the spec only says for XYB (in YXB order) and it feels a bit unclear to me
2023-04-21 03:10:32
at any point in the spec if it references X, Y, B for images that aren't XYB-encoded, you drop-in X, Y, B as R, G, B instead.
2023-04-21 03:11:03
since you read the varblocks in Y, X, B order, that means likewise you read non-XYB varblocks in G, R, B order
_wb_
Tirr this is unrelated to splines and more like a question. it seems that VarDCT RGB is possible. do we read coeffs in GRB order when this is happening? the spec only says for XYB (in YXB order) and it feels a bit unclear to me
2023-04-21 03:11:19
I think coeffs would be in GRB order then, yes. Best to check, take an RGB JPEG, recompress it with cjxl and see what happens 🙂
Traneptora
Tirr for YCbCr images coeffs are in YCbCr order, but modular images are in CbYCr order
2023-04-21 03:11:59
modular images are in Cb, Y, Cr order because they're in YXB order, iirc.
2023-04-21 03:12:13
if you use xyb modular it will by in Y, X, B order (with B subbed in for B-Y)
2023-04-21 03:17:51
X, Y, B gets replaced with "first, second, third" components
2023-04-21 03:17:59
which are either R, G, B or Y, Cb, Cr, depending on context
spider-mario
_wb_ <@736666062879195236> <@604964375924834314> <@179701849576833024> <@853026420792360980> please double-check if the stuff I'm putting in the spec is now actually correct, and if the NOTE makes sense.
2023-04-21 03:18:14
how does that interact with the limits on the number of pixels affected?
Tirr
Traneptora which are either R, G, B or Y, Cb, Cr, depending on context
2023-04-21 03:21:05
well I'm decoding LF images and HF coeffs in YCbCr order, and headers signal subsampling info in CbYCr order, so I thought XYB matches CbYCr
2023-04-21 03:23:08
as far as I understood: basically X, Y, B matches with R, G, B or Cb, Y, Cr (or C, M, Y I guess?) so for VarDCT images they are in GRB and YCbCr order. but for modular images, the spec explicitly says (R, G, B) / (Y', X', B') / (Cb, Y, Cr) so we decode images in that order
Traneptora
2023-04-21 03:26:40
ah, then yea
Tirr
2023-04-21 03:29:04
ah the spec has that: > In the rest of this document, the three main colour components are referred to as X, Y, B, even in the case where the XYB colour space is not used for the stored image; in that case the actual components are either R, G, B or Cb, Y, Cr or C, M, Y.
Traneptora
2023-04-21 03:29:25
ah, there it is
_wb_
spider-mario how does that interact with the limits on the number of pixels affected?
2023-04-21 03:37:00
it doesn't, the estimated pixel area is computed in integer math straight from the quantized splines and isn't related to the actual spline rendering
Tirr
2023-04-23 03:45:18
Table F.8 (and Table K.1): expressions of the blending operations don't match those of libjxl. - alpha clamping is done only on `new_alpha` (the image to blend), not for `old_alpha` - kMulAdd: `new_alpha` is always clamped, blending uses `new_alpha` instead of mixed `alpha`, the resulting alpha value is always `old_alpha` (for kMulAddBelow it would be `new_alpha`) - kMul: `clamp` is signalled, but it doesn't take any alpha channel; instead `new_sample` should be clamped
2023-04-23 03:50:29
found while testing `blendmodes` from conformance test
_wb_
2023-04-24 06:53:55
<@179701849576833024> <@532010383041363969> fix spec to match libjxl, or fix libjxl to match spec on these ones?
2023-04-24 06:55:30
fixing spec is easier, but we can go both ways imo since I don't think any encoder is using funky blend modes in a way that would break existing images if we change any of the above 3 mismatches (except of course the conformance test)
2023-04-24 06:57:09
if we don't want to have to update conformance tests, we should sync the spec to make it match libjxl
2023-04-24 06:59:33
kMulAdd always doing clamping seems like a libjxl bug: why bother signalling `clamp` if we're just always clamping anyway?
2023-04-24 07:04:10
kMul did not make sense in the spec since there is no alpha_channel to clamp in this case; libjxl indeed clamps all foreground samples; I suppose we can change the spec to match libjxl here
2023-04-24 07:06:38
alpha clamping being done on `new_alpha`, not `old_alpha` : is this a spec bug or a libjxl bug? can go either way, I guess. I would treat it as a spec bug
2023-04-24 07:08:29
kMulAdd has `alpha = old_alpha` instead of the kBlend formula for alpha: I think that's a spec bug, it doesn't really make sense to do that
2023-04-24 07:10:51
https://github.com/libjxl/libjxl/blob/main/lib/jxl/alpha.cc#L71 this code ignoring the value of `clamp` seems like a libjxl bug, right? or am I missing something?
2023-04-24 07:53:23
Resolving it in this way, see https://github.com/libjxl/libjxl/pull/2418 for the libjxl fix and the rest we can fix in the spec
2023-04-24 07:54:23
Updated spec draft:
2023-04-24 07:55:01
Uploading this one as input study text for the JPEG meeting that will start in 3 hours
Jyrki Alakuijala
_wb_ <@179701849576833024> <@532010383041363969> fix spec to match libjxl, or fix libjxl to match spec on these ones?
2023-04-24 09:21:33
not immediately sure what to do -- this needs a bit of thinking and discussing
_wb_
2023-04-24 09:23:21
I proposed something above — the meeting is starting in 98 minutes and I need to upload _something_ as the input document. But we have all week to discuss and update the draft as needed before we produce the output document which is to be approved on Friday.
Jyrki Alakuijala
2023-04-24 09:24:33
depends on your own stress level probably more than anything else -- if you can deal with excitement, you can make a last minute change
2023-04-24 09:25:03
my experience is that 30 % of such last minute changes introduce an unforeseen side effect
2023-04-24 09:25:27
but seems reasonable to try to fix this
_wb_
2023-04-24 09:40:49
we can discuss about what's the best way to fix it, but by the end of the week I want the spec and libjxl to define blending in the same way 🙂
Tirr
2023-04-24 10:28:48
J.4.3: `quantization_width` is not `HfMul`, it's `Mul` defined in Section I.5.3
2023-04-24 10:29:11
(please double-check this, jxl-oxide is still broken after fixing this)
2023-04-24 10:29:57
relevant code in libjxl: https://github.com/libjxl/libjxl/blob/0d8651b09e42bec33a676e02459cf573d3905d87/lib/jxl/epf.cc#L82
2023-04-24 11:43:28
J.4.3: we need to add `scaled_distance * inv_sigma` when computing `v` (as `inv_sigma` is negative)
2023-04-24 11:44:09
these two changes fixed EPF of jxl-oxide
_wb_
Tirr J.4.3: we need to add `scaled_distance * inv_sigma` when computing `v` (as `inv_sigma` is negative)
2023-04-24 12:10:18
does something need to be changed in the spec or is the spec ok for this one?
Tirr J.4.3: `quantization_width` is not `HfMul`, it's `Mul` defined in Section I.5.3
2023-04-24 12:10:37
can you confirm now that it's indeed Mul and not HfMuL?
Tirr
_wb_ does something need to be changed in the spec or is the spec ok for this one?
2023-04-24 12:24:10
I can't check the spec right now, but iirc the spec computes negative `inv_sigma` because of the `sqrt(0.5) - 1` term, so yeah, I think the spec needs to be changed
_wb_
2023-04-24 12:24:59
ah right, otherwise v is always >= 1
Tirr
_wb_ can you confirm now that it's indeed Mul and not HfMuL?
2023-04-24 12:26:36
I didn't modify this code after I changed it to Mul, but I'll double-check when I have access to my dev machine
_wb_
2023-04-24 12:27:04
perhaps makes more sense to change it to `1 - sqrt(0.5)` in the spec so `inv_sigma` is positive
Tirr
2023-04-24 12:27:18
yeah It would be better
_wb_
2023-04-24 12:37:04
Thank you so much for finding all these spec bugs, <@268284145820631040> , <@853026420792360980> , <@206628065147748352> and <@736666062879195236>! I wish I could add a "thank you" section to the spec, but ISO doesn't do that kind of thing (they never put any names or credits in their standards)
yoochan
2023-04-24 12:38:51
what are the opportunities to amend the spec AFTER the official publication ? do you have to pay for it ?
_wb_
2023-04-24 12:40:08
one of the slides I'll be presenting later today 🙂
yoochan what are the opportunities to amend the spec AFTER the official publication ? do you have to pay for it ?
2023-04-24 12:41:07
after a standard is published (IS stage), it can be changed by making an AMD (amendment), CORR (corrigendum) or by making a new edition
2023-04-24 12:41:26
all of these options are mostly just a lot of bureaucracy 🙂
2023-04-24 12:43:26
if you want to make an amendment, corrigendum or new edition, first you need to submit a NWIP to ISO (new work item proposal). This needs to be approved by national bodies before anything can happen. Then it goes from working draft to committee draft to DIS to FDIS to IS, where starting at the CD each stage goes to ballot.
yoochan
2023-04-24 12:45:34
yum 😄
2023-04-24 12:45:34
are we in a novel of douglas adams ?
_wb_
2023-04-24 12:46:01
I made a very generic comment myself (officially it is the Belgian national body making this comment) to the CD, which then "allows me" as an editor to make the changes I want to make — because officially changes have to be motivated by national bodies requesting them
2023-04-24 12:47:10
so basically I need to with one hat on make a request to the Belgian national body to please request changes so with the other hat on I can make those changes
2023-04-24 12:47:44
this is the kind of antiquated procedures you need to navigate when working as an ISO working group
yoochan
2023-04-24 12:47:56
amazing how rules always have a backdoor 😄
_wb_
2023-04-24 12:48:26
it's not really a backdoor, it's their way to document changes that happen
2023-04-24 12:48:44
(a git log of the spec repo is more practical though imo)
yoochan
2023-04-24 01:37:53
almost no link with the issue but I once dreamed of a democratic process where laws are voted through merge requests made by MPs 😄 it would be so clean to have the book of law patched instead of the horrendous existing "modifying laws"
Tirr
_wb_ can you confirm now that it's indeed Mul and not HfMuL?
2023-04-24 03:12:49
yeah I can confirm this; the complete formula I'm using is `quant_mul * (65536.0 / (global_scale * hf_mul)) * sharp_lut[sharpness]` so it's `Mul`
2023-04-24 03:13:56
(it results in the same sigma value with libjxl)
_wb_
2023-04-24 03:25:17
cool, thanks!
Traneptora
_wb_ Updated spec draft:
2023-04-25 03:15:17
it says `stage` in lowercase right before Warning for WDs and CDs
2023-04-25 03:15:20
I assume that's a typo
2023-04-25 03:15:34
2023-04-25 03:15:49
or a placeholder
_wb_
2023-04-25 05:20:32
Yeah I dunno why it doesn't put DIS there, maybe a metanorma bug
spider-mario
2023-04-25 09:01:32
_Système de codage de noyau_ 😂
2023-04-25 09:02:57
“Core coding system” is supposed to be “Core {coding system}” (“core” applies to “system”), but instead they’ve translated it as “{kernel coding} system”
2023-04-25 09:03:07
a system for coding kernels
yoochan
2023-04-25 09:12:07
c'est peut-être pour les olives
_wb_
2023-04-25 10:21:40
I don't know who did that translation but I'm not going to try to figure out how to request ISO to change the french title of the standard
Tirr
2023-04-25 05:41:57
I.5.2: CfL should be done before LF smoothing
_wb_
2023-04-25 06:37:30
ah good point, that is indeed completely not clear from the spec
2023-04-27 09:09:14
Here's the version that we will submit as DIS (pending approval tomorrow at the final plenary of the 99th JPEG meeting)
2023-04-27 09:09:31
(that is, the Word version of that)
2023-04-27 09:11:46
Notable change: the definition of `total_estimated_area_reached` was adjusted (both how it is computed and what the profile limits are).
yoochan
2023-04-27 10:49:16
congratulations ! but what was published in march https://discord.com/channels/794206087879852103/803379415106584626/959072434571583488 ?
Tirr
2023-04-27 10:52:15
I guess it's the first edition published last year
_wb_
2023-04-27 11:24:28
Yes that was the first edition. Now it's the second edition. No major changes, just tons of bugfixes and clarifications.
yoochan
2023-04-27 11:59:23
nice ! there is a new edition planned every year ?
Traneptora
2023-04-27 12:01:27
there's not going to be a reason to unless more extensions keep being added
2023-04-27 12:01:52
second edition's primary purpose is to fix errors in first edition, rather than add new features
2023-04-27 12:02:19
ideally this will iron out most or all of them so there won't be a need to keep doing more editions
Tirr
2023-04-27 12:09:09
if `xyb_encoded && want_icc`, are the samples after inverse XYB (defined in the spec) in the correct gamut and transfer curve defined in the ICC profile?
2023-04-27 12:10:09
or do I need to convert samples from linear sRGB according to the profile?
Traneptora
2023-04-27 12:10:30
if the image is XYB encoded, then inverting XYB will always produce linear sRGB
2023-04-27 12:10:57
in the case where the image is XYB-encoded, colorspace info is a suggestion for decoders, and it provides information on what the original colorspace was
2023-04-27 12:11:39
you can convert that linear sRGB into any space you wish, be it the tagged space, or the display space, or whatever.
Tirr
2023-04-27 12:12:14
okay, thanks for clarifying
Eugene Vert
2023-04-27 12:57:48
`E.1` uses "grayscale" while `1`, `3.2.10` and `E.2` use "greyscale"
yoochan
2023-04-27 01:09:10
I propose græyscale
_wb_
2023-04-27 03:05:08
there is no plan for more editions. At some point in the future we might make an amendment to add a new profile (one that only allows some subset that can be implemented conveniently in hardware), and we might find more errors to fix, so there might be a 3rd edition at some point, but we certainly have no goal to make more editions
2023-04-27 03:11:23
ISO standards are supposed to use the Oxford English spelling, so greyscale would be correct
Traneptora
Tirr okay, thanks for clarifying
2023-04-27 08:14:45
do note that inverting XYB involves a cubic-like gamma transfer, followed by a linear transform by the OpsinInverseMatrix
2023-04-27 08:15:10
and converting from linear sRGB to another linear space (e.g. BT.2100 or P3) is a linear transformation
2023-04-27 08:15:32
so you can actually take the sRGB -> BT2100 matrix and multiply it by the Opsin Inverse Matrix
2023-04-27 08:15:48
to get a new OpsinInverseMatrix that maps into a different space
Eugene Vert
2023-04-27 09:10:08
In `K1` it is says that out of image features, upsampling happens first, but should the upsampling of splines work like in jxl-art?
2023-04-27 09:13:37
Not sure, but from `26 dec_cache.cc:Status PassesDecoderState::PreparePipeline` it seems like that the order in libjxl is: patches; splines; upsampling; noise
_wb_
2023-05-05 06:45:50
So it looks like there are some `Umod`s missing in the spec since libjxl is using `int32_t` for some things that might not fit
2023-05-05 06:47:29
Table H.4 (modular properties) : all of these have to be cast to `int32_t`
2023-05-05 06:50:33
Section H.5 (self-correcting predictor): some of these variables also have to be cast to `int32_t`
Tirr
2023-05-07 04:40:50
L.3: the bias applied to Y (0.5) is a little bit off to the actual constant used in libjxl (128 / 255 = 0.50196...), which can make the Level 10 conformance tests to fail
veluca
2023-05-07 08:32:47
(I hate that constant so much...)
_wb_
2023-05-07 08:43:52
Is this a spec bug or libjxl bug?
veluca
2023-05-07 10:46:41
who knows
2023-05-07 10:46:57
I'd consider it a spec bug and save ourselves a lot of pain
Tirr
2023-05-07 10:56:14
I think it should match the result of decoding jpeg image at least, as ycbcr images mainly come from recompressing jpegs
2023-05-07 10:57:58
but is it specified in the jpeg standard?
2023-05-07 10:59:30
libjxl cites full-range BT.601 (`stage_ycbcr.cc` L32)
2023-05-07 11:00:32
I guess it's a spec bug
veluca
2023-05-07 11:44:05
what offset jxl uses for YCbCr files is kinda immaterial as long as the encoder properly corrects for it
_wb_
2023-05-07 12:21:45
for jpeg recompression the encoder does not have a choice though
veluca
2023-05-07 12:30:48
how so?
2023-05-07 12:30:58
the encoder can very easily add an offset to DC
2023-05-07 12:31:04
I am in fact quite sure it does
2023-05-07 12:31:17
ah, no, I see what you mean
2023-05-07 12:31:34
what libjxl does matches what libjpeg does here
_wb_
2023-05-07 12:37:43
yes, I think that was the goal. basically all components including Y are treated like int8_t, with zero in the middle (which is why all-zeroes is gray in jpeg, not black or green)
Eugene Vert
2023-06-04 12:48:51
I have a question about `dim_shift` in `ExtraChannelInfo` In spec there is: > The value `1 << dim_shift` does not exceed the `group_dim` of any frame but in libjxl's it's (`image_metadata.cc`) ```c if ((1U << dim_shift) > 8) { return JXL_FAILURE("dim_shift %u too large", dim_shift); } ``` Isn't `group_dim` is something like `128 << group_size_shift`, why is it `8` in the code?
Traneptora
2023-06-04 02:28:30
well `group_dim` defaults to 256 for most images, so `1 << dim_shift` not exceeding `group_dim` would imply that `dim_shift > 8` should be the check, not `1 << dim_shift > 8`
2023-06-04 02:28:42
so something is wrong here
Eugene Vert
2023-06-06 10:45:29
(frame_header.cc:253) ```c++ for (size_t i = 0; i < extra_channels.size(); ++i) { uint32_t dim_shift = nonserialized_metadata->m.extra_channel_info[i].dim_shift; uint32_t& ec_upsampling = extra_channel_upsampling[i]; ec_upsampling >>= dim_shift; JXL_QUIET_RETURN_IF_ERROR( visitor->U32(Val(1), Val(2), Val(4), Val(8), 1, &ec_upsampling)); ec_upsampling <<= dim_shift; if (ec_upsampling < upsampling) { return JXL_FAILURE( "EC upsampling (%u) < color upsampling (%u), which is invalid.", ec_upsampling, upsampling); } if (ec_upsampling > 8) { return JXL_FAILURE("EC upsampling too large (%u)", ec_upsampling); } } ``` So the maximum cumulative upsampling (`ec_upsampling << dim_shift`) is no more than **8** and ```c++ if ((1U << dim_shift) > 8) { return JXL_FAILURE("dim_shift %u too large", dim_shift); } ``` is kind of early check?
2023-06-06 10:50:11
But `L.4` describes > first 8x upsampling is applied dim_shift Idiv 3 times implying that there may be multiple 8x upsamplings?
Tirr
2023-06-13 06:19:13
how do I render spot color channels on grayscale images?
_wb_
2023-06-13 06:42:27
Good question
2023-06-13 06:42:45
How would you want to render them?
2023-06-13 06:42:58
Does libjxl do anything that makes sense?
2023-06-13 06:43:38
The spec is not clear on what to do, so I think we should either explicitly not allow it, or clarify what to do
Tirr
2023-06-13 06:52:43
- If `r == g && g == b`, render them as grayscale - Else, the bitstream is invalid (we can't define the meaning of RGB with grayscale colorspace) or, just don't allow them at all.
2023-06-13 06:57:05
hmm iirc G in RGB corresponds to Y in XYB, maybe just take G in spot color and render it
Eugene Vert
2023-06-13 07:05:55
(But taking R as spot color is easier to programm)
_wb_
2023-06-13 08:13:33
I mean, it could potentially be useful to have some meaning for gray + non-gray color, e.g. in print this actually happens (if you only need black plus one extra color, the usual 4-color CMYK process can be more expensive than a 2-color process, and results will be not as nice)
2023-06-13 08:15:05
> At the final step in the decode process (when the inverse colour transform has been applied), the decoded image is in the colour space defined by the image metadata.
2023-06-13 08:16:22
so I guess you are supposed to do the spot color blending in that tagged space, not in linear sRGB (which would be the output of the inverse XYB as it is in the spec)
2023-06-13 08:16:40
in that case, if the tagged space is grayscale then of course RGB has no meaning
2023-06-13 08:17:55
ok seems to be too tricky to have gray + non-gray spot color, if you want to do that you can always just make it an RGB image that happens to use only gray, or even an empty image with two spot colors (one for black and one for the other color)
2023-06-13 08:18:20
I don't really see a use for grayscale with a spotcolor that is gray too
2023-06-13 08:19:46
so I would just say spot colors are either not allowed in the grayscale case, or a decoder can ignore them
Tirr
2023-06-13 08:20:54
if the tagged colorspace is grayscale then the embedded ICC profile should also be grayscale, so we can't explain what the spot color is supposed to look like (am I getting it right?)
_wb_
2023-06-13 08:23:55
yes, exactly. Unless we say spot colors are always specified in linear sRGB... but then a decoder needs to be able to convert the image to that before rendering spot colors, and back...
2023-06-13 08:27:31
for CMYK images as it is now, you have to specify spot colors using only CMY, and the decoder can do the spot color rendering directly on the decoded CMY channels (or not, and return the spot color channel separately). The resulting CMYK image is well-defined and can be converted to RGB for display.
2023-06-13 08:29:51
if we would specify spot colors in a fixed absolute space like linear sRGB (or XYB for that matter), then in the CMYK case a decoder would first need to convert the image to that absolute space, then do the spot color rendering, and if it's just for display then all is good but if you want to get a resulting CMYK image, you need to do a conversion from RGB to CMYK which is not well-defined (information was lost when converting CMYK to RGB and it cannot really be restored)
2023-06-13 08:33:22
meh, I'd say we just keep it as it is now, and just add that if the color space is kGrey then spot colors are not rendered by the decoder
2023-06-13 08:34:12
(applications can still have them as extra channels if they want to, you just don't get an approximate preview of what it would look like)
Lucas Chollet
2023-06-20 09:19:45
Hey, I started implementing a JPEG-XL decoder for SerenityOS and as far as I can tell, I found two small issues in the spec: In H.4.1, inside the code snippet: ```cpp rW = (x > 0 ? channel[j](x − 1, r) : 0); ``` `r` is used without being defined The same problem exists with `w` in H.3: ```cpp NEE = (x + 2 < w and y > 0 ? channel[i](x + 2, y − 1) : NE); ``` Hope that it's useful 😃
Traneptora
Lucas Chollet Hey, I started implementing a JPEG-XL decoder for SerenityOS and as far as I can tell, I found two small issues in the spec: In H.4.1, inside the code snippet: ```cpp rW = (x > 0 ? channel[j](x − 1, r) : 0); ``` `r` is used without being defined The same problem exists with `w` in H.3: ```cpp NEE = (x + 2 < w and y > 0 ? channel[i](x + 2, y − 1) : NE); ``` Hope that it's useful 😃
2023-06-20 10:39:59
looks like a typo for y
Lucas Chollet
2023-06-20 10:44:30
wdym?
yurume
2023-06-21 01:10:06
i.e. `(x - 1, r)` should have been `(x - 1, y)`
Lucas Chollet
Traneptora looks like a typo for y
2023-06-21 01:29:15
I misunderstand that message that haha, I just realized that `y` wasn't an abbreviation for "you"
yurume i.e. `(x - 1, r)` should have been `(x - 1, y)`
2023-06-21 01:29:47
Sure, that's what I did
2023-06-21 01:31:18
I only make that message so that the spec can be corrected I don't even know if that's useful for you guys as I guess that it's complicated to modify a spec
yurume
2023-06-21 01:35:15
yeah, that's definitely a typo that should be fixed, and I think `y` is indeed correct
Traneptora
2023-06-21 05:27:09
Is there a version more up-to-date than this one?
2023-06-21 05:27:10
https://discord.com/channels/794206087879852103/1021189485960114198/1101072554837409853
_wb_
2023-06-22 04:22:26
No, haven't made updates yet after submitting DIS...
Lucas Chollet
2023-06-23 08:49:32
<@794205442175402004> (I hope the ping don't bother you) Should I continue to report small spec bug and inaccuracy? I don't want to bother you with that especially if it's useless down the road, but the good citizen in me wants to report every thing that seems a bit off
_wb_
2023-06-23 08:54:21
Sure, please report stuff! We can still make changes after DIS, and even after IS of the second edition it will be useful to fix stuff for a potential third edition 🙂
Lucas Chollet
2023-06-25 05:29:10
It seems that there is no specified behavior to traverse the MA tree in case of a decision node with a property index greater than the current number of properties. This problem has been solved with this commit in `jxl-oxide`: <https://github.com/tirr-c/jxl-oxide/commit/7ca38cfa28b91fbfaad1346debbe0344a1994f26>
2023-06-25 05:30:35
(So it default to the left child node)
Tirr
Lucas Chollet It seems that there is no specified behavior to traverse the MA tree in case of a decision node with a property index greater than the current number of properties. This problem has been solved with this commit in `jxl-oxide`: <https://github.com/tirr-c/jxl-oxide/commit/7ca38cfa28b91fbfaad1346debbe0344a1994f26>
2023-06-25 05:36:04
it's specified as returning zero for those properties: > The function `GetProperties(i, x, y)` returns an array of properties as defined above, where property values that are not set by the above table and code are zero.
Lucas Chollet
2023-06-25 05:41:26
Not in the old version I used when I wrote that code 😅 I should have check in new version, my bad But I'm glad that it has been fixed!
Traneptora
Lucas Chollet It seems that there is no specified behavior to traverse the MA tree in case of a decision node with a property index greater than the current number of properties. This problem has been solved with this commit in `jxl-oxide`: <https://github.com/tirr-c/jxl-oxide/commit/7ca38cfa28b91fbfaad1346debbe0344a1994f26>
2023-06-26 04:03:30
in general invalid bitstreams are unspecified
2023-06-26 04:03:52
it specifies what is valid and how to handle valid bitstreams
2023-06-26 04:04:04
but decoders can do whatever with invalid ones
Arcane
2023-07-02 07:17:04
Traneptora
2023-07-08 03:14:49
Section E.4.1:
2023-07-08 03:14:51
> The decoder reads enc_size as U64(). The decoder reads 41 pre-clustered distributions as specified in C.1. The decoder then reads enc_size integers as specified in C.3.3 to obtain decompressed bytes, using DecodeHybridVarLenUint(IccContext(index, prev_byte, prev_prev_byte)) where index is the current byte index, prev_byte and prev_prev_byte are respectively the previous and second-previous bytes or 0 if they do not exist yet, and the IccContext() function is defined in the code below. The resulting decompressed byte-based data is the encoded ICC stream. The decoder reconstructs the ICC profile from this encoded ICC stream as described in the subsequent subclauses of this annex.
2023-07-08 03:15:06
It calls them "bytes" but are they actually guaranteed to be <256? they're being read as hybrid integers
2023-07-08 03:15:23
if they are, is it illegal (aka unspecified) behavior if they are at least 256
2023-07-08 03:15:51
or is "bytes" the wrong term?
_wb_
2023-07-08 03:34:02
I assume it has to be < 256, would indeed be good to make that explicit...
Traneptora
2023-07-17 05:05:21
It looks like Adaptive LF Smoothing has to occur *after* CFL
2023-07-17 05:06:40
otherwise the low B-channel coeffs will be smoothed out
2023-07-17 05:06:47
the spec doesn't say this
Tirr
2023-07-17 05:17:57
ah yeah, it's reported before https://discord.com/channels/794206087879852103/1021189485960114198/1100476812557041757
Traneptora
2023-07-17 05:18:28
I just figured that out when debugging jxlatte <:kek:857018203640561677>
Tirr
2023-07-17 05:21:58
is it added in the spec? I think I didn't check that
Traneptora
2023-07-17 05:22:55
hasn't been a new draft recently, but idk if I'm using an older one or the current one
2023-07-17 01:11:41
Spec doesn't say whether you are supposed to apply Gaborish to extra channels
2023-07-17 01:11:44
and if so, what weights to use
2023-07-17 01:12:56
same with EPF
Tirr
2023-07-24 02:30:16
Table M.1: it says `nb_channels` corresponds to the number of modular channels, but libjxl adds color channels even when it's VarDCT frame
2023-07-24 02:30:43
(related jxl-oxide issue: <https://github.com/tirr-c/jxl-oxide/issues/65>)
_wb_
2023-07-24 03:18:25
right — I think libjxl's behavior makes sense since even in VarDCT frames there is still modular-encoded LF data where it can be useful to have nontrivial MA trees. But this definition in the spec is then indeed at best ambiguous (and at worst just wrong)
2023-07-25 02:06:23
here's a draft of the 2nd edition of 18181-3 (conformance testing)
Tirr
2023-07-25 02:40:16
first time to see 18181-3
2023-07-25 02:41:25
.npy format used for conformance is described in the spec, great
2023-07-25 02:41:31
maybe I'll implement that
Fraetor
2023-07-29 10:34:50
Ah, its just a NumPy binary. Nice.
Lucas Chollet
2023-08-10 04:37:58
In "H.5.2 Prediction", between the two first code snippets, there is this sentence "The weights weight[i] for each of the 4 sub-predictions and `mxe` are computed as specified by the following code [...]". It's the only occurrence of "mxe" in the whole document, so I guess that it needs some clarification/clean-up. Also in "C.3.3 Hybrid integer decoding", the penultimate line (`window[(num_decoded++) & 0xFFFFF] = r;`) of `DecodeHybridVarLenUint(ctx)` uses the `window` variable. However, this variable is only defined if `lz77.enabled` so I think that it should be behind a if: ```cpp if (lz77.enabled) window[(num_decoded++) & 0xFFFFF] = r; ```
Traneptora
2023-09-08 07:56:48
I'm confused about section I.2.4:
2023-09-08 07:56:49
> params = /* read 3 x 9 matrix in raster order */; > for (i = 0; i < 3; i++) for (j = 0; j < 6; j++) params(i, j) *= 64;
2023-09-08 07:57:01
``` params = /* read 3 x 9 matrix in raster order */; for (i = 0; i < 3; i++) for (j = 0; j < 6; j++) params(i, j) *= 64; ```
2023-09-08 07:57:20
this looks like the matrix should be width=9 and height=3
2023-09-08 07:57:43
but `params(i, j)` will overflow in that case
2023-09-08 07:57:54
unless it's supposed to be `params(y, x)`
_wb_
2023-09-09 03:44:46
I hate this never ending (row,column) vs (x,y) confusion
2023-09-09 03:45:25
Thanks for noticing the inconsistency
yoochan
2023-09-09 05:34:56
Instead of i and j indices, in my codes I use r and c... And still make mistakes but they are faster to spot 😅
_wb_
2023-09-25 07:28:07
The second editions of 18181-1 and 18181-2 are now in DIS stage and the ballot deadline is approaching — we need to make a ballot comment for any change.
yoochan
2023-09-25 07:45:25
what is the scope of changes ? typos ? clarifications ?
_wb_
Eugene Vert Not sure, but from `26 dec_cache.cc:Status PassesDecoderState::PreparePipeline` it seems like that the order in libjxl is: patches; splines; upsampling; noise
2023-09-25 07:48:03
<@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?
Tirr
2023-09-25 07:48:27
is this addressed? https://discord.com/channels/794206087879852103/1021189485960114198/1130364598202150912
2023-09-25 07:48:32
and this https://discord.com/channels/794206087879852103/1021189485960114198/1103935571316965407
veluca
_wb_ <@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?
2023-09-25 07:49:38
I *think* it is that if the extra channels need to be brought on the same size as color channels, that happens before patches, then all of that is upsampled to final size after patches
2023-09-25 07:49:40
lemme check
2023-09-25 07:50:57
ok apparently not
2023-09-25 07:51:47
I think you are right, and the combination in which there's different EC upsampling from color channel upsampling and color channels have non-1 upsampling is just not supported if there are also patches
Tirr
2023-09-25 08:05:22
F.2 says: > After decoding subsampled channels, they are upsampled as specified in J.2 (in the case of `jpeg_upsampling`) and K.2 (in the case of `upsampling` and `ec_upsampling`). and L.4 says: > first 8x upsampling is applied `dim_shift Idiv 3` times, followed by no, 2x, or 4x upsampling if `dim_shift Umod 3` is equal to 0, 1 or 2, respectively. but patches require extra channel upsampling factor equals color channel, so I guess it's (color & extra channel upsampling, ec upsampling, patches, splines, noise)?
veluca
_wb_ <@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?
2023-09-25 08:08:52
I think it is supposed to be how Jon says here
jonnyawsom3
_wb_ <@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?
2023-09-25 09:01:09
Might not be related, but upsampling patches is currently broken, presumably because of the ordering in libjxl https://github.com/libjxl/libjxl/issues/2737
2023-09-25 09:02:00
May be worth checking what order is needed in that instance too, unless it's simply not supported
_wb_
Eugene Vert But `L.4` describes > first 8x upsampling is applied dim_shift Idiv 3 times implying that there may be multiple 8x upsamplings?
2023-09-25 09:42:07
I don't think libjxl does upsampling with a factor > 8, right <@179701849576833024> ? Should we just spec it to be at most 8x cumulative upsampling, and drop that sentence about how to do upsampling with a factor > 8? Or should we keep that and fix libjxl to also do bigger upsamplings?
2023-09-25 09:48:24
I think having a limit of 8x is probably not really limiting and keeps things simpler, but I'm fine either way — we just need to decide whether to put the libjxl limit in the spec or to remove the limit from libjxl.
2023-09-25 09:52:58
(note: this only appplies to extra channels, where dim_shift can actually be signalled to be as large as 8, which means 256x upsampling, so combined with the ec_upsampling from the frame header, you could in principle have up to 2048x upsampling)