JPEG XL

Specification issues

2001 - 2110

1501 - 2000

veluca

	Traneptora Section J.2, JPEG Upsampling, doesn't specify what happens in corner cases. I'm guessing it uses `Mirror` (section 5.2) like every other restoration/upsampling filter, but this isn't explicit.
2022-12-02 04:14:43	yep
	Traneptora I don't know if the spec should be disallowing these as well
2022-12-02 04:14:55	I think also yep

Traneptora

2022-12-02 04:15:41	Note 4 below Table E.6 contains some `\LaTeX`
2022-12-02 04:16:37	I wonder if it's worth mentioning that PQ is also specified in SMPTE ST 2084
2022-12-02 04:16:59	and that 709, PQ, sRGB, are also specified in ITU-T H.273
2022-12-02 04:24:51	You use `&` as the bitwise-and operator, but don't define it in Section 4.3
2022-12-02 04:25:13	I assume most people reading this will know what it does, but it's still a good idea to be explicit
2022-12-02 04:26:08	it's not in the precedence table either
2022-12-02 04:28:50	`CoeffNumNonzeroContext[64] = {` there's an extra space in the first row, 123, 4th from the right. it's not incorrect, but the formatting is slightly off

Lines N Shapes

2022-12-03 07:37:40

In section B.2.3 the '+' is missing in what should be "If s == 1, value is 1 + u(4)"

yurume

2022-12-08 03:24:54	G.4.1 has the following sentence: > The dimensions correspond to `width`, except for extra channels with `dim_shift > 0` (as specified in Annex H) which have dimensions `ceil(width / (1 << dim_shift))`.
2022-12-08 03:25:56	I've been told that this should also say about `height`, as it's currently implicit (though obvious enough)
2022-12-08 03:28:51	C.2.6: the second loop `for (i = 0; i < table_size; i++)` should loop until `alphabet_size`, otherwise when `alphabet_size != table_size`, `underfull` might be initialized with more than `alphabet_size` elements

_wb_

	veluca I think also yep
2022-12-08 03:51:32	What exactly should be changed in the spec for this?

veluca

2022-12-08 03:52:18

good question, I don't remember where this thing is

_wb_

	Traneptora it's not in the precedence table either
2022-12-08 04:16:00	I'll add it to the precedence table, but I don't think we need to explicitly define it, since we start by saying > This document uses the operators defined by the C++ programming language (ISO/IEC 14882), with the following differences:

Traneptora

2022-12-08 04:16:15

Yea, that's fair

_wb_

2022-12-08 04:16:28	in the case of bitwise and, we use it without differences from how it is used in C++
2022-12-08 04:19:23	we also use bitwise or and xor so I'll add those to the precedence table too — though I'm not sure why we actually have that table, since it's just the same one as in C++

Tirr

	_wb_ What exactly should be changed in the spec for this?
2022-12-08 04:23:33	in C.2.2: > Otherwise, if is_simple is false, the decoder reads a Bool() use_mtf. The decoder then initializes a symbol decoder using a single distribution D, as specified in C.2.1. spec should forbid using lz77 here

Traneptora

2022-12-08 04:33:02	currently libjxl allows lz77 here
2022-12-08 04:33:30	it only doesn't allow it on the child distribution if the parent already has <= 2 dists
2022-12-08 04:34:07	I ran into this issue when developing jxlatte, if you disallow lz77 for all nesteds you'll reject otherwise valid bitstreams

Tirr

2022-12-08 04:38:40

aah that's right, i thought the condition was just `true`

_wb_

2022-12-08 05:11:11	So I should add something like this to C.2.1 then, below table C.1 or so: > If `num_dist <= 2`, then `lz77.enabled` is `false`. Does that look OK?
2022-12-08 05:18:05	Or is it rather in C.2.2, where there is the recursive reference to C.2.1, that it should say that if `num_dist <= 2`, then in the nested distribution, `lz77.enabled` is false?

Tirr

2022-12-08 05:21:41	for me, it sounds like one should skip reading `lz77.enabled` when `num_dist <= 2`
2022-12-08 05:23:49	maybe i'm not familiar with normative specifications
2022-12-08 05:46:32	or before reading `LZ77Params`, add something like "If the decoder is being initialized by another decoder, and the parent decoder has `num_dist <= 2`, then the decoder reads a Bool() which is `false`, and sets `lz77.enabled` to `false`"?

_wb_

2022-12-08 05:47:29	I don't think that would be correct. I think the field needs to be read anyway, but it has to be `false` in a valid bitstream (so a decoder is free to explode into flames, or just politely refuse to decode, if it's not the case)
2022-12-08 05:48:14	Yes, something like that, except "being initialized by another decoder" and "parent decoder" are not really rigorous enough

Tirr

2022-12-08 05:51:15	yeah that was just an idea
	_wb_ So I should add something like this to C.2.1 then, below table C.1 or so: > If `num_dist <= 2`, then `lz77.enabled` is `false`. Does that look OK?
2022-12-08 05:54:50	now I feel this is also ok

veluca

	_wb_ Or is it rather in C.2.2, where there is the recursive reference to C.2.1, that it should say that if `num_dist <= 2`, then in the nested distribution, `lz77.enabled` is false?
2022-12-08 05:59:42	seems about right (not sure about the section)

_wb_

2022-12-18 08:45:13	We don't currently have a notion of "multi-page" like TIFF. I propose we add to the spec that if the frame duration field has the maximum value, it is to be interpreted as "end of page".
2022-12-18 08:52:18	So conceptually you can have multiple pages (max duration frames), each possibly consisting of multiple animation frames (nonzero duration frames), each possibly consisting of multiple layers (zero duration frames), each possibly internally using more than one frame to be encoded (a layer could have data in kReferenceOnly or LF frames, which are not semantically part of the decoded object).
2022-12-18 08:57:18	Multi-page with animations on evey page is maybe a bit unusual, but it could have use cases, e.g. for a set of cinemagraphs. We only have a loop count at the image header level so I propose it would then apply to every page.
2022-12-18 09:03:09	Multi-page still images could be quite useful in principle, to represent a set of related images that are not to be displayed as an animation, but as an ordered collection, with manual navigation. Comic books or rasterized slide decks, for example.
2022-12-18 09:06:33	Besides potential compression benefits (compared to storing these as separate files, one per page), having them as a single file object is also just more convenient.

spider-mario

2022-12-18 09:07:21	the thought of having the animation be one “level” higher (and therefore possibly having a different number of pages per “frame”) briefly entered my mind but it quickly left it again
2022-12-18 09:07:32	it would be terrible

_wb_

2022-12-18 09:12:56

How I propose it would gracefully degrade in applications that don't know about it, you just see the first page and if the animation playback has buttons for seeking or going to next/previous frame, those could be used for basic navigation. It's of course better if they actually know it's multi-page and not just a very long animation...

Fraetor

2022-12-18 11:09:39

~~I'd personally argue that paging is better handled at a higher level (by PDF or CBZ, et al.) but if there are significant space gains then it might be worth it.~~ My mind has been changed.

_wb_

2022-12-18 12:36:51

I don't see a significant difference between animation and multi-page, to me it's just that the first is autoplaying at the speed chosen by the author, while the second is manually navigated frame by frame.

improver

2022-12-18 01:32:49

you gotta need new metadata value telling that it's multi page. & even then i would say that stuff like CBZ are easier to use still because of how its being handled already in most apps

_wb_

2022-12-18 01:56:07

Already implemented things that work already are of course easier to use than new things. This seems a simple enough spec change though (just assigning mildly new semantics to existing syntax), applications can choose if they want to provide a specific user interface for it or not, but mentioning it explicitly in the spec can help to encourage viewers to interpret it as a multi-page file and handle it like how they would handle cbz or pdf or whatever.

Fraetor

2022-12-18 02:04:42	TBF, as long as falling back to a really slow animation is an option, it is probably a good idea. Multipage is a useful feature of TIF.
2022-12-18 02:09:56	Would you ba able to have differently size pages?

_wb_

2022-12-18 02:31:33	Frames don't have to be canvas-sized, they can be larger or smaller. But there is only one canvas size per image, and presumably what is shown (at least in an animation) is always canvas-sized (i.e. if a frame is smaller, it only updates part of the canvas, and if it is larger, you don't see what's outside the canvas)
2022-12-18 02:35:21	If you would want to mix landscape and portrait pages, probably the best way is make a square canvas, set the pages to not be saved as reference, and set the blend source of every page to an uninitialized reference frame, which has the effect of introducing transparent black padding (implicit all zeroes).

Traneptora

2022-12-25 05:18:23	This isn't quite a bug but it's not very clear
2022-12-25 05:18:47	in section K.4.1, you read the splines themselves one at a time
2022-12-25 05:19:08	it's not obvious to me that you read the control points for spline 0, then you read the coefficients for spline 0, then each for spline 1, etc.
2022-12-25 05:19:16	It's specified but it could be clearer

_wb_

2023-01-11 08:10:48

Today I'm flying to Sydney for the 98th JPEG meeting, finally an in-person meeting again after 3 years of virtual meetings due to covid. In the weekend we'll be preparing the final CD texts of the second editions of 18181-1 and 18181-2, which will then be discussed during the JPEG meeting next week and hopefully approved on Friday next week. If there are any more spec bugs or unclear wordings, feel free to mention them here so we can address all the remaining issues. We'd really like to have a very high-quality committee draft this time, so the delta between the CD and the final IS can be as small as possible. Up to the CD stage we are more or less free to (discretely) share the document for feedback; once it goes to DIS stage there's the ISO copyright thing and things unfortunately go behind the paywall. If we can have a CD that's good enough, then effectively in practice the paywall isn't a real obstacle. For the first edition this was impossible since we were still very actively defining the bitstream back then, so the delta between CD and IS was huge, but for the second edition I hope we can have a CD that is basically the same as the IS.

Tirr

2023-01-11 09:31:29

in H.5.2, minor typo: `W` is not in the list of "corresponding samples"

_wb_

	veluca seems about right (not sure about the section)
2023-01-13 03:30:10	> Otherwise, if is_simple is false, the decoder reads a Bool() use_mtf. The decoder then initializes a symbol decoder using a single distribution D, as specified in C.2.1, where `lz77.enabled` is `false`. For each pre-clustered distribution i, the decoder reads an integer as specified in C.3.3. Finally, if use_mtf, the decoder applies an inverse move-to-front transform to the cluster mapping (see code below).
2023-01-13 03:31:29	is that the correct fix? In this recursive case, it is reading 'a single distribution' which means `num_dist == 1` there, right?

veluca

2023-01-13 04:02:23

I ... think so?

_wb_

2023-01-13 04:18:22

the entropy decoding wording in the spec is still not quite as clear as I would prefer it to be — it should be a lot better now than in the first edition, but things are still a bit too vague for my taste, as demonstrated by how tricky it seems to be to locate exactly how to fix this zip bomb thingie

Traneptora

	_wb_ > Otherwise, if is_simple is false, the decoder reads a Bool() use_mtf. The decoder then initializes a symbol decoder using a single distribution D, as specified in C.2.1, where `lz77.enabled` is `false`. For each pre-clustered distribution i, the decoder reads an integer as specified in C.3.3. Finally, if use_mtf, the decoder applies an inverse move-to-front transform to the cluster mapping (see code below).
2023-01-13 04:59:40	this is not always the case, no. it's based on the number of distributions in the parent, not in the child
2023-01-13 04:59:53	if `num_dist <= 2` in the parent then it sets disallow lz77

_wb_

2023-01-13 05:00:44

so if `num_dist == 2`, right? since if it's 1 then there is no recursive call anyway

Traneptora

2023-01-13 05:01:06	according to libjxl code, yes
	_wb_ so if `num_dist == 2`, right? since if it's 1 then there is no recursive call anyway
2023-01-13 05:03:51	if you set it to always disallow for nested then certain files don't decode, such as bench.jxl
2023-01-13 05:04:31
2023-01-13 05:04:44	I believe this is the same file as in the conformance testing repo but I've uploaded it here anyway

_wb_

2023-01-13 05:06:26	right. so if num_dist > 2, then the recursive distribution can still use lz77, but then that one can have num_dist == 1+1 (initially 1 but due to lz77.enabled == true it becomes 2), and if it is recursively reading a distribution, then that one cannot use lz77
2023-01-13 05:08:25	so at most there are 3 levels (grandparent with num_dist > 2, parent with num_dist == 1+1 = 2, child with num_dist = 1 and not allowed to get the +1 from lz77.enabled)
2023-01-13 05:09:32	Here is my current version of both parts of the spec:

Traneptora

2023-01-13 05:09:45	https://github.com/libjxl/libjxl/blob/58fff0336ec59fa3080a239ba89e0ed24d75d050/lib/jxl/dec_context_map.cc#L83
2023-01-13 05:09:49	Here's the relevant line in libjxl
2023-01-13 05:10:20	I have to sleep (it's midnight in my timezone) but I can try to take another scan through the specs tomorrow

_wb_

2023-01-13 05:11:47	ok I think what I wrote in the spec is the same thing as what libjxl does (I wrote num_dist == 2 instead of <= 2 but that boils down to the same thing because we're in a part of the spec where num_dist > 1)
2023-01-14 07:25:45	Current spec draft:
2023-01-14 07:26:53	one thing that was added: we now specify multi-page images to exist: > If duration has the value 0xFFFFFFFF, the decoder presents the next frame as the next page in a multi-page image.

Fraetor

2023-01-14 03:10:13

Finally, a format to rival TIFF.

improver

2023-01-14 03:12:33	i can't imagine web browsers making use of this though (too much UX changes necessary). in that case they should display just first image?
2023-01-14 03:13:19	or i guess interpret that value literally and do very long delay
2023-01-14 03:13:33	will there be any libjxl changes related to this

Fraetor

2023-01-14 03:36:33	Yeah, any decoders that don't implement this will just show them as really long duration animations.
2023-01-14 03:45:13	Which at the duration that is reprisentable with the least bits (100 ticks per second), is 1.3 years per frame.
2023-01-14 03:52:49	(Though you can get that down to 49 days in the same number if bits if you like, or beyond a millennium.)

Tirr

2023-01-15 05:42:32	In C.2.5: In the case of simple distribution (when the first Bool() is true), `alphabet_size` is set to 1 or 2, but this might not cover `v1` and/or `v2`
2023-01-15 05:43:08	`alphabet_size` is referred in alias mapping (C.2.6) so this might be confusing
2023-01-15 05:43:41	iirc libjxl sets `alphabet_size` to `max(v1, v2) + 1`
2023-01-15 05:57:30	In Table F.7, minor typo: backticks didn't make the text render as code

andrew

	_wb_ If you would want to mix landscape and portrait pages, probably the best way is make a square canvas, set the pages to not be saved as reference, and set the blend source of every page to an uninitialized reference frame, which has the effect of introducing transparent black padding (implicit all zeroes).
2023-01-15 08:52:35	I still think there ought to be a clean way to express a multi-page document with differing page sizes like TIFF

_wb_

2023-01-15 08:57:22

I think it's clean enough: frames can already have arbitrary dimensions that can be different from the image dimensions. The only difference with TIFF (I think) is that you also have the image dimensions, defining a canvas, and frame offsets defining where to show each frame (and possibly what to crop when showing the frame). I think that's a feature: it means it's well defined how viewers should (by default) display a series of pages with differing dimensions.

2023-01-15 08:59:32

But applications can of course also just get the individual frames and display them in a different way (e.g. not cropping the outside-canvas parts of the frame, resized to fit viewport width or to fit the screen, or whatever)

andrew

2023-01-15 08:59:57	that canvas size issue still makes it feel a bit hacky
2023-01-15 09:03:23	perhaps it should be specified that if the duration is 0xFFFFFFFF, the canvas dimension should be ignored for subsequent pages?

_wb_

2023-01-15 09:04:21	How so? Generally I'm assuming viewers will display multi-page images like a slideshow where the canvas is rescaled to fit the screen/viewer window, not like a scrolling document — it's not like PDF where scrolling makes sense since the document likely involves text that is read top to bottom and the bottom of one page continues at the top of the next; generally I expect jxl to be used for photos and illustrations, which don't really have a particular 'reading direction'
2023-01-15 09:04:52	having it well-defined how to position pages on the canvas does make sense to me
2023-01-15 09:05:28	if applications choose to, they are still free to ignore the canvas, get all the individual pages, and display them in whatever way they want

andrew

2023-01-15 09:06:28	in my mind multi-page images (like in TIFF) are used to store scanned documents
2023-01-15 09:06:44	like if I put a .tif into Adobe Acrobat, it will create a PDF from each page in the image

_wb_

2023-01-15 09:07:43	Right. For that use case it would indeed make sense to display in a scrolling way and the canvas thing is not useful.
2023-01-15 09:08:32	(btw I kind of regret that Orientation is an image header field and not a frame header field, in this use case it would have been useful if it can be adjusted per frame)

andrew

2023-01-15 09:09:14

alternatively, you could make a valid argument that multi-page documents is not the job of JPEG-XL and should be handled by a container format like CBZ

_wb_

2023-01-15 09:11:31

Yes. But there are advantages to doing it at this level. E.g. if you have repeating elements like shared headers/footers/logos in a multi-page image, encoding it in a single jxl codestream allows encoding the repeated elements just once, while if it's handled by a container, that is not possible.

andrew

2023-01-15 09:13:34	here's how Windows Photo Viewer handles a multi-page TIF btw
2023-01-15 09:16:37	if you're going to go all-in on multi-page JXL, there should be frame-level metadata (exif, physical dimensions, etc.) such that it's possible to (semi-)losslessly merge/unmerge multiple JXL files

_wb_

2023-01-15 09:58:43

There can be multiple Exif boxes in a jxl file, I think we just need to define what that means — I would suggest Exif box *n* corresponds to frame *n*, and if there are fewer Exif boxes than frames, the last Exif gets repeated for the remaining frames.

sklwmp

2023-01-16 12:54:14

The Rec. 709 specification does not actually state a "display" function, but seems to recommend BT.1886 (gamma 2.4), while the current implementation uses gamma 1/0.45 (inverse of Rec. 709)

Traneptora

	sklwmp The Rec. 709 specification does not actually state a "display" function, but seems to recommend BT.1886 (gamma 2.4), while the current implementation uses gamma 1/0.45 (inverse of Rec. 709)
2023-01-16 01:58:23	except H.273 does specify the gamma value
2023-01-16 01:58:49	https://www.itu.int/rec/T-REC-H.273-202107-I/en
2023-01-16 01:59:22	look at section 8.2
2023-01-16 02:00:01	it specifies `V = 4.500 * Lc` for `Lc < beta` and specifies `V = alpha * Lc^0.45 - (alpha - 1)` for `Lc >= beta`
2023-01-16 02:00:33	It's not a good idea to use BT.1886 because it's not actually the inverse of BT.709
2023-01-16 02:00:35	(really)
2023-01-16 02:01:17	generally speaking you should use the inverse of the function in H.273 for any purpose that is not targeting a display
2023-01-16 02:01:38	and if targeting a display you should invert BT.709 with BT.1886
2023-01-16 02:01:57	the reason these functions aren't inverses of each other can be boiled down to "historical crap"
2023-01-16 02:03:04	but the 709 transfer function looks similar to the sRGB one in that it has a linear segment and a gamma segment
2023-01-16 02:07:15	either way, for libjxl it should be inverting BT.709 to do that

sklwmp

	Traneptora except H.273 does specify the gamma value
2023-01-16 02:47:29	It does, but for the opto-electrical transfer function, which converts from light to electricity. Nothing for the other way around (which is what most decoders are concerned about). Neither H.273 nor ITU-R BT.709-6 specify an electro-optical transfer function, but BT.709 refers to BT.1886 as the "reference decoding function" for a "reference monitor".
	Traneptora and if targeting a display you should invert BT.709 with BT.1886
2023-01-16 02:48:31	should libjxl do this? or does it introduce complications because libjxl is an image format
2023-01-16 02:48:58	most video decoding software uses BT.1886 to display BT.709 video, it seems
2023-01-16 02:49:22	also, these ITU links seem to 404 for me?

Traneptora

	sklwmp should libjxl do this? or does it introduce complications because libjxl is an image format
2023-01-16 03:08:36	IMO no, since libjxl doesn't target displays
2023-01-16 03:08:43	it should use the inverse of what's specified in H.273
2023-01-16 03:09:16	do note that libjxl uses a CMS to encode, which means it's likely littlecms or skcms that is actually inverting BT.709, not libjxl native code

sklwmp

2023-01-16 03:19:20

although, just to clarify, the Rec 709 standard *was* designed with displays in mind, right? are color calculations the only situation wherein you use the inverse OETF instead the EOTF?

Traneptora

2023-01-16 03:26:48

any time you have to convert something to linear light

spider-mario

	sklwmp most video decoding software uses BT.1886 to display BT.709 video, it seems
2023-01-16 12:04:30	anecdotally, I recall grading a video and letting DaVinci Resolve encode it with a 2.4 gamma and a Rec. 709 tag, uploading the result to YouTube and Facebook, and the result looking totally washed out because the end-to-end encoding/playback chain of YouTube and Facebook did not actually seem to apply such a high gamma
2023-01-16 12:06:54	it seems that my videos look most like what they are supposed to if I just encode them with the 709 OETF

Tirr

2023-01-18 06:32:55	In I.3.2, minor typo: there is broken formatting with backticks
2023-01-18 12:20:23	I.2.4 has similar formatting error after the second psuedocode block
2023-01-18 01:16:55	In I.2.4, the first pseudocode: in the double for loop in `encoding_mode == AFV`, I think `vals` should be `params` instead
2023-01-18 01:18:50	I haven't checked the corresponding part of libjxl yet
2023-01-19 05:30:37	In I.2.4, when computing weights matrix: weights(0, 0) is not defined for encoding mode DCT2. I guess weights(0, 0) = 1.0, is this correct?

yurume

2023-01-19 05:34:08

not just undefined, but it is not used at all. so its value doesn't actually matter (libjxl had `-0xBAD` in its place AFAIK) but the current wording of the spec obscures that fact.

Tirr

2023-01-19 05:35:38

hmm then I'll go with +0.0

yurume

2023-01-19 05:35:51	it is not used because it is entirely covered by LLF coefficients, the top-left 1/8 x 1/8 portion of coefficients, which are derived from LF images
2023-01-19 05:36:06	yeah anything will work
2023-01-19 05:36:15	I briefly considered to put NaN there

Tirr

2023-01-19 05:38:04

Hornuss mode defines weight(0, 0) as 1.0 though, is it actually used?

yurume

2023-01-19 05:38:14

again, nope

Tirr

2023-01-19 05:40:16

so basically we don't care the area (0, 0) to (w/8, h/8) of the dequant matrix

yurume

2023-01-19 05:42:37	correct
2023-01-19 05:43:20	if you look at the coefficient decoding process, it even doesn't bother to read those them, it skips immediately to the first non-LLF coefficient

_wb_

2023-01-19 07:33:37

Unless anyone finds another issue, this will be the CD text we'll be approving tomorrow at the final plenary of the 98th JPEG meeting. There's only one known issue left, the references in the pseudocode in E.4.4 but that's due to a bug in Metanorma (https://github.com/metanorma/metanorma-iso/issues/909) and I'll fix that manually in the Word document.

2023-01-19 11:27:01

One of the trickier things was to define some limits on splines so there's a bound on the decode effort it takes to render them that can be computed without actually rendering them or forcing a specific rendering implementation, so especially in level 5 you cannot make an image that decodes crazy slow by using many/large splines, but also so things are not restricted too much. <@853026420792360980> I think you're the only one who already did an independent implementation of splines, so please take a look to see if how we did it makes sense to you.

Tirr

2023-01-19 12:16:59

In I.2.4, when computing weight matrix: in the pseudocode for AFV encoding mode, `Interpolate` is called with fourth argument `4`. What does this mean?

_wb_

2023-01-19 01:23:46

looks to me like a spurious argument that just shouldn't be there

veluca

2023-01-19 01:26:54	it means that the spec/that function was modified a lot of times
2023-01-19 01:26:55	😛

Tirr

2023-01-19 01:55:42

maybe that fourth argument was some kind of multiplier?

_wb_

2023-01-19 01:58:52	no it was bands.size() basically, iiuc
2023-01-19 10:26:36

Traneptora

	_wb_ One of the trickier things was to define some limits on splines so there's a bound on the decode effort it takes to render them that can be computed without actually rendering them or forcing a specific rendering implementation, so especially in level 5 you cannot make an image that decodes crazy slow by using many/large splines, but also so things are not restricted too much. <@853026420792360980> I think you're the only one who already did an independent implementation of splines, so please take a look to see if how we did it makes sense to you.
2023-01-19 11:23:35	what are fwidth and fheight here?
2023-01-19 11:23:38	frame width and frame height?

_wb_

2023-01-19 11:26:03	yes
2023-01-20 03:21:42	The new spline limits will not go in the CD yet, since we couldn't get consensus to just do that already without following the formal process of NB comments, so I suppose they will be officially added only at the DIS stage.
2023-01-20 03:23:41	So what's above is basically what we will approve to be the CD, except it has those new two rows in the levels table that are not part of the official CD and that we will request to be added, as a ballot comment on the CD ballot.

veluca

2023-01-20 07:01:42

since when do you need comments for changes to a WD?

_wb_

2023-01-20 07:16:41	Since it's not an editorial change but a technical change in the profiles and levels, and Fernando claims profile/levels definitions cannot be changed without following the process. In fact he also wants us to provide an input document for the next meeting to explain the change and discuss the implications on interoperability.
2023-01-20 07:18:40	Which makes sense in general and on an abstract level. In this particular case I would argue that the interop implications are minimal, but he has a point in principle that these implications do need to be considered.
2023-01-20 07:22:30	I will make some slides for the next meeting to convey the point that we're essentially only defining the playground for a coding tool that is currently only used frivolously in jxlart, and although we do want to have a well-defined restriction on it in case there are future, more serious applications that use this, and also to ensure that the browser use case (level 5) has some amount of protection against excessive resource use. Since we expect a future hardware profile to not include splines at all, the implications for that will be zero.
2023-01-20 07:39:12	In the meantime, libjxl will just define the de-facto limits, and I suppose other software implementations can either also use those limits, or choose to not check level conformance at all, which is always also a valid approach to make a conforming decoder.

Traneptora

2023-01-25 09:41:20	^ jxlatte doesn't reject streams that otherwise would be conformant but break level rules
2023-01-25 09:41:29	for most things
2023-01-25 09:41:32	for some things it does
2023-01-25 09:41:55	I check the size header for raw codestreams to make sure I'm not getting a false positive

_wb_

2023-02-06 07:29:09

I wonder which of the 3 independent decoders will be the first to implement a full decoder 😛

Tirr

2023-02-12 07:17:17

I.5.3: when computing `Mul` it should be divided by `quantizer.global_scale`, not multiplied

_wb_

2023-02-12 07:20:50

🤦

Tirr

2023-02-12 07:21:01	I.8: `cx` should be `max(bwidth, bheight) / 8`, `cy` should be `min(bwidth, bheight) / 8`, `ScaleF` arguments should be `(y, cy)` and `(x, cx)`
2023-02-13 09:32:24	I.9.8: - each occurence of `coefficients(1, 0)` and `coefficients(0, 1)` should be `(0, 1)` and `(1, 0)`, respectively - in the last for loop, `samples_4x4` should be `samples_4x8`

_wb_

2023-02-13 12:31:20

This is all VarDCT stuff, not really what I know best. Can the implementors of libjxl, j40 and/or jxlatte double-check these findings? (some of these things might be spec bugs I accidentally introduced when fixing other spec bugs, so I want to be sure)

Tirr

2023-02-13 12:41:10	I'm not 100% sure about those (especially I.8). jxl-oxide still seems to have a bug related to LF that makes decoded image look darker and have blocky artifacts
2023-02-13 12:41:15	double-checking would be very helpful

Traneptora

	_wb_ This is all VarDCT stuff, not really what I know best. Can the implementors of libjxl, j40 and/or jxlatte double-check these findings? (some of these things might be spec bugs I accidentally introduced when fixing other spec bugs, so I want to be sure)
2023-02-14 10:02:45	I was the one who reported those originally, but I later mentioned that I got something backwards
2023-02-14 10:03:00	I don't know if it was ever changed
2023-02-14 10:05:06	https://discord.com/channels/794206087879852103/1021189485960114198/1043823541138763817
	Traneptora I believe that last samples should say samples4x8
2023-02-14 10:08:42	I reported this already here, so I can verify it
	Tirr I.5.3: when computing `Mul` it should be divided by `quantizer.global_scale`, not multiplied
2023-02-14 10:13:47	not sure about this one, compare it here: https://discord.com/channels/794206087879852103/1021189485960114198/1043286802934550643

Tirr

2023-02-19 10:39:43	I.2.4: for encoding mode RAW, computed weights matrix should be used as is, reciprocals should not be taken
2023-02-19 10:39:51	(it fixed my losslessly transcoded image)
	Traneptora It is multiplied by `(1 << 16) * quantizer.global_scale / HfMul`
2023-02-19 10:47:31	I think it's `(1 << 16) / (global_scale * HfMul)`, to be consistent with LF dequant where `m{X,Y,B}DC` is computed as `(1 << 16) * m_{x,y,b}_lf_unscaled / (global_scale * quant_lf)`
2023-02-19 10:47:35	it worked for me at least

Traneptora

	Tirr I.2.4: for encoding mode RAW, computed weights matrix should be used as is, reciprocals should not be taken
2023-02-19 12:31:54	I confirm this as I reported it too

yoochan

2023-03-03 03:58:19	Annex L 2.2 ```Lmix = (pow(Lgamma − cbrt(oim.opsin_bias0)),3) + oim.opsin_bias0) * itscale; Mmix = (pow(Mgamma − cbrt(oim.opsin_bias1)),3) + oim.opsin_bias1) * itscale; Smix = (pow(Sgamma − cbrt(oim.opsin_bias2)),3) + oim.opsin_bias2) * itscale;``` on thoses 3 lines, there is one more closing parenthesis than opening 😄
2023-03-03 04:03:41	and even if it is perfectly clear what is what, the symbol B is used twice, both for the YXB and RGB third channel...
2023-03-03 04:04:14	I'm perhaps dumbreading too much here 😅

_wb_

2023-03-03 04:33:07	Leonard from Adobe just pointed me to this: https://github.com/pdf-association/pdf-issues
2023-03-03 04:33:35	maybe we should do something similar, creating a public repo just for opening spec issues
2023-03-03 04:33:43	wdyt, <@795684063032901642> ?

yoochan

2023-03-03 07:17:10	I guess having the html under revision control is not an acceptable option ?
2023-03-03 07:18:22	the html is perhaps not the original... is it generated from the pdf or the other way around ?

_wb_

2023-03-03 08:33:59	Html is generated from metanorma sources
2023-03-03 08:34:10	Which we have on a private git repo
2023-03-03 08:34:23	I want to make it public but ISO does not allow that

yoochan

2023-03-04 05:58:05

I understand... It seems difficult to open public issues on a private document. But it could be issues tagged "spec" on the main libjxl repo

Fraetor

2023-03-04 10:52:15

I think having a separate repo for with just a README.md would work. It gives a place to publicly track spec issues.

Traneptora

2023-03-10 06:56:03	Section G.1.2, Table G.2, typo in definition of `m_b_lf`
2023-03-10 06:56:08	it's listed as `m_b_l`
2023-03-11 09:41:44	<@794205442175402004> dyou have an updated spec draft? so I can use the latest version when working on libhydrium

_wb_

2023-03-11 09:43:22

Haven't made any updates yet since the version above, iirc. Still need to get the more recently reported issues fixed...

Traneptora

2023-03-11 09:43:39

I see, thanks

yoochan

2023-03-12 10:31:39

Libhydrium?

sklwmp

2023-03-12 10:39:56

i believe it is this: https://github.com/thebombzen/hydrium

yoochan

2023-03-12 11:08:20

thx ! I couldn't find it 🙂

Traneptora

2023-03-12 11:16:14

very WIP

yoochan

2023-03-12 08:40:37

Every journey start with a first step

improver

2023-03-13 07:52:44	imo these numbers are more "expected", similar how people keep using 16:10 & not 8:5, it makes remembering them easier when u can stack them alongside other ratio numbers
2023-03-13 07:57:08	though its not very common as a ratio anyway...

_wb_

2023-03-13 08:12:11	yes, I suppose in particular the 2:1 is to emphasize that it's an aspect ratio, but then maybe we should also have done `× 1 Idiv 1` in the first one 🙂
2023-03-13 08:13:09	maybe we should make it a table again instead of pseudocode...

improver

2023-03-13 08:26:30

what is 12:10 used for btw? i can't seem to find it as being common for anything on the net, though its position does make sense in this list so its prob not a typo

_wb_

2023-03-13 08:33:52	No idea how that list of aspect ratios was decided, it was already there in pik.
2023-03-13 08:42:38	480x576 is described in BT.1358 so there's that, I guess
2023-03-13 08:43:22	it seems a reasonable intermediate aspect ratio between 5:4 and square
2023-03-13 08:43:36	but I agree it seems a bit exotic

Tirr

2023-03-13 09:19:42

I once had a 16:10 monitor

spider-mario

	Tirr I once had a 16:10 monitor
2023-03-13 09:42:22	afaik, current macbooks are still 16:10
2023-03-13 09:42:36	the Pro that have the notch are 16:10 even after you remove the height of the notch (so even a bit taller with the notch)

Fraetor

2023-03-13 07:39:05

And many other laptops are too.

jonnyawsom3

2023-03-13 11:42:40

I know 16:10 is getting popular-ish since it gives you a full 1080p with the taskbar underneath on windows as opposed to covering it up partially, plus more room for editing timelines and such

Traneptora

2023-03-15 08:49:50	Annex O says the initial state should be 0×130000
2023-03-15 08:49:55	I believe this should say 0x130000
2023-03-15 08:50:06	as this is a hex literal, not multiplication

_wb_

2023-03-15 08:58:14

Yep. It's a metanorma thing, it auto-converts things that look like number x number to use the times symbol. We should put all hex literals in `monospace` to avoid that...

Tirr

2023-03-17 05:13:07

I.6: chroma from luma for LF should be skipped when kUseLfFrame is set (which seems obvious, but still)

Traneptora

2023-03-18 11:51:12	update: I was wrong
2023-03-18 11:51:30	gab_custom is not signalled if RestorationFilter all_default = 1
2023-03-18 11:51:47	it didn't matter 7 times out of 8 but I found and 8th time where it mattered
2023-03-18 11:53:04
2023-03-18 11:53:06	here's a sample

Tirr

2023-04-11 05:42:48	Table I.6: the second element of SeqB should be `-0.3633036457487539`
2023-04-11 05:43:20	(the sign is missing)

_wb_

2023-04-11 05:45:54	Oops!
	Tirr I.8: `cx` should be `max(bwidth, bheight) / 8`, `cy` should be `min(bwidth, bheight) / 8`, `ScaleF` arguments should be `(y, cy)` and `(x, cx)`
2023-04-12 02:13:41	can this particular change be confirmed by now?

Tirr

2023-04-12 02:14:22

I'm still using modified one at least

Traneptora

	_wb_ can this particular change be confirmed by now?
2023-04-12 02:15:37	https://discord.com/channels/794206087879852103/1021189485960114198/1043823541138763817

_wb_

2023-04-12 02:17:31	ok, looks to me like that change makes sense
2023-04-12 02:23:19	this is the draft as I have it now
2023-04-12 02:23:59	(preparing for the upcoming JPEG meeting where we will try to move to DIS stage)

Eugene Vert

2023-04-16 10:34:53	The draft still seems to have extra arclengths in K.4.2 `Sigma = ContinuousIDCT(dctSigma, 31 * arclength_from_start / arclength);` `value = ContinuousIDCT(dctC, 31 * arclength_from_start / arclength);`
2023-04-16 10:35:07	https://discord.com/channels/794206087879852103/794206170445119489/1039469299045240892

_wb_

2023-04-17 12:42:13	thanks for noticing that, I only looked at spec bugs reported here, not in <#794206170445119489> , so I missed that one!
2023-04-17 12:42:33	Is this still an issue https://discord.com/channels/794206087879852103/794206170445119489/1039467250803036160 ? <@853026420792360980>

Traneptora

	_wb_ Is this still an issue https://discord.com/channels/794206087879852103/794206170445119489/1039467250803036160 ? <@853026420792360980>
2023-04-17 02:15:27	I can double check in a bit

_wb_

	Traneptora I can double check in a bit
2023-04-18 06:45:06	that would be great

Eugene Vert

	_wb_ Is this still an issue https://discord.com/channels/794206087879852103/794206170445119489/1039467250803036160 ? <@853026420792360980>
2023-04-18 10:08:30	I'm getting very strange results using the version from the specification
2023-04-18 10:11:09	For example: (libjxl is 1st picture)
2023-04-18 10:12:28	Or this:
2023-04-18 10:49:29	With ```c++ diffX = x - p.x; diffY = y - p.y; distance = sqrt(diffX * diffX + diffY * diffY); factor = erf((0.5 * distance + SQRT_0125) * inverseSigma); factor -= erf((0.5 * distance - SQRT_0125) * inverseSigma); /* the sample at column x and row y of channel C / += 0.25 value * intensity * sigma * factor * factor; ``` as in `splines.cc:67 void DrawSegment` line 79 and tweaked ```c++ xend = min(image.height − 1, floor(p.y + maximum_distance + 1.5)); yend = min(image.height − 1, floor(p.y + maximum_distance + 1.5)); ``` as in `splines.cc:92 void DrawSegment` line 98, the splines render normally and match the libjxl version
2023-04-18 11:27:04	Also ```c++ maximum_distance = −2 * log(0.1) * Sigma * Sigma; ``` is ```c++ maximum_distance = sqrt(-2 * sigma * sigma * (log(0.1) * 3.0 - log(max_color))); ``` as in splines.cc:108 void ComputeSegments line 135

Traneptora

2023-04-18 04:43:55	ah, looks like you double checked
2023-04-18 04:43:58	thanks

_wb_

2023-04-18 04:46:46

ugh what is `max_color`?

Traneptora

	_wb_ ugh what is `max_color`?
2023-04-18 04:52:02	maximum value of colorX, colorY, and colorB
2023-04-18 04:52:22	this is what I have for it in jxlatte
2023-04-18 04:52:38	```java values[0] = fourierICT(coeffX, t) * arc.arcLength; values[1] = fourierICT(coeffY, t) * arc.arcLength; values[2] = fourierICT(coeffB, t) * arc.arcLength; float maxColor = MathHelper.max(0.01f, values[0], values[1], values[2]); ```
2023-04-18 04:53:19	where `t \in [0, 31]`
2023-04-18 04:53:27	proportional to progress along arc
2023-04-18 04:54:35	in libjxl it's this
2023-04-18 04:54:38	```c++ // We cap from below colors to at least 0.01. float max_color = 0.01f; for (size_t c = 0; c < 3; c++) { max_color = std::max(max_color, std::abs(color[c] * intensity)); } ```

_wb_

2023-04-18 05:03:29

<@604964375924834314> would you mind taking a look at the splines rendering in the spec and fixing the above spec bugs in the spec directly? I'll review if you make a PR on the spec repo

spider-mario

2023-04-18 05:04:24

I’m sick at the moment but I’ll try to have a look later in the week

_wb_

2023-04-18 06:19:33	Oh, no rush, get better soon!
	Eugene Vert Also ```c++ maximum_distance = −2 * log(0.1) * Sigma * Sigma; ``` is ```c++ maximum_distance = sqrt(-2 * sigma * sigma * (log(0.1) * 3.0 - log(max_color))); ``` as in splines.cc:108 void ComputeSegments line 135
2023-04-21 10:38:33	as far as I understand, the maximum_distance in libjxl is just a tighter estimate than in the spec, to avoid making the bounding box larger than it should be, for speed reasons. But having a bigger number in the spec shouldn't really make a difference. So not sure if this needs to change in the spec, it's just an optimization of the implementation. But it looks like you also found a bug in the spec that makes things blurrier than they should be, and that should be fixed of course.
2023-04-21 10:38:51	<@179701849576833024> could you take a look too?

veluca

2023-04-21 10:51:14

Maybe next week

_wb_

2023-04-21 11:34:13	`maximum_distance` only influences the size of the rectangle that gets repainted, where you don't want to make those rectangles too large since there is one of those for every sample point along the spline, and if they're too large you're effectively adding a lot of zeroes (or values extremely close to zero) most of the time, which isn't making any visible difference anyway
2023-04-21 11:36:01	but as far as the spec goes (which doesn't have to care about computational complexity, only about correctness), I think we could probably even drop `maximum_distance` completely and always theoretically update all pixels (even if most of those updates will be + 0)
2023-04-21 02:47:32	updated draft:
2023-04-21 02:48:08	simplifying/fixing the spline rendering to this:
2023-04-21 02:51:03	(old version looked like this, which wasn't very correct and also included details on how to pick xbegin/xend/ybegin/yend which were both unnecessary and wrong)
2023-04-21 02:53:59	<@736666062879195236> <@604964375924834314> <@179701849576833024> <@853026420792360980> please double-check if the stuff I'm putting in the spec is now actually correct, and if the NOTE makes sense.
2023-04-21 02:55:17	if there are no more bugs found, this will be the version that will become the DIS we submit at the end of next week.
2023-04-21 02:56:48	(so please if you find more spec bugs, find them before ~Wednesday April 26th, so we can still fix them in time)
2023-04-21 02:58:39	once it goes to DIS stage, any change will have to be made twice (once in the Word file the ISO editor will give us, and once in the metanorma source repo) and it would be best if no more technical changes need to be made so we can skip FDIS and go straight from DIS to IS

Traneptora

	_wb_ <@736666062879195236> <@604964375924834314> <@179701849576833024> <@853026420792360980> please double-check if the stuff I'm putting in the spec is now actually correct, and if the NOTE makes sense.
2023-04-21 03:04:04	not sure I like abstracting away max_distance and just iterating over the whole image since in practice that will be absurdly slow. in other places where something was specified and an implementation suggestion existed, it could safely be ignored for naive implementations. e.g. in the WP section of the modular spec, it says "you can implement X / (1 << 24) with a lookup-table" but if you just do the integer division, it's not that bad. but in this case I'd probably suggest some heuristic for the size of that rectangle rather than just say "it exists"
2023-04-21 03:04:49	but mathematically speaking it appears what you have written is correct, from my first look
2023-04-21 03:07:01	someone doing a naive implementation-directly-from-spec should be able to do so IMO without a lot of extra math

Tirr

2023-04-21 03:08:43	this is unrelated to splines and more like a question. it seems that VarDCT RGB is possible. do we read coeffs in GRB order when this is happening? the spec only says for XYB (in YXB order) and it feels a bit unclear to me
2023-04-21 03:09:50	for YCbCr images coeffs are in YCbCr order, but modular images are in CbYCr order
2023-04-21 03:10:18	(this confused me when implementing vardct for the first time)

_wb_

2023-04-21 03:10:22

makes sense to put something a bit better than the super naive thing there, as long as it's safe and not too complicated. Adding the thing with `max_color` and everything kind of complicates things quite a bit, but maybe we can just have some simpler bound that is not as tight but also helps implementers. Then again there are other things like e.g. the DCT itself where we don't really offer any implementation advice in the spec, so I think it might be better to let those who want inspiration for optimizing things just look at existing implementations for that

Traneptora

	Tirr this is unrelated to splines and more like a question. it seems that VarDCT RGB is possible. do we read coeffs in GRB order when this is happening? the spec only says for XYB (in YXB order) and it feels a bit unclear to me
2023-04-21 03:10:32	at any point in the spec if it references X, Y, B for images that aren't XYB-encoded, you drop-in X, Y, B as R, G, B instead.
2023-04-21 03:11:03	since you read the varblocks in Y, X, B order, that means likewise you read non-XYB varblocks in G, R, B order

_wb_

	Tirr this is unrelated to splines and more like a question. it seems that VarDCT RGB is possible. do we read coeffs in GRB order when this is happening? the spec only says for XYB (in YXB order) and it feels a bit unclear to me
2023-04-21 03:11:19	I think coeffs would be in GRB order then, yes. Best to check, take an RGB JPEG, recompress it with cjxl and see what happens 🙂

Traneptora

	Tirr for YCbCr images coeffs are in YCbCr order, but modular images are in CbYCr order
2023-04-21 03:11:59	modular images are in Cb, Y, Cr order because they're in YXB order, iirc.
2023-04-21 03:12:13	if you use xyb modular it will by in Y, X, B order (with B subbed in for B-Y)
2023-04-21 03:17:51	X, Y, B gets replaced with "first, second, third" components
2023-04-21 03:17:59	which are either R, G, B or Y, Cb, Cr, depending on context

spider-mario

	_wb_ <@736666062879195236> <@604964375924834314> <@179701849576833024> <@853026420792360980> please double-check if the stuff I'm putting in the spec is now actually correct, and if the NOTE makes sense.
2023-04-21 03:18:14	how does that interact with the limits on the number of pixels affected?

Tirr

	Traneptora which are either R, G, B or Y, Cb, Cr, depending on context
2023-04-21 03:21:05	well I'm decoding LF images and HF coeffs in YCbCr order, and headers signal subsampling info in CbYCr order, so I thought XYB matches CbYCr
2023-04-21 03:23:08	as far as I understood: basically X, Y, B matches with R, G, B or Cb, Y, Cr (or C, M, Y I guess?) so for VarDCT images they are in GRB and YCbCr order. but for modular images, the spec explicitly says (R, G, B) / (Y', X', B') / (Cb, Y, Cr) so we decode images in that order

Traneptora

2023-04-21 03:26:40

ah, then yea

Tirr

2023-04-21 03:29:04

ah the spec has that: > In the rest of this document, the three main colour components are referred to as X, Y, B, even in the case where the XYB colour space is not used for the stored image; in that case the actual components are either R, G, B or Cb, Y, Cr or C, M, Y.

Traneptora

2023-04-21 03:29:25

ah, there it is

_wb_

	spider-mario how does that interact with the limits on the number of pixels affected?
2023-04-21 03:37:00	it doesn't, the estimated pixel area is computed in integer math straight from the quantized splines and isn't related to the actual spline rendering

Tirr

2023-04-23 03:45:18

Table F.8 (and Table K.1): expressions of the blending operations don't match those of libjxl. - alpha clamping is done only on `new_alpha` (the image to blend), not for `old_alpha` - kMulAdd: `new_alpha` is always clamped, blending uses `new_alpha` instead of mixed `alpha`, the resulting alpha value is always `old_alpha` (for kMulAddBelow it would be `new_alpha`) - kMul: `clamp` is signalled, but it doesn't take any alpha channel; instead `new_sample` should be clamped

2023-04-23 03:50:29

found while testing `blendmodes` from conformance test

_wb_

2023-04-24 06:53:55	<@179701849576833024> <@532010383041363969> fix spec to match libjxl, or fix libjxl to match spec on these ones?
2023-04-24 06:55:30	fixing spec is easier, but we can go both ways imo since I don't think any encoder is using funky blend modes in a way that would break existing images if we change any of the above 3 mismatches (except of course the conformance test)
2023-04-24 06:57:09	if we don't want to have to update conformance tests, we should sync the spec to make it match libjxl
2023-04-24 06:59:33	kMulAdd always doing clamping seems like a libjxl bug: why bother signalling `clamp` if we're just always clamping anyway?
2023-04-24 07:04:10	kMul did not make sense in the spec since there is no alpha_channel to clamp in this case; libjxl indeed clamps all foreground samples; I suppose we can change the spec to match libjxl here
2023-04-24 07:06:38	alpha clamping being done on `new_alpha`, not `old_alpha` : is this a spec bug or a libjxl bug? can go either way, I guess. I would treat it as a spec bug
2023-04-24 07:08:29	kMulAdd has `alpha = old_alpha` instead of the kBlend formula for alpha: I think that's a spec bug, it doesn't really make sense to do that
2023-04-24 07:10:51	https://github.com/libjxl/libjxl/blob/main/lib/jxl/alpha.cc#L71 this code ignoring the value of `clamp` seems like a libjxl bug, right? or am I missing something?
2023-04-24 07:53:23	Resolving it in this way, see https://github.com/libjxl/libjxl/pull/2418 for the libjxl fix and the rest we can fix in the spec
2023-04-24 07:54:23	Updated spec draft:
2023-04-24 07:55:01	Uploading this one as input study text for the JPEG meeting that will start in 3 hours

Jyrki Alakuijala

	_wb_ <@179701849576833024> <@532010383041363969> fix spec to match libjxl, or fix libjxl to match spec on these ones?
2023-04-24 09:21:33	not immediately sure what to do -- this needs a bit of thinking and discussing

_wb_

2023-04-24 09:23:21

I proposed something above — the meeting is starting in 98 minutes and I need to upload _something_ as the input document. But we have all week to discuss and update the draft as needed before we produce the output document which is to be approved on Friday.

Jyrki Alakuijala

2023-04-24 09:24:33	depends on your own stress level probably more than anything else -- if you can deal with excitement, you can make a last minute change
2023-04-24 09:25:03	my experience is that 30 % of such last minute changes introduce an unforeseen side effect
2023-04-24 09:25:27	but seems reasonable to try to fix this

_wb_

2023-04-24 09:40:49

we can discuss about what's the best way to fix it, but by the end of the week I want the spec and libjxl to define blending in the same way 🙂

Tirr

2023-04-24 10:28:48	J.4.3: `quantization_width` is not `HfMul`, it's `Mul` defined in Section I.5.3
2023-04-24 10:29:11	(please double-check this, jxl-oxide is still broken after fixing this)
2023-04-24 10:29:57	relevant code in libjxl: https://github.com/libjxl/libjxl/blob/0d8651b09e42bec33a676e02459cf573d3905d87/lib/jxl/epf.cc#L82
2023-04-24 11:43:28	J.4.3: we need to add `scaled_distance * inv_sigma` when computing `v` (as `inv_sigma` is negative)
2023-04-24 11:44:09	these two changes fixed EPF of jxl-oxide

_wb_

	Tirr J.4.3: we need to add `scaled_distance * inv_sigma` when computing `v` (as `inv_sigma` is negative)
2023-04-24 12:10:18	does something need to be changed in the spec or is the spec ok for this one?
	Tirr J.4.3: `quantization_width` is not `HfMul`, it's `Mul` defined in Section I.5.3
2023-04-24 12:10:37	can you confirm now that it's indeed Mul and not HfMuL?

Tirr

	_wb_ does something need to be changed in the spec or is the spec ok for this one?
2023-04-24 12:24:10	I can't check the spec right now, but iirc the spec computes negative `inv_sigma` because of the `sqrt(0.5) - 1` term, so yeah, I think the spec needs to be changed

_wb_

2023-04-24 12:24:59

ah right, otherwise v is always >= 1

Tirr

	_wb_ can you confirm now that it's indeed Mul and not HfMuL?
2023-04-24 12:26:36	I didn't modify this code after I changed it to Mul, but I'll double-check when I have access to my dev machine

_wb_

2023-04-24 12:27:04

perhaps makes more sense to change it to `1 - sqrt(0.5)` in the spec so `inv_sigma` is positive

Tirr

2023-04-24 12:27:18

yeah It would be better

_wb_

2023-04-24 12:37:04

Thank you so much for finding all these spec bugs, <@268284145820631040> , <@853026420792360980> , <@206628065147748352> and <@736666062879195236>! I wish I could add a "thank you" section to the spec, but ISO doesn't do that kind of thing (they never put any names or credits in their standards)

yoochan

2023-04-24 12:38:51

what are the opportunities to amend the spec AFTER the official publication ? do you have to pay for it ?

_wb_

2023-04-24 12:40:08	one of the slides I'll be presenting later today 🙂
	yoochan what are the opportunities to amend the spec AFTER the official publication ? do you have to pay for it ?
2023-04-24 12:41:07	after a standard is published (IS stage), it can be changed by making an AMD (amendment), CORR (corrigendum) or by making a new edition
2023-04-24 12:41:26	all of these options are mostly just a lot of bureaucracy 🙂
2023-04-24 12:43:26	if you want to make an amendment, corrigendum or new edition, first you need to submit a NWIP to ISO (new work item proposal). This needs to be approved by national bodies before anything can happen. Then it goes from working draft to committee draft to DIS to FDIS to IS, where starting at the CD each stage goes to ballot.

yoochan

2023-04-24 12:45:34	yum 😄
2023-04-24 12:45:34	are we in a novel of douglas adams ?

_wb_

2023-04-24 12:46:01	I made a very generic comment myself (officially it is the Belgian national body making this comment) to the CD, which then "allows me" as an editor to make the changes I want to make — because officially changes have to be motivated by national bodies requesting them
2023-04-24 12:47:10	so basically I need to with one hat on make a request to the Belgian national body to please request changes so with the other hat on I can make those changes
2023-04-24 12:47:44	this is the kind of antiquated procedures you need to navigate when working as an ISO working group

yoochan

2023-04-24 12:47:56

amazing how rules always have a backdoor 😄

_wb_

2023-04-24 12:48:26	it's not really a backdoor, it's their way to document changes that happen
2023-04-24 12:48:44	(a git log of the spec repo is more practical though imo)

yoochan

2023-04-24 01:37:53

almost no link with the issue but I once dreamed of a democratic process where laws are voted through merge requests made by MPs 😄 it would be so clean to have the book of law patched instead of the horrendous existing "modifying laws"

Tirr

	_wb_ can you confirm now that it's indeed Mul and not HfMuL?
2023-04-24 03:12:49	yeah I can confirm this; the complete formula I'm using is `quant_mul * (65536.0 / (global_scale * hf_mul)) * sharp_lut[sharpness]` so it's `Mul`
2023-04-24 03:13:56	(it results in the same sigma value with libjxl)

_wb_

2023-04-24 03:25:17

cool, thanks!

Traneptora

	_wb_ Updated spec draft:
2023-04-25 03:15:17	it says `stage` in lowercase right before Warning for WDs and CDs
2023-04-25 03:15:20	I assume that's a typo
2023-04-25 03:15:34
2023-04-25 03:15:49	or a placeholder

_wb_

2023-04-25 05:20:32

Yeah I dunno why it doesn't put DIS there, maybe a metanorma bug

spider-mario

2023-04-25 09:01:32	_Système de codage de noyau_ 😂
2023-04-25 09:02:57	“Core coding system” is supposed to be “Core {coding system}” (“core” applies to “system”), but instead they’ve translated it as “{kernel coding} system”
2023-04-25 09:03:07	a system for coding kernels

yoochan

2023-04-25 09:12:07

c'est peut-être pour les olives

_wb_

2023-04-25 10:21:40

I don't know who did that translation but I'm not going to try to figure out how to request ISO to change the french title of the standard

Tirr

2023-04-25 05:41:57

I.5.2: CfL should be done before LF smoothing

_wb_

2023-04-25 06:37:30	ah good point, that is indeed completely not clear from the spec
2023-04-27 09:09:14	Here's the version that we will submit as DIS (pending approval tomorrow at the final plenary of the 99th JPEG meeting)
2023-04-27 09:09:31	(that is, the Word version of that)
2023-04-27 09:11:46	Notable change: the definition of `total_estimated_area_reached` was adjusted (both how it is computed and what the profile limits are).

yoochan

2023-04-27 10:49:16

congratulations ! but what was published in march https://discord.com/channels/794206087879852103/803379415106584626/959072434571583488 ?

Tirr

2023-04-27 10:52:15

I guess it's the first edition published last year

_wb_

2023-04-27 11:24:28

Yes that was the first edition. Now it's the second edition. No major changes, just tons of bugfixes and clarifications.

yoochan

2023-04-27 11:59:23

nice ! there is a new edition planned every year ?

Traneptora

2023-04-27 12:01:27	there's not going to be a reason to unless more extensions keep being added
2023-04-27 12:01:52	second edition's primary purpose is to fix errors in first edition, rather than add new features
2023-04-27 12:02:19	ideally this will iron out most or all of them so there won't be a need to keep doing more editions

Tirr

2023-04-27 12:09:09	if `xyb_encoded && want_icc`, are the samples after inverse XYB (defined in the spec) in the correct gamut and transfer curve defined in the ICC profile?
2023-04-27 12:10:09	or do I need to convert samples from linear sRGB according to the profile?

Traneptora

2023-04-27 12:10:30	if the image is XYB encoded, then inverting XYB will always produce linear sRGB
2023-04-27 12:10:57	in the case where the image is XYB-encoded, colorspace info is a suggestion for decoders, and it provides information on what the original colorspace was
2023-04-27 12:11:39	you can convert that linear sRGB into any space you wish, be it the tagged space, or the display space, or whatever.

Tirr

2023-04-27 12:12:14

okay, thanks for clarifying

Eugene Vert

2023-04-27 12:57:48

`E.1` uses "grayscale" while `1`, `3.2.10` and `E.2` use "greyscale"

yoochan

2023-04-27 01:09:10

I propose græyscale

_wb_

2023-04-27 03:05:08	there is no plan for more editions. At some point in the future we might make an amendment to add a new profile (one that only allows some subset that can be implemented conveniently in hardware), and we might find more errors to fix, so there might be a 3rd edition at some point, but we certainly have no goal to make more editions
2023-04-27 03:11:23	ISO standards are supposed to use the Oxford English spelling, so greyscale would be correct

Traneptora

	Tirr okay, thanks for clarifying
2023-04-27 08:14:45	do note that inverting XYB involves a cubic-like gamma transfer, followed by a linear transform by the OpsinInverseMatrix
2023-04-27 08:15:10	and converting from linear sRGB to another linear space (e.g. BT.2100 or P3) is a linear transformation
2023-04-27 08:15:32	so you can actually take the sRGB -> BT2100 matrix and multiply it by the Opsin Inverse Matrix
2023-04-27 08:15:48	to get a new OpsinInverseMatrix that maps into a different space

Eugene Vert

2023-04-27 09:10:08	In `K1` it is says that out of image features, upsampling happens first, but should the upsampling of splines work like in jxl-art?
2023-04-27 09:13:37	Not sure, but from `26 dec_cache.cc:Status PassesDecoderState::PreparePipeline` it seems like that the order in libjxl is: patches; splines; upsampling; noise

_wb_

2023-05-05 06:45:50	So it looks like there are some `Umod`s missing in the spec since libjxl is using `int32_t` for some things that might not fit
2023-05-05 06:47:29	Table H.4 (modular properties) : all of these have to be cast to `int32_t`
2023-05-05 06:50:33	Section H.5 (self-correcting predictor): some of these variables also have to be cast to `int32_t`

Tirr

2023-05-07 04:40:50

L.3: the bias applied to Y (0.5) is a little bit off to the actual constant used in libjxl (128 / 255 = 0.50196...), which can make the Level 10 conformance tests to fail

veluca

2023-05-07 08:32:47

(I hate that constant so much...)

_wb_

2023-05-07 08:43:52

Is this a spec bug or libjxl bug?

veluca

2023-05-07 10:46:41	who knows
2023-05-07 10:46:57	I'd consider it a spec bug and save ourselves a lot of pain

Tirr

2023-05-07 10:56:14	I think it should match the result of decoding jpeg image at least, as ycbcr images mainly come from recompressing jpegs
2023-05-07 10:57:58	but is it specified in the jpeg standard?
2023-05-07 10:59:30	libjxl cites full-range BT.601 (`stage_ycbcr.cc` L32)
2023-05-07 11:00:32	I guess it's a spec bug

veluca

2023-05-07 11:44:05

what offset jxl uses for YCbCr files is kinda immaterial as long as the encoder properly corrects for it

_wb_

2023-05-07 12:21:45

for jpeg recompression the encoder does not have a choice though

veluca

2023-05-07 12:30:48	how so?
2023-05-07 12:30:58	the encoder can very easily add an offset to DC
2023-05-07 12:31:04	I am in fact quite sure it does
2023-05-07 12:31:17	ah, no, I see what you mean
2023-05-07 12:31:34	what libjxl does matches what libjpeg does here

_wb_

2023-05-07 12:37:43

yes, I think that was the goal. basically all components including Y are treated like int8_t, with zero in the middle (which is why all-zeroes is gray in jpeg, not black or green)

Eugene Vert

2023-06-04 12:48:51

I have a question about `dim_shift` in `ExtraChannelInfo` In spec there is: > The value `1 << dim_shift` does not exceed the `group_dim` of any frame but in libjxl's it's (`image_metadata.cc`) ```c if ((1U << dim_shift) > 8) { return JXL_FAILURE("dim_shift %u too large", dim_shift); } ``` Isn't `group_dim` is something like `128 << group_size_shift`, why is it `8` in the code?

Traneptora

2023-06-04 02:28:30	well `group_dim` defaults to 256 for most images, so `1 << dim_shift` not exceeding `group_dim` would imply that `dim_shift > 8` should be the check, not `1 << dim_shift > 8`
2023-06-04 02:28:42	so something is wrong here

Eugene Vert

2023-06-06 10:45:29

(frame_header.cc:253) ```c++ for (size_t i = 0; i < extra_channels.size(); ++i) { uint32_t dim_shift = nonserialized_metadata->m.extra_channel_info[i].dim_shift; uint32_t& ec_upsampling = extra_channel_upsampling[i]; ec_upsampling >>= dim_shift; JXL_QUIET_RETURN_IF_ERROR( visitor->U32(Val(1), Val(2), Val(4), Val(8), 1, &ec_upsampling)); ec_upsampling <<= dim_shift; if (ec_upsampling < upsampling) { return JXL_FAILURE( "EC upsampling (%u) < color upsampling (%u), which is invalid.", ec_upsampling, upsampling); } if (ec_upsampling > 8) { return JXL_FAILURE("EC upsampling too large (%u)", ec_upsampling); } } ``` So the maximum cumulative upsampling (`ec_upsampling << dim_shift`) is no more than **8** and ```c++ if ((1U << dim_shift) > 8) { return JXL_FAILURE("dim_shift %u too large", dim_shift); } ``` is kind of early check?

2023-06-06 10:50:11

But `L.4` describes > first 8x upsampling is applied dim_shift Idiv 3 times implying that there may be multiple 8x upsamplings?

Tirr

2023-06-13 06:19:13

how do I render spot color channels on grayscale images?

_wb_

2023-06-13 06:42:27	Good question
2023-06-13 06:42:45	How would you want to render them?
2023-06-13 06:42:58	Does libjxl do anything that makes sense?
2023-06-13 06:43:38	The spec is not clear on what to do, so I think we should either explicitly not allow it, or clarify what to do

Tirr

2023-06-13 06:52:43	- If `r == g && g == b`, render them as grayscale - Else, the bitstream is invalid (we can't define the meaning of RGB with grayscale colorspace) or, just don't allow them at all.
2023-06-13 06:57:05	hmm iirc G in RGB corresponds to Y in XYB, maybe just take G in spot color and render it

Eugene Vert

2023-06-13 07:05:55

(But taking R as spot color is easier to programm)

_wb_

2023-06-13 08:13:33	I mean, it could potentially be useful to have some meaning for gray + non-gray color, e.g. in print this actually happens (if you only need black plus one extra color, the usual 4-color CMYK process can be more expensive than a 2-color process, and results will be not as nice)
2023-06-13 08:15:05	> At the final step in the decode process (when the inverse colour transform has been applied), the decoded image is in the colour space defined by the image metadata.
2023-06-13 08:16:22	so I guess you are supposed to do the spot color blending in that tagged space, not in linear sRGB (which would be the output of the inverse XYB as it is in the spec)
2023-06-13 08:16:40	in that case, if the tagged space is grayscale then of course RGB has no meaning
2023-06-13 08:17:55	ok seems to be too tricky to have gray + non-gray spot color, if you want to do that you can always just make it an RGB image that happens to use only gray, or even an empty image with two spot colors (one for black and one for the other color)
2023-06-13 08:18:20	I don't really see a use for grayscale with a spotcolor that is gray too
2023-06-13 08:19:46	so I would just say spot colors are either not allowed in the grayscale case, or a decoder can ignore them

Tirr

2023-06-13 08:20:54

if the tagged colorspace is grayscale then the embedded ICC profile should also be grayscale, so we can't explain what the spot color is supposed to look like (am I getting it right?)

_wb_

2023-06-13 08:23:55	yes, exactly. Unless we say spot colors are always specified in linear sRGB... but then a decoder needs to be able to convert the image to that before rendering spot colors, and back...
2023-06-13 08:27:31	for CMYK images as it is now, you have to specify spot colors using only CMY, and the decoder can do the spot color rendering directly on the decoded CMY channels (or not, and return the spot color channel separately). The resulting CMYK image is well-defined and can be converted to RGB for display.
2023-06-13 08:29:51	if we would specify spot colors in a fixed absolute space like linear sRGB (or XYB for that matter), then in the CMYK case a decoder would first need to convert the image to that absolute space, then do the spot color rendering, and if it's just for display then all is good but if you want to get a resulting CMYK image, you need to do a conversion from RGB to CMYK which is not well-defined (information was lost when converting CMYK to RGB and it cannot really be restored)
2023-06-13 08:33:22	meh, I'd say we just keep it as it is now, and just add that if the color space is kGrey then spot colors are not rendered by the decoder
2023-06-13 08:34:12	(applications can still have them as extra channels if they want to, you just don't get an approximate preview of what it would look like)

Lucas Chollet

2023-06-20 09:19:45

Hey, I started implementing a JPEG-XL decoder for SerenityOS and as far as I can tell, I found two small issues in the spec: In H.4.1, inside the code snippet: ```cpp rW = (x > 0 ? channel[j](x − 1, r) : 0); ``` `r` is used without being defined The same problem exists with `w` in H.3: ```cpp NEE = (x + 2 < w and y > 0 ? channel[i](x + 2, y − 1) : NE); ``` Hope that it's useful 😃

Traneptora

Lucas Chollet Hey, I started implementing a JPEG-XL decoder for SerenityOS and as far as I can tell, I found two small issues in the spec: In H.4.1, inside the code snippet: ```cpp rW = (x > 0 ? channel[j](x − 1, r) : 0); ``` `r` is used without being defined The same problem exists with `w` in H.3: ```cpp NEE = (x + 2 < w and y > 0 ? channel[i](x + 2, y − 1) : NE); ``` Hope that it's useful 😃

2023-06-20 10:39:59

looks like a typo for y

Lucas Chollet

2023-06-20 10:44:30

wdym?

yurume

2023-06-21 01:10:06

i.e. `(x - 1, r)` should have been `(x - 1, y)`

Lucas Chollet

	Traneptora looks like a typo for y
2023-06-21 01:29:15	I misunderstand that message that haha, I just realized that `y` wasn't an abbreviation for "you"
	yurume i.e. `(x - 1, r)` should have been `(x - 1, y)`
2023-06-21 01:29:47	Sure, that's what I did
2023-06-21 01:31:18	I only make that message so that the spec can be corrected I don't even know if that's useful for you guys as I guess that it's complicated to modify a spec

yurume

2023-06-21 01:35:15

yeah, that's definitely a typo that should be fixed, and I think `y` is indeed correct

Traneptora

2023-06-21 05:27:09	Is there a version more up-to-date than this one?
2023-06-21 05:27:10	https://discord.com/channels/794206087879852103/1021189485960114198/1101072554837409853

_wb_

2023-06-22 04:22:26

No, haven't made updates yet after submitting DIS...

Lucas Chollet

2023-06-23 08:49:32

<@794205442175402004> (I hope the ping don't bother you) Should I continue to report small spec bug and inaccuracy? I don't want to bother you with that especially if it's useless down the road, but the good citizen in me wants to report every thing that seems a bit off

_wb_

2023-06-23 08:54:21

Sure, please report stuff! We can still make changes after DIS, and even after IS of the second edition it will be useful to fix stuff for a potential third edition 🙂

Lucas Chollet

2023-06-25 05:29:10	It seems that there is no specified behavior to traverse the MA tree in case of a decision node with a property index greater than the current number of properties. This problem has been solved with this commit in `jxl-oxide`: <https://github.com/tirr-c/jxl-oxide/commit/7ca38cfa28b91fbfaad1346debbe0344a1994f26>
2023-06-25 05:30:35	(So it default to the left child node)

Tirr

	Lucas Chollet It seems that there is no specified behavior to traverse the MA tree in case of a decision node with a property index greater than the current number of properties. This problem has been solved with this commit in `jxl-oxide`: <https://github.com/tirr-c/jxl-oxide/commit/7ca38cfa28b91fbfaad1346debbe0344a1994f26>
2023-06-25 05:36:04	it's specified as returning zero for those properties: > The function `GetProperties(i, x, y)` returns an array of properties as defined above, where property values that are not set by the above table and code are zero.

Lucas Chollet

2023-06-25 05:41:26

Not in the old version I used when I wrote that code 😅 I should have check in new version, my bad But I'm glad that it has been fixed!

Traneptora

	Lucas Chollet It seems that there is no specified behavior to traverse the MA tree in case of a decision node with a property index greater than the current number of properties. This problem has been solved with this commit in `jxl-oxide`: <https://github.com/tirr-c/jxl-oxide/commit/7ca38cfa28b91fbfaad1346debbe0344a1994f26>
2023-06-26 04:03:30	in general invalid bitstreams are unspecified
2023-06-26 04:03:52	it specifies what is valid and how to handle valid bitstreams
2023-06-26 04:04:04	but decoders can do whatever with invalid ones

Arcane

2023-07-02 07:17:04

Traneptora

2023-07-08 03:14:49	Section E.4.1:
2023-07-08 03:14:51	> The decoder reads enc_size as U64(). The decoder reads 41 pre-clustered distributions as specified in C.1. The decoder then reads enc_size integers as specified in C.3.3 to obtain decompressed bytes, using DecodeHybridVarLenUint(IccContext(index, prev_byte, prev_prev_byte)) where index is the current byte index, prev_byte and prev_prev_byte are respectively the previous and second-previous bytes or 0 if they do not exist yet, and the IccContext() function is defined in the code below. The resulting decompressed byte-based data is the encoded ICC stream. The decoder reconstructs the ICC profile from this encoded ICC stream as described in the subsequent subclauses of this annex.
2023-07-08 03:15:06	It calls them "bytes" but are they actually guaranteed to be <256? they're being read as hybrid integers
2023-07-08 03:15:23	if they are, is it illegal (aka unspecified) behavior if they are at least 256
2023-07-08 03:15:51	or is "bytes" the wrong term?

_wb_

2023-07-08 03:34:02

I assume it has to be < 256, would indeed be good to make that explicit...

Traneptora

2023-07-17 05:05:21	It looks like Adaptive LF Smoothing has to occur after CFL
2023-07-17 05:06:40	otherwise the low B-channel coeffs will be smoothed out
2023-07-17 05:06:47	the spec doesn't say this

Tirr

2023-07-17 05:17:57

ah yeah, it's reported before https://discord.com/channels/794206087879852103/1021189485960114198/1100476812557041757

Traneptora

2023-07-17 05:18:28

I just figured that out when debugging jxlatte <:kek:857018203640561677>

Tirr

2023-07-17 05:21:58

is it added in the spec? I think I didn't check that

Traneptora

2023-07-17 05:22:55	hasn't been a new draft recently, but idk if I'm using an older one or the current one
2023-07-17 01:11:41	Spec doesn't say whether you are supposed to apply Gaborish to extra channels
2023-07-17 01:11:44	and if so, what weights to use
2023-07-17 01:12:56	same with EPF

Tirr

2023-07-24 02:30:16	Table M.1: it says `nb_channels` corresponds to the number of modular channels, but libjxl adds color channels even when it's VarDCT frame
2023-07-24 02:30:43	(related jxl-oxide issue: <https://github.com/tirr-c/jxl-oxide/issues/65>)

_wb_

2023-07-24 03:18:25	right — I think libjxl's behavior makes sense since even in VarDCT frames there is still modular-encoded LF data where it can be useful to have nontrivial MA trees. But this definition in the spec is then indeed at best ambiguous (and at worst just wrong)
2023-07-25 02:06:23	here's a draft of the 2nd edition of 18181-3 (conformance testing)

Tirr

2023-07-25 02:40:16	first time to see 18181-3
2023-07-25 02:41:25	.npy format used for conformance is described in the spec, great
2023-07-25 02:41:31	maybe I'll implement that

Fraetor

2023-07-29 10:34:50

Ah, its just a NumPy binary. Nice.

Lucas Chollet

2023-08-10 04:37:58

In "H.5.2 Prediction", between the two first code snippets, there is this sentence "The weights weight[i] for each of the 4 sub-predictions and `mxe` are computed as specified by the following code [...]". It's the only occurrence of "mxe" in the whole document, so I guess that it needs some clarification/clean-up. Also in "C.3.3 Hybrid integer decoding", the penultimate line (`window[(num_decoded++) & 0xFFFFF] = r;`) of `DecodeHybridVarLenUint(ctx)` uses the `window` variable. However, this variable is only defined if `lz77.enabled` so I think that it should be behind a if: ```cpp if (lz77.enabled) window[(num_decoded++) & 0xFFFFF] = r; ```

Traneptora

2023-09-08 07:56:48	I'm confused about section I.2.4:
2023-09-08 07:56:49	> params = /* read 3 x 9 matrix in raster order /; > for (i = 0; i < 3; i++) for (j = 0; j < 6; j++) params(i, j) = 64;
2023-09-08 07:57:01	``` params = /* read 3 x 9 matrix in raster order /; for (i = 0; i < 3; i++) for (j = 0; j < 6; j++) params(i, j) = 64; ```
2023-09-08 07:57:20	this looks like the matrix should be width=9 and height=3
2023-09-08 07:57:43	but `params(i, j)` will overflow in that case
2023-09-08 07:57:54	unless it's supposed to be `params(y, x)`

_wb_

2023-09-09 03:44:46	I hate this never ending (row,column) vs (x,y) confusion
2023-09-09 03:45:25	Thanks for noticing the inconsistency

yoochan

2023-09-09 05:34:56

Instead of i and j indices, in my codes I use r and c... And still make mistakes but they are faster to spot 😅

_wb_

2023-09-25 07:28:07

The second editions of 18181-1 and 18181-2 are now in DIS stage and the ballot deadline is approaching — we need to make a ballot comment for any change.

yoochan

2023-09-25 07:45:25

what is the scope of changes ? typos ? clarifications ?

_wb_

Eugene Vert Not sure, but from `26 dec_cache.cc:Status PassesDecoderState::PreparePipeline` it seems like that the order in libjxl is: patches; splines; upsampling; noise

2023-09-25 07:48:03

<@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?

Tirr

2023-09-25 07:48:27	is this addressed? https://discord.com/channels/794206087879852103/1021189485960114198/1130364598202150912
2023-09-25 07:48:32	and this https://discord.com/channels/794206087879852103/1021189485960114198/1103935571316965407

veluca

	_wb_ <@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?
2023-09-25 07:49:38	I think it is that if the extra channels need to be brought on the same size as color channels, that happens before patches, then all of that is upsampled to final size after patches
2023-09-25 07:49:40	lemme check
2023-09-25 07:50:57	ok apparently not
2023-09-25 07:51:47	I think you are right, and the combination in which there's different EC upsampling from color channel upsampling and color channels have non-1 upsampling is just not supported if there are also patches

Tirr

2023-09-25 08:05:22

F.2 says: > After decoding subsampled channels, they are upsampled as specified in J.2 (in the case of `jpeg_upsampling`) and K.2 (in the case of `upsampling` and `ec_upsampling`). and L.4 says: > first 8x upsampling is applied `dim_shift Idiv 3` times, followed by no, 2x, or 4x upsampling if `dim_shift Umod 3` is equal to 0, 1 or 2, respectively. but patches require extra channel upsampling factor equals color channel, so I guess it's (color & extra channel upsampling, ec upsampling, patches, splines, noise)?

veluca

_wb_ <@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?

2023-09-25 08:08:52

I think it is supposed to be how Jon says here

jonnyawsom3

	_wb_ <@179701849576833024> It indeed looks like the spec is not super clear on the order of the steps. If I understand correctly, there are two cases: - extra channels need more upsampling than main channels: in this case the order is (upsampling, patches, splines, noise) so all channels have the same dimensions before applying patches - all channels need the same amount of upsampling: in this case the order is (patches, splines, upsampling, noise) Is that correct?
2023-09-25 09:01:09	Might not be related, but upsampling patches is currently broken, presumably because of the ordering in libjxl https://github.com/libjxl/libjxl/issues/2737
2023-09-25 09:02:00	May be worth checking what order is needed in that instance too, unless it's simply not supported

_wb_

	Eugene Vert But `L.4` describes > first 8x upsampling is applied dim_shift Idiv 3 times implying that there may be multiple 8x upsamplings?
2023-09-25 09:42:07	I don't think libjxl does upsampling with a factor > 8, right <@179701849576833024> ? Should we just spec it to be at most 8x cumulative upsampling, and drop that sentence about how to do upsampling with a factor > 8? Or should we keep that and fix libjxl to also do bigger upsamplings?
2023-09-25 09:48:24	I think having a limit of 8x is probably not really limiting and keeps things simpler, but I'm fine either way — we just need to decide whether to put the libjxl limit in the spec or to remove the limit from libjxl.
2023-09-25 09:52:58	(note: this only appplies to extra channels, where dim_shift can actually be signalled to be as large as 8, which means 256x upsampling, so combined with the ec_upsampling from the frame header, you could in principle have up to 2048x upsampling)

Info

JPEG XL

General chat

Voice Channels

Archived

Specification issues