JPEG XL

2022-11-11 02:52:46	but the spec only shows placing varblocks in the LFGroup decoding
2022-11-11 02:53:04	oh wait nvm, I see, blocks are always at least 8x8

_wb_

2022-11-11 02:54:32	Yes, dct4x8 etc are always filling an 8x8 region
2022-11-11 02:56:09	That simplifies alignment, DC, and avoids too much signaling cost for the small blocks, at the cost of course that you cannot have arbitrary layouts for the small ones

Traneptora

2022-11-11 03:50:03	Interesting, now I'm running into some bug decoding the HF Passes
2022-11-11 03:51:08	the order stream is verifying successfully, but the histogram after that is blowing up
2022-11-11 04:18:30	ah, I see, the last bits of a prefix code hit the buffer end
2022-11-11 04:18:35	adding some padding fixed it
2022-11-11 05:03:53	what if the blocks in DCTSelect overlap between groups?

_wb_

2022-11-11 05:07:41

Not allowed

Traneptora

2022-11-11 05:08:01

so then my block placement algorithm is faulty

_wb_

2022-11-11 05:08:06

Spec should already say that somewhere

Traneptora

2022-11-11 05:08:20	since I'm getting a DC16 placed at x=12, y=31 in DCTSelect
2022-11-11 05:08:30	which overflows out of the group
2022-11-11 05:13:45	out of curiosity, if it would overflow the group, do you just consider it to "not fit"? or is that explicitly disallowed

_wb_

2022-11-11 05:21:03

Explicitly disallowed, invalid bitstream.

Traneptora

2022-11-11 05:21:17	I'm just trying to figure out why I'm placing a block on a group boundary then
2022-11-11 05:21:25	the ANS state is verifying so I'm decoding it correctly
2022-11-11 05:21:48	but I'm probably misplacing it then
2022-11-11 06:01:48	yup
2022-11-11 06:10:53
2022-11-11 06:11:06	apparently I'm unable to place a 16x16 -> 04 block now
2022-11-11 06:11:40	it clearly doesn't fit, but the question is "how did we get here"
2022-11-11 06:14:39	is this the correct order to place blocks, given the block sizes?
2022-11-11 06:14:40
2022-11-11 06:14:49	(mod100 for readability)
2022-11-11 06:16:10	it appears it's what's happening though
2022-11-11 06:17:06
2022-11-11 06:17:13	right around here, block 47 doesn't fit into block 48's slot

_wb_

2022-11-11 06:29:39

If it helps: current jxl encoder does not cross 64x64 boundaries either

Traneptora

2022-11-11 06:30:05	also, extending past the bottom of the image is against the rules, right?
2022-11-11 06:31:08	the modular channel is verifying though so I clearly am messing up the placement somehow
2022-11-11 06:31:36	does anything here look out of place?
2022-11-11 06:32:16	like a block that, assuming its size is correct, should not have been placed where it was?

_wb_

2022-11-11 06:36:43

Well I think it should never happen that a gap gets created and filled later

Traneptora

2022-11-11 09:21:43	and, I figured it out!
2022-11-11 09:21:52	it is a bug in libjxl that will now have to become part of the spec
2022-11-11 09:23:20	either that or dimensions are just very confusing
2022-11-11 09:23:21	but
2022-11-11 09:23:28	`ACStrategy::covered_blocks_x()` and `ACStrategy::covered_blocks_y()` have swapped LUTs compared to what the spec seems like
2022-11-11 09:23:42	for example, ID7 is DCT8x16, which actually is wider than it is tall
2022-11-11 09:24:07	but maybe this is a confusion Width x Height with Rows x Columns
2022-11-11 09:26:11	and ID6 is DCT16x8, which is actually taller than it is wide
2022-11-11 09:27:47	and there we go, a beautiful block map

_wb_

2022-11-11 09:31:52

What's <@268284145820631040> 's take on this? Seems like a big thing if rows and columns got swapped...

Traneptora

2022-11-11 09:32:37	in either case, once I started laying 8x16 blocks Horizontally and 16x8 blocks Vertically, it worked
2022-11-11 09:32:47	so they might need to be relabeled in the spec
2022-11-11 09:33:11	either that or the spec needs to be more clear about what AxB means

yurume

2022-11-11 10:25:33	if you are saying DCT8x16 should be horizontal, I believe the specification does say that it's named as DCTRxC and since it has 8 rows and 16 columns it is obviously horizontal (i.e. landscape)
2022-11-11 10:27:56	I agree it is deeply confusing that while DCTRxC are consistently named, the specification's coordinate orders are much less consistent, in some cases a single section even refers to both notations
2022-11-11 10:30:39	and it doesn't help that varblocks are optionally transposed to make it use only one direction...

veluca

2022-11-12 12:23:39

imagine how confusing this was while writing it down...

★コピペ

2022-11-12 02:33:44

wow <@853026420792360980> how do i get output like that

Traneptora

2022-11-12 03:35:34

I wrote some code to dump the block array as ascii art for debugging

_wb_

2022-11-12 06:43:14

(y,x) versus (x,y) is one of the most confusing things in images. Both make total sense: memory-layout-wise, (row, column) makes most sense, while the common math convention is (horizontal,vertical).

DZgas Ж

2022-11-12 09:07:51

<:BlobYay:806132268186861619>

Fraetor

2022-11-12 02:45:41

Is a JXL codestream always zero padded to a byte boundary, or is it only when it is in a codestream box?

Traneptora

2022-11-12 02:48:24

raw codestreams are defined as a sequence of bytes

Fraetor

2022-11-12 03:08:15

Brilliant, thanks.

Traneptora

2022-11-12 03:09:21	keep in mind that raw JXL codestreams are just files on a filesystem
2022-11-12 03:09:28	which are byte sequences, like all other files

Fraetor

2022-11-12 03:11:03

That is true. I was sort of wondering if it was left unspecified so you could save the extra 7 bits when transmitting over the network, but that is probably not worth the extra complexity, given that it only saves 7% on the absolute smallest possible codestream.

Traneptora

2022-11-12 03:11:31	well 7 bits at the end of it can't be unused
2022-11-12 03:11:55	the Frames themselves are all byte-aligned
2022-11-12 03:12:27	same with TOC entries
2022-11-12 03:13:09	in theory the last 7 bits of the last section of the last frame might be unused, but frame sections (well, TOC entries) are all byte-aligned

Fraetor

2022-11-12 03:13:21

Another question, is there any significance to the contents of the DBox for the JXL Signature box, or was is it just a random number chosen to be unique? It's contents are "0xD 0xA 0×87 0xA", which is 218793738 as an integer.

Traneptora

2022-11-12 03:15:03	it's used for every JPEG-family member
2022-11-12 03:15:11	for example JPEG 2000 uses that same sequence
2022-11-12 03:16:14	why was it chosen then? I don't know

Fraetor

2022-11-12 03:25:15

Do the boxes have to appear in any particular order?I assume the signature box has to appear at the start, but what about the others such as the file type box or level box?

Traneptora

2022-11-12 03:29:29	signature must be first, and ftyp must be second
2022-11-12 03:29:41	jxll must occur before the codestream start
2022-11-12 03:31:21	otherwise there aren't a ton of restrictions

Fraetor

2022-11-12 03:34:30

Okay, so I'm fine hardcoding offsets for signature and ftyp, but not for anything else.

Traneptora

2022-11-12 03:35:24	correct
2022-11-12 03:35:52	what are you writing?

Fraetor

2022-11-12 03:36:35

***Very*** early stages, but a JPEG XL decoder: https://github.com/Fraetor/pyjxl

Traneptora

2022-11-12 04:22:41

huh

DZgas Ж

2022-11-12 04:27:44	https://tenor.com/view/huh-verne-over-the-hedge-turtle-disbelief-gif-24744867
2022-11-12 04:48:57	<:PepeSad:815718285877444619>

Fraetor

2022-11-12 05:17:34	Like I said, very early stages.
2022-11-12 05:17:59	(Also I haven't packaged it yet. :)

Traneptora

2022-11-12 07:59:16	this has been so useful for debugging ngl
2022-11-12 07:59:17
2022-11-12 07:59:32	this dumb ascii art block dumper
2022-11-12 08:01:33	I search for the block that crashes it
2022-11-12 08:01:59	oh, 475 overruns a buffer, why? turns out it's a horizontal block but the order type of the block is vertical, and I didn't transpose it
2022-11-12 08:02:16	why wasn't this caught before? cause this one occurs on a group boundary, so it actually overruns the buffer
2022-11-12 08:03:00	either way, HF Coefficients are now decoding
2022-11-12 08:03:11	which means I'm now reading all the stuff in all the passgroups
2022-11-12 08:03:21	I just now have to actually run the inverse DCT
2022-11-12 08:19:34	but it appears I now have successfully parsed all frame data sections
2022-11-12 08:19:53	so I just need to actually compute the VarDCT Image

_wb_

2022-11-12 08:23:54

https://tenor.com/view/good-luck-sign-of-the-cross-best-luck-gif-15229225

Traneptora

2022-11-12 08:30:32	that's probably a task for some other time though
2022-11-12 08:30:39	at this point I'm going to start cleaning up the code I just wrote
2022-11-12 08:30:58	I still haven't tested image widths and heights that aren't a multiple of 8
2022-11-12 08:32:38	or images with more than one LFGroup
2022-11-12 09:10:06	or images with JpegUpsampling ._.
2022-11-12 09:10:07	that's a pain
2022-11-12 09:20:23	> The decoder proceeds by decoding varblocks in raster order; for each varblock it reads channels Y, X, then B; if a channel is subsampled, its varblocks are skipped unless the varblock corresponds to the top-left corner of a non-subsampled varblock.
2022-11-12 09:20:32	Emphasis mine
2022-11-12 09:22:35	does this mean that for 4:2:0 subsampling I basically skip all X/B varblocks unless the block x coordinate and y coordinate are even?
2022-11-12 09:22:53	coordinate in DctSelect

fab

2022-11-12 09:36:22

https://it.wikipedia.org/w/index.php?title=JPEG_XL&diff=prev&oldid=130445753

veluca

2022-11-12 09:37:10

yep

Traneptora

2022-11-12 09:37:20

<#1037323113643384844>

fab

2022-11-12 09:37:39

veluca is this change great

Traneptora

2022-11-12 09:37:53

does veluca speak italian?

veluca

2022-11-12 09:38:11

I do

Traneptora

2022-11-12 11:10:22	and I figured it out
2022-11-12 11:10:43	now successfully decoding the HFCoefficients for a jpeg subsampled image
2022-11-12 11:13:49	all I gotta do is work on what happens if the width and height are not multiples of 8
2022-11-12 11:19:05	Consider a 500x606 image
2022-11-12 11:19:14	which has one LF group for the whole image
2022-11-12 11:19:30	would the dimensions of the LF Coefficient subbitstream be 32x76?
2022-11-12 11:20:17	if it's subsampled at 4:2:2
2022-11-12 11:21:33	or would it be 31x76?
2022-11-12 11:22:02	since if you do ceildiv by 8 you get 63x76
2022-11-12 11:22:19	and if it's subsampled, would you then do 31x76, or 32x76?
2022-11-12 11:22:25	it matters which order you shift and then ceildiv, is why I ask
2022-11-12 11:32:49	(in either case I'm failing to verify the modular subbitstream, but it's nice to know how it's supposed to work)

veluca

2022-11-13 08:48:59

IIRC shift is rounding up and done after ceildiv by 8

Fraetor

2022-11-13 02:06:02	what does this mean? > The resulting value is `(offset + v) Umod (1 << 32)`.
2022-11-13 02:06:30	Specifically I'm not familiar with the Umod operator.

_wb_

2022-11-13 02:29:55

Basically it means "use uint32_t overflow semantics on this one"

Fraetor

2022-11-13 03:04:32	What does it mean to have a ratio of 0 in the SizeHeader?
2022-11-13 03:06:03	And am I correct in assuming that !ratio is true when ratio is zero, and false when ratio > 0?
2022-11-13 03:07:00	Ah, is a ratio of zero where one isn't set, and you instead have an explicit width?

yurume

2022-11-13 03:17:19

yes.

Fraetor

2022-11-13 03:19:11

Is there a reason that a ratio of 2 is 12 by 10 rather than 6 by 5?

yurume

2022-11-13 03:24:41	I believe there is no particular reason, considering that in the spec everything has an arbitrary precision.
2022-11-13 03:25:06	probably cause libjxl also had that line?

Traneptora

2022-11-13 03:35:34	these are C cemantics
2022-11-13 03:35:52	in C, `if (foo)` in C is the same as `if (foo != 0)`
2022-11-13 03:36:25	so `if (!ratio)` means `if (! (ratio != 0))` i.e. `if (ratio == 0)`
2022-11-13 03:37:23	I don't believe there's any particular reason
2022-11-13 03:37:59	multiplying by 12 and then dividing by 10 will always yield the same result as multiplying by 6 and then dividing by 5
2022-11-13 03:40:02	more specifically, it's false when `ratio != 0`, not when `ratio > 0`
2022-11-13 03:40:09	in this case it can't be negative so it's the same
2022-11-13 03:40:13	but it's not always the same, in case they are signed
2022-11-13 03:40:30	in python, C, etc. most languages it's the `%` sign
2022-11-13 03:41:04	```python $ python3 Python 3.10.8 (main, Nov 1 2022, 14:18:21) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> 13 % 5 3 >>> ```
2022-11-13 03:41:21	it's the remainder when you integer-divide
2022-11-13 03:42:52	indeed, that's correct. `ratio == 0` means 'read it' `ratio != 0` means "calculate it from height"

Fraetor

2022-11-13 03:44:54

I've got output! It is only a little bit wrong. ``` $ jxlinfo minimal.jxl JPEG XL image, 512x256, lossy, 8-bit RGB Color space: RGB, D65, sRGB primaries, sRGB transfer function, rendering intent: Relative $ python3 -m pyjxl minimal.jxl Decoding minimal.jxl and printing output... Width: 535 Height: 357 ```

Traneptora

2022-11-13 03:45:35	did you remember to skip past `0xFF 0x0A` at the beginning of the codestream?
2022-11-13 03:46:21	all codestreams start with that as a signature

Fraetor

2022-11-13 03:46:32

Yep. I am assuming that the SizeHeader is a maximum of 9 bytes, but I think that should always be the case.

Traneptora

2022-11-13 03:46:39	it is not always the case
2022-11-13 03:46:44	wait 9 bytes? yea
2022-11-13 03:46:50	not 9 bits
2022-11-13 03:46:56	it's a good idea to think in bits
2022-11-13 03:47:24	first thing to check, are you reading the small size header bit correctly?

Fraetor

2022-11-13 03:47:25

Yeah, I'm realising that the headers are not byte aligned.

Traneptora

2022-11-13 03:47:30	indeed
2022-11-13 03:47:37	nothing is byte-aligned unless it specifically says it is

Fraetor

2022-11-13 03:47:41

Time to add some more prints.

Traneptora

2022-11-13 03:47:45	the first step is going to brew some sort of bitreader
2022-11-13 03:47:58	that can read bits
2022-11-13 03:48:02	rather than bytes

Fraetor

2022-11-13 03:49:07	Yeah, I've gotten a simple one. I do have some initial logic that handles the code stream in bytes, to skip past the signature, but I should probably remove that if nothing else is going to be byte aligned.
2022-11-13 03:53:43	Removing the byte handling has changed my size to different, but still wrong values.

_wb_

2022-11-13 03:57:43

Make sure to read bits in the right order, it is reading bits from lsb to msb

Fraetor

2022-11-13 03:59:23	Yep, that seems to have fixed it. I was reading the bytes big endian.
2022-11-13 04:00:28	The minima case works now, but not a different image.
2022-11-13 04:15:57	I think my non div8 code is working, but the div8 stuff returns the wrong values.
2022-11-13 04:18:32	I wasn't advancing my bit pointer when reading a div8 width or height.
2022-11-13 04:21:14	I think (🙏) it is working now.
2022-11-13 04:37:08	Well, I'm going to call it there for today, but I think that is some progress.

Traneptora

2022-11-13 07:08:59	might want to roll up a class for that, or at least a function that automatically does that
2022-11-13 07:09:02	it'll get pretty tedious
2022-11-13 07:11:09	also since this is in python, don't forget that `/` is floating point division and `//` is integer division

Fraetor

2022-11-13 08:27:30

The class is a good shout. Saves me from adding a `shift += n` after every read.

yurume

2022-11-13 09:13:23	you will need the class a lot, so make sure to have good tests 😉
2022-11-13 09:13:54	(it is so annoying when I track down some obscure bug to a silly and easily testable bug in fundamental routines...)

Traneptora

2022-11-13 09:41:48	yup, hit a few of those
2022-11-13 09:51:48	ngl, writing a jxl decoder is probably biting off more than you can chew right now
2022-11-13 09:52:17	I don't mean that disparagingly but this is a big project

Fraetor

2022-11-13 09:52:59

Fair, but it is an interesting challenge to learn new things. How far I get is something to see.

Traneptora

2022-11-13 09:53:17	I'm already at ~8000 lines of code into jxlatte
2022-11-13 11:10:27	alright, dequantization is done (but not tested)
2022-11-13 11:10:35	up next is to program decorrelation
2022-11-13 11:10:38	and then finally IDCT
2022-11-14 12:40:49	I'm not sure how % interacts with negative integers
2022-11-14 12:40:56	I think it's UB
2022-11-14 12:41:07	same with right shfiting negative integers

yurume

2022-11-14 06:13:09

to prevent the accidental interpretation as C behavior, where `%` isn't fully defined

veluca

2022-11-14 10:48:05	% is fully defined for negative integers
2022-11-14 10:48:18	just not in a way that usually people find useful xD (i.e. -1 % 2 == -1)

lonjil

2022-11-14 11:05:45

It's defined in C99 and later, but not in C89 (technically speaking, % has the same definition in all versions of C, and it's actually / that has changed, which of course impacts %)

Traneptora

2022-11-14 12:54:08	I'm trying to figure out where to place the LF Coefficients and the HF Coefficients before taking the IDCT
2022-11-14 12:54:37	I have a single LF group and it appears every LF Coefficient is 0.8457765667574931
2022-11-14 12:54:51	and every HF coefficient is zero
2022-11-14 12:55:09	but since I have 64x64 DCTs, if I were to populate one out of every 8 pixels with the LF coefficients, it wouldn't end up giving me what I want
2022-11-14 12:55:25	it instead gives me this
2022-11-14 12:55:33	do I put the LF coefficients in the upper-left 8x8-block?

_wb_

2022-11-14 01:05:37

You have to convert the LF data to low-freq HF coeffs (LLF) as described in I.6.6

Traneptora

2022-11-14 01:06:17	oh, I missed that
2022-11-14 01:06:18	thanks
2022-11-14 01:07:19	ngl the fact that it's way at the end tripped me up

_wb_

2022-11-14 01:08:43

yep, looks to me like that's not an optimal presentation order

veluca

2022-11-14 02:17:17	ah yeah that doesn't help
2022-11-14 02:17:27	(also in general I'd suggest starting with 8x8 DCT)

Traneptora

2022-11-14 05:41:45

and jxlatte correctly decoded its first VarDCT image!

diskorduser

2022-11-14 05:47:17	Cool.
2022-11-14 05:52:21	Can it decode the images in <#824000991891554375>?

Traneptora

2022-11-14 05:52:39	Yes, those are modular
2022-11-14 05:52:45	I already finished that
2022-11-14 06:57:57	what does this look like to you guys?
2022-11-14 06:58:08	Does it look like I simply didn't run gab and epf, or is there something else going on there?

yurume

2022-11-14 07:00:07

to my knowledge the lack of gab and epf can't produce that much effect for typical cases

Traneptora

2022-11-14 07:01:49	might be an issue with the HF coefficients
2022-11-14 07:02:18	also since this is a jpeg reconstruction they're both probably false

_wb_

2022-11-14 07:02:53

Maybe dequant is wrong

Traneptora

2022-11-14 07:03:24

most likely. the numerators and denominators get so mixed up

_wb_

2022-11-14 07:05:05

The image looks like the shape of the hf stuff is right but the amplitude is very wrong

Traneptora

2022-11-14 07:05:47	also, I think it's channels-specific
2022-11-14 07:05:52	here's a lenna example
2022-11-14 07:06:31	it looks like the dequant might be swapped between Cb and Y?

_wb_

2022-11-14 07:06:49	That bench has a silly icc profile so no idea what channel is doing what there
2022-11-14 07:08:53	I think the channel order is Cb,Y,Cr (but most things are encoded in order 1,0,2 so Y,Cb,Cr)

Traneptora

2022-11-14 07:09:17	hm, I already account for that, and swapping the decoding didn't change the problem
2022-11-14 07:09:53	this is a JPEG reconstruction, so it's using RAW quant weights

_wb_

2022-11-14 07:10:48

Somehow looks like the sign of the coeffs got flipped

Traneptora

2022-11-14 07:12:04	flipping all the signs didn't change much
2022-11-14 07:12:27	magnitude is still probably wrong

Jyrki Alakuijala

2022-11-14 10:45:26	green-bench.png
2022-11-14 10:46:07	I'm happy that you use my test images 😄
2022-11-14 10:47:07	no, gab and epf are more 'subtle'
2022-11-14 10:49:40	you should be proud nonetheless!!

daniilmaks

2022-11-14 11:24:39	is there a command for changing the luma vs chroma bias somehow?
2022-11-14 11:25:33	finally started experimenting with jxl and found the chroma blocking atrocious
2022-11-14 11:26:57	not as bad as jpeg xyb tho
2022-11-14 11:27:48	it's exactly the issue displayed here https://discord.com/channels/794206087879852103/794206087879852106/1038517128610992159
2022-11-14 11:28:43	transitions between light skin tones and dark tones are specially troublesome
2022-11-14 11:32:01	both blues and greens will often staircase along the edge of fingers and such

_wb_

2022-11-14 11:38:37

not afaik, though it should be not too hard to add that. What kind of quality/distance are you looking at?

daniilmaks

2022-11-14 11:39:06	no idea. all default
2022-11-14 11:39:33	d1.0 I think that's the quality or distance whatever
2022-11-14 11:40:02	png input

_wb_

2022-11-14 11:40:27

can you give an example of an input image where default gives atrocious chroma blocking?

daniilmaks

2022-11-14 11:40:27	(I know jpegs are treated special)
2022-11-14 11:41:44	I'm testing on real, personal material so let me look for something I can share
2022-11-14 11:44:06	I'll just grab something from unsplash
2022-11-14 11:45:22	actually no... everything's jpeg
2022-11-14 11:47:37	I mean it's exactly like this, so that's bad as it is, I dunno how you don't notice
2022-11-14 11:50:36	(except the luma looks good by comparison)

_wb_

2022-11-14 11:51:47

Usually I don't zoom in to see what's happening at the pixel level

daniilmaks

2022-11-14 11:54:23	usually modern codecs perform better than older ones
2022-11-14 11:55:25	imma try epf 3

lonjil

2022-11-14 11:58:19

In my subjective opinion, color artefacts are more noticeable zoomed out than many other kinds of artefacts

daniilmaks

2022-11-14 11:59:40

I have very low tolerance for chroma artifacts, which is why in the beginning I had a poor view on jxl

monad

2022-11-15 12:00:49

Just target a lower distance?

daniilmaks

2022-11-15 12:07:52	I want to sacrifice luma for chroma tho. it looks way too good for what the chroma is doing
2022-11-15 12:08:25
2022-11-15 12:09:35	in practice yes I would be using slightly lower distance but I don't want to be restricted to high bpp

Traneptora

2022-11-15 12:10:15

what horrible chroma artifacts are you referring to?

daniilmaks

2022-11-15 12:11:37	you can see it along the top edges of the thumbs and the bokeh bubble at the left
2022-11-15 12:12:36	it's a lot more visible in higher resolution pics but I don't have samples that I can share

Traneptora

2022-11-15 12:12:41

I see a little green blip on the bubble but I'm not sure what you're referring to on the thumbs

daniilmaks

2022-11-15 12:13:16	the green and pink chroma staircase
2022-11-15 12:14:37	I have to go for the moment, will try to expand on this later with better samples

Cool Doggo

2022-11-15 12:17:59

Traneptora

2022-11-15 12:37:11	oh, that
2022-11-15 12:37:50	that looks like a legacy JPEG issue. How are you encoding these images?
2022-11-15 12:38:45	actually not only is the magnitude correct, but the resulting float pixels are correct
2022-11-15 12:38:52	so there's something happening afterward
2022-11-15 12:43:37	not sure what it is
2022-11-15 12:43:49	atm I'm just taking the output of the DCT and directly rendering it as a float
2022-11-15 12:44:15	well after going from YCbCr -> RGB that is
2022-11-15 12:46:07	there's also only one pass, so it's not that

daniilmaks

2022-11-15 01:15:34	the conversion is from png because I am aware jpeg gets special treatment by default
2022-11-15 01:19:41	at least effectively speaking. for transparency reasons the process here was: high res jpeg source (only natural camera noise in trouble areas) > 25% size png > jxl epf3 If you believe it's an issue that the source was technically jpeg and that it would've had a significant effect in the outcome after the heavy downscale, I will welcome the possibility if you can expand on it.
2022-11-15 01:20:34	https://unsplash.com/photos/MvXOE3e0LDs

Cool Doggo

2022-11-15 02:30:52

btw, what do you consider "high bpp"?

daniilmaks

2022-11-15 02:33:47	I don't use hard numbers to qualify that so I wouldn't know the threshold
2022-11-15 02:34:12	at least d0.5 probably
2022-11-15 02:35:02	or like fab likes to do... d0.49786452874767466907825

190n

2022-11-15 02:36:17

-d pi/10

Traneptora

2022-11-15 04:36:01	I meant, what encoding params, just `cjxl -d 1 --epf 3 input.png output.jxl`?
2022-11-15 05:58:40	Okay, upon further inspection, my HF coeffients are not correct, and some of them appear to be out of order
2022-11-15 05:59:13	what's weird is that the first Varblock is decoding perfectly, and the next varblock is transposed
2022-11-15 05:59:15	I can't figure out why
2022-11-15 06:01:11	this is before IDCT
2022-11-15 06:03:11	the HF coefficient in (0, 9) appears to be decoded to (1, 8) instead
2022-11-15 06:03:21	but the first varblock doesn't have this issue
2022-11-15 06:04:49	why would some varblocks be transposed and not others?

daniilmaks

2022-11-15 07:01:55	oh, those. yea, thats it, correct.
2022-11-15 07:11:22	(d and epf at end of command)

_wb_

2022-11-15 07:35:07	They should all be the same order
2022-11-15 07:35:52	Epf 3 is meant for way lower quality settings

veluca

2022-11-15 08:59:45	are those all the same size of blocks?
2022-11-15 08:59:54	because you can change coefficient order per block type

daniilmaks

2022-11-15 10:19:28

isn't it meant to reduce blocking at edges?

_wb_

2022-11-15 10:21:07	sure, but it shouldn't be necessary at d1 to reduce blocking that much, just the default (gaborish only iirc) should be enough
2022-11-15 10:21:57	<@532010383041363969> do you know where in the code to adjust the balance luma-chroma (Y vs X and B)?
2022-11-15 10:24:26	does this look like a chroma DC or AC issue?

daniilmaks

2022-11-15 10:27:00	I was quite literally about to ask in that front fr
2022-11-15 10:27:24	I mean not the code but about having such option

_wb_

2022-11-15 10:33:28

btw do you get better results when using `-e 8`?

daniilmaks

2022-11-15 10:34:40	Imma check
2022-11-15 10:38:23	8 and 9 are practically the same

_wb_

2022-11-15 10:39:04

that's expected, but is e8 any better than the default e7?

daniilmaks

2022-11-15 10:40:03	I mean 8 and 9 vs 7 are practically the same
2022-11-15 10:40:34	luma improves more than chroma, that I can tell
2022-11-15 10:44:12	the chroma stairs appear much more on pictures with low overall saturation so at least it does weight that somewhat.
2022-11-15 10:45:31	then again I don't want to make people to look like tomatos

Jyrki Alakuijala

2022-11-15 10:52:03	quantization matrices, then there are quick adjustments for B and X quantizations, two bits each (the idea is that we can quantize them a bit more for high quality)
2022-11-15 10:53:30	if we are seeing color quantization artefacts in zoomed use cases and that is a common problem, we could expose a 'more colors' mode that would likely make compression 2 % worse but be safer for red-green and blue-yellow
2022-11-15 10:53:45	it needs to be done at encoding time naturally
2022-11-15 10:54:18	it would impact the high quality (low distance) images more

daniilmaks

2022-11-15 10:56:05	yeah I'd like to be able to decide what to sacrifice and how much
2022-11-15 10:59:04	edge chroma stability is one of the weak areas of jxl vs the competition™️ so if it is possible to make that area better you'll have all the more strength of reason in your favor
2022-11-15 10:59:37	at least at mid to low fidelity that's it

_wb_

2022-11-15 11:01:42

an encode option to manually override the relative B/X precision could be useful here. Or maybe we should just change the defaults if they aren't good enough.

daniilmaks

2022-11-15 11:04:08

gtg rn

_wb_

2022-11-15 11:08:47

To what extent are these artifacts still visible without extreme zooming? I don't think we should aim for visually lossless at 8x zoom, at least not by default — perhaps what we need is an encode option to express the expected amount of zooming that will be used. E.g. browsers don't let you zoom more than 5x or so, so for web delivery there is no point to take extreme zooming into account (and if significant zooming is expected, imo you should provide higher-res images rather than getting the pixels perfect).

daniilmaks

2022-11-15 11:31:51	I saw it as an issue before zooming, this is not about pixel peeping
2022-11-15 11:32:51	i.e. noticeable at fullscreen
2022-11-15 11:56:40	I agree that these may sound like high fidelity requirements, but I'm not here to pat everyones back at how good jxl is at high bpp. I'm here to make sure mid to lows are in fact at the limit of what can be archieved within specs.
2022-11-15 11:58:28	if much older codecs performed better in some regards, that leaves things to be desired.

_wb_

2022-11-15 12:13:12	Sure, thanks for the feedback! Subtle chroma issues are notoriously hard to get right by only looking at metrics, so we'll need to make sure we have some set of examples where this is currently noticeable and adjust the balance to make things better.
2022-11-15 12:16:23	To be sure: which version of libjxl are you currently using? I did make some changes to the DC/AC balance last month and those are not in the 0.7 release; it could be this change made things better or worse for these artifacts.
2022-11-15 12:17:30	If you have time, could you compare the results of v0.7 with those of the current git version to see if there's a difference?

daniilmaks

2022-11-15 12:18:26	`cjxl v0.8.0 d1fdb1f`
2022-11-15 12:20:11	thank you for all your work on it. I only hope for the best to the format, we needed a jpeg upgrade asap.

_wb_

2022-11-15 12:21:56

ok so that's a version from after my dc/ac adjustment, could you check if the v0.7 release is better?

daniilmaks

2022-11-15 12:22:29	sure
2022-11-15 12:32:22	they seem to trade blows both luma and chroma but let me test a bit more
2022-11-15 12:32:48	initial opinion is that there is probably no regression
2022-11-15 12:42:26	chroma stairs just shift around mostly
2022-11-15 12:43:32	I will have to leave it at that for the moment. I'll have more time at the end of the week
2022-11-15 12:50:19	My initial worry was that there might have been a bug that was causing all this but if it's supposed to be weighted like this then I will limit my request to being able to adjust bias and perhaps double check if there's a weight combo that doesn't give out blocky color transitions below high bpp usecases

Traneptora

2022-11-15 12:51:34

yea, they're all 8x8. this is a jpeg reconstruction so every varblcok is DCT8x8

veluca

2022-11-15 01:11:45

Then they all should be permuted the same way

Traneptora

2022-11-15 01:15:09	then I have to investigate more
2022-11-15 04:13:09	okay, I'm especially confused now, lol
2022-11-15 04:13:31	I added a check to dump every varblock decoded with libjxl in DecodeACVarBlock
2022-11-15 04:13:36	and that matches up with my own code
2022-11-15 04:13:52	but then, I added the same dump code after LoadBlock
2022-11-15 04:13:59	and it's not matching up
2022-11-15 04:14:08	which means something is happening in the meantime and I don't know why
2022-11-15 06:04:17	and we have a success!
2022-11-15 06:14:01	hm, we have some errors though, I wonder if this is an issue with the channel correlation
2022-11-15 06:14:34	looks like Cr is being bumped upward somehow

_wb_

2022-11-15 06:21:13

Chroma from luma maybe?

Traneptora

2022-11-15 06:23:18	possibly
2022-11-15 06:23:48	Chroma from Luma is still performed on YCbCr images that aren't subsampled, right?

_wb_

2022-11-15 06:25:19

Yes

Traneptora

2022-11-15 06:29:57	disabling chroma from luma fixes many of the artifacts but also leaves some blocks unchromatized
2022-11-15 06:30:01	I wonder if the factor is wrong
2022-11-15 06:30:51	with chroma from luma
2022-11-15 06:30:58
2022-11-15 06:30:59	without
2022-11-15 06:31:20	looks like both Cb and Cr are being magnified

_wb_

2022-11-15 06:34:38

Maybe the sign is wrong, or it's the reciprocal or something

Traneptora

2022-11-15 06:55:04	never happened with an image with one group, so I suspected I must have been taking CFL from the wrong group
2022-11-15 06:55:05	turns out I was
2022-11-15 06:55:10	now it works!

_wb_

2022-11-15 06:58:33

Yay!

Traneptora

2022-11-15 06:58:37	comparing lenna.jxl and lenna.png with imagemagick, I have an RMSE of 0.008 in the blue channel, and and a peak absolute error of 2 pixels
2022-11-15 06:58:43	I think this might not be within tolerance
2022-11-15 06:59:01	it's 0.004 in the red and green channels, and peak absolute error of 1 pixel there

_wb_

2022-11-15 06:59:29

What are you comparing? Libjxl decode vs jxlatte decode?

Traneptora

2022-11-15 06:59:34	yup
2022-11-15 07:00:00	I'm wondering if libjxl fast decode heuristics are possibly out of tolerance instead of jxlatte
2022-11-15 07:00:12	it's also possible that I didn't clamp something to an integer that libjxl did
2022-11-15 07:00:35	for example I think libjxl might do some integer clamping with CFL that I didn't? idk

_wb_

2022-11-15 07:00:45

This is a recompressed jpeg right, so no xyb?

Traneptora

2022-11-15 07:00:50	correct, YCbCr
2022-11-15 07:01:14	I tried that first, because it meant I only had to implement DCT and no other transforms

_wb_

2022-11-15 07:01:29

Accuracy of idct should be quite good for both implementations, I assume

Traneptora

2022-11-15 07:01:38	IDCT accuracy to me is in double precision
2022-11-15 07:02:06	I suspect there might be some clamping to integers happening in CFL that matters

_wb_

2022-11-15 07:02:12

That's more precise than libjxl, but both are way more precise than libjpeg-turbo

Traneptora

2022-11-15 07:02:25	yea, but unlike JPEG, JXL decoding is fully specified
2022-11-15 07:02:36	so I feel responsible for getting within tolerance to the reference decoder
2022-11-15 07:03:06	turbojpeg is way off of the libjxl decode but that's to be expected

_wb_

2022-11-15 07:03:20

Decode side cfl should not round things to integers

Traneptora

2022-11-15 07:03:32

ah, I saw that code in libjxl so I wasn't sure

_wb_

2022-11-15 07:04:48

Iirc we had to do some things carefully to make sure cfl doesn't mess with being able to recover the original quantized coeffs

Traneptora

2022-11-15 07:04:52	according to the table I'm within the tolerance
2022-11-15 07:04:53
2022-11-15 07:05:12	I wonder if I'm actually being more accurate than libjxl, but libjxl is taking shortcuts and is still within tolerance?
2022-11-15 07:05:46	this is a level 5 image

_wb_

2022-11-15 07:06:09

We came up with the tolerances using a hacked libjxl that uses double precision instead of single

Traneptora

2022-11-15 07:06:23	RMS of 0.008 in the blue and 0.004 in rg is much less than 0.02
2022-11-15 07:06:40	and 2 pixels out of 255 is a peak error of less than 0.01
2022-11-15 07:06:55	is it possible that I'm more accurate here than libjxl?

_wb_

2022-11-15 07:07:03	Sure
2022-11-15 07:07:27	If you do things with doubles instead of floats, you'll be more accurate

Traneptora

2022-11-15 07:07:56	but I might also have a bug which is causing different results
2022-11-15 07:08:05	I wonder if I wrap the codestream in a level 10 header, does stuff because more accurate in libjxl?
2022-11-15 07:08:34	since my peak error and rmse is not within the tolerance for level 10

_wb_

2022-11-15 07:08:56

Libjxl does the same thing regardless of level

Traneptora

2022-11-15 07:09:16

so if this codestream were level 10, either I'm not compliant or libjxl is not compliant

_wb_

2022-11-15 07:09:38

Yes. Or our tolerances are too strict and we need to bump them up 🙂

Traneptora

2022-11-15 07:09:42

perhaps

_wb_

2022-11-15 07:10:16

Have you compared pixel values before converting them to ints?

Traneptora

2022-11-15 07:10:33	not yet, I'm comparing 8-bit PNG output
2022-11-15 07:10:45	I could try comparing 16-bit PNG output

_wb_

2022-11-15 07:10:49

Can be useful to compare pfm output

Traneptora

2022-11-15 07:11:00	currently I can't output PFM, but I can output 16-bit PNG
2022-11-15 07:11:09	what is PFM?
2022-11-15 07:11:20	portable float map, but what's the format?
2022-11-15 07:11:42	if it's simple enough I could throw together a PFM output

_wb_

2022-11-15 07:12:38

You can just dump a float buffer and add a header in front, it can do both endiannesses so you can just use little endian like a float array would be

Traneptora

2022-11-15 07:13:21

ah so it's pretty simple

_wb_

2022-11-15 07:13:22	Maybe check 16-bit png first
2022-11-15 07:13:50	And maybe look at the signed errors
2022-11-15 07:14:03	Could be some silly rounding direction thing or something

Traneptora

2022-11-15 07:30:11	peak error of `0.00601205`, or 394 in 16-bit space
2022-11-15 07:30:35	rmse of `0.00228393`
2022-11-15 07:31:46	I convert from float to integer by multiplying by max, and then rounding to nearest integer
2022-11-15 07:32:42	I tried multiplying by max+1 and then truncating, but that produced different results from libjxl for lossless images

_wb_

2022-11-15 07:33:07

Yeah no multiplying by max and rounding to nearest is correct

Traneptora

2022-11-15 07:33:16	it's "correct"
2022-11-15 07:33:47	multiplying by max+1 and then truncating creates 256 evenly sized bins in [0, 1]
2022-11-15 07:34:16	if you mulitply by max and round, you end up with the lowest and highest bins as half-size
2022-11-15 07:34:50	so it's correct in the sense that it perfectly inverts the inverse transform, but both are viable options

_wb_

2022-11-15 07:34:57

Yeah but in principle values can be out of range so the lowest and highest bins are big enough:)

Traneptora

2022-11-15 07:35:38	either way, I wonder what happens if I output to pfm
2022-11-15 07:35:45	does git main djxl support pfm output?

_wb_

2022-11-15 07:35:57

It should, yes

Traneptora

2022-11-15 07:36:41

what's the PFM header format?

_wb_

2022-11-15 07:37:04

Just produce one with djxl and copy it 🙂

Traneptora

2022-11-15 07:37:38	it might be binary for all I know
2022-11-15 07:37:47	but a quick google search showed that it's extraordinarily simple
2022-11-15 08:26:17	peak error of `0.00544693` in float, rmse of `0.00198064`
2022-11-15 08:26:38	still not 10^-4
2022-11-15 08:37:19	but well within the level5 tolerance
2022-11-15 08:37:37	already brewed up a PFM writer <:KEKW:643601031040729099>

jox

2022-11-15 09:39:37

Does anyone know which options on https://jpegxl.io/ correspond with which cjxl options? Using that website to convert my test jpg (7.2 MB), it creates a jxl that is only 1 MB. How is this possible? Just comparing the photos, they look identical and the transcode option is checked. Converting the same image and using the default options in cjxl 0.7, creates a jxl that is 5.9 MB.

Traneptora

2022-11-15 09:43:08	likely it's a bug with the transcode option, as 7.2M -> 1MB won't happen for most JPG -> JXL transcodes
2022-11-15 09:45:44	just checked, indeed, it's not doing a jpeg transcode
2022-11-15 09:45:54	I tested my own file and it includes an 8x4 DCT
2022-11-15 09:46:01	all jpeg transcodes are 8x8 only

jox

2022-11-15 09:47:13

Thanks for the quick reply <@853026420792360980>! Do you know which options that could produce a lossy image like that? No matter which -d or -q option I use in cjxl the file seems to be the same size.

Traneptora

2022-11-15 09:47:26	`cjxl input.jpg output.jxl` should do it by default
2022-11-15 09:47:36	oh, lossy?

jox

2022-11-15 09:47:44	Yeah
2022-11-15 09:48:20	I was really impressed with how good the image looked so I wanted to try and generate a similar jxl using cjxl

_wb_

2022-11-15 09:48:43	try `cjxl --lossless_jpeg=0`
2022-11-15 09:49:51	Maybe we should look at the jpeg quantization tables to estimate quality and if it looks like a very high quality one (say something straight from camera), we should do lossy transcode by default
2022-11-15 09:50:04	though that's maybe a bit confusing

Traneptora

2022-11-15 09:50:05	nah, I think that's a bad idea
2022-11-15 09:50:19	I think lossless transcode for jpeg input by default makes sense to me
2022-11-15 09:50:37	as it removes SURPRISE! of people expecting that since it usually does that

jox

2022-11-15 09:51:09

Turns out it actually increased the file size by 1.1 MB

_wb_

2022-11-15 09:51:48

what? oh try adding -d 1

jox

2022-11-15 09:56:37

That seems to be it! The image is now even a bit smaller than the one from jpegxl.io

Traneptora

2022-11-15 10:01:06	interestingly, jxlinfo is reporting "possibly lossless"
2022-11-15 10:01:15	even though it's VarDCT
2022-11-15 10:01:17	why would that be?
2022-11-15 10:02:59	``` Encoding: VarDCT Type: Regular Size: 3264 x 2448 Origin: 0 x 0 YCbCr: false ```
2022-11-15 10:03:04	this to me says obviously lossy

_wb_

2022-11-15 10:03:39

it says possibly lossless when it's not xyb

Traneptora

2022-11-15 10:03:53

so it doesn't read the frame header

_wb_

2022-11-15 10:04:14

do we even expose the frame encoding in the api?

Traneptora

2022-11-15 10:04:29	oh, it's API based
2022-11-15 10:04:31	I see

_wb_

2022-11-15 10:06:04

if it's a recompressed jpeg it should find the jbrd and say something about it, so you can tell it's only "lossless" in the sense that it's a losslessly recompressed jpeg

Traneptora

2022-11-15 10:06:17

this is a raw codestream

_wb_

2022-11-15 10:06:43

ah, yes then it cannot tell except by looking at the frame header

Traneptora

2022-11-15 10:21:01

how useful is upsampling outside of jxl art?

_wb_

2022-11-15 10:31:07	well, we also use the 8x upsampling to show the DC when doing progressive, so it's useful for that
2022-11-15 10:31:43	it can be useful if for some silly reason you need to reach extremely low bpp
2022-11-15 10:32:50	there's a hypothesis that it might be useful for a layered encoder that sends e.g. a 1:4 images first and then fixes it up in a 1:1 layer
2022-11-15 10:34:41	for extra channels that are for some reason subsampled (e.g. depth and thermal often are), it could be a nicer way to upsample them than some simpler upsampling method (and it's nice that it's well-defined how to do it)

Traneptora

2022-11-15 11:19:56	but I meant, for most use cases wouldn't you rather just allow the client to upsample themselves?
2022-11-15 11:20:18	since there's plenty of strong optimized upsampling algorithms that can be done on the GPU, for example

_wb_

2022-11-15 11:39:30

Well the nonseparable upsampling we have in jxl is quite a bit better than most other (classical) upsampling methods. Better for diagonal lines, for example.

Traneptora

2022-11-15 11:39:56	it is, but it's also very slow
2022-11-15 11:40:20	and you can use high-quality upsamplers in software too, like EwaLanczos
2022-11-15 11:40:33	but they can also be done i hardware

daniilmaks

2022-11-15 11:40:47

does jxl upsampler even have a name?

Traneptora

2022-11-15 11:41:02	also is it nonseparable? it's a linear combination of two separable filters
2022-11-15 11:41:41	-4I + 0.16B, where B is the box filter and I is the identity

daniilmaks

2022-11-15 11:42:38

elon approved naming scheme I see

Traneptora

2022-11-15 11:42:49

that's the formula for it, not its name

daniilmaks

2022-11-15 11:44:51

understood

Traneptora

2022-11-15 11:49:24	I lied, that's the filter for noise, it's not the upsampling kernel
2022-11-15 11:50:25	the upsampling in JXL doesn't have an easy way to describe it

daniilmaks

2022-11-15 11:50:36

I understood that it is a formula, not that I understood the formula

Jyrki Alakuijala

2022-11-15 11:58:54	I didn't give it a name -- the inspiration was from using a median filter on a 4x4 nearest upsampled image and then light smoothing
2022-11-16 12:01:04	I'm not sure if it deserves a name, but if we need one, I'd propose JinkaFilter -- Jinka was a kind dog in my childhood home
2022-11-16 12:01:58	(she is dead for 40 years already, but deserves to be remembered 🙂

daniilmaks

2022-11-16 12:08:12	if its part of the oficial jxl documentation it deserves a name
2022-11-16 12:11:20	I'd just call it jinka

Traneptora

2022-11-16 12:12:24

only issue with the name Jinka is possible conflation with Jinc

daniilmaks

2022-11-16 12:15:35

the sombrero function?

Traneptora

2022-11-16 12:58:07	hm, I wonder what's happening here
2022-11-16 01:02:08	it looks like chroma is possibly being pulled to the right by 2x
2022-11-16 01:02:14	but why does it reset every 8 varblocks?
2022-11-16 01:03:34	thinking about it numerically, that's 4 times per group, which doesn't quite make sense
2022-11-16 01:04:19	oh wait this image is much higher res than I thought, those are not 8 varblocks
2022-11-16 01:04:35	those big stripes are groups
2022-11-16 01:04:45	resetting every group makes sense

Info

JPEG XL

General chat

Voice Channels

Archived

jxl

Anything JPEG XL related