JPEG XL

2025-04-26 12:32:13	15360x2048x3 is 94371840
2025-04-26 12:32:25	so 94M of that is used by the frontend to buffer tiles
	RaveSteel cjxl: Memory usage summary: heap total: 13113988389, heap peak: 1889278942, stack peak: 34032 hydrium: Memory usage summary: heap total: 2709580194, heap peak: 425811326, stack peak: 4944
2025-04-26 12:32:39	I wonder what the memory usage is with `--tile-size=3`?

RaveSteel

2025-04-26 12:33:00

I had still tested the last result with --tile-size=3

Traneptora

2025-04-26 12:33:04

ah okay

RaveSteel

2025-04-26 12:33:07

One moment, let me repeat without it

Traneptora

2025-04-26 12:33:30

it'll be a bit more wth `--one-frame` and a lot less with default of `tile-size=0`

RaveSteel

2025-04-26 12:34:31

--tile-size=3: Memory usage summary: heap total: 4304122022, heap peak: 37249742, stack peak: 4944 --one-frame: Memory usage summary: heap total: 3217604941, heap peak: 2546414918, stack peak: 4944

Traneptora

2025-04-26 12:34:57	Yea, for massive images it looks like it's not preferable
2025-04-26 12:35:02	to even libjxl
2025-04-26 12:35:07	to do `--one-frame`
2025-04-26 12:35:29	but 37M to encode that image, including buffering the decoded PNG, is pretty nice, ngl

RaveSteel

2025-04-26 12:36:13	Lossy with this specific image is much larger than lossless, always a bit silly when it happens
2025-04-26 12:36:23	Lossless JXL is 2.1MB

Traneptora

2025-04-26 12:36:35	hydrium has no psychovisual optim
2025-04-26 12:36:41	so it won't matter much for hydrium

RaveSteel

2025-04-26 12:37:37

Some images just work best in lossless

Traneptora

2025-04-26 12:37:47	I have other plans too, like one option is to reduce the memory footprint of the large tile size stuff
2025-04-26 12:37:55	by having the client send tiles in group anyway
2025-04-26 12:40:05	running forward DCT on the group rather than the LF group, and compute the coeffs, and then discard the XYB buffer is probably better

jonnyawsom3

	RaveSteel Well, could be that it takes some time, but I waited for 5 minutes and it didn't decode, so I assume it will not even if I wait longer
2025-04-26 12:50:31	Default Hydrium takes so long because it encodes the image a tile at a time in separate layers/frames. So it can't do any threading for decoding, has to buffer all layers/frames at once, and then has to merge them all into a single image too

RaveSteel

2025-04-26 12:50:59

Ah, that explains the long decode

jonnyawsom3

2025-04-26 12:51:18

Add in that libjxl only decodes 1 frame/layer at a time, and you get an order of magnitude or two of slowdown

Traneptora

2025-04-26 01:05:33	ye
2025-04-26 01:05:46	it's really not pragmatic
2025-04-26 01:06:04	I want to try to make one-frame mode more useful
2025-04-26 01:07:24	it may be helpful to use some heuristics for HF coeff histograms
2025-04-26 01:07:39	so I don't have to buffer the whole stream

jonnyawsom3

	RaveSteel
2025-04-26 01:14:55	A throwback https://discord.com/channels/794206087879852103/803645746661425173/1284124900943990816

RaveSteel

2025-04-26 01:15:47

true xddd

jonnyawsom3

2025-04-26 01:16:34

Maybe one day we can have JXL powered screenshots in game. It's a perfect target, tile and sprite based chunked encoding

RaveSteel

2025-04-26 01:17:53	I still hope that Valve will add JXL screenshots to gamescope - right after they add proper HDR AVIF screenshots to the steamdeck
2025-04-26 01:18:08	Because the deck will always only create tonemapped SDR PNGs

jonnyawsom3

2025-04-26 01:25:36

I think it'll come down to someone making a PR

RaveSteel

2025-04-26 01:47:47	If I were capable I would
2025-04-26 01:47:55	Sadly I am not

jonnyawsom3

RaveSteel Much faster on my 7950x ``` main $ time cjxl -d 0 -p OpenTTD_16K.png OpenTTD_16K_main.jxl JPEG XL encoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 52390.6 kB (3.158 bpp). 15360 x 8640, 2.924 MP/s [2.92, 2.92], , 1 reps, 32 threads. real 0m46,823s user 1m3,303s sys 0m5,618s ------------------- PR time cjxl -d 0 -p OpenTTD_16K.png OpenTTD_16K_pr.jxl JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 24022.1 kB (1.448 bpp). 15360 x 8640, 8.916 MP/s [8.92, 8.92], , 1 reps, 32 threads. real 0m16,344s user 0m54,789s sys 0m5,524s ``` PR cjxl also improves the loading times ``` [Loader::qt] OpenTTD_16K_main.jxl "image/jxl" QImage::Format_RGBX64 "sRGB" transform:none 1279ms [Loader::qt] OpenTTD_16K_pr.jxl "image/jxl" QImage::Format_RGBX64 "sRGB" transform:none 595ms ```

2025-04-26 12:53:16

Oh, I should've asked this sooner, but could you try lossless faster decoding piped into djxl too? The new levels seem slightly hardware/image dependant, but it's hard to tell

RaveSteel

2025-04-26 12:55:31

with progressive enabled?

jonnyawsom3

2025-04-26 12:58:50

Disabled, but it should technically work with both

RaveSteel

2025-04-26 01:10:42

For your build: FD0 ``` JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2192.1 kB (0.132 bpp). 15360 x 8640, 64.808 MP/s [64.81, 64.81], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 401.843 MP/s [401.84, 401.84], , 1 reps, 32 threads. real 0m3,721s user 0m48,255s sys 0m10,650s ``` FD1 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2286.5 kB (0.138 bpp). 15360 x 8640, 74.659 MP/s [74.66, 74.66], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 519.315 MP/s [519.31, 519.31], , 1 reps, 32 threads. real 0m3,374s user 0m38,087s sys 0m11,628s ``` FD2 ``` JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3481.0 kB (0.210 bpp). 15360 x 8640, 60.551 MP/s [60.55, 60.55], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 729.392 MP/s [729.39, 729.39], , 1 reps, 32 threads. real 0m3,708s user 0m37,167s sys 0m19,042s ``` FD3 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3920.9 kB (0.236 bpp). 15360 x 8640, 57.620 MP/s [57.62, 57.62], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 753.847 MP/s [753.85, 753.85], , 1 reps, 32 threads. real 0m3,823s user 0m39,584s sys 0m19,405s ``` FD4 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3903.0 kB (0.235 bpp). 15360 x 8640, 61.082 MP/s [61.08, 61.08], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 776.508 MP/s [776.51, 776.51], , 1 reps, 32 threads. real 0m3,686s user 0m36,774s sys 0m18,932s ```

2025-04-26 01:10:58

Should I do main as well?

jonnyawsom3

2025-04-26 01:13:12	Okay, damn. Never seen speeds that high xD
2025-04-26 01:13:35	Safe to say it works. You can do main too, then we have more comparisons

RaveSteel

2025-04-26 01:16:17

For main: FD0 ``` JPEG XL encoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] JPEG XL decoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2183.8 kB (0.132 bpp). 15360 x 8640, 47.976 MP/s [47.98, 47.98], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 257.206 MP/s [257.21, 257.21], , 1 reps, 32 threads. real 0m4,653s user 1m9,990s sys 0m10,278s ``` FD1 ``` JPEG XL decoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] JPEG XL encoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 68627.5 kB (4.137 bpp). 15360 x 8640, 85.531 MP/s [85.53, 85.53], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 530.164 MP/s [530.16, 530.16], , 1 reps, 32 threads. real 0m3,223s user 0m31,261s sys 0m11,704s ``` FD2 ``` JPEG XL decoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] JPEG XL encoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 68627.5 kB (4.137 bpp). 15360 x 8640, 85.807 MP/s [85.81, 85.81], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 522.203 MP/s [522.20, 522.20], , 1 reps, 32 threads. real 0m3,211s user 0m31,330s sys 0m11,908s ``` FD3 ``` JPEG XL encoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] JPEG XL decoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2266.5 kB (0.137 bpp). 15360 x 8640, 67.566 MP/s [67.57, 67.57], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 511.467 MP/s [511.47, 511.47], , 1 reps, 32 threads. real 0m3,604s user 0m42,086s sys 0m11,909s ``` FD4 ``` JPEG XL decoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] JPEG XL encoder v0.11.1 794a5dcf [AVX2,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2266.5 kB (0.137 bpp). 15360 x 8640, 67.030 MP/s [67.03, 67.03], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 511.677 MP/s [511.68, 511.68], , 1 reps, 32 threads. real 0m3,624s user 0m41,931s sys 0m12,216s ```

2025-04-26 01:17:37

File size diffs for FD1 and FD2 are crazy

jonnyawsom3

2025-04-26 01:23:20	Yeah, that's what we originally set out to fix. Then we realised even when it was working, it was fundamentally flawed and was sometimes slower than normal. So we completely replaced all 4 levels, with new parameters that actually take advantage of the decoder's fast paths
2025-04-26 01:29:13	Oxide is even faster at FD4 IIRC, certainly for progressive

RaveSteel

2025-04-26 01:36:10

You certainly managed to improve it quite a lot! Anything else you'd like me to test?

jonnyawsom3

2025-04-26 01:46:47

I think the only other thing I did was fix delta pallete, but that encodes at a fixed 0.04MP/s, so I wouldn't recommend it on the OpenTTD image xD

Quackdoc

2025-04-26 01:50:05

I can't wait to test in olive, but sadly, olive doesn't build against recent ffmpeg so I gotta wait for the new re-write but using the patch in an NLE is gonna be super nice, probably will likely beat prores in decode perf too at this point

jonnyawsom3

2025-04-26 01:55:20

Nice thing is, the decode speed isn't dependant on effort with the new setup either. So you can use e9 and get closer to standard density (I think)

A homosapien

	RaveSteel You certainly managed to improve it quite a lot! Anything else you'd like me to test?
2025-04-26 09:24:59	Decoding lossless progressive JXLs with our PR compared to main. Should be around 10% faster

jonnyawsom3

2025-04-26 09:29:47

Using FD4 too, around 30% faster

Paultimate

2025-04-26 10:20:22

Whats the diff between the version of jxl here and now? https://github.com/WangXuan95/Image-Compression-Benchmark

jonnyawsom3

	Paultimate Whats the diff between the version of jxl here and now? https://github.com/WangXuan95/Image-Compression-Benchmark
2025-04-26 10:56:45	That uses v0.9, the major difference is that v0.10 added chunked encoding, massively lowering memory usage and allowing multithreaded encoding. Though, for some reason they disable multithreading, severely limiting JPEG XL compared to the others

A homosapien

2025-04-26 11:06:08

Also it's limited to 8-bit only, not really making it future-proof

Orum

2025-04-26 11:11:59	that reminds me I really need to update my benchmarks
2025-04-26 11:13:00	was really hoping they'd release an official version with a fix for `-d 0 -e 1` first though

jonnyawsom3

	Orum was really hoping they'd release an official version with a fix for `-d 0 -e 1` first though
2025-04-27 09:52:46	Did I miss something?

Orum

2025-04-27 05:01:54

https://github.com/libjxl/libjxl/issues/4026

jonnyawsom3

2025-04-27 08:53:10

Ah, yeah I'm surprised that wasn't put in a patch immediately

Orum

2025-04-27 09:16:16

I mean, it *is* in a "patch", but it's not in a release version of libjxl yet

jonnyawsom3

2025-04-27 09:47:02

That's what I meant, a patch release

Orum

2025-04-27 10:03:19

ahh 👌

CrushedAsian255

2025-04-27 10:10:02

It seems like an emergency feature

jonnyawsom3

RaveSteel For your build: FD0 ``` JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2192.1 kB (0.132 bpp). 15360 x 8640, 64.808 MP/s [64.81, 64.81], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 401.843 MP/s [401.84, 401.84], , 1 reps, 32 threads. real 0m3,721s user 0m48,255s sys 0m10,650s ``` FD1 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2286.5 kB (0.138 bpp). 15360 x 8640, 74.659 MP/s [74.66, 74.66], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 519.315 MP/s [519.31, 519.31], , 1 reps, 32 threads. real 0m3,374s user 0m38,087s sys 0m11,628s ``` FD2 ``` JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3481.0 kB (0.210 bpp). 15360 x 8640, 60.551 MP/s [60.55, 60.55], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 729.392 MP/s [729.39, 729.39], , 1 reps, 32 threads. real 0m3,708s user 0m37,167s sys 0m19,042s ``` FD3 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3920.9 kB (0.236 bpp). 15360 x 8640, 57.620 MP/s [57.62, 57.62], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 753.847 MP/s [753.85, 753.85], , 1 reps, 32 threads. real 0m3,823s user 0m39,584s sys 0m19,405s ``` FD4 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3903.0 kB (0.235 bpp). 15360 x 8640, 61.082 MP/s [61.08, 61.08], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 776.508 MP/s [776.51, 776.51], , 1 reps, 32 threads. real 0m3,686s user 0m36,774s sys 0m18,932s ```

2025-04-28 10:55:04

If my napkin maths work out, that should be 4K at roughly 45 fps. <@853026420792360980> I don't suppose you could expose `-faster_decoding` as an option for FFMPEG? With the new [PR](<https://github.com/libjxl/libjxl/pull/4201>) it can actually be viable as a lossless video format

Traneptora

	If my napkin maths work out, that should be 4K at roughly 45 fps. <@853026420792360980> I don't suppose you could expose `-faster_decoding` as an option for FFMPEG? With the new [PR](<https://github.com/libjxl/libjxl/pull/4201>) it can actually be viable as a lossless video format
2025-04-29 01:19:44	It wouldn't be too hard
2025-04-29 01:19:54	I can try this week
2025-04-29 01:32:52	currently working on improving hydrium
2025-04-29 01:33:06	got a 15% speedup so far
2025-04-29 01:33:25	currently trying to reduce the number of malloc calls

_wb_

2025-04-30 12:48:18	Some decode speedup with a simple change: https://github.com/libjxl/libjxl/pull/4221
2025-04-30 01:05:09	Actually it is a bigger speedup than what I would expect also for other effort settings — possibly Huffman coding ends up getting used more often than I realized

jonnyawsom3

2025-04-30 02:30:42	I tried to find when huffman is used, but I only saw it for effort 1 and ICC profiles. It used to be enabled for faster decoding 2 and 3, but we disabled that in our PR since it had little impact
	_wb_ Actually it is a bigger speedup than what I would expect also for other effort settings — possibly Huffman coding ends up getting used more often than I realized
2025-04-30 02:40:31	Has profiling ever been done to see what's using the most time? I feel like huffman being 20-30% would've shown up quite clearly
2025-04-30 02:45:53	I know there's no fast path for single node MA trees, since that's why Oxide is faster than libjxl at the new faster decoding 4 and progressive lossless, if I recall. The testing was a few weeks ago, so my memory is a bit fuzzy

_wb_

2025-04-30 02:54:07

IIRC even if Huffman coding is not forced (like it is in e1), it may still be used depending on the contents: there's some more signaling overhead for ANS so it is not necessarily worth using it

Jarek Duda

	_wb_ IIRC even if Huffman coding is not forced (like it is in e1), it may still be used depending on the contents: there's some more signaling overhead for ANS so it is not necessarily worth using it
2025-04-30 04:09:48	while in Huffman we can just encode the powers of 2, in accurate entropy coding we have larger space to optimize quantization of probability distribution - could also just use power of 2, but can also do better - here is the best I have found: https://arxiv.org/pdf/2106.06438

_wb_

2025-04-30 04:51:36

Maybe the speedup at higher effort is not caused by my PR, I was lazy and compared against v0.11.1 instead of comparing to the actual "before" version, and maybe some other recent changes improved decode speed too

jonnyawsom3

	_wb_ Maybe the speedup at higher effort is not caused by my PR, I was lazy and compared against v0.11.1 instead of comparing to the actual "before" version, and maybe some other recent changes improved decode speed too
2025-04-30 05:32:09	Do we know the PR did anything then? Maybe effort 1 was already faster

_wb_

2025-04-30 05:33:13

Speeds of effort 1 were similar to v0.11.1 before the PR

jonnyawsom3

2025-04-30 05:33:24

Ah, good

CrushedAsian255

	_wb_ Some decode speedup with a simple change: https://github.com/libjxl/libjxl/pull/4221
2025-04-30 10:43:03	Just wondering, why are there 2 calls to Consume() that just ignore the output?
2025-04-30 10:44:37	Or is that equivalent to just walking the read pointer forward?

jonnyawsom3

2025-05-01 02:17:22

<@1028567873007927297> An unforseen benefit to RGB JPEG, is that JXL transcoding is far more effective too. 55% smaller

embed

	<@1028567873007927297> An unforseen benefit to RGB JPEG, is that JXL transcoding is far more effective too. 55% smaller
2025-05-01 02:17:44	https://embed.moe/https://cdn.discordapp.com/attachments/803645746661425173/1367323987843092571/RGB-JPEG.jxl?ex=68142b32&is=6812d9b2&hm=cebb549695418e279ae657f4c35d1020f14110a32414acde87ac94f1c9cb4570&

jonnyawsom3

2025-05-01 02:18:23

And naturally the bot is using a YCbCr JPEG xD

Demiurge

2025-05-01 02:18:32	Well, for an image like that, it's not surprising of course, lol
2025-05-01 02:18:54	Not unforeseen
2025-05-01 02:19:12	For a more natural image then it would be surprising

jonnyawsom3

2025-05-01 02:21:08

YCbCr manages 32%

Demiurge

2025-05-01 02:24:14

They're RGB gradients. Of course they'll be more efficient to compress in RGB.

jonnyawsom3

2025-05-01 02:25:35

Well, the original JPEG is almost twice the size in RGB, so I was surprised to see the JXL only be 4% larger

Fab

2025-05-01 09:58:28

Progress encoding JPEG

jonnyawsom3

	Orum was really hoping they'd release an official version with a fix for `-d 0 -e 1` first though
2025-05-01 05:58:04	Thinking about this again https://github.com/libjxl/libjxl/issues/4026 and what else could be done before the next release
2025-05-01 05:59:01	I think I know a workaround for https://github.com/libjxl/libjxl/issues/3823 at the cost of higher memory usage, but I'll need to test it
2025-05-01 06:00:15	We've already fixed progressive lossless and faster decoding. Nearly done with a jpegli overhaul too which would be nice to get in
	𝕰𝖒𝖗𝖊 For butter 3-norm the splitting point where 444 better is: ``` cjpegli "${ref}" "${output}" -q "84" -p "2" --xyb --chroma_subsampling=420 cjpegli "${ref}" "${output}" -q "21" -p "2" --xyb --chroma_subsampling=444 ```
2025-05-01 07:00:09	I know it's been quite a while, but if it's not too much hassle, do you think you could test non-XYB 444 vs 420, and 444 Adaptive Quant vs No Adaptive Quant? The PR we're working on has thresholds set to enable 420 at low quality and to disable AQ at high quality, but I just realised the results I used were your XYB tests, so it may not apply to YCbCr
2025-05-01 07:01:48	And Fab, please try to stick to <#805176455658733570> or <#806898911091753051>

Fab

	I know it's been quite a while, but if it's not too much hassle, do you think you could test non-XYB 444 vs 420, and 444 Adaptive Quant vs No Adaptive Quant? The PR we're working on has thresholds set to enable 420 at low quality and to disable AQ at high quality, but I just realised the results I used were your XYB tests, so it may not apply to YCbCr
2025-05-01 07:02:45	Temu encoder is meant for High fidelity
2025-05-01 07:03:33	Is like x264

Orum

	Thinking about this again https://github.com/libjxl/libjxl/issues/4026 and what else could be done before the next release
2025-05-01 07:04:23	Well, I'm sure a lot, but the question is really, "What makes sense to include before the next release?"

jonnyawsom3

2025-05-01 07:07:18	I suppose what I meant is, what are low hanging fruit/important things we could do relatively quickly
2025-05-01 07:07:58	Progressive should be a one line fix if I'm right
2025-05-01 07:08:25	Or rather, a one line workaround. We encountered a similar issue for progressive lossless

𝕰𝖒𝖗𝖊

	I know it's been quite a while, but if it's not too much hassle, do you think you could test non-XYB 444 vs 420, and 444 Adaptive Quant vs No Adaptive Quant? The PR we're working on has thresholds set to enable 420 at low quality and to disable AQ at high quality, but I just realised the results I used were your XYB tests, so it may not apply to YCbCr
2025-05-01 07:10:19	Why not, I can do it if it helps to improve anything. Should I recompile from source? (any new patches / updates I need to know about)

jonnyawsom3

	𝕰𝖒𝖗𝖊 Why not, I can do it if it helps to improve anything. Should I recompile from source? (any new patches / updates I need to know about)
2025-05-01 07:12:43	Shouldn't be anything new, thanks. I thought it makes more sense to ask you, since you already have the workflow made, than spend a day remaking the scripts myself
2025-05-01 08:01:45	Oh, and remember to take note of the crossover quality/distance between them, since what's what we need for the threshold haha

𝕰𝖒𝖗𝖊

	Oh, and remember to take note of the crossover quality/distance between them, since what's what we need for the threshold haha
2025-05-01 09:08:10	To be clear: All of them should be non-XYB right? So I can directly compare these three in a single test: - Non-XYB \| Adaptive 444 - Non-XYB \| Adaptive 420 - Non-XYB \| Non-Adaptive 444

jonnyawsom3

	𝕰𝖒𝖗𝖊 To be clear: All of them should be non-XYB right? So I can directly compare these three in a single test: - Non-XYB \| Adaptive 444 - Non-XYB \| Adaptive 420 - Non-XYB \| Non-Adaptive 444
2025-05-01 09:14:36	Yeah, though we're interested in the distance/quality crossover point of 444 vs 444 No-AQ, and 444 vs 420. Could also do 444 XYB vs 444 XYB No-AQ, then we can set a different threshold if it's a big difference to YCbCr, but it's not vital
2025-05-02 12:07:31	<@171262335468568576> one last thing, I can't remember if you test the full range from quality 0 to 100. These differences will most likely be around the 90 range or higher, so we'll want to make sure your benchmarks go all the way

𝕰𝖒𝖗𝖊

2025-05-02 12:08:25	Yeah, full range
	<@171262335468568576> one last thing, I can't remember if you test the full range from quality 0 to 100. These differences will most likely be around the 90 range or higher, so we'll want to make sure your benchmarks go all the way
2025-05-02 03:08:26	coming soon

jonnyawsom3

	Progressive should be a one line fix if I'm right
2025-05-02 05:10:21	Welp, I was right https://github.com/libjxl/libjxl/pull/4223
2025-05-02 05:10:44	Though formatting made it 5 :P

𝕰𝖒𝖗𝖊

2025-05-02 05:24:51
2025-05-02 05:25:17	non-XYB is not so clear <@238552565619359744>
2025-05-02 05:26:13	``` cjpegli -q "${q}" -p "2" -v -x icc_pathname=/to/the/path.icc ```
2025-05-02 05:31:30	The gap is closer with non-XYB: ``` # Crossovers 420 q=86 → 444 q=82 ```
2025-05-02 05:31:38	and after 88, no-adapt 444 takes over
2025-05-02 05:32:00	With xyb the difference was extremely weird
2025-05-02 05:32:38	From the older test with XYB

jonnyawsom3

2025-05-02 05:32:48	I only realise now, but the previous XYB 420 tests were actually subsampling Luma, causing the significantly better scores at the very low quality range
2025-05-02 05:33:48	Specifying nothing was actually 420, subsampling the B channel instead of Y

𝕰𝖒𝖗𝖊

2025-05-02 05:35:26	In the previous test with XYB, up until q 84, the quality stayed in the lower range but it still managed to become more efficient
2025-05-02 05:36:07	specifying 420 with XYB is weird. This option could be disabled or mapped to another option
2025-05-02 05:36:47	It doesn't matter if it's efficient within the low quality range. 0 to 25 quality range is not a very imporant usecase

A homosapien

2025-05-02 05:39:31

Any higher than a distance of 5 I personally think is unusable

𝕰𝖒𝖗𝖊

2025-05-02 05:41:02

forcing or encouraging users to create higher quality images can also be a better tradeoff 😁

jonnyawsom3

	𝕰𝖒𝖗𝖊 and after 88, no-adapt 444 takes over
2025-05-02 05:50:47	Don't suppose you have results for XYB 444 AQ vs XYB 444 No-AQ? Considering how varied XYB has been, it might need different values

𝕰𝖒𝖗𝖊

	Don't suppose you have results for XYB 444 AQ vs XYB 444 No-AQ? Considering how varied XYB has been, it might need different values
2025-05-02 05:51:16	Can test it

jonnyawsom3

2025-05-02 05:51:21

The results so far have helped a lot by the way, thank you

𝕰𝖒𝖗𝖊

2025-05-02 03:26:19	Took some time: I can segmentize this to create a better view. This is with XYB.
2025-05-02 03:27:44
2025-05-02 03:28:05
2025-05-02 03:28:12	<@238552565619359744>
2025-05-02 03:32:15	There is a SSIMU2 crossover here on: `420 q=74 → not-specified q=22` `420 q=76 → 444 q=21` `420 q=80 → not-specified-noadapt q=22` `420 q=81 → 444-noadapt q=20` Butter crossovers: `420 q=69 → not-specified q=18` `420 q=73 → 444 q=19`
2025-05-02 03:32:32	And there are many crossovers for others. Again, I can segmentize the graphs for better view if needed.
2025-05-02 03:37:15	The wisest approach is to use: ```Bash cjpegli "${ref}" "${output}" -q "${q}" -p "2" -v --xyb -x "icc_pathname=/path/to/profile.icc" --chroma_subsampling=444 ``` Where `q` is between `70` to `100`.

jonnyawsom3

	𝕰𝖒𝖗𝖊 Took some time: I can segmentize this to create a better view. This is with XYB.
2025-05-02 06:09:37	Interesting, so Adaptive Quantization is always better for XYB, looking at the graphs. Unfortunately that's all I can use from the results, due to 420 being broken in main and only 444 being used for XYB by default in our fork for JXL compatibility

𝕰𝖒𝖗𝖊

	Interesting, so Adaptive Quantization is always better for XYB, looking at the graphs. Unfortunately that's all I can use from the results, due to 420 being broken in main and only 444 being used for XYB by default in our fork for JXL compatibility
2025-05-02 06:21:21	yes, 444 AQ and 70+ quality
2025-05-03 09:08:11	Command: ```Bash cjpegli -p "2" --chroma_subsampling=444 -v --xyb -x icc_pathname=/path/to/prof.icc avifenc -a tune=iq -d "10" -y "444" -s "0" --ignore-xmp --ignore-exif cjxl --lossless_jpeg="0" --allow_jpeg_reconstruction="0" -e "10" -x strip="all" -x icc_pathname="/path/to/prof.icc" --keep_invisible="0" # For q100 JXL cjxl --lossless_jpeg="1" --allow_jpeg_reconstruction="1" -e "10" ``` - Source: `134.5 MP (13385x10264)` - Filesize: `101 MB` - Camera: `Phase One IQ4` - Link to source: https://flic.kr/p/2kXUoje ```Bash $ file reference.jpg reference.jpg: JPEG image data, Exif standard: [TIFF image data, little-endian, direntries=16, width=14204, height=10652, bps=206, compression=none, PhotometricInterpretation=RGB, manufacturer=Phase One, model=IQ4 150MP, orientation=upper-left], baseline, precision 8, 13385x10264, components 3 ```
2025-05-03 09:08:33	<@238552565619359744> Should I have overridden the bit-depth to 10 for JXL?
2025-05-03 09:09:20	It's confusing how AVIF can be better than lossless JPG to JXL conversion here

jonnyawsom3

	𝕰𝖒𝖗𝖊 It's confusing how AVIF can be better than lossless JPG to JXL conversion here
2025-05-03 09:18:42	Is the SSIMULACRA2 chart showing JPEG Transcoding for JXL? Because that's lossless by definition, so not even hitting a score of 90 is very strange...

𝕰𝖒𝖗𝖊

2025-05-03 09:18:51	Yes
2025-05-03 09:19:06	the highest datapoint for JXL: ``` cjxl --lossless_jpeg="1" --allow_jpeg_reconstruction="1" -e "10" ```
2025-05-03 09:19:33	This feels like an error in metrics
2025-05-03 09:19:53	and avif loses some quality in metrics from png decoding

jonnyawsom3

2025-05-03 09:20:04

Could it be color management issues again? What if you try decoding with djxl to PNG and run SSIMULACRA2 again

𝕰𝖒𝖗𝖊

2025-05-03 09:26:24

``` ssimulacra2 reference.jpg jxl_q100.png 88.01649899 ```

jonnyawsom3

2025-05-03 09:31:13

Is it within the expected 20% size reduction? Now I'm wondering if something bugged and it's not actually a transcode

𝕰𝖒𝖗𝖊

	Is it within the expected 20% size reduction? Now I'm wondering if something bugged and it's not actually a transcode
2025-05-03 09:34:18	yes ``` du -b reference.jpg jxl_q100.jxl 105993211 reference.jpg 85950001 jxl_q100.jxl ```
2025-05-03 09:34:59	18.91% reduction
2025-05-03 09:37:51	I will try to recompile `cjxl` to make sure if the binary works correctly

jonnyawsom3

2025-05-03 09:38:12

Hmm, ran a small test myself, even decoding the original JPEG to PNG only scores 91 when compared. Seems like it's interpreting the image differently Tried decoding the JXL with both djxl and jxl-oxide to test dithering vs no dithering too, still 91

𝕰𝖒𝖗𝖊

2025-05-03 09:39:11

default intensity heatmap

jonnyawsom3

2025-05-03 09:40:08	<@794205442175402004> any idea why SSIMULACRA2 is capping at 90 for JPEG originals? I know the decoding differs, but I would've expected at least 95
	𝕰𝖒𝖗𝖊 It's confusing how AVIF can be better than lossless JPG to JXL conversion here
2025-05-03 09:43:05	Is the highest lossless AVIF?

𝕰𝖒𝖗𝖊

2025-05-03 09:43:10	no
2025-05-03 09:43:26	q93
2025-05-03 09:43:38	I stopped when it became bigger than the source image

jonnyawsom3

2025-05-03 09:43:44

Worth checking, the mystery deepens

𝕰𝖒𝖗𝖊

2025-05-03 09:45:24

To add that: ```Bash du -b reference.jpg jxl_q100.jxl avif_q93.avif 105993211 reference.jpg 85950001 jxl_q100.jxl 85905967 avif_q93.avif ```

jonnyawsom3

2025-05-03 09:54:11

Can you do a test encode of `cjxl reference.jpg Test.jxl -j 0 -d 0.1 -e 9` then try SSIMULACRA2?

𝕰𝖒𝖗𝖊

2025-05-03 09:54:29	yes
2025-05-03 09:57:26	```Bash ssimulacra2 reference.jpg test.jxl 89.49191733 butteraugli_main "reference.jpg" "test.jxl" --intensity_target 203 6.0229372978 3-norm: 1.288902 butteraugli_main "reference.jpg" "test.jxl" 2.4833838940 3-norm: 0.693338 ```

jonnyawsom3

2025-05-03 10:00:34	Hmm. Maybe something about that image in particular, or the huge resolution, is causing the metric to break down
2025-05-03 10:00:59	I was hitting 95 on a 1080p photo

𝕰𝖒𝖗𝖊

2025-05-03 10:02:10	``` ssimulacra2 1.jpg test.jxl 89.9177500 ``` Another image
2025-05-03 10:02:40	1440*1799
2025-05-03 10:03:07	I'll recompile and test again
2025-05-03 10:20:43	```Bash 100% tests passed, 0 tests failed out of 8137 [==========] Running 1 test from 1 test suite. [----------] Global test environment set-up. [----------] 1 test from GaussBlurTest [ RUN ] GaussBlurTest.DISABLED_SlowTestDirac1D Max abs err: 7.82543862e-03 [ OK ] GaussBlurTest.DISABLED_SlowTestDirac1D (26702 ms) [----------] 1 test from GaussBlurTest (26702 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test suite ran. (26702 ms total) [ PASSED ] 1 test. ``` Trying again
2025-05-03 10:25:56	```Bash ssimulacra2 aa.jpg aa.jxl 92.65919079 # E11 without jpg reconstruction (3x size of the original) ssimulacra2 aa.jpg aa.jxl 100.00000000 ``` Another random image.
2025-05-03 10:26:04	Now, I'll try the actual image again
2025-05-03 10:28:23	```Bash cjxl --lossless_jpeg="1" --allow_jpeg_reconstruction="1" -e "10" reference.jpg test.jxl JPEG XL encoder v0.12.0 6bd5a2ce [_AVX2_,SSE4,SSE2] Encoding [JPEG, lossless transcode, effort: 10] Compressed to 85950.8 kB including container ssimulacra2 reference.jpg test.jxl 88.39863123 ```
2025-05-03 10:28:35	<@238552565619359744> No luck 😁

jonnyawsom3

2025-05-03 10:44:07

Very strange, at least Butter shows the sudden quality jump from being lossless, but somehow still worse than AVIF... Definitely adds up to something being broken

𝕰𝖒𝖗𝖊

2025-05-03 11:47:02

yeah, even for maxnorm; avif seems so stable and linear

A homosapien

	𝕰𝖒𝖗𝖊 Now, I'll try the actual image again
2025-05-03 11:55:02	https://fileditchfiles.me/file.php?f=/s22/xkrWBFoRzAToymUeKT.png Try this version of the image. I converted it to PNG, slightly downscaled, and properly converted to sRGB.

𝕰𝖒𝖗𝖊

	A homosapien https://fileditchfiles.me/file.php?f=/s22/xkrWBFoRzAToymUeKT.png Try this version of the image. I converted it to PNG, slightly downscaled, and properly converted to sRGB.
2025-05-04 12:15:50	worse: ```Bash cjxl --lossless_jpeg="1" --allow_jpeg_reconstruction="1" -e "10" a.png test.jxl simulacra2 a.png test.jxl 83.66310444 ```
2025-05-04 12:16:21	oh it can't be lossless jpg, it's png 😁
2025-05-04 12:16:26	This is the default quality, sorry
	A homosapien https://fileditchfiles.me/file.php?f=/s22/xkrWBFoRzAToymUeKT.png Try this version of the image. I converted it to PNG, slightly downscaled, and properly converted to sRGB.
2025-05-04 12:17:36	but this doesn't have the ICC profile
2025-05-04 12:18:13	the original image has a huge ICC profile that is 222kb
2025-05-04 12:18:22	that's the point
2025-05-04 12:20:15	There is a perceivable difference even without zooming

_wb_

	<@794205442175402004> any idea why SSIMULACRA2 is capping at 90 for JPEG originals? I know the decoding differs, but I would've expected at least 95
2025-05-04 05:58:11	Must be that the difference between libjpeg-turbo decoding and libjxl's jpeg decoding is larger than the artifacts introduced by very high quality lossy encoding
2025-05-04 05:59:43	Maybe we shouldn't be using libjpeg-turbo to decode jpegs since it's sacrificing precision for speed
2025-05-04 06:00:03	Then again, it is how jpegs are typically decoded in practice

jonnyawsom3

2025-05-04 06:05:27

Seeing as jpegli is already built with libjxl, it does make sense to remove the other jpeg libraries if possible. Though I see your point, if SSIMULACRA reports higher scores than people see in practice. Even so, hardly hitting the 'visually lossless' mark just by using different JPEG decoders feels... Off...

_wb_

2025-05-04 06:13:00	It's very sensitive to small differences, even converting to a higher bit depth causes the score to go down
2025-05-04 06:17:59	It should be recalibrated with more data on images below 1 JND. As it is now, basically 90-99 all just means "below 0.5 JND or so, and not lossless" but you cannot really say that 97 is better than 93

jonnyawsom3

2025-05-04 06:25:32

Right, makes sense given AIC part 3 was only recently finished(?) which focuses on that range

_wb_

2025-05-04 07:58:20

Exactly. Most older datasets cover the range from ssimu2 -300 to +60 or so, with CID22 we collected data in the range from 30 to 90, and AIC-3 will allow us to get good data for the range from 50 to 100. Roughly speaking.

jonnyawsom3

2025-05-04 08:06:00

-300... I never even considered how low it could go

Traneptora

2025-05-04 08:07:22	-300 always looked to me like "error"
2025-05-04 08:07:41	like, someone messed up and these aren't even the same image

_wb_

2025-05-04 09:18:29

Negative ssimu2 scores are qualities so low nobody would want to go there, but in some existing datasets like TID2013 or KADID10k you will find distortion levels that high...

Orum

2025-05-04 09:29:25

well we need to have some values low enough for Netflix

𝕰𝖒𝖗𝖊

2025-05-04 04:33:47	Kind of true but I believe they especially optimize for certain metrics.
	Orum well we need to have some values low enough for Netflix
2025-05-04 04:34:16	Their bitrate to metric ratio must be high enough but their preferred bitrate is too low for most of their content.
2025-05-04 04:34:27	Understandable because of the bandwitdth + storage they need to use

jonnyawsom3

2025-05-05 04:14:26

That was a fun regression to find, **25,000x** larger since v0.11 ``` JPEG XL encoder v0.12.0 5e1e5530 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 12583.2 kB (6.000 bpp). 4096 x 4096, 6.958 MP/s [6.96, 6.96], , 1 reps, 16 threads. ``` ``` JPEG XL encoder v0.12.0 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 479 bytes (0.000 bpp). 4096 x 4096, 7.064 MP/s [7.06, 7.06], , 1 reps, 16 threads. ```

2025-05-05 04:22:23

https://github.com/libjxl/libjxl/pull/4225

Kupitman

2025-05-05 01:46:34

Awesome, re-encoding archive again.

jonnyawsom3

	RaveSteel For your build: FD0 ``` JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2192.1 kB (0.132 bpp). 15360 x 8640, 64.808 MP/s [64.81, 64.81], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 401.843 MP/s [401.84, 401.84], , 1 reps, 32 threads. real 0m3,721s user 0m48,255s sys 0m10,650s ``` FD1 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 2286.5 kB (0.138 bpp). 15360 x 8640, 74.659 MP/s [74.66, 74.66], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 519.315 MP/s [519.31, 519.31], , 1 reps, 32 threads. real 0m3,374s user 0m38,087s sys 0m11,628s ``` FD2 ``` JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3481.0 kB (0.210 bpp). 15360 x 8640, 60.551 MP/s [60.55, 60.55], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 729.392 MP/s [729.39, 729.39], , 1 reps, 32 threads. real 0m3,708s user 0m37,167s sys 0m19,042s ``` FD3 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3920.9 kB (0.236 bpp). 15360 x 8640, 57.620 MP/s [57.62, 57.62], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 753.847 MP/s [753.85, 753.85], , 1 reps, 32 threads. real 0m3,823s user 0m39,584s sys 0m19,405s ``` FD4 ``` JPEG XL decoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] JPEG XL encoder v0.12.0 2695d26 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 7] Compressed to 3903.0 kB (0.235 bpp). 15360 x 8640, 61.082 MP/s [61.08, 61.08], , 1 reps, 32 threads. Decoded to pixels. 15360 x 8640, 776.508 MP/s [776.51, 776.51], , 1 reps, 32 threads. real 0m3,686s user 0m36,774s sys 0m18,932s ```
2025-05-05 01:48:49	In more usual scenarios, it ranges from a hundred bytes to a few kilobytes, so nothing groundbreaking but it avoids the worse case scenario now
2025-05-05 01:49:24	I... Don't know how it replied to that message, but sure Discord

RaveSteel

2025-05-05 01:49:34	It happens
2025-05-05 01:49:35	smh
2025-05-05 02:26:58	Continuing here from <#805176455658733570> I rendered this animation a few months ago, it plays fine on my phone. 480x480@50, e1. No fast decode, because that was broken at the time
2025-05-05 02:29:16	Would be interesting to have some people here try without a reencode with faster decode and report here if it plays fine or not on their phones

Kupitman

	RaveSteel Would be interesting to have some people here try without a reencode with faster decode and report here if it plays fine or not on their phones
2025-05-05 07:57:40	phones? how

RaveSteel

	Kupitman phones? how
2025-05-05 08:00:59	https://github.com/oupson/jxlviewer

Quackdoc

2025-05-05 08:03:38

termux [av1_chad](https://cdn.discordapp.com/emojis/862625638238257183.webp?size=48&name=av1_chad)

RaveSteel

2025-05-05 08:06:33

Do you have a full proot environment?

Kupitman

	RaveSteel Continuing here from <#805176455658733570> I rendered this animation a few months ago, it plays fine on my phone. 480x480@50, e1. No fast decode, because that was broken at the time
2025-05-05 08:07:54	Work awesome, mediatek 8100

Quackdoc

	RaveSteel Do you have a full proot environment?
2025-05-05 08:46:59	nah, i don't bother with that anymore, enough stuff is packaged that I don't really need it much anymore

RaveSteel

2025-05-05 08:47:39

Which package do you use to display JXLs? Or was it just ajoke?

Quackdoc

2025-05-05 08:49:27	I just use a compiled occulante for x11
	RaveSteel Continuing here from <#805176455658733570> I rendered this animation a few months ago, it plays fine on my phone. 480x480@50, e1. No fast decode, because that was broken at the time
2025-05-05 08:56:08	chafa can't open this at all!?!?!?!?!?!?
2025-05-05 08:56:23	``` chafa -f symbols tux_d0_re.jxl chafa: Failed to open 'tux_d0_re.jxl': Unknown file format ```

RaveSteel

2025-05-05 08:58:59

lol

Quackdoc

2025-05-05 08:59:02	<@280274692487643148> seems like you were the one who contributed chafa support, any ideas?
2025-05-05 09:00:11	other jxl images are working

TheBigBadBoy - 𝙸𝚛

2025-05-05 09:12:32

I wish Fossify gallery would support animated JXL

Quackdoc

2025-05-05 09:13:31	timg doesn't open the video either
	TheBigBadBoy - 𝙸𝚛 I wish Fossify gallery would support animated JXL
2025-05-05 09:14:01	sadly fossify would either need to overhaul its image loader or wait for oupsons fork to be ready

RaveSteel

	TheBigBadBoy - 𝙸𝚛 I wish Fossify gallery would support animated JXL
2025-05-05 09:14:08	I wish it would support still JXLs properly
2025-05-05 09:16:08	<@184373105588699137> does this d1 JXL created with FFmpeg open for you?

Quackdoc

2025-05-05 09:17:47	no 0.0
2025-05-05 09:17:58	neither timg nor chafa can open either

RaveSteel

2025-05-05 09:18:00	weird
2025-05-05 09:18:20	at least oupson's app plays them properly

Quackdoc

2025-05-05 09:20:00

``` timg tux_d1_ffmpeg.jxl --verbose --upscale=i Terminal cells: 56x28 cell-pixels: 25x50 Active Geometry: 54x26 Effective pixelation: Using quarter block. Background color for transparency 'auto'; effective RGB #000000 Alpha-channel merging with background color done by timg. 1 file (0 successful); 0.0 Bytes written (0.0 Bytes/s) 0 frames Environment variables TIMG_PIXELATION (not set) TIMG_DEFAULT_TITLE (not set) TIMG_ALLOW_FRAME_SKIP (not set) TIMG_USE_UPPER_BLOCK (not set) TIMG_FONT_WIDTH_CORRECT (not set) ```

RaveSteel

2025-05-05 09:22:06

Can you try `lsix` if you have it? It doesnt play the animation but should display the frames in the terminal

Quackdoc

2025-05-05 09:25:09

no sixel support on termux

RaveSteel

2025-05-05 09:25:18

darn

oupson

	Quackdoc <@280274692487643148> seems like you were the one who contributed chafa support, any ideas?
2025-05-05 10:19:42	Seams to be related to jxl containers. I will look at it when I will have some times.

Quackdoc

2025-05-06 12:21:12

[av1_perfect](https://cdn.discordapp.com/emojis/842099338904600599.webp?size=48&name=av1_perfect)

jonnyawsom3

	RaveSteel Would be interesting to have some people here try without a reencode with faster decode and report here if it plays fine or not on their phones
2025-05-06 01:02:20	It takes a bit to get to full speed, looks like 30fps until it's fully buffered, but surprisingly works on an 8 year old Huawei Mate 10 Pro
2025-05-06 01:03:17	Using ImageToolbox's Preview feature

oupson

	oupson Seams to be related to jxl containers. I will look at it when I will have some times.
2025-05-06 05:31:04	found a fix, will do a pr once the code is clean

Quackdoc

2025-05-06 03:07:44

yay

jonnyawsom3

2025-05-15 06:30:17

Continuing from https://discord.com/channels/794206087879852103/822105409312653333/1372638713745772745, faster decoding level 4 ``` Encoding kPixels Bytes BPP E MP/s D MP/s Max norm SSIMULACRA2 PSNR pnorm BPP*pnorm QABPP Bugs -------------------------------------------------------------------------------------------------------------------------------------------------------- jxl:d0:1:faster_decoding=4 1084 1380528 10.1827623 79.338 68.478 -nan(ind) 100.00000000 99.99 0.00000000 0.000000000000 10.183 0 jxl:d0:2:faster_decoding=4 1084 1274680 9.4020284 9.383 77.332 -nan(ind) 100.00000000 99.99 0.00000000 0.000000000000 9.402 0 jxl:d0:3:faster_decoding=4 1084 1128803 8.3260409 8.255 40.048 -nan(ind) 100.00000000 99.99 0.00000000 0.000000000000 8.326 0 jxl:d0:4:faster_decoding=4 1084 1363411 10.0565075 3.755 109.784 -nan(ind) 100.00000000 99.99 0.00000000 0.000000000000 10.057 0 jxl:d0:7:faster_decoding=4 1084 1352151 9.9734538 0.567 107.682 -nan(ind) 100.00000000 99.99 0.00000000 0.000000000000 9.973 0 jxl:d0:9:faster_decoding=4 1084 1351613 9.9694855 0.436 112.949 -nan(ind) 100.00000000 99.99 0.00000000 0.000000000000 9.969 0 ```

veluca

2025-05-15 06:39:06

`-e 1` defends itself pretty well 😛 what's -e 9 -fd 4 doing to decode so fast?

jonnyawsom3

	veluca `-e 1` defends itself pretty well 😛 what's -e 9 -fd 4 doing to decode so fast?
2025-05-15 07:09:19	4 is LZ77 and Gradient predictor only, so maybe better 77 matches?

veluca

2025-05-15 07:09:42

yeah that would work

jonnyawsom3

2025-05-15 07:10:55	It's also why effort 1 nearly matches them, it's practically what it does already. Effort 3 forces Weighted, so it can't benefit
2025-05-15 07:12:20	Hmm, actually maybe I should have effort 3 be overridden to 2 when faster decoding is used. Otherwise it does almost nothing

Mine18

2025-05-15 10:51:35

<@238552565619359744> is there a benefit to using fast decode over a lower effort

jonnyawsom3

	Mine18 <@238552565619359744> is there a benefit to using fast decode over a lower effort
2025-05-15 10:53:32	Faster decoding or higher density :P

Mine18

2025-05-15 10:54:23	like, what's the decoding performance and bpp for effort 2 vs effort 9 with fd 3
2025-05-15 10:54:56	fd should have a noticeable impact on iq, yeah?

jonnyawsom3

2025-05-15 11:43:28

Oh right you mean lossy, I honestly forget to check that every time

A homosapien

Mine18 fd should have a noticeable impact on iq, yeah?

2025-05-17 12:09:43

For lossy, lower efforts don't actually increase decode speed, only `--faster_decoding` does that. The impact on iq is notable, but it might be worth the 20-50% decode speed gains depending on your use case.``` Encoding kPixels Bytes BPP E MP/s D MP/s Max norm SSIMULACRA2 PSNR pnorm BPP*pnorm QABPP Bugs -------------------------------------------------------------------------------------------------------------------------------------------------------- jxl:d1:2 1084 202932 1.4968246 37.831 85.426 1.67419874 85.40978047 43.50 0.66288362 0.992220533458 2.506 0 jxl:d1:4 1084 191673 1.4137784 28.114 79.960 1.67328796 85.79362731 43.53 0.66184587 0.935703356848 2.366 0 jxl:d1:7 1084 183906 1.3564890 7.358 85.123 1.83441732 84.95991517 43.20 0.68523169 0.929509269309 2.488 0 jxl:d1:7:faster_decoding=1 1084 188288 1.3888106 8.235 103.549 1.85359479 84.94919133 43.20 0.68538727 0.951873123220 2.574 0 jxl:d1:7:faster_decoding=2 1084 186760 1.3775401 14.232 127.056 1.85366226 84.97907592 43.18 0.68678502 0.946073909335 2.553 0 jxl:d1:7:faster_decoding=3 1084 186760 1.3775401 14.527 122.803 1.85366226 84.97907592 43.18 0.68678502 0.946073909335 2.553 0 jxl:d1:7:faster_decoding=4 1084 187445 1.3825927 15.878 134.474 2.05895733 84.45922173 42.23 0.73173478 1.011691139910 2.847 0 ```

Mine18

2025-05-17 07:41:01

good to know <:FeelsReadingMan:808827102278451241>

John

2025-05-20 03:37:01

Does anyone know if SSIMULACRA2 would be a good metric for video? As in, measuring it frame-by-frame? Or is it more of an image only metric.

RaveSteel

2025-05-20 03:44:26

Is your goal measuring frame similarity or do you want to check if frames are the same?

_wb_

2025-05-20 03:52:57

frame-by-frame ssimu2 can be used as a video metric, though of course it ignores any temporal aspects. according to https://videoprocessing.ai/benchmarks/full-reference-video-quality-metrics.html it still gets quite close to VMAF in terms of correlation, even though VMAF of course does include temporal aspects

Quackdoc

	John Does anyone know if SSIMULACRA2 would be a good metric for video? As in, measuring it frame-by-frame? Or is it more of an image only metric.
2025-05-20 05:17:53	we use it a lot, its practically the best right now
2025-05-20 05:18:10	if you graph it, you get a good idea

John

2025-05-20 05:19:32

Excellent. Thank you. Working on a data science project and figured I’d ask the experts lol

Quackdoc

2025-05-20 05:21:02	butteraugli is also pretty good, but its slow
2025-05-20 05:22:26	ssimu2 gives a good blend of accuracy without falling flat on its face like vmaf and other benchmarks, while also being reasonably usable

_wb_

2025-05-20 05:38:49

The biggest issue with vmaf is that it's not a fidelity metric but an appeal metric (i.e. compressed can be better than original), which makes it easy to cheat — e.g. AI-based codecs have no problem finding ways to fool vmaf. For codec comparisons this is a big issue. They tried solving this with vmaf-neg, but their approach was basically to take regular vmaf and try to subtract the "enhancement gain" aspect, which works a little bit but it's still like scotch taping some wings on a pig and cutting of its front legs and calling it an eagle.

2025-05-20 05:39:52

(at least the above is my impression, I don't mean to disrespect the folks who worked on vmaf)

Quackdoc

_wb_ The biggest issue with vmaf is that it's not a fidelity metric but an appeal metric (i.e. compressed can be better than original), which makes it easy to cheat — e.g. AI-based codecs have no problem finding ways to fool vmaf. For codec comparisons this is a big issue. They tried solving this with vmaf-neg, but their approach was basically to take regular vmaf and try to subtract the "enhancement gain" aspect, which works a little bit but it's still like scotch taping some wings on a pig and cutting of its front legs and calling it an eagle.

2025-05-20 05:52:25

its also just not a very reliable appeal metric. Sometimes the metric literally falls flat on it's face, for instance, it really doesn't handle luminance awareness well.

_wb_

2025-05-20 06:02:02

It also messes up badly regarding banding, somehow it completely misses banding when it gets very noticeable. So yeah it's not very reliable as an appeal metric. But my point was just that it isn't designed to measure fidelity, it will by design give higher scores to an image where you e.g. did some sharpening or contrast/saturation boosting than to the original, which makes it easy to cheat.

Quackdoc

2025-05-20 06:03:55

I do understand penalizing banding less then blocking however. People are "simply used to banding" so to speak

damian101

2025-05-20 06:16:23

The biggest issue with vmaf is that it's trained exclusively on x264 preset medium encodes...

_wb_

	Quackdoc I do understand penalizing banding less then blocking however. People are "simply used to banding" so to speak
2025-05-20 06:17:27	It's also quite tolerant to blocky banding

damian101

2025-05-20 06:18:08

Yes, it's almost blind to all low-contrast blocking and banding.

Quackdoc

2025-05-20 06:18:09

yeah, thats no good [av1_dogelol](https://cdn.discordapp.com/emojis/867794291652558888.webp?size=48&name=av1_dogelol)

_wb_

2025-05-20 06:21:28	https://jon-cld.s3.amazonaws.com/test/ahall_of_fshame_SSIMULACRA_2_modelD.html even some pretty high contrast banding there where it doesn't seem to care much
2025-05-20 06:26:32	https://jon-cld.s3.amazonaws.com/test/ahall_of_fshame_VMAF-NEG.html or I guess this one is more interesting, it's where vmaf disagrees with all the others
2025-05-20 06:29:26	Pretty old page at this point, but still a nice source of embarrassing examples for any metric: https://jon-cld.s3.amazonaws.com/test/index.html

juliobbv

	_wb_ https://jon-cld.s3.amazonaws.com/test/ahall_of_fshame_VMAF-NEG.html or I guess this one is more interesting, it's where vmaf disagrees with all the others
2025-05-20 06:39:17	yeah, vmaf just doesn't penalize blocking enough to my 👀
2025-05-20 06:41:58	IME, I've found it easy to unintentionally inflate scores by doing things that most often end up hurting quality
2025-05-20 06:43:03	usually around tweaking deblocking filter strength and dc coefficient quants
2025-05-20 06:44:01	SSIMU2 seems to be more robust at artificial inflation -- the only major thing I've had trouble with is that SSIMU2 likes steeper QMs than I do, but that's it
2025-05-20 06:44:36	and for some reason, SSIMU2 gets confused with 4:2:2 content?

damian101

	juliobbv usually around tweaking deblocking filter strength and dc coefficient quants
2025-05-20 06:53:11	Yeah, you absolutely don't want to blindly optimize towards vmaf, it's not a robust metric at all.
2025-05-20 06:54:13	Btw, the 4K model is way worse than the standard model in that aspect even. Because unlike the standard model it wasn't trained on upscaled low resolution content in addition to the native 1080p one.
2025-05-20 06:55:01	Well, I assume that's the biggest contributor anyway.

_wb_

2025-05-20 07:02:15	Ssimu2 really dislikes banding and likes super precise low frequency so it probably encourages putting too many bits in the DC, I think
2025-05-20 07:05:23	What does it do with that? I think it was tuned only using 444 and 420 images but I would assume 422 should still do something reasonable. Maybe not though.

damian101

	_wb_ What does it do with that? I think it was tuned only using 444 and 420 images but I would assume 422 should still do something reasonable. Maybe not though.
2025-05-20 07:16:51	Standard chroma subsampling introduces some significant chroma shift and desaturation. So I would expect this to be heavily punished by a metric like ssimu2.
2025-05-20 07:17:30	Even more so butteraugli, of course.

_wb_

2025-05-20 07:32:56	Yeah it does punish chroma subsampling, at least when it causes trouble. But I would expect it punishes 422 less than 420 and doesn't get confused by it?
2025-05-20 07:33:56	SSIM got an Emmy award in 2015, VMAF got an Emmy award in 2020. I wonder if another metric is going to get an Emmy this year 🙂

juliobbv

	_wb_ What does it do with that? I think it was tuned only using 444 and 420 images but I would assume 422 should still do something reasonable. Maybe not though.
2025-05-20 07:34:26	IIRC with my testing, SSIMU2 doesn't suggest using a chroma delta q value that's somewhere "in between" 444 and 420
2025-05-20 07:35:43	if I tried to naively optimize for SSIMU2 scores, the resulting 444 and 420 looked similar to each other, but 422 chroma looked significantly worse

_wb_

2025-05-20 07:35:57

Interesting

juliobbv

2025-05-20 07:36:06

maybe something about 422's anisotropy could confuse SSIMU2?

_wb_

2025-05-20 07:36:23

Could be, yes.

juliobbv

2025-05-20 07:37:06	in the end, I ended up overriding its suggestion with a "good enough" in between value for tune=iq libaom
	_wb_ SSIM got an Emmy award in 2015, VMAF got an Emmy award in 2020. I wonder if another metric is going to get an Emmy this year 🙂
2025-05-20 07:38:41	if you do win one, you should upload a pic here!

_wb_

2025-05-20 07:41:56

422 is kind of weird, I mean of course it is convenient from an implementation pov to subsample only horizontally, and kind of makes sense for analog signal where the horizontal dimension doesn't really have a clear resolution anyway, or in combination with interlaced video where you basically subsample vertically already due to interlacing. But it doesn't really make any sense perceptually to subsample one dimension and not the other, creating wide rectangular chroma pixels instead of square ones...

juliobbv

2025-05-20 07:42:40	also, doesn't 422 have the additional complexity that chroma location for odd chroma rows could be different than even chroma rows?
2025-05-20 07:43:48	I'm not super familiar with this myself

_wb_

2025-05-20 07:44:04	Uh, I didn't know that was a thing, I assumed 422 always has centered chroma sample positions
2025-05-20 07:47:24	420 does have various options though, the weirdest one being the most common MPEG way of doing it (the h264 way) which is centered vertically and cosited with the left luma sample horizontally.
2025-05-20 07:48:44	Anyway, ssimu2 only sees RGB so the complications of how chroma subsampling/upsampling is done are something it doesn't care about

juliobbv

	_wb_ Uh, I didn't know that was a thing, I assumed 422 always has centered chroma sample positions
2025-05-20 08:00:21	maybe I'm confusing Bayer pattern layout with 422 chroma layout... yeah, that's prob it

_wb_

2025-05-20 08:04:42	Ah, yes, Bayer patterns are kind of weird subsampling since if you consider G to be more or less Y and B,R to be more or less Cb, Cr, then basically the Y is horizontally 2x subsampled with alternating sampling positions, and the Cb and Cr are like in 4:2:0 but with weird sample positions.
2025-05-20 08:15:53	I always felt that camera vendors are being misleading when they report some number of megapixels when actually that's the Bayer data resolution so it is really "megasubpixels" and the actual amount of samples is half for G and one quarter for R and B, so basically 24 megapixels is only really 8 RGB megapixels worth of data.

spider-mario

2025-05-20 08:55:34

image quality metrics react to extreme blocking and banding

Mine18

2025-05-21 05:45:21

isnt ssim the same as ssimulacra

Meow

	Mine18 isnt ssim the same as ssimulacra
2025-05-21 06:02:00	https://en.wikipedia.org/wiki/Structural_similarity_index_measure

Mine18

2025-05-21 06:02:26

huh

_wb_

2025-05-21 06:14:45

ssimulacra is based on ssim, ssim is from 2001

damian101

	juliobbv IIRC with my testing, SSIMU2 doesn't suggest using a chroma delta q value that's somewhere "in between" 444 and 420
2025-05-21 08:48:49	Btw, I think you should have optimized the chroma q offsets for chroma subsampling done with the -sharpyuv option... A benefit of it would be that it definitely confuses fidelity metrics much less than standard chroma subsampling in YCbCr.
	_wb_ Uh, I didn't know that was a thing, I assumed 422 always has centered chroma sample positions
2025-05-21 08:51:29	Oh, that's incorrect. In fact, 4:2:2 is never center position in any standard, but always left/colocated, which are the same for 4:2:2.

_wb_

2025-05-21 08:52:05

even in 422 jpeg?

damian101

2025-05-21 08:52:24	wait...
2025-05-21 08:52:41	forgot that exists 😅
	_wb_ even in 422 jpeg?
2025-05-21 08:52:54	should be center there, actually

_wb_

2025-05-21 08:57:01

I think JPEG always uses center positions in both directions, while MPEG went from center (in h261) to left (in h264) to topleft (in h265).

damian101

2025-05-21 08:58:33	I think H.265 is still interpreted as left by default. Top-left has to be signalled, but it's recommended or demanded in some standards.
2025-05-21 08:59:38	UHD Blu-rays should be top-left, but only are half of the time, despite being almost always flagged with top-left regardless.

_wb_

2025-05-21 09:03:17	BT.2020 also says top-left, dunno if that is actually respected
2025-05-21 09:09:42	it's pretty messy and it looks like people are not really learning
2025-05-21 09:10:45	I don't have the final version but e.g. the ISO 21496 gain maps spec, as far as I know, doesn't really specify the sampling position normatively and doesn't say anything about the upsampling to be used
2025-05-21 09:11:07	Maybe this is fixed in the final version but the version of that spec that I saw only has this

damian101

2025-05-21 09:11:20	well, sample position should always be signalled, especially if it's not left position...
2025-05-21 09:11:32	in the video world anyway

_wb_

2025-05-21 09:13:51	it's not signaled in the gain maps spec, which would be fine if they then normatively define what it is, but they don't, it's just a note expressing a preference but that doesn't have any normative impact.
2025-05-21 09:15:26	also in the case of JPEG with gain maps, it's pretty counterintuitive to use a different sample position for the gain map than in the JPEG chroma subsampling; the natural thing for an implementation to do is to just upsample the gain map with the same code that upsamples chroma...

damian101

2025-05-21 09:16:32

still images should always use center position

_wb_

2025-05-21 09:17:26

well the note says top-left is "preferred", for some reason.

damian101

2025-05-21 09:17:53	hmm...
2025-05-21 09:18:18	right
2025-05-21 09:20:07	that's some poor decision, really...

juliobbv

	Btw, I think you should have optimized the chroma q offsets for chroma subsampling done with the -sharpyuv option... A benefit of it would be that it definitely confuses fidelity metrics much less than standard chroma subsampling in YCbCr.
2025-05-21 05:11:02	Maybe in the future sharpyuv downsampled images can be factored into the tuning. Keep in mind that tune=iq is an libaom concept (not libavif), so it's meant to be used with all libraries that oftentimes don't have sharpyuv support, and even then libavif's sharpyuv support is optional. This means we need to optimize for the more common aggregated case first.

damian101

	juliobbv Maybe in the future sharpyuv downsampled images can be factored into the tuning. Keep in mind that tune=iq is an libaom concept (not libavif), so it's meant to be used with all libraries that oftentimes don't have sharpyuv support, and even then libavif's sharpyuv support is optional. This means we need to optimize for the more common aggregated case first.
2025-05-21 06:10:20	well, downscaling in linear light for chroma subsampling can be done outside libavif, but I get your point of course Just feels wrong to adapt an encoder to artifacts from what is essentially incorrect downscaling. Especially since it punishes the very small minority who do it correctly 😅

_wb_

2025-05-21 06:36:30	Is downscaling chroma in linear light more correct? The upsampling is not going to change...
2025-05-21 06:49:52	Although I guess upsampling in nonlinear light from downsampled in linear light is going to be better than doing both in nonlinear. Less darkening will happen.

damian101

	_wb_ Is downscaling chroma in linear light more correct? The upsampling is not going to change...
2025-05-21 07:02:56	way more correct, gets you about half to 4:4:4, really
	_wb_ Although I guess upsampling in nonlinear light from downsampled in linear light is going to be better than doing both in nonlinear. Less darkening will happen.
2025-05-21 07:04:13	Upsampling generally has a lot less impact on that in my experience, I guess because it's not really doing convolution like downsampling does.
2025-05-21 07:05:55	That's why for upsampling, sigmoidal light is often used, instead of linear light, as it reduces haloing artifacts.
	_wb_ Is downscaling chroma in linear light more correct? The upsampling is not going to change...
2025-05-21 07:06:14	I'll send you an example.

_wb_

2025-05-21 07:07:20	I suppose jpegli is not doing this yet when doing 420...
2025-05-21 07:08:01	Or libjxl when doing downsampling when encoding at very low quality
2025-05-21 07:08:47	Some room for improvement since linear downsampling is just better
2025-05-21 07:09:02	I just didn't realize it makes sense to apply it to Cb,Cr too
	way more correct, gets you about half to 4:4:4, really
2025-05-21 07:10:03	At least to avoid the typical darkening of textured reds and blues it should help quite a bit

damian101

2025-05-21 07:15:06

Yes!

_wb_

2025-05-21 07:19:01

High frequency chroma detail is still going to get blurry (say small font colored text) but the overall (low freq) tone should change less which I suppose for photographic images is the main problem with chroma subsampling...

damian101

	way more correct, gets you about half to 4:4:4, really
2025-05-21 08:11:21	well, maybe not quite half 😆 I think I did an unfair comparison back then
	_wb_ Or libjxl when doing downsampling when encoding at very low quality
2025-05-21 08:18:06	if you're in XYB, it probably matters way less than when you're in YCbCr
	if you're in XYB, it probably matters way less than when you're in YCbCr
2025-05-22 04:49:27	It's probably better to downscale chroma in XYB than in linear light, actually.
2025-05-22 04:51:34	Especially if you're encoding in XYB so the upscaling happens in XYB as well. Also much faster.

_wb_

2025-05-22 05:41:48

Usually the image gets converted to linear RGB before converting to XYB, so for full-image downscaling doing it in linear is slightly faster since there's less to convert to XYB then...

damian101

	_wb_ Usually the image gets converted to linear RGB before converting to XYB, so for full-image downscaling doing it in linear is slightly faster since there's less to convert to XYB then...
2025-05-22 05:57:55	Well, but linear RGB to XYB conversion then needs to happen twice at two different resolutions.
2025-05-22 06:01:46	And the downscaling needs to happen on full RGB instead of simply the specific XYB planes where you want lower resolution.
2025-05-22 06:04:12

2025-05-22 06:07:05	Chroma subsampling in linear light here looks almost the same as the source, while there is a huge degradation when doing it simply on the chroma planes.
2025-05-22 06:08:38	I linearized from Gamma 2.2, not sRGB, btw...
2025-05-22 06:12:03	another example
2025-05-22 06:13:19	this sample punishes all kinds of chroma subsampling
2025-05-22 06:17:50	relevant part of the Vapoursynth script I used: ```py transfer = 4 # sRGB = 13, Gamma 2.2 = 4, Gamma 2.4 = 1 linear = True c = core.ffms2.Source(source) c = core.std.SetFrameProps(c, _Transfer=transfer, _Primaries=1) if linear == True: chromasub = core.resize.Point(c, transfer=8, format=vs.RGBS) else: chromasub = core.resize.Point(c, matrix=1, format=vs.YUV444PS) chromasub = core.fmtc.resample(chromasub, kernel="Box", scale=0.5) chromasub = core.resize.Point(chromasub, transfer=transfer, matrix=1, format=vs.YUV444PS) c = core.resize.Point(c, transfer=transfer, matrix=1, format=vs.GRAYS) c = core.std.ShufflePlanes([c, chromasub, chromasub], planes=[0, 1, 2], colorfamily=vs.YUV) c = core.std.SetFrameProps(c, _ChromaLocation=1) c = core.resize.Bicubic(c, dither_type="error_diffusion", format=vs.RGB24) ```

_wb_

	And the downscaling needs to happen on full RGB instead of simply the specific XYB planes where you want lower resolution.
2025-05-22 06:20:40	I was talking about the resampling where the whole image (all components) are lower res, what we do at d>20 or whatever the threshold currently is.

damian101

2025-05-22 06:21:09

yes, then it's only a matter of quality, not speed

A homosapien

	_wb_ I was talking about the resampling where the whole image (all components) are lower res, what we do at d>20 or whatever the threshold currently is.
2025-05-22 06:57:44	It's currently d 10. Thanks to <@238552565619359744>'s changes.

jonnyawsom3

2025-05-22 01:33:56

Also using a different downsampling method than previously, slightly lower quality but an order of magnitude faster

damian101

	Also using a different downsampling method than previously, slightly lower quality but an order of magnitude faster
2025-05-22 04:02:03	Simple Box scaler is both fastest and highest quality for 1/n scaling.

jonnyawsom3

	Simple Box scaler is both fastest and highest quality for 1/n scaling.
2025-05-22 04:27:26	Not sure what libjxl is using, but I don't think it's that

damian101

	Not sure what libjxl is using, but I don't think it's that
2025-05-22 04:31:09	In case of 1/n scale factor it simply averages the pixel contents of an n*n group. Gives some very good results when done in linear light. Especially regarding fidelity.

jonnyawsom3

	In case of 1/n scale factor it simply averages the pixel contents of an n*n group. Gives some very good results when done in linear light. Especially regarding fidelity.
2025-05-22 05:18:14	The old method averaged 12*12 pixels with sharpening. Or something, it was labelled as 'iterative' and caused some ringing

damian101

2025-05-22 05:20:06

12*12 pixels??

jonnyawsom3

	12*12 pixels??
2025-05-22 05:27:25	https://github.com/libjxl/libjxl/pull/471

damian101

2025-05-22 05:41:35	Box scaling does have a problem with jagged edges...
	https://github.com/libjxl/libjxl/pull/471
2025-05-22 05:41:46	hey, that uses box sampling
2025-05-22 05:41:51	with some extra magic...

gb82

	Btw, I think you should have optimized the chroma q offsets for chroma subsampling done with the -sharpyuv option... A benefit of it would be that it definitely confuses fidelity metrics much less than standard chroma subsampling in YCbCr.
2025-05-30 04:49:51	have you seen good results with sharp YUV? I haven't generally

A homosapien

2025-05-30 04:51:54

With WebP I've seen some good results, idk about AVIF

damian101

	A homosapien With WebP I've seen some good results, idk about AVIF
2025-05-30 05:03:09	libavif uses libwebp for sharpyuv
	gb82 have you seen good results with sharp YUV? I haven't generally
2025-05-30 05:03:23	yes...
	gb82 have you seen good results with sharp YUV? I haven't generally
2025-05-30 05:05:18	my own implementation, but the libwebp sharpyuv shouldn't be much different: https://discord.com/channels/794206087879852103/803645746661425173/1374991219079385158

gb82

2025-05-30 05:07:17

interesting – and do metrics respond well here?

A homosapien

	gb82 interesting – and do metrics respond well here?
2025-05-30 05:10:15	Yes, in my testing dssim, ssimu2, and butter respond well

gb82

2025-05-30 05:11:04

do u have graphs or anything?

A homosapien

2025-05-30 05:14:35

Unfortunately I don't, I remember emre giving me a python script to make nice graphs, but I don't know how to set it up

gb82

2025-05-30 05:21:14

i'm seeing it at ~0.2% worse according to SSIMU2 and Butter

jonnyawsom3

2025-05-30 05:35:20

Metrics aren't everything. At least for WebP, sharp YUV is a significant improvement for me

damian101

	gb82 interesting – and do metrics respond well here?
2025-05-30 05:38:50	Yes, any metric that cares about color accuracy responds very well overall. On the specific sample above, butter and ssimu2 scores for the "sharpyuv" version are closer to 4:4:4 than to standard 4:2:0 even.

gb82

2025-05-30 05:39:28

gotcha, interesting

damian101

2025-05-30 05:40:51	On most samples it doesn't matter much.
2025-05-30 05:41:41	But when it matters, "sharpyuv" is definitely preferable in terms of fidelity.
2025-05-30 05:44:42	libwebp sharpyuv uses bilinear downsampling in linear light according to some documentation I've read However, I don't know if that's 2x2 bilinear (which would be good here), or 4x4 bilinear (which would be what most image processing does for standard bilinear/triangle scaling, but it would be unnecessarily blurry here).
2025-05-30 05:44:59	actually, I can easily check when I'm home

A homosapien

	libwebp sharpyuv uses bilinear downsampling in linear light according to some documentation I've read However, I don't know if that's 2x2 bilinear (which would be good here), or 4x4 bilinear (which would be what most image processing does for standard bilinear/triangle scaling, but it would be unnecessarily blurry here).
2025-05-30 05:48:36	The chroma is quite sharp around thin edges, it has to be the 2x2 downscaling
2025-05-30 05:48:39	I think
	gb82 i'm seeing it at ~0.2% worse according to SSIMU2 and Butter
2025-05-30 06:08:34	Which is fine, but there are some images (artwork especially) with thin colored lines that really do suffer from 420 subsampling. Sharp yuv fixes those issues and in my experience makes them look a lot better (and score higher on metrics as well).
2025-05-30 06:09:50	I was seeing like a 3-4 point bump in ssimu2 scores for WebP, I can't say the same for AVIF since I didn't test it on my images

damian101

	A homosapien The chroma is quite sharp around thin edges, it has to be the 2x2 downscaling
2025-05-30 12:17:16	it is indeed
2025-05-30 12:30:48	however, it looks quite different, better than mine, I'm wondering if they're doing colocated instead of center position? or maybe I'm doing something wrong?

gb82

	A homosapien I was seeing like a 3-4 point bump in ssimu2 scores for WebP, I can't say the same for AVIF since I didn't test it on my images
2025-05-30 05:56:50	woah, that's good to know ... maybe the dataset I was testing on wasn't diverse enough to show the benefits. I was using my photographic dataset
2025-05-30 05:57:13	AV1 benefits from CfL and also supports 444 so it may be less relevant there

A homosapien

	gb82 AV1 benefits from CfL and also supports 444 so it may be less relevant there
2025-05-30 08:24:23	Yeah, 444 is usually the better option for chroma fidelity compared to sharp yuv. WebP doesn't have that luxury so sharp yuv is a godsend for lossy compression.

damian101

	however, it looks quite different, better than mine, I'm wondering if they're doing colocated instead of center position? or maybe I'm doing something wrong?
2025-05-30 08:33:34	I think it's just the different upscaling they use
	however, it looks quite different, better than mine, I'm wondering if they're doing colocated instead of center position? or maybe I'm doing something wrong?
2025-06-03 09:52:48	avifenc uses center chroma sample position

afed

2025-06-19 10:36:06	https://www.phoronix.com/review/ryzen-ai-max-390-windows-linux/4
2025-06-19 10:36:09
2025-06-19 10:36:44

veluca

2025-06-19 10:59:55	is that in line with other benchmarks?
2025-06-19 11:00:06	(in particular multithreaded ones)

afed

2025-06-19 11:02:20

<:Thonk:805904896879493180>

veluca

2025-06-19 11:09:16

looks like a yes xD

jonnyawsom3

	afed
2025-06-19 11:14:02	Weird that they did 2 lossy tests instead of a lossy and lossless

afed

2025-06-19 11:18:44

yeah, phoronix settings for jxl tests aren't particularly good, or some even don't make sense

jonnyawsom3

2025-06-19 11:24:35	https://openbenchmarking.org/test/pts/jpegxl
2025-06-19 11:25:41	They have specific tests for JPEG and PNG input... But then they disable transcoding so it makes no difference anyway
2025-06-19 11:27:08	Last I checked, quality 80 wasn't twice as fast either... Some odd decisions

Quackdoc

	Weird that they did 2 lossy tests instead of a lossy and lossless
2025-06-19 11:31:11	Michaels tests are just for comparing hardware another not for actual settings or anything

jonnyawsom3

	Quackdoc Michaels tests are just for comparing hardware another not for actual settings or anything
2025-06-19 11:36:33	Yeah, but lossy and lossless hit different parts of the CPU, so it would've been a good way to compare them

Quackdoc

2025-06-19 11:37:17

well, it's all open source so someone could make a PR

afed

2025-06-19 11:41:03

`Closed` <:KekDog:805390049033191445> https://github.com/phoronix-test-suite/test-profiles/pull/187

Quackdoc

2025-06-19 11:42:36	xD
2025-06-19 11:42:42	well guess that answes that

jonnyawsom3

	They have specific tests for JPEG and PNG input... But then they disable transcoding so it makes no difference anyway
2025-06-19 11:47:40	https://github.com/phoronix-test-suite/test-profiles/issues/186#issuecomment-777902924
2025-06-19 11:47:56	> veluca93 > > I noticed that the current test profile tests lossless JPEG recompression - while this is an interesting usecase, I'd expect it not to be the principal one. Is there a possibility to edit the test profile? Should I just open a merge request?
2025-06-19 11:48:11	That answers that too xD
2025-06-20 10:14:30	```Github --output_format ppm nul 23390 x 13016, 304.916 MP/s [304.92, 304.92], , 1 reps, 16 threads. PeakWorkingSetSize: 1.728 GiB PeakPagefileUsage: 1.812 GiB Wall time: 0 days, 00:00:01.391 (1.39 seconds) User time: 0 days, 00:00:01.343 (1.34 seconds) Kernel time: 0 days, 00:00:13.109 (13.11 seconds) Clang --output_format ppm nul 23390 x 13016, 437.180 MP/s [437.18, 437.18], , 1 reps, 16 threads. PeakWorkingSetSize: 1.729 GiB PeakPagefileUsage: 1.813 GiB Wall time: 0 days, 00:00:01.169 (1.17 seconds) User time: 0 days, 00:00:01.265 (1.27 seconds) Kernel time: 0 days, 00:00:09.593 (9.59 seconds) Clang --disable_output 23390 x 13016, 237.159 MP/s [237.16, 237.16], , 1 reps, 16 threads. PeakWorkingSetSize: 3.531 GiB PeakPagefileUsage: 6.927 GiB Wall time: 0 days, 00:00:01.624 (1.62 seconds) User time: 0 days, 00:00:04.296 (4.30 seconds) Kernel time: 0 days, 00:00:12.062 (12.06 seconds)``` So, Clang builds still make quite the difference and `--disable_output` is still a bit broken, giving lower speeds at significantly more memory use than dumping to null

_wb_

2025-06-20 10:48:15	hm, I wonder what causes that. Maybe ppm output produces uint buffers and disable_output produces float buffers, or something like that
2025-06-20 10:48:36	(question then is, what do you want it to measure?)

jonnyawsom3

2025-06-20 11:19:36	That may explain the memory, but loosing half the speed seems a bit extreme for that
2025-06-20 11:20:56	Unfortunately I just turned my computer off, otherwise I'd have tried pfm into nul
2025-06-21 02:24:37	``` --disable_output 7680 x 4320, geomean: 108.626 MP/s [106.19, 122.20], , 5 reps, 16 threads. PeakWorkingSetSize: 617.6 MiB PeakPagefileUsage: 799.7 MiB Wall time: 0 days, 00:00:01.580 (1.58 seconds) User time: 0 days, 00:00:02.625 (2.62 seconds) Kernel time: 0 days, 00:00:10.125 (10.12 seconds) --output_format ppm nul 7680 x 4320, geomean: 137.334 MP/s [129.70, 145.51], , 5 reps, 16 threads. PeakWorkingSetSize: 327.1 MiB PeakPagefileUsage: 330.6 MiB Wall time: 0 days, 00:00:01.264 (1.26 seconds) User time: 0 days, 00:00:01.218 (1.22 seconds) Kernel time: 0 days, 00:00:07.687 (7.69 seconds) --output_format pfm nul 7680 x 4320, geomean: 107.749 MP/s [102.92, 126.93], , 5 reps, 16 threads. PeakWorkingSetSize: 774.7 MiB PeakPagefileUsage: 799.2 MiB Wall time: 0 days, 00:00:01.681 (1.68 seconds) User time: 0 days, 00:00:02.953 (2.95 seconds) Kernel time: 0 days, 00:00:10.046 (10.05 seconds) ``` Seems you were spot on. It's a shame `--disable_output` doesn't work with `--output_format` or `--bits_per_sample`, otherwise I'd say it should default to 8bit int, ideally checking what bitdepth is in the JXL header and using that. Also <@207980494892040194> since I know you found this a while back

damian101

	BlueSwordM It might be that CAMBI tries to remove dithering to see the actual banding.
2025-06-21 04:50:45	removing dither noise does not create banding if you do it in high precision

gb82

2025-06-23 11:30:14

kodak true color image set all set to use 1 thread avifenc had `-d 10 -y 444` set as well

A homosapien

	gb82 kodak true color image set all set to use 1 thread avifenc had `-d 10 -y 444` set as well
2025-06-23 11:37:09	What's the baseline at 0%?

gb82

2025-06-23 11:55:40

tune iq speed 8

A homosapien

2025-06-24 12:01:43

libjxl has some catching up to do 🥴

gb82

2025-06-24 12:19:05	i think it'd probably do much better on a higher resolution dataset in the high fidelity only range
2025-06-24 12:19:21	the kodak images are pretty small after all

afed

2025-06-24 12:21:55

gb82

2025-06-24 12:36:00

we're testing avif 4:4:4 here, but I can test another dataset to show you that the results are pretty much identical if you're interested

jonnyawsom3

	gb82 kodak true color image set all set to use 1 thread avifenc had `-d 10 -y 444` set as well
2025-06-24 01:00:59	If its not too hard, what about trying libjxl v0.8 too? Would be good to see a comparison to main

gb82

2025-06-24 01:01:13

Yeah I can take a look

_wb_

	gb82 we're testing avif 4:4:4 here, but I can test another dataset to show you that the results are pretty much identical if you're interested
2025-06-24 06:19:15	Doing 4:4:4 on pixels that have been 4:2:2 is still giving a benefit to YCbCr-based codecs since they can take advantage of the high freq horizontal coeffs are basically zero, while in XYB that effect will not really be noticeable.
2025-06-24 06:20:34	What ssimu2 range is the BD rate computed over? Since often the curves tend to cross...

juliobbv

2025-06-24 12:04:27	regarding datasets: https://structpku.github.io/LIU4K_Dataset/LIU4K_v2.html
2025-06-24 12:04:37	this might be an interesting one to try out

gb82

	_wb_ What ssimu2 range is the BD rate computed over? Since often the curves tend to cross...
2025-06-24 03:47:20	0 to 80
	_wb_ Doing 4:4:4 on pixels that have been 4:2:2 is still giving a benefit to YCbCr-based codecs since they can take advantage of the high freq horizontal coeffs are basically zero, while in XYB that effect will not really be noticeable.
2025-06-24 03:47:50	Ah ok good to know. I can rebalance these numbers by trying on subset1 with ssimu2 50-90

_wb_

2025-06-24 04:48:42	Yeah that would make more sense
2025-06-24 04:49:24	If you take the range 0 to 80, then what happens from 0 to 40 has as much impact as what happens from 40 to 80
2025-06-24 05:02:54	(while I think the more important range is something like ssimu2 60-80)

gb82

	_wb_ (while I think the more important range is something like ssimu2 60-80)
2025-06-24 05:56:23	I can try that too

A homosapien

2025-07-07 01:50:59
2025-07-07 01:51:02	Comparing of the accuracy of various jpeg decoders.

spider-mario

2025-07-07 11:01:37

would be interesting to compare a jpegli-encoded JPEG as well

afed

2025-07-07 11:04:45
2025-07-07 11:04:56
2025-07-07 11:07:03	although this is an old jpegli version, there have been some changes since then, and some unmerged PRs may also be useful

jonnyawsom3

	A homosapien
2025-07-07 11:43:05	I had done some testing here including visual examples https://github.com/RawTherapee/RawTherapee/issues/7125#issuecomment-2534693119

juliobbv

2025-07-20 05:38:52	Hi all, I just wanted to report an instance of a pair of images with similar encoded perceived quality, yet very wildly different SSIMU2 scores. I know that SSIMU2 doesn't correlate well in the lower-end range, but I wanted to share with y'all a rather extreme example.
2025-07-20 05:42:39	the image on the left gets a SSIMU2 score of 62.24, while the image on the right gets a -1.99
2025-07-20 05:43:16	source images can be found here: https://discord.com/channels/992019264959676448/992029418644054086/1396359457365299241
2025-07-20 05:44:50	I'd even venture to say that the peacock actually looks a bit more faithful to the original than the mountains

Lumen

2025-07-20 08:49:25	The mountain is dark So no matter the quality it tends to be very permissive
2025-07-20 08:49:38	We d need some target intensity like for butter ideally
2025-07-20 08:50:33	What does butter say?
2025-07-20 08:51:11	Also the right one has very fine texture that probably gets the same weakness as grain in metrics

Foxtrot

2025-07-20 09:09:16	Avif vs jxl comparison benchmark https://www.rachelplusplus.me.uk/blog/2025/07/a-better-image-compression-comparison/
2025-07-20 09:12:36	Relevant quote: >>> when using only the stock settings, JPEG-XL handily beats all of the other encoders at all speed settings. But with optimized settings, both libaom and SVT-AV1 can outperform it, with libaom giving the best results in most cases. Which just goes to show how critically important encoder tuning can be!
2025-07-20 09:13:56	Reddit discussion: https://www.reddit.com/r/AV1/s/GGCABdGKld

A homosapien

	juliobbv the image on the left gets a SSIMU2 score of 62.24, while the image on the right gets a -1.99
2025-07-20 09:58:13	It is just me or does the Peacock image looks a tiny bit darker than the original?

juliobbv

	A homosapien It is just me or does the Peacock image looks a tiny bit darker than the original?
2025-07-20 02:58:23	it's definitely a bit darker on my end, but because the bright lines are significantly softer
	Lumen What does butter say?
2025-07-20 03:07:06	Mountain 7.7043237686 3-norm: 1.804515 Peacock: 22.5073051453 3-norm: 6.188558
2025-07-20 03:07:48	With intensity target 203: Mountain 11.5731630325 3-norm: 2.497548 Peacock 29.0675411224 3-norm: 7.991876

afed

2025-07-20 03:16:15

perhaps also because the lines and some areas are reconstructed by predictors or something that modern video encoders use so it introduces a lot of distortion, but for human perception it's something generally similar, especially for not close viewing on a very complex picture

damian101

	Lumen We d need some target intensity like for butter ideally
2025-07-20 03:25:43	That's not the main issue here. You could dim your monitor a lot, and ssimu2 and your perception would still disagree on these images.
2025-07-20 03:28:53	(intensity target setting in ssimulacra2 would still be nice, although I think it would be more of a bandaid as the way XYB maps luminance just generally doesn't give enough importance to dark areas imo)

juliobbv

2025-07-20 03:59:38

apparently cvvdp doesn't get fooled, and scores both images similarly

_wb_

2025-07-20 04:30:13	Interesting pair of images. Yes, in my experience ssimu2 is not so great at inter-content consistency and the vdp metrics are better at that, while for inter-codec consistency, it's the other way around.
2025-07-20 04:32:03	Here the image content is about as different as it gets (subtle shades of mostly monochromatic stuff, versus something very busy in both luma and chroma).

𝕰𝖒𝖗𝖊

2025-07-20 06:54:14

``` PEACOCK: cvvdp: 8.0830 MOUNTAIN: cvvdp: 8.6750 ``` If anyone wonders

juliobbv

	𝕰𝖒𝖗𝖊 ``` PEACOCK: cvvdp: 8.0830 MOUNTAIN: cvvdp: 8.6750 ``` If anyone wonders
2025-07-20 08:06:25	this seems to correlate with my perception
2025-07-20 08:06:41	mountain is a bit worse than peacock
2025-07-20 08:07:50	some outlines in mountain get completely wiped off into outright blocks
	_wb_ Interesting pair of images. Yes, in my experience ssimu2 is not so great at inter-content consistency and the vdp metrics are better at that, while for inter-codec consistency, it's the other way around.
2025-07-20 08:18:04	yeah, this is the most extreme pair I've found so far
2025-07-20 08:19:05	what's weird is that ssimu2 is barely penalizing blocking in the mountain image, despite having a map for it
2025-07-20 08:19:46	it might be worth a look for a ssimu2.2 release
2025-07-20 08:20:38	maybe the underlying algo has to be tuned or something

Lumen

2025-07-20 08:20:40

will update vship for this if it happens 👀

juliobbv

	Foxtrot Avif vs jxl comparison benchmark https://www.rachelplusplus.me.uk/blog/2025/07/a-better-image-compression-comparison/
2025-07-20 09:25:45	interesting comparison, it's worth mentioning that the author runs a multi-res convex hull over every encoded image and quality setting, and tries to find the optimal size that maximizes ssimu2 scores
2025-07-20 09:28:17	so her charts might look different when encoding at native res only

𝕰𝖒𝖗𝖊

2025-08-01 03:19:23
2025-08-01 03:19:49	Image dimension: 13385x10264 Reconstructed Jpeg Ryzen 9 9950x

jonnyawsom3

2025-08-01 03:36:38

What size were the PNGs? The encoding could've been taking a lot of time

𝕰𝖒𝖗𝖊

	What size were the PNGs? The encoding could've been taking a lot of time
2025-08-01 04:55:53	the original JPG (processed camera output) is 106MB
2025-08-01 04:58:46

Quackdoc

2025-08-01 04:59:34

well you could test no output if you didn't care about png output, but its interesting to see regardless

afed

2025-08-01 04:59:42

> .png some of the time (sometimes significant) is spent on png encoding (so this is also a png encoding speed comparison in some way), it is better to use ppm or nul

jonnyawsom3

2025-08-01 04:59:49

Hmm, have you tried decoding with output disabled? (remove `-o enc.png` from Oxide and change djxl to `djxl enc.jxl nul --output_format ppm --bits_per_sample 8`)

afed

2025-08-01 05:02:32

or ``` --disable_output No output file will be written (for benchmarking)```

jonnyawsom3

2025-08-01 05:02:58

That isn't comparable to oxide due to decoding to 32float

Info

JPEG XL

General chat

Voice Channels

Archived

benchmarks