JPEG XL

2025-10-09 09:15:58	not sure I've ever seen lossless h.264 🤔
2025-10-09 09:16:49	actually here's a better question: what's the best choice for lossless video, if we can take advantage of inter coding?

Orum

2025-10-09 09:17:54	there are really only 3 worthwhile lossless video codecs: h.264, utvideo, and if utvideo doesn't support your pixel format, ffvhuff
2025-10-09 09:18:26	for inter the only real option is h.264

RaveSteel

2025-10-09 09:35:21	utvideo is an excellent codec for recording lossless. It uses almost no resources, you just need fast enough storage
2025-10-09 09:36:41	I have tried converting utvideo to lossless JXL in MKV via FFmpeg, which was much smaller, but barely usable due to inadequate decoding speed lol
2025-10-09 09:37:49	FFV1 is of course the better choice for archival, despite the lower density, since it supports CRC for frames, which JXL does not

veluca

	juliobbv the weakest part of the standard IMO
2025-10-09 10:05:05	I'd be surprised if av2 were any better than av1 at lossless image/video compression 😄
2025-10-09 10:05:27	_especially_ for nonphoto

juliobbv

	veluca I'd be surprised if av2 were any better than av1 at lossless image/video compression 😄
2025-10-09 10:06:29	there were some improvements, but IIRC they were limited to additional transforms, not better prediction methods
2025-10-09 10:06:55	so better than AV1? yes.
2025-10-09 10:07:10	better than even AVC? still up in the air

veluca

2025-10-09 10:08:15

I'd be happy with "better than PNG" but I have my doubts 😛

juliobbv

2025-10-09 10:29:41

yeah, I guess we'll see once comparisons are released

Exorcist

	lonjil how's lossless av?
2025-10-10 05:18:37	The compression ratio for low quality The compression ratio for high quality The compression ratio for near lossless / true lossless For almost all codec, the three ratios are not consistent
2025-10-10 05:20:31	So, a single value of "BD-rate" is meaningless

jonnyawsom3

2025-10-10 06:11:37

Yeah that's.... How compression works

2025-10-17 08:15:43

Lossless ```3840 x 2160, geomean: 43.478 MP/s [41.029, 45.202], 10 reps, 16 threads. PeakWorkingSetSize: 67.23 MiB Wall time: 0 days, 00:00:01.939 (1.94 seconds) User time: 0 days, 00:00:00.687 (0.69 seconds) Kernel time: 0 days, 00:00:26.156 (26.16 seconds)``` ```2025-10-17T20:06:58.555968Z INFO jxl_oxide_cli::decode: Image dimension: 3840x2160 2025-10-17T20:06:58.556135Z INFO jxl_oxide_cli::decode: Running 10 repetitions 2025-10-17T20:07:00.996065Z INFO jxl_oxide_cli::decode: Geomean: 241.998 ms (34.275 MP/s) 2025-10-17T20:07:00.996236Z INFO jxl_oxide_cli::decode: Range: [221.915 ms, 342.733 ms] ([24.201 MP/s, 37.377 MP/s]) PeakWorkingSetSize: 121.1 MiB Wall time: 0 days, 00:00:02.464 (2.46 seconds) User time: 0 days, 00:00:00.281 (0.28 seconds) Kernel time: 0 days, 00:00:31.000 (31.00 seconds)``` Progressive Lossless ```3840 x 2160, geomean: 30.852 MP/s [28.883, 32.886], 10 reps, 16 threads. PeakWorkingSetSize: 162.8 MiB Wall time: 0 days, 00:00:02.710 (2.71 seconds) User time: 0 days, 00:00:13.390 (13.39 seconds) Kernel time: 0 days, 00:00:14.578 (14.58 seconds)``` ```2025-10-17T20:07:08.829117Z INFO jxl_oxide_cli::decode: Image dimension: 3840x2160 2025-10-17T20:07:08.829271Z INFO jxl_oxide_cli::decode: Running 10 repetitions 2025-10-17T20:07:10.297453Z INFO jxl_oxide_cli::decode: Geomean: 146.626 ms (56.568 MP/s) 2025-10-17T20:07:10.297659Z INFO jxl_oxide_cli::decode: Range: [140.175 ms, 167.710 ms] ([49.457 MP/s, 59.172 MP/s]) PeakWorkingSetSize: 139.2 MiB Wall time: 0 days, 00:00:01.497 (1.50 seconds) User time: 0 days, 00:00:00.218 (0.22 seconds) Kernel time: 0 days, 00:00:18.140 (18.14 seconds)``` Turns out Oxide decodes progressive lossless even faster than normal lossless, and twice as fast as libjxl

A homosapien

2025-10-17 08:41:37

Doesn't jxl-oxide have certain optimizations/fast-paths that libjxl doesn't?

jonnyawsom3

2025-10-17 08:45:59	Yeah
2025-10-19 09:37:20
2025-10-19 09:37:27	<@493769349882970148> Group size 1 matches the pattern, but then has more groups signalling near-identical MA trees, versus 4 trees for the entire image (I think, it might be doing global MA on v0.8 and that's unrelated)

ignaloidas

2025-10-19 09:38:51

ah, if it's not using a global tree then yeah, would make sense, but with a global tree I'd expect -g 1 to be better

A homosapien

2025-10-19 09:39:18

I guess the more context the better

jonnyawsom3

	A homosapien I guess the more context the better
2025-10-19 09:40:11	Well, we could find out... Don't we still have that patch to dump the MA tree? Could just examine it ourselves (Albeit for only a single group, but still)

A homosapien

2025-10-19 09:40:40

I think so, it's in our DM's somewhere

jonnyawsom3

2025-10-19 09:41:14
2025-10-19 09:41:56	Also I'm surprised by `-P 14`, that's just the default below max effort. I would've expected `-P 5` since it's almost entirely a gradient, or `-P 15`

A homosapien

2025-10-19 09:42:28	I'm not sure that's the smallest it can go, I just guessed it would compress well
	Also I'm surprised by `-P 14`, that's just the default below max effort. I would've expected `-P 5` since it's almost entirely a gradient, or `-P 15`
2025-10-19 09:45:56	Your guess is better``` cjxl 16million-pschmidt.png 16million-pschmidt1.jxl -d 0 -e 9 -g 3 -E 4 -I 100 -P 5 JPEG XL encoder v0.8.4 aa2df301 [AVX2] Read 4096x4096 image, 57757 bytes, 554.0 MP/s Encoding [Modular, lossless, effort: 9], Compressed to 700 bytes (0.000 bpp). 4096 x 4096, 0.30 MP/s [0.30, 0.30], 1 reps, 12 threads. ```

jonnyawsom3

	A homosapien Your guess is better``` cjxl 16million-pschmidt.png 16million-pschmidt1.jxl -d 0 -e 9 -g 3 -E 4 -I 100 -P 5 JPEG XL encoder v0.8.4 aa2df301 [AVX2] Read 4096x4096 image, 57757 bytes, 554.0 MP/s Encoding [Modular, lossless, effort: 9], Compressed to 700 bytes (0.000 bpp). 4096 x 4096, 0.30 MP/s [0.30, 0.30], 1 reps, 12 threads. ```
2025-10-19 09:54:07	Interesting, maybe you were right about palette sorting, but my instincts tells me something else is up ```cjxl 16million-pschmidt.png 16million-pschmidt.jxl -d 0 -e 10 -g 3 -E 11 -I 100 -P 5 --patches 0 JPEG XL encoder v0.12.0 029cec42 [_AVX2_] {Clang 20.1.8} Encoding [Modular, lossless, effort: 10] Compressed to 2831 bytes (0.001 bpp). 4096 x 4096, 0.213 MP/s, 16 threads.```

A homosapien

2025-10-19 09:55:00

Another win for 0.8 <:KekDog:805390049033191445>

jonnyawsom3

2025-10-19 09:56:01	Time for us to get another 75% density improvement in a PR <:Stonks:806137886726553651>
2025-10-19 09:59:07	I'm trying a pure LZ77 encode, just to see what happens (So far, it's extremely slow)

A homosapien

2025-10-19 10:17:21	`-P 15` got the smallest result
2025-10-19 10:17:53	479 bytes

veluca

	A homosapien `-P 15` got the smallest result
2025-10-19 10:31:06	that makes sense

A homosapien

2025-10-19 10:32:24
2025-10-19 10:32:25	JPEG XL performs so well it even surprises the devs <:Stonks:806137886726553651>

jonnyawsom3

	I'm trying a pure LZ77 encode, just to see what happens (So far, it's extremely slow)
2025-10-19 10:49:31	Memory usage is sat at 30MB and a single core... Still running though
2025-10-19 10:54:20	Tried doing an effort 9 encode too... Locked up the system so no clue what went wrong there
2025-10-19 10:54:53	`cjxl 16million-pschmidt.png 16million-pschmidt.jxl -d 0 -g 3 -I 0 -e 9 -C 0 --patches 0 -P 0`
2025-10-19 11:05:50	Gave up on LZ77 only, seems to break something at effort 9 with that image and effort 8 isn't anywhere close
	A homosapien 479 bytes
2025-10-19 11:07:07	Definately gotta see if we can extract that MA tree, might give us some ideas for new JXL art
	A homosapien I think so, it's in our DM's somewhere
2025-10-19 11:08:08	https://discord.com/channels/794206087879852103/822105409312653333/1346452090465030204
2025-10-19 11:08:30	Might need tweaking for v0.8, but then we can compare trees against main. Ideally we could just use a debug build and it would tell us what it's doing
2025-10-20 01:37:49	Made an image that jpegli isn't happy with. Heavy quantization on the right side in YCbCr and too much B in XYB (Have to compare locally due to Discord)
2025-10-20 01:38:38	For reproducing and future testing
2025-10-20 01:39:44	```cjpegli IMG_20240928_204850.png YCbCr.jpg Read 3960x2968 image, 16330595 bytes. Encoding [YUV d1.000 AQ p2 OPT] Compressed to 2213812 bytes (1.507 bpp). 3960 x 2968, 30.018 MP/s [30.02, 30.02], , 1 reps, 1 threads.``` ```cjpegli IMG_20240928_204850.png XYB.jpg --xyb Read 3960x2968 image, 16330595 bytes. Encoding [XYB d1.000 AQ p2 OPT] Compressed to 1875098 bytes (1.276 bpp). 3960 x 2968, 3.507 MP/s [3.51, 3.51], , 1 reps, 1 threads.``` It also made the encoder 10x slower for some reason
2025-10-20 01:51:06	Right, it's color profiles and color management... It always is <:NotLikeThis:805132742819053610>
	A homosapien 479 bytes
2025-10-20 06:46:17	We figured out there's been a slow regression since v0.8, and a large one in v0.12. All encoded with `-d 0 -e 9 -g 3 -E 4 -I 100 -P 15` ``` v0.8 Compressed to 479 bytes (0.000 bpp). 4096 x 4096, 0.05 MP/s [0.05, 0.05], 1 reps, 16 threads.``` ``` v0.9 Compressed to 504 bytes (0.000 bpp). 4096 x 4096, 0.033 MP/s [0.03, 0.03], 1 reps, 16 threads.``` ``` v0.10 v0.11 Patches 1 Compressed to 603 bytes (0.000 bpp). 4096 x 4096, 0.038 MP/s [0.04, 0.04], , 1 reps, 16 threads.``` ``` v0.12 Patches 1 Compressed to 2640 bytes (0.001 bpp). 4096 x 4096, 0.055 MP/s, 16 threads.``` ``` v0.12 Effort 10 Compressed to 2643 bytes (0.001 bpp). 4096 x 4096, 0.056 MP/s, 16 threads.``` For some reason effort 10 adds 5 bytes to LfGlobal and removes 2 bytes from the first group compared to effort 9 with patches enabled to force global trees

A homosapien

2025-10-20 06:52:43

I should really set up a script with a local corpus that tests stuff like this

jonnyawsom3

2025-10-20 07:40:36

I mean, we have tests on the github repo, but apparently they're not thorough enough

A homosapien

	We figured out there's been a slow regression since v0.8, and a large one in v0.12. All encoded with `-d 0 -e 9 -g 3 -E 4 -I 100 -P 15` ``` v0.8 Compressed to 479 bytes (0.000 bpp). 4096 x 4096, 0.05 MP/s [0.05, 0.05], 1 reps, 16 threads.``` ``` v0.9 Compressed to 504 bytes (0.000 bpp). 4096 x 4096, 0.033 MP/s [0.03, 0.03], 1 reps, 16 threads.``` ``` v0.10 v0.11 Patches 1 Compressed to 603 bytes (0.000 bpp). 4096 x 4096, 0.038 MP/s [0.04, 0.04], , 1 reps, 16 threads.``` ``` v0.12 Patches 1 Compressed to 2640 bytes (0.001 bpp). 4096 x 4096, 0.055 MP/s, 16 threads.``` ``` v0.12 Effort 10 Compressed to 2643 bytes (0.001 bpp). 4096 x 4096, 0.056 MP/s, 16 threads.``` For some reason effort 10 adds 5 bytes to LfGlobal and removes 2 bytes from the first group compared to effort 9 with patches enabled to force global trees
2025-10-20 10:34:38	Caused by https://github.com/libjxl/libjxl/pull/4154
2025-10-20 10:35:04	I've pointed this out before, it causes other lossless size regressions as well
2025-10-20 10:44:16	Jon's [fix](https://github.com/libjxl/libjxl/pull/4236) doesn't work either.

_wb_

2025-10-21 08:19:12

I wonder what it is doing differently, it's a rather pathological case though, more like jxlart than like a typical image

lonjil

2025-10-21 09:06:33

wow jxl really doesn't like this image ``` Encoding [Modular, lossless, effort: 1] Compressed to 149874.3 kB (30.702 bpp). 7216 x 5412, 104.448 MP/s, 4 threads. Encoding [Modular, lossless, effort: 6] Compressed to 141625.3 kB (29.012 bpp). 7216 x 5412, 3.242 MP/s, 4 threads. ```

jonnyawsom3

2025-10-21 09:07:11

I'm guessing it's float?

lonjil

2025-10-21 09:07:49

big_building.ppm from the 16 bit linear set from imagecompression.info

jonnyawsom3

2025-10-21 09:17:46

Huh, interesting. I'm running some effort 11 encodes right now so can't test myself, but very strange it's 3x higher than PNG (According to the website)

lonjil

2025-10-21 09:22:42

hm, where are you seeing that?

jonnyawsom3

	lonjil hm, where are you seeing that?
2025-10-21 09:24:37	https://imagecompression.info/lossless/

lonjil

2025-10-21 09:26:04	I don't see png there
2025-10-21 09:26:28	but it's weird that it's worse than jpeg-ls, jpeg2000, and jpeg xr

veluca

2025-10-21 09:32:35

I don't remember that image being particularly problematic

lonjil

2025-10-21 09:33:24

I'm just doing `cjxl -d -e $x big_building.ppm $out` with the latest main commit.

jonnyawsom3

	lonjil I don't see png there
2025-10-21 09:34:54	I may have gone slightly insane, but yeah. 10bpp on that graph
2025-10-21 09:36:36	Unrelated, but a slightly slow encode ```JPEG XL encoder v0.10.2 4451ce9 [AVX2,SSE2] Encoding [Modular, lossless, effort: 11] Compressed to 774.9 kB (5.912 bpp). 1024 x 1024, 0.000 MP/s [0.00, 0.00], , 1 reps, 8 threads. PageFaultCount: 1610910657 PeakWorkingSetSize: 1.676 GiB QuotaPeakPagedPoolUsage: 35.42 KiB QuotaPeakNonPagedPoolUsage: 60.29 KiB PeakPagefileUsage: 1.704 GiB Creation time 2025/10/21 21:30:16.122 Exit time 2025/10/21 22:21:13.678 Wall time: 0 days, 00:50:57.555 (3057.56 seconds) User time: 0 days, 00:42:47.953 (2567.95 seconds) Kernel time: 0 days, 05:47:30.031 (20850.03 seconds)```
2025-10-21 09:37:48	Conclusion was: We can make progressive lossless around 10% smaller. Also v0.11 is 20x faster than v0.10 for effort 11 with the same result, so don't use that
	lonjil but it's weird that it's worse than jpeg-ls, jpeg2000, and jpeg xr
2025-10-21 09:53:08	I tried encoding my own JPEG-LS file, and it came out to `147,397,615 bytes` while `cjxl -d 0 -e 3` gives `140,226,751 bytes`
2025-10-21 09:53:36	Seems like the 16 bit Linear file is just extremely hard for encoders, but JXL does squeeze past the others

lonjil

2025-10-21 09:53:50	interesting
2025-10-21 09:55:05	I wonder why the chart shows all those as 10 bpp 🤔

jonnyawsom3

	https://imagecompression.info/lossless/
2025-10-21 09:55:44	Not sure why that graph says it's 10bpp, but unless I'm encoding the LS file wrong, it's definitely 30bpp
2025-10-21 09:57:26	```-ls mode : encode in JPEG LS mode, where 0 is scan-interleaved, 1 is line interleaved and 2 is sample interleaved. Use -c to bypass the YCbCr color transformation for true lossless```
2025-10-21 09:57:48	Also, an effort 2 JXL actually opens faster than the PPM for me

taliaskul

2025-10-22 01:59:00	hi guys this is 1.4giB working set swiffer quickjet compression
2025-10-22 01:59:23
2025-10-22 02:07:20	well guys then apparently xzip is corrupt this is only 630mb

jonnyawsom3

	taliaskul hi guys this is 1.4giB working set swiffer quickjet compression
2025-10-22 02:10:21	What is this meant to be used for?

taliaskul

2025-10-22 02:13:39

sell prod

jonnyawsom3

2025-10-22 02:14:17

https://tenor.com/view/the-rock-gif-23501265

screwball

	taliaskul sell prod
2025-10-22 02:14:29	price?

taliaskul

2025-10-22 02:14:29
2025-10-22 02:14:44	its for playing robber baron vlc
2025-10-22 02:14:56	oh theres no price im f2p

screwball

	taliaskul
2025-10-22 02:18:54	whats your name

_wb_

	https://imagecompression.info/lossless/
2025-10-22 06:25:35	I assume the vertical axis is bits per sample here, not bits per pixel. The numbers for Gray 16 bit linear are roughly the same, around 10 bpp for big_building.
2025-10-22 06:29:20	it's a relatively noisy image (both in terms of image content and photon noise) so I'm not surprised that 48 bits can only be compressed down to 30
2025-10-22 06:32:00	looks like all the charts on https://imagecompression.info/lossless/ are labeled as "bits per pixel" but it's actually bits per sample so in the plots for RGB you have to multiply it by 3

taliaskul

2025-10-22 09:03:18	this is my improvment 632mb
2025-10-22 09:03:35	hi im fee bee

jonnyawsom3

	taliaskul this is my improvment 632mb
2025-10-22 09:04:45	What compression are you benchmarking? This server is about image formats and JPEG XL specifically

taliaskul

2025-10-22 02:31:38

hi this is n=np compression

lonjil

2025-10-22 03:35:56	What do you mean by non-entropy bound?
2025-10-22 03:40:21	Which kinds of files can your method compress?
2025-10-22 03:41:20	Actually impossible
2025-10-22 03:44:17	there are 2^512 possible bit strings of length 512. any shorter length has fewer possible bit strings. so not all 512-bit inputs can be mapped to smaller outputs. if some become smaller, necessarily some must become larger.

Exorcist

2025-10-22 03:44:21

https://stackoverflow.com/questions/21220151/is-there-an-algorithm-for-perfect-compression

lonjil

2025-10-22 03:48:46

if you don't have a one to one mapping, you can't reverse the compression without loss

Exorcist

2025-10-22 04:02:28	is your string random generated?
2025-10-22 04:04:17	3024 / 4096 = 0.73828125 The compression ratio means your string is not random enough, is it represent something?
2025-10-22 04:07:56	> produced random 1024 hex digit How you produce it? You claim "random" is not really random
2025-10-22 04:13:21	I assume you don't want to share your source code (and it is not necessary for the discussion about theory)
2025-10-22 04:19:36	This is how to test pseudorandom number generator: https://csrc.nist.gov/projects/random-bit-generation/documentation-and-software/guide-to-the-statistical-tests Why this is about compression? Think this: compression is about how to find pattern, random is about how to avoid pattern

_wb_

2025-10-22 04:21:30	The pigeonhole principle is not something that can be circumvented with some clever new idea. It is even more impossible than a perpetuum mobile machine.
2025-10-22 04:26:27	If unconditional lossless compression with guaranteed nonzero size reduction were possible, then you could apply it recursively and reduce all possible files to one bit
2025-10-22 04:26:56	What would your system decode this one bit to? It can only decode it to two different files

Exorcist

	Exorcist https://stackoverflow.com/questions/21220151/is-there-an-algorithm-for-perfect-compression
2025-10-22 04:27:36	You need to read this again: The Shannon limit only applies if you treat a set of symbols "individually". As soon as you try to "combine them", or find any correlations between the symbols, you are "modeling". And this is the territory of Kolmogorov Complexity, which is not computable.

Orum

	_wb_ What would your system decode this one bit to? It can only decode it to two different files
2025-10-22 04:32:17	the classic qbit problem
2025-10-22 04:32:52	infinite amounts of data can be stored, but only a 1 or 0 can be retrieved 😆

_wb_

2025-10-22 04:34:08

Anyway, compression/decompression speed is only relevant if the method actually works. To benchmark lossless compression, you need some representative corpus of uncompressed data and then you can see how well it compresses.

Exorcist

2025-10-22 04:34:50

b594c7c5e5b5bfb7523b29dacf67fcd760197c3ea74f04f40817c77a5ac4fc27de20b5e3a91f2dc104ddffb817955fb26a2619e316dd370d87ef2680423c022e2fe4060538f5ee1e687dc8c30395de97c900f5db468777dab0cd3efe28dc5259926f528a3568803179beef7c265820951a6457626314f1e4662a8d1006b4c5f0e35f4f56b6a204bab62ba4fef59fb3e085bd9a039c80426acb338a426311de2c08c89a4cf56d264bd56ad9504bce687096c582676cd00cd9dfcc8d8e1bae3df53440c32764817d9be017bd23638efd42ab932bbe9ec706d6568ca239d4ca21e530fb9af807531e6b1b95a2e34ee9656aab3864b704692b57bc18057e85402b2ed987604f4fe2e0fff859fa231320ad1686cd8e1b1988b69e8a1a4dac3aa4bf2cf66f896efe7f9f0008c435d5cbf75e4e181cbc011f3bf0d7bb334a541d680ea8451fda5b2fa1dbf4c8833db6cb4526856b75f6fc2f5740dd38c93756f7496b7c60cc12b612087bebb7a6b43eb3b7306a5a4e35e4700c5cc8d0d97ac524b3b6d02af50f31afd4e893dc2da6513b2553d7c0fd153fbfddfd7974643dffda5232996359b4baf1311d5828776e7a5cc81f13812bb41ab2c306463f2a50b8d37742f2297aaadf59aae532183c2924f44da29878e248cecf4c8f5061ec4e95922701a05e74718a1f127366b11344acabf1741ab7d0035b4fd5a15c9e286ec04d0a3151

_wb_

2025-10-22 04:38:45

``` $ dd if=/dev/random of=/dev/stdout count=512 bs=1 status=none | xxd -p 1e9ff4499cf12cc23a1a1a73ecfa1f13ac868bfed8e65088f62c6f0e6f5c 3e425ca4baf783fbf5415a8050bf4d4a4f9aaca44e393e7393b875ce402e 61c06b9dc5ae1e14e53b5d76783181d7f01904e51e1cfce0720f6d91682a 67db2c1a4b6b6eba807b56627484a89392d75bd41b61112f007df23a57cc 23be1166511fb4b6d7646d548282f7b48281ae964f0eb49ed286e125affa c8443bd9ef177010d6759978b4947416c5251a3da5bf664749e98da5d60c 3f6f9281d304d9b4b03f332759746d71dcdbc20897261e31d7038ffd07f8 362b5b506205ea9961a60aee3f285a3291f857b25866b815d18236c0a5fe 3efb8895fd93a346248f3f571215acf25f2f7de16480bf2b20ee7d8d0d74 ef34a28c456b298a9eb33384aade9ba5597cb8b5a93ab47c115f5a2eb068 c9f7bea78a32132e4cb11d914c5cf05e7e1b9b1d0eb811eadfaa5f697797 8b5db298f10456ca24ffd5f2540500e9c82ac33c1a8186871af3217fa458 2f45c5ea8109e7cd4648a574d6131b4db16dc0893464e6263ce22eb15aeb 11a1adb9b7a8974ea129eaa6ba15c23159f4f1eb247da0e20bef7ff0d006 b7b5a9405170efab2109e0af85354e96426aec3eb7aed6d1a71d5e532116 628a3d4bb9e27347d40c4f8ba7ae51a514258922a68e4bcc4782332e1962 4e301eb2a2131a5bd885713b35e8594ef8e341eda0f685d9137c458f7168 6a21 ```

2025-10-22 04:40:01

storage size 392 bytes -> but can you actually create a 392 byte file that is completely sufficient to reconstruct the input?

ignaloidas

2025-10-22 04:41:17

are you sure you're not smuggling data through some other means?

Exorcist

2025-10-22 04:42:07

Unless they think rainbow table is a kind of compression

_wb_

2025-10-22 04:43:08

make your encoder produce a file, and then make a decoder that decodes the file, and show me two separate command line invocations and a call to `cmp` showing the roundtrip is lossless.

ignaloidas

2025-10-22 04:43:42

so for example, you could send me the program that can decompress, and then if I send you some random data, you can send me ~300 bytes just that I could decompress back into the same random data?

Orum

2025-10-22 04:43:43

why is the output redacted?

Exorcist

	_wb_ make your encoder produce a file, and then make a decoder that decodes the file, and show me two separate command line invocations and a call to `cmp` showing the roundtrip is lossless.
2025-10-22 04:44:12	This is meaningless, since they don't share the source code, they may fake the log

ignaloidas

2025-10-22 04:44:13

I'm down to have a go at that exact experiment, to see where exactly it breaks 🙂

_wb_

	Exorcist This is meaningless, since they don't share the source code, they may fake the log
2025-10-22 04:45:35	sure, but I'm assuming this person isn't trying to deceive us, he is just deceiving himself somehow. Doing the test this way should help to find out that it doesn't work, after all.

lonjil

2025-10-22 04:45:55

> Yes, but you would need a code book like hoffman and I seriously doubt you would be able to get below 4096 bits (as starting size). ok, but what if I start with a 1 GiB file? If what you claim is true, you should be able to compress a random 1 GiB file to 1 MiB. > And, the code book would add overhead so eventually it would not be worth it. what is this code book?

_wb_

2025-10-22 04:46:35

the code book is the 120 bytes of side info he needs to compress the 512 bytes to 392 bytes 🙂

lonjil

2025-10-22 04:47:43	what is the code book, and why would there be any difference?
2025-10-22 04:48:51	if your program takes a 512 byte input and produces a smaller output, you can just run it separately for each 512 byte chunk of the original file, combine the results, then repeat the process.
2025-10-22 04:49:21	If you can compress any 512 byte string, it doesn't matter whether it was produced by compressing another string.
2025-10-22 04:49:40	If you can't recompress, then your method is not in fact capable of compressing any 512 byte string.
2025-10-22 04:50:50	yes
2025-10-22 04:50:56	but that is not very much information

jonnyawsom3

2025-10-22 04:53:45	What they mean is, let's say I turn `data.dat` into `data.thunk` with your compressor. Then I feed `data.thunk` to make `data.thunk.thunk`. I should be able to deocde it back from `data.thunk.thunk` to `data.thunk`, and then decode that back to `data.dat` (Beautiful naming, I know)
2025-10-22 04:55:31	If it's lossless and not entropy based, it won't care if it's a double compressed file, it will just make the second pass even smaller

Orum

2025-10-22 05:11:08

what was the question?

Exorcist

2025-10-22 05:13:09

First, you must verify your algorithm is correct

monad

2025-10-22 05:16:24

this is the canonical place to post about magical compressors: <https://encode.su/forums/19-Random-Compression>

Exorcist

2025-10-22 05:27:12	verify correctness before ratio and speed
2025-10-22 05:29:21	Reading source make verification be trivial, so I keep playing the puzzle

Orum

2025-10-22 05:31:41	compression size is size of the compressed data + size of the decompressor
2025-10-22 05:33:36	where do they show the size of the decompressor?
2025-10-22 05:34:48	I only see the size of the compressed data
2025-10-22 05:38:40	yes, none of that is the size of the decompressor

Exorcist

2025-10-22 05:39:46

Let's assume you don't hardcode the data in the source (this is definitive cheat)

Orum

2025-10-22 05:40:01

how large the binary (and any libraries it uses) is to decompress it

Exorcist

2025-10-22 05:42:08

I suggest you write separated encoder & decoder, that make your round-trip don't share global variable

Orum

2025-10-22 05:42:36	well I doubt that, but anyway, we need to know the size of the decompressor
2025-10-22 05:44:24	so you compressed a 4KB file to "something less than a megabyte"
2025-10-22 05:44:33	that's not terribly impressive
2025-10-22 05:45:53	I suggest you look at the Hutter prize rules and then come back here

lonjil

	Orum how large the binary (and any libraries it uses) is to decompress it
2025-10-22 05:46:11	That's not really relevant. No decompressor, no matter how large, is capable of doing what he claims.

Orum

2025-10-22 05:47:36	it's rather clearly stated in the rules
2025-10-22 05:48:03	in any case if you're not willing to release the decompressor you aren't even eligible
2025-10-22 05:49:11	then you better beat the current record by a lot
2025-10-22 05:49:44	also it's 5000 euros
2025-10-22 05:50:22	the prize is just the prize, it's not a pool of money that is being exhausted
2025-10-22 05:52:25	well if you get it to 1% that would easily make you the record holder as long as you meet the execution and memory requirements
2025-10-22 05:55:12	compressing random data isn't terribly interesting or useful, even

Exorcist

2025-10-22 05:56:32

But the author claim "compress random", so it is must not correct

Orum

2025-10-22 05:59:03

then break the enwiki9 into 4kbit chunks, compress them, and claim the prize

jonnyawsom3

2025-10-22 06:10:39	<@991011772381163581> keeping on topic to the server, what happens if you compress this? And then could you upload the decoded version after
2025-10-22 06:17:15	```FF 0A FA 7F 01 90 08 06 01 00 48 01 00 00 40 00 04 40 00 04 40 00 04 40 00 04 40 00 04 40 00 04 40 00 04 40 00 04 00 0B 44 38 A3 A0 AE D1 D1 51 F1 22 6B E1 8A 00 08 22 54 CE CB C0 C6 BF CA B0 04 BC 7E 7E 4F B1 52 48 EC 0F 37 DB 77 F5 00 E0 22 16 85 98 8B EB 4F 8C 8D 4B BF F0 47 2E BD F9 FC 7E C5 B5 7A C3 E4 97 3B A2 54 E6 4C B1 5E A7 B1 1C 12 A0 95 04 07 78 01 03 03 03 03 03 03 03 03 03 03 03 03 03 03 03 03```
2025-10-22 06:27:33	Interesting
2025-10-22 06:32:42	> This is the llm explanation. Forgive my curioisty, but was the code AI assisted too by any chance?

_wb_

	_wb_ ``` $ dd if=/dev/random of=/dev/stdout count=512 bs=1 status=none \| xxd -p 1e9ff4499cf12cc23a1a1a73ecfa1f13ac868bfed8e65088f62c6f0e6f5c 3e425ca4baf783fbf5415a8050bf4d4a4f9aaca44e393e7393b875ce402e 61c06b9dc5ae1e14e53b5d76783181d7f01904e51e1cfce0720f6d91682a 67db2c1a4b6b6eba807b56627484a89392d75bd41b61112f007df23a57cc 23be1166511fb4b6d7646d548282f7b48281ae964f0eb49ed286e125affa c8443bd9ef177010d6759978b4947416c5251a3da5bf664749e98da5d60c 3f6f9281d304d9b4b03f332759746d71dcdbc20897261e31d7038ffd07f8 362b5b506205ea9961a60aee3f285a3291f857b25866b815d18236c0a5fe 3efb8895fd93a346248f3f571215acf25f2f7de16480bf2b20ee7d8d0d74 ef34a28c456b298a9eb33384aade9ba5597cb8b5a93ab47c115f5a2eb068 c9f7bea78a32132e4cb11d914c5cf05e7e1b9b1d0eb811eadfaa5f697797 8b5db298f10456ca24ffd5f2540500e9c82ac33c1a8186871af3217fa458 2f45c5ea8109e7cd4648a574d6131b4db16dc0893464e6263ce22eb15aeb 11a1adb9b7a8974ea129eaa6ba15c23159f4f1eb247da0e20bef7ff0d006 b7b5a9405170efab2109e0af85354e96426aec3eb7aed6d1a71d5e532116 628a3d4bb9e27347d40c4f8ba7ae51a514258922a68e4bcc4782332e1962 4e301eb2a2131a5bd885713b35e8594ef8e341eda0f685d9137c458f7168 6a21 ```
2025-10-22 06:48:41	what size does this compress to?
2025-10-22 06:53:12	or try this if you like something not completely random: ``` $ xxd -p -s 200000 -l 512 some_random_image.jxl 66b3685b1d3123a20c919c486c4a291b673d89aa64feba9b5634306ace75 33b0f3408f832959da1f1faf54b00798ffd95465e4437b0b9167e1a64ee1 ac9b76059c0ab80b95edbbb1268a3d3dc3c4a6b31fa3a307ac7a1729e965 6979007cde5b25d9cdee2185e219821793905a9f114d9beb6ea525bd107c 7c05cdde0db96e5b10015c75980dc5ced5cd15f1649e16e2384fe524dc4f 629ba0a0a48c8ff7b8acd4cca78f75cf791459f4a316dfd59ae759e82cec a0349275b6d992dfdbc29a0efb6ca10c01c77b3f123716234b0fe97d31d9 cd310697d460f0e3ac0c7d1a19a34fca2aa1ffcc734a61a092224dc2b3ee 756fe21f46f3167cd68d06b2b0f8481be45faae7f99fafdb99d7282b7ad9 d8b4194ce27a363f30b3d35d3dbe46522ba82aee833821ec97763c3cfc17 2e6db6c6f2d3884374e58af5a49326fd343a1bd05a7e644a2db3c235c857 65a8bf889b9c426fadfc42eff678adde82e0b7a58756670f0fdce7f1a5e4 59c2178f2ffa42d4aabc829d058d44e051e5495391a125df9571bf885cf8 93c402aa250c7c8d05e2c36ba757a8438e9c97da13d0157f097f3e896caf 65e74b4f87370080258ac4781d134f44902ff12a65a3a34b3a123dbc90db 295bf77994ef36e33d9fa262b64c8cdcae7aad9e86ea9a2fb2089a3edaed b2dd58d72bf51530a4979cb7b168fbc6ea0fbc27ed2e76464920352a885b 5a4e ```

spider-mario

2025-10-22 06:58:50	31-35 kB/s is squarely in “slow” territory; compression ratio on one small input doesn’t tell much
2025-10-22 07:00:04	you can see how various compression methods fare on various files here: https://quixdb.github.io/squash-benchmark/

Exorcist

2025-10-22 07:00:29

Did you write the Huffman tree **inside** the compressed data?

spider-mario

2025-10-22 07:02:01

for example, on the file “kennedy.xls” (1MB) and machine “s-desktop”, brotli achieves anywhere from a compression ratio of 8.18 at 137.5MiB/s to a ratio of 15.7 at 267KiB/s

_wb_

2025-10-22 07:02:19

Speed does not matter if correctness is not established first. It is very much impossible that your method can reduce any random 512 byte sequence into less than 400 bytes or whatever.

spider-mario

2025-10-22 07:02:20

(decompression speed around 350-400MiB/s for both)

_wb_

2025-10-22 07:03:23

No matter if it has 10 billion years to do that or one nanosecond, it is just impossible to map 2^512 inputs to 2^400 outputs in a reversible way.

spider-mario

2025-10-22 07:18:03

it’s faster to test, though 😁

A homosapien

2025-10-22 07:25:33

The "fail fast" philosophy

juliobbv

	_wb_ No matter if it has 10 billion years to do that or one nanosecond, it is just impossible to map 2^512 inputs to 2^400 outputs in a reversible way.
2025-10-22 09:11:03	tell that to the "AI face enhancer" people
2025-10-22 09:11:51

taliaskul

2025-10-23 05:41:19	so look one of these zips 50% compression many of these zips 25% compression
2025-10-23 05:48:08	*one of these bamboo-as-bix bookshelf

jonnyawsom3

2025-10-23 06:00:36

... What?

A homosapien

2025-10-23 07:20:08

We've got ourselves a new Fab <:monkaMega:809252622900789269>

monad

2025-10-23 07:30:33

There are other areas under **General chat** in the left sidebar to collect discussions unrelated to JXL.

jonnyawsom3

2025-10-23 08:17:17

Namely <#805176455658733570>

taliaskul

2025-10-23 08:24:46
2025-10-23 09:39:34	hi guys i have you exclusive access the latest crypto Basic Keyboard on my doge -feebee
2025-10-23 09:42:56	so look for hf detail thats just example (theres infinite of these in the op)
2025-10-23 09:43:26	yes thats not sped up
2025-10-23 09:44:01	well see you guys fee bee luk lik 4

jonnyawsom3

2025-10-23 09:45:25	<@794205442175402004> maybe it would be best to delete all the random half-gig files they've uploaded... We tried to investigate and they seem to contain highly nested subdirectories with randomly generated filenames, all ending in thousands of `.bat` files
2025-10-23 11:40:27	A random thought, could an MA tree compress the B channel of normal maps? I know it can use values from previous channels for decisions, but I don't think it can use those values for any predictions and certainly not the square root needed for full reconstruction from R and G

veluca

2025-10-23 11:49:29	what's the difference between a prediction and a decision? 🙂
2025-10-23 11:50:13	depending on the size of your normal map, it might make sense to have some more-or-less accurate map from all the possible (R, G) values to B

jonnyawsom3

2025-10-23 11:59:25

Prediction is the end of an MA branch, can be applied to any values that match the branch. A decision is the yes/no chain that forms the branch. Though now you mention it, I suppose you could make a tree that uses `set` for every possible value and it would still be relatively small compared to storing the B channel normally

veluca

2025-10-24 12:02:04

> Prediction is the end of an MA branch, can be applied to any values that match the branch. A decision is the yes/no chain that forms the branch. yeah, I understood that, I guess I did not make it clear enough that it was a leading question 😛

jonnyawsom3

2025-10-24 12:02:35	...I may be stupid, of course you know, you helped make it xD
2025-10-24 12:06:06	I ran a test using the hacked together MA exporting build, on a cropped 256x256 chunk of a normal set to effort 11. It only used 16 contexts and Weighted Predictor
2025-10-24 12:08:09	Aaaand I just beat effort 11 with the first set of parameters I tried xD

veluca

2025-10-24 12:12:02

not sure I understood what you did for the tree

jonnyawsom3

2025-10-24 12:18:57	```cjxl -d 0 Normal.png Normal.jxl --allow_expert_options -e 11 JPEG XL encoder v0.12.0 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 11] Compressed to 95944 bytes (11.712 bpp). 256 x 256, 0.002 MP/s [0.00, 0.00], , 1 reps, 16 threads.``` ```cjxl -d 0 Normal.png Normal.jxl -e 10 --patches 0 -I 100 -E 2 -C 0 -P 15 JPEG XL encoder v0.12.0 [_AVX2_,SSE4,SSE2] Encoding [Modular, lossless, effort: 10] Compressed to 92333 bytes (11.271 bpp). 256 x 256, 0.008 MP/s [0.01, 0.01], , 1 reps, 16 threads.```
2025-10-24 12:22:34	To test a custom tree we'd have to hardcode it into a build, but the possibility is there
2025-10-24 01:02:38	Also here's the normal for future testing. I don't think the current MA builder in libjxl uses `set` nodes at all, so it can't do the LUT style B reconstruction

pixyc

	taliaskul hi guys i have you exclusive access the latest crypto Basic Keyboard on my doge -feebee
2025-10-24 01:18:14	what the fuck

_wb_

2025-10-24 06:21:57	I sometimes wonder if we shouldn't have been more generic in the design of RCT and basically allow something like `B += (aR + bG + c) >> d` to be done
2025-10-24 06:24:04	(it would not be enough for this case, but it would be more flexible than the set of RCTs we have now)

jonnyawsom3

2025-10-24 06:47:58

In the past week I've come to realise, I think the MA tree can handle most things, the issue is how to encode it. For example the LUT idea Veluca had could work perfectly for an 8bit map, probably only kilobites to save megabytes, but 16bit normals are very common too. Reducing the entropy back down to 8 instead of 16bit by using the same size MA LUT should still help a lot All hypothetical for now, who knows what impact it would have on decode time, if it works at all. But a novel idea to save 1/3 of the file

_wb_

2025-10-24 08:56:10

structured MA trees that correspond to a LUT could be detected at decode time and implemented as a LUT instead of the generic branchy tree traversal codepath. You could even go further and do some kind of JIT compilation of MA trees (and prediction) to whatever is the most efficient functionally equivalent implementation. Obviously these kind of optimizations come with potential security risks and it's also pretty complicated, but it is an interesting direction. It's basically like how a browser can implement javascript with a simple interpreter or do JIT compilation — except it's not a Turing complete language in the case of jxl so the risks should be a bit lower (but then again the benefits are also lower since this only applies to modular mode)

spider-mario

2025-10-24 09:23:28

it is quite tempting to use something like https://cranelift.dev/ for this, isn’t it

veluca

2025-10-24 09:50:09

~~hand-written machine code~~

jonnyawsom3

2025-10-27 10:49:22	Hmm, v0.8 wins yet again
2025-10-27 10:49:28

Exorcist

2025-10-30 04:11:23	Where is the compressed file size?
2025-10-30 04:16:04	Well, where is the compressed data size?

_wb_

2025-10-30 04:41:40	If you run it 1000 times on bits coming from /dev/random, does it produce something that roundtrips every time? And also compresses on average?
2025-10-30 05:35:26	Please do such a benchmark and report the results.

monad

2025-10-30 10:45:00	Please post to an appropriate channel or make a thread in the general forum.
2025-10-30 11:08:25	I believe this channel should remain related to JPEG XL benchmarks. That would be the default expectation given the server topic.

Exorcist

2025-10-30 11:16:27

I think we can stop talking the "magic" (euphemism) compressor.

_wb_

2025-10-31 09:53:25

Right, let's stick here to benchmarks related to image compression and relevant for jxl. General data compression and non-image stuff like audio is more something for the general chat, e.g. in <#794206087879852106> or possibly <#805176455658733570> .

Demiurge


2025-11-04 09:14:33	Thank you so much for finding things like this to help fix regressions and make important improvements to fidelity and visibility of details in the shadows.
2025-11-04 09:14:53	Deep, detailed shadows really contribute to making images "pop" with depth.
2025-11-04 09:15:40	And I really hate to see libjxl struggle with something so important and impactful

_wb_

	juliobbv btw, libaom's tune iq in 3.13 is no longer optimized for just ssimu2, that'd be `tune=ssimulacra2`
2025-11-05 10:14:32	what other metrics is it optimized for?

juliobbv

	_wb_ what other metrics is it optimized for?
2025-11-05 11:02:05	Butter 3-norm and DSSIM (and <@703028154431832094> and <@297955493698076672>'s 👀)
2025-11-05 11:04:36	Butter max-norm also gets some degree of improvement too

jonnyawsom3

2025-11-05 11:29:56

We do love some 👀 tuning, though I feel like mine are gonna fall out my head after staring at the same DCT blocks for an hour

spider-mario

	We do love some 👀 tuning, though I feel like mine are gonna fall out my head after staring at the same DCT blocks for an hour
2025-11-05 12:06:39	how I imagine it:

juliobbv

	We do love some 👀 tuning, though I feel like mine are gonna fall out my head after staring at the same DCT blocks for an hour
2025-11-05 12:07:55	it's kind of like the tetris effect, but the underlying blocks can be of different sizes
2025-11-05 12:08:29
2025-11-05 12:12:16	me and the aomanalyzer have spent so much time together it could be considered a proper long-term relationship

AccessViolation_

2025-11-05 12:12:49	I wish we had something like aomanalyzer for jxl
2025-11-05 12:13:02	jxlatte can do some of it

juliobbv

2025-11-05 12:13:13	oh yeah I was going to suggest that
2025-11-05 12:13:29	but aomanalyzer is surprisingly complete for what it is
2025-11-05 12:14:11	the only things that bother me is frame decoding speed and frame presentation order view being broken

AccessViolation_

2025-11-05 12:14:31

I think a tool like that would be relatively easy to make if you keep it simple. because it's more akin to reading out the data structures than actually decoding to a complete image

jonnyawsom3

	juliobbv it's kind of like the tetris effect, but the underlying blocks can be of different sizes
2025-11-05 12:15:18	To make the encoder quality better, I have to sacrifice my tolerance to low quality images on the web... It was already at a point I'd mention what codec Discord was using based on the video artifacts, or tell when an image was subsampled

juliobbv

2025-11-05 12:16:31

yeah, it's a mindset switch from fidelity toward artifact prevention (mostly "appeal")

jonnyawsom3

AccessViolation_ I wish we had something like aomanalyzer for jxl

2025-11-05 12:19:50

I'd not heard of aomanalyzer, I was only aware of this thing (Which I wanted for years) <https://www.elecard.com/products/video-analysis/streameye> We do *have* the debugging visualisations in the code... The issue is I don't think they've worked for years, so I've been using jxlatte (When it doesn't error) <https://github.com/libjxl/libjxl/blob/3a4f18bbc74939ddc4481a0395e1f6599a68bf44/lib/jxl/enc_ac_strategy.cc#L68>

juliobbv

2025-11-05 12:29:31

aomanalyzer: https://pengbins.github.io/aomanalyzer.io/

username

2025-11-05 12:39:38

> jxlatte (When it doesn't error) It got some new commits a week or two ago. I wonder if they would make anything work better in your case?

jonnyawsom3

	juliobbv aomanalyzer: https://pengbins.github.io/aomanalyzer.io/
2025-11-05 12:43:04	Well for some reason I can only load the first frame, but it's a real treat when it does
	username > jxlatte (When it doesn't error) It got some new commits a week or two ago. I wonder if they would make anything work better in your case?
2025-11-05 12:44:07	I'm pretty sure most bugs were fixed already, but I only have an old binary

username

2025-11-05 12:44:56	I thought you where given a newer one at some point?
2025-11-05 12:46:41	checking and yeah you should have a jar with all of the changes/commits from before the latest 3 that where done

juliobbv

	Well for some reason I can only load the first frame, but it's a real treat when it does
2025-11-05 01:01:32	are you loading an avif, or an ivf?
2025-11-05 01:01:56	the analyzer only supports one frame from avif files iirc

jonnyawsom3

2025-11-05 01:02:10

AV1 encoded MP4

juliobbv

2025-11-05 01:02:44

oh I'd remux to either webm or ivf

jonnyawsom3

2025-11-05 01:06:04	Ah yeah, WebM worked. Never heard of IVF before
	username checking and yeah you should have a jar with all of the changes/commits from before the latest 3 that where done
2025-11-05 01:06:56	Actually, that build was before all of these

username

	Actually, that build was before all of these
2025-11-05 01:12:05	what's the hash the jar you have?

jonnyawsom3

	username what's the hash the jar you have?
2025-11-05 01:17:52	`90c75b191a4beb64957ee8f6e7197a210da4caafa37dc6ff9f0453d0a2d6ec27` `17/9/2024`

username

2025-11-05 01:26:25	<@238552565619359744> here's some copy on my PC that seems to be from ¿early 2025?
2025-11-05 01:26:39	I thought this was the one you switched to
2025-11-05 01:32:45	seems like these are the only commits that are different from the build you have compared to the one I have:

jonnyawsom3

	username <@238552565619359744> here's some copy on my PC that seems to be from ¿early 2025?
2025-11-05 01:42:26	Yeah, that fixes the error I've been getting
	juliobbv yeah, it's a mindset switch from fidelity toward artifact prevention (mostly "appeal")
2025-11-05 01:46:54	Actually it's the other way around in this case. The encoder is oversmoothing and using bigger block sizes than it should, so we're walking it back to get more noise in the HF areas and only smoothing in the LF areas. But because Gaborish and EPF handle almost all of the DCT noise, we can get away with a lot more localised block sizing
2025-11-05 01:51:40	At least, that's the theory I'm running with right now. All I know is that v0.8 was much sharper and smaller, with the main difference being it used smaller blocks

monad

2025-11-05 03:21:54

still works on a proper OS when passing the define to cmake

jonnyawsom3

2025-11-05 03:24:41

Hmm, maybe it was only the Modular code that broke

monad

2025-11-05 03:28:37

The MA tree graph and related dot file don't work. Other maps from even earlier versions are gone. Patches-related output works by default, AC strategy works assuming you compile the code.

jonnyawsom3

2025-11-05 03:38:28	Thanks for checking them, the MA tree is what we tried before so that explains it
	I'd not heard of aomanalyzer, I was only aware of this thing (Which I wanted for years) <https://www.elecard.com/products/video-analysis/streameye> We do have the debugging visualisations in the code... The issue is I don't think they've worked for years, so I've been using jxlatte (When it doesn't error) <https://github.com/libjxl/libjxl/blob/3a4f18bbc74939ddc4481a0395e1f6599a68bf44/lib/jxl/enc_ac_strategy.cc#L68>
2025-11-06 08:33:00	Ah right, that's why... https://github.com/libjxl/libjxl/issues/2287

RaveSteel

2025-11-06 10:10:29	Ah, so that's what you meant when I asked a few weeks ago on how to see patches
2025-11-06 10:10:50	I wondered, because it worked fine for me on linux

monad

2025-11-07 07:34:25

"a proper OS"

jonnyawsom3

2025-11-11 06:44:41
2025-11-11 06:45:05	<@384009621519597581> what about lossy modular instead?

AccessViolation_

2025-11-11 06:45:48	I tried that briefly, I couldn't really get that below 1 kB
2025-11-11 06:45:52	I'll try again
2025-11-11 06:47:38	oh actually
2025-11-11 06:47:49	idk what I did wrong the first time but this is promising

lonjil

2025-11-11 06:48:01	I doubt libjxl encoder is particularly optimized for this use case either
2025-11-11 06:48:26	So a jxl encoder designed for sub 200 byte previews might work pretty well if tuned for that

jonnyawsom3

2025-11-11 06:49:28

Probably a JXL art generator more than an encoder

lonjil

2025-11-11 06:49:44

still a kind of encoder, even if very odd

jonnyawsom3

2025-11-11 06:50:36

IIRC 2IM basically brute forces all combinations and picks the closest, JXL art targeting a low size could do the same (With a faster decoder)

Tirr

2025-11-11 06:52:24

cram everything into DCT256 blocks and heavily quantize all of those

AccessViolation_

2025-11-11 06:52:57	if only it could fix up the color a little more. 262 bytes `cjxl scaled.png scaled-lossy-modular.jxl -m 1 --resampling=8 --photon_noise_iso=30000000 -q 5 --gaborish=0 --epf=1 -e 10`
2025-11-11 06:53:25	but if you allow 300-400 bytes it becomes significantly better
2025-11-11 06:53:35	disproportionally so compared to vardct of the same size
2025-11-11 06:53:43	from brief testing

Tirr

2025-11-11 06:53:51

very looks like missing squeeze residuals

AccessViolation_

2025-11-11 06:56:45

`--epf=3` is really nice for blending awful color artifacts

veluca

2025-11-11 06:57:19

what are you even doing xD

AccessViolation_

2025-11-11 06:57:48	trying to best the low res polygon shenanigans webp2 does :D
2025-11-11 06:57:56	https://skal65535.github.io/triangle/index.html

veluca

2025-11-11 06:57:59

where's the original?

lonjil

2025-11-11 06:58:13

veluca

2025-11-11 06:58:17

ah found it

AccessViolation_

2025-11-11 06:58:21
2025-11-11 06:58:22	<@179701849576833024>
	AccessViolation_ `--epf=3` is really nice for blending awful color artifacts
2025-11-11 07:00:35	`--epf=[0, 1, 2, 3]` in order:
2025-11-11 07:01:41	all of them around 350-360 bytes
2025-11-11 07:02:41	gaborish is...weird, at these distances
2025-11-11 07:03:06	especially in vardct, because then you get massive blocks. I don't know why it thinks it can get away with them
2025-11-11 07:04:22	like "I didn't know leonardo da vinci made abstract art" levels of massive blocks
2025-11-11 07:05:18	also in vardct, encoder effort below 5 does better for the same reason, above that it starts assuming it can use massive blocks while it really shouldn't
2025-11-11 07:06:21	(I'm aware libjxl isn't tuned for these distances)

jonnyawsom3

	Tirr cram everything into DCT256 blocks and heavily quantize all of those
2025-11-11 07:09:42	Unfortunately 128 and 256 aren't implemented in libjxl yet, a shame since it would do well for backgrounds and skies while I'm tuning the weights of the rest
2025-11-11 07:10:02	Though, at Resampling 8, it should be fitting in 1 DCT block anyway

AccessViolation_

2025-11-11 07:10:33

one thing I was thinking that could be good at these very low distance scenarios, is abusing adaptive LF smoothing to achieve less jarring transitions between blocks

jonnyawsom3

2025-11-11 07:11:33	Actually, I thought of something even worse
	AccessViolation_ if only it could fix up the color a little more. 262 bytes `cjxl scaled.png scaled-lossy-modular.jxl -m 1 --resampling=8 --photon_noise_iso=30000000 -q 5 --gaborish=0 --epf=1 -e 10`
2025-11-11 07:11:43	Run it with `-R 0`

Tirr

	Unfortunately 128 and 256 aren't implemented in libjxl yet, a shame since it would do well for backgrounds and skies while I'm tuning the weights of the rest
2025-11-11 07:13:11	you'll need to hack libjxl (or write another encoder) to do dct256 of course

jonnyawsom3

	AccessViolation_ especially in vardct, because then you get massive blocks. I don't know why it thinks it can get away with them
2025-11-11 07:13:56	The image is only 60 pixels tall with the resampling, so even 8x8 is 1/8th of the image

AccessViolation_

2025-11-11 07:14:18	ah, true
	Run it with `-R 0`
2025-11-11 07:16:18	this produces a larger file and also is pretty high 'resolution'

jonnyawsom3

2025-11-11 07:16:44

It disables squeeze, making lossy modular only do quantization

AccessViolation_

2025-11-11 07:16:56	`cjxl scaled.png scaled-lossy-modular.jxl -m 1 --resampling=8 -d 25 -e 10 --epf=0 -R 0` 781 byes
2025-11-11 07:17:23	what if we force it to use a palette :3c

jonnyawsom3

2025-11-11 07:18:22

Unfortunately delta palette doesn't have quality options, otherwise that would be fun

afed

2025-11-11 07:20:18

what if use `--override_bitdepth`

jonnyawsom3

	AccessViolation_ if only it could fix up the color a little more. 262 bytes `cjxl scaled.png scaled-lossy-modular.jxl -m 1 --resampling=8 --photon_noise_iso=30000000 -q 5 --gaborish=0 --epf=1 -e 10`
2025-11-11 07:20:52	Tried encoding the same size with my WIP block retuning

AccessViolation_

2025-11-11 07:22:26	oh that's an improvement for sure
2025-11-11 07:22:44	throw an `--epf=3` oh that and it might look even better

jonnyawsom3

2025-11-11 07:23:15
2025-11-11 07:23:37	We're just remaking AVIF at this rate

AccessViolation_

	afed what if use `--override_bitdepth`
2025-11-11 07:24:01	I don't think this does anything to the way data is encoded in VarDCT mode. the concept of bit depth only exists during decoding as far as I'm aware. internally, it's always 32-bit floating point per channel

afed

2025-11-11 07:25:40

at least for lossy modular, it should do something, as far as I remember

jonnyawsom3

2025-11-11 07:26:02	Doesn't do anything for me
2025-11-11 07:36:48	It's 1,803 Bytes because I can't enter other parameters, but there's the debug view for Distance 25
2025-11-11 07:37:40	Internally it's 2x resampled, so I scaled up the blocks to match
2025-11-11 07:41:37	Main and retune, bearing in mind I've only been focusing on distance 1 so far
2025-11-11 07:43:00	Still lots to do, but slow and steady progress

veluca

2025-11-11 08:02:31	y'all successfully nerdsniped me
2025-11-11 08:02:39
2025-11-11 08:03:24	best I managed to do

jonnyawsom3

2025-11-11 08:04:15

By the way, shouldn't libjxl do the same for accurate comparisons between them? Assuming it doesn't already https://github.com/libjxl/jxl-rs/pull/474

veluca

2025-11-11 08:04:28	or this if you like more noise
	By the way, shouldn't libjxl do the same for accurate comparisons between them? Assuming it doesn't already https://github.com/libjxl/jxl-rs/pull/474
2025-11-11 08:04:38	it doesn't deinterleave anything so...

jonnyawsom3

2025-11-11 08:05:04

That would certainly do it

veluca

	veluca or this if you like more noise
2025-11-11 08:05:32	(if you're wondering, I modified the encoder for this)

jonnyawsom3

2025-11-11 08:05:52

In what way?

veluca

2025-11-11 08:07:09	biggest thing is changing squeeze_xyb_qtable, then I added some logic to not store quantization factors for channels that are all 0, and set EPF to basically the maximum I could
2025-11-11 08:07:39
2025-11-11 08:07:42
	AccessViolation_ if only it could fix up the color a little more. 262 bytes `cjxl scaled.png scaled-lossy-modular.jxl -m 1 --resampling=8 --photon_noise_iso=30000000 -q 5 --gaborish=0 --epf=1 -e 10`
2025-11-11 08:08:23	definitely an improvement over this 😛

jonnyawsom3

2025-11-11 08:10:27

Ahh, I've been meaning to try adjusting the table for a few months at this point, just never got round to it. The logic seems like a no-brainer, assuming it's up to spec. Do they apply specifically to this image, or do you think there's a PR around the corner haha

lonjil

2025-11-11 08:11:14

do y'all think blurhash-like results are blurhash-like sizes would be feasible with the right encoding method?

jonnyawsom3

2025-11-11 08:13:33

I don't see why not, blurhash is basically just upsampled 4x3 images (by default on their demo site). Could either store a JXL that size or do a JXL art representation

AccessViolation_

2025-11-11 08:13:40	I was just about to joke about that. congrats on adopting blurhash into the bitstream!
	veluca
2025-11-11 08:15:57	this is actually really good. without directly comparing with the original right now, this passes as a blurred and noised-up version of the original

veluca

	Ahh, I've been meaning to try adjusting the table for a few months at this point, just never got round to it. The logic seems like a no-brainer, assuming it's up to spec. Do they apply specifically to this image, or do you think there's a PR around the corner haha
2025-11-11 08:16:42	oh definitely tweaked ad-hoc

AccessViolation_

2025-11-11 08:16:52

how does it look without noise?

veluca

2025-11-11 08:16:53	(but the spec doesn't know anything about that table)
2025-11-11 08:18:21	not even too bad

AccessViolation_

2025-11-11 08:18:30

I wonder if in the future, we can optimize something about the way the first stage(s) look to achieve something that looks good when 300 bytes have been loaded, while still eventually progressing to the full image

veluca

2025-11-11 08:18:32

(-11 bytes)

AccessViolation_

	AccessViolation_ I wonder if in the future, we can optimize something about the way the first stage(s) look to achieve something that looks good when 300 bytes have been loaded, while still eventually progressing to the full image
2025-11-11 08:19:17	basically: just progressive decode, but tuned specifically for the first <500 bytes as well, since getting something like this is a lot better than getting the abstract art
	veluca not even too bad
2025-11-11 08:19:46	that's not bad either
2025-11-11 08:20:55	I think the noise doesn't really help here anyway. I suspect that the reason WebP2 needs noise is because noise hides the structure of the polygons. these, however, look smooth and blurry, and so there isn't any unintended structure the noise would need to to distract you from

jonnyawsom3

2025-11-11 08:23:59	The noise helps with banding, so polygons of solid color would naturally have a lot of it
	lonjil do y'all think blurhash-like results are blurhash-like sizes would be feasible with the right encoding method?
2025-11-11 08:26:12	After very brief testing I realised an issue, the blurhash is just raw data without any header, while JXL has to do the signalling to be in spec. Around 100 bytes it's pretty good, but hitting the 30 bytes of a blur/thumbhash is a struggle due to overhead without a dedicated encoder/MA tree builder for it

AccessViolation_

2025-11-11 08:28:51	on second though, some noise does help I think... it makes it look less like it's just supposed to be a blurred image, because images that were blurred are never also noisy
	AccessViolation_ I wonder if in the future, we can optimize something about the way the first stage(s) look to achieve something that looks good when 300 bytes have been loaded, while still eventually progressing to the full image
2025-11-11 08:30:03	if this was to be attempted though, you could never use noise because it would persist when the image is fully loaded as well

jonnyawsom3

	AccessViolation_ if this was to be attempted though, you could never use noise because it would persist when the image is fully loaded as well
2025-11-11 08:30:45	https://github.com/libjxl/libjxl/issues/4368
2025-11-11 08:30:54	It's already happening accidentally

lonjil

2025-11-11 08:32:48	hm, a lot of people are using base64'd SVGs for LQIP instead of blurhash, and those tend to be a decent bit bigger, at the benefit of not needing a special blurhash decoder on your page
2025-11-11 08:33:03	so maybe that'd be a better point of comparison, trying to do a better job than LQIP SVGs

Oleksii Matiash

2025-11-11 08:37:41

> because images that were blurred are never also noisy If blurred intentionally, but think about out of focus photo. It can and, in case of film photo, should be both noisy and blurred

AccessViolation_

2025-11-11 08:40:06

that's why I intentionally changed it to say "were blurred" just before posting ^^

_wb_

2025-11-11 08:41:00

At these jxl-art kind of filesizes, probably it is worth it to check in detail where the actual bits are going to. Perhaps hardcoding some tree specifically for this stuff could help. Something simple like ``` if c > [last channel with nonzero coeffs] - Set 0 if c > 0 - Set 0 - Gradient 0 ``` might work, i.e. Clampedgradient for the 8x8 or whatever first pass, Zero for all the residuals, and a separate context for all the zero-only channels.

AccessViolation_

2025-11-11 08:44:00

I was gonna say I suggest we reserve `-q 0` for this in the encoder since I mistakenly thought quality zero currently errors, but quality 0 already works

_wb_

2025-11-11 08:44:39	in FLIF and FUIF, I made it so that even negative qualities would work
2025-11-11 08:45:30	q 0 was just some point where I was pretty sure nobody would want to go that low, but if you wanted a quality -123 image, why not

lonjil

2025-11-11 08:45:38

did anyone ever make a tool to dump information about a jxl file (useful to investigate in this case)

jonnyawsom3

2025-11-11 08:46:32

I've got debug builds for VarDCT blocks and Lossless MA trees, but the former I can't use most parameters and the latter only works on a single group in a single channel

veluca

	_wb_ At these jxl-art kind of filesizes, probably it is worth it to check in detail where the actual bits are going to. Perhaps hardcoding some tree specifically for this stuff could help. Something simple like ``` if c > [last channel with nonzero coeffs] - Set 0 if c > 0 - Set 0 - Gradient 0 ``` might work, i.e. Clampedgradient for the 8x8 or whatever first pass, Zero for all the residuals, and a separate context for all the zero-only channels.
2025-11-11 08:48:17	the tree we ended up with actually ends up having 3 clusters, and they're actually useful (not clustering is massively worse)
2025-11-11 08:48:30	(it's basically one node per channel)

_wb_

	lonjil did anyone ever make a tool to dump information about a jxl file (useful to investigate in this case)
2025-11-11 08:49:02	probably if you make a build of libjxl with all the verbosity and debug stuff compiled in, it can already tell you a lot. though the quality of those verbose logs is not great

veluca

2025-11-11 08:49:22

I do wonder if the default squeeze script is the best we could do (I imagine a couple of squeeze steps followed by something akin to delta palette is better)

jonnyawsom3

2025-11-11 08:50:50

Hmm, good idea. Delta pallete residuals with Squeeze smoothing the result

AccessViolation_

	veluca
2025-11-11 08:52:22	now is a good time to remind everyone that the ✨amazing✨ 250 byte image I tried to beat looked like this! and we didn't even have to invent a specific coding tool to beat it
2025-11-11 08:54:51	webp2 retains more detail which I guess the goal was, but that JXL is more usable
2025-11-11 08:55:04	(imo)

spider-mario

	AccessViolation_ now is a good time to remind everyone that the ✨amazing✨ 250 byte image I tried to beat looked like this! and we didn't even have to invent a specific coding tool to beat it
2025-11-11 08:55:30	yeah, to me, this is in the realm of “a nice technical/intellectual achievement, but who is ever going to actually use it?”

AccessViolation_

2025-11-11 08:55:36

exactly

jonnyawsom3

2025-11-11 08:55:52

If we wanted detail, we could just make it greyscale (Or Y)

AccessViolation_

	AccessViolation_ now is a good time to remind everyone that the ✨amazing✨ 250 byte image I tried to beat looked like this! and we didn't even have to invent a specific coding tool to beat it
2025-11-11 09:01:07	(slight correction, this is the one without noise - I forgot I disabled that)

veluca

2025-11-11 09:02:56
2025-11-11 09:03:04	just a quick try, I'm sure one could do better
2025-11-11 09:03:19	(say with some patches)

jonnyawsom3

2025-11-11 09:05:54

Run dot detection and just give her a :-)

AccessViolation_

2025-11-11 09:16:49

favorite things I have learned today: 1. JPEG XL is a feasible format for super small placeholders 2. someone decided to touch up *the* mona lisa image on the wikipedia article in 2011 because they didn't like the way the colors looked??

A homosapien

2025-11-11 09:19:53	2im in 256 bytes, not a brute force encode, just good settings
2025-11-11 09:22:19	At thumbnail size it looks really convincing, better than blurhash or thumbhash imo
2025-11-11 09:25:31	128 bytes
2025-11-11 09:27:28	no weird false colors or ringing, just a Picassofication of the image

AccessViolation_

2025-11-11 09:34:50

these look better than what webp2 tries to do I think

A homosapien

2025-11-11 09:35:18

colors are much more accurate, at the expense of detail

AccessViolation_

2025-11-11 09:36:19	I've looked at some more webp2 images and they seem to not care about color borders that much. a polygon will happily intersect with an area that has a completely different color, and because webp2 uses a small palette of colors it's not like that intersecting problematic polygon can be given an average color either: it's a spike that distorts the structure of the image
2025-11-11 09:38:00
2025-11-11 09:38:53	they're from this paper: https://arxiv.org/abs/1809.02257

A homosapien

	It's 1,803 Bytes because I can't enter other parameters, but there's the debug view for Distance 25
2025-11-11 09:51:38	1801 bytes

veluca

2025-11-11 09:56:28	pretty good
2025-11-11 09:56:40	can't it do gradients?

A homosapien

2025-11-11 10:01:03

Doesn't seem like it, just polygonal shapes

veluca

2025-11-11 10:09:27

sad 😛

A homosapien

2025-11-11 10:51:30

<@811568887577444363> can twim do gradients? If not, what was the rationale behind just shapes? Simplicity?

_wb_

2025-11-12 07:46:59

I wonder how far we would get with 8x upsampling and using just colors from the default palette to represent the small image, or even just from the first color cube in the default palette.

TheBigBadBoy - 𝙸𝚛

	A homosapien 2im in 256 bytes, not a brute force encode, just good settings
2025-11-12 08:25:34	how did you generate rhis one? 😳

A homosapien

2025-11-12 08:27:41

veluca

2025-11-12 08:56:42	ah that's not a svg
2025-11-12 08:56:47	I thought it was

AccessViolation_

2025-11-12 10:07:35

<@179701849576833024> I got the webp2 code working. here's it on the same scaled source we used. 228 bytes

veluca

2025-11-12 10:08:14

Yeah I like the jxl one better 😛

AccessViolation_

2025-11-12 10:08:18

me too

veluca

2025-11-12 10:08:46

But tbh you probably should just upscale it more than jxl supports in HTML

AccessViolation_

2025-11-12 10:10:24	can you take an 8x upscaled image, patch copy it to another layer, then upscale that layer independently?
2025-11-12 10:12:31	(in terms of possibility, this isn't a request for you to do that to the benchmark image 😅 )

veluca

2025-11-12 10:16:40

IIRC no

lonjil

2025-11-12 10:51:51

Maybe you could upscale with an SVG filter

AccessViolation_

2025-11-12 10:55:44

I wonder if splines can be abused to create big splotches of color for use cases like this, better than VarDCT or lossy modular can

_wb_

2025-11-12 10:57:39

they can but they come at a relatively large signaling cost

AccessViolation_

2025-11-12 11:06:12

ah interesting

jonnyawsom3

	AccessViolation_ can you take an 8x upscaled image, patch copy it to another layer, then upscale that layer independently?
2025-11-12 01:32:22	Theoretically you could do squeeze in a special way to get a similar effect
	AccessViolation_ I wonder if splines can be abused to create big splotches of color for use cases like this, better than VarDCT or lossy modular can
2025-11-12 01:33:38	Dots could work, depending on signalling cost

AccessViolation_

2025-11-12 02:09:35	I thought dots weren't added to the bitstream because patches could achieve something similar
2025-11-12 02:12:30	no mention of dots in the paper

Exorcist

2025-11-12 02:13:21

I remember someone claim "4*4 block is enough to encode dot"

AccessViolation_

2025-11-12 02:21:51	if the dot is nice and in the center. if it's in the corner, you're going to have four 4x4 blocks that need to retain all of their high frequency components ^^"
2025-11-12 02:24:09	JXL does have these, so in that case the four corners with the dot can use these separately coded darker pixels so it doesn't require you to retain all high frequency components for the rest of the blocks

veluca

2025-11-12 02:24:25	jxl has dot detection, they get coded as patches
2025-11-12 02:24:50	they had custom syntax at some point, got rid of that

_wb_

2025-11-12 02:38:49

Exactly. The custom syntax for dots was kind of funky, with quad trees to indicate positions at subpixel resolution, quantized ellipses and iirc even its own entropy coder, so it complicated the spec quite a bit while dots are mostly useful when the diameter is like 0-1.5 px so the affected area is 2x2 pixels or maybe 3x3 — when the dots are thicker, it's less problematic for the DCT. So just encoding them as patches is not really more expensive in signaling cost, and saves the spec complexity of having another coding tool.

AccessViolation_

2025-11-12 02:44:58	speaking of, what's the imagined usage scenario for DCT2x2? which types of patterns benefit from it
2025-11-12 02:45:59	I feel like it'd almost have to be noise given how small those are
2025-11-12 02:46:11	so like, textures of concrete for example? sand?

veluca

2025-11-12 02:46:15	most things with sharp edges benefit I believe
2025-11-12 02:46:36	(unclear to me when to use this and when to use IDENTITY/HORNUSS/whatever)

jonnyawsom3

2025-11-12 02:47:57	All I know is that the encoder is very eager to use them. I've had to weight them lower by a lot so the other block types have a chance
2025-11-12 02:49:44	The other day I wondered about making a DCT test image, with an area where each block should be best. Would make getting a baseline for further tuning easier

AccessViolation_

2025-11-12 02:53:13	do we have a corpus that changes to the encoded can be tested against?
2025-11-12 02:53:25	I know there's the QOI corpus that has some interesting images
2025-11-12 02:54:40	https://qoiformat.org/benchmark/
2025-11-12 03:12:51	going through the image set, I recognize a lot of these from all sorts of JXL benchmarks I've seen so clearly this isn't new information haha

jonnyawsom3

2025-11-15 08:32:14	I did a small test of the various encoding types, seeing how each holds up in the different decoders
2025-11-15 08:35:00
2025-11-15 08:35:14	The faster decoding levels mostly rely on fast paths, so it's not surprising they're still slow. Weighted is nearly there though and Default isn't far off either

Tirr

2025-11-15 08:36:33

I'm writing fast-lossless path for jxl-rs, stay tuned 😉

jonnyawsom3

2025-11-15 08:37:10

I was about to say, I have no doubt that table will be useless by the end of next week haha

Tirr

2025-11-15 08:38:32

and RCT transform isn't even simdified yet

jonnyawsom3

2025-11-15 08:38:55	At least I can just swap the numbers out now that I've made the table, it's even formatted for Github if we want to do some fancy before and afters
	Tirr I'm writing fast-lossless path for jxl-rs, stay tuned 😉
2025-11-15 08:41:04	Ahh right, I thought this was a fast path, I need to read more carefully heh https://github.com/libjxl/jxl-rs/pull/481

Tirr

2025-11-15 08:41:46

it's a halfway through it

veluca

	I did a small test of the various encoding types, seeing how each holds up in the different decoders
2025-11-15 08:54:31	should measure lossy too, that's the part that we actually optimized for now 😛

jonnyawsom3

2025-11-15 08:55:19	Just as I closed my terminal xD Gimme a min and I'll add it
	veluca should measure lossy too, that's the part that we actually optimized for now 😛
2025-11-15 09:05:01	Lossless Effort 7 (Modular refers to Lossy Modular) 1 Thread 5 Reps 3840 x 2160 8-bit ``` \| MP/s \| jxl-rs \| libjxl \| jxl-oxide \| \|----------\|-----------\|------------\|-----------\| \| Default \| 2.65 \| 4.51 \| 3.72 \| \| Effort 1 \| 6.54 \| 20.89 \| 18.82 \| \| Gradient \| 4.77 \| 19.08 \| 18.96 \| \| Weighted \| 5.10 \| 7.15 \| 5.92 \| \| FD1 \| 3.97 \| 7.56 \| 5.95 \| \| FD2 \| 4.67 \| 9.18 \| 7.18 \| \| FD3 \| 6.79 \| 21.75 \| 16.20 \| \| FD4 \| 8.59 \| 25.88 \| 27.15 \| \| Progres- \| 4.59 \| 8.35 \| 7.51 \| \| Modular \| 5.62 \| 10.24 \| 9.96 \| \| VarDCT \| 14.66 \| 30.88 \| 18.88 \|``` ``` \| CPU Secs \| jxl-rs \| libjxl \| jxl-oxide \| \|----------\|-----------\|------------\|-----------\| \| Default \| 15.42 \| 8.81 \| 11.08 \| \| Effort 1 \| 5.70 \| 1.56 \| 2.14 \| \| Gradient \| 8.39 \| 1.89 \| 2.11 \| \| Weighted \| 7.98 \| 5.48 \| 6.92 \| \| FD1 \| 10.19 \| 5.00 \| 6.89 \| \| FD2 \| 8.77 \| 4.27 \| 5.69 \| \| FD3 \| 5.94 \| 1.78 \| 2.50 \| \| FD4 \| 4.67 \| 1.36 \| 1.47 \| \| Progres- \| 8.50 \| 4.28 \| 5.47 \| \| Modular \| 7.03 \| 3.50 \| 3.95 \| \| VarDCT \| 2.42 \| 1.14 \| 1.98 \|```
2025-11-15 09:05:24	I threw in Lossy Modular and Progressive Lossless too because why not, but VarDCT is certainly catching up

Orum

2025-11-15 09:06:43

1 thread <:WhatThe:806133036059197491>

jonnyawsom3

2025-11-15 09:06:55

Because jxl-rs hasn't been multithreaded yet

Orum

2025-11-15 09:07:39

that's no reason to kneecap libjxl <:KekDog:805390049033191445>

jonnyawsom3

2025-11-15 09:07:50

Or Oxide, but this is just for relative comparison as more PRs get merged

veluca

2025-11-15 10:04:26

huh, the gap with libjxl is definitely smaller on my machine/image, what are you running this on? (image and cpu)

jonnyawsom3

	veluca huh, the gap with libjxl is definitely smaller on my machine/image, what are you running this on? (image and cpu)
2025-11-15 10:07:11	A 4K (3840 x 2160) game screenshot and a Ryzen 7 1700

veluca

2025-11-15 10:17:45

Can you share the jxl file for vardct?

AccessViolation_


2025-11-15 10:40:14	I didn't know jxl-oxide was so fast 😮 I wasn't expecting it to be competitive at all, I'm impressed

jonnyawsom3

	veluca Can you share the jxl file for vardct?
2025-11-15 10:45:54	The image was actually taken to celebrate decoding some JXLs via modding the game, so it felt fitting to test with it too
2025-11-15 10:45:57	Here's the full commands and output too ```wintime -- djxl --disable_output --num_reps 5 --num_threads 0 Test.jxl JPEG XL decoder v0.12.0 6efa0f5a [_AVX2_] {Clang 20.1.8} Decoded to pixels. 3840 x 2160, geomean: 31.785 MP/s [30.003, 32.747], 5 reps, 0 threads. PageFaultCount: 158461 PeakWorkingSetSize: 123.4 MiB QuotaPeakPagedPoolUsage: 36.41 KiB QuotaPeakNonPagedPoolUsage: 5.969 KiB PeakPagefileUsage: 126.2 MiB Creation time 2025/11/15 10:43:33.896 Exit time 2025/11/15 10:43:35.239 Wall time: 0 days, 00:00:01.342 (1.34 seconds) User time: 0 days, 00:00:00.171 (0.17 seconds) Kernel time: 0 days, 00:00:01.140 (1.14 seconds)``` ```wintime -- jxl_cli -s -n 5 Test.jxl Decoded 41472000 pixels in 2.9917646 seconds: 13862053.184264563 pixels/s PageFaultCount: 398554 PeakWorkingSetSize: 214.8 MiB QuotaPeakPagedPoolUsage: 33.11 KiB QuotaPeakNonPagedPoolUsage: 5.57 KiB PeakPagefileUsage: 228.8 MiB Creation time 2025/11/15 10:43:45.048 Exit time 2025/11/15 10:43:48.121 Wall time: 0 days, 00:00:03.073 (3.07 seconds) User time: 0 days, 00:00:00.593 (0.59 seconds) Kernel time: 0 days, 00:00:02.484 (2.48 seconds)```

veluca

2025-11-15 10:49:41	``` Compiling jxl v0.1.1 (/home/luca/jxl-rs/jxl) Compiling jxl_cli v0.1.0 (/home/luca/jxl-rs/jxl_cli) Finished `release` profile [optimized + debuginfo] target(s) in 8.43s Decoded 8294400 pixels in 0.172853886 seconds: 47985036.33294076 pixels/s luca@desktop ~/jxl-rs main $ djxl Test.jxl --disable_output --num_reps 10 --num_threads=0 JPEG XL decoder v0.11.1 794a5dcf [AVX3_DL,AVX3,AVX2,SSE4,SSE2] Decoded to pixels. 3840 x 2160, geomean: 79.635 MP/s [77.39, 80.75], , 10 reps, 0 threads. ```
2025-11-15 10:49:43	mhhh
2025-11-15 10:50:04	(but I guess you're testing on windows which probably changes some things)

AccessViolation_

2025-11-15 10:52:12

presumably `--target-cpu=native` during the build process has little benefit in jxl-rs since it uses dynamic feature detection and hand-written SIMD right?

Tirr

2025-11-15 10:53:45

it does turn off some of the dynamic feature detection but I guess so

jonnyawsom3

veluca ``` Compiling jxl v0.1.1 (/home/luca/jxl-rs/jxl) Compiling jxl_cli v0.1.0 (/home/luca/jxl-rs/jxl_cli) Finished `release` profile [optimized + debuginfo] target(s) in 8.43s Decoded 8294400 pixels in 0.172853886 seconds: 47985036.33294076 pixels/s luca@desktop ~/jxl-rs main $ djxl Test.jxl --disable_output --num_reps 10 --num_threads=0 JPEG XL decoder v0.11.1 794a5dcf [AVX3_DL,AVX3,AVX2,SSE4,SSE2] Decoded to pixels. 3840 x 2160, geomean: 79.635 MP/s [77.39, 80.75], , 10 reps, 0 threads. ```

2025-11-15 10:58:40

libjxl is 2.6x faster than me and jxl-rs is 3.4x faster, seems like quite the jump for singlethreaded. Either my CPU is just that bad, or something strange is going on

veluca

	AccessViolation_ presumably `--target-cpu=native` during the build process has little benefit in jxl-rs since it uses dynamic feature detection and hand-written SIMD right?
2025-11-15 11:12:25	correct
	libjxl is 2.6x faster than me and jxl-rs is 3.4x faster, seems like quite the jump for singlethreaded. Either my CPU is just that bad, or something strange is going on
2025-11-15 11:13:06	linux and threadripper 7950x (which among other things has avx512)

jonnyawsom3

2025-11-15 11:21:18

You're making me think of Gigapixel fast-lossless encodes again. We never tried 512 with it

Tirr

2025-11-15 11:25:31	and jxl-rs got fast-lossless path just now
2025-11-15 11:26:33	it does get significantly faster, though a bit slower than I expected

jonnyawsom3

2025-11-15 11:26:59

Drats, just as I turned my computer off

Tirr

2025-11-15 11:28:07

you can check it again with more optimizations later 🙂

jonnyawsom3

2025-11-15 11:30:26

Yeah, I'll probably give it a week and then make an updated chart

AccessViolation_

	You're making me think of Gigapixel fast-lossless encodes again. We never tried 512 with it
2025-11-15 12:57:54	I just saw a gigapixel image of the moon on reddit a few minutes ago
2025-11-15 12:58:37	iirc I tried gigapixel but that was back when chunked lossy encoding didn't work or I didn't know how to do that. I don't remember
2025-11-15 01:02:53	https://petapixel.com/2023/05/12/photographers-incredible-gigamoon-image-is-made-from-280000-photos/
2025-11-15 01:13:07	it's not available for free but still cool to look at in the dynamic zoom tool <https://www.easyzoom.com/imageaccess/40d920f226e9451cba72a74430be5fd2>

ignaloidas

2025-11-15 11:57:31

There are a bunch of quite big images (though not quite gigapixel scale because stitching software limits) on https://siliconprawn.org e.g. https://siliconprawn.org/map/amd/n8c0c186/single/amd_n8c0c186_ryancor_mz_20x_overview.jpg (though it's all JPEG because of the size of them, even pre-stiching is JPEG to make it a "workable size")

jonnyawsom3

	AccessViolation_ iirc I tried gigapixel but that was back when chunked lossy encoding didn't work or I didn't know how to do that. I don't remember
2025-11-16 12:03:27	I meant Mario getting 11 GP/s for standalone fast lossless. AVX512 on a threadripper or server chip could push it even further

veluca

	libjxl is 2.6x faster than me and jxl-rs is 3.4x faster, seems like quite the jump for singlethreaded. Either my CPU is just that bad, or something strange is going on
2025-11-16 09:36:46	eh
2025-11-16 09:37:00	could buy that avx512 makes the gap bigger

lonjil

	AccessViolation_ I wonder if in the future, we can optimize something about the way the first stage(s) look to achieve something that looks good when 300 bytes have been loaded, while still eventually progressing to the full image
2025-11-22 12:46:23	Been thinking about this. Since the DC frame is modular encoded, would it be possible to encode it such that an early squeeze stage would look similar to what veluca achieved above within the first few hundred bytes of an otherwise normal image?

_wb_

2025-11-22 12:49:45

If it's a DC frame, yes. Otherwise, only if the image fits in a DC group (2048x2048), since you can't apply squeeze across DC groups in that case.

jonnyawsom3

2025-11-22 02:15:32

Huh, doesn't progressive_dc merge them into a single squeezed DC or something similar?

_wb_

2025-11-22 02:25:11

Yes, a DC frame with squeeze

AccessViolation_

	lonjil Been thinking about this. Since the DC frame is modular encoded, would it be possible to encode it such that an early squeeze stage would look similar to what veluca achieved above within the first few hundred bytes of an otherwise normal image?
2025-11-24 06:06:18	hm, I suppose a browser implementation could do this out-of-spec manually: the very very early stages get blurred to the point of a blurhash, and the blur strength lessens as more components get loaded. the native blur is not nearly enough for those very large blocks
2025-11-24 06:07:41	that might be worth considering for the firefox implementation (on the firefox side, not jxl-rs to be clear)

Info

JPEG XL

General chat

Voice Channels

Archived

benchmarks