JPEG XL

What is missing in libjxl to make it work directly without external tiling? We do have chunked encode and pixel callback for decoding, so both can be done in a streaming / limited memory way...

tufty

2024-09-25 09:11:26	I've been looking at the chunked mode, but I think it won't work for streaming write
2024-09-25 09:12:33	there are no guarantees about what order data_at() will fetch pixels, if pixels will be fetched more than once, etc., so it needs to be backed by the whole source image in memory
2024-09-25 09:13:02	it would need to (eg.) only ask for pixels tiop to bottom in small regular chunks, and (maybe?) never ask for the same pixel twice

_wb_

2024-09-25 09:13:07	It doesn't, but maybe we need to make the api contract tighter
2024-09-25 09:13:18	<@179701849576833024> any thoughts?

Traneptora

2024-09-25 09:14:55	hydrium currently can handle massive images without storing the whole image in memory
2024-09-25 09:15:36	My experience has been that decoding hydrium encodes is very slow
2024-09-25 09:15:52	and ROI decoding is nearly impossible
2024-09-25 09:16:09	this is without `--one-frame`

veluca

	_wb_ <@179701849576833024> any thoughts?
2024-09-25 09:16:34	we should talk about it -- a single jxl file is a much better solution for many reasons, and ideally we'd get the API in a state where it can be used for this

Traneptora

2024-09-25 09:16:58

But the upside is it works well at conserving memory

tufty

2024-09-25 09:18:12

my 2p would be for JXL to add a tiled mode, fwiw write the tiles as a set of frames with a little extra metadata to say how they should be ordered, plus an index at the end

veluca

2024-09-25 09:18:37	a jxl file is already tiled
2024-09-25 09:18:47	and already has an index

Traneptora

2024-09-25 09:19:05

that's what hydrium does, but frame tiling can't have an index

tufty

2024-09-25 09:19:35

ok, then guarantees about chunked write data_at() ordering should be possible, I'd think

Traneptora

2024-09-25 09:20:45

<@310374889540550660> Are you looking for an API like the one at https://github.com/Traneptora/hydrium/blob/a278f0ae4b4259181a3c34469682aea733024e8d/src/include/libhydrium/libhydrium.h#L273

veluca

2024-09-25 09:20:58

something like adding a flag that guarantees it only requests access in scanline order for 2k x 2k tiles?

Traneptora

2024-09-25 09:21:07	If so, that seems reasonable to add to libjxl
2024-09-25 09:21:16	as an api

tufty

2024-09-25 09:21:49

actually, it seems to do that now ok to copy-paste a shortish log?

veluca

2024-09-25 09:21:58	sure
2024-09-25 09:22:05	I am also pretty sure it will happen in practice
2024-09-25 09:22:36	but we might want not to guarantee it unless you ask explicitly
2024-09-25 09:23:10	for example, if you do "middle out" mode for progression you probably want non-scanline

tufty

2024-09-25 09:23:20	``` john@banana ~/pics $ vips copy wtc.jpg x.jxl[chunked] vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_data_at: left = 0, top = 0, width = 2056, height = 2056 vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_data_at: left = 2040, top = 0, width = 2064, height = 2056 vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_data_at: left = 4088, top = 0, width = 2064, height = 2056 vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_data_at: left = 6136, top = 0, width = 2064, height = 2056 vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_data_at: left = 8184, top = 0, width = 1188, height = 2056 vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_data_at: left = 0, top = 2040, width = 2056, height = 2064 vips_foreign_save_jxl_pixel_format: vips_foreign_save_jxl_data_at: left = 2040, top = 2040, width = 2064, height = 2064 vips_foreign_save_jxl_pixel_format: ... ```
2024-09-25 09:23:51	I added some prints so you can see the rects it's asking for, and they seem to be top to bottom, phew!
2024-09-25 09:24:01	with libjxl v0.11
2024-09-25 09:24:14	yes, maybe just add a guarantee about this to the docs?

veluca

2024-09-25 09:24:43	yep, perhaps conditional on setting some flag
2024-09-25 09:28:21	<@310374889540550660> can you open an issue?

tufty

2024-09-25 09:29:58	sure, I'll do it now
2024-09-25 09:30:15	I'll upload a sample image too! hehe

Traneptora

2024-09-25 09:30:27	What happens if the image is permuted
2024-09-25 09:30:35	permuted toc

veluca

2024-09-25 09:30:53	this is on the encoder side no?
2024-09-25 09:31:28	if you ask for a permuted toc, you don't get asked input in scanline order 😛

tufty

2024-09-25 09:42:09	I made an issue https://github.com/libjxl/libjxl/issues/3853
2024-09-25 09:44:50	will there ever be parallel calls to data_at()? I guess not, though the docs don't say (I think)

veluca

2024-09-25 09:45:53	there might be
2024-09-25 09:46:06	well, you need to enable threads for it

Traneptora

2024-09-25 09:47:28	I'd imagine that pixel callbacks can happen in parallel if threads are enabled
	veluca this is on the encoder side no?
2024-09-25 09:48:01	yea, true ig
2024-09-25 09:48:08	you just have to pass data in scanline order

tufty

2024-09-25 09:48:17

could that break ordering? eg. a thread asks for tile 0x0 and stalls, other threads ask for later tiles and continue down the image, the tiler 0x0 thread finally finishes, then the next tile is out of order

veluca

2024-09-25 09:48:51

I believe (but I'd need to check) that threads will ask in scanline order

Traneptora

2024-09-25 09:49:00	well, but they may finish out of scanline order
2024-09-25 09:49:09	what happens in that scenario?

tufty

2024-09-25 09:49:22	libvips has this problem 😦
2024-09-25 09:49:46	it has a complicated mechanism to keep requests in scanline order even if individual threads stall

Traneptora

2024-09-25 09:49:46	you could always have libjxl consider this as a possibility and use permuted TOC to fix it
2024-09-25 09:50:10	jpeg xl has the lucky feature where you can calculate the tile permutation
2024-09-25 09:50:20	so if the tiles end up out of order you just change how you send the TOC

tufty

2024-09-25 09:50:30	that's nice
2024-09-25 09:51:19	ok, i'll ignore the possibility of stalls breaking tile ordering

veluca

2024-09-25 09:51:25

what do you care about exactly?

Traneptora

2024-09-25 09:52:12	it sounds like the issue is that the API doesn't guarantee that it requests tiles in scanline order
2024-09-25 09:52:28	it may do this anyway, but if it's not a guarantee, tufty has extra buffering concerns to worry about

veluca

2024-09-25 09:52:41

yes but with multiple threads what would you *want* to have?

Traneptora

2024-09-25 09:52:43

if the API promises that it won't request tiles outside of scanline order, then buffering is much easier on the clientside

veluca

2024-09-25 09:52:58	does the order of requesting or of releasing matter?
2024-09-25 09:53:39	I believe with multiple threads libjxl still asks in scanline order (unless you give a weird parallel runner)
2024-09-25 09:53:51	but the time at which they get released might be somewhat shuffled

Traneptora

2024-09-25 09:54:37	but that's handled gracefully by libjxl, is it not?
2024-09-25 09:54:56	i.e. even if the threads release in a weird order, the jxl bitstream won't be shuffled

tufty

2024-09-25 09:57:12

I think it'll work something like this: - I start a background write thread, that thread calls JxlEncoderAddChunkedFrame() - it calls data_at() and asks for scanline 0, I stall it while I prepare the pixels - I generate the first 2100 scanlines (2048 plus a safety margin) - I let data_at() return for the first set of tiles and wait - as soon as data_at() asks for scanline 2040, I stall it again - I move my scanline window down, copy over any excess, and generate the next few thousand lines - I let data_at() continue again and wait

veluca

2024-09-25 09:57:58

I don't see `release_buffer` in here

tufty

2024-09-25 09:57:59	so I need to know that as soon as data_at() asks for scanline 2040, no more requests for anything in scanline 0 will come in
2024-09-25 09:59:37	yes, I'd need to do something about release_buffer too
2024-09-25 10:00:40	needing a couple of semaphore and some threads is a bit painful, it'd be simpler with a push API, of course

monad

	DZgas Ж Why can not -I 10000 <:Stonks:806137886726553651>
2024-09-26 07:37:57	it turns out the practical limit is 1000

DZgas Ж

	monad it turns out the practical limit is 1000
2024-09-26 08:43:48	I see that there is some kind of 0.00001% ..... It's worth trying for 10 hours<:Stonks:806137886726553651> However, thanks for the graphics

monad

2024-09-26 10:18:44	above 100 doesn't actually get slower because this only affects part of the algorithm which normally samples at 10%
2024-09-26 10:21:43	related to the observed plateau in density, there is not linearly more calculation happening from 1-1000
2024-09-26 10:34:18	and while I 100 is probably rarely best, it's at least on the good side of tendency for photo-like content. I 1000 is even more dubious, but it is possible that values over 100 would perform better than any lower value.

jonnyawsom3

2024-09-26 10:38:56

I remember seeing extreme differences when doing multiples of 5 and 10 compared to incrementing by 1

_wb_

2024-09-27 09:26:35	``` // TODO(firsching): add a function JxlDecoderSetDefaultCms() for setting a // default in case libjxl is build with a CMS. ``` <@795684063032901642> this should probably be done before v1.0, it should be easy enough, right? Or maybe we don't even need that function but just have that as the default unless/until the application sets a different cms?
2024-09-27 09:43:13	I think it would also make sense to have some sensible default output color space when an application does not call `JxlDecoderSetOutputColorProfile` but a (default) cms is available, to make it less likely for viewers/applications to mess up badly. For example: - if the image space is CMYK: set the default output space to Display P3 - if the image space is HDR (PQ/HLG): if uint8 buffers are used and no desired intensity target is set, set the default output space to Display P3 with some reasonable intensity target (255?), otherwise set the default output space to the signaled space - if the image space is anything wider gamut than sRGB (Adobe98, ProPhoto etc): set the default output space to Display P3 - otherwise set the default output space to sRGB. The above should make things look OK for "dumb" viewers that assume everything is 8-bit RGB.
2024-09-27 09:44:29	WDYT <@604964375924834314> <@795684063032901642> ?

Moritz Firsching

	_wb_ ``` // TODO(firsching): add a function JxlDecoderSetDefaultCms() for setting a // default in case libjxl is build with a CMS. ``` <@795684063032901642> this should probably be done before v1.0, it should be easy enough, right? Or maybe we don't even need that function but just have that as the default unless/until the application sets a different cms?
2024-09-27 10:16:29	right, let me revive https://github.com/libjxl/libjxl/pull/3062
	_wb_ I think it would also make sense to have some sensible default output color space when an application does not call `JxlDecoderSetOutputColorProfile` but a (default) cms is available, to make it less likely for viewers/applications to mess up badly. For example: - if the image space is CMYK: set the default output space to Display P3 - if the image space is HDR (PQ/HLG): if uint8 buffers are used and no desired intensity target is set, set the default output space to Display P3 with some reasonable intensity target (255?), otherwise set the default output space to the signaled space - if the image space is anything wider gamut than sRGB (Adobe98, ProPhoto etc): set the default output space to Display P3 - otherwise set the default output space to sRGB. The above should make things look OK for "dumb" viewers that assume everything is 8-bit RGB.
2024-09-27 10:18:32	Sure, sounds good to me

CrushedAsian255

2024-09-28 08:53:27	Could you JIT compile the MA trees to improve modular decoding speed?
2024-09-28 08:53:41	although that might end up with more possible security bugs

Tirr

2024-09-28 08:57:06

https://discord.com/channels/794206087879852103/1065165415598272582/1238399364561506315

CrushedAsian255

2024-09-28 09:06:52	Maybe weighted could be improved with a Lookup table?
2024-09-28 09:07:04	What exactly is weighted again?

Tirr

2024-09-28 09:10:30

self-correcting predictor

CrushedAsian255

2024-09-28 09:13:45

What’s the formula?

_wb_

2024-09-28 09:18:04

... It's complicated

Tirr

2024-09-28 09:20:00

it's the most complex beast in the entire Modular decoding process imo

_wb_

2024-09-28 09:21:54
2024-09-28 10:02:16	Alexander Rhatushnyak invented the self-correcting predictor. Luca and I then made it even more complicated to make it more expressive and faster.

CrushedAsian255

2024-09-28 10:51:24	Oh geez, no wonder why it’s not particularly performant
2024-09-28 10:54:00	Only thing is for image 3, sometimes the integer division can be faster than a lookup table due to cache issues

_wb_

2024-09-28 10:55:00

It performs well in terms of compression for photographic images. But speed-wise, you will see a difference between decoding e2 (which uses only the simple ClampedGradient predictor) and e3 (which uses the self-correcting predictor) images.

CrushedAsian255

2024-09-28 10:55:32

I’m guessing it’s one of the first things cut for faster_decoding?

_wb_

	CrushedAsian255 Only thing is for image 3, sometimes the integer division can be faster than a lookup table due to cache issues
2024-09-28 10:56:16	A lookup table with 64 entries should be faster than integer division on pretty much any platform. Integer division is one of the worst operations out there...

CrushedAsian255

2024-09-28 10:56:45	Huh, missed the fact it’s only 64 entries
2024-09-28 10:57:12	I misread the numerator to be (1<<x) where x <= 24
2024-09-28 03:05:34	Do libjxl simd use optimisation?
2024-09-28 03:06:11	Doesn’t Modular effectively forbid usage of simd?

Demiurge

2024-09-28 03:12:04	libjxl uses hwy, a google C++ simd library.
2024-09-28 03:12:36	it vectorizes the perf-critical stuff.
2024-09-28 03:13:13	modular does not prevent vectorization.
2024-09-28 03:14:22	to my knowledge anyways...

CrushedAsian255

2024-09-28 03:39:38	Add this?
2024-09-28 03:39:40	https://youtu.be/J0fF4thhhe4?si=Qw847xAGIp0bo73f

tufty

2024-09-29 02:32:34	I've got libvips using the new chunked save API, phew with a 30k x 30k test image I see: ``` $ /usr/bin/time -f %M:%e vips copy st-francis.jpg x.jxl 5719092:586.74 $ /usr/bin/time -f %M:%e vips copy st-francis.jpg x.jxl[chunked] 854360:573.48 ``` so very slightly faster (libvips will overlap JPG load with JXL save), but almost 7x lower memory use woo!
2024-09-29 02:35:18	libjxl v0.11 never seems to run `data_at()` in parallel, which is interesting, I was expecting to see several run at once though I don't think it would help speed, in this case anyway

_wb_

2024-09-29 03:27:16

I guess you can keep 64 threads busy with one 2048x2048 chunk, so not much need to work on more than one of those at the same time, unless you have lots of cores...

Kleis Auke

2024-09-29 03:42:44

For those interested, the `st-francis.jpg` test image can be found here: <https://commons.wikimedia.org/wiki/File:Giovanni_Bellini_-_Saint_Francis_in_the_Desert_-_Google_Art_Project.jpg>.

tufty

2024-09-29 04:04:57

I tried making data_at() very slow: ``` vips gaussblur st-francis.jpg x.jxl[chunked] 100 --vips-progress ``` ie. a sigma 100 gaussian blur now data_at() is taking many seconds, but libjxl v0.11 doesn't seem to run more than one data_at() at once so utilisation drops to 100% (single threaded) for large parts of the save

_wb_

2024-09-29 04:15:12

I assume libjxl expects data_at() to return pretty much instantly, but I guess in your example you want it to work on multiple chunks at the same time?

veluca

	_wb_ I guess you can keep 64 threads busy with one 2048x2048 chunk, so not much need to work on more than one of those at the same time, unless you have lots of cores...
2024-09-29 04:33:21	fair point
	_wb_ I assume libjxl expects data_at() to return pretty much instantly, but I guess in your example you want it to work on multiple chunks at the same time?
2024-09-29 04:33:47	well, I'd expect to do internal parallelization for that since it's running on a 4mp chunk, no?

tufty

2024-09-29 05:40:34	yes, libvips could start up a threadpool to generate these pixels, though there would be some overhead (of course)
2024-09-29 05:41:15	from the API docs I wondered if libjxl would start to run data_at() in parallel if it saw it was idle a lot of the time

veluca

2024-09-29 05:41:38

nah, not happening

tufty

2024-09-29 05:41:39

will data_at() always be single threaded? I could remove a few locks if that's guaranteed

veluca

2024-09-29 05:41:47

also not guaranteed

tufty

2024-09-29 05:41:53

hehe ah well

veluca

2024-09-29 05:42:18

I can imagine us wanting to run it in parallel if you ever run this on a 7990x or some other insane cpu like that

tufty

2024-09-29 05:43:06	ok, I'll subdivide libjxl tiles into eg 128x128 tiles and generate them in parallel in data_at()
2024-09-29 05:43:10	thanks for the advice!

Demiurge

2024-09-29 06:03:05	I notice cjxl has really poor multicore CPU utilization.
2024-09-29 06:08:43	Wall time performance does not improve linearly with extra threads sharing the work, even for a huge image at effort=3
2024-09-29 06:09:09	Most of the cores are idle
2024-09-29 06:10:08	`30000 x 26319, 19.929 MP/s [19.93, 19.93], , 1 reps, 1 threads.` `30000 x 26319, 26.455 MP/s [26.45, 26.45], , 1 reps, 2 threads.` `30000 x 26319, 31.931 MP/s [31.93, 31.93], , 1 reps, 4 threads.` `30000 x 26319, 33.858 MP/s [33.86, 33.86], , 1 reps, 8 threads.` `30000 x 26319, 36.542 MP/s [36.54, 36.54], , 1 reps, 16 threads.`
2024-09-29 06:10:22	That's pretty abysmal scaling.
2024-09-29 06:11:51	not even a 2x improvement going from 1 to 16 cores.
2024-09-29 06:16:23	Also, there's a superfluous comma, and I am not sure what the numbers in the brackets are supposed to represent.

_wb_

2024-09-29 06:16:26	That's weird behavior, what platform is this?
2024-09-29 06:16:47	What's the input format?

Demiurge

2024-09-29 06:16:57	`JPEG XL encoder v0.11.0 4df1e9e [AVX2,SSE2]` Windows 64
2024-09-29 06:17:26	The input format is JPEG, with `-j 0`
2024-09-29 06:17:56	It's the Francis painting
2024-09-29 06:18:29	But I have noticed bad multi-CPU usage in cjxl for a long long time. And very bad sublinear scaling too
2024-09-29 06:18:43	This just seems like a really good demonstration

_wb_

2024-09-29 06:54:03

Can you open an issue for this? I get very different behavior (much better scaling) so there might be something wrong with the platform-specific stuff

tufty

2024-09-29 07:22:43	it's very difficult to benchmark stuff like that
2024-09-29 07:23:00	you need to be careful around eg turbo boost, core affinity, etc
2024-09-29 07:23:36	so for example, is the 1 thread case running with a 30% clock boost because only one core is being used?
2024-09-29 07:24:41	is the two core case really running across two completely separate cores, or perhaps two cores that share some resources?

Demiurge

	_wb_ Can you open an issue for this? I get very different behavior (much better scaling) so there might be something wrong with the platform-specific stuff
2024-09-29 07:24:53	effort=7 has better scaling. Maybe that's why you noticed better scaling? `30000 x 26319, 1.586 MP/s [1.59, 1.59], , 1 reps, 2 threads.` `30000 x 26319, 7.676 MP/s [7.68, 7.68], , 1 reps, 16 threads.`

tufty

2024-09-29 07:25:18

are threads moving between cores during execution etc etc

Demiurge

	tufty are threads moving between cores during execution etc etc
2024-09-29 07:25:40	Well, that almost always happens due to OS CPU scheduling

tufty

2024-09-29 07:26:29	sure, but it can influence benchmark results disabling turbo-boost is the main thing that's bitten me on stuff like that
2024-09-29 07:26:48	and disabling hyper threading of course

Demiurge

2024-09-29 07:27:21	But... a program that fully utilizes the CPU will have 100% cpu usage and you can measure the CPU time and see if the measured CPU time increases with the number of threads. If the CPU time is not increasing then it's not using the extra CPUs
2024-09-29 07:27:58	If the CPUs are not being used then you know there's a problem that isn't caused by the OS or hyper threading
2024-09-29 07:28:25	Because the OS and the CPU firmware will always try to give the program as much CPU as it asks for

tufty

2024-09-29 07:28:26

ah that's true

Demiurge

2024-09-29 07:29:20	I mean, it could still be caused by the platform specific thread library
2024-09-29 07:29:58	either a bug there or more likely a bug in the application
2024-09-29 07:30:38	not a serious bug, just a lot of performance left on the table :)

tufty

2024-09-29 07:31:57

what's the exact cjxl command you're running <@1028567873007927297> ?

Demiurge

2024-09-29 07:33:06	`cjxl -j 0 -d 1 -e 7`
2024-09-29 07:33:40	and `cjxl -j 0 -d 1 -e 3`

tufty

2024-09-29 07:34:05

thanks! I'll experiment

Demiurge

2024-09-29 07:34:27	sometimes with `--num_threads=2` etc
2024-09-29 07:35:41	if you omit `--num_threads` it detects the number of available logic cores and uses that number.
2024-09-29 10:19:51	vips only requires just over a gig of memory to resize that image by the way.
2024-09-29 10:22:08	~~a lot less memory than cjxl even when using jxlsave :)~~ actually, never mind that, cjxl and vips both use just over 5gb to convert it without resizing.

jonnyawsom3

	Demiurge Also, there's a superfluous comma, and I am not sure what the numbers in the brackets are supposed to represent.
2024-09-29 10:28:15	I'm a bit late, but that's min and max MP/s for when --num_reps is used

Demiurge

2024-09-29 10:36:08	Ah. And the empty comma field?
2024-09-29 10:36:48	Maybe the output could be formatted a little better. Also maybe cjxl could use an option parsing library.
2024-09-29 10:37:02	like if not getopt then something similar to getopt

jonnyawsom3

2024-09-29 11:08:09	I think it's meant to be the variablility between min and max? `value_tendency, unit, value_min, value_max, variability);`
2024-09-29 11:08:38	<https://github.com/libjxl/libjxl/blob/7e178a51b0a59fcc8e58c78ce88058a9b97abf5b/tools/speed_stats.cc#L90C12-L90C69>

Demiurge

2024-09-29 11:35:21	```cc std::string mbs_stats = SummaryStat(file_size_ * 1e-6, "MB", s); ```
2024-09-29 11:35:26	It's supposed to be this
2024-09-29 11:37:14	The empty comma is supposed to be mb/s but for some reason `mbs_stats.c_str()` is an empty string.

jonnyawsom3

	Demiurge effort=7 has better scaling. Maybe that's why you noticed better scaling? `30000 x 26319, 1.586 MP/s [1.59, 1.59], , 1 reps, 2 threads.` `30000 x 26319, 7.676 MP/s [7.68, 7.68], , 1 reps, 16 threads.`
2024-09-30 12:52:29	Yes, the legibility is awful, I might try and reformat it tomorrow, but here's some data for me across all efforts. All Lossless``` Seconds 1 Thread 2 Thread Effort Real Time 0.68 0.52 1 CPU Time 0.66 0.70 1.66 1.05 2 1.48 1.38 2.92 1.74 3 2.61 2.84 8.10 4.39 4 7.86 8.12 13.40 7.14 5 13.23 13.50 18.23 9.65 6 17.86 18.30 26.81 13.91 7 26.47 26.84 66.98 34.78 8 66.16 67.84 110.59 58.82 9 109.75 115.12 532.16 495.80 10 520.50 500.95 ```
2024-09-30 12:53:50	I can add more threads later on too, probably

Demiurge

2024-09-30 01:03:59	That can't be right, CPU time is less with 2 threads?
2024-09-30 01:04:08	Did you divide the CPU time by the number of CPUs?
2024-09-30 01:05:31	Hmm, no, I think I see...

jonnyawsom3

2024-09-30 01:05:32

I assume you mean effort 10

Demiurge

2024-09-30 01:06:08	The total CPU time should be the same or more, but not less, with additional threads...
2024-09-30 01:06:19	Why would more CPUs = less total work?

jonnyawsom3

2024-09-30 01:06:49

e10 is almost entirely singlethreaded, so *something* in there must be getting skipped when multithreaded

Demiurge

2024-09-30 01:08:14	The amount of work should be unchanged regardless of thread count. Or there should be some additional work overhead with additional threads.
2024-09-30 01:08:32	I didn't expect the total amount of work to decrease

jonnyawsom3

2024-09-30 01:09:51

Oh actually, it's effort 10 and effort 2... Strange

Demiurge

2024-09-30 01:10:12	But that's only the case with e=10, and e=2
2024-09-30 01:10:22	Yeah, kinda weird
2024-09-30 01:10:28	ah well :)
	Yes, the legibility is awful, I might try and reformat it tomorrow, but here's some data for me across all efforts. All Lossless``` Seconds 1 Thread 2 Thread Effort Real Time 0.68 0.52 1 CPU Time 0.66 0.70 1.66 1.05 2 1.48 1.38 2.92 1.74 3 2.61 2.84 8.10 4.39 4 7.86 8.12 13.40 7.14 5 13.23 13.50 18.23 9.65 6 17.86 18.30 26.81 13.91 7 26.47 26.84 66.98 34.78 8 66.16 67.84 110.59 58.82 9 109.75 115.12 532.16 495.80 10 520.50 500.95 ```
2024-09-30 01:10:45	That's good scaling
2024-09-30 01:11:10	I guess e=1 is single threaded too
2024-09-30 01:12:38	Now do an 8 thread test... 😂

jonnyawsom3

2024-09-30 01:12:54	Tried another image, e2 has the expected 2x CPU time increase going to 2 threads, e10 is still lower by 0.06 seconds while still being faster
	Demiurge Now do an 8 thread test... 😂
2024-09-30 01:14:26	Now e2 has the same CPU time as singlethreaded, but e10 is slightly more CPU time

Demiurge

2024-09-30 01:15:14	image dimensions?
2024-09-30 01:15:27	is it a big image?

jonnyawsom3

2024-09-30 01:18:18

The last few results were 910 x 461, the table above was 3495 x 2458

Demiurge

2024-09-30 03:57:24

bigger is better :)

_wb_

2024-09-30 06:10:53

Did you average over many runs? Timings can fluctuate quite a bit...

tufty

	Demiurge ~~a lot less memory than cjxl even when using jxlsave :)~~ actually, never mind that, cjxl and vips both use just over 5gb to convert it without resizing.
2024-09-30 08:20:56	libvips chunked mode support is in this draft PR https://github.com/libvips/libvips/pull/4174/
2024-09-30 08:21:41	it brings peak memuse for copy jpg->jxl on that big image down from 5.5gb to 800mb (and it's sligfhtly faster too)

Demiurge

2024-09-30 08:55:54	nice
2024-09-30 08:56:05	soon the vips will rule the galaxy

tufty

2024-09-30 09:05:46

I'm hoping to get a new libvips done before xmas with chunked mode libjxl support, and JXL HDR too

Meow

2024-09-30 10:33:19	Does this only happen with v0.11.0 for converting JPEG to lossy JXL? > Must not set non-zero distance in combination with --lossless_jpeg=1, which is set by default.
2024-09-30 10:33:58	I didn't mention --lossless_jpeg=1 however
2024-09-30 10:36:20	It's required to add --lossless_jpeg=0 as a workaround now

Demiurge

2024-10-01 07:36:16

I think it should be more difficult for people to accidentally re-encode a JPEG from pixels

Meow

2024-10-01 08:06:59

It's an error popped up if I attempt to do it without --lossless_jpeg=0

Demiurge

2024-10-01 11:10:38

That's good

Meow

2024-10-02 04:14:36

That shouldn't happen when -d isn't 0

Demiurge

2024-10-02 09:46:06	It should... or it should trim the file without decoding to pixels if possible. It's good if it's less likely someone accidentally re-encodes a JPEG from pixels since that's usually dumb and unwanted behavior
2024-10-02 09:46:53	Forcing a -j 0 to confirm before doing something stupid, is a good thing

Meow

2024-10-02 03:54:38

Maybe it's better to notify users to add -j 0 if they really want to convert to lossy JXL

Demiurge

2024-10-02 07:07:45	A "you’re about to do something stupid" warning. "Are you sure?"
2024-10-02 07:08:50	Ideally there should never be a reason to decode to pixels and libjxl should be able to trim the file without decoding it
2024-10-02 07:09:52	Imagine being able to use lossy compression but never having to worry that the output will be larger file size than the jpeg source

CrushedAsian255

2024-10-03 02:21:32

We could silently ignore non zero distance and just have j 0 as an expert option

Meow

	Demiurge A "you’re about to do something stupid" warning. "Are you sure?"
2024-10-03 04:34:26	I don't know why converting a higher quality JPEG to a lower quality JXL is a "stupid" decision

Demiurge

	CrushedAsian255 We could silently ignore non zero distance and just have j 0 as an expert option
2024-10-03 04:35:40	silently not doing what the user asked is bad behavior. Better to tell the user "you're dumb"

Meow

2024-10-03 04:35:44

Maybe that's not for your usages, it's quite necessary to handle JPEG that's unreasonably at quality 100 for sharing or storing

Demiurge

	Meow I don't know why converting a higher quality JPEG to a lower quality JXL is a "stupid" decision
2024-10-03 04:36:51	Because the way it work right now, libjxl inflates the file and then recodes it less efficiently rather than re-using and trimming the DCT data.
2024-10-03 04:37:15	That's definitely the least optimal way of doing it

Meow

2024-10-03 04:37:49	The resulted file sizes matter more
2024-10-03 04:44:13	I can convert one JPEG to -d 0 and -d [I want] and then compare which is smaller

damian101

2024-10-03 05:03:29

A lot of JPEGs on web pages are huge and inefficient, as they went directly from camera to webpage, or were exported with 100 quality setting.

Meow

2024-10-03 05:14:29

My own procedure is converting a bunch of JPEG to both -d 0 and -d 0.5. Larger -d 0.5 ones are removed as their original JPEG should be at a significantly lower quality

Demiurge

	Meow My own procedure is converting a bunch of JPEG to both -d 0 and -d 0.5. Larger -d 0.5 ones are removed as their original JPEG should be at a significantly lower quality
2024-10-03 06:20:06	As long as you're encoding from a high quality source it should be fine.
2024-10-03 06:20:39	But there absolutely should be a stupid-alert when doing -j 0 since it's kinda a footgun
2024-10-03 06:21:56	unless the source JPEG is q100 or close to it, then libjxl is not going to be very efficient

_wb_

2024-10-03 06:23:13

In Cloudinary, if you do q_auto on an existing JPEG (without any modifications like downscaling), it looks at the quantization tables, estimates JPEG quality from those, and does one of these three things: - JPEG was low quality already (e.g. q60): don't touch it, just do lossless jpegtran optimization - JPEG is high quality: requantize it a bit in the DCT domain, e.g. bring a q93 quant table down to q85 or so, only touching coeffs where the factors get higher (different encoders use differently shaped quant matrices) - JPEG is very high quality (q96+): just decode it to pixels and do the usual q_auto encoding

Demiurge

2024-10-03 06:23:19

ideally, libjxl would be able to lossy requantize JPEG without ballooning it first.

_wb_

2024-10-03 06:27:00

In principle we could do something similar in libjxl — though currently the encode API is not really designed for that...

Demiurge

2024-10-03 07:46:07

Is the lossless jpeg transcoding exposed in libjxl or is it only a cjxl thing?

_wb_

2024-10-03 11:03:12

cjxl cannot do anything that libjxl cannot do. There are two ways in libjxl to encode a frame: one is by providing pixel data, the other is by providing a jpeg bitstream and telling it to losslessly transcode it. I suppose we could add an option or an additional API for providing a jpeg bitstream that doesn't necessarily apply lossless transcoding, but compares the estimated quality of the jpeg to a target distance and then either does lossless transcoding or re-encodes from pixels (or potentially does a lossy transcoding with requantization in the DCT domain, though I'm not sure if we want to go there)

CrushedAsian255

	_wb_ cjxl cannot do anything that libjxl cannot do. There are two ways in libjxl to encode a frame: one is by providing pixel data, the other is by providing a jpeg bitstream and telling it to losslessly transcode it. I suppose we could add an option or an additional API for providing a jpeg bitstream that doesn't necessarily apply lossless transcoding, but compares the estimated quality of the jpeg to a target distance and then either does lossless transcoding or re-encodes from pixels (or potentially does a lossy transcoding with requantization in the DCT domain, though I'm not sure if we want to go there)
2024-10-03 11:20:34	If the JPEG is sufficiently high quality does it make sense to go to pixels? If so what’s the approx cutoff?
	_wb_ In Cloudinary, if you do q_auto on an existing JPEG (without any modifications like downscaling), it looks at the quantization tables, estimates JPEG quality from those, and does one of these three things: - JPEG was low quality already (e.g. q60): don't touch it, just do lossless jpegtran optimization - JPEG is high quality: requantize it a bit in the DCT domain, e.g. bring a q93 quant table down to q85 or so, only touching coeffs where the factors get higher (different encoders use differently shaped quant matrices) - JPEG is very high quality (q96+): just decode it to pixels and do the usual q_auto encoding
2024-10-03 11:22:29	Does Cloudinary not do JPEG to jxl ?
2024-10-03 11:22:57	Also what estimation algorithm are you using? Is it like the one in Exiftool?

_wb_

	CrushedAsian255 If the JPEG is sufficiently high quality does it make sense to go to pixels? If so what’s the approx cutoff?
2024-10-03 12:12:33	I guess the cutoff depends on the ratio between input JPEG quality and desired target distance. If you want d1, probably you need libjpeg-turbo quality 95+, while if you want d3, probably libjpeg-turbo q85+ suffices.
	CrushedAsian255 Does Cloudinary not do JPEG to jxl ?
2024-10-03 12:13:24	I was talking about what we do when both input and output format are JPEG.

CrushedAsian255

2024-10-03 12:13:35

Does quality = quantisation weights ?

_wb_

	CrushedAsian255 Also what estimation algorithm are you using? Is it like the one in Exiftool?
2024-10-03 12:15:31	what I'm using as a proxy is just some weighted sum of the quantization factors, but there are probably better ways to do it. In particular, you can have encoders like guetzli and jpegli that use low quantization factors with variable deadzone quantization, and then just looking at the quantization tables kind of overestimates the actual quality...

CrushedAsian255

2024-10-03 12:16:45

Wait since when did JPEG have adaptive quantisation ?

_wb_

2024-10-03 12:19:48

it doesn't, but an encoder can choose to zero out coefficients even if they wouldn't actually quantize to zero. Doing that in a smart way (selectively, only when it doesn't hurt visually) is basically the main 'trick' jpegli uses to achieve better compression

CrushedAsian255

	_wb_ it doesn't, but an encoder can choose to zero out coefficients even if they wouldn't actually quantize to zero. Doing that in a smart way (selectively, only when it doesn't hurt visually) is basically the main 'trick' jpegli uses to achieve better compression
2024-10-03 01:15:45	Does 0 entropy better?

_wb_

2024-10-03 02:20:34

yes, the goal of most coding tools (prediction, frequency transforms, RCTs etc) is basically to get more zeroes (or lower-amplitude numbers) since those will be cheap after entropy coding. In particular for DCT coefficients, both JPEG and JXL additionally have cheap ways to signal "all the other coefficients in this block are zeroes"

Demiurge

	_wb_ cjxl cannot do anything that libjxl cannot do. There are two ways in libjxl to encode a frame: one is by providing pixel data, the other is by providing a jpeg bitstream and telling it to losslessly transcode it. I suppose we could add an option or an additional API for providing a jpeg bitstream that doesn't necessarily apply lossless transcoding, but compares the estimated quality of the jpeg to a target distance and then either does lossless transcoding or re-encodes from pixels (or potentially does a lossy transcoding with requantization in the DCT domain, though I'm not sure if we want to go there)
2024-10-03 08:51:51	That would be awesome
2024-10-03 09:07:37	After a certain point the quantization noise is so small it won't really matter when it goes through vardct. But only at really high settings where there is very little noise like q97+
2024-10-03 09:08:21	Otherwise the encoder tries to preserve that noise

Enhex

2024-10-04 11:04:23

is there an example for how to decode a region of interest/cropped decoding?

jonnyawsom3

	Enhex is there an example for how to decode a region of interest/cropped decoding?
2024-10-05 08:30:28	It's on the roadmap for 1.0 but not currently implemented in libjxl. JXL-Oxide has experimental cropped decoding if you'd like to try that

Enhex

	It's on the roadmap for 1.0 but not currently implemented in libjxl. JXL-Oxide has experimental cropped decoding if you'd like to try that
2024-10-05 03:36:18	thanks I'll look into it

tufty

2024-10-06 11:52:20	hello all, libvips dev here, I'm trying to improve JXL HDR support in libvips it's mostly working (I think?) https://github.com/libvips/libvips/pull/4174#issuecomment-2395009764
2024-10-06 11:53:54	1. I think I need a HDR JXL image with very high dynamic range for testing, something that will decode to luminances well outside SDR does anyone have a good test image like that?
2024-10-06 11:56:25	2. I think I have some kind of D50/D65 mixup, my images are rather yellowish 😦

Oleksii Matiash

2024-10-06 11:59:18

AdobeRGB png -> lossy jxl -> png produces png with profile named "RGB_D65_0.639997;0.329997;0.210005;0.710005;0.149998;0.060004_Per_g0.454707". Is it expected?

_wb_

	tufty hello all, libvips dev here, I'm trying to improve JXL HDR support in libvips it's mostly working (I think?) https://github.com/libvips/libvips/pull/4174#issuecomment-2395009764
2024-10-06 12:03:35	https://sneyers.info/hdrtest/

Demiurge

	Oleksii Matiash AdobeRGB png -> lossy jxl -> png produces png with profile named "RGB_D65_0.639997;0.329997;0.210005;0.710005;0.149998;0.060004_Per_g0.454707". Is it expected?
2024-10-06 12:13:42	Yes. There's no enum or shortcut for adobe rgb (also known as opRGB) so it always spells out the gamma and primary coordinates

Oleksii Matiash

	Demiurge Yes. There's no enum or shortcut for adobe rgb (also known as opRGB) so it always spells out the gamma and primary coordinates
2024-10-06 12:15:15	Thank you. Not a problem, just looks weird in ps

Demiurge

2024-10-06 12:17:25	Yeah. It would be nice if there was a shortcut/shorthand for opRGB, at least in libjxl's color_description.cc
2024-10-06 12:22:20	0.454707 is the INVERSE of the gamma though
2024-10-06 12:22:29	Is that expected?
2024-10-06 12:22:40	??

Tirr

2024-10-06 12:24:52	jxl saves inverse gamma internally
2024-10-06 12:26:00	up to 7 digits below decimal point iirc

Demiurge

2024-10-06 12:26:49

Ok... and sRGB gamma is a special case because of the linear segment I guess?

Tirr

2024-10-06 12:27:13

yeah it's different from pure gamma

Demiurge

2024-10-06 12:29:13	opRGB primaries are 0.64,0.33;0.21,0.71;0.15,0.06, d65 illuminant, gamma = 563/256 (or the inverse 0.454707 I guess) so it would be cool if there was a shortcut for requesting these settings since it's used relatively often in photos and printing
2024-10-06 12:30:28	It doesn't need to be specially added to the on disk format but just a shortcut in the library would be mildly cool and neat
2024-10-08 07:55:25	Maybe this is a really dumb question, but why does djxl produce such a dim and dull looking image? Is this a bug? `djxl --color_space=sRGB --bits_per_sample=8 ./jpegxl-home.jxl ./jpegxl-home-srgb.png` `djxl --color_space=DisplayP3 --display_nits=300 --bits_per_sample=16 .\jpegxl-home.jxl .\jpegxl-home-p3.png`
2024-10-08 07:56:38	I don't understand what the point of display_nits and intensity_target are
2024-10-08 07:59:25	the intensity_target in jxlinfo says 10000 nits, but if that's the case then shouldn't the decoded image not be so ridiculously dark?

_wb_

2024-10-08 11:48:36

<@604964375924834314> this does look like a bug, right?

Demiurge

2024-10-08 12:59:36	Waterfox also shows the jxl as an extremely dark image just like here. So does jxl-oxide as used by jxl-winthumb. To me it seems like maybe intensity_target and display_nits affect the decoded image in a counter-intuitive way, and maybe the default value of the latter is equal to the former, and I'm not sure what they are or how they are even supposed to be useful.
2024-10-08 01:04:25	Maybe I'm just misunderstanding the problem of converting an HDR image to SDR

jonnyawsom3

2024-10-08 01:05:33

I think you got the images backwards... The bright one has a P3 ICC, the dark one is sRGB

Demiurge

2024-10-08 01:06:53	Yeah, when I decoded it to 8 bit sRGB it's really dark and dull
2024-10-08 01:07:32	Which I don't think makes sense but maybe I'm not understanding something

jonnyawsom3

2024-10-08 01:07:54

That's standard for HDR to SDR without proper tone mapping. It's how I see all HDR JXLs

Demiurge

2024-10-08 01:08:23	I guess, but ... why?
2024-10-08 01:09:07	You would think it would be super bright and washed out with lots of clipping or something
2024-10-08 01:09:38	Or just remapped to whatever range the display could show
2024-10-08 01:10:14	Or just closer to the bright image

jonnyawsom3

2024-10-08 01:10:34

The default as far as I understand it is to scale everything down until the peak brightness matches what SDR can output

Demiurge

2024-10-08 01:11:12

I don't see any fully bright pixels in the dark image though

jonnyawsom3

2024-10-08 01:12:01

That's because it's 16bit, cramming the values down to 8 means the entire range is used. But this is all speculation and observations on my behalf

Demiurge

2024-10-08 01:12:27	Bit depth won't change the brightness
2024-10-08 01:20:02	Mostly I don't understand if the nits/intensity knobs even make sense or serve a useful purpose.

Quackdoc

	Demiurge Maybe this is a really dumb question, but why does djxl produce such a dim and dull looking image? Is this a bug? `djxl --color_space=sRGB --bits_per_sample=8 ./jpegxl-home.jxl ./jpegxl-home-srgb.png` `djxl --color_space=DisplayP3 --display_nits=300 --bits_per_sample=16 .\jpegxl-home.jxl .\jpegxl-home-p3.png`
2024-10-08 01:20:31	can you share the jxl?

Demiurge

2024-10-08 01:20:45

It's on the jpeg website 😂

Quackdoc

2024-10-08 01:21:25

ill find it eventually I guess

Demiurge

2024-10-08 01:21:32	https://jpeg.org/jpegxl/
2024-10-08 01:21:52	It looks great on my iPhone actually

Quackdoc

2024-10-08 01:21:56	[dorime](https://cdn.discordapp.com/emojis/979397787970072596.webp?size=48&quality=lossless&name=dorime)
2024-10-08 01:22:25	>`https://jpeg.org/images/jpegxl-home.avif`
2024-10-08 01:22:27	heresy
2024-10-08 01:22:33	absolute heresy

Demiurge

2024-10-08 01:23:02	It looks almost exactly like the p3.png I made
2024-10-08 01:23:13	On my iPhone
2024-10-08 01:24:30	I dunno, maybe I'm stupid, but... I fail to understand why the image looks so different when I just decode it without --display_nits

Quackdoc

2024-10-08 01:24:45	yeah using --display_nits=203 ont he srgb decode makes it look fine
2024-10-08 01:26:39	this should be more or less expected behavior

Demiurge

2024-10-08 01:29:05	I dunno, is it wrong of me to expect the image to have better tone mapping? Even HDR displays require tone mapping to view images, since no display exists that can faithfully represent the full possible range of intensities a camera can capture.
2024-10-08 01:30:29	Is it wrong of me to expect an image to look not so completely and obviously different from what it's intended to look like?

Quackdoc

2024-10-08 01:30:45

you aren't doing real tonemapping rather you are creating "an HDR image with an SDR transfer"

Demiurge

2024-10-08 01:32:03	I'm not sure what I'm doing because I have no idea what the nitknobs do
2024-10-08 01:32:21	Or why they are even necessary

Quackdoc

2024-10-08 01:32:49

the display_nits is the tonemapping, when it's not set it's not doing "tonemapping", that being said, libjxl should likely implictly tonemap when doing PQ->sRGB

Demiurge

2024-10-08 01:32:58

So either I'm dumb or they're dumb or both 😂

Quackdoc

2024-10-08 01:33:19

basically you are creating a 10k nit sRGB which is way out of spec

Demiurge

2024-10-08 01:33:51

It looks like the default result is what would happen if I set display_nits=intensity_target=10000

Quackdoc

2024-10-08 01:33:54	note you get the exact same behavior with mpv `mpv --loop jpegxl-home.jxl --target-peak=10000`
2024-10-08 01:34:39	the image is getting decoded with it's target nit value, which is... not really correct?

Tirr

2024-10-08 01:35:41

``` jxl-oxide jpegxl-home.jxl --target-colorspace srgb -o jpegxl-home.png ```

Demiurge

2024-10-08 01:35:46

All images require some form of tone mapping to be displayed on any display

Quackdoc

2024-10-08 01:35:57

I wouldn't consider this a "bug" but it is for sure not desired behavior

Demiurge

2024-10-08 01:36:13

Because displays can't represent any arbitrary brightness value

jonnyawsom3

2024-10-08 01:36:40

Considering most software still doesn't even have color management...

Quackdoc

	Tirr ``` jxl-oxide jpegxl-home.jxl --target-colorspace srgb -o jpegxl-home.png ```
2024-10-08 01:36:54	iirc jxl-oxide tonemaps implicitly to 80 nits when using sRGB right?

Tirr

2024-10-08 01:37:34

uh let me check

Demiurge

	Tirr ``` jxl-oxide jpegxl-home.jxl --target-colorspace srgb -o jpegxl-home.png ```
2024-10-08 01:37:40	Sadly jxl-winthumb renders it super dark
2024-10-08 01:38:05	80 is pretty low...

Quackdoc

2024-10-08 01:38:12

80 is per sRGB spec

Tirr

2024-10-08 01:38:17

it targets 255 nits

Quackdoc

2024-10-08 01:38:19	which yes, is very low
	Tirr it targets 255 nits
2024-10-08 01:38:29	really? that looks a little blown out for that
2024-10-08 01:39:06	oh no, it lines up with djxl

Demiurge

2024-10-08 01:39:12

No, that's pretty much exactly how it looks on my iphone

Quackdoc

2024-10-08 01:40:13	well, it's not like tonemapping is an exact art sadly
2024-10-08 01:41:07	I like 2446a with this image myself

Demiurge

2024-10-08 01:41:27

The only difference between tirr's image and the original on safari is, the original is slightly deeper blue in the iris. And the edges of the iris are darker too

Tirr

2024-10-08 01:41:36

(jxl-oxide uses rec2408)

Quackdoc

	Demiurge Sadly jxl-winthumb renders it super dark
2024-10-08 01:42:07	yeah, winthumb should be converting it to sRGB ideally

Tirr

2024-10-08 01:42:27

and iirc libjxl does some local tone mapping

Demiurge

2024-10-08 01:42:40	In tirr's image the skin looks more saturated and the iris looks more washed out
2024-10-08 01:42:59	But they are almost identical

Quackdoc

2024-10-08 01:43:05

one day, I will create a proper rust cms with a bunch of tonemapping stuff available [av1_dogelol](https://cdn.discordapp.com/emojis/867794291652558888.webp?size=48&quality=lossless&name=av1_dogelol)

Demiurge

2024-10-08 01:44:53

Much better than a super dark and obviously defective image... unless "obviously defective" is the look we're intending to go for but I don't think that's what people usually intend when they want to just view an image and expect it to look as close as possible across displays

Quackdoc

2024-10-08 01:45:03

oculante just displays

Demiurge

2024-10-08 01:46:28	That's basically just as bad as the dark image
2024-10-08 01:46:52	Instead of dark it's extremely dull

2024-10-08 01:46:54

80 nits isnt even that dim

Demiurge

	w 80 nits isnt even that dim
2024-10-08 01:48:07	I guess not since brightness is not a linear thing

2024-10-08 01:48:58	if you have a pixel 7, it's 66% brightness
2024-10-08 01:49:33	on samsung it's about 25%

Quackdoc

2024-10-08 01:49:56	gonna compile oculante requesting an sRGB image
2024-10-08 02:05:01
2024-10-08 02:05:05	https://tenor.com/view/easy-button-that-was-easy-red-button-press-the-button-gif-17005111

KKT

2024-10-08 06:34:29

BTW, Affinity Photo 2 open is extermely dark as well.

spider-mario

	Demiurge the intensity_target in jxlinfo says 10000 nits, but if that's the case then shouldn't the decoded image not be so ridiculously dark?
2024-10-08 08:44:50	`intensity_target` only has to be an upper bound on the brightness found in the image, it’s allowed for it not to be the smallest possible upper bound
2024-10-08 08:44:58	as it happens, it seems to not actually reach 10 000 nits
2024-10-08 08:45:44	and most of the image is considerably below that
2024-10-08 08:47:55	without `--display_nits`, `djxl` currently won’t perform any tone mapping and will therefore map 0-`intensity_target` to 0-1 in the output
2024-10-08 08:48:30	whereas `--display_nits=N` will cause it to tone-map from 0-`intensity_target` to 0-`N`, and then map 0-`N` to 0-1 in the output
2024-10-08 08:49:39	(iirc, there is actually even code to tone-map to a non-0 lower bound, but it’s not exposed in `djxl`)

Demiurge

2024-10-08 09:37:56	I think when you set display_nits, it tries to approximate the appearance of a target_intensity screen on a display_nits screen
	spider-mario without `--display_nits`, `djxl` currently won’t perform any tone mapping and will therefore map 0-`intensity_target` to 0-1 in the output
2024-10-08 09:42:11	This seems wrong. Why does the image look normal on an HDR display even though most HDR displays don't get anywhere near 10000 nits? Even my iphone on low brightness shows a normal looking image.
2024-10-08 09:42:54	Isn't there a better way to show a more consistent looking picture across different displays?

spider-mario

2024-10-08 09:43:15	because HDR displays will receive a PQ signal and they know how to display that
2024-10-08 09:43:28	including their own tone mapping, usually

Demiurge

2024-10-08 09:44:00

So the tone mapping has to be implemented in software if the hardware doesn't support it?

spider-mario

2024-10-08 09:44:55

yes (e.g. a MacBook Pro set to the reference “HDR video” mode, which leaves everything under 1000 cd/m² unchanged and just hard-clips at 1000, instead of the usual tone mapping)

_wb_

	spider-mario `intensity_target` only has to be an upper bound on the brightness found in the image, it’s allowed for it not to be the smallest possible upper bound
2024-10-08 09:46:51	Would it be more correct to encode a PQ image that only goes up to 4000 nits with intensity_target 4000 than to keep it at the default? So it will go all the way to 1 when tone mapping instead of keeping some range for intensities that don't actually occur?
2024-10-08 09:47:55	Because then maybe we should change the default encode behavior from "10000 if the input is PQ" to "find actual max and signal that", no?
2024-10-08 09:48:32	Since many PQ images will not go brighter than 4000 nits, or even 1000 nits...

Demiurge

2024-10-08 09:54:48

Sounds more logical to me, assuming it doesn't break anything on HDR displays...

_wb_

2024-10-08 10:07:43

On HDR displays it shouldn't change anything iirc, it would just affect the default tone mapping.

Demiurge

2024-10-08 10:18:58

What even is the point of target_intensity otherwise if it doesn't contain any entropy. Isn't it part of the on disk format?

dogelition

	spider-mario `intensity_target` only has to be an upper bound on the brightness found in the image, it’s allowed for it not to be the smallest possible upper bound
2024-10-08 10:20:21	does it actually have to be an upper bound? the png spec suggests using the approach from this paper: <https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9508136> which, for still images, just means taking the 99.99% percentile of the maximum light level ( = `max(R, G, B)`) value
2024-10-08 10:21:05	but i agree that it would be good default behavior to calculate it automatically

Quackdoc

	_wb_ Since many PQ images will not go brighter than 4000 nits, or even 1000 nits...
2024-10-08 11:04:35	shouldn't intensity target be handled similarly to HDR10's mastering display nits? EDIT: when working with PQ

dogelition

	Quackdoc shouldn't intensity target be handled similarly to HDR10's mastering display nits? EDIT: when working with PQ
2024-10-08 11:20:12	can you elaborate on what you mean by that?

Quackdoc

2024-10-08 11:26:39

PQ has an "absolute nit scale" but that doesn't mean displays can always render it, for instance say you have a 1k nit display and a 2k nit picture, what does the display do? It tone maps it down, instead of clipping. so intensity_target=1000 can signal "Ok this image was made on a 1knit display, so signals that are above 1000nits, are most likely not actually above 1000 nits. EDIT: or rather it's more like, this is what stuff was tonemapped to, so take this into consideration

Demiurge

2024-10-08 11:26:48

I can imagine sometime in the future someone takes a picture of the sun, saves it in jpeg xl, then sends it to someone to decode... and their computer explodes :)

Quackdoc

2024-10-08 11:28:32

as far as I can tell, there is no real single defined way on how to handle HDR10 metadata

Demiurge

2024-10-08 11:29:59

I thought PQ is just a transfer function. Why can't a PQ image just be decoded to linear and re-encoded with a different transfer function?

Quackdoc

2024-10-08 11:34:46	I mean, you can do that, PQ is just a transfer function, but it's one that defines luminance absolutely
2024-10-08 11:35:04	1.0 will always == 10000 nits with PQ

Demiurge

2024-10-08 11:35:20

What's so special and complicated about it? And why does it feel like, whenever reading about HDR, they reinvent everything to be way more fragile and needlessly complicated instead of reusing and logically extending existing technology?

Quackdoc

2024-10-08 11:35:26

well reference nits

Demiurge

2024-10-08 11:36:32	I see... so the problem is that other transfer curves don't absolutely define luminance
2024-10-08 11:37:13	Or at least, a SMALL part of the problem is that

Quackdoc

2024-10-08 11:38:13

well kinda, but you will always have an issue of mapping intended luminance. this is the reason old sRGB images look bad on modern displays

Demiurge

2024-10-08 11:40:16

Really it shouldn't even be a problem... and compared to the behavior of actual HDR display hardware, simply decoding a PQ image that way without any tone mapping seems incorrect and inconsistent compared to how an HDR screen would behave

Quackdoc

2024-10-08 11:41:15	inconsistent yes, incorrect no
2024-10-08 11:41:23	well, it depends on what you want to do with it

Demiurge

2024-10-08 11:42:10

If you're displaying it on a non HDR monitor or converting it to sRGB then it should mimic what an HDR screen would do and tone map it in a similar way

Quackdoc

2024-10-08 11:43:12

well we have a lot more sophisticated and better tonemappers

Demiurge

2024-10-08 11:43:35

Well anything is better than the current super dark output 😂

Quackdoc

2024-10-08 11:43:39	mpv's spline isn't bad, bt.2446a, bt.2390, hable, marty, etc
2024-10-08 11:43:59	oh yeah, libjxl should imo for sure do implicit tonemapping when converting to an SDR transfer

Demiurge

2024-10-09 02:48:04	Well, yeah, because tonemapping is always implicitly happening when an HDR image is being displayed properly on an HDR display, it should be no different when re-encoding/remapping to an SDR transfer curve...
2024-10-09 02:48:34	It would only be consistent
2024-10-09 02:48:46	In my opinion...

CrushedAsian255

2024-10-09 02:48:49

is there any way to tell if a jxl image is hdr ?

Quackdoc

	CrushedAsian255 is there any way to tell if a jxl image is hdr ?
2024-10-09 02:49:07	checking intensity target is probably the easiest way
2024-10-09 02:49:12	that or the intended transfer

Demiurge

2024-10-09 02:49:39

A transfer curve based on absolute luminance?

Quackdoc

	Demiurge It would only be consistent
2024-10-09 02:49:52	consistent but not necessarily accurate, you can for sure convert between colorspaces without doing any tonemapping, and there may be cases for that for sure
2024-10-09 02:50:37	but ofc, like I said, I do agree that it should be the default

Demiurge

	Quackdoc consistent but not necessarily accurate, you can for sure convert between colorspaces without doing any tonemapping, and there may be cases for that for sure
2024-10-09 02:50:58	But there's no such thing as accurate when it comes to displaying an HDR image on any screen, even HDR screen, that can't reproduce the brightest pixel

Quackdoc

2024-10-09 02:52:24	that;s not true, this is why we have reference nits
2024-10-09 02:53:01	we working in reference which is the "theoretical best case" so we have a common ground, and everything revolves around them

CrushedAsian255

2024-10-09 02:53:28	are there actually any 10k nit displays?
2024-10-09 02:53:30	or is that stupid

Quackdoc

2024-10-09 02:53:56	for example, the "common ground" for mastering right now is 1000 reference nits, but some folk are pushing towards 2k or even 3k I think it is now
	CrushedAsian255 are there actually any 10k nit displays?
2024-10-09 02:54:16	I know of like 1 or 2 that have been showed of at CES before

CrushedAsian255

2024-10-09 02:54:33

with HDR is the point that the display always shows the pixels at the same nits

Quackdoc

2024-10-09 02:54:35

TCL and LG iirc

CrushedAsian255

2024-10-09 02:54:40

how does that interact with screen brightness controls?

Quackdoc

	CrushedAsian255 with HDR is the point that the display always shows the pixels at the same nits
2024-10-09 02:54:41	not true
2024-10-09 02:54:47	this is in fact almost never true

CrushedAsian255

2024-10-09 02:55:01

then how is things defined with absolute nits?

Quackdoc

2024-10-09 02:55:27	with absolute nits, you have display nits vs reference nits, by default displays will almost always tonemap, but you can stop this behavior with calibration
2024-10-09 02:57:17	actually I think most HDR1000 displays track PQ, then tonemap when needed now
2024-10-09 02:58:01	not like it matters because you get retarded BS like this anyways lmao

Demiurge

2024-10-09 02:58:36

As far as I know, content is always adapted/tone-mapped to the screen it's displayed on.

Quackdoc

2024-10-09 02:59:00

"we can track PQ in hardware, but you will never be able to do this because we in firmware put a power savings mode on that you can't disable unless you get into manufacturer control mode"

CrushedAsian255

	Quackdoc "we can track PQ in hardware, but you will never be able to do this because we in firmware put a power savings mode on that you can't disable unless you get into manufacturer control mode"
2024-10-09 02:59:19	oh dear

Demiurge

2024-10-09 02:59:20	And so ideally images should look as consistent as possible regardless of the transfer curve encoding or the type of monitor
2024-10-09 03:00:10	And there shouldn't be inconsistent rendering depending on things like that

Quackdoc

2024-10-09 03:00:25	Rtings actually do PQ EOTF tracking tests iirc, so you would want to check them before you buy a HDR display
2024-10-09 03:01:20	https://www.rtings.com/tv/tests/picture-quality/pq-eotf
2024-10-09 03:01:41	as long as you have good tracking within 1knit you are probably fine

Demiurge

	Quackdoc not like it matters because you get retarded BS like this anyways lmao
2024-10-09 03:01:41	NPC...

Quackdoc

2024-10-09 03:02:42	4knit tracking is great for people with a lot of cash to burn, and it's probably nice? I don't actually have experience with displays that high end, but I would assume it's nice
2024-10-09 03:03:33	realistically for a budget display, you want good PQ tracking to 600, because past 600 nits you are mostly into the "highlights" sections unless you are watching terribly mastered content
2024-10-09 03:03:53	or content that just really wants to push PQ

Demiurge

2024-10-09 03:04:16

The brighter something is the less precisely your eyes care how bright it is

Quackdoc

2024-10-09 03:07:32

but yeah, a good display will track whatever nit range it reliably can, then tonemap the rest. so if you have a "1000 nit mastering display" (HDR10 metada) you can assume that you roughly have 1000 nits of properly graded content, then content above that has likely been tonemapped on the mastering display. This means that you only *really* need to care about properly tonemapping that 1000 nit range to your mediocre display assumedly {8,6,4}00 nits display

2024-10-09 03:08:13

I dunno how to handle HDR10+ btw, I know there is a dedicated tonemapping spec, but iirc it's paywalled

Demiurge

2024-10-09 04:14:56	If there's no headroom then there's no tone-mapping, just clipping.
2024-10-09 04:15:17	"map" implies headroom/space to map it

2024-10-09 04:27:48

there's always tonemapping

Demiurge

2024-10-09 05:18:54

Within a range the display is accurately able to "track?"

_wb_

	Demiurge If there's no headroom then there's no tone-mapping, just clipping.
2024-10-09 05:34:11	No, especially if there is no HDR headroom, tone mapping is needed

Demiurge

	Quackdoc but yeah, a good display will track whatever nit range it reliably can, then tonemap the rest. so if you have a "1000 nit mastering display" (HDR10 metada) you can assume that you roughly have 1000 nits of properly graded content, then content above that has likely been tonemapped on the mastering display. This means that you only really need to care about properly tonemapping that 1000 nit range to your mediocre display assumedly {8,6,4}00 nits display
2024-10-09 05:36:46	I was responding to this. If a display is capable of tracking up to 1000 nits, it needs to start tonemapping before 1000 nits.
2024-10-09 05:38:06	If it doesn't start tonemapping before it runs out of headroom then it will clip

Quackdoc

2024-10-09 06:09:53

yes, thats why good displays will try to track PQ as far as possible before having a very strong tonemap for everything else

2024-10-09 06:42:49	they will try to track
2024-10-09 06:42:51	by tonemapping
2024-10-09 06:43:40	everything tonemaps that's how displays works

Quackdoc

2024-10-09 06:45:58	?
2024-10-09 06:46:24	the entire point of what is being said is that displays will try to maintain 1:1 with PQ as far as they reliable can before the tonemapping kicks in
2024-10-09 06:47:25	this means if you have a PQ video that never goes higher then say 600 nits, and your TV can reliable display 1000 nits, the signal will be un touched, the displays tonemapper will pretty much just do noop

2024-10-09 06:54:42

nuh uh

lonjil

2024-10-09 07:07:40

1:1 makes zero sense since the viewing environment matters a lot. Or does PQ mandate a particular viewing environment for mastering, making it really not all that different from sRGB?

Demiurge

2024-10-09 07:45:10	Honestly very little about the theory of HDR makes sense to me :D
2024-10-09 07:47:42	You would think it would be as simple and obvious as it is in audio with changing the bit depth. But noooo. There's a whole bunch of extra assumptions and technologies attached to it and the separation between different technologies is muddled together rather than kept logically separated.
2024-10-09 07:50:06	audio has nonlinear "transfer functions" too, like mu-law encoding. It's called "companding" in that domain.
2024-10-09 07:51:33	They don't introduce a whole bunch of extra assumptions and unexpected complexity because completely different technologies and techniques are kept separate and distinct and are easy to understand and work with and mix/match.
2024-10-09 07:52:19	I wish image processing was as simple and logical as audio processing 😂
2024-10-09 07:53:28	Do I just not "get it?"

2024-10-09 08:24:47

audio isnt simple either

Quackdoc

	lonjil 1:1 makes zero sense since the viewing environment matters a lot. Or does PQ mandate a particular viewing environment for mastering, making it really not all that different from sRGB?
2024-10-09 08:56:10	1:1 in the sense of, "If you put a calibration tool to your display, in a reference viewing environment, send a PQ signal of 0.35, the display will output the nits equivalent of that"
	lonjil 1:1 makes zero sense since the viewing environment matters a lot. Or does PQ mandate a particular viewing environment for mastering, making it really not all that different from sRGB?
2024-10-09 08:56:18	they all do
2024-10-09 08:56:35	you can't make a "reference" without controlling the environment
2024-10-09 08:56:55	well, all in the sense of stuff that has a "reference"

lonjil

2024-10-09 08:57:28

So sRGB and PQ are the same and supposed nit levels are entirely ignorable

Quackdoc

2024-10-09 08:57:53	what? no
2024-10-09 08:58:15	PQ has an absolute reference nit value
2024-10-09 08:58:35	ofc reference is reference. but the "absolute intent" is there

dogelition

	CrushedAsian255 how does that interact with screen brightness controls?
2024-10-09 09:36:18	can only speak for LG TVs: oled light and contrast set to the default 100 gives you accurate pq tracking, lowering either one makes it undertrack
	Quackdoc not like it matters because you get retarded BS like this anyways lmao
2024-10-09 09:37:59	this is caused by ABL, which is just based on the average picture level (APL). tuning the ABL curve for very high brightness in tiny windows necessarily makes it start dimming/undertracking at a rather low APL
2024-10-09 09:38:47	afaik every consumer panel that currently exists uses just uses APL for the ABL implementation

Quackdoc

2024-10-09 09:38:55	[Hmm](https://cdn.discordapp.com/emojis/1113499891314991275.webp?size=48&quality=lossless&name=Hmm)
2024-10-09 09:39:14	last I checked you often needed manufacturer mode to disable energy efficency stuff

dogelition

2024-10-09 09:39:24	e.g. full field gray/white in hdr on my lg c1 oled starts undertracking at around 60 nits
2024-10-09 09:39:53	despite the panel being able to show like 120 nits full field

2024-10-09 09:40:41

also because oled is really unstable at tracking and you can tell the difference in 0-100 nits

dogelition

2024-10-09 09:41:36

the math is fairly simple: assuming you have a panel that does 1000 nits in a 3% window (and starts dimming at a >3% window size), that means ABL kicks in at 30 nits APL

CrushedAsian255

	dogelition can only speak for LG TVs: oled light and contrast set to the default 100 gives you accurate pq tracking, lowering either one makes it undertrack
2024-10-09 09:50:36	Undertrack meaning just divide brightness by x?

dogelition

2024-10-09 09:51:03	should be just that, yes
2024-10-09 09:53:09	oled light controls the "backlight" of the panel, while contrast is like a multiplied that gets applied to the digital signal sent to the panel, but because the panel is actually running in 2.2 gamma both operations should be equivalent to dividing the luminance

Meow

2024-10-09 10:35:10

How should I make this image smaller with -d >0? Now I could make it smaller only with -d 0

CrushedAsian255

2024-10-09 10:36:56

-d 7.2 works

Meow

2024-10-09 10:41:03

That's only 2.5% smaller than -d 0

CrushedAsian255

2024-10-09 10:41:50

i think lossless works better because the alpha channel can be coupled with the image data

Meow

2024-10-09 10:41:52

It's also interesting that -d 7.2 resulted in ssimulacra2 at 90.20265327

CrushedAsian255

2024-10-09 10:42:08

it's because of the alpha

Demiurge

	w audio isnt simple either
2024-10-09 11:06:13	It is extremely straightforward... Definitely way more obvious and logical... I never felt overwhelmed with terminology and multiple different ideas being mixed together and conflated for one another like there is in images/video

2024-10-09 11:07:16

yeah but have you heard about loudness

Demiurge

2024-10-09 11:07:33

Sure, it's often measured in decibels...

2024-10-09 11:07:55	that's for sound
2024-10-09 11:08:12	<:clueless:1186118225046540309>

Demiurge

2024-10-09 11:08:46

It's never been a big deal or a crazy mess like it is in video

lonjil

2024-10-09 11:08:48

most digital audio is only measured as a sort of relative loudness

2024-10-09 11:10:46

loudness curve is like tonemapping

Demiurge

2024-10-09 11:11:19

It can get complicated, but when it does it's usually because of one specific thing at a time and not because of multiple completely unrelated things interacting with each other badly all at the same time

jonnyawsom3

	Meow How should I make this image smaller with -d >0? Now I could make it smaller only with -d 0
2024-10-09 11:23:16	8bit palette with only 46 colors, so even with Alpha aside it's much more suited for lossless

Demiurge

2024-10-09 11:24:11

And the completely unrelated things are never conflated or assumed to be related

Meow

	8bit palette with only 46 colors, so even with Alpha aside it's much more suited for lossless
2024-10-09 12:18:44	Now PNG is even slightly smaller than JXL <:FeelsAmazingMan:808826295768449054>

jonnyawsom3

2024-10-09 12:20:06

Not surprised, the palette coding still needs a lot of work to even beat GIF most of the time

Traneptora

	w audio isnt simple either
2024-10-09 12:21:01	audio isn't simple? clearly never heard of "just use opus" :D
2024-10-09 12:21:08	(/s)

yoochan

2024-10-09 01:38:59

just use opus ! 😄

CrushedAsian255

2024-10-09 01:57:15

just use mp3 /s

Enhex

2024-10-09 09:57:20	im curious about https://github.com/libjxl/libjxl/discussions/3634
2024-10-09 09:58:02	is there compression between frames?

veluca

2024-10-09 09:58:59

the encoder doesn't really do it, but there could be

jonnyawsom3

Enhex im curious about https://github.com/libjxl/libjxl/discussions/3634

2024-10-09 10:18:53

Reading that post, is this lossless or high quality lossy? Because lossless uses square groups with 128, 256, 512 and 1024 pixel sizes, so that might be why compression didn't increase with merged images. We've had discussions about Satellite imagery with similar ideas, but they ended up using extra channels which could then reference each other to save space, I'm not sure if there's anything to do the same with frames yet. One idea to test is using APNG as an intermediary run though optipng to make use of identical areas, as cjxl can then reuse the cropped frames/layered transparency during encoding.

Info

JPEG XL

General chat

Voice Channels

Archived

libjxl