JPEG XL

	dogelition chromium's tone mapping also looks nice, see here https://discord.com/channels/794206087879852103/824000991891554375/1283499159332196403 (assuming you're using the discord website/desktop app and hdr is disabled)
2024-09-12 09:40:07	it doesn't look nice when you open the png in firefox though, which afaict uses the 3d lut built into the icc profile instead of doing some tone mapping by itself

Demiurge

	_wb_ Normally I would add some dark beer to the batter but the only bottles I had available were all too good to be used for that. Used sparkling water instead, that also works.
2024-09-12 10:16:01	literally no such thing as good beer
2024-09-12 10:17:36	Not even the dark stuff. Although the dark stuff is slightly more drinkable than the watery stuff.

yoochan

	Demiurge literally no such thing as good beer
2024-09-12 10:28:22	if you like beer and are from Belgium, there is

Demiurge

2024-09-12 10:35:58

that's a big IF

TheBigBadBoy - 𝙸𝚛

2024-09-12 10:38:52

then "if you come to Belgium, there is" <:CatSmile:805382488293244929>

_wb_

2024-09-12 10:43:08

Good beer most definitely exists. But obviously, de gustibus non disputandum est.

Traneptora

2024-09-12 07:23:16	how does libjxl convert XYB-grayscale back to single-channel? does it invert the XYB and grab the green channel?
2024-09-12 07:23:44	or does it average the three channels?

_wb_

2024-09-12 07:27:17

XYB grayscale just has X and B all-zeroes. Converting that to RGB will produce R=G=B. So it doesn't really matter how you convert that to grayscale 🙂

Traneptora

2024-09-12 07:28:07

ah. atm I'm just grabbing the green channel. but I'm failing conformance on grayscale_public_university so I need to figure out why

_wb_

2024-09-12 07:38:35

That's not the reason then.

Traneptora

2024-09-12 07:39:49

I wonder if it's epf again. Epf is performed in XYB space so maybe it's that?

Fraetor

	monad That's not me, it's <@176735987907559425>. I thought you were looking for sample code, and that is a very readable one.
2024-09-12 08:42:27	I should really get back to writing that, it was really fun to make, and I'm a better developer now too.
2024-09-12 08:53:12	It's quite cathartic actually, as unlike all my work projects, the requirements for a decoder are perfectly clear, and even have a nice specification.

CrushedAsian255

2024-09-13 02:15:42

`-d 1 -e 7` is default right?

jonnyawsom3

2024-09-13 02:37:30

Unless it's a jpeg or a GIF input, yes

Oleksii Matiash

2024-09-13 06:18:53

Just curious, why lossless compression of 8-bit file produces containerless file, and the same file in 16 bit forces container usage? There is no metadata, so it is not the reason

CrushedAsian255

2024-09-13 06:19:56

Can you send the files?

_wb_

	Oleksii Matiash Just curious, why lossless compression of 8-bit file produces containerless file, and the same file in 16 bit forces container usage? There is no metadata, so it is not the reason
2024-09-13 06:23:25	The reason is that containerless is implicitly Main Profile, Level 5. For any other profile/level, you have to use a container and indicate what it is with a `jxll` box.
2024-09-13 06:24:37	For 16-bit lossless, after RCTs etc (even before, actually), things don't fit in int16 buffers, which is a requirement for level 5.
2024-09-13 06:25:44	You can only go up to 14-bit or so while staying within Level 5. For lossless that is — for lossy, bitdepth is just a suggestion and doesn't matter.

Oleksii Matiash

2024-09-13 06:26:39

Thank you

_wb_

2024-09-13 06:29:54

You _can_ strip the container in this case and it will work in all current decoders. But technically it's invalid, because the codestream is doing things it is not allowed to do within Main Profile, Level 5. In theory, decoders could refuse to decode such invalid bitstreams. Libjxl does not want to create invalid bitstreams so it automatically does the right thing. Files produced by jxl_from_tree are always containerless and can be technically invalid files though.

Oleksii Matiash

2024-09-13 06:31:27

Thank you again 🙂

CrushedAsian255

_wb_ You _can_ strip the container in this case and it will work in all current decoders. But technically it's invalid, because the codestream is doing things it is not allowed to do within Main Profile, Level 5. In theory, decoders could refuse to decode such invalid bitstreams. Libjxl does not want to create invalid bitstreams so it automatically does the right thing. Files produced by jxl_from_tree are always containerless and can be technically invalid files though.

2024-09-13 08:24:33

im guessing Level 5 is more web-targeting?

_wb_

2024-09-13 08:26:04

Exactly. From the spec: > To promote interoperability, a single profile, named “Main” profile, is defined. This profile is intended for use (among others) in mobile phones, web browsers, and image editors. It includes all of the coding tools in this document. > > The Main profile has two levels. Level 5 is suitable for end-user image delivery, including web browsers and mobile apps. Level 10 corresponds to a broad range of use cases such as image authoring workflows, print, scientific applications, satellite imagery, etc. > > Levels are defined in such a way that if a decoder supports level N, it also supports lower levels. > > Unless signalled otherwise, a JPEG XL codestream is assumed to be conforming to the Main profile, level 5.

CrushedAsian255

2024-09-13 08:26:48

so using Level 10 images may cause incompatibilities with that Rust JXL decoder that Firefox might add

lonjil

2024-09-13 08:29:01

seems unlikely

CrushedAsian255

2024-09-13 08:29:56

they probably don't need any features in level 10 and it would lower the chance of malicious image files crashing / hanging the system / process

_wb_

2024-09-13 08:30:43

I don't think so, probably it will just decode anything. But e.g. CMYK is not in level 5 and browsers generally don't want to bother with implementing proper color management for it, so that would be one thing in Level 10 that probably will not be properly supported on the web. Also for some of the limits regarding spline sizes and number of channels etc, a browser decoder might want to refuse decoding things outside Level 5.

CrushedAsian255

	_wb_ I don't think so, probably it will just decode anything. But e.g. CMYK is not in level 5 and browsers generally don't want to bother with implementing proper color management for it, so that would be one thing in Level 10 that probably will not be properly supported on the web. Also for some of the limits regarding spline sizes and number of channels etc, a browser decoder might want to refuse decoding things outside Level 5.
2024-09-13 08:31:45	so they will end up supprting level 7.5 or something?
2024-09-13 08:33:44	(made up name)

_wb_

2024-09-13 08:37:49

We haven't defined intermediate levels but yeah, in practice a browser implementation would only guarantee to conform to Level 5 (and even that only in a "best effort" way in the sense that if you want to see a huge image, you'll need enough memory for it), but it will also decode many Level 10 files. If you want to be sure it will not be rejected, you should stick to Level 5, but most of the limits of Level 5 will likely not be hard-enforced.

jonnyawsom3

2024-09-13 08:38:18

Level 5 supported, higher not guaranteed

CrushedAsian255

2024-09-13 08:39:05

so it's basically "We can do Level 5 (as long as you're not being stupid with resolutions) and can probably load some/most level 10 images, but don't count on it"?

Oleksii Matiash

2024-09-13 08:41:53

Do I understand correctly that 16 bpp lossless files are not guaranteed to be decoded by browser? I'm just curious, 16 bpp ll obviously not the most used format over the internet

Tirr

2024-09-13 08:43:19

browsers would fully implement Level 5, while Level 10 is not strictly required but they may implement some of those. the impl. might not be fully conforming to Level 10 though

CrushedAsian255

2024-09-13 08:43:48

is there a way to tell `djxl` to reject certain images?

_wb_

	CrushedAsian255 so it's basically "We can do Level 5 (as long as you're not being stupid with resolutions) and can probably load some/most level 10 images, but don't count on it"?
2024-09-13 08:43:54	yes, that's the idea. In particular, if you're going to have very big splines that require a lot of effort to render, probably it's going to be rejected right away if it's outside Level 5, and if you use CMYK, you'll probably not get correct color management. Also animations faster than 120 fps will be slowed down to 120 fps, and maybe some other things.

CrushedAsian255

2024-09-13 08:44:38

the idea is that things in level 10 are probably not going to be particularly useful on the web, right? who needs 120+fps GIFs

Tirr

2024-09-13 08:44:44

especially Level 10 requires lossless f32 being bit-exact

_wb_

	CrushedAsian255 is there a way to tell `djxl` to reject certain images?
2024-09-13 08:45:27	currently I don't think so, probably it would be useful to have some options to let it check the Levels limits and to put limits on image dimensions / nb of channels / nb of frames/layers

CrushedAsian255

2024-09-13 08:45:41

also with `modular_16_bit_buffers` can I still use 32 bit buffers?

_wb_

	Oleksii Matiash Do I understand correctly that 16 bpp lossless files are not guaranteed to be decoded by browser? I'm just curious, 16 bpp ll obviously not the most used format over the internet
2024-09-13 08:47:15	That's right, but this is probably one of the things where browsers will allow a bit more than what Level 5 says, e.g. still allow up to 16-bit (but maybe not all the way to 32-bit).

Oleksii Matiash

2024-09-13 08:47:32

Thank you

_wb_

	CrushedAsian255 also with `modular_16_bit_buffers` can I still use 32 bit buffers?
2024-09-13 08:54:09	If this header field is true (like it has to be in Level 5), then you can in principle use int16 buffers in the implementation of all modular decoding, while otherwise you have to use twice the memory and use int32 buffers. Currently libjxl ignores the field and always uses int32 buffers anyway (we haven't bothered with writing specialized/optimized code paths yet). For lossy, even at pretty high quality, int16 buffers do suffice since the modular data only contains quantized coefficients — after dequantization you'll need float32 to get enough precision. All of this is unrelated to what kind of buffers the application uses at the libjxl API level to pass images to the encoder or to get images from the decoder.
2024-09-13 09:01:44	In practice, there is not a lot of lossless data that actually has a full 16 bits of precision. A lot of things end up just being scaled to 16-bit because PNG, TIFF, PPM etc are byte-based so it's either 8-bit or 16-bit. But the actual precision of cameras is more like 14-bit, and for Rec2100 PQ/HLG, 12-bit suffices (and most displays only support 10-bit precision). This is why we considered this an acceptable limitation for Level 5.

CrushedAsian255

2024-09-13 09:03:09

Can level 5 12 bpp?

_wb_

2024-09-13 09:05:34	Up to 15-bit is possible (one bit is lost because int16 is signed and image data is usually unsigned) but not very effective since then you can't do RCTs. Up to 14-bit is possible with effective compression.
2024-09-13 09:06:57	You are also allowed to make a lossy file that is marked as being 16-bit (though not higher than that, in Level 5).

CrushedAsian255

2024-09-13 09:07:03

Is JXL’s max int32/uint31?

Tirr

2024-09-13 09:08:36

int24 and float32

CrushedAsian255

2024-09-13 09:09:48

How does float32 fit in an int32?

_wb_

2024-09-13 09:09:58

just bit casting

CrushedAsian255

2024-09-13 09:10:11	Then wouldn’t negative numbers go all wonky?
2024-09-13 09:10:20	Or are negative floats inverted?

_wb_

2024-09-13 09:10:27

negative floats just become negative ints that way

CrushedAsian255

2024-09-13 09:10:43

But they go the other way compared to 2s compliment?

_wb_

2024-09-13 09:11:10	no it's the same
2024-09-13 09:11:20	sign bit 0 means positive in both cases

CrushedAsian255

2024-09-13 09:12:02	No, as in 0xffffffff is the negative 2s compliment number closest to 0
2024-09-13 09:12:22	But the closest negative float to 0 is 0x80000001

_wb_

2024-09-13 09:16:30

right, in that sense they go the other way

CrushedAsian255

2024-09-13 09:17:10

Sorry if my wording was confusing

_wb_

2024-09-13 09:18:25	typical float data is in the 0..1 range though, so the first two bits are zeroes and it makes sense when cast to int32 in the sense that order is preserved (if a < b when interpreted as float, also a < b when bitcast to int32)
2024-09-13 09:20:01	in any case, predictors get a bit wonky when they get float-bitcast-to-int data, but still make some amount of sense and help with compression

CrushedAsian255

2024-09-13 09:20:39

Float bit cast to int is kinda a piece wise linear exponential function?

_wb_

2024-09-13 09:20:53	yeah that's basically what it does
2024-09-13 09:22:06	which is kind of OKish if the data itself is linear light, so the bitcasting is basically similar to applying a log transfer curve
2024-09-13 09:23:54	but lossless compression of float32 is somewhat limited, e.g. you cannot do squeeze or RCTs (if the data can be arbitrary floats) since that would require more than int32 for the residuals.
2024-09-13 09:24:35	our assumption was that if you want to do lossless float32, you don't really care too much about compression 🙂

CrushedAsian255

2024-09-13 09:25:49	when i was saying "the other way" i meant like this, where NaNs are in red
2024-09-13 09:26:02	where if the left side was bit flipped it would look more like this
2024-09-13 09:26:16
2024-09-13 09:26:44	which matches more like how int works
2024-09-13 09:26:51

_wb_

2024-09-13 09:28:12	hmyeah probably we should have done that, I hadn't given it much thought since the float image data I could find didn't go negative
2024-09-13 09:28:20	too late for that now though

CrushedAsian255

2024-09-13 09:29:13	i guess float data is usually not going negative
2024-09-13 09:29:25	and when it is, it's probably an anomaly so doesn't matter about compression
2024-09-13 09:29:53	and also what you said earlier
2024-09-13 09:29:59	> our assumption was that if you want to do lossless float32, you don't really care too much about compression 🙂
	_wb_ too late for that now though
2024-09-13 09:32:18	extensions? idk lol
2024-09-13 09:32:27	don't quite understand what they do

username

2024-09-13 09:35:33

what is the resolution limit set as in level 5?

CrushedAsian255

2024-09-13 09:36:16	Max width/height : 256k
2024-09-13 09:36:23	Can’t be more than 256 MPx
2024-09-13 09:36:40	So maximum of 256k by 1024
2024-09-13 09:36:44	If I’m reading the spec right
2024-09-13 09:36:49	Or other way round

username

2024-09-13 09:37:46

just to be clear you mean something like 256,XXX x 256,XXX?

CrushedAsian255

2024-09-13 09:39:24

Maximum is

username

2024-09-13 09:40:14	I have a headache at the moment and just want it in basic terms lol
2024-09-13 09:41:15	I might just go calculate it when my headache goes away

CrushedAsian255

2024-09-13 09:41:28	Maximum width or height by it self is 262144
2024-09-13 09:41:47	Multiplied together the total amount of pixels can’t be more than 256 million

_wb_

2024-09-13 09:42:26	So basically the limit is 256 megapixels, with an aspect ratio that is not too extreme
2024-09-13 09:43:38	(between 256:1 and 1:256)
2024-09-13 09:44:07	That should be enough for the web

username

2024-09-13 09:47:23

around 4 times the max res of JPEG 1 so yeah it should be plenty and also allow for some breathing room for the future as well

_wb_

2024-09-13 09:48:21

On an 8k screen, that is 16 screenfulls. If you want bigger than that, it's probably a good idea to split it up into several images.

CrushedAsian255

2024-09-13 09:48:58

Or maybe don’t do that on the web

_wb_

2024-09-13 09:50:04

We can always define a Level 6 if it would become a real limitation at some point, but I don't expect that to happen anytime soon or even ever.

CrushedAsian255

2024-09-13 09:50:41	So That’s why there’s a gap?
2024-09-13 09:50:45	Like BASIC lines?
2024-09-13 09:51:08	They’re no plan of level 4 for low specification ?

username

CrushedAsian255 They’re no plan of level 4 for low specification ?

2024-09-13 09:57:22

Should a need arise then there might be a level 4 or whatever however there's really no reason to define one at the moment and keep in mind that there's a real danger in artificially slapping on limitations. For example we could have had native 12-bit JPEGs but no one ever implemented support because it was seen as un-needed at the time and now because of that we are stuck with 8-bit JPEGs because using 12-bit ones would immediately break compatibility with 90% of software and hardware.

CrushedAsian255

2024-09-13 09:59:03

Same with arithmetics?

username

username Should a need arise then there might be a level 4 or whatever however there's really no reason to define one at the moment and keep in mind that there's a real danger in artificially slapping on limitations. For example we could have had native 12-bit JPEGs but no one ever implemented support because it was seen as un-needed at the time and now because of that we are stuck with 8-bit JPEGs because using 12-bit ones would immediately break compatibility with 90% of software and hardware.

2024-09-13 09:59:22

of course spec defined levels/profiles is a bit different then what happened with JPEG 1 but eh better safe then sorry

CrushedAsian255

2024-09-13 09:59:25	Actually, that was because of patterns wasn’t it?
2024-09-13 09:59:29	Patents*

username

2024-09-13 09:59:37

yeah

CrushedAsian255

2024-09-13 10:00:19

Why don’t people use JNG for transparent photographic ?

username

	CrushedAsian255 Why don’t people use JNG for transparent photographic ?
2024-09-13 10:01:04	what is that again? is it that format that uses JPEG for the base image and PNG for alpha?
2024-09-13 10:02:36	because if so then well iOS apps (don't know about modern ones) and Adobe Flash both used it so that's something

CrushedAsian255

2024-09-13 10:02:42	http://www.libpng.org/pub/mng/spec/jng.html
	username because if so then well iOS apps (don't know about modern ones) and Adobe Flash both used it so that's something
2024-09-13 10:02:55	They did?
2024-09-13 10:02:57	Never knew

username

2024-09-13 10:06:00

don't know if they *specifically/exactly* used JNG or just independently invented/re-invented the same idea/thing but yeah they used combined JPEG and PNG data for storing assets

CrushedAsian255

2024-09-13 10:08:37	Interesting
2024-09-13 10:08:39	Actually, a JNG may contain two separate JNG JPEG datastreams (one eight-bit and one twelve-bit), each contained in a series of JDAT chunks, and separated by a JSEP chunk (see the JSEP chunk specification below, Paragraph 1.1.5). Decoders that are unable to (or do not wish to) handle twelve-bit datastreams are allowed to display the eight-bit datastream instead, if one is present.

username

2024-09-13 10:08:52	whenever I go to extract SWF files the assets come out like that and also I looked at an old iOS app and it was storing the main color data as JPEG and the alpha as separate PNG files
2024-09-13 10:09:09	so maybe not exactly JNG but just the same basic idea

CrushedAsian255

	username whenever I go to extract SWF files the assets come out like that and also I looked at an old iOS app and it was storing the main color data as JPEG and the alpha as separate PNG files
2024-09-13 10:09:17	Both probably “reinvented” it as it’s a relatively straightforward and obvious idea

username

2024-09-13 10:09:33

yeah

_wb_

	CrushedAsian255 Like BASIC lines?
2024-09-13 10:27:37	Yes, exactly like BASIC line numbers. We want to keep the invariant that supporting a higher level implies supporting any lower level, so leaving room for potential new levels, should the need arise, seemed wise.
2024-09-13 10:36:45	We _could_ at some point define a lower level with stricter limits, in case we want to define use cases where conformance is guaranteed in a hard way, i.e. for systems where you actually want to guarantee to be able to decode the image in a given time budget and with a given amount of available memory. For now, that need has not yet arisen. We will probably at some point define a lower profile, which is not just putting limits on the sizes of things, but also putting limits on which coding tools are allowed / have to be implemented. For the camera use case, you don't need/want extra channels, patches, splines, big blocks, etc. When defining a subset of coding tools, it's not just a levels thing but it's a different profile. Probably we'll make the level numbering global though, so e.g. this lower profile might have levels 1 and 3, for example. There might at some point also be extensions that will require defining a _higher_ profile, but for now, that need has not yet arisen. It is something we prefer to avoid. In general we really want to avoid having different profiles. For interoperability it is best if all decoders (at least the software ones) can just decode anything. This is why for now, there is only one profile ("Main") and conformance requires implementing everything.

CrushedAsian255

2024-09-13 11:44:33

So you might have “capture” profile that doesn’t have the fancy tools?

Oleksii Matiash

	CrushedAsian255 So you might have “capture” profile that doesn’t have the fancy tools?
2024-09-13 11:47:55	We already have it - 8-bit jpeg. I mean it is very dangerous approach to allow to limit decoders to the level that is enough for cameras

CrushedAsian255

2024-09-13 11:48:19

theoretically a camera could just take an 8 bit jpeg and then lossless jxl it

_wb_

2024-09-13 11:51:03

It would be a profile intended basically only for on-device viewing of the pictures you just took.

CrushedAsian255

2024-09-13 11:53:52

would it basically be lossy `-e 5`?

_wb_

2024-09-13 11:56:25

more like the output of libjxl-tiny

CrushedAsian255

	_wb_ more like the output of libjxl-tiny
2024-09-13 12:24:23	What coding features does it use?

_wb_

2024-09-13 12:51:55	basically vardct with smaller block sizes only (up to 16x16 iirc)
2024-09-13 12:54:31	for d < 1 and on photographic images, it's pretty much almost as good as default libjxl; for images that benefit from patches (like screenshots with text) or at low quality, it's more like somewhere between jpegli and libjxl. But the main use case would be cameras, so low quality and non-photographic images don't really matter

CrushedAsian255

2024-09-13 12:57:52

I’m guessing no fancy MA trees?

_wb_

2024-09-13 01:16:54

yeah, we haven't yet decided on what limits to impose on that but likely the shape of the MA trees will be heavily constrained. It's only used for a small amount of data anyway (the DC and some control fields). Same with the context model size for the AC. Allowing arbitrary MA trees for a hardware decoder would be pretty inconvenient 🙂

jonnyawsom3

2024-09-13 01:20:06

A fixed MA tree like effort 3?

_wb_

2024-09-13 01:26:50

probably something like that — anything that can be done without branches or large lookup tables.

KKT

2024-09-13 06:37:12

OK, I've got a bit of a weird one. My kid's school sent out a link for all the photos from last year in Google Photos (many thousands of them). Downloaded them all and ran them through with: `parallel cjxl --num_threads=0 -j 0 -d 1 -e 9 {} {.}.jxl ::: ./*.jpg` Obviously quality is already degraded, so they don't have to be awesome. Most were taken with an iPhone. I'm getting a shift in the HDR – JPEG's highlights are noticably brighter. Attached is a good example. They're close to the same in Preview, but not exactly. Quicklook doesn't do HDR for JXL files at all. Preview shows the JPEG as 10 bits. Exif tool shows `Profile Connection Space: XYZ`. So these are Jpegli compressed?

embed

2024-09-13 06:38:07

https://embed.moe/https://cdn.discordapp.com/attachments/794206170445119489/1284221553160097862/IMG_0015.jxl?ex=66e5d805&is=66e48685&hm=619195ab9dde0cefa8be2d18d38edf488ea1c3ac8b37879ee3374d51ce740655&

KKT

2024-09-13 06:39:16	Ugh. Wrong image for the JXL
2024-09-13 06:39:30	This one.

spider-mario

	KKT OK, I've got a bit of a weird one. My kid's school sent out a link for all the photos from last year in Google Photos (many thousands of them). Downloaded them all and ran them through with: `parallel cjxl --num_threads=0 -j 0 -d 1 -e 9 {} {.}.jxl ::: ./*.jpg` Obviously quality is already degraded, so they don't have to be awesome. Most were taken with an iPhone. I'm getting a shift in the HDR – JPEG's highlights are noticably brighter. Attached is a good example. They're close to the same in Preview, but not exactly. Quicklook doesn't do HDR for JXL files at all. Preview shows the JPEG as 10 bits. Exif tool shows `Profile Connection Space: XYZ`. So these are Jpegli compressed?
2024-09-13 06:42:47	note that XYB ≠ XYZ
2024-09-13 06:42:55	XYZ is the 1931 colorspace from the CIE

KKT

2024-09-13 06:42:57

Ohh, misread that

jonnyawsom3

2024-09-13 07:04:21

I keep making that mistake and getting excited at random ICC tags

RaveSteel

2024-09-13 07:09:44

Same

KKT

2024-09-13 07:50:51

The 10-bit JPEG got me pointed in the wrong direction too.

Demiurge

	_wb_ We _could_ at some point define a lower level with stricter limits, in case we want to define use cases where conformance is guaranteed in a hard way, i.e. for systems where you actually want to guarantee to be able to decode the image in a given time budget and with a given amount of available memory. For now, that need has not yet arisen. We will probably at some point define a lower profile, which is not just putting limits on the sizes of things, but also putting limits on which coding tools are allowed / have to be implemented. For the camera use case, you don't need/want extra channels, patches, splines, big blocks, etc. When defining a subset of coding tools, it's not just a levels thing but it's a different profile. Probably we'll make the level numbering global though, so e.g. this lower profile might have levels 1 and 3, for example. There might at some point also be extensions that will require defining a _higher_ profile, but for now, that need has not yet arisen. It is something we prefer to avoid. In general we really want to avoid having different profiles. For interoperability it is best if all decoders (at least the software ones) can just decode anything. This is why for now, there is only one profile ("Main") and conformance requires implementing everything.
2024-09-14 03:17:29	Sounds kinda like PIK mode
2024-09-14 03:21:10	I'm interested in what Jyrki said about "frequency lifting" and I wonder what that was

BabylonAS

2024-09-14 09:26:41

Can JXL use indexed colors? I often deal with images that have less than 256 colors

spider-mario

2024-09-14 10:07:11

yes, in lossless mode

_wb_

2024-09-14 10:20:47	JXL can do any palette size up to 70k colors.
2024-09-14 10:21:35	Plus delta palette and default palette colors.

CrushedAsian255

	_wb_ JXL can do any palette size up to 70k colors.
2024-09-15 12:28:40	70k?
2024-09-15 01:14:52	is there jpeg xl merch?

monad

2024-09-15 01:46:29

New fundraiser for Luca's Rust decoder.

CrushedAsian255

	monad New fundraiser for Luca's Rust decoder.
2024-09-15 01:48:04	is this rust decoder going to be open source?

monad

2024-09-15 01:49:49

I would assume so, given the primary impetus is Firefox integration.

CrushedAsian255

2024-09-15 06:06:21

is 16 bit enough precision for lossless yuv422p10->rgb?

_wb_

2024-09-15 06:31:55

I would assume it is, but doesn't hurt to check. At least for 10-bit yuv444 you can just check all values and see if they roundtrip. Maybe the chroma subsampling makes it a bit harder to roundtrip though.

CrushedAsian255

2024-09-15 06:33:04

It’s not really a big deal if one pixels off by one value i was just wondering

_wb_

2024-09-15 07:06:04

It's mostly rgb -> yuv that is losing information (because you're mapping a color cube to a volume that only uses about 1/4th of the coordinate space), the other way around is less problematic.

CrushedAsian255

2024-09-15 11:46:13	???
2024-09-15 11:46:18	scam?

BabylonAS

2024-09-15 11:46:44

There was a spam attack

CrushedAsian255

2024-09-15 11:48:07	oop
2024-09-15 11:48:09	banned?

jonnyawsom3

2024-09-15 12:04:44

Already done

CrushedAsian255

2024-09-15 12:05:01

👍

AccessViolation_

2024-09-15 02:17:13

So, I've been wondering... with floating point color precision and support for many layers as gain maps or whatever else you want, would be be possible to create an image that covers...basically the full true luminance of everything in the shot? One scenario I imaged was a hacked together camera with a couple of solar filters that can be mechanically shifted before the lens. You could then take many exposures, one normal image that shows like a landscape, one with boosted EV to capture darker areas, and then a couple of shots that add the solar filters. Then you would use the knowledge of the solar filters and exposure settings and whatnot to create a JXL image that contains the sun in the shot as its *actual* luminance compared to all the other stuff. This could also potentially be cool for astrophotography. I recognize that this isn't compatible with HDR standards of any kind and you would probably need specialized software to view select luminescence ranges unless you also want to create a display capable of giving you sunburn, but anyway. How could a file like this theoretically look in JPEG XL?

CrushedAsian255

2024-09-15 02:18:16

Technically you could, it supports arbitrary float data

yoochan

2024-09-15 02:18:39

There is a guy here which used jxl to store altitude data iirc

2024-09-15 02:18:56

that's what cameras already do

CrushedAsian255

2024-09-15 02:19:04

That’s what the extra channels are for..?

2024-09-15 02:19:15

capturing x stops of dynamic range and fitting into a range of values

AccessViolation_

2024-09-15 02:21:19

Do you think just a single 32-bit precision layer would suffice for this, or would you also need one (or more) gain maps to capture the real luminance of something like the sun in the same something dimly lit?

2024-09-15 02:21:34

existing hdr formats already do that

CrushedAsian255

2024-09-15 02:21:44

32 bit float goes up to like 10^308 or something

2024-09-15 02:22:32

even non "hdr" formats can do that

CrushedAsian255

	CrushedAsian255 32 bit float goes up to like 10^308 or something
2024-09-15 02:23:54	Wait no, that for doubles I think

2024-09-15 02:24:32

oh i guess the issue with PQ is that it maxes out at 10000 nits

CrushedAsian255

2024-09-15 02:24:44

I guess you can use linear?

AccessViolation_

2024-09-15 02:25:23

I wasn't sure if that extra precision was all mapped into a more...realistic range of values. If we can theoretically properly expose the sun and a dark room in the same image while preserving the actual real-life luminance difference between them in a 32-bit float image that would suffice. I just didn't expect that to be the case, because for basically every photo ever taken that means you're wasting a lot of the 32-bit range on values you'll never get

2024-09-15 02:26:08

that's always arbitrary and subjective

CrushedAsian255

2024-09-15 02:26:16	Due to how floats work you’re waiting 3/4 of the space
2024-09-15 02:26:29	As 0..1 gets the same allocation as 1..inf
2024-09-15 02:26:42	And then negatives are the other half

AccessViolation_

2024-09-15 02:26:58

Hmm

2024-09-15 02:27:30

https://en.wikipedia.org/wiki/Transfer_functions_in_imaging#List_of_transfer_functions

AccessViolation_

2024-09-15 02:31:19

I suppose one thing you could to to make sure you can represent the landscape with a good range and the details of the surface of the sun in a good range, is to try to get the limunance of everything in the base image layer close to 0 so you can represent the tones properly, and then have different gain maps for different orders of magnitude of brightness. Otherwise you might run into issues where you can represent the landscape in nice detail but the surface of the sun would suffer from lack of precision because it's very far away from 0 in the float range

2024-09-15 02:33:02

This is very cursed and definitely something I will explore further for astrophotography

2024-09-15 02:33:09

it's something camera raw formats try to solve

AccessViolation_

2024-09-15 02:37:31

Surely a RAW image couldn't store this large a range if you had a sensor or wacky device (like mentioned above) capable of capturing this level of contrast?

_wb_

2024-09-15 02:37:42

PQ maxes at 10000 nits but you can in principle make a JXL file that just uses a linear transfer function and stores them as lossles floats. Then you can set the intensity_target field to what you want the intensity of 1.0 to be, and just use values higher than 1.0 for anything brighter.

CrushedAsian255

2024-09-15 02:38:18	Is intensity target the nominal peak ?
	AccessViolation_ Surely a RAW image couldn't store this large a range if you had a sensor or wacky device (like mentioned above) capable of capturing this level of contrast?
2024-09-15 02:38:48	RAW formats can store whatever the heck you want, that’s the point of RAW

2024-09-15 02:38:54

the range is abritrary

_wb_

2024-09-15 02:39:37

intensity target is the brightness in nits of the color (1.0, 1.0, 1.0)

2024-09-15 02:39:55

whiter than white

_wb_

2024-09-15 02:39:59	it is 255 nits by default, but you can signal it to be anything up to 65k nits
2024-09-15 02:40:18	but then there is nothing stopping you from putting values above 1.0 in a jxl file

CrushedAsian255

2024-09-15 02:40:23

So if I set intensity target to 1, then the floats store raw nit values?

_wb_

2024-09-15 02:41:04

yep

CrushedAsian255

2024-09-15 02:41:07	RGB values of (12345, NaN, -Infinity)
2024-09-15 02:41:13	Why not lmao

_wb_

2024-09-15 02:41:24

largest float32 is 3.4028234664 × 10^38

CrushedAsian255

2024-09-15 02:41:56

Yea I was seeing double

Oleksii Matiash

	CrushedAsian255 RAW formats can store whatever the heck you want, that’s the point of RAW
2024-09-15 02:42:02	Only if you create raw format that uses 32 bpp, afaik there is no such existing

CrushedAsian255

	Oleksii Matiash Only if you create raw format that uses 32 bpp, afaik there is no such existing
2024-09-15 02:42:25	You can make your own raw format
2024-09-15 02:42:44	Nothing will support it, but you can

jonnyawsom3

2024-09-15 02:45:31

Just swap out the JXL in a DNG for one encoded at 32f, what could go wrong :P

_wb_

2024-09-15 02:47:03

jxl can represent basically anything if you want — float32 precision where the brightness of 1.0 is adjustable means there is effectively no limit to the precision if you use it directly as a raw format. Note that in DNG they limit the precision to 16-bit (either uint16 or float16) since that is plenty of precision with current camera technology.

AccessViolation_

2024-09-15 02:48:37

Woah for astrophotography you could also store other radio frequencies beyond the visible spectrum as modular mode channels, right?

Quackdoc

2024-09-15 02:48:37

this is really common, maybe not to the extent you are talking about, but I actually talked about making a demo related to this in the website channel, but yeah, you would just use a linear transfer, and then in your application use exposure adjustment to change the stops you see. Im using EXR in this example because olive builds against a version of ffmpeg that has float broken, but the same concept applies

AccessViolation_

2024-09-15 02:48:46

(sorry to derail)

Quackdoc

2024-09-15 02:49:03

this is a video comparing exr vs png, but s/exr/jxl in supported apps

jonnyawsom3

	AccessViolation_ Woah for astrophotography you could also store other radio frequencies beyond the visible spectrum as modular mode channels, right?
2024-09-15 02:51:01	There's actually been a lot of discussion here before for satellite imagery, weather radar and other multispectral things that could be useful for this too. https://discord.com/channels/794206087879852103/794206087879852106/1244203123648626739 https://github.com/openclimatefix/Satip/issues/67 https://github.com/libjxl/libjxl/issues/1245 Although, if you're using float32 I feel obliged to bring this up for the 500th time (Even though I only encountered it once when trying to compress a .HDR file) https://github.com/libjxl/libjxl/issues/3511
2024-09-15 02:51:29	So filesizes might not be representative of what JXL can achieve

Quackdoc

2024-09-15 02:52:43

jxl needs fp64, imagine everything you could store in it [pepehands](https://cdn.discordapp.com/emojis/1075509930502664302.webp?size=48&quality=lossless&name=pepehands)

CrushedAsian255

2024-09-15 02:53:07	Just have 2 f32 channels
2024-09-15 02:53:19	And concatenate them

Quackdoc

2024-09-15 02:53:48	we aren't gpus here, we have standards
2024-09-15 02:53:49	[av1_dogelol](https://cdn.discordapp.com/emojis/867794291652558888.webp?size=48&quality=lossless&name=av1_dogelol)

CrushedAsian255

2024-09-15 02:54:17

They’re paywalled 😦

Quackdoc

2024-09-15 02:54:23	[av1_omegalul](https://cdn.discordapp.com/emojis/885026577618980904.webp?size=48&quality=lossless&name=av1_omegalul)
2024-09-15 02:55:27	but yeah, storing wide range data in jxl is quite nice since it's good for not only photography, but also renders and the like

lonjil

2024-09-15 02:56:01

As I recall, though I may be misremembering, there's nothing inherently that stops JXL from supporting things bigger than fp32, it's just what they decided to set as the limit. The actual encoding, again as I recall I may be wrong, can handle bigger values just fine.

Quackdoc

2024-09-15 02:56:28

we need a JXL+

CrushedAsian255

2024-09-15 02:56:35	Maybe eventually they could make a JXL64 extension?
2024-09-15 02:56:41	Like a new profile?
2024-09-15 02:57:00	The problem is then that ruins backwards compatibility and it becomes another new forms
2024-09-15 02:57:15	Although most places that would use JXL wouldn’t need them

Quackdoc

2024-09-15 03:00:05	i dunno what one would actually use 64bit depth for lol
2024-09-15 03:00:08	scientific im sure

jonnyawsom3

	lonjil As I recall, though I may be misremembering, there's nothing inherently that stops JXL from supporting things bigger than fp32, it's just what they decided to set as the limit. The actual encoding, again as I recall I may be wrong, can handle bigger values just fine.
2024-09-15 03:00:51	https://discord.com/channels/794206087879852103/804324493420920833/1209240294747406386

spider-mario

	CrushedAsian255 Is intensity target the nominal peak ?
2024-09-15 03:21:42	it’s supposed to be the peak luminance found in the image, but it’s not strictly enforced

_wb_

2024-09-15 03:42:00

It kind of serves different purposes at the same time, doesn't it? It's a scaling factor for converting XYB to linear RGB too, right?

spider-mario

2024-09-15 03:44:15	yes (except when the target is PQ)
2024-09-15 03:45:11	I guess it does raise a bit of a dilemma for SDR images that don’t go all the way to SDR white
2024-09-15 03:45:38	although the spec does say that it just has to be _an_ upper bound, not necessarily the smallest possible upper bound
2024-09-15 03:46:01	so that scanning the image for the actual maximum isn’t mandatory

AccessViolation_

2024-09-15 06:20:43

I don't know where I read this, but I remember something about a feature where when a very high resolution image is to be displayed on a webpage, but it's small enough in the viewport that you don't need that much detail, the browser can stop loading the image at some point where loading more of it wouldn't improve the visual quality. Did I somehow make this up or is this a real feature? I haven't heard about it since

2024-09-15 06:21:30

Or maybe it was something they *could* do utilizing progressive decoding, without it being a real feature?

jonnyawsom3

2024-09-15 06:24:51

You probably saw this https://discord.com/channels/794206087879852103/794206170445119489/1283547974936432760 Which points to this discussion https://discord.com/channels/794206087879852103/794206170445119489/1279372576946393180 Just an idea based on the progressive loading for now, although there should actually be a few ways to do it

BabylonAS

2024-09-15 06:26:48

I knew it had to be spoken somewhere on this very server

AccessViolation_

2024-09-15 06:27:37	I only just joined and I remember thinking about this like... a year ago or more, so I might've come up with it and misremembered it as a feature, or read it on some blog post then
2024-09-15 06:28:16	Cool to hear that it's possible tho!

jonnyawsom3

2024-09-15 06:29:24

I wouldn't be surprised if people come up with the idea on their own, makes sense once you've thought of it, but the issue is figuring out who's meant to decide. Does the server cut off the connection, the client, should libjxl have an option to stop at x progressive scan? Lots of possibilities

AccessViolation_

2024-09-15 06:33:03	For sure. The nice thing tho is that none of this needs to be standardized. I think all you'd need is a stream per image (which HTTP/3 and the older TCP hack, I think HTTP/2.1? should support) and progressive decode until you get the desired quality per display pixel and terminate the stream. The server should then stop sending data. I don't think this would break spec
2024-09-15 08:40:44	I'm planning to send this in the Pop!_OS dev channel, thought I'd ask here if there is anything I should bring up/change/rephrase: > It may be a bit early to bring this up, but making a DE from scratch might make for a good opportunity to introduce the JPEG XL image format into the desktop. It's an image format which outperforms all other image formats in terms of image quality and compression ratio (in both the lossless and lossy modes), and many other factors. > > It's FOSS, patent free, and currently supported in popular creative software like Darktable, Adobe and Affinity's products, etc, but is seeing slower adoption on the web and in generic software like image viewers and other platforms that deal with images. macOS already has full support. I think it would be nice if COSMIC got support as well. > > Google Research is currently working on a decoder implementation in Rust which could be added to libcosmic/iced [COSMIC's GUI toolkit] in the future, and it's already possible to make desktop software relying on the C++ libjxl reference decoder. Personally I think a JPEG XL export option in the screenshot tool would be a great start, and it would be nice to see it in the future COSMIC image viewer as well. > > What do you think?
2024-09-15 08:55:46	I sent it and I'll see what happens. JXL support in COSMIC would be awesome

jonnyawsom3

2024-09-15 09:56:56

There is also <#1065165415598272582> already, if you don't mind a slight speed hit

monad

AccessViolation_ I don't know where I read this, but I remember something about a feature where when a very high resolution image is to be displayed on a webpage, but it's small enough in the viewport that you don't need that much detail, the browser can stop loading the image at some point where loading more of it wouldn't improve the visual quality. Did I somehow make this up or is this a real feature? I haven't heard about it since

2024-09-16 12:49:09

There's the classic FLIF demonstration. <https://flif.info/example.html>

Quackdoc

	AccessViolation_ I'm planning to send this in the Pop!_OS dev channel, thought I'd ask here if there is anything I should bring up/change/rephrase: > It may be a bit early to bring this up, but making a DE from scratch might make for a good opportunity to introduce the JPEG XL image format into the desktop. It's an image format which outperforms all other image formats in terms of image quality and compression ratio (in both the lossless and lossy modes), and many other factors. > > It's FOSS, patent free, and currently supported in popular creative software like Darktable, Adobe and Affinity's products, etc, but is seeing slower adoption on the web and in generic software like image viewers and other platforms that deal with images. macOS already has full support. I think it would be nice if COSMIC got support as well. > > Google Research is currently working on a decoder implementation in Rust which could be added to libcosmic/iced [COSMIC's GUI toolkit] in the future, and it's already possible to make desktop software relying on the C++ libjxl reference decoder. Personally I think a JPEG XL export option in the screenshot tool would be a great start, and it would be nice to see it in the future COSMIC image viewer as well. > > What do you think?
2024-09-16 06:03:45	no need
2024-09-16 06:03:52	cosmic is hard limited by image-rs
2024-09-16 06:04:18	I plan on making a PR to either add zune, or jxl-oxide directly when I find the time
2024-09-16 06:08:54	unfortunately image-rs has made some... odd decisions which means it will be a while until its supported. I might be making a new image abstraction crate in the future if my hands can hold up, but its seeming unlikely
2024-09-16 06:10:15	as for libcosmic, it doesn't make sense to to add it to that since it's rust and rust crates already exist

AccessViolation_

2024-09-16 07:49:49

Yeh I meant adding it to the relevant part, like image-rs, not necessarily directly adding it to libcosmic/iced itself

Quackdoc

2024-09-16 08:03:14

image-rs has declined it

CrushedAsian255

2024-09-16 08:03:29

Did they state a reason?

Quackdoc

2024-09-16 08:04:30	because the spec is paywalled, which I don't understand why that would be a blocker myself but it is what it is.
2024-09-16 08:05:52	in the end, image-rs has become non viable for all sorts of programs which is a shame and in general a massive setback to rust

CrushedAsian255

2024-09-16 08:06:14

Can’t you just link JXL oxide?

Quackdoc

2024-09-16 08:06:54

if you want to maintain your own fork

CrushedAsian255

2024-09-16 08:07:14

Fork that

Quackdoc

2024-09-16 08:07:49	in the end, unless you fork it, and convince people to use your fork, it won't actually change the ecosystem
2024-09-16 08:08:27	it would be nice if zune-image would support more formats, I did ask about it, but didn't get a hard answer

_wb_

2024-09-16 09:27:09

The AVIF spec references MIAF (ISO/IEC 23000-22) 14 times and HEIF (ISO/IEC 23008-12) 16 times, in normative ways so you really need to get these other specs if you want to be able to implement AVIF from spec. It also references ISOBMFF (ISO/IEC 14496-12) directly and indirectly via MIAF and HEIF. MIAF costs CHF 151, HEIF is CHF 216, ISOBMFF is CHF 216. For JPEG XL you need ISO/IEC 18181-1 (CHF 216) and ISO/IEC 18181-2 (CHF 96). Total cost for AVIF: CHF 583 Total cost for JPEG XL: CHF 312

BabylonAS

2024-09-16 09:31:38

CHF?

CrushedAsian255

2024-09-16 09:31:47

Why do you need -2 as well if you are decoding just the code stream

_wb_

2024-09-16 09:31:48

Swiss Francs

CrushedAsian255

	BabylonAS CHF?
2024-09-16 09:31:55	Swiss money

BabylonAS

2024-09-16 09:32:03

huh

_wb_

2024-09-16 09:32:27

You don't really need -2 as long as you can figure out where the codestream is

CrushedAsian255

2024-09-16 09:32:33

-2 is just the container right?

_wb_

2024-09-16 09:33:06

Yep, and the jpeg bitstream reconstruction procedure

CrushedAsian255

2024-09-16 09:33:13	Also isn’t JPEG XL container ISOBMFF
2024-09-16 09:33:35	Or is all the required normative stuff redefined

_wb_

2024-09-16 09:33:43	It uses the same syntax but it is described in a self-contained way in -2
2024-09-16 09:34:03	We only have normative references to stuff that is not behind a paywall
2024-09-16 09:34:19	Well, except for JUMBF

CrushedAsian255

2024-09-16 09:34:19	Like the brotli spec?
2024-09-16 09:34:36	What’s JUMBF again?

_wb_

2024-09-16 09:34:49	Yes, brotli, ICC, and some ITU specs.
2024-09-16 09:35:19	The extensible metadata thing that you can use for 360 images and C2PA and stuff

CrushedAsian255

2024-09-16 09:35:37	So just metadata?
2024-09-16 09:35:46	Not required for image decoding?

_wb_

2024-09-16 09:36:04	But you can just treat it as a blob, it is not needed if you just want to decode an image
2024-09-16 09:38:44	Anyway, I think all paywalls are bad and I wish we could have the jxl spec without paywall, but it's out of our control, it is high level ISO policy.
2024-09-16 09:40:46	What I don't understand is why AVIF chose to base itself on a paywalled spec, they could have done what we did in 18181-2 and define things in a self-contained way in the spec they can make publicly available.

CrushedAsian255

2024-09-16 09:44:08	Or use their own spec
	_wb_ Anyway, I think all paywalls are bad and I wish we could have the jxl spec without paywall, but it's out of our control, it is high level ISO policy.
2024-09-16 09:44:29	Once I fully understand the format I might write my own version of the spec
2024-09-16 09:44:53	Or we can all help contributor to <@206628065147748352> ‘s experimental spec
2024-09-16 09:45:30	Definitely think if there is an independent specification it will help ease people about the whole paywalled spec thing

HCrikki

2024-09-16 12:19:20

cosmic can patch the image-rs in their repository/desktop. reportedly it already works fine, its just image-rs that declined to make the integration upstream. upstreaming is only *ideal*, not required. and adding a single extra patch does not make your builds forks or derivatives

Quackdoc

2024-09-16 12:20:29	yeah, but that still requires you to maintain your own fork
2024-09-16 12:22:21	I was going to finish the work that <@736666062879195236> had done, but I'm not about to maintain a forked version of image-rs
2024-09-16 12:22:42	that being said, if anyone wanted to rebasing the existing patches is easy
2024-09-16 12:24:49	realistically, we need a different abstraction crate anyways for any application that want's to take itself seriously because the policy is going to break lots of workflows, but when you rely on a project that have maintainers that create a new rule and arbitrarily enforce them, it's risky at best

AccessViolation_

2024-09-16 01:21:08

The whole of libcosmic is basically a reimplementation of iced, with many non-cosmic-specific changes upstreamed. Based on that, I don't think they'd mind maintaining a fork or reimplementation of image-rs to add JXL support if they do want it, but that's just a guess

Tirr

2024-09-16 02:34:48	progressive rendering animation of https://discord.com/channels/794206087879852103/1065165415598272582/1099994961530851409 (d1 e6 progressive_dc=1). this should be HDR, but youtube seems to take long time to fully process HDR videos... https://www.youtube.com/watch?v=q0Pg2BvupbU
2024-09-16 02:45:10	total image size is ~3.1 MiB (3250045 bytes). - at 0.15% it shows... something - 16x downsampled image is shown at 3% (~100kB), followed by 8x downsampled image in groups of 2048x2048 until ~14% - HF data of the sky part of the image loads very quickly, since it's mostly smooth gradients and there's not much data to decode

Quackdoc

	AccessViolation_ The whole of libcosmic is basically a reimplementation of iced, with many non-cosmic-specific changes upstreamed. Based on that, I don't think they'd mind maintaining a fork or reimplementation of image-rs to add JXL support if they do want it, but that's just a guess
2024-09-16 02:45:31	possibly, but that would be on them

jadamas

2024-09-16 02:45:51	hola
2024-09-16 02:46:46	necesito una imagen de phineas y ferds trabajando camaras de seguridad CCTV por favor

_wb_

2024-09-16 02:47:52

afed

	Tirr progressive rendering animation of https://discord.com/channels/794206087879852103/1065165415598272582/1099994961530851409 (d1 e6 progressive_dc=1). this should be HDR, but youtube seems to take long time to fully process HDR videos... https://www.youtube.com/watch?v=q0Pg2BvupbU
2024-09-16 02:49:10	what if `group_order=1`

Tirr

2024-09-16 02:51:40	started encoding video, wait a minute
	afed what if `group_order=1`
2024-09-16 03:02:53	https://www.youtube.com/watch?v=mgt4C0NDja8
2024-09-16 03:04:16	filesize is 3250331 bytes, mostly the same except the image is loaded center-first

_wb_

2024-09-16 03:04:36

probably doesn't make a huge difference here since the sky is 'easy'

afed

2024-09-16 03:07:24

yeah, not that noticeable, also looks like hdr tonemapping is different

Tirr

2024-09-16 03:08:10

my code is moving quickly, maybe something has changed between two encodes

AccessViolation_

2024-09-16 03:13:10

It says `group_order=1` is center first, is that the one where it spirals outward like the example from the whitepaper? It doesn't look like that's what it's doing, it looks more like a scanline pattern on three columns, doing the center one first, but it's hard to tell

jonnyawsom3

	Tirr total image size is ~3.1 MiB (3250045 bytes). - at 0.15% it shows... something - 16x downsampled image is shown at 3% (~100kB), followed by 8x downsampled image in groups of 2048x2048 until ~14% - HF data of the sky part of the image loads very quickly, since it's mostly smooth gradients and there's not much data to decode
2024-09-16 03:14:06	Very much a nit-pick, but maybe you could put info like that in the descriptions in future, for easy reference while watching
	AccessViolation_ It says `group_order=1` is center first, is that the one where it spirals outward like the example from the whitepaper? It doesn't look like that's what it's doing, it looks more like a scanline pattern on three columns, doing the center one first, but it's hard to tell
2024-09-16 03:22:58	That's because of the 'easy' sky, it spirals but the upper half is done almost instantly so you only notice it on the way down

AccessViolation_

2024-09-16 03:24:10

Ah, makes sense

jonnyawsom3

2024-09-16 03:31:24	Wonder if the bytes/frame could change when it hits decode thresholds instead of just based on time, then regardless of the file the timing is consistent for loading the LF and HF groups, ect (Probably still a power of 2)
	Tirr progressive rendering animation of https://discord.com/channels/794206087879852103/1065165415598272582/1099994961530851409 (d1 e6 progressive_dc=1). this should be HDR, but youtube seems to take long time to fully process HDR videos... https://www.youtube.com/watch?v=q0Pg2BvupbU
2024-09-16 03:36:00	Oh, also any reason for using e6?
	_wb_
2024-09-16 03:45:56	<@384009621519597581> (I know Monad beat me to it but I didn't scroll for nothing :P)

Tirr

	Oh, also any reason for using e6?
2024-09-16 03:47:55	I wasn't sure about what features each effort level enables, and I just wanted to avoid patches

afed

2024-09-16 03:48:47

`patches=0`

AccessViolation_

2024-09-16 03:49:27

Lossy doesn't do patches yet does it?

monad

2024-09-16 03:49:40	patches are enabled on e5-e9 on small images, e10
2024-09-16 03:50:31	but surely there's no chance of catching anything here

afed

2024-09-16 03:51:46

for lossless, but not for lossy

monad

	AccessViolation_ Lossy doesn't do patches yet does it?
2024-09-16 03:52:44	lossy does use patches
2024-09-16 03:53:08	but okay, my efforts may be off, I am always in lossless space ..

afed

2024-09-16 03:54:13

also need some corrections for new changed efforts https://github.com/libjxl/libjxl/blob/main/doc/encode_effort.md

jonnyawsom3

	Tirr I wasn't sure about what features each effort level enables, and I just wanted to avoid patches
2024-09-16 03:58:25	Since 0.10, now only effort 10 uses patches

AccessViolation_

	monad lossy does use patches
2024-09-16 04:00:57	Ah, I read on the subreddit that the encoder doesn't do lossy patches yet but that was probably outdated then

jonnyawsom3

	afed also need some corrections for new changed efforts https://github.com/libjxl/libjxl/blob/main/doc/encode_effort.md
2024-09-16 04:02:35	Streamed encoding has to be disabled to use them, so they were moved to effort 10 for both lossy and lossless unless explicitly enabled
2024-09-16 04:03:03	Same reason progressive is currently broken without the DC command

monad

2024-09-16 04:06:56

as long as both x and y image dimensions are less than 2048px, it will search for patches

jonnyawsom3

2024-09-16 04:18:23	Huh, so you're right... Now I'm wondering why it never showed in my tests considering I'm on a 1080p monitor and 90% of my images are screenshots
2024-09-16 04:19:54	So the old effort chart, but above 4 MP it's effort 10 only

monad

2024-09-16 04:22:37

Not even 4 MP, just one image dimension needs to be >2048px.

jonnyawsom3

2024-09-16 04:23:49

Right

AccessViolation_

2024-09-16 04:44:12

Thinking about this: > As a rule, AVIF takes the lead in low-fidelity, high-appeal compression while JPEG XL excels in medium to high fidelity. It’s unclear to what extent that’s an inherent property of the two image formats, and to what extent it’s a matter of encoder engineering focus. In any case, they’re both miles ahead of the old JPEG. - <https://cloudinary.com/blog/time_for_next_gen_codecs_to_dethrone_jpeg#compression> I've also noticed AVIF beating JXL at quality per byte at very high compression levels, and I'm wondering if at this point we know more about it. Like, do we know whether AVIF having an edge is because of inherent properties of the image format or is this something JXL could technically address with changes to how the encoder behaves in these situations?

jonnyawsom3

2024-09-16 04:48:18

JXL could definitely be improved, right now the 256 VarDCT blocks aren't used, only going up to 32 or 64 if I recall. But in the end AVIF was made for video running at extremely low quality and fast paced images, so I think it will always have an edge. The question is if you actually want that outside of... Well, a video

AccessViolation_

2024-09-16 04:52:23

It's definitely not something you'd want *that bad* as a feature since even if you'd be looking at a slightly less shitty image, it's still a shitty image, so fair point. Better is still better, though. I wonder if this could maybe improve the looks of the early results of progressive decoding? I don't know how it works exactly but progressive decoding to me looks more like it works in a downscaling kind of way rather than a quality setting kind of way, but idk

lonjil

2024-09-16 04:52:29

<@532010383041363969> has made the point that the entropy coding used in AV1 doesn't scale well to larger values. Higher quality images will inherently have larger numbers that need to be entropy coded, so AV1 has an inherent disadvantage.

HCrikki

2024-09-16 04:53:50

its not some issue with the format, just how the reference library as released by the libjxl authors works - preserving more detail at all resolutions. one could do a rebuild that discards some detail for lower visual quality

lonjil

2024-09-16 04:54:20

Presumably, the converse ought to be true, if AV1 was well designed then the sacrifice to have poor entropy coding for larger values would make the entropy coding of smaller values a bit more efficient. Though, my mathematical intuitition tells me that it would only be a small advantage.

_wb_

2024-09-16 04:58:20

There are definitely some inherent aspects of av1 codec design vs jxl codec design that cause av1 to be more suitable for low quality and jxl more suitable for high quality. Entropy coding, granularity of filter strength signaling, aggressiveness of filters, expressivity of jxl patches vs avif palette blocks, and the hard limit of 12-bit precision in av1 are some of the things that point in that direction.

2024-09-16 04:59:55

But there's also a difference in encoder dev focus, where basically most jxl devs don't really care about d>4 while most av1 devs don't really care about d<4, so to speak 🙂

A homosapien

2024-09-16 06:22:20

Not to mention there are even improvements to be made for high fidelity images as well. I remember there was a discussion where vibrant colors became a little desaturated, and blue hues being shifted around. Idk if these issues were fixed in the new release. Regardless, there is *a lot* of untapped potential still left in jxl. That's for sure.

AccessViolation_

2024-09-16 06:28:45	Thanks for the clarifications :)
2024-09-16 06:30:31	> Regardless, there is a lot of untapped potential still left in jxl Personally I'm really curious what benefits from splines we will see in the future
2024-09-16 06:35:54	When I see an image like this I can only think "how well could splines help compressing this fur"

A homosapien

2024-09-16 06:38:10

Same with manga/line art, I can only imagine the insane gains for such images.

jonnyawsom3

	spider-mario it was indeed for features such as hair that are harder to encode with DCT; I’ll see if I can dig up the hand-crafted examples later but they seemed relatively promising
2024-09-16 06:44:39	https://discord.com/channels/794206087879852103/794206170445119489/1217510081155956758 (Link to image)
	AccessViolation_ > Regardless, there is a lot of untapped potential still left in jxl Personally I'm really curious what benefits from splines we will see in the future
2024-09-16 06:44:45	Took a while, but I found both a discussion around it and an example

AccessViolation_

2024-09-16 06:55:21	Woah that's really good actually
2024-09-16 06:55:33	I know it's hand crafted, but still

Fox Wizard

2024-09-16 07:01:49

Wonder if spline will mess up the fur on that image

_wb_

2024-09-16 07:04:59

my gut feeling is that for the fur, splines are not that useful (regular DCT will be ok), and splines are more useful for those whiskers

spider-mario

2024-09-16 07:05:12

I was going to say pretty much the same thing

_wb_

2024-09-16 07:06:09

funny how we have gut feelings about the effectiveness of coding tools we don't even have an encoder for yet 🙂

spider-mario

2024-09-16 07:06:34

I did act as a human encoder for one of them, though 😁

_wb_

2024-09-16 07:07:32

I guess I kind of did too, in some jxl art 🙂

AccessViolation_

2024-09-16 07:09:00

From the discussion, I don't really understand what's actually supposed to happen to the pixel data under the spline. My initial idea was that the splines would be fitted first, then pixels it's supposed to replace are removed or deemed unimportant to the lossy compressor thing somehow, so they're 'ignored' and I imagine would be replaced by the average color of nearby pixels/continuing patterns, but it but I'm seeing different ideas.

2024-09-16 07:09:59

Forgive me using phrases like "lossy compressor thing", I'm still learning :)

jonnyawsom3

2024-09-16 07:12:14

I trust the experts will pitch in soon, but if I recall it's done in a way that whatever gets predicted under the spline is left alone with no residuals/entropy, and then the Spline is layerd on top using the prediction as it's base color, ect (So the spline encodes the differences but with a perfect edge) As soon as I hit send ;P

spider-mario

2024-09-16 07:12:41	splines in jxl are additive by default (in XYB space), so the spline would be subtracted from the original pixels
2024-09-16 07:13:10	it’s also possible to make them occlusive but it gets potentially tricky when you consider the semi-transparent edges

afed

2024-09-16 07:13:16

i think splines are useful for any sharp edges on some flat area where artifacts are especially visible

jonnyawsom3

2024-09-16 07:14:50

Thin objects like hair, whiskers, wires, ect but also possibly the edges of all objects to hide the blur or VarDCT for example

AccessViolation_

2024-09-16 07:17:17

> if I recall it's done in a way that whatever gets predicted under the spline is left alone with no residuals/entropy So if I understand this would effectively be "predict what would go here but unlike what you would normally, don't store bytes telling it it's wrong if it is", which makes sense to me as a data saving mechanism. But I don't understand how this: > splines in jxl are additive by default (in XYB space), so the spline would be subtracted from the original pixels results in less data being used in the underlying pixel layer. Unless I'm misunderstanding what subtraction means in this context

2024-09-16 07:19:03

Since splines aren't occlusive in this case you would still need those pixels there, even if you subtract a different value from them, no?

spider-mario

2024-09-16 07:23:23

if the pixel would normally be (242, 128, 137), and the value added by the spline would be (125, 0, 0) (it’s a red line on a grey-ish background), you would encode (117, 128, 137) with VarDCT (which would encode well if the spline is accurate and the subtraction leaves behind pretty much just the background)

afed

2024-09-16 07:24:15	for fur and hair, like for each individual hair, I think it would highly overcomplicate the decoding and detection because it would need a lot of splines, but for the most prominent or boundary ones it could make a significant improvement
2024-09-16 07:28:05	sort of like cel-shading methods, but on top, without texture flattening <https://en.wikipedia.org/wiki/Cel_shading>

_wb_

	spider-mario it’s also possible to make them occlusive but it gets potentially tricky when you consider the semi-transparent edges
2024-09-16 07:30:54	I don't think we have a way to do "kReplace"-like splines, only "kAdd", right?

spider-mario

2024-09-16 07:32:35	from what I recall, we had an alpha blending option, or did we remove it?
2024-09-16 07:32:46	now that I think about it, we might have

_wb_

2024-09-16 07:33:27

splines only operate on XYB (or whatever the main color channels are)

spider-mario

2024-09-16 07:34:19	``` Thu May 16 16:53:59 2019 +0200 Remove alpha splines, leaving only additive splines ```
2024-09-16 07:34:23	RIP alpha splines

_wb_

2024-09-16 07:34:24	patches have different blend modes, and so do frames, so I guess you could put a spline on a patch frame and then put that on your frame with any blend mode you like
2024-09-16 07:34:54	though you can't make the alpha channel follow the shape of the spline

spider-mario

2024-09-16 07:35:49	```diff - // Nothing is currently done to "subtract" alpha-blended splines from the - // encoded image. - if (!add && spline.mode == Spline::RenderingMode::kAlphaBlending) continue; ```
2024-09-16 07:37:01	I don’t quite remember all the details of how they worked but I do remember that they were much slower to render

_wb_

2024-09-16 07:37:31	yeah I can imagine that
2024-09-16 07:38:18	also seems even harder to make an encoder for that

spider-mario

2024-09-16 07:38:51

I see a commit message with the following numbers: ``` - decompression speed without any spline: ~180MP/s (unchanged) - decompression speed with three kAdditive splines: ~160MP/s - decompression speed with three kAlphaBlending splines: ~60MP/s ```

_wb_

2024-09-16 07:39:03

(just like we only have an encoder for kAdd patches at the moment, we're not using any of the other blend modes)

AccessViolation_

spider-mario if the pixel would normally be (242, 128, 137), and the value added by the spline would be (125, 0, 0) (it’s a red line on a grey-ish background), you would encode (117, 128, 137) with VarDCT (which would encode well if the spline is accurate and the subtraction leaves behind pretty much just the background)

2024-09-16 07:51:01

is it like you're basically trying to remove the element that's going to become the spline by continuing the surrounding background color and then making the spline the color you would need to recreate it when you apply the operation, effectively removing the object, giving VarDCT a more homogeneous area leading to better data savings ? I thought about your message but I don't get how it's able to achieve anything without also looking at the colors surrounding the spline rather than the spline candidate pixels themselves

spider-mario

2024-09-16 07:51:50	yes, ideally, the spline should be such that what remains after subtraction is smooth and easy to encode
2024-09-16 07:51:59	but we currently don’t have an algorithm that actually achieves that

AccessViolation_

2024-09-16 07:52:15	Ahh, okay, then I understand. Thanks!
2024-09-16 07:52:22	Interesting stuff

spider-mario

2024-09-16 07:52:29	so I basically constructed the spline data by hand for the example above
2024-09-16 07:52:46	“let’s see… if I make the spline go through these points, it’s almost there, but it needs to be slightly thicker here”

AccessViolation_

2024-09-16 07:53:35

I saw, it looked pretty impressive. How long did it take you?

spider-mario

2024-09-16 07:53:49	I don’t remember exactly, it was in 2019
2024-09-16 07:54:19	to think that 2019 was 5 years ago :S

_wb_

2024-09-16 07:54:20

we should look at some of the old vectorization algorithms or something. it's a tricky problem but shouldn't be unsolvable

AccessViolation_

2024-09-16 08:00:55	just some edge detection, and maybe adobe will let you borrow their content aware fill for the pixels under the splines <:PepeOK:805388754545934396>
2024-09-16 08:01:17	ez

jonnyawsom3

2024-09-16 08:03:16

Considering what Adobe did with the JXL DNGs, I'm not sure I'd trust them with it Dx

lonjil

	spider-mario ``` Thu May 16 16:53:59 2019 +0200 Remove alpha splines, leaving only additive splines ```
2024-09-16 08:03:26	dang that's crazy, I've been saying on this server for ages that alpha blended splines would've been great
2024-09-16 08:04:54	the solid part of the spline (if it exists) would've had an alpha of 1.0, fully occluding underlying pixels so that the underlying pixels could be unconstrained, and the semi-transparent edges could've done a nice blending with the surrounding pixels

AccessViolation_

	Considering what Adobe did with the JXL DNGs, I'm not sure I'd trust them with it Dx
2024-09-16 08:05:45	I was very much joking, but what did they do? 👀

jonnyawsom3

2024-09-16 08:08:13	Instead of using any of the features JXL has to offer, they just swapped out the old jpeg for it and that was it. So it still uses TIFF tiles, at tile sizes that don't match group sizes, without using the ColorFilterArray extra channel specifically meant for RAW files and don't get me started on the DNGConveter command line tool...
2024-09-16 08:09:42	They set the lossy quality so low that the lossless files are actually smaller, set `faster_decoding=4` even though it has some issues with lossless that bloats filesize massively, although I think they doged that bullet by not letting it compress properly in the first place thanks to the tiles

AccessViolation_

2024-09-16 08:10:31

<:galaxybrain:821831336372338729>

jonnyawsom3

2024-09-16 08:10:34

Jyrki says they know some people at Adobe and that they must have a method to the madness, but it just seems insane to me

AccessViolation_

2024-09-16 08:13:12

And presumably this is all set in stone for 1.7 already

lonjil

2024-09-16 08:13:42

I think DNG is pretty flexible in how stuff is stored

_wb_

2024-09-16 08:14:15

yes, I think you could use it in a way that is basically just a wrapper around a jxl codestream plus some obligatory DNG metadata

lonjil

2024-09-16 08:14:19

Certainly the libjxl encoding setting choices are not in the spec at all, and can be chosen arbitrarily by the encoder.

AccessViolation_

2024-09-16 08:16:52

If those two digital camera models that shoot DNG 1.7 found out about this they would be very upset

_wb_

2024-09-16 08:17:05

AccessViolation_

2024-09-16 08:17:11

Oh, and iPhone 16, which is pretty huge

_wb_

2024-09-16 08:17:17
2024-09-16 08:17:43	they did put some fields in DNG 1.7 which are pretty specific, kind of assuming that libjxl is the encoder 🙂

lonjil

2024-09-16 08:17:56

that's funny

_wb_

2024-09-16 08:19:16

I mean, I guess another jxl encoder could also use these fields to give some indication of what it is doing

lonjil

	AccessViolation_ Oh, and iPhone 16, which is pretty huge
2024-09-16 08:19:38	There was some code in iOS 18 relating to non-debayered and tiled jxl in dng, but I think that for DNGs produced by Apple's own camera app, they will store single 3-channel images of debayered data. And hopefully not with weird tiling. (does anyone know what kind of tiling ProRaw files use today?)

_wb_

2024-09-16 08:20:14	I bet they're going to do tiling, for HEIC they use 512x512 tiles
2024-09-16 08:20:56	for current ProRaw it doesn't really matter since they use lossless, but for lossy it's kind of a bad idea to use DNG-level tiles
2024-09-16 08:22:02	not that it will matter much if they're going to use very high quality settings, which seems to be the case if those estimated file sizes are correct

lonjil

2024-09-16 08:22:15

But HEIC tiling must be to make more efficient use of the hardware encoder and decoder. I mean I wouldn't be surprised if they did tiling for JXL ProRaw, but I don't think it is influenced by the same factors as HEIC. Unless they're planning to HW accelerate it in the future?

_wb_

2024-09-16 08:23:01

I'll be curious to see the first iPhone 16 Pro bitstream samples

lonjil

	_wb_ not that it will matter much if they're going to use very high quality settings, which seems to be the case if those estimated file sizes are correct
2024-09-16 08:23:03	yeah, I mean that looks like d0.1 or something. I couldn't even tell the difference between lossless and d0.1 when zoomed in when I tried it.

_wb_

2024-09-16 08:23:54

I think it's probably a good idea if you want to make the concept of "lossy raw" digestible for "pro" photographers, to use an insanely high quality like that

AccessViolation_

2024-09-16 08:24:56

Are there even reasonable savings with such a low distance?

_wb_

2024-09-16 08:25:29	well it should still be smaller than lossless
2024-09-16 08:25:46	but not by a huge factor
2024-09-16 08:26:41	it would have been nice if they would expose some control over it — so you can do d0.1 - d1 or whatever

lonjil

	AccessViolation_ Are there even reasonable savings with such a low distance?
2024-09-16 08:27:47	For my photographs (not necessarily accurate for Apple's 12-bit debayered images) ``` 1.1G uncompressed 672M jpegll 385M jxld0e1 186M jxld0.1e1 170M jxl-tiny-d0.1 ```

AccessViolation_

2024-09-16 08:28:03	If it shaves off 1% I don't think it'd be worth thinking about, if it shaves off 10% (like maybe being able to sacrifice the tiniest amount of data at all suddenly opens the door to way better compression at still being effectively-but-not-really lossless) I could see it being worthwhile
	lonjil For my photographs (not necessarily accurate for Apple's 12-bit debayered images) ``` 1.1G uncompressed 672M jpegll 385M jxld0e1 186M jxld0.1e1 170M jxl-tiny-d0.1 ```
2024-09-16 08:28:38	damn, okay
2024-09-16 08:28:41	nvm then XD

lonjil

	lonjil yeah, I mean that looks like d0.1 or something. I couldn't even tell the difference between lossless and d0.1 when zoomed in when I tried it.
2024-09-16 08:28:43	ok, at 8x zoom looking around for a while, I found a single pixel that changed enough for me to notice

AccessViolation_

2024-09-16 08:32:24

I've only recently been dealing with raw files. I heard somewhere that even 20 year old raw files can be developed into really nice wide gamut HDR images today, which was motivation to sacrifice a bit of storage and make my phone capture RAW as well for particularly nice photos I'm taking

lonjil

2024-09-16 08:33:09	My DSLR is I think more than 15 years old at this point and it blows any phone camera out of the water
2024-09-16 08:33:19	though I'm not very good at the mastering stuff

AccessViolation_

2024-09-16 08:34:48

I've been toying with Darktable but I haven't been able to really match my phone's default JPEG output in appeal. Phones do a crazy lot of computational photography especially in dark scenes, so that's sort of expected. At least until I get better at doing this

spider-mario

2024-09-16 08:35:18	the “Local contrast” module can probably help
2024-09-16 08:36:12	it’s been a while since I’ve used darktable, but from what I remember, what I liked to do was temporarily set it to an absurdly large intensity, just to help me find the radius that works best for the image
2024-09-16 08:36:17	then dial it back
2024-09-16 08:36:43	(it’s still the strategy I use in DaVinci Resolve)

AccessViolation_

2024-09-16 08:37:19	Thanks, I'll try that. I've mostly been using saturation, exposure and denoise to make things look "okay"
2024-09-16 08:38:39	I'm probably doing something wrong in the saturation department because RAW images come out of my phone almost looking sepia and I need to get that slider pretty close to max for it to look alright

spider-mario

2024-09-16 08:38:41	ah, it seems the “local laplacian filter” mode doesn’t _have_ a radius setting
2024-09-16 08:38:48	I guess I used the “bilateral grid” mode then

AccessViolation_

2024-09-16 08:48:37

I'll probably try following a starter guide tomorrow

spider-mario

2024-09-16 08:50:27

at the time, it was also quite a good thing to understand the filmic module, but not sure how it is now

lonjil

2024-09-16 08:51:06	someone sent me an iOS 15 12MP ProRaw file, this is what exiftool reports: ``` 322 TileWidth : 504 323 TileLength : 504 ```
2024-09-16 08:51:52	So if they just use JXL as a drop in replacement, I guess those would be the tile dimensions they'd use.

CrushedAsian255

2024-09-16 08:58:50	it's better than Adobe
2024-09-16 08:59:16	a 256x256, 256x248, 248x256, 248x248
2024-09-16 08:59:29	their HEVC decoder probably works in 512x512 blocks already

_wb_

2024-09-16 09:01:25

why 504 and not 512

CrushedAsian255

2024-09-16 09:03:03	probably to prevent edge artifacting
2024-09-16 09:03:20	it might give the endoder 8x8 of the previous tile so it can use it to predict off of
2024-09-16 09:03:31	wait no that doesno't work
2024-09-16 09:03:44	someone should check the block sizes of heic image

AccessViolation_

2024-09-16 09:07:49

Maybe the full height and width are a multiple of 504, making it so you only have full tiles and not small slices of tiles in the bottom and right side of the image

CrushedAsian255

2024-09-16 09:08:18	Probably
2024-09-16 09:08:30	4032/504 = 8.0
2024-09-16 09:08:51	3024/504 = 6.0

jonnyawsom3

	CrushedAsian255 it's better than Adobe
2024-09-16 11:51:51	`672x752` https://discord.com/channels/794206087879852103/822105409312653333/1252968284702380083

Tirr

	Tirr https://www.youtube.com/watch?v=mgt4C0NDja8
2024-09-17 06:42:36	this is now available as hdr
2024-09-17 06:44:56	seems to work well, the sun is brighter than the light-mode youtube background

CrushedAsian255

2024-09-17 07:00:49	Confirmed; hdr works on iPhone
2024-09-17 07:01:40	Can’t comment on the progressive loading though; my vision makes it so 10% looks fully loaded
2024-09-17 07:01:47	Also could be YT compression

Tirr

2024-09-17 07:02:44

I can see HF progressive with my macbook air so it's ok

Arcane

2024-09-17 10:27:19

CrushedAsian255

2024-09-17 10:27:27

Oops sorry misclick

jonnyawsom3

	AccessViolation_ It says `group_order=1` is center first, is that the one where it spirals outward like the example from the whitepaper? It doesn't look like that's what it's doing, it looks more like a scanline pattern on three columns, doing the center one first, but it's hard to tell
2024-09-17 10:42:15	I see what you mean now, and I think I know why. That happens during the DC/LF part of the image, which are 2048 blocks instead of 256. So it probably is a spiral, but there's only 3 blocks so it just goes middle, left, right (probably)
2024-09-17 11:59:26	I do wonder if centre-first should be the default group order. Set it apart from the other formats and usually the focus is in the centre anyway

Traneptora

2024-09-17 11:59:45

wouldn't be too hard to do with permuted TOC

jonnyawsom3

2024-09-17 12:10:09

It's already an option in cjxl, but scanline is default currently

AccessViolation_

2024-09-17 12:33:42

Hmm, I wonder if there's some simple heuristic to determine a better custom group order without using the fancy machine learning model Google proposed. For example, if you have a picture of an ocean underneath a clear sky, people will probably look at the horizon first. So in that sense you want high coarse contrast areas decoded first. You'd also want to avoid being tricked into decoding larger noisy areas, like if there's a rough concrete wall it would go "oh, contrasty features, probably important" which it shouldn't do. So It should however care about the section where the wall ends or changes color as that's an eye catching feature. So to summarize: large, coarse color/luminance differences? Just a theory

CrushedAsian255

2024-09-17 12:34:25

Maybe areas with lots of defined edges?

AccessViolation_

2024-09-17 12:36:27

Simple repeating patterns of stripes would throw that method off, similarly to how an otherwise boring feature that happens to be pretty noisy like concrete would throw off a simple contrast approach. Maybe you really just want to look at the very low frequency data for this. You can then maybe do some edge detection on *that*

CrushedAsian255

2024-09-17 12:37:28	What about faces?
2024-09-17 12:37:34	That’s usually where people look first

AccessViolation_

2024-09-17 12:40:19

For faces Google's approach already knows to care about the eyes and mouth first. You can probs also do simple face detection (which would probably also count as ML) as a special pass on top of our edge detection / coarse contrasts detection

jonnyawsom3

2024-09-17 12:40:37

Feels like something to do along with spline encoding, since that would be edge detection and borders of objects too

CrushedAsian255

	AccessViolation_ For faces Google's approach already knows to care about the eyes and mouth first. You can probs also do simple face detection (which would probably also count as ML) as a special pass on top of our edge detection / coarse contrasts detection
2024-09-17 12:41:24	A really simple ml model that only activates at higher efforts shouldn’t be a problem

AccessViolation_

2024-09-17 12:42:01

Without a special module for face detection, then if the face is large like with a portrait it would probably care about the edges, where the neck meets the shirt, and the eyes/mouth, would probably be what happens

CrushedAsian255

2024-09-17 12:42:32

Eye/mouth would probably be fine tbh

AccessViolation_

2024-09-17 12:52:42	Actual image, image prepared for feature detection, detected features
2024-09-17 12:53:18	Something like this maybe?
2024-09-17 12:53:41	The idea is that tiles containing the red splines would get decoding priority
2024-09-17 12:56:36	By operating on a blurred/averaged version you eliminate the problem of someone with a striped shirt being prioritized because there's a lot of contrast there, for example

CrushedAsian255

2024-09-17 12:57:31

Does this still detect faces?

AccessViolation_

2024-09-17 12:58:45	No special logic for faces, no. Face detection could be a pass that happens on the unblurred image can also be added to the list of features
2024-09-17 01:00:36	I'm not sure if splines like these are the best method for representing features, if you used a low resolution matrix containing the priority for every location in the image then you wouldn't have to use splines for everything, since splines are really only good if you care about rough edges, not faces or other features
2024-09-17 01:02:04	And with that, specialized encoders could decide themselves what they care about. E.g one for encoding comics could detect and prioritize text

Info

JPEG XL

General chat

Voice Channels

Archived

jxl

Anything JPEG XL related