JPEG XL

skalt711#4276 Can JPEG XL work as an all-around replacement for PNG, JPEG, TIFF and possibly other formats I haven't heard about for high quality pictures?

2023-04-19 02:40:54

PNG, JPEG, TIFF, GIF, BMP, PNM, PFM, WebP, EXR: yes, JXL can work as a replacement for all of those. In fact it was part of the design rationale that JXL should feature-wise be a superset of these formats. Also it can do a nice chunk of the common denominator between image authoring formats (PSD, XCF, KRA etc), that is, it can do named layers with alpha blending and selection masks. Of course it's not as expressive and open-ended as actual authoring formats (which have tons of features and they change with every new version of the software), but for interchange of layered images, I think it has a lot of potential — currently only TIFF can do that, but TIFF is basically uncompressed (or at least only weakly compressed).

novomesk

2023-04-19 03:04:38

I started to implement JXL write support in gdk-pixbuf plugin. I'll submit PR when ready and finished with tests.

Traneptora

2023-04-19 03:05:18	what typical predictors does libjxl use for the LF coefficient modular stream?
2023-04-19 03:05:37	atm I'm just using gradient, and it's not particularly effective

_wb_

2023-04-19 03:19:11

<@179701849576833024> do you remember? iirc you did something like near-lossless gradient with a fixed tree, right?

veluca

2023-04-19 03:21:01	that's about what I remember too 😛
	Traneptora atm I'm just using gradient, and it's not particularly effective
2023-04-19 03:21:14	what does "effective" mean?
	_wb_ <@179701849576833024> do you remember? iirc you did something like near-lossless gradient with a fixed tree, right?
2023-04-19 03:21:40	for most qs, it's actually "lossless" except for quantization

_wb_

2023-04-19 03:41:59

right but you do the quantization after prediction, not before, right? that makes a small but significant difference iirc

Traneptora

	veluca what does "effective" mean?
2023-04-19 03:42:20	I mean I use gradiant as the only predictor, here, and I get large residuals

veluca

	_wb_ right but you do the quantization after prediction, not before, right? that makes a small but significant difference iirc
2023-04-19 03:42:51	it does not with gradient and no mult 😛

_wb_

2023-04-19 03:43:37

large residuals is inevitable for LF, unless you have large regions with slow gradients, an LF image does tend to have high entropy per pixel

Traneptora

2023-04-19 06:58:22	is it generally worth it to squeeze LF?
2023-04-19 06:58:37	or does that not realistically save you much

_wb_

2023-04-19 07:41:06

I don't think it helps much, the main advantage of that is being able to do 1:16 or 1:32 previews

Traneptora

2023-04-19 08:07:45

wouldn't LF-frames with a higher LF level be better for that?

_wb_

2023-04-19 08:09:56

If the LF frame doesn't use Squeeze, it doesn't give you a 1:16 preview

Traneptora

2023-04-19 08:47:08	ah right
2023-04-19 08:47:26	That's only 1:64 what I was thinking

Jyrki Alakuijala

	_wb_ Avif was not really rejected. Actually the conclusion of the 81st JPEG meeting was that AVIF could be a good starting point for JXL. Pik, fuif and avif were considered the 3 best proposals. The idea was to then start a collaborative phase where the best elements of the proposals would be combined. However, when it became clear that JPEG was not just going to take AVIF and rubberstamp it into JPEG XL without changes, the AVIF proposal was retracted, since they weren't prepared to make any changes to the actual bitstream (which makes sense, av1 hardware was already designed at this point...)
2023-04-21 09:25:00	After AVIF was withdrawn, JPEG committee came to us (Pik) to ask if we wanted to take over being the platform. I proposed the collaboration of two platforms (Pik+Fuif) for JPEG XL and while it is an unusual approach, the proposal was accepted by the committee. I consider it was my best contribution to the whole process, it has been fantastic to work with Jon all these years.
2023-04-21 09:28:12	I and Jon were competitors from the WebP lossless/FLIF era, and I had followed him closely enough to know already back then that he can bring a lot on the table.
2023-04-21 09:28:37	better to have the high caliber people in your team instead of competing 😄

yoochan

2023-04-21 09:30:04

nice mindset !

Jyrki Alakuijala

2023-04-21 09:30:18	JPEG committee had requirements to compress arbitrary photographs to 0.06 BPP -- the AVIF guys from Netflix/Mozilla were driving the original call for proposals and they had made the requirements to match with what they had
2023-04-21 09:32:34	Also, pik was an absolute colorspace system with XYB, and the requirements were written for a relative colorspace system -- that created quite some difficulties for us -- but I didn't want to go to relative colorspace nonetheless, since I consider that the future improvements in images/videos require an absolute color space internally
2023-04-21 09:33:06	those two create quite a bit of difficulties for us, the pik team, in our submission -- and we didn't show our best in the initial submission
2023-04-21 09:34:16	at 1.6 bpp and above pik was quite nice, possibly still better than libjxl delivers today
2023-04-21 09:36:49	Jan, Jon, Luca, Lode, Moritz and I managed to lead the standardization process to a rather nice solution despite these tensions

_wb_

2023-04-21 09:50:10

At the 81st JPEG meeting 7 proposals were investigated, and it was quite clear that 3 proposals were considered the most promising: avif, pik and fuif. The original decision was to take AVIF as the starting point for JPEG XL (and bring the best elements of pik and fuif into it, and make whatever other changes to make it better as a new still image format), but then AVIF withdrew — I assume mostly because they didn't want to make any changes to the bitstream, which was already finalized at that point — which then led to the decision to take pik+fuif as the starting point for JPEG XL. I am really happy with how we managed to merge those two codecs into a single codec that is really better than the sum of its parts — this was very nontrivial and required significant effort from everyone involved, but it led to something really nice.

yoochan

	Jyrki Alakuijala at 1.6 bpp and above pik was quite nice, possibly still better than libjxl delivers today
2023-04-21 09:50:44	what was lost from pik to jpegxl which made it slightly worse ?

username

	yoochan what was lost from pik to jpegxl which made it slightly worse ?
2023-04-21 09:52:03	I assume it's just current encoders as "better than libjxl delivers today" implies.

_wb_

2023-04-21 09:52:34

I don't think anything was lost, but the attention focus for libjxl went to a broader range of qualities than what pik was designed for, so it could be the case that libjxl is not as optimized for d0.5 anymore as pik was

Jyrki Alakuijala

2023-04-21 09:52:43	pik was 100 % focused on 8x8 dct, and the worst possible quality happening anywhere at the image at distance 1.0
2023-04-21 09:53:11	every distance 1.0 compression looked the same as lossless, no surprises

_wb_

2023-04-21 09:53:14

(but at d2-d3 libjxl is much, much better than pik was)

Jyrki Alakuijala

2023-04-21 09:53:35

yes, even at d1.5 libjxl is much better

_wb_

2023-04-21 09:54:03

bitstream-wise, jxl can do everything pik can though, unless I'm missing something

Jyrki Alakuijala

2023-04-21 09:54:14	if we refocus on distance 1.0 and optimize a dedicated encoder for that we can make it 5-10 % better with jpeg xl than with pik
2023-04-21 09:54:25	correct
2023-04-21 09:55:00	decoding speed might be slightly better with pik but less threading options
2023-04-21 09:55:21	tiling didn't exist yet in pik
2023-04-21 09:55:47	the compressed decoder size for libjxl is about 180 kB, but could have been 25 kB for pik

_wb_

2023-04-21 09:56:05

for the web, d2 is imo the sweet spot (with some web applications requiring more like d1-1.5 and others more like d2.5-3, but d2 is what most web devs would prefer to use)

Jyrki Alakuijala

2023-04-21 09:56:10	simpler might have been nice for wasm implementations
2023-04-21 09:56:15	I like d1 🙂
2023-04-21 09:56:41	I hope eventually the internet will be fast enough for d1 for everyone

_wb_

2023-04-21 09:56:46

for cameras, d0.3-d1 is a more relevant range

veluca

	Jyrki Alakuijala tiling didn't exist yet in pik
2023-04-21 09:59:30	oh I remember implementing that... that was an absolutely massive pain

_wb_

2023-04-21 10:00:01	I doubt "d1 for everyone" will happen as long as bandwidth has a nontrivial cost for content providers, but I do think "d1 when it matters, d2 for anything else" will happen (where "when it matters" will mostly be decided by marketeers doing A/B testing to see what fidelity target is most profitable)
2023-04-21 10:00:52	tiling didn't exist yet in fuif either, and it was also a massive pain to make that work (in particular in combination with progressive / squeeze)

Jyrki Alakuijala

2023-04-21 10:00:53	human time costs 1000x more than computers, it is just a matter of time for capitalism figuring that out
2023-04-21 10:01:59	I believe that avif didn't solve tiling properly, they just combine things at the final level, i.e., noise synthesis and filtering doesn't extend between the largest tiles
2023-04-21 10:03:11	we still have interesting interactions of CfL tiles to integral transforms to 256x256 tiles -- in practice we use 64x64 groups of integral transforms to avoid difficulties
2023-04-21 10:03:39	if we dealt with those difficulties we could get a 1-2 % more density
2023-04-21 10:03:59	but I don't think we will find smart enough people to ever do it 😛

jonnyawsom3

_wb_ for the web, d2 is imo the sweet spot (with some web applications requiring more like d1-1.5 and others more like d2.5-3, but d2 is what most web devs would prefer to use)

2023-04-21 10:05:17

Can't forget it's very good for offline use too, when d0 isn't a good option d0.3 usually retains all the pixels but only shifts the colors by a shade or two when on high resolution images for me, with a filesize in the middle of lossless and d1 (If my memory serves me right during my tests... I really should've written them down somewhere, probably got a lot of data across a few months) Although, you already mentioned cameras by the time I typed all that out haha

Jyrki Alakuijala

2023-04-21 10:06:20

I'm currently working on improving quality of libjxl in difficult situations

_wb_

2023-04-21 10:06:48

avif has tiling at the heif level and tiling at the av1 level — at the heif level it's just doing a collage without any integration to the coding tools like filters, which causes tile seam artifacts; at the av1 level I am not sure how it works regarding filters and noise synthesis, but there it certainly does interfere with directional prediction and causes compression results to be somewhat worse when using tiles (i.e. multi-threaded encode) than when not using it, unlike libjxl where the number of threads used in encoding does not affect the output bitstream at all.

Jyrki Alakuijala

2023-04-21 10:06:55	running optimizations related to adjusting adaptive quantization for pixelised graphics
2023-04-21 10:11:53	lots of looking at images required unfortunately -- the heuristics have quite a few manually tweaked things and just looking at objective metrics has become less helpful

_wb_

2023-04-21 10:12:56

detecting non-photo regions that are bad for dct and extracting them to use modular patches (possibly using delta palette) would probably help a lot for currently-difficult images, but the main challenge there is to do it in a way that causes no surprises (no heuristics that cause very nonlinear behavior where something completely different happens depending on a small difference in the input), and in a way that is effective (both quality and density improve)

2023-04-21 10:13:28

detecting splines would also probably help quite a bit

Jyrki Alakuijala

2023-04-21 10:13:30	delta palette encoding is still rather experimental kilo-pixels/second thingy -- but it has huge potential
2023-04-21 10:14:25	but before doing this it would be nice if VarDCT was stable in quality, even if it means high bpp sometimes

jonnyawsom3

2023-04-21 10:15:20

Certainly gave yourselves plenty of wiggle room for rather vast improvements in future

username

2023-04-21 11:47:31

It would be interesting to have a list of things that didn't make it into jpeg xl from pik, flif, fuif and in-development jxl

veluca

2023-04-21 11:50:19	The biggest one I can think of are about 5 different entropy coders 🤣
2023-04-21 11:50:55	Also modular mode was rewritten a fair bit, some things went away and some were replaced

username

2023-04-21 11:56:31	maybe in the future a proper list could be made since while there is some nice info in this server it's pretty scattered around the place
2023-04-21 11:57:40	stuff like dots being removed to simplify the spec and progressive abilities from flif being removed since they sounded like a memory nightmare of some sort
2023-04-21 11:58:18	there was something else I heard about in this server that sounded interesting but I saw it being talked about months ago and I don't remember any details
2023-04-21 12:06:21	I heard in here that in the future some of yall core devs might write a book about jpeg xl and if so then having a part go into detail about some of the things dropped could be pretty interesting

Kampidh

2023-04-21 12:16:17

I found an interesting bug(?) in Thorium (and later also confirmed on older chromium by <@245794734788837387> ) where some JXL pics aren't progressively decoded while doing a slow load, but it works fine on progressive demo and Waterfox

spider-mario

2023-04-21 12:16:31	at some point, we had a feature called “gradient map” but it also did not survive
2023-04-21 12:17:10	https://github.com/google/pik/blob/be30e6e06c10b7830b0d5843b6b25666654033cd/pik/gradient_test.cc#L148-L153

veluca

	spider-mario at some point, we had a feature called “gradient map” but it also did not survive
2023-04-21 12:18:16	oh I had forgotten about that
2023-04-21 12:18:22	also we had low-frequency predictions

username

	Kampidh I found an interesting bug(?) in Thorium (and later also confirmed on older chromium by <@245794734788837387> ) where some JXL pics aren't progressively decoded while doing a slow load, but it works fine on progressive demo and Waterfox
2023-04-21 12:25:57	maybe <@795684063032901642> would know what the problem is? also I would like to note that the issue happens to the jxl images on the gallery pages on this website https://saklistudio.com/
2023-04-21 12:26:53	so anything under "Artwork Gallery" or "Photography Gallery" from what it seems
	spider-mario at some point, we had a feature called “gradient map” but it also did not survive
2023-04-21 12:28:27	a way of representing a gradient in a image that goes from one color to another I'm guessing?

spider-mario

2023-04-21 12:28:45	something like that, from what I recall
2023-04-21 12:28:48	(i.e. not much)

Kampidh

	username maybe <@795684063032901642> would know what the problem is? also I would like to note that the issue happens to the jxl images on the gallery pages on this website https://saklistudio.com/
2023-04-21 12:32:58	Since the images there are quite mixed (libjxl version difference and some even had it at modular), I'd like to point this one in particular for the test https://saklistudio.com/artworks/personal.php?20 (center progressive, libjxl 0.7, vardct, distance 1)

Moritz Firsching

2023-04-21 12:33:57	without looking at the images: progressive does not work when the image has alpha
2023-04-21 12:34:17	even if the alpha is fully opaque

username

2023-04-21 12:35:21	ohhh that makes since I didn't notice that jxlinfo said the images had alpha when I put them in
2023-04-21 12:35:38	(the last file in that cmd image was something that didn't have the issue)

_wb_

	username It would be interesting to have a list of things that didn't make it into jpeg xl from pik, flif, fuif and in-development jxl
2023-04-21 12:35:43	I think most of the stuff we ended up removing was removed for a good reason — either because it didn't bring any real advantage, or because there was another way already to do. We unified and simplified stuff but I don't think there's anything we removed that is really worth bringing back in some future codec or jxl extension.

Kampidh

2023-04-21 12:35:44

ooohhh yup makes sense since krita always include alpha

_wb_

	Moritz Firsching without looking at the images: progressive does not work when the image has alpha
2023-04-21 12:37:04	we need to fix that — bitstream-wise, jxl can do progressive alpha just fine, as long as squeeze is used (which libjxl does by default when doing lossy)

Moritz Firsching

2023-04-21 12:38:25

yeah it is only the chrome decoder that doesn't do progressive in that case, not a limitation from the bitstream

username

	_wb_ I think most of the stuff we ended up removing was removed for a good reason — either because it didn't bring any real advantage, or because there was another way already to do. We unified and simplified stuff but I don't think there's anything we removed that is really worth bringing back in some future codec or jxl extension.
2023-04-21 12:39:17	oh yeah I'm sure they where all removed for good reasons I just find that some of them where quite interesting ideas even if in practice they either didn't work well or just caused spec or code bloat for little gain

_wb_

2023-04-21 12:40:37	we just need to implement showing previews in this case, which requires either unsqueezing with zeroes for every residual not yet decoded, or we could do a partial unsqueeze to 1:8 resolution and use the fancy 8x upsampling from there — the latter is probably both more efficient and nicer looking, but it requires a bit of implementation
2023-04-21 12:41:43	(iirc there is already code for doing the "unsqueezing with zero residuals for everything not yet seen", so maybe we should start by just making that work in chrome)

username

2023-04-21 12:44:39

progressively decoding images with alpha (at least the ones produced by libjxl with default'ish seetings) seem to work fine it's just I think the chromium decoding code is told to not even attempt progressive decoding if it sees a alpha channel

_wb_

2023-04-21 12:47:16	also we should probably drop trivial (all opaque) alpha channels by default for single-frame inputs, at least when doing lossy (when doing lossless, arguably the existence of an alpha channel even if it is trivial is in itself something that needs to be preserved)
2023-04-21 12:47:44	having a trivial alpha channel will just slow down decoding for no good reason

username

	username progressively decoding images with alpha (at least the ones produced by libjxl with default'ish seetings) seem to work fine it's just I think the chromium decoding code is told to not even attempt progressive decoding if it sees a alpha channel
2023-04-21 12:49:01	example of chromium alpha progressive decoding seeming to work if forced:

_wb_

2023-04-21 01:04:26

https://cloudinary.com/labs/cid22 yay!

Foxtrot

2023-04-21 01:19:33

nice, let it beat over AVIF teams head 😄

_wb_

2023-04-21 01:27:31

I think the results for jxl are quite good, but the main conclusion I draw is that it is very image-dependent which of the two codecs performs best (for some images it's about the same, for some jxl is better, for some avif is better), with jxl performing a bit better than avif on average but certainly not on every single image. In terms of encoder consistency, jxl is doing a lot better (as you can see by comparing the jxl-avif gap at p50 vs at p5), and if there can be only one format, it should probably be jxl, but I think our data indicates that there's room for both avif and jxl and having both is better than having only one of them.

username

	_wb_ https://cloudinary.com/labs/cid22 yay!
2023-04-21 01:49:57	what downscaling algorithm was used for the reference images in the dataset? I don't really know anything about datasets so maybe there is a standard way to do downscaling for this type of stuff?

_wb_

2023-04-21 01:50:56	I think it was just default imagemagick -resize (which is not very good but it does the job)
2023-04-21 01:51:42	most of the source images are high-res pictures from pexels, cropped to a square aspect ratio and then downscaled to 512x512
2023-04-21 01:52:55	most of the source images would be 6x or 8x as large and lossy but quite high quality, so we just assume that after downscaling it's effectively a pristine image

username

2023-04-21 01:56:33

yeah I have noticed that taking a large somewhat lightly lossy image and downscaling it a bunch does result in something you wouldn't be able to tell was originally lossy

_wb_

2023-04-21 01:57:30	we could probably have used a better downscaling method, but whatever, the images are what they are and I think they're pristine enough
2023-04-21 01:58:19	(default imagemagick -resize uses a decent Lanczos3 but it also does it in gamma-compressed color which is wrong and makes things look darker than they should)

Traneptora

2023-04-21 01:59:52

libplacebo scales in linear light by default if you're looking for methods

username

2023-04-21 02:01:27

the dataset is already published but that could be useful for future datasets I guess

Traneptora

2023-04-21 02:07:36

yea, true

username

	_wb_ https://cloudinary.com/labs/cid22 yay!
2023-04-21 02:08:04	would it be possible to have a extra download with the full amount of non-distorted reference/pristine images? as 7.2GB is way too large a size for my internet to handle in a timely manner.

_wb_

2023-04-21 02:09:08	ah sure that's a good idea. might take a while
2023-04-21 02:09:55	(i can't update the website myself)

Jim

	_wb_ we could probably have used a better downscaling method, but whatever, the images are what they are and I think they're pristine enough
2023-04-21 02:12:16	It's probably better that way. Most people use some horrible way to downscale images and so do a lot of websites, so it is likely a better comparison to what you see on the typical web rather than trying to get the most pristine downscaling you can get.

username

2023-04-21 02:12:39

<@794205442175402004> the information about how the downscaling was done would probably be nice to have somewhere if it's not already somewhere, so maybe that could also be added to the website when you submit a update for it?

afed

2023-04-21 02:13:32

modern imagemagick uses Robidoux for the downscale, if I'm not mistaken (maybe just for -distort resize), it's pretty decent

2023-04-21 02:15:11	is it atleast downsampling in linear light
2023-04-21 02:18:22	oh you said it's not

sklwmp

2023-04-21 02:18:38	what metric is this using?
2023-04-21 02:18:48	either it's not indicated anywhere or i'm being dumb
2023-04-21 02:19:41	i'm guessing it's ssimulacra2?

afed

2023-04-21 02:20:38

also some interesting old thread about resampling <https://forum.luminous-landscape.com/index.php?topic=91754.460>

2023-04-21 02:21:43

i find the gamma issue the most important <http://www.ericbrasseur.org/gamma.html>

_wb_

	sklwmp what metric is this using?
2023-04-21 02:22:01	that's not an objective metric, those are subjective scores

sklwmp

2023-04-21 02:22:55	ah, that makes sense
2023-04-21 02:23:07	it should probably be indicated somewhere though

Traneptora

2023-04-21 02:23:10	I'd probably put that in the header, like "the following plot shows bitrate/distortion curves aggregated over subjective scores of the entire CID22 dataset" or something similar to that
2023-04-21 02:23:31	I assumed it was ssimulacra2 tbh

2023-04-21 02:24:28

i'd put "this plot compares how good they are"

_wb_

2023-04-21 02:24:47

it's a dataset based on 1.4 million opinions collected in 46k crowd-sourced test sessions (where single participants could do up to 4 sessions, so this represents something between 10 and 40k different humans)

Traneptora

2023-04-21 02:25:09	or at least, I'd put on the vertical axis "aggregate subjective score"
2023-04-21 02:25:18	like how you have the horizontal axis labeled average bpp

afed

2023-04-21 02:25:38

it would be nice to add results for 2.1 and jxl with the latest quality changes (also maybe jpegli) <https://jon-cld.s3.amazonaws.com/test/index.html>

Traneptora

2023-04-21 02:26:28	I don't believe it's possible to change the results, as these were subjectively gathered from 1.4 million opinions
2023-04-21 02:26:47	to test libjxl again would require you to ask those people again

2023-04-21 02:27:39

how are people picked to participate in those tests

afed

2023-04-21 02:32:27	there are some services, also with some anti-cheating validation
	afed it would be nice to add results for 2.1 and jxl with the latest quality changes (also maybe jpegli) <https://jon-cld.s3.amazonaws.com/test/index.html>
2023-04-21 02:32:48	I mean purely metrics scores, without subjective evaluations, though libjxl has not much improved on metrics unlike visual quality

_wb_

	w how are people picked to participate in those tests
2023-04-21 02:37:20	we used https://subjectify.us/
2023-04-21 02:37:51	I don't know how they recruit participants
2023-04-21 02:38:24	could be they internally use Amazon Mechanical Turk or something similar, no idea
2023-04-21 02:39:56	subjective testing takes time and it's not cheap, so it's not something you can do for every new encoder version. But you can use the subjective data to make objective metrics better — that's exactly what I did to make ssimulacra 2

2023-04-21 02:40:37

🆒

_wb_

	afed it would be nice to add results for 2.1 and jxl with the latest quality changes (also maybe jpegli) <https://jon-cld.s3.amazonaws.com/test/index.html>
2023-04-21 02:41:23	yes, i'll update those benchmarks at some point, might take a while before I get to it though

username

2023-04-21 02:52:14

something somewhat interesting I would like to note is that MozJPEG does trellis quantization by default while cWebP only does it if you change the "method"/"-m" to something above the default value of "4"

Traneptora

2023-04-21 07:12:12

Can I use jpegli as a library atm or is it limited to benchmark_xl?

_wb_

2023-04-21 07:13:42

Afaik it should be usable as a libjpeg-turbo dropin replacement already, if you don't use exotic libjpeg api features

Traneptora

2023-04-21 09:43:11

how? Does libjxl.so expose the symbols or is it separate?

spider-mario

2023-04-21 10:56:25	it’s built separately
2023-04-21 10:58:56	if you build a current libjxl on linux, you should end up with a `jpegli/libjpeg.so` in your build directory if I am not mistaken

Traneptora

2023-04-22 06:18:47	does it come with an equivalent of cjpeg too, btw?
2023-04-22 06:19:00	I've only used it with benchmark xl

_wb_

2023-04-22 06:22:46

I think if you take the cjpeg from libjpeg-turbo and LD_PRELOAD the jpegli libjpeg.so, it might work

sklwmp

2023-04-22 07:02:57	Nope, not yet.
2023-04-22 07:22:43	i usually just end up using cjpeg_hdr from the libjxl repo
	sklwmp Nope, not yet.
2023-04-22 07:29:20	even hacking in these symbols when compiling cjpeg and manually linking to jpegli, i still get the same error as before: `lib/jpegli/encode.cc:567: jpegli_compress_struct has wrong size.`

_wb_

2023-04-22 08:14:06

Hm, probably needs some more work then I suppose.

spider-mario

2023-04-22 09:45:47	I believe `cjpeg_hdr` is indeed the current recommended approach outside of `benchmark_xl`
2023-04-22 09:46:26	though I don’t think we’ve added a flag to use XYB instead of the input’s embedded profile
2023-04-22 09:47:06	yeah, no
2023-04-22 09:47:09	no XYB with `cjpeg_hdr`

monad

	_wb_ https://cloudinary.com/labs/cid22 yay!
2023-04-22 07:33:40	cured my depression for the day, thanks

OkyDooky

2023-04-22 09:32:44

Before I discovered JXL, I had an idea for an image format that could effectively compress batches of images into one convenient file, mainly to handle variants but also to bundle albums in a single viewable package (which is always inconvenient to do with ZIP-style archives), similar to how .mkv/.mka can be a single-file playlist that you extract individual tracks from, if you wish. The use cases would be photo sets of similar photos and variant sets of illustrations, including anything utilizing these (eg. websites that allow hosting multiple images per entry, like Pixiv and Newgrounds, as well as Visual Novels, which use variants as their bread and butter), as well as being able to share a group of unrelated images that could be viewed in a gallery app without having to be "unzipped" or doing any extra steps by the recipient. But, seeing <@794205442175402004> talking about layers and such a bit earlier, I figured I should ask if the JXL format could be used for this kind of application, first. Does, or can, JXL compress elements *between* layers and have savings by having similarities between layers? If yes, then the "album" features could maybe be done through metadata and adding appropriate handling technology into viewing software? Thanks for all your works, guys.

jonnyawsom3

2023-04-22 10:00:49

If I recall JXL does have reference frames for this kind of thing, but I've never seen them used yet

Foxtrot

2023-04-22 10:15:10

I see so many uses for JPEG XL here that it seems to me that single .jxl extension will not be enough

OkyDooky

2023-04-22 10:53:36

Hmm. I'll have to look at that. Any URL pointers? My thought is, say, with visual novels, for instance, those will usually use one character illustration and have dozens of variations that just change something simple, like expressions, and 90% of the image is the same, so shouldn't there be a convenient, even lossless way, to basically just describe those differences and shrink the image set's size by a huge ammount, depending on how many images are being described this way? I use VNs as a peak example, since they usually have the most similarities between variations, but some artist illustration sets and photo sets come pretty close. (<@238552565619359744>)

2023-04-22 11:01:43

I agree. Maybe metadata can be used to help software not only distinguish internally but also clue in users to the features a particular file utilizes. Kind of like how some gallery apps have mini icons to indicate an image is a "motion photo" or a proper panorama, etc. Maybe ome could indicate layers, this "album" function, or any others. But, if my idea could be done via JXL, then that could be a great thing to market for some web services, especially the ones I mentioned (Newgrounds and Pixiv). Pixiv, especially, since they have TONS of entries that have dozens or even 100+ images each. This feature could possibly reduce their total load by 40% or more. Actually, if you include converting all those bloated JPGs and PNGS they have to JXL, then the total server storage savings could go much closer to 80%, by my non-scientific estimating. (<@100707002262503424>)

jonnyawsom3

2023-04-22 11:04:14

There's no direct mention of it I can find, but it seems related to patches and there is mention of it here https://github.com/libjxl/libjxl/blob/a741606839f6b5118dba7be0b67d2f496e225543/lib/jxl/enc_frame.h#L40

OkyDooky

2023-04-22 11:08:36

Thanks. It doesn't mean much to me. And I wonder if the methods of inter-frame prediction or whatever is being used would be ideal for this. I'm sure it could work, but maybe there would be more preferable approaches. Idk.

jonnyawsom3

2023-04-22 11:09:19

Having two bytes to mark the file as lossless, lossy or mixed (if the modular patches gets merged). Then single, multi-frame or animated (has timing info) could help a lot for very little increase in size (others can probably think of more to make the most use of the bytes)

_wb_

2023-04-23 06:13:39	The headers will tell if the image is single-layer, multi-layer or animated. No need to add additional metadata for that imo.
2023-04-23 06:16:15	Patches could indeed be used for visual novels if there are duplicated regions (especially if they're identical at the pixel level). Currently libjxl only uses patches within a single frame (to deduplicate letters of text), but the bitstream allows reusing patches across frames. So it is just a matter of making an encoder that actually uses this bitstream feature.

Traneptora

	_wb_ The headers will tell if the image is single-layer, multi-layer or animated. No need to add additional metadata for that imo.
2023-04-23 03:35:08	how will the headers tell you if the image is single-vs-multi-layer?
2023-04-23 03:35:11	those are the Frame headers iirc

skalt711#4276

	_wb_ Patches could indeed be used for visual novels if there are duplicated regions (especially if they're identical at the pixel level). Currently libjxl only uses patches within a single frame (to deduplicate letters of text), but the bitstream allows reusing patches across frames. So it is just a matter of making an encoder that actually uses this bitstream feature.
2023-04-23 04:14:03	Is an awesome feature for multi-layered images where you need to pinpoint something!

_wb_

	Traneptora how will the headers tell you if the image is single-vs-multi-layer?
2023-04-23 04:27:37	If it's a single layer image then the first frame will have is_last set to true. But yes, that's technically Frame header, not Image header

Traneptora

	_wb_ If it's a single layer image then the first frame will have is_last set to true. But yes, that's technically Frame header, not Image header
2023-04-23 04:30:45	the first frame that is regular or skip progressive
2023-04-23 04:31:00	if the first frame is an LF frame or a patches frame you have to skip it

_wb_

2023-04-23 07:00:37

Right, in those cases you would need to seek to figure it out

OkyDooky

2023-04-23 07:10:31	Okay, so "patches" is what I should be keeping an eye on, rather than "reference frame(s)?" So, just to clarify, something like these\:
2023-04-23 07:10:46	jjba7su\_\_students\_visual\_novel\_sprite\_sample\_2\_by\_mikotonui-dbzku30.png
2023-04-23 07:10:50	visual\_novel\_character\_sprite\_sample\_by\_woolfcub-d7wmoeo.jpg
2023-04-23 07:12:22	...could be compressed way below the size of doing them one-by-one if patches are applied and assuming each one is on a separate layer/frame (not sure how differently those are treated)?
2023-04-23 07:14:19	If I were wanting to save space with an existing product, I could (assuming I were one of the developers or publishers) take all the existing sprites and group them accordingly to get good compression, then?

Fraetor

2023-04-23 07:14:48

Yeah, you could have the bulk of each image in a separate frame, and then have a frame for each sprite that uses a big patch for most of it.

OkyDooky

2023-04-23 07:16:00	From an authoring workflow, I'm guessing it would be best to save any base elements on their own layer/frame (whichever is more correct) and then an changing elements on separate ones, then have the game's decoder selectively pull each appropriate layer/frame and display them in the proper draw order?
2023-04-23 07:18:52	Awesome! I'm still not sure how to think of layers/frames, in this instance, since I just think of the general workflow for each (layers=GIMP/Photoshop, frames=video editing). Is it basically semantics to the format? Or are they fundamentally and functionally different things? (<@176735987907559425>)

_wb_

2023-04-23 07:27:28	It doesn't matter if the repetition is in different frames or within the same frame. You can make a "sprite sheet" frame (which we call kReferenceOnly, it is a frame that isn't shown), tell the jxl decoder to remember it, and then reference it in subsequent frames, basically by issuing commands of the form "take a rectangle of size 200x100 at position 123,456 in reference frame 0, and put it at position 789,234 of the current frame, alpha blending it on top of what was already there because of the normal frame decoding"
2023-04-23 07:30:16	There are 4 "save slots" for reference frames, you can take Patches from any of these slots - this can be either an invisible previous frame (kReferenceOnly) or a regular previous frame (but it can be from a long time ago, it is signaled whether or not to save a frame to a save slot and which slot to use for that)
2023-04-23 07:30:54	There are various blend modes (alpha blend over, alpha blend under, just replacing, adding, etc)
2023-04-23 07:33:31	So it offers a lot of bitstream expressivity, while for the decoder it's not that hard to do, it just needs memory to store previously decoded frames if needed, and do blending of one region on top of another (but most of that code can be shared with what is needed for frame blending anyway)
2023-04-23 07:34:58	We haven't tried to make an encoder yet that really tries to use this to its full potential, mostly because searching for good matches is in general computationally hard.
2023-04-23 07:39:12	We have some use of Patches already in the current libjxl encoder, but it's quite limited, and doesn't try to do anything for multi-frame.
2023-04-23 07:40:59	In general, we tried to make the jxl bitstream as expressive as possible without making the decoder too complicated, so we can have a lot of room for future encoder improvements so jxl can be a format that can last 3 decades like the old jpeg did.
2023-04-23 07:44:35	Splines and patches are imo two of those "future proofing" bitstream features: it's only a few extra pages of spec / a few hundred extra lines of decoder code, but they bring a ton of possibilities for future improvements.
2023-04-23 07:47:34	So we designed things with a quite different time horizon in mind than how video codecs are designed: there they design codecs for the needs of the moment, not really looking more than 5-10 years ahead since by then the next video codec will be there anyway...

jonnyawsom3

2023-04-24 01:20:56

Ahh, that'd explain why it tends to struggle compressing GIFs, I was never quite sure if the reference frames were 'enabled' since it *could* lower the filesize but only on higher efforts where patches should be enabled. Obviously it could still do a lot better then

OkyDooky

2023-04-24 01:28:53

I'm glad you all decided to design the way you did. I'm not going to pretend I get the full implications of how the reference frames and the slits work. But, it's sounding like you're basically saying\: "yes, JXL can do the things that you were wanting a new file format to do. It's just not there yet to do it at an ideal level."

2023-04-24 01:38:12

I think the use cases I described above are very valid and haven't had a good tool to fill that niche. Like, being able to share a cluster of images as one file and having a gallery be able to open it would be very useful for general Sharing and social purposes. And lots of games and such could benefit from using the patches feature for any number of kinds of sprites to potentially shrink their size by a lot. And, I mentioned Pixiv, because they're the only ones that I know that use the "multiple images per post" feature that heavily, but I think Flickr and others offer something similar (I haven't checked ArtStation in years, though). So, I think these features would very much be worth investing in and putting on the roadmap. More *actual* use cases, in theory, means better adoption. And the savings this could bring in those areas, I think, would be worth the extra computational costs. I mean, if people are using AV1, despite its "poor" optimization, then I wouldn't say spending an extra minute or few to export a file like this would be too off-putting.

jonnyawsom3

2023-04-24 01:47:41

In most cases where sprites are used they tend to be pixel art or no larger than 2K, so doing it 'the hard way' might not be as slow as expected for its propper use cases. Obviously working smarter rather than harder is still a goal though

OkyDooky

2023-04-24 02:49:26

True, which is the other reason why I specified visual novels, since they tend to live off of variations (aside from event CG scenes), but also because those assets tend to be of high resolution. Even the lower-bit, lower-res assets in older VNs focus on describing higher fidelity images than your usual "pixel art." Interestingly, I just remembered, the older ones also tended to feature more animations in their assets than more contemporary entries. I guess some might use some kind of Live2D feature. But, once Armory3D is able to reimplement support for Blender's Grease Pencil objects, virtually all limitations with the above approaches should be null. But, I digress.

Jyrki Alakuijala

2023-04-24 09:27:08

I'm having quite a bit of success in adjusting the initial quantization field -- at around d1 I seem to be getting 2 % more compression or so, but with ~5 % slowdown of the default speed encoder

_wb_

2023-04-24 09:43:09

is the initial quant field what ends up influencing the ac strategy (block type selection)? are you testing at e7 or at e8/e9 or both?

Jyrki Alakuijala

2023-04-24 11:14:10

e7 only, it does change ac strategy so I tune it to have roughly similar composition of transforms -- so that I don't see fake improvements of just the system moving to larger transforms

_wb_

2023-04-24 12:17:36

I mean if the initial quant field is better it might lead to actually better ac strategy choices. If I understand correctly, at e8+ this would be the main advantage of a better initial quant field (since at e8+ the actual quant field will change anyway)

Jyrki Alakuijala

2023-04-24 12:21:35

it is complex, even the e8 e9 quant fields are constrained by the initial guess

_wb_

2023-04-24 12:22:42

right, it will not deviate arbitrarily much from the initial guess

Foxtrot

2023-04-24 01:16:20

can I find somewhere precompiled ssimulacra for windows? thanks

monad

	I think the use cases I described above are very valid and haven't had a good tool to fill that niche. Like, being able to share a cluster of images as one file and having a gallery be able to open it would be very useful for general Sharing and social purposes. And lots of games and such could benefit from using the patches feature for any number of kinds of sprites to potentially shrink their size by a lot. And, I mentioned Pixiv, because they're the only ones that I know that use the "multiple images per post" feature that heavily, but I think Flickr and others offer something similar (I haven't checked ArtStation in years, though). So, I think these features would very much be worth investing in and putting on the roadmap. More actual use cases, in theory, means better adoption. And the savings this could bring in those areas, I think, would be worth the extra computational costs. I mean, if people are using AV1, despite its "poor" optimization, then I wouldn't say spending an extra minute or few to export a file like this would be too off-putting.
2023-04-24 03:40:56	Games already save space reducing redundancy, whether within a sprite sheet or using separate images (e.g. one image for body, multiple images for facial features). For this use case, I doubt there would be much gained storing these images as layered JXL over traditional layouts. For human viewing, there is certainly an advantage, but I don't really imagine online image-hosting services taking UGC and converting it to layered JXL automatically. Rather, if some convention is adopted for treating layers as pages, I could imagine websites supporting those images already encoded that way.
2023-04-24 03:44:02	Some games use layered PSDs for 3D textures, and while there are certainly better techniques already, maybe there is some niche for JXL here.

_wb_

2023-04-24 03:45:40

for games it could be useful as a convenience feature, if the jxl decoder does the recomposition instead of having to write game code to handle that

jonnyawsom3

2023-04-24 04:22:52

I know recently I got back into NeosVR and was wondering what JXL would add to it, since it already has a pretty robust asset system and heavily uses webp for screenshots or world previews, ect. Unfortunately development has stalled due to a legal battle otherwise I could've just talked to the developer about it

username

2023-04-24 04:26:47	something like what is done in second life or the id tech 5 engine could maybe be done?
2023-04-24 04:27:28	where texture assets are stored as images and then converted on the fly to something the gpu can read
2023-04-24 04:27:53	I think second life uses JPEG 2000 and id tech 5 uses JPEG XR

jonnyawsom3

2023-04-24 04:31:29	Neos uses PNG or WebP as far as I'm aware, depending on the source file. If I hadn't just got in bed I could look, since all 'personal' assets are stored locally
2023-04-24 04:32:17	Thinking about it that also means I could directly test the improvement with JXL instead

username

2023-04-24 04:33:48

from the sounds of it their system could be massively improved with or without JXL

jonnyawsom3

2023-04-24 04:41:00

I'm sure I'm misremembering and forgetting a lot, been about a year since I last delved into the technical details

username

2023-04-24 04:41:00

well maybe not "massively" it depends on how they currently have stuff setup

jonnyawsom3

2023-04-24 04:43:25

I do know they run it pretty efficiently all things considered, asset deduplication across all users, asset variant management. But considering textures and screenshots are lossless, JXL could probably cut it down by quite a bit

username

2023-04-24 04:44:20

from light testing it seems like WebP lossless is closer to JXL lossless then it is to PNG

jonnyawsom3

2023-04-24 04:44:43	Recently someone modded FFMPEG support into it, but it just converts filetypes or edits videos rather than pixel data or such
2023-04-24 04:45:33	As I said, I can do some testing when I'm on the computer again since I have my assets locally
2023-04-24 04:46:31	Considering it allows .blend file import, I can't image JXL would be out of the question down the road in one form or another (Sorry for nerding out)

Jyrki Alakuijala

2023-04-24 04:54:54

https://github.com/libjxl/libjxl/pull/2422 0.5 % improvement on quality (mostly affects difficult images and around d1)

Traneptora

2023-04-24 06:17:15	to extract the LF coefficients from the 16x16 forward DCT, do I have to run the "inverse DCT" on the 2x2 LLF coeffs?
2023-04-24 06:18:15	I'm considering switching hydrium from 8x8 to 16x16 blocks and I just want to make sure I get the math behind it

_wb_

2023-04-24 06:18:57

effectively the LF coefficients are defined so that you get a 1:8 image regardless of what DCT block selection you use

Traneptora

2023-04-24 06:19:29

so if I take a 16x16 block, the lowest frequency 2x2 block of coefficients would have to be "inverted" back to the LF coeffs, right?

_wb_

2023-04-24 06:19:34

so for the LF, it doesn't matter if you use only 8x8 blocks or if you use 256x256 blocks, the LF image is the same

Traneptora

2023-04-24 06:20:30

ye, I'm thinking about the encode-stage here

_wb_

2023-04-24 06:20:32

encode-side, it boils down to doing IDCT on the lowest freq coeffs after FDCT, yes

Traneptora

2023-04-24 06:20:40

ah, that's what I wanted to double check, ty

_wb_

2023-04-24 06:21:38	but you can also just compute a 1:8 image (say as if it's all dct8x8) and throw away the lowest freq coeffs after FDCT
2023-04-24 06:22:02	(they are skipped in the HF encoding anyway)

Traneptora

2023-04-24 06:22:09	I'm wondering if there's value in me doing multiple modes for hydrium where I can have it be "ultra-low-memory" mode with one frame per 256x256 group (what it does now)
2023-04-24 06:22:30	and have a "shift" option
2023-04-24 06:22:37	for 2x2, 4x4, and 8x8 groups per LF group

_wb_

2023-04-24 06:22:59

one frame per group is just to avoid buffering output bitstream or seeking in the output to write the TOC, right?

Traneptora

2023-04-24 06:23:15	yea, with it I get ~2-3MB max allocated by the library
2023-04-24 06:23:19	plus whatever from the PNG decoder in the frontend
2023-04-24 06:23:36	I have to buffer the full Frame but that's it
2023-04-24 06:24:19	I use more memory with ANS than prefix codes but I get a better ratio, for the HF coeffs

_wb_

2023-04-24 06:25:42

I would consider buffering 2048x2048 input at a time and buffering output bitstream (or alternatively write to output stream and seek once at the end to write the toc)

Traneptora

2023-04-24 06:25:42	ultra-low-memory is nice but if you can spare 10 MiB instead of 2 MiB I'd think you'd prefer better ratio or faster if you actually have the RAM to spare, which is why I'm considering a different codepath there
2023-04-24 06:25:55	well I was thinking of making it configurable

_wb_

2023-04-24 06:26:24

buffering a full uncompressed frame is a bigger memory requirement than buffering output bitstream

Traneptora

2023-04-24 06:26:25	like, by default it does 1x1 groups per Frame, but I could have an option to let it to 2x2, 4x4, or 8x8 groups per frame
2023-04-24 06:26:40	ye, the idea is that you wouldn't have to buffer the full uncompressed frame
2023-04-24 06:27:16	for example, massive massive PNGs can currently be encoded by hydrium as it progressively reads 256 scanlines, encodes them, and then repeats

_wb_

2023-04-24 06:27:20

libpng can decode row by row so you could do strips of 256 rows at a time

Traneptora

2023-04-24 06:27:28	I currently do that in the frontend with spng
2023-04-24 06:27:38	libpng is weird, it uses setjmp which I hate
2023-04-24 06:27:45	and it's kind of nonportable
2023-04-24 06:27:48	spng is portable C
2023-04-24 06:28:15	it's like lodepng but it can do progressive, which lodepng cannot do as far as I'm aware
2023-04-24 06:28:37	if lode adds progressive decoding to lodepng I'd probably switch back as lodepng is faster

_wb_

2023-04-24 06:28:47

it would be nice to be able to encode a frame as a single frame though, doing it as many layers feels a bit like a hack and it feels semantically a bit different to me (it's a layered image, not a single-layer image)

Traneptora

2023-04-24 06:29:05	it is, but that doesn't work for non-seekable output
2023-04-24 06:29:37	and yea it is kind of a hack
2023-04-24 06:30:30	my to-do list involves (1) add larger frame sizes as a configurable option (2) add one-frame mode, which writes to seekable output
2023-04-24 06:30:53	I'm not entirely sure how to do that in a portable way without reallocs, since mmap is a posix-specific thing iirc
2023-04-24 06:38:01	tho it appears windows has it
2023-04-24 06:38:08	it's just MapViewOfFile

OkyDooky

2023-04-24 08:38:03

It sounds like I wasn't too clear on the actual benefits I was thinking of. For games, I was specifically referring to ones that use a ton of variations of existing assets, like VNs. In theory, you would only have to save the base asset and then all variations could be reduced up to or above 90% their original size, either because of using patches (useful for converting existing assets) or using layers to composite specific variations onto the base layer. For websites, I was thinking of sites like Pixiv (and any others that do multiple images per post) using patches to save space on any redundant elements or similar images in a post. For the "use JXL as a single-file album," I was thinking more of sharing between devices than websites. Sites could use that feature, too. But, the main convenience is in sharing between people, either for work or casually. (<@263300458888691714>)

2023-04-24 08:45:38

that sounds very complicated and inconvenient

Traneptora

2023-04-24 09:56:01	Anyway before I start breaking the API (which I can do because 0.X semver) here's another release of hydrium:
2023-04-24 09:56:02	https://github.com/Traneptora/hydrium/releases/tag/v0.2.3

spider-mario

	Traneptora it's just MapViewOfFile
2023-04-24 10:05:52	we use it to read the ICC profile in the GUI tools, perhaps a little overkill https://github.com/libjxl/libjxl/blob/f8c76c0adebff3c23acb28b782105e6538010750/tools/icc_detect/icc_detect_win32.cc#L40-L62
2023-04-24 10:07:11	```c++ LARGE_INTEGER profile_size; ```
2023-04-24 10:07:33	😂 let’s hope not

monad

2023-04-25 08:04:00

Is this not like what you have in mind? It's common to reduce asset weight this way. Outside games, I imagine JXL patches being practical at this scale when directly supported by image editing software, so artistic tools can inform the encoding process. Writing software to detect redundancy with the sophistication of human judgment would be challenging™.

Jyrki Alakuijala

	Traneptora I'm considering switching hydrium from 8x8 to 16x16 blocks and I just want to make sure I get the math behind it
2023-04-25 08:21:40	I get more benefit from a mix of 8x8 and 16x8s than a mix of 8x8 and 16x16 -- if you only choose one transform, perhaps choosing 16x8 and 8x16 could be more interesting than 16x16 for quality

Traneptora

2023-04-25 08:33:57	huh
2023-04-25 08:34:39	although 16x8 and 8x16 together are two transform types
2023-04-25 08:34:44	unless you meant one or the other only

Jyrki Alakuijala

2023-04-25 08:44:32	they are two
2023-04-25 08:44:50	but there is some symmetry between them
2023-04-25 08:45:06	in libjxl-tiny we use 8x8, 8x16, 16x8

jonnyawsom3

	Thinking about it that also means I could directly test the improvement with JXL instead
2023-04-25 10:52:38	Varied from 6:1 to only one or two MB on a 50MB file, I did discover they apparently convert most/all imported images into PNG since I found a jpeg I had uploaded sitting there as a 60MB PNG. The WebP screenshots didn't seem to compress very well, but I was only using -e 4 due to the size of the files so it probably could've done better

Foxtrot

2023-04-25 01:48:36

just thinking, why is avif better in high compression/low quality than jxl? is it just because they focused more on optimizing this in development or is there some fundamental difference in architecture that enables avif to be better?

jonnyawsom3

2023-04-25 01:51:18

I'd say a bit of both, avif is based on a video codec, so it's made to run fast and output the smallest files it can

Jyrki Alakuijala

2023-04-25 03:37:01	Before/After for https://github.com/libjxl/libjxl/pull/2424
	Foxtrot just thinking, why is avif better in high compression/low quality than jxl? is it just because they focused more on optimizing this in development or is there some fundamental difference in architecture that enables avif to be better?
2023-04-25 03:41:41	Fundamental difference: AVIF is using prediction and aggressive axis-separable filtering, JPEG XL more on context modeling and more refined filtering ('non-axis-separable'). Prediction is better if you need to do low quality, context modeling is better for high quality.

Foxtrot

2023-04-25 03:50:33

i am just looking on low quality image and it really looks like when there is some sharp line or long edge avif can do it really good

Jyrki Alakuijala

2023-04-25 03:51:18	they have a mode where they can use 8 colors in a locality
2023-04-25 03:51:28	if there is 9 colors there, too bad, something will happen
2023-04-25 03:51:41	'palette transform'

Foxtrot

2023-04-25 03:51:42

thanks for explanation

Jyrki Alakuijala

2023-04-25 03:52:08	that improves graphics quite a bit, but you can almost never use that transform in medium/high quality
2023-04-25 03:52:26	so the improvement is for limited scope
2023-04-25 03:52:45	in video it doesn't matter much -- you get it approximately right at first, then refine it in further frames
2023-04-25 03:53:29	in photos such approaches do not work in my opinion and I didn't include such in JPEG XL
2023-04-25 03:53:44	we tried three different ways to do linear prediction
2023-04-25 03:54:02	it never gave any gains in mid/high quality, just slowed down things
2023-04-25 03:54:34	the palette transform we didn't try
2023-04-25 03:55:11	but since there is a nice palette mode it might be possible to be done by layering a palette and vardct modes
2023-04-25 03:55:31	but it is difficult to do it right so that it helps at mid/high quality

Foxtrot

2023-04-25 03:56:10

i agree that its only good for low quality, is it possible to only apply it on low quality where it helps?

Jyrki Alakuijala

2023-04-25 03:59:24

it requires some effort and will have some long-tail of maintenance to it -- possibly new encoders for lower quality will emerge once the ecosystem matures

lithium

	Jyrki Alakuijala Before/After for https://github.com/libjxl/libjxl/pull/2424
2023-04-25 04:03:20	That quality improvement is amazing! Thank you for your hard work 🙂

Jyrki Alakuijala

2023-04-25 04:03:53	I was quite surprised to see 1+ % from such a simple thing -- all these improvements are driven by the community coming up with examples where libjxl doesn't do very well
2023-04-25 04:12:51	I still consider that libjxl is failing on the colored chessboard test image -- some more work to do

_wb_

	Foxtrot i am just looking on low quality image and it really looks like when there is some sharp line or long edge avif can do it really good
2023-04-25 04:13:16	Directional prediction helps for that, if you don't mind that angles get rounded to the nearest available direction mode

Jyrki Alakuijala

2023-04-25 04:13:50	it is not the directional prediction as much as it is the palette mode -- you can turn these features on and off one by one
2023-04-25 04:15:00	this is d1 with the chessboard patterns -- some tiles still have pretty bad low frequency artefacts
2023-04-25 04:16:26	ok, it doesn't look like that -- discord spoiled it for us

lithium

	Jyrki Alakuijala it is not the directional prediction as much as it is the palette mode -- you can turn these features on and off one by one
2023-04-25 04:31:42	I remember I get some tiny artifacts issue on non-photo image smooth area(-d 1.0~0.5), After this AC Strategy Improve PR, we probably can get more balanced quality on lower target distance, similar webp near-lossless or pik quality?
2023-04-25 04:37:56	libavif disable Palette prediction > -a enable-palette=0

Fraetor

2023-04-25 04:43:59

Multiple frames as a way of creating an album is IIRC already in the spec. Specifically if the frame duration on animated frames is the maximum value it means that the intent is for viewers to provide controls for the users to navigate. Though its no viewer implements it so far..

monad

2023-04-25 09:32:34

It's not part of the spec, but it's been suggested here that such a convention could emerge.

Traneptora

2023-04-25 09:47:53

this one comes to mind

OkyDooky

2023-04-25 10:47:50

Yes, that is what I had in mind. I've actually only unpacked assets from a few VNs (since I don't play too many) and I don't remember them using that structure. Plus, all the other CG I've seen is hosted on 3rd party gallery sites, so now I'm assuming they composited the final assets before uploading them. Using JXL here would still improve both total file size and convenience, I think. For said sites, I think using patches at any stage of development (after a stable implementation) would yield practical benefits, even if they erred on the side of preserving detail accurracy over maximum compression. But, I agree that JXL needs to be supported more by software workflow than just by an encoder, especially since it is designed for that. (<@263300458888691714>)

2023-04-25 10:54:16

I think it could have a use and wouldn't be too unreasonable to implement as a practice (eg for gallery apps and the general sharing/usage process), seeing as "motion photos" are a thing and I still don't quite understand why. Lol I'm just psyched that my ideas for a whole new format likely won't actually require a whole new format (at least, in addition to JPEG XL). Thanks for being patient with me on this. (<@263300458888691714>)

_wb_

	monad It's not part of the spec, but it's been suggested here that such a convention could emerge.
2023-04-26 05:03:47	We're adding it to the spec in the 2nd edition

DZgas Ж

2023-04-26 05:34:55

what is the maximum frame duration during animation?

_wb_

2023-04-26 05:57:26	The maximum duration is 2^32 - 1 ticks
2023-04-26 05:59:31	The maximum tick duration would be 1024 seconds (tps_nominator = 1, tps_denominator = 1024, so 1/1024 ticks per second, i.e. 1024 seconds per tick)
2023-04-26 06:01:04	so the maximum frame duration of a jxl animation is a bit over 139365 years
2023-04-26 06:02:58	it used to be ~139365.68402 years, but with this spec change it becomes ~139365.68398 years because the maximum duration is now reserved to mean "multi-page" instead of "animation"

DZgas Ж

2023-04-26 06:05:04	🙂 and why wasn't it necessary to make restrictions? Such ticks make it possible to make a hidden image inside the jpeg xl which will be deep behind the timer
2023-04-26 06:07:54	I would even say that it gives the ability to make a gigabyte collection of anything. It's also a big bomb. When create a giant image, the animation will show the first frame, which will be displayed immediately. But there can be several gigabytes of hidden images inside the file, which can use up anyone's traffic limit
2023-04-26 06:09:58	maybe the browsers themselves will have to limit the loading of jpeg xl if they see that it is "animation" larger than a certain size
2023-04-26 06:11:36	I don't know how it's done with APNG or GIFs. But in Telegram gif(mp4) bigger than 10 megabytes and screen resolution more than 1024 in height - stop playing and are perceived by server as video files

monad

	_wb_ We're adding it to the spec in the 2nd edition
2023-04-26 06:57:14	Oh, sweet. Latest version I had was from last October.

yoochan

	_wb_ it used to be ~139365.68402 years, but with this spec change it becomes ~139365.68398 years because the maximum duration is now reserved to mean "multi-page" instead of "animation"
2023-04-26 07:29:19	you chose the highest possible duration for multi page, has a duration of 0 (zero) a special meaning ?

Tirr

2023-04-26 07:58:43	zero-tick frame means "blend this frame with the next one"
2023-04-26 08:05:38	technically, this means zero-tick frames are not shown to the end user at all
2023-04-26 08:05:55	(unless it's the last frame)

lithium

	lithium I remember I get some tiny artifacts issue on non-photo image smooth area(-d 1.0~0.5), After this AC Strategy Improve PR, we probably can get more balanced quality on lower target distance, similar webp near-lossless or pik quality?
2023-04-26 08:43:58	oh, I got it, those quality improvement PR is for difficult images, DZgas example images and chessboard images, so those difficult area, also can benefit from this improvement.

_wb_

2023-04-26 08:58:56

yes, zero-duration has no special meaning except that it literally means zero duration, decoders have to immediately go to the next frame and not even paint the intermediate canvas state. A layered still image is the same thing as an animation where all frames have zero duration (well except it isn't signaled to be an animation so there is no tick duration signaled and no loopcount and viewers will hopefully not treat it like an animation)

Jyrki Alakuijala

	_wb_ so the maximum frame duration of a jxl animation is a bit over 139365 years
2023-04-26 10:14:50	if this becomes a problem there is sufficient time to fix it in a further edition 😄
	lithium I remember I get some tiny artifacts issue on non-photo image smooth area(-d 1.0~0.5), After this AC Strategy Improve PR, we probably can get more balanced quality on lower target distance, similar webp near-lossless or pik quality?
2023-04-26 10:23:11	this improvement is not a dramatic one, consider that it will fix 5 % of the remaining problems, not all -- still more work to do, luckily I have still ideas
2023-04-26 10:32:18	next up is to fix the colorful chessboard more effectively
2023-04-26 10:35:30	I'm thinking of comparing the sum of absolute error in the dct to the sum of absolute values -- bucketed in 2-4 frequency bands, and if their ratio is too high in any frequency band, then ramp up the quant -- it works very well in my thinking, but let's see if the code/benchmark agrees 😛

_wb_

2023-04-26 11:00:24

such a heuristic could eventually also be useful to decide to use 'palette blocks' instead (i.e. modular patches)

derberg🛘

	_wb_ so the maximum frame duration of a jxl animation is a bit over 139365 years
2023-04-26 12:33:12	Now if we had some more control in the format, e.g. starting frame depending on the opening time of the file, some silly but interesting things could be done <:KekDog:805390049033191445>
2023-04-26 12:34:42	"Today is not the first of April, come back to this file when it is that day"

Traneptora

2023-04-26 03:47:54	oh god
2023-04-26 03:47:59	I'm very glad there's no scripting in JXL
2023-04-26 03:48:36	Out of curiosity, I have an image that has clear DCT-style artifacts
2023-04-26 03:49:41	it looks like it's been JPEGed before at a low quality, but now it's a lossy WebP (or maybe the artifacts were introduced in lossy webp? who knows)
2023-04-26 03:50:46	does anyone have any suggestion on how to remove DCT-style artifacts caused by improperly quantized HF coefficients?
2023-04-26 03:50:50	without just blurring the image
2023-04-26 03:51:23	this is the image upscaled with nearest-neighbor (yes it's a meme emoji)
2023-04-26 03:52:20	in particular in this region there's vertical DCT artifacts

Jyrki Alakuijala

	Traneptora does anyone have any suggestion on how to remove DCT-style artifacts caused by improperly quantized HF coefficients?
2023-04-26 03:52:50	if it is a single image, fix it in gimp

veluca

2023-04-26 03:53:14

write some code to ask Stable Diffusion to do it for you 😛

Traneptora

2023-04-26 03:53:14	I suppose that's probably the best solution
	veluca write some code to ask Stable Diffusion to do it for you 😛
2023-04-26 03:53:34	there's already NN-tools like Waifu2x to handle that, but they tend to overblur the image

Jyrki Alakuijala

2023-04-26 03:53:54

stable diffusion will make the fox a wolf? 😛

veluca

2023-04-26 03:53:57

but that's for mostly-realtime upscaling of anime-ish content, no? pretty different

Traneptora

2023-04-26 03:54:24	it does both
2023-04-26 03:54:33	the model attempts to reconstruct what the original vector-based image would have looked like
2023-04-26 03:54:45	which can be applied to both upscaling but also to removing JPEG ringing
2023-04-26 03:54:54	it's surprisingly effective at that too, if the image is actually lineart
2023-04-26 03:55:53	https://cdn.discordapp.com/attachments/235844249201934337/860389977095143464/temp.jpg
2023-04-26 03:56:03	Here's one example, it's a JPEG compressed at low quality
2023-04-26 03:56:39	here's what happens when you tell it to "denoise but not scale"
2023-04-26 03:57:43	view the diff here: https://traneptora.com/Orange-Diff/?imagea=https%3A%2F%2Fcdn.discordapp.com%2Fattachments%2F235844249201934337%2F860389977095143464%2Ftemp.jpg&imageb=https%3A%2F%2Fcdn.discordapp.com%2Fattachments%2F235844249201934337%2F860390218443915274%2Ftemp.png

lithium

2023-04-26 04:08:40

~~I think `jpeg-quantsmooth` or `jpeg2png` probably can handle this?~~ If original format isn't jpeg, I guess probably FBCNN can handle this? > https://github.com/jiaxi-jiang/FBCNN

monad

2023-04-26 04:12:37

It is not a JPEG (I guess it could be converted, but doubt results would be great ...)

jonnyawsom3

	veluca but that's for mostly-realtime upscaling of anime-ish content, no? pretty different
2023-04-26 04:13:43	Not sure I'd call 2 minutes a frame realtime, but it certainly works well on anime and most art in my case

monad

2023-04-26 04:20:57

I sometimes use ESRGAN+(pick the most appealing model), even though I kinda hate the general loss of integrity from ML "restoration" methods

lithium

	monad I sometimes use ESRGAN+(pick the most appealing model), even though I kinda hate the general loss of integrity from ML "restoration" methods
2023-04-26 04:40:25	Could you teach me which ESRGAN model is more suitable for slight denoise? I don't want lost too much detail. material is japanese style manga and anime, jpeg quality 75~80 and 80~90.

2023-04-26 04:47:05

just use dpir

lithium

2023-04-26 04:54:12	Cool, I like this DPIR parameter `noise_level_img`
	w just use dpir
2023-04-26 04:58:49	Thank you for your help. I will try this later. 🙂

2023-04-26 04:59:09	use dpir deblock
2023-04-26 05:00:46	but i was referring to the greater conversation

monad

	lithium Could you teach me which ESRGAN model is more suitable for slight denoise? I don't want lost too much detail. material is japanese style manga and anime, jpeg quality 75~80 and 80~90.
2023-04-26 07:26:31	In my experience, Alsa's models are best for integrity, but you can still expect slight color shift.

lithium

	monad In my experience, Alsa's models are best for integrity, but you can still expect slight color shift.
2023-04-27 10:24:24	I understand, Thank you very much.

Traneptora

2023-04-27 10:27:25	Well that's odd
2023-04-27 10:27:37	I just finished tweaking hydrium to have configurable LF Group sizes
2023-04-27 10:28:03	so you can set each LF Group to be 1, 2, 4, or 8 Groups in each direction
2023-04-27 10:28:27	apparently using 1 group per Frame is producing smaller files than using one Frame for the whole image
2023-04-27 10:28:30	but approximately identical
2023-04-27 10:28:37	very very similar though
2023-04-27 10:28:50
2023-04-27 10:29:43	george-insect1.jxl has 256x256 Frames, with one Group/LF Group per Frame, and is ~256 KiB
2023-04-27 10:29:58	george-insect3.jxl has one Frame for the whole 1080x720 image, and is ~258 KiB
2023-04-27 10:30:13	both decode to identical pixel data
2023-04-27 10:30:26	Why would this be?
2023-04-27 10:38:13	(though libjxl decodes the one-frame version about 4x faster)

veluca

2023-04-28 08:38:54

probably this ends up using per-group histograms

Traneptora

	veluca probably this ends up using per-group histograms
2023-04-28 10:35:49	hm?

veluca

2023-04-28 10:36:54

For the entropy coding, by default you use one set of histograms per frame

Traneptora

2023-04-28 10:37:15	That's true
2023-04-28 10:37:21	it's in the HF Global section
2023-04-28 10:37:43	are you saying that with one group per frame, I get more accurate histograms, reducing the filesize?
	veluca For the entropy coding, by default you use one set of histograms per frame
2023-04-28 11:26:09	is there any way to leverage per-group histograms without doing one-group-per-frame?

veluca

	Traneptora are you saying that with one group per frame, I get more accurate histograms, reducing the filesize?
2023-04-28 11:26:59	Depends on the image, in some cases the histograms are similar enough
	Traneptora is there any way to leverage per-group histograms without doing one-group-per-frame?
2023-04-28 11:27:32	There are ways, but I never managed to get it to work well enough to be worth the compute 😁 never tried too hard though

Traneptora

2023-04-28 11:28:01	it just feels weird that one-group-per-frame gives me better coding efficiency than an entire lf group
2023-04-28 11:28:11	I feel like that shouldn't happen
2023-04-28 11:28:53	also I got libhydrium down to 1.5 MB of heap memory <:poggies:853085814032302122>
2023-04-28 11:29:08	for a typical image
2023-04-28 02:19:45	and now it's down to <1 MB <:poggies:853085814032302122>
2023-04-28 02:47:27	I need the front-end to accept PPM input so I can really profile it
2023-04-28 02:47:39	since atm I have to decode a PNG each test run
2023-04-28 02:51:46	``` $ time build/hydrium ants.png ants1.jxl libhydrium version 0.3.0 Total libhydrium heap memory: 11607068 bytes Max libhydrium heap memory: 1012268 bytes real 0m0.547s user 0m0.533s sys 0m0.010s ```
2023-04-28 02:51:51	not bad for 1 MB of heap memory

yoochan

2023-04-29 10:41:29

I was thinking, couldn't chrome rely on wic or gdkpixbuf to display images? Aren't those library exactly meant to delegate the image decoding process?

spider-mario

2023-04-29 10:56:41

I doubt that they would want to rely on system libraries for that

_wb_

2023-04-29 12:43:33	Also I suppose most frameworks don't deal with streaming decoding / progressive rendering
2023-04-29 12:44:13	And also HDR would probably get tricky

Eugene Vert

2023-05-01 06:21:53	While implementing noise in jxl-oxide, I came across an interesting bug. Conformance test `noise.jxl` is looking like `ref.png` in libjxl (qt-jpegxl-image-plugin), but the noise pattern is different in `djxl noise.jxl out.png` UPD: It turned out that I was using an outdated Qt plugin from kf5-kimageformats, and `ref.png` was also outdated.
2023-05-01 06:22:25	(libjxl / djxl)

Traneptora

2023-05-01 08:00:19	different in libjxl vs djxl?
2023-05-01 08:00:29	huh

Eugene Vert

2023-05-01 08:03:57

Yep, very strange, considering that djxl uses the libjxl API

Traneptora

2023-05-01 08:04:29	how different?
2023-05-01 08:04:41	might be thread related
2023-05-01 08:05:01	what if you use numthreads 1

Eugene Vert

2023-05-01 08:07:33	TBH, i don't know how to force qt-jpegxl-image-plugin to use only one thread
2023-05-01 08:08:17	But strangely enough, my implementation seems to match the djxl version.

Traneptora

2023-05-01 08:08:36

compare it to jxlatte

Eugene Vert

2023-05-01 08:10:01

The output looks like neither :c

Traneptora

2023-05-01 08:13:02	huh
2023-05-01 08:18:23	id open an issue

Eugene Vert

2023-05-01 08:19:02

https://github.com/libjxl/libjxl/issues/2447

_wb_

2023-05-02 07:11:28	Could it be caused by orientation?
2023-05-02 07:12:22	Or maybe pixel callback not doing the same thing as full buffer

Jyrki Alakuijala

2023-05-02 09:59:20	djxl looks better than libjxl -- the noise boundary seems to have more 'blue noise' characteristics there
2023-05-02 09:59:54	possibly the noise resynthesis is seeded wrong in libjxl use (x position)

Traneptora

2023-05-02 09:27:37	hydrium can now encode a single frame! it uses way more memory though
2023-05-02 09:27:46	as I have to buffer the entire output codestream
2023-05-02 09:28:14	does shave off about 10% of the filesize though, which is really nice
2023-05-02 09:33:05	does appear to be a bit buggy atm
2023-05-02 09:33:54	as in, the HF coefficients are in the wrong order
2023-05-02 09:34:01	but it's almost done, at least
2023-05-02 09:47:41	~~fix it by writing a TOC permutation~~
2023-05-02 09:47:54	easier to just write it in a different order ngl

veluca

	Traneptora as I have to buffer the entire output codestream
2023-05-02 10:04:07	can you also make it write to file and patch the TOC afterwards? 🙂

Traneptora

	veluca can you also make it write to file and patch the TOC afterwards? 🙂
2023-05-02 10:06:00	possibly but then I have to accept a seekable file* which atm the library doesn't do. has no io
2023-05-02 10:07:18	possible tho

veluca

2023-05-02 10:07:51

yeah that's why people usually have some output abstraction implementation

Traneptora

2023-05-02 10:08:20	hm?
2023-05-02 10:08:37	wdym
2023-05-02 10:09:28	also fwrite can insert in the middle of a file, right?
2023-05-02 10:11:04	another issue with my current implementation is that my HF histograms are generated from the entire sequence of HF coefficients
2023-05-02 10:11:35	which requires me to buffer them all
2023-05-02 10:11:51	I need a better heuristic to gen HF coeffs histogram
2023-05-02 10:12:40	I could do it for one LF group only
2023-05-02 10:12:51	but that feels wrong

veluca

	Traneptora also fwrite can insert in the middle of a file, right?
2023-05-02 10:17:08	not insert no
2023-05-02 10:17:12	only overwrite
2023-05-02 10:17:16	(I think, at least)

Traneptora

	veluca (I think, at least)
2023-05-02 10:24:37	then how do you back and add the toc?

veluca

2023-05-02 10:25:48	that's a good question
2023-05-02 10:25:59	I don't remember what was supposed to be the solution
2023-05-02 10:26:06	how big is the first group usually?

Traneptora

	veluca how big is the first group usually?
2023-05-02 10:41:02	either 2048x2048 or smaller, depending on image size
2023-05-02 10:41:24	well, LF group
2023-05-02 10:41:24	but yea

veluca

2023-05-02 10:41:27

ah sorry, how big (in bytes) is the first *section*?

Traneptora

2023-05-02 10:41:31	4
2023-05-02 10:41:38	lfglobal is 4 bytes, constant
2023-05-02 10:41:46	I don't use a global tree

veluca

2023-05-02 10:42:30	ok, so there are 2 solutions
2023-05-02 10:42:36	that I can think of
2023-05-02 10:42:48	the first one is perhaps a bit less reliable though
2023-05-02 10:43:52	option 1: decide in advance if you'll need 10, 14, 22 or 30 bytes for each section, write the TOC assuming the smallest amount of bytes with that representation, and patch the TOC afterwards

Traneptora

2023-05-02 10:45:04	is LFGlobal permitted to be bigger than it needs to be? i.e. am I allowed to allocate a larger ToC section for it?
2023-05-02 10:45:20	even if it would be finished reading sooner than the end of its TOC boundary?

veluca

2023-05-02 10:45:21	option 2: always reserve (lfglobal_bytes + 4 * num_section) bytes for the TOC + lfglobal, then at the end compute the actual TOC, write it, write LFGlobal, and fill the rest with 0s (saying lfglobal is bigger)
2023-05-02 10:45:40	yeah any section can do that

Traneptora

2023-05-02 10:45:45

ah, that makes the most sense then

veluca

2023-05-02 10:45:59	and IIRC we don't even check that the padding bytes are 0, although we really should
2023-05-02 10:46:57	(in case you are wondering, the fact that this can be done is, I believe, a complete accident - although a happy one)

Traneptora

2023-05-02 10:47:32

there's a catch-22 tho

veluca

2023-05-02 10:47:38

yeah

Traneptora

2023-05-02 10:47:43	the LFGlobal section is the first entry in the TOC
2023-05-02 10:48:01	so I have to write its size to calculate the TOC size, and I have to calculate the TOC size to write the LF Global size

veluca

2023-05-02 10:48:07

the good news is that you only need to know its *approximate* size

Traneptora

2023-05-02 10:48:36	alternatively I could just assume that the LF Global part of the TOC takes up 4 bytes
2023-05-02 10:48:39	exactly

veluca

2023-05-02 10:49:02	the bad news is that I could imagine some image sizes causing the padded-lfglobal size to be 1024-or-1025 or some other sizes
	Traneptora alternatively I could just assume that the LF Global part of the TOC takes up 4 bytes
2023-05-02 10:49:30	unfortunately 4 bytes implies it should be at least 4mb and change

Traneptora

2023-05-02 10:49:40	oh right, cause of the offset
2023-05-02 10:49:40	hm

veluca

2023-05-02 10:50:23	I think this might be solvable by never allocating a lf+toc size that's precisely within some margin of each section size boundary
2023-05-02 10:51:33	ah wait, a section entry can also be 12 bits total
2023-05-02 10:51:35	ugh
2023-05-02 10:51:54	oh well

Traneptora

	veluca ah wait, a section entry can also be 12 bits total
2023-05-02 10:53:11	I can always use trial-and-error

veluca

2023-05-02 10:53:25

if you have N sections, the toc can take from 1.5N to 4N bytes, so if N is between 256 and 683 or so, you could have issues

Traneptora

2023-05-02 10:53:32	well I'm thinking more along the lines of like
2023-05-02 10:53:53	write all TOC except LFGlobal into a bitbuffer
2023-05-02 10:54:18	then assume LF Global TOC entry is 10 bits
2023-05-02 10:54:47	then check if it works
2023-05-02 10:54:51	if it doesn't, assume it's 12-bits
2023-05-02 10:54:52	try again

veluca

2023-05-02 10:54:59

so the issue is that there are some TOC+lfglobal total lengths that are not possible at all for a given set of lengths of the other sections (at least, I think so)

Traneptora

2023-05-02 10:57:29	what are the problem children then?
2023-05-02 10:57:42	because I can always add crap to the frame header like Extensions

Info

JPEG XL

General chat

Voice Channels

Archived

jxl

Anything JPEG XL related