JPEG XL

2024-05-02 01:51:47	what model? it couldve had h264 available but not exposed and defaulted to mjpeg. many cams default to whats hardware-accelerated (hw-accelerated encoders suck and cant be updated but consume less cycles)
2024-05-02 01:54:31	some niches also require keyframes for more precise seeking for hot footage not yet being archived so keyframe-only captured images is seen as a plus. id have expected mjpeg2000 used for that though to accomodate extra metadata with individual frames

username

	Crite Spranberry idk I've just used paint.net
2024-05-02 02:45:08	this plugin does exist for times where you just wanna quickly export something out of Paint.NET and not mess with any other programs: https://forums.getpaint.net/topic/118213-mozjpeg-filetype-2022-08-28/

Demiurge

	lonjil it still beats jpegli below like, q=75? idk the exact number.
2024-05-02 02:50:45	the JPEG format isn't suitable for very low fidelity, it breaks down rather suddenly below a certain threshold for technical reasons that are inherent to the bitstream format that I can't recall right now
2024-05-02 02:51:24	But most people don't want to degrade their images that much

LMP88959

2024-05-02 02:51:34

I thought jpeg at low bitrates is bad because all the AC coefficients pretty much disappear

Demiurge

2024-05-02 02:51:45

Most people expect an image to look the same, before and after saving it.

LMP88959

2024-05-02 02:52:06

Plus they quantize the DC coefficients too so it ends up giving you totally wrong colors

Demiurge

	LMP88959 I thought jpeg at low bitrates is bad because all the AC coefficients pretty much disappear
2024-05-02 02:52:55	It just looks like 8x8 blocks with 8 unique colors after a certain point

LMP88959

2024-05-02 02:53:39	Yeah cuz at that point the block has only a nonzero DC value and that DC value is quantized
2024-05-02 02:53:54	Equivalent to applying a posterization effect to an rgb image
2024-05-02 02:54:19	Ideally you wouldnt touch the DC value
2024-05-02 02:54:27	Idk why jpeg touches it

Demiurge

2024-05-02 03:22:52

Neither do I... Unless you want to progressively compress the DC value, but I don't think JPEG can do that

_wb_

2024-05-02 01:40:15	<@532010383041363969> can probably explain it better than me, but basically gaborish is just doing some mild smoothing using a 3x3 blur kernel after decoding the image, which is useful to hide block boundaries and dct artifacts. To avoid making the image blurry, the encoder does the opposite before encoding the image: it applies a 5x5 sharpening kernel in such a way that the overall result of sharpening -> lossy compression -> decoding -> blurring is as close to the original as possible. Effectively this is an alternative to lapped transforms (another approach to avoid block artifacts, e.g. used in JPEG XR).
2024-05-02 01:43:50	The problem is that currently the encode-side sharpening is probably too strong, which is not really a problem for single-generation encoding, but when doing many generations the error accumulates and you end up getting pretty bad generation loss, like this: https://discord.com/channels/794206087879852103/803645746661425173/1230065259784572928
2024-05-02 01:44:33	(note: this is not an inherent property of JPEG XL, it is a property of what the libjxl encoder is currently doing by default)

LMP88959

2024-05-02 01:46:34

how does the deblocking filter compare?

Traneptora

	lonjil yep, mjpeg 😬
2024-05-02 01:49:13	because the encode cpu needed to encode high-quality mjpeg on the fly
2024-05-02 01:49:19	is very low

afed

2024-05-02 01:51:57

knusperli uses something like gaborish https://github.com/google/knusperli

_wb_

2024-05-02 02:21:51

knusperli is doing something else: instead of doing simple dequantization (just multiplying the coefficient with the quantization factor), which restores the coefficient to the center of the quantization bucket, it does a context-aware dequantization that assigns values within the quantization bucket in such a way that discontinuities at block boundaries are avoided (i.e. it assumes the original was smooth, not blocky)

Meow

	lonjil lossless webp is pretty darn good
2024-05-02 04:44:10	People simply forgot that lossless WebP is the early masterpiece by <@532010383041363969>

Crite Spranberry

2024-05-02 04:47:21	png
2024-05-02 04:47:26	gif <:t_:1088450292095389706>
2024-05-02 04:47:41	Is 100 quality jpeg lossy?
2024-05-02 04:48:01	I've never thought about it as I've had no case where 100 quality jpeg is better than png
	Crite Spranberry Is 100 quality jpeg lossy?
2024-05-02 04:49:32	If it is it doesn't show
2024-05-02 04:49:56	4:2:0 shows a bit of loss though
2024-05-02 04:50:27	But 100 quality 4:4:4 seems equal to png in random unigine heaven screenshot i found
	Crite Spranberry 4:2:0 shows a bit of loss though
2024-05-02 04:50:59

Meow

	Crite Spranberry Is 100 quality jpeg lossy?
2024-05-02 05:00:57	Of course it is lossy
2024-05-02 05:02:06	Some softwares would switch to lossless when exporting quality 100 for AVIF or WebP

Demiurge

2024-05-02 11:22:14

Why does my shiny new iPhone not support color profiles on JPEG?

a goat

	Demiurge Why does my shiny new iPhone not support color profiles on JPEG?
2024-05-02 11:24:27	Forcing people into having to use DCI P3 maybe?

Demiurge

2024-05-02 11:27:38	Well if an image is tagged it's supposed to convert it to the display profile, not completely misinterpret it somehow
2024-05-02 11:29:44
2024-05-02 11:30:37	It's even worse than if they completely ignored/stripped the profile.

LMP88959

2024-05-03 08:18:53	i used imagemagick's JXL implementation
2024-05-03 08:19:06	```convert -quality 10 orig.png out.jxl```
2024-05-03 08:19:15	is it a good impl
2024-05-03 08:19:23
2024-05-03 08:19:29	because im confused why it looks so bad
2024-05-03 08:19:36	it's a 64x64 icon
2024-05-03 08:20:08	JXL on the left (1,700 bytes), mine in the middle (1,000 bytes), original on the right'
2024-05-03 08:21:42
2024-05-03 08:21:43	here is the original

lonjil

2024-05-03 08:28:22

here is a 889 byte one I just encoded

LMP88959

2024-05-03 08:28:31	what did you use to encode
2024-05-03 08:28:56	i just noticed imagemagick inserts a ton of useless metadata into the file which is why the size is so large

lonjil

2024-05-03 08:28:58

`cjxl -d 4 -e 10 orig.png d4_e10.jxl`

LMP88959

2024-05-03 08:31:54

ok sweet, thanks. i will avoid imagemagick from now on

lonjil

2024-05-03 08:32:39

I wonder what it even maps `-quality 10` to

LMP88959

2024-05-03 08:34:27	cjxl has a -quality param too
2024-05-03 08:34:32	so maybe that?
2024-05-03 08:35:36
2024-05-03 08:35:44	jxl 600 bytes, mine 600 bytes
2024-05-03 08:35:56	pretty cool seeing the differences

Quackdoc

	LMP88959 cjxl has a -quality param too
2024-05-03 09:06:56	magick can do some weird things sometimes

LMP88959

2024-05-03 09:07:13

yeah.. it's a shame

HCrikki

2024-05-03 09:13:02	d4 looks overkill for avatar-like images (smaller than 256x256)
2024-05-03 09:14:32	are you using a current libjxl? some workflows with env paths can end using old versions even though you keep updating just recently someone experienced such an issue with their setups and erroneously assumed jxl barely improved since 0.7

LMP88959

2024-05-03 09:15:27

JPEG XL encoder v0.10.2 0.10.2 [AVX2,SSE4,SSE2]

jonnyawsom3

	LMP88959
2024-05-03 09:23:15	There's also VarDCT and Modular
2024-05-03 09:23:26	About 600 bytes too

LMP88959

2024-05-03 09:24:07	holy moly
2024-05-03 09:24:16	so much to learn about the knobs and parameters
2024-05-03 09:25:23	it looks like JXL is losing a lot of chroma
2024-05-03 09:25:30	is there subsampling going on?

HCrikki

	LMP88959 JPEG XL encoder v0.10.2 0.10.2 [AVX2,SSE4,SSE2]
2024-05-03 09:36:45	is that what ver reports in terminal?

lonjil

2024-05-03 09:36:45	no, but in the current release it quantizes chroma probably too aggressively
2024-05-03 09:37:09	Jyrki posted recently about increasing the amount of chroma info by 20% (and reducing luma by 2% IIRC)

LMP88959

	HCrikki is that what ver reports in terminal?
2024-05-03 09:38:20	yeah

Demiurge

	LMP88959 i just noticed imagemagick inserts a ton of useless metadata into the file which is why the size is so large
2024-05-04 06:59:52	I wonder if graphicsmagick does the same thing
	LMP88959 is there subsampling going on?
2024-05-04 07:00:41	No but v0.10 mutilates colors badly I noticed when using q<90 or d>1
2024-05-04 07:03:52	If you use lower quality settings than that, then the color is so mangled that it's practically useless output

yoochan

2024-05-04 07:04:14

What about lossless encoding using resampling? I can't test at the moment

Demiurge

2024-05-04 07:05:02	Lossless and resampling are mutually exclusive, no? I wouldn't use cjxl as a resampling tool.
2024-05-04 07:06:54	cjxl is pretty good at lossless but sometimes there are bugs in the decoder with the colorspace profile it generates and the output format.
2024-05-04 07:07:35	But I think some of those bugs only affect lossy
2024-05-04 07:08:59	If you try to encode an ambiguous file that does not have color tags, then it will create one and I think assume it's sRGB
2024-05-04 07:09:21	which is correct behavior
2024-05-04 07:11:08	But if you compare an untagged ambiguous file with a file tagged with non-ambiguous color info, they can look different even if the data is exactly the same, so that can sometimes cause images to change in appearance after lossless encoding an untagged file to JXL
2024-05-04 07:11:20	because in JXL, there is no such thing as "untagged"

yoochan

2024-05-04 07:17:37

For pixel art you can reduce the pixel count during encoding and upscale it at decoding it works really well. I'll try to find the exact command

damian101

2024-05-04 09:31:31
2024-05-04 09:33:12

2024-05-04 10:02:49	just use webp
2024-05-04 10:03:06	654 bytes

yoochan

2024-05-04 11:20:15

3905 bytes with lossless 🙂

Demiurge

	yoochan For pixel art you can reduce the pixel count during encoding and upscale it at decoding it works really well. I'll try to find the exact command
2024-05-04 11:23:36	you mean like nearest neighbor resizing?

yoochan

2024-05-04 11:27:30	yes, I frist thought it was pixel art but the image is pixel level drawing, my mistake
2024-05-04 11:30:00	we spoke about this here : https://discord.com/channels/794206087879852103/794206170445119489/1222214319198961714

TheBigBadBoy - 𝙸𝚛

2024-05-04 11:48:13

what's that tool again to convert JPG to PNG (and "remove" compression artifacts) ?

username

	TheBigBadBoy - 𝙸𝚛 what's that tool again to convert JPG to PNG (and "remove" compression artifacts) ?
2024-05-04 11:49:20	https://github.com/victorvde/jpeg2png https://github.com/ilyakurdyukov/jpeg-quantsmooth

TheBigBadBoy - 𝙸𝚛

2024-05-04 11:49:35	thanks
2024-05-04 11:49:37	[⠀](https://cdn.discordapp.com/emojis/853506500088692747.webp?size=48&quality=lossless&name=pepelove)
2024-05-04 11:52:36	9 years ago [⠀](https://cdn.discordapp.com/emojis/852007419474608208.webp?size=48&quality=lossless&name=av1_woag)

damian101

	w 654 bytes
2024-05-04 12:08:10	that's wild...
2024-05-04 12:08:19
2024-05-04 12:08:36	But ssimulacra2 prefers AVIF

Demiurge

2024-05-04 12:11:25	metrics are useless as far as they disagree with your own eyes
2024-05-04 12:12:27	the goal of these metrics is to predict what a human would say
2024-05-04 12:12:57	So if you say it looks like shit while the metric says it's great then it's not you that's wrong, it's the metric
2024-05-04 12:13:16	since the metric is supposed to be designed to predict what you would say
2024-05-04 12:13:56	incidentally, all existing metrics are a very bad match to actual human vision
2024-05-04 12:15:06	But it requires time and fatiguing effort for a human to stare at and compare images
2024-05-04 12:15:13	that's why metrics are developed and used
2024-05-04 12:15:30	not because they are superior to humans but because they are more fast and convenient to use
2024-05-04 12:15:55	the goal of a lossy codec is to also closely match human vision
2024-05-04 12:16:09	therefore actual humans are needed to be the guide and the judge
2024-05-04 12:17:15	until metrics get much, much better at predicting what humans see

damian101

2024-05-04 12:32:06	At larger viewing distance, I definitely prefer AVIF, though
2024-05-04 12:32:58	WebP preserves more detail, but moves it around a lot.
2024-05-04 12:33:03	And chroma is also worse.

Meow

2024-05-04 01:14:41

Are those better than denoising tools built in several softwares?

_wb_

2024-05-04 01:44:53

I think it's safe to say that most encoders are not tuned for doing lossy on a 64x64 image — for such small images the overhead of various constant-sized signaling becomes relevant, but this is often not really something that gets taken into account when tuning an encoder

LMP88959

	w just use webp
2024-05-04 02:09:18	that looks fantastic

username

	w just use webp
2024-05-04 05:23:21	here's what I got with WebP, file size is 650 bytes

LMP88959

2024-05-04 05:41:43

so webp is better for lossy icon sized pixel art?

damian101

	LMP88959 so webp is better for lossy icon sized pixel art?
2024-05-04 07:52:21	webp preserves (only) high-contrast detail well, and doesn't try hard to prevent spatial distortion

LMP88959

2024-05-04 08:25:06

Ah i see

damian101

2024-05-04 09:26:19	so, yes, it's quite good appeal-wise for highly lossy low resolution content...
2024-05-04 09:26:54	But at such small resolutions you usually don't want to be that lossy...

LMP88959

2024-05-04 09:28:06

Yeah that is a good point. Data savings are pretty negligible with such small images

2024-05-04 09:30:13	webp often wins at pixel art at any resolution
2024-05-04 09:30:21	even for lossless

Demiurge

	Demiurge
2024-05-05 03:28:25	Maybe this has something to do with this bug... https://github.com/libjxl/libjxl/issues/3512
2024-05-05 03:28:37	idk how to add an app14 tag to an existing jpeg to test

TheBigBadBoy - 𝙸𝚛

	Demiurge idk how to add an app14 tag to an existing jpeg to test
2024-05-05 04:54:08	https://encode.su/threads/2489-jpegultrascan-an-exhaustive-JPEG-scan-optimizer?p=82584&viewfull=1#post82584 the `jpegultrascan.pl` script adds it automatically, use `-b 0 -t $(nproc)` for fastest result

Demiurge

2024-05-05 09:53:21

Perl is cool

yoochan

2024-05-06 07:04:34

perl 5 or perl 6 ?

spider-mario

2024-05-06 09:12:51	both, in their own respective ways (but perl 6 is called raku now)
2024-05-07 06:20:53	https://gtmetrix.com/efficiently-encode-images.html what kind of crappy heuristic is that
2024-05-07 06:21:04	what does “85% of their original quality” even mean
2024-05-07 06:21:47	degrading quality by 15% (whatever that means) for a 4 kB gain doesn’t sound like a worthy trade-off?

username

2024-05-07 06:22:49

"BMP" what who even uses BMPs anymore‽‽

spider-mario

2024-05-07 06:46:52	oh no https://developer.chrome.com/docs/lighthouse/performance/uses-optimized-images#how_lighthouse_flags_images_as_optimizable
2024-05-07 06:47:32	well, that almost looks more reasonable
2024-05-07 06:47:47	maybe that’s what the GTmetrix page meant to imply

Oleksii Matiash

2024-05-07 07:06:28

I'm confused, bmp quality 85? <:WhatThe:806133036059197491>

Eugene Vert

2024-05-07 08:40:49

It's a bit weird that they don't use quantization table analysis for jpeg, like in `magick identify -format '%Q' 1.JPG` or https://gist.github.com/atr000/559324

a goat

2024-05-08 07:06:35

Are there any GPU accelerated difference metrics with comparable results to butteraugli for 4MP and up images?

Quackdoc

2024-05-08 07:16:33

the only semi decent gpu accelerated metric I know of is someone ported ssimu2rs to onnx

a goat

	Quackdoc the only semi decent gpu accelerated metric I know of is someone ported ssimu2rs to onnx
2024-05-08 07:23:46	Oh? Is there a repo?

Quackdoc

2024-05-08 07:24:14	I dont think so, source was just a zip release iirc
2024-05-08 07:24:38	ill see if I can dig it up
	a goat Oh? Is there a repo?
2024-05-08 07:27:28	https://cdn.discordapp.com/attachments/1042536514783023124/1118195477188444230/ssimulacra2_bin-gpu.zip?ex=663ce730&is=663b95b0&hm=5275ab283f34843656fe6580495da77789a4535437f62aa886adcba19ded4963&
2024-05-08 07:27:32	oh crap
2024-05-08 07:27:40	actually no that worked

a goat

	Quackdoc https://cdn.discordapp.com/attachments/1042536514783023124/1118195477188444230/ssimulacra2_bin-gpu.zip?ex=663ce730&is=663b95b0&hm=5275ab283f34843656fe6580495da77789a4535437f62aa886adcba19ded4963&
2024-05-08 08:44:55	Thanks! This will come in handy

Demiurge

	Jyrki Alakuijala we could do this -- now we are opting towards smoothness, but to a lesser degree than most other codecs -- we used to do this a lot more, but Jon changed it (with a 2-3 % objective score improvement) back in the day, perhaps three years ago -- we could have modes for this
2024-05-08 11:46:47	Is it possible to do this in a way that looks acceptable for non-photographic graphics as well?
2024-05-08 11:47:44	Grainy, bluenoise-like lossy artifacts I mean
2024-05-08 11:58:38	Certain types of dithering look good for non-photo content. So maybe it's possible to produce lossy compression artifacts similar to that, in a way that looks natural and less disruptive to the human eye for both photo and non-photo. Metrics be damned.

JendaLinda

2024-05-11 08:32:36

It's fun to create image files so small so they fit inside the NTFS file entry itself.

lonjil

2024-05-11 08:45:42	what size is that?
2024-05-11 08:45:47	on zfs it's 112 bytes

JendaLinda

2024-05-11 08:47:39

Around 600 bytes. It depends on the file attributes and length of the file name.

HCrikki

2024-05-11 08:47:59

quick question, can that be made to generate QR codes ?

JendaLinda

2024-05-11 08:50:17	Small files could be turned to QR codes as well.
2024-05-11 08:53:15	Anyway. Windows stores payload of very small files inside their file entries automatically.

jonnyawsom3

2024-05-11 09:19:50	500 bytes is the rough limit, probably meant to be 512 but filename, ect....
	JendaLinda Small files could be turned to QR codes as well.
2024-05-11 09:22:44	I tried to do that, but there's no way to open the file upon scan. `file:///` doesn't default to a browser and just shows as text, so I had to resort to a usual link with a redirect to a website hosting the file

JendaLinda

2024-05-11 09:28:45	I have a 627 bytes file that still fits.
	I tried to do that, but there's no way to open the file upon scan. `file:///` doesn't default to a browser and just shows as text, so I had to resort to a usual link with a redirect to a website hosting the file
2024-05-11 09:30:08	You need a QR code software that can encode and decode binary files. QR codes can store any data.

lonjil

2024-05-11 09:31:57

fun fact: "binary" QR is universally interpreted as UTF-8, so when actual arbitrary binary data needs to be stored, people use text-mode QR with the data in base45 encoding.

JendaLinda

2024-05-11 09:38:50

This is only depending on the implementation how it interprets the bytes.

jonnyawsom3

	JendaLinda You need a QR code software that can encode and decode binary files. QR codes can store any data.
2024-05-11 09:39:51	Not too useful when trying to make it work for the average person unfortunately

JendaLinda

2024-05-11 09:41:08

Encoding binary data in QR is usually application specific.

Meow

2024-05-11 10:12:54

Good I should experiment QR codes with various codes later

jonnyawsom3

2024-05-11 11:12:15

I've posted this channel before, but a unique way of explaining colors and perceptual color spaces using Minecraft if an accessible form https://youtu.be/e0HM_vfSuDw

yoochan

	lonjil fun fact: "binary" QR is universally interpreted as UTF-8, so when actual arbitrary binary data needs to be stored, people use text-mode QR with the data in base45 encoding.
2024-05-11 01:35:43	What a waste. Iirc, shift JIS is also part of the norm. A pre unicode relic

Meow

2024-05-11 03:10:43

UTF-8 is a must for CJKV characters

yoochan

2024-05-11 03:29:58

Only when you want to mix them... Before unicode Japaneses, Chineses, Russians used language specific codepages which were much better for each language but a nightmare to mix

JendaLinda

2024-05-11 04:01:47

And different operating systems were using different codepages fot the same language. It's a good thing there is now finally a standard everybody agreed on.

Meow

	yoochan Only when you want to mix them... Before unicode Japaneses, Chineses, Russians used language specific codepages which were much better for each language but a nightmare to mix
2024-05-11 04:04:03	A worse nightmare even before Unicode
2024-05-11 04:04:21	If you lived in such environment

JendaLinda

2024-05-11 04:06:34

Just for my native language, there exist about 6 different codepages.

Meow

2024-05-11 04:37:01

We can also experiment QR code version 40 with JXL

jonnyawsom3

	I tried to do that, but there's no way to open the file upon scan. `file:///` doesn't default to a browser and just shows as text, so I had to resort to a usual link with a redirect to a website hosting the file
2024-05-11 04:55:52	When I was messing with QR codes, I tried to make JXL art of a QR code, which then scanned as it's self, but naturally every change to the image changes what the code looks like so I was slowly turning into Sisyphus

Crite Spranberry

2024-05-11 04:58:08

Meow

2024-05-11 05:20:56

There are some utilities that can produce QR codes blending with some image

spider-mario

	JendaLinda Just for my native language, there exist about 6 different codepages.
2024-05-11 05:25:50	may I ask which language that is?

Oleksii Matiash

2024-05-11 05:47:10

I know about 3 CP for my language, but 6 🤯

JendaLinda

	spider-mario may I ask which language that is?
2024-05-11 05:49:39	Czech
2024-05-11 05:51:17	There are ISO 8859-2, Windows-1250, Mac OS CE, CP 852, Kamenický, KOI8-CS

Oleksii Matiash

2024-05-11 06:10:14

Ah, yes, ISO, I forgot about it. ISO 8859-5, Windows-CP1251, CP866, KOI8-U

Meow

2024-05-12 03:04:25	QR code version 40
2024-05-12 03:05:16	The original image
2024-05-12 03:10:19	Comparing to PNG QOI 4770% AVIF 284% HEIC 10118% JXL 222% WebP 104% All lossless

jonnyawsom3

2024-05-12 03:25:19	Had to reduce the size slightly since it was 1:9 instead of 1:8, but close to the original
2024-05-12 03:27:26	Did get it slightly under 10 KB completely lossless too though
2024-05-12 03:30:25	45% `cjxl "QR code Pixels.png" "QR code.jxl" -d 0 -e 10 -g 3 -I 0 -P 0 --resampling=8 --already_downsampled --upsampling_mode=0` 159% `cjxl "QR code.png" "QR code 10KB.jxl" -d 0 -e 10 -g 3 -I 0 -P 0`
2024-05-12 03:33:06	Oh, and here's the 'Pixels' input

Meow

2024-05-12 03:45:01

Yes still scannable

jonnyawsom3

2024-05-12 03:53:00

I do wonder if an LZ77 only JXL file would be possible or even worth it...

Jyrki Alakuijala

	Oh, and here's the 'Pixels' input
2024-05-13 09:02:37	for that image it is critical that the LZ77 will find the 4-line pattern and favor the LZ77 copy that is exactly 4 lines above -- this can be rare for images in general, basic heuristics in WebP lossless try combos of previous-line, previous-pixel and all LZ77 -- I don't know what is favored in LZ77 in JPEG XL's current encoder (Luca wrote that)
2024-05-13 09:04:26	often faster LZ77 tries to find matches with hashing -- and hashing is looking at a few symbols only, 3 or 4
2024-05-13 09:05:36	with PNG and WebP lossless there is a step that first collapses 8 binary symbols into a single symbol, so hashing looks at 16 (in WebP I think -- as hashing there is based on two pixels IIRC) or 24 (in PNG/zlib) bits at once (zlib hashing three bytes)

jonnyawsom3

2024-05-13 09:49:14

Since that image is a 1 bit pallete, I assume the same amount of pixels are hashed as are bits

veluca

	Jyrki Alakuijala for that image it is critical that the LZ77 will find the 4-line pattern and favor the LZ77 copy that is exactly 4 lines above -- this can be rare for images in general, basic heuristics in WebP lossless try combos of previous-line, previous-pixel and all LZ77 -- I don't know what is favored in LZ77 in JPEG XL's current encoder (Luca wrote that)
2024-05-13 10:01:02	I think special distances are favoured but perhaps not

_wb_

2024-05-13 03:06:05	we don't have bit packing for low bitdepth images in jxl (unlike png which packs eight 1-bit pixels in a byte, or four 2-bit pixels, or two 4-bit pixels)
2024-05-13 03:06:39	but we could approximate it by doing some horizontal squeeze steps and then doing palette on the result
2024-05-13 03:07:52	if you have e.g. a single-channel 1-bit 800x600 image, then with some squeezing you can turn it into eight channels that are 100x600
2024-05-13 03:08:26	then we can do an 8-channel palette on that
2024-05-13 03:10:05	it will probably have more than 256 'colors' though, due to the squeeze tendency thing, but more or less it turns 8 pixels into one pixel with a bigger range

jonnyawsom3

2024-05-13 03:20:41	Riiight, I remember you mentioning the squeeze as a replacement before
2024-05-13 03:21:27	I know I've already seen it work, by turning on squeeze for lossless on a single color image

_wb_

2024-05-13 04:38:59	for something where png's bitpacking happens to be very beneficial (like that qr code), doing a custom squeeze script + palette could be useful, currently libjxl has no way to do that though
2024-05-13 04:39:34	possibly this could be a strategy worth trying for images that are 1 bit (or even 2 or 3 bit)
2024-05-13 04:40:58	for now we haven't really tried to optimize compression for such relatively niche images — but at some point we should look into it, and make sure we consistently beat png on such images too

jonnyawsom3

2024-05-13 04:57:50

Squeezed and upsampled QR codes... Just think of how many more tricks there are to find

Meow

	_wb_ for now we haven't really tried to optimize compression for such relatively niche images — but at some point we should look into it, and make sure we consistently beat png on such images too
2024-05-13 06:22:42	QR codes aren't niche

JendaLinda

2024-05-13 06:41:42

Scanned documents are often 1 bit.

_wb_

2024-05-13 07:31:51	It's quite oldschool imo to scan in 1 bit, and not grayscale. Reminds me of fax machines. But yes, there is certainly legacy content like that, and some things still do that.
	Meow QR codes aren't niche
2024-05-13 07:32:42	A more efficient way to store a QR code is to store the actual content and generate the QR code from that if you need it visually 🙂

190n

2024-05-13 07:36:52

now could you express qr encoding logic in jxl_from_tree

jonnyawsom3

2024-05-13 07:49:50

I did try to make a QR code using splines, got about a quater of the way before realising my insanity

veluca

2024-05-13 07:54:15	QR codes can be quite resilient
2024-05-13 07:56:00	my phone could scan that QR without any isues, despite it being hand-drawn
2024-05-13 07:56:06	(don't ask why)

TheBigBadBoy - 𝙸𝚛

2024-05-13 08:43:17	what about stable diffusion QR codes <:KekDog:805390049033191445> https://cdn.arstechnica.net/wp-content/uploads/2023/06/qr_code_lady-800x450.jpg
2024-05-13 08:43:40	mine can hardly get it after a few tries

190n

2024-05-13 08:44:49

for me binary eye gets it very quickly but google camera has trouble

jonnyawsom3

	TheBigBadBoy - 𝙸𝚛 what about stable diffusion QR codes <:KekDog:805390049033191445> https://cdn.arstechnica.net/wp-content/uploads/2023/06/qr_code_lady-800x450.jpg
2024-05-13 08:51:28	WebP strikes again

TheBigBadBoy - 𝙸𝚛

2024-05-13 08:53:37

JendaLinda

	_wb_ It's quite oldschool imo to scan in 1 bit, and not grayscale. Reminds me of fax machines. But yes, there is certainly legacy content like that, and some things still do that.
2024-05-13 08:53:46	I've seen that more often than I like, unfortunately. 1 bit scans are usually hard to read.

spider-mario

	veluca QR codes can be quite resilient
2024-05-13 08:53:56	when covid certificates were still a thing, I tried to order temporary tattoos of mine
2024-05-13 08:54:06	sadly, once applied to the skin, it didn’t read properly

jonnyawsom3

2024-05-13 08:56:25	Should've gone with RFID implants clearly
	TheBigBadBoy - 𝙸𝚛 what about stable diffusion QR codes <:KekDog:805390049033191445> https://cdn.arstechnica.net/wp-content/uploads/2023/06/qr_code_lady-800x450.jpg
2024-05-13 08:58:07	Still 1 bit ;P

Demiurge

	spider-mario when covid certificates were still a thing, I tried to order temporary tattoos of mine
2024-05-14 09:02:42	lmao...

JendaLinda

2024-05-14 09:30:31

I would consider QR codes displayed using block characters in VGA text mode. Using half block characters, it's possible to display 80x50 pixels.

lonjil

2024-05-14 09:39:08

``` █████████████████████████████████ █████████████████████████████████ ████ ▄▄▄▄▄ █ █▄▀▀▄▄█ ▄▄▄▄▄ ████ ████ █ █ █ ▀▄ █▀ █ █ █ ████ ████ █▄▄▄█ █▀██▀▀█▄▄▀█ █▄▄▄█ ████ ████▄▄▄▄▄▄▄█▄▀▄█ █ ▀▄█▄▄▄▄▄▄▄████ ████ ▄▀█ ▄▀█▀▀▄▀▀█▀▀██▄▀ ▄ ▄████ ████▀▀▄ █▄▄▀ █▀ ▄█ ▄▄▀ ▀ ▄▀██████ ████▀ ▀▄ ▄█▀▀▄█▄▀ ▀ ▄▄▄▀▀▄▄▄████ ██████ ▀▄▀▀▀██▀██ ▀▄█ █ ▀▀████ ████▄▄█▄▄▄▄█▀ █▄▀▀█▄ ▄▄▄ ▄▄▄█████ ████ ▄▄▄▄▄ █▀█▀ ▄█ █ █▄█ ▀█▀████ ████ █ █ █▄█▄█▄▀█▀ ▄ ▄█▄ ████ ████ █▄▄▄█ █▀ ▄█▀██▀▄ ▄ ▀▄ █████ ████▄▄▄▄▄▄▄█▄█▄▄▄▄▄▄██▄█▄█▄▄▄████ █████████████████████████████████ █████████████████████████████████ ```

Demiurge

2024-05-14 09:54:36

Is UTF8 considered a codepage? lol... or are codepages not referring to multi-byte encoding?

jonnyawsom3

lonjil ``` █████████████████████████████████ █████████████████████████████████ ████ ▄▄▄▄▄ █ █▄▀▀▄▄█ ▄▄▄▄▄ ████ ████ █ █ █ ▀▄ █▀ █ █ █ ████ ████ █▄▄▄█ █▀██▀▀█▄▄▀█ █▄▄▄█ ████ ████▄▄▄▄▄▄▄█▄▀▄█ █ ▀▄█▄▄▄▄▄▄▄████ ████ ▄▀█ ▄▀█▀▀▄▀▀█▀▀██▄▀ ▄ ▄████ ████▀▀▄ █▄▄▀ █▀ ▄█ ▄▄▀ ▀ ▄▀██████ ████▀ ▀▄ ▄█▀▀▄█▄▀ ▀ ▄▄▄▀▀▄▄▄████ ██████ ▀▄▀▀▀██▀██ ▀▄█ █ ▀▀████ ████▄▄█▄▄▄▄█▀ █▄▀▀█▄ ▄▄▄ ▄▄▄█████ ████ ▄▄▄▄▄ █▀█▀ ▄█ █ █▄█ ▀█▀████ ████ █ █ █▄█▄█▄▀█▀ ▄ ▄█▄ ████ ████ █▄▄▄█ █▀ ▄█▀██▀▄ ▄ ▀▄ █████ ████▄▄▄▄▄▄▄█▄█▄▄▄▄▄▄██▄█▄█▄▄▄████ █████████████████████████████████ █████████████████████████████████ ```

2024-05-14 10:25:32

Jesus christ, even after reading the block character message and thinking "Huh, good idea". I just spent a full minute tapping on your message thinking it was just an underexposed printout

lonjil

2024-05-14 10:28:05

lol

Meow

2024-05-14 11:16:11

I tried to scan it immediately

JendaLinda

	Demiurge Is UTF8 considered a codepage? lol... or are codepages not referring to multi-byte encoding?
2024-05-14 11:55:59	UTF-8 is not a codepage, UTF-8 is one of the methods to encode Unicode. Unicode is technically a codepage. The one codepage that rules them all.

yoochan

2024-05-14 11:57:18

a bit like TRON : https://en.wikipedia.org/wiki/TRON_(encoding) 😄

JendaLinda

2024-05-14 12:06:03

Another commonly used Unicode encoding is UTF-16. It's the same idea as UTF-8 but UTF-16 uses 16 bit words rather than bytes to encode unicode characters. Both UTF-8 and UTF-16 code sequences encode the same Unicode codes.

yoochan

2024-05-14 12:06:39

and UTF-32, rarely used

JendaLinda

2024-05-14 12:09:17

I'd say UTF-32 is actually the raw Unicode encoding as it encodes the Unicode codes directly.

yoochan

2024-05-14 12:10:06

indeed, but the name exists...

JendaLinda

2024-05-14 12:12:09

It does, to specify the exact encoding, so there's no confusion.

spider-mario

2024-05-14 12:17:17	UTF-16 is the worst of both worlds
2024-05-14 12:17:52	more bloated than UTF-8 and byte-order-sensitive but without the constant-width property of UTF-32
2024-05-14 12:18:13	it originates from when it was thought that 16 bits would be enough to be constant-width (UCS-2)
2024-05-14 12:18:21	later on, “oops, actually not”

JendaLinda

2024-05-14 12:20:48

Microsoft also believed 64k code point would be enough for everybody, so they chose UTF-16 as the default Unicode encoding in Windows.

spider-mario

2024-05-14 12:21:22	https://utf8everywhere.org/#facts
2024-05-14 12:21:38	> UTF-16 is the worst of both worlds, being both variable length and too wide. It exists only for historical reasons and creates a lot of confusion. We hope that its usage will further decline.

yoochan

2024-05-14 12:22:17	Still waiting for the UTF4, nibble based
2024-05-14 12:23:33	and for the unicode to accept the "decimal separator" as a codepoint to throw away issues with comma or dot in numbers in the world hence avoid the WW3

spider-mario

2024-05-14 12:24:01	(arguably, even codepoints being a constant number of code units is overrated, since graphemes can consist of several codepoints anyway)
2024-05-14 12:24:15	this is e + combining ´: é

JendaLinda

2024-05-14 12:24:22

Others have chosen UTF-8 for it's flawless backward compatibility with ASCII. So Microsoft is alone with UTF-16.

spider-mario

2024-05-14 12:25:12

```python >>> [ord(c) for c in 'é'] [101, 769] >>> [ord(c) for c in 'é'] [233] ```

JendaLinda

2024-05-14 12:26:53

It's not perfect. There are incremental additions and new ideas but the old stuff couldn't be changed.

spider-mario

2024-05-14 12:27:57	```shell $ perl -Mutf8 -E 'binmode *STDOUT, ":encoding(UTF-8)"; while ("énervé" =~ /(\X)/g) { say "Grapheme: $1"; for my $c (split "", $1) { say " codepoint: ", ord($c) } }' Grapheme: é codepoint: 101 codepoint: 769 Grapheme: n codepoint: 110 Grapheme: e codepoint: 101 Grapheme: r codepoint: 114 Grapheme: v codepoint: 118 Grapheme: é codepoint: 101 codepoint: 769 ```
2024-05-14 12:29:19	```shell $ perldoc perlre […] \X [4] Match Unicode "eXtended grapheme cluster" ```
2024-05-14 12:29:58	```shell $ perldoc perlrebackslash […] \X This matches a Unicode extended grapheme cluster. "\X" matches quite well what normal (non-Unicode-programmer) usage would consider a single character. As an example, consider a G with some sort of diacritic mark, such as an arrow. There is no such single character in Unicode, but one can be composed by using a G followed by a Unicode "COMBINING UPWARDS ARROW BELOW", and would be displayed by Unicode-aware software as if it were a single character. The match is greedy and non-backtracking, so that the cluster is never broken up into smaller components. See also "\b{gcb}". Mnemonic: eXtended Unicode character. ```

lonjil

	spider-mario it originates from when it was thought that 16 bits would be enough to be constant-width (UCS-2)
2024-05-14 12:31:33	Note that the international community insisted that 16 bit wasn't enough, but early Unicode, composed entirely of American tech companies, thought they knew better.
2024-05-14 12:32:21	And it only took like, a year after Unicode 1.0 for the mistake to become a problem.

spider-mario

2024-05-14 12:46:59	I love UTF-8, it’s a work of beauty
2024-05-14 12:47:06	UTF-32 is, eh, very niche, but “why not”
2024-05-14 12:47:16	UTF-16 is a nope from me
2024-05-14 12:47:31	pointless, no pun intended

lonjil

2024-05-14 12:49:43

I think ISO wanted 32 bits

JendaLinda

2024-05-14 12:50:39

Windows uses three character encodings at once, for different purposes. 1) UTF-16, so-called "UNICODE", the default for the OS and applications compiled with Unicode support. 2) Legacy 8 bit Windows encoding, depending on the country and language settings, so-called "ANSI" encoding, used by legacy graphical Windows applications without Unicode support, usually applications written for Win9x. 3) 8 bit MS-DOS encoding, also depending on the country and language settings, so-called "OEM" encoding, used by console applications that doesn't support the quirky Windows Unicode implementation. The Windows terminal itself supports Unicode but the programs will use the MS-DOS encoding by default, this includes cjxl, djxl and other tools.

spider-mario

2024-05-14 01:06:38	oh, I thought we’d be using ANSI by default
2024-05-14 01:07:34	wasn’t there a PR that did the manifest thing that makes ANSI=UTF-8? does it not do what was hoped of it, then?
2024-05-14 01:08:28	why is MS-DOS even still relevant at all? Windows has dropped NTVDM

JendaLinda

2024-05-14 01:30:22	MS-DOS encoding was the default in the MS-DOS command line in Win9x and there were also 32 bit "console applications" running in Windows 9x so these had to use the MS-DOS codepage as well.
2024-05-14 01:35:18	Curiously, batch/cmd scripts are also assumed to be encoded in MS-DOS codepage by default.
2024-05-15 01:07:55	I wonder if any application actually used the default VGA palette.
2024-05-15 01:09:30	It seems to be based on HSL model and has poor coverage of the RGB color space so it's not very useful. The default 256 color palette was there as a placeholder because everybody were using custom palettes anyway.

_wb_

2024-05-15 02:05:02

Can someone explain why so many color spaces have a transfer function that has a linear segment near black? sRGB, Rec709, Rec2020, ProPhoto: they all do that. What is the purpose of this? I've seen many vague references to "near black issues" but I still don't understand what the problem is and how this solves it.

lonjil

2024-05-15 02:08:53

I assume it has something to do with the derivative being 0 at 0

afed

2024-05-15 02:08:57

maybe for that <https://poynton.ca/notes/colour_and_gamma/GammaFAQ.html>

_wb_

2024-05-15 02:10:02	There are plenty of other color spaces / transfer functions (Adobe98, PQ, HLG, DCI-P3) that don't have such a linear segment near black, so I don't think there's a math reason for it
2024-05-15 02:11:07	"minimizes the effect of sensor noise"? I don't understand what this means.

afed

	afed maybe for that <https://poynton.ca/notes/colour_and_gamma/GammaFAQ.html>
2024-05-15 02:13:37

_wb_

2024-05-15 02:15:14	When I select "ProPhoto" in photoshop, it uses something with a pure gamma curve, while the official ProPhoto definition does have a linear segment near black. Maybe it doesn't matter enough for Adobe to care about it.
2024-05-15 02:18:32	It's curious that all the "old" transfer functions have a linear segment near black, and then at some point this was no longer fashionable — any "new" transfer functions don't do such special near-black segments anymore.

JendaLinda

2024-05-15 05:02:02

I guess IBM didn't take any of that into consideration when they designed VGA.

kkourin

	_wb_ "minimizes the effect of sensor noise"? I don't understand what this means.
2024-05-15 05:23:32	I guess the idea is to dedicate less bits to the noise floor area?

Quackdoc

	_wb_ Can someone explain why so many color spaces have a transfer function that has a linear segment near black? sRGB, Rec709, Rec2020, ProPhoto: they all do that. What is the purpose of this? I've seen many vague references to "near black issues" but I still don't understand what the problem is and how this solves it.
2024-05-15 09:13:33	its a retro detail from when CRTs needed built in flate compensation and stuff
2024-05-15 09:13:50	its outdated and should have died long ago, but people refuse to give it up
2024-05-15 10:07:21	actually correction it turns out it has different intents depending on the specification? I found a comment from jack holm stating this <@456226577798135808> > The straight line part of the sRGB EOTF was to avoid extreme slope, which caused problems in some color management systems. It also partly addresses the difference between the expected black point of 0.1 nit for 1886 and 0.2 nit for sRGB (with more veiling glare because of the higher ambient). https://community.acescentral.com/t/srgb-piece-wise-eotf-vs-pure-gamma/4024/27
2024-05-15 10:07:38	ah man <@794205442175402004> meant to tag you instead
2024-05-15 10:08:32	so it handles the glare partially, but also has other uses

spider-mario

2024-05-15 10:48:39

glare, the bane of my existence

dogelition

2024-05-15 11:21:28

BT.709 and BT.2020 are kinda weird in that they used to not specify an EOTF at all, and now both refer to BT.1886, which is essentially a pure 2.4 gamma EOTF on a reference monitor so the piecewise transfer function (OETF) specified in BT.709/BT.2020 only applies to cameras and is irrelevant for how content should be displayed, i.e. areas like display calibration and color management

_wb_

2024-05-16 07:30:12

I now think the linear part is there just so you can do conversions from linear to sRGB and back in limited precision, fixed-point arithmetic (e.g. uint16), without running into trouble. If you make it a pure gamma curve, you cannot do that accurately in fixed-point arithmetic. In float32 this all doesn't matter since that has plenty of precision, or even float16 is probably OK (the important thing is to have floating point and not fixed point, since you need more precision near zero), but in uint16 I guess it does make a difference, and I suspect much of the early CMS implementations couldn't afford floats. Since the displays and viewing conditions sRGB was designed for had substantial amounts of glare anyway, I guess they tried to kill two birds with one stone and defined things so that when viewing an sRGB image on a display that actually renders the pixels as gamma 2.2, it results in some glare compensation.

Quackdoc

2024-05-16 07:47:20

yeah, the thread was quite... hehe illuminating

jonnyawsom3

2024-05-16 10:02:32

Mentioning float32 reminded me https://github.com/libjxl/libjxl/issues/3511

Jyrki Alakuijala

	_wb_ Can someone explain why so many color spaces have a transfer function that has a linear segment near black? sRGB, Rec709, Rec2020, ProPhoto: they all do that. What is the purpose of this? I've seen many vague references to "near black issues" but I still don't understand what the problem is and how this solves it.
2024-05-16 01:46:55	spontaneous isomerization of opsin chemicals creates a linear response at very low intensity values in the eye -- I used biasing of the log-function for this (in butteraugli xyb)
2024-05-16 01:48:29	or not quite sure if it is called isomerization -- spontaneous excitation nonetheless

KKT

2024-05-16 11:24:30

This may seem slightly off-topic, but bear with me – I need the hive mind. If you were creating a website that was open for the community to edit, what would you do it in? Any frontend devs that use tools or frameworks that would make sense in this context? Straight up HTML & CSS would be a bit painful if the site becomes at all complex… Gohugo? Tailwind?

_wb_

2024-05-17 05:12:34

This is very much on-topic since the question is relevant for jpegxl.info too 😉

yoochan

	KKT This may seem slightly off-topic, but bear with me – I need the hive mind. If you were creating a website that was open for the community to edit, what would you do it in? Any frontend devs that use tools or frameworks that would make sense in this context? Straight up HTML & CSS would be a bit painful if the site becomes at all complex… Gohugo? Tailwind?
2024-05-17 05:25:25	Where the content is open to edit? Wiki like?
2024-05-17 05:28:51	Jxlinfo content is managed as a git repository if i remember correctly, which delegates the collabrorative coordination to a tool made for it

_wb_

2024-05-17 05:40:41	It might be nice to put some static page generation thing between the editable source and the final html. So we can have things like multiple pages with nice menus and all, without having to manually keep it all consistent.
2024-05-17 05:43:25	That can be done in GitHub too, just have to do the page regeneration as a GitHub Action or something. I imagine there are several frameworks out there to make that easy, but I wouldn't know which ones are the nicest / most convenient

yoochan

2024-05-17 06:00:45	We use(d) hugo at work. It requires users to be familiar with the geeky side of editing (blindly), and commiting or serving locally.
2024-05-17 06:05:22	The editor experience was not very smooth. And a correct end user experience required minimal setup

veluca

2024-05-17 06:10:00

I've been using zola for a similar purpose (I'm even adding jxl support to it :P)

yoochan

2024-05-17 06:12:48

Emile Zola, Victor Hugo... I see a pattern here

spider-mario

2024-05-17 07:46:40

I use hugo for https://sami.boo/; it's all right

KKT

2024-05-17 08:33:31

K, we'll investigate Hugo a bit more since it came up in our independent research as well. Thanks for the input.

spider-mario

2024-05-19 07:30:46

https://www.tomshardware.com/software/windows/enthusiast-gets-windows-xp-running-on-an-i486

JendaLinda

2024-05-19 09:59:50

So they replaced all opcodes unsupported by 486 in the binaries. Interesting, I thought it would be easier to make some hybrid of Win2000 and WinXP. Win2000 can run on 486 quite well, it's just very slow.

a goat

2024-05-21 07:45:26	What would be a good metric for determining how distracting the artifacts are in a transformed video compared to the source? While I am somewhat concerned about perceptible similarity to the source, I'm more concerned about separating out the most visible artifacts and measuring their ability to add motion specific noise
2024-05-21 07:50:59	Specifically, I'm quantizing colors and I'm trying to give a number to the degree to which samples have more obvious temporal errors like flickering or improperly moving banding

_wb_

2024-05-22 10:31:31

What is the most recent "important" software/platform that doesn't yet support icc v4 but only icc v2?

HCrikki

2024-05-22 11:19:13	since v104 firefox seems to misrepresent its icc v4 support and bases its claim on an obsolete or flawed test it never evaluated again
2024-05-22 11:21:50	on windows, whats most common is that image viewers have color management disabled out of the box as a legacy decision that has not been changed since

username

2024-05-22 11:24:18	doesn't Firefox's v4 icc support work in most cases though? like I know there's v4 test that Firefox fails but I don't think I have seen any images that naturally have a v4 profile that fails
2024-05-22 11:25:46	also wb's question is about what supports color management with icc v2 but not icc v4

HCrikki

2024-05-22 11:26:47	full support is necessary, partial is pointless if not WIP
2024-05-22 11:27:32	images encoded with xyb using jpegli show severe color distortion on firefox. almost everything that renders using webkit or blink shows correct colors, while software based on firefox or gecko fails (including derivatives like waterfox and floorp). clearly some deficiency here in need of admitting and fixing - proper color management is necessary for HDR
2024-05-22 11:30:20	Servo is working towards implementation but still lagging btw - last checked a month ago
2024-05-22 11:37:14	safari for windows (nightly snapshot from may) has full support firefox lacks despite no longer released there

Meow

	HCrikki safari for windows (nightly snapshot from may) has full support firefox lacks despite no longer released there
2024-05-22 11:40:19	Isn't it discontinued long ago?

HCrikki

2024-05-22 11:42:05

extract 2 zip archives in the same folder and you can use it to test sites without a mac. supports jxl btw h**ps://dev.to/dustinbrett/running-the-latest-safari-webkit-on-windows-33pb

lonjil

2024-05-22 11:57:17

I don't think Safari supports ICCv4 fully either

Nyao-chan

	JendaLinda It's fun to create image files so small so they fit inside the NTFS file entry itself.
2024-05-22 12:10:13	reminds me of https://github.com/nanochess/bootRogue 510 byte roguelike that fits in a boot sector

JendaLinda

	Nyao-chan reminds me of https://github.com/nanochess/bootRogue 510 byte roguelike that fits in a boot sector
2024-05-22 01:20:24	Creating a working game fitting the tiny space is a great achievement. There is also demo scene around tiny executables.

jonnyawsom3

2024-05-22 01:27:06

Good ol' Kkrieger

2024-05-22 09:39:06	everyone hates iccv4 anyway
2024-05-22 09:39:21	it's worse than iccv2

_wb_

2024-05-23 07:53:17

how so? It's significantly more compact (and correct!) if you can e.g. express the sRGB transfer function as a parametric function instead of having to use a tabulated approximation like in iccv2

2024-05-23 08:35:39

argyllcms guy hates iccv4 so I also hate it

_wb_

2024-05-23 09:27:15

you're not confusing with iccv5 are you?

spider-mario

2024-05-23 09:28:40	as in iccMAX?
2024-05-23 09:28:42	(https://discuss.pixls.us/t/wayland-color-management/10804/304 )

Quackdoc

	spider-mario (https://discuss.pixls.us/t/wayland-color-management/10804/304 )
2024-05-23 09:34:32	this thread... this has given me immense headache

2024-05-23 09:38:16	https://hub.displaycal.net/forums/topic/how-to-create-icc-v-4-files/
2024-05-23 09:38:18	https://www.argyllcms.com/doc/iccgamutmapping.html
2024-05-23 09:42:23	the same florian and gill
2024-05-23 09:46:34	but it sounds like iccmax isnt needed and iccv4 isnt needed

Quackdoc

2024-05-23 09:56:50

I don't know the specifics of what ICCMax ads over iccv4, but I know a lot of people complain that iccv4 is wholly unsuitable for HDR content. This is a quote from troy sobotka, author of hitchikers guide to digital colour, specifcally in regards to wayland compositors but the point still stands > _If_ one is serious about fixing it, the wiser approach would be to use the non-domain bound ICCMax tags. > And in the CMS, which would ideally be written from scratch for display work, leverage the appropriate ICCMax tags. > At least if I were tasked with designing an appropriate path, that’s where I would certainly start. curv and para tags are explicitly domain bound in ICC, and scaling or other hacks do not seem prudent. > And it would help unfuck the glaring fuckup in V4 regarding achromatic axis.

2024-05-23 09:59:30

hdr is fine in icc v2 🤷

Quackdoc

2024-05-23 10:00:48	evidently not
2024-05-23 10:01:10	I've seen many people complain about it, and it's pretty much never used in any mastering workflow

yoochan

2024-05-23 01:30:52

If I wanted to output the minimal size at which I can truncate a file and have a full 1/8th preview, would it be complex ? would it work whatever the encoding process ? the container ? the use of modular ? lossless ? lossy ?

_wb_

2024-05-23 03:49:13

Default lossless is without progressive preview. You can have lossless + progressive, but it comes at a cost in compression. For lossy, both vardct and default lossy modular are progressive. The API doesn't have a function yet to tell you what offset you'd need (maybe we should add such a function!) — you can figure it out manually via the libjxl API by doing a progressive decode and recording the bytes read at every pass, but that's far from ideal/efficient.

yoochan

	_wb_ Default lossless is without progressive preview. You can have lossless + progressive, but it comes at a cost in compression. For lossy, both vardct and default lossy modular are progressive. The API doesn't have a function yet to tell you what offset you'd need (maybe we should add such a function!) — you can figure it out manually via the libjxl API by doing a progressive decode and recording the bytes read at every pass, but that's far from ideal/efficient.
2024-05-23 06:59:06	Thank you. What would happen if I progressively load a lossless file without preview? Will it fill the frame from top to bottom or will I have passes?
2024-05-23 06:59:13	I'll test that

_wb_

2024-05-23 07:26:42

You will get it one group at a time

Quackdoc

2024-05-23 07:39:02

an example

yoochan

2024-05-23 07:40:49

neat ! thank you !

a goat

2024-05-24 10:11:09

<@794205442175402004> What would be the better format for compressing lossless (or near lossless) floating point elevation data stored inside of a large number of images: WebP or JXL? I know the standard is TIFF for this sort of thing, but I'm still curious. I know the container for JXL is pretty small, which makes me lean towards it, but people speak highly of lossless WebP as well

_wb_

2024-05-24 11:08:07

Lossless webp is not an option, it can only do uint8

a goat

2024-05-25 12:19:23

Ah gotcha

jjrv

2024-05-25 06:17:29

Many web mapping tools use PNG and store up to 24 bits of integer elevation in RGB channels. You can do that in webp too. I think jxl is better though. The RGB channel abuse won't compress too well.

yoochan

	jjrv Many web mapping tools use PNG and store up to 24 bits of integer elevation in RGB channels. You can do that in webp too. I think jxl is better though. The RGB channel abuse won't compress too well.
2024-05-25 09:24:38	interesting ! with an additionnal 8bit alpha channel you could encode 32 bits float ... directly.... seems messy as hell, I have to try

jjrv

2024-05-25 09:44:10

I recommend JXL because it has lossy and lossless floating point, so you don't have to do a transformation first that hurts compression. BUT if you want to really dive into this, it's possible to store altitude, temperature or other mostly smooth things as an integer modulo 256. Then when decompressing, if the (absolute) difference between neighbors is over +-128, assume it's actually less than that and the more significant bit dropped by the modulo got flipped. Then if the gradient for some reason actually is that large or more, create a second image with only such problematic parts, or drop precision and signal that by adding chroma to an otherwise grayscale image. I got pretty deep into that stuff, you can do lossy or lossless compression of floating point data even using video codecs. But in my experience JXL finally just made that pain go away.

2024-05-25 09:48:31

If you take let's say LiDAR altitudes and actually 24 bits (millimeter precision), you will have one color channel of pure white noise and it won't compress pretty much at all. JXL understands it's a single number with more bits and is able to do something with it, though it'd be better to round off the noise or compress lossy.

Jyrki Alakuijala

	_wb_ Lossless webp is not an option, it can only do uint8
2024-05-25 11:04:00	You could store a 32 bit floating point number in argb (4x 8 bits) and store with --exact so that a=0 doesn't mess up things.

_wb_

2024-05-25 11:06:30

Sure but a standard webp will show something weird when displaying such an image. You can do the same with PNG.

JendaLinda

2024-05-25 05:12:13

Instead of hacking image formats, it would be better to just use pfm and compress it using any file archiver.

jjrv

2024-05-26 07:51:07	Using image formats as a generic archiver like that makes no sense, but using a more suitable input transformation or image format saves a lot of space. Like for elevations on this planet on at least city-wide scale, a single 16-bit integer grayscale channel is plenty. If you include Everest and Mariana trench, that's still 33cm of vertical precision. If you focus on just the land or ocean and add 8 more bits, it's difficult for those bits to be something other than noise. As mentioned, you can go with just 8 bits storing height modulo 256 as long as neighbor elevations have under 42 meters of difference, and treat those as special cases somehow. Top left corner full elevation is then image metadata. Why not store deltas (gradient) instead? Can't compress them lossy in any way at all and otherwise they have all the same problems (with lossy you do need to decrease that 42 threshold to account for compression artifacts).
2024-05-26 08:19:00	There's so much more potential in jxl tooling. ECMWF distributes the world's leading weather forecasts as JPEG 2000 which is a slow to decompress nightmare format. Each forecast has dozens of things like temperature, wind u/v, humidity... Each is stored as a separate single-channel image. But everything is available for over 20 pressure levels and over 100 time steps, a 4D cube essentially. If you could put either time steps or pressure levels into different channels of the same image and take advantage of their correlation, that would surely save a lot of space. And pretty much any weather forecast you see anywhere, had to transfer a lot of that as part of the pipeline of getting it from ECMWF or NOAA to your eyes. ESA / NOAA multispectral satellite imagery can be processed into a 4D cube as well, with about 20 spectral channels (unlike the 3 of RGB) and any number of time steps. There are specialist formats for that kind of 4D scientific data, like ZFP. But they cannot compete even with JPEG 2000. ECMWF didn't just pick that one randomly. ESA also uses it for Sentinel 2.
2024-05-26 08:21:41	JPEG XL would deserve to become the standard format for earth observation, climate science, etc. It already compresses a bit smaller and way faster than JPEG 2000. It could also compress way smaller if multiple channels fit in a single file more efficiently. Way faster way smaller might be enough of a combo to overcome some inertia and get it adopted
2024-05-26 09:12:40	We're talking terabyte a day of compressed data transferred and archived in operational use. I'm sure ICEYE could produce a petabyte a day of meaningful data if it were economical to store and transfer, so as speeds and capacities increase compression will always remain relevant.

CrushedAsian255

2024-05-28 11:00:22

does JPEG XL support 4D image data?

_wb_

2024-05-28 11:44:49

Not really. It supports 2D images. If you consider multi-frame as another dimension, that gives you 3D. If you consider multi-channel as yet another dimension, you could call it 4D, but the number of channels is limited to 4K. But the main compression/decorrelation/prediction is happening for the two image dimensions, since in the end everything is stored in a planar way. There are some options to decorrelate frames (e.g. kAdd blend mode, cropped frames) and to decorrelate channels (color transforms), but the current libjxl encoder is mostly not using those except for decorrelating the first three channels (RGB).

jjrv

2024-05-28 01:14:19

And even in its current state, where you can take advantage of the correlation between 3 values by stashing them into RGB channels and otherwise use separate image files, it already hands-down beats everything else out there for the use cases mentioned. So, I'm extremely happy with it.

Demiurge

	jjrv JPEG XL would deserve to become the standard format for earth observation, climate science, etc. It already compresses a bit smaller and way faster than JPEG 2000. It could also compress way smaller if multiple channels fit in a single file more efficiently. Way faster way smaller might be enough of a combo to overcome some inertia and get it adopted
2024-05-28 11:46:00	The problem is the current decoder will run out of memory and crash, and lacks ROI decoding api or resize-on-load I believe either
2024-05-28 11:47:33	Also instead of backing out and returning an error it just calls an illegal instruction and forces the caller to abort so you can't recover from errors
2024-05-28 11:47:48	:(

yoochan

2024-05-29 05:42:15

the current libjxl implementation lacks features and may have bug today. This doesn't go in contradiction with the fact it deserve to be a standard format for science

jjrv

2024-05-29 05:40:38	<@794205442175402004> I tried with latest release and then compiled cjxl from your pam_extrachans branch but it keeps throwing Invalid float description. Maybe it thinks the extra channels should be floats?
2024-05-29 05:43:33	Tried dropping the channel count: ``` P7 WIDTH 4865 HEIGHT 4091 DEPTH 9 MAXVAL 65535 TUPLTYPE RGB TUPLTYPE Optional TUPLTYPE Optional TUPLTYPE Optional TUPLTYPE Optional TUPLTYPE Optional TUPLTYPE Optional ENDHDR ``` And running: `cjxl radiance.pam radiance.jxl -d 0 -e 9 -E 5` Prints: ``` JPEG XL encoder v0.10.2 f7e65ca1 [AVX2,SSE4,SSE2] Encoding [Modular, lossless, effort: 9] ./lib/jxl/encode.cc:591: Invalid float description JxlEncoderSetExtraChannelInfo() failed. EncodeImageJXL() failed. ``` Tested that 3 channels does work with that header format and current pipeline.
2024-05-29 05:52:56	Huh, switched those from Optional to Thermal (that immediately failed with 21 channels but now 12 didn't) and now it's doing something. Let's see what happens, then.
2024-05-29 05:56:20	Inputs were Oa01_radiance.jxl - Oa10_radiance.jxl, total 214058510 bytes. Output is: https://reakt.io/jxl/radiance.jxl 177305253 bytes. Not bad at all. OK, maybe when I tested with 21 channels, it was with the release and not the branch, because now that did something too: https://reakt.io/jxl/radiance-21.jxl 312569879 bytes (down from 381512264 in 7 .jxl files so 18% less) Excellent result, just that it took 2 minutes on 48 cores and I've got to run it on 200k images so it would be great to re-use the MA tree and use a lower effort? OTOH I guess I can run it like this, since it's less than a year of constant crunching, not like decades. I guess I'd like to take a random sampling of up to 1000 images, optimize the compression on those, and then use those settings for the whole archive?

DZgas Ж

2024-05-29 05:57:49	bruh
2024-05-29 06:06:58	my small project of converting images to RGB Place (1500 images with 1% damping)
2024-05-29 06:10:43	count of pixels

monad

jjrv Inputs were Oa01_radiance.jxl - Oa10_radiance.jxl, total 214058510 bytes. Output is: https://reakt.io/jxl/radiance.jxl 177305253 bytes. Not bad at all. OK, maybe when I tested with 21 channels, it was with the release and not the branch, because now that did something too: https://reakt.io/jxl/radiance-21.jxl 312569879 bytes (down from 381512264 in 7 .jxl files so 18% less) Excellent result, just that it took 2 minutes on 48 cores and I've got to run it on 200k images so it would be great to re-use the MA tree and use a lower effort? OTOH I guess I *can* run it like this, since it's less than a year of constant crunching, not like decades. I guess I'd like to take a random sampling of up to 1000 images, optimize the compression on those, and then use those settings for the whole archive?

2024-05-29 11:02:00

Not sure how e9 could possibly be worth the compute. e7 with g3 incorporated is probably way more efficient.

jjrv

2024-05-30 07:33:16	Definitely don't want to use e9 but that's what wb wanted me to use first, to test compressing 21 channels.
2024-05-30 07:40:31	How is the quality / distance value unit and channel type related anyway? What is the meaning of a depth vs thermal vs something else channel in terms of compressing arbitrary data? I've been compressing thermal data as RGB to get 3 time steps in a single image. `uses_original_profile` and `JXL_TRANSFER_FUNCTION_LINEAR` seem to affect what the Butteraugli distance does. I don't even care about Butteraugli in this context, would prefer some measure related to the input numbers. So if my input is in Kelvin, what kind of relative errors can I expect with different distances? I know this format isn't meant for science, but it's an unreasonably good fit. Latest confusion is that 2 RGB images compressed lossy are total about 400kb. When compressing losslessly, there were nice further savings from combining all the frames into a single 6-channel image. Now when testing lossy, the result bloated to 2 megabytes, 5 times the expected size. Maybe these aren't the same unit any more, when there's RGB channels and extra channels are thermal? ``` JxlEncoderSetFrameDistance(frame, 0.05) JxlEncoderSetExtraChannelDistance(frame, num, 0.05) ``` Or maybe I screwed up something else 😬
2024-05-30 07:44:51	For reference, I'm working on this: https://reakt.io/temp/ The globe spins. For me JPEG XL is the only practical way to get that thing working at all. I've been mucking around with ffmpeg, video codecs and getting temperatures -80 - 60 Celsius somehow encoded at 0.1 Celsius resolution, then decoded in a web app. It has been a nightmare, and now after picking up JPEG XL suddenly 3 days later it works.
2024-05-30 07:53:44	I want it to also show optical imagery, 300m resolution globally and select areas (ones in the news lately) at 10m resolution, day per frame over the past several years. If I win the lottery, then 10m globally. So that's a whole different dataset to also compress, the 21 channel case is related to that side, because the optical imagery is way more than just RGB. I'll use JPEG XL for lossless archival and probably have to re-visit the latest video codecs every couple years.

_wb_

2024-05-30 08:35:24	When doing lossy, the RGB channels are treated differently, by default they're converted to XYB and encoded with VarDCT — which is good if it's visual data intended for human viewing. VarDCT is hardcoded for 3 channels, any additional channels are always using Modular.
2024-05-30 08:37:23	When doing modular lossy, very different coding tools are used than when doing the usual vardct lossy, so results in both quality and size can be quite different.
2024-05-30 08:41:57	For lossy, I expect no real benefit from putting all 21 channels together, since they will be compressed more or less independently (prevchannel context does not work well in combination with squeeze)

yoochan

2024-05-30 08:47:00

what is the reason behind the choice to forbid varDCT for all other channels ?

_wb_

2024-05-30 09:31:31

Only having to deal with the case of 3 channels makes it easier to have an efficient implementation, and we assumed that DCT-based lossy compression mostly makes sense for perceptual image data, not so much for other channels like Alpha or the K of CMYK. It was a design trade-off between keeping things simple conceptually and in terms of implementation (generalizing varDCT to an arbitrary number of channels would complicate quite a few things, like chroma from luma etc) on the one hand, and having an expressive / feature-rich image codec on the other hand (having a varDCT option would give an encoder more possibilities for extra channel encoding).

Oleksii Matiash

2024-05-30 09:42:36

Well, this is the only thing in jxl that I can't really agree with. For me much more logical would be to allow any channel above first 3 to be encoded either with modular or vardct, just not using special tricks, used for first 3 channels in vardct mode. Probably I'm missing something critical that lead to this decision

_wb_

2024-05-30 09:57:09	It wouldn't be impossible to define it like that, but it would complicate quite a few things.
2024-05-30 10:00:51	One other way to represent 21 channels in a single image is to do it as an RGB image with 7 layers/frames. Frames are encoded mostly independently, but you can use the kAdd blend mode and subtract one frame from another frame, which can be beneficial, especially for lossy compression, if the data is similar.

Oleksii Matiash

2024-05-30 10:13:25

Yes, I know that it is possible to do workaround for this limitation, I just believe that this limitation is not very clever. For 99.99% cases - yes, 3 channels is the enough, but disallowing it at all.. I'd rather split it to 'levels': "first 3 channels vardct only" for base level encoders\decoders, and it would be enough for almost everything, but if you need to encode\decode more complex - you are allowed to do it

2024-05-30 10:13:46

But it's just thoughts

Tirr

2024-05-30 10:20:16

I guess it might cause decoder fragmentation. as an author of jxl-oxide I can see such features will complicate decoder in various ways, and if it's optional feature I'd rather defer implementing, if not skip that at all

Oleksii Matiash

2024-05-30 10:51:00

Yes, I agree, sure. And I'd personally never use more than 3 channels, it's just thoughts about ideal world 🙂

_wb_

2024-05-30 12:59:10

In the old JPEG, you can have between 1 and 4 channels. In the JXL design, we could have done something similar (but with a higher maximum number of channels), in fact that's what I did in FUIF (which evolved into JXL's Modular mode): everything is defined for an arbitrary number of channels, and everything is encoded channel per channel anyway so it does not complicate implementations. In PIK (which evolved into JXL's VarDCT mode) a different choice was made: they had more of a focus on enc/dec speed, and everything is hardcoded for exactly 3 channels (even grayscale is done as 3 channels where the two chroma channels just happen to be all-zeroes), so everything can be specialized for the 3-channel case, e.g. making it possible to do everything from inverse DCT to XYB-to-RGB in one go. Also all signaling of things like quant tables, filter weights, etc is made simpler and can have nice default values by assuming 3 channels.

yoochan

2024-05-30 01:10:07	interesting 👍
2024-05-30 01:11:34	it sounds like the varDCT decode function is not even implemented once in a single layer fashion

jjrv

2024-05-30 05:17:37

Very good to know! Then I'll plan to always use 3 channels when compressing lossy, and more only when lossless. It sounds like being able to re-use an MA tree is the main potential new thing on the roadmap relevant to this area.

_wb_

	jjrv Inputs were Oa01_radiance.jxl - Oa10_radiance.jxl, total 214058510 bytes. Output is: https://reakt.io/jxl/radiance.jxl 177305253 bytes. Not bad at all. OK, maybe when I tested with 21 channels, it was with the release and not the branch, because now that did something too: https://reakt.io/jxl/radiance-21.jxl 312569879 bytes (down from 381512264 in 7 .jxl files so 18% less) Excellent result, just that it took 2 minutes on 48 cores and I've got to run it on 200k images so it would be great to re-use the MA tree and use a lower effort? OTOH I guess I can run it like this, since it's less than a year of constant crunching, not like decades. I guess I'd like to take a random sampling of up to 1000 images, optimize the compression on those, and then use those settings for the whole archive?
2024-06-02 12:34:22	I made some small tweaks to e2 that are useful in general for 16-bit images: https://github.com/libjxl/libjxl/pull/3622
2024-06-02 12:35:21	Also added something to make `cjxl -e 2 -E 1` actually do something different than default `-e 2`, and it seems to work quite well on this 21-channel test image.
2024-06-02 12:37:57	Not quite as small as the e9 E5 image of course, but should be quite a bit faster to encode and decode.
2024-06-02 12:40:21	Before: e2: 396.7 MB e2E1: same as just e2 e9E5: 312.6 MB After: e2: 387.1 MB e2E1: 343.8 MB
2024-06-02 12:47:53	At some point we should add some functionality to encode with a manually specified tree (in `jxl_from_tree` syntax, for example), and to extract trees from a jxl bitstream. That should open up some possibilities for making your own custom fast encoder that is tuned for the specific kind of image content you have. No idea how to add such a thing in a nice way to the libjxl API though...
2024-06-03 08:42:36	Made some tweaks, it's now 341.0 MB at e2E1 🙂

CrushedAsian255

	_wb_ Made some tweaks, it's now 341.0 MB at e2E1 🙂
2024-06-03 08:45:08	maybe `cjxl` should give information about the size / performance tradeoffs of the options

_wb_

2024-06-03 08:45:59

it now says this: ``` -e EFFORT, --effort=EFFORT Encoder effort setting. Range: 1 .. 10. Default: 7. Higher numbers allow more computation at the expense of time. For lossless, generally it will produce smaller files. For lossy, higher effort should more accurately reach the target quality. ```

CrushedAsian255

	_wb_ it now says this: ``` -e EFFORT, --effort=EFFORT Encoder effort setting. Range: 1 .. 10. Default: 7. Higher numbers allow more computation at the expense of time. For lossless, generally it will produce smaller files. For lossy, higher effort should more accurately reach the target quality. ```
2024-06-03 08:47:02	i was thinking more like for the more advanced options, like it's not directly obvious what changing the group size does for performance

_wb_

2024-06-03 08:48:55

Hard to say what it actually does for performance. Changing the group size to something different may improve or worsen the compression density, and may be good or bad for speed depending on the image content and the number of threads.

CrushedAsian255

	_wb_ Hard to say what it actually does for performance. Changing the group size to something different may improve or worsen the compression density, and may be good or bad for speed depending on the image content and the number of threads.
2024-06-03 08:49:48	so it's more complex than just "smaller group = faster but worse quality" or something, and you kinda just need advanced understanding of the format to get an intuition of the performance/speed tradeoff

_wb_

2024-06-03 08:53:12

for group size it's tricky — larger groups can be good for lz77 and to have fewer poorly-predicted group edges, smaller groups can be good for finer granularity in local RCTs and local palettes; when the number of available threads is large and the image is not so large, smaller groups allow more parallelization so it's faster, but if the number of threads is smaller than the number of megapixels, even the largest group size will allow using all threads so it won't make much of a speed difference.

salrit

2024-06-03 09:07:54

Hey, I was curious to know how is the predictor for lossless - working here, like its said in the jxl's git, "An adaptive predictor computes 4 from the NW, N, NE and W pixels and combines them with weights based on previous errors." and in krita's website "Fraction of pixels used for the Meta-Adaptive Context tree. The MA tree is a way of analyzing the pixels surrounding the current pixel, and depending on the context choose a given predictor for this pixel". What's the MA-Tree used here for? How is 'context modelling' used here? and the other thing is - what exactly in WebP lossless (in terms of C-Ratio not the rendering speed etc.), JPEG-XL's lossless tries to fix ,what were the issues with WebP that JPEG-XL overcomes?

monad

	CrushedAsian255 so it's more complex than just "smaller group = faster but worse quality" or something, and you kinda just need advanced understanding of the format to get an intuition of the performance/speed tradeoff
2024-06-03 09:20:09	generally: larger is denser (correlated with increasing image size), smaller is slower single-threaded but can be faster multi-threaded
2024-06-03 09:22:07	so doing things single-threaded, it's safe to just use the largest group size

_wb_

	salrit Hey, I was curious to know how is the predictor for lossless - working here, like its said in the jxl's git, "An adaptive predictor computes 4 from the NW, N, NE and W pixels and combines them with weights based on previous errors." and in krita's website "Fraction of pixels used for the Meta-Adaptive Context tree. The MA tree is a way of analyzing the pixels surrounding the current pixel, and depending on the context choose a given predictor for this pixel". What's the MA-Tree used here for? How is 'context modelling' used here? and the other thing is - what exactly in WebP lossless (in terms of C-Ratio not the rendering speed etc.), JPEG-XL's lossless tries to fix ,what were the issues with WebP that JPEG-XL overcomes?
2024-06-03 09:22:44	jxl has various predictors, the self-correcting one (aka Weighted predictor) is only one of them. The MA tree determines what predictor to use for what pixels, and it also determines the entropy coding context (i.e. the distribution of symbols that will be used in the ANS or Huffman coding). See also: https://qon.github.io/jxl-art/wtf.html and https://qon.github.io/jxl-art/
2024-06-03 09:26:12	The main difference between lossless webp and jxl is that lossless webp is hardcoded for 8-bit RGBA, while lossless jxl can handle arbitrary bit depths and number of channels. Hardcoding for 8-bit RGBA does allow webp to be fast, but it also seriously limits its applicability since in many use cases (especially HDR or medical/scientific) you need more precision and/or channels.
2024-06-03 09:29:29	Another difference is that lossless webp is encoding full rows of pixels (like PNG), while jxl is tiled (by default it does groups of 256x256 pixels at a time). Doing full rows has some advantages for compression (e.g. you can have longer runs in RLE, there are fewer border pixels with poor prediction), but it also means it cannot be parallelized, while in jxl both encoding and decoding can use multiple threads effectively, making it more suitable for larger images and current and future processors (which tend to become not really faster but mostly just have more cores).
2024-06-03 09:40:26	(webp cannot handle very large images anyway since dimensions are limited to 16k by 16k in the header syntax, but even if larger dimensions would be allowed, it would be limited by its non-tiled bitstream structure; e.g. jxl can do chunked encode/decode which is essential for handling huge images; webp and png can do a line-by-line decode but that's not as good — e.g. if you have a 100k by 10k image and you want to show a 2k by 1k viewport, with just line-by-line decode you'll have to decode basically everything if the viewport is close to the bottom, and even if it's close to the top you'll need to decode 100k columns while you need only 2k columns, while in jxl it can be done with way less overhead wherever the viewport is)

salrit

2024-06-03 09:57:18

<@794205442175402004> thanks for the detailed answer. BTW, the rolling window scheme for LZ77, is implemented here too? and ya the difference in compression ratios that JXL-lossless is better than WebP's lossless is for the 1.) Adaptive predictor in case of JXL lossless and while WebP although have 14 choices, chooses block wise and get one fixed for all the pixels in the block? 2.) The entropy coding here uses (context modelling and ANS) + LZ77 while WebP uses Huffman + LZ77 ?

_wb_

2024-06-03 10:16:23	Oops I see you also emailed me about this, I missed that email but I guess I can answer here now 🙂
2024-06-03 10:21:43	Yes, jxl also supports lz77, though libjxl doesn't make as much use of it as libwebp does — webp is closer to PNG in that respect. Part of the reason is another difference between jxl and png/webp: jxl encodes samples in a planar way (RRRR.... GGGG.... BBBB....) while png/webp encode in an interleaved way (RGBRGBRGBRGB....). For lz77, interleaved is better since you can match entire pixel sequences while in planar you can only match sample sequences of one channel. But in general, planar is often better for compression, and it is also easier to generalize to handle an arbitrary number of channels.
2024-06-03 10:26:30	1) Yes, webp chooses a predictor per block while in jxl the choice depends on the MA tree, which is more expressive (you could make an MA tree that mimicks what WebP or PNG does, but you can also do much more) 2) I think WebP also does some kind of context modeling but it's certainly more limited and less expressive. JXL allows both ANS and Huffman as the 'back-end' for the entropy coding; usually ANS is better but for fast encoding Huffman is better.

monad

2024-06-03 10:28:14

more thoroughly

yoochan

2024-06-03 10:29:21

did jyrki participated a bit to the design of the lossless encoder ? or it was only your ideas _wb_ ? did some webp innovations were incorporated in jxl ?

salrit

_wb_ 1) Yes, webp chooses a predictor per block while in jxl the choice depends on the MA tree, which is more expressive (you could make an MA tree that mimicks what WebP or PNG does, but you can also do much more) 2) I think WebP also does some kind of context modeling but it's certainly more limited and less expressive. JXL allows both ANS and Huffman as the 'back-end' for the entropy coding; usually ANS is better but for fast encoding Huffman is better.

2024-06-03 10:32:02

Thanks and ya WebP uses dedicated entropy codes for regions sharing similar statistical properties - kinda context modelling- although as you said the one for JXL is a stronger one. And there are few more.... : 1.) I guess color transform(the XYB one) don't take place in JXL for the case of lossless, pixels are read in ARGB format? 2.) There was something mentioned about the Image feature extraction, is it involved in lossless too?

_wb_

2024-06-03 11:01:16

1) XYB is only used for lossy since it is not reversible in integer arithmetic; RCTs like YCoCg are used (can be different RCT per group) 2) Yes, there is a coding tool called Patches that can be used to draw a rectangle taken from a previous frame; this is effectively a form of 2D lz77 that can be used e.g. to encode repetitive elements like letters of text only once (in a kind of sprite sheet hidden frame) and reuse them multiple times with cheap signaling.

2024-06-03 11:02:30

There's also a coding tool called Splines that can be used in principle in both lossy and lossless but we don't currently have an encoder that can use this tool effectively, so for now it is only something to play with in <#824000991891554375> 🙂

salrit

	_wb_ 1) XYB is only used for lossy since it is not reversible in integer arithmetic; RCTs like YCoCg are used (can be different RCT per group) 2) Yes, there is a coding tool called Patches that can be used to draw a rectangle taken from a previous frame; this is effectively a form of 2D lz77 that can be used e.g. to encode repetitive elements like letters of text only once (in a kind of sprite sheet hidden frame) and reuse them multiple times with cheap signaling.
2024-06-03 11:11:21	Thanks 🙌
	_wb_ Yes, jxl also supports lz77, though libjxl doesn't make as much use of it as libwebp does — webp is closer to PNG in that respect. Part of the reason is another difference between jxl and png/webp: jxl encodes samples in a planar way (RRRR.... GGGG.... BBBB....) while png/webp encode in an interleaved way (RGBRGBRGBRGB....). For lz77, interleaved is better since you can match entire pixel sequences while in planar you can only match sample sequences of one channel. But in general, planar is often better for compression, and it is also easier to generalize to handle an arbitrary number of channels.
2024-06-03 02:58:26	<@794205442175402004> Isn't matching sample sequences in a planar way might be beneficial than the interleaved way? Like in a plane, pixels might be correlated more -'area wise' than taking three of the planes combined? Can that can be a point of generalizing that planar encoding > interleaved encoding?

_wb_

2024-06-03 05:07:38	Planar is better for decorrelation. Extreme example: imagine an 8-bit grayscale image encoded as 8-bit RGBA: after color transforms (YCoCg or SubtractGreen or whatever), it is basically one byte of image data followed by three bytes that are always the same (0 for the two chroma channels, 255 for the alpha channel). Doing that interleaved is not so great, doing it planar means you can encode the main data and then just a big bunch of repeating values that will compress down to nothing (either by doing RLE or by using entropy coding with a singleton distribution which assigns a 0 bit code to represent a symbol if there is only one symbol).
2024-06-03 05:12:25	But interleaved is somewhat better if you have something that works well with lz77, say an image that has the exact same pixel values repeating, like a screenshot of this discord chat where avatars, emojis, letters of text etc are causing repetitive data that can be matched with lz77. If it's interleaved, you only need to encode one (distance, length) pair per horizontal run of repeated pixels, while if it's planar, you need to encode 3 such pairs (you'll need to repeat it in every channel).
2024-06-03 05:14:46	(Patches is even better than lz77 on an interleaved image, but we don't have a very exhaustive Patches detector yet; it's a harder thing to write a good encoder for than for finding lz77 matches, which is also not trivial but it has been around for a long time now)

jonnyawsom3

2024-06-03 05:19:58

I recall I think Veluca(?) had the idea of using lz77 to find matches to then use patches instead, or something along those lines

salrit

	_wb_ (Patches is even better than lz77 on an interleaved image, but we don't have a very exhaustive Patches detector yet; it's a harder thing to write a good encoder for than for finding lz77 matches, which is also not trivial but it has been around for a long time now)
2024-06-03 05:22:12	So something like a patch-substitution on an interleaved is the best option? But this best again depends, like if its a synthetic image, with repeatative values then great but if a natural image maybe then the planar one suits... just tryin to reason why the planar is chosen in JXL ...

_wb_

2024-06-03 05:55:10

Yes, for natural images planar is better, and for many synthetic images too. In general it is part of why jxl is compressing better. But there are some cases where interleaved is better, especially when combined with full row encoding, like if you put two copies of a photo side by side horizontally.

jjrv

2024-06-03 06:06:03

Wonder what would be the best way to encode 32bit ints? I guess the format supports them as such, but libjxl not yet? Not asking it to be fixed, just wondering what's the best workaround. 16bit grayscale for 16 least significant bits and an extra channel for 16 most significant bits? Sounds questionable to use two color channels and transform to some other color space? Guess I'll do some benchmarking.

2024-06-03 06:08:22

These are longitude-latitude pairs in range -180000000 - 180000000. Just planning to use the same codec for these as the corresponding color data in a separate file. Actually there's 28 bits of latitude, 29 bits of longitude and 16 bits of altitude. I'll cram them in a single file somehow, just have to test different channel ordering strategies. These shouldn't really be correlated at all I think.

jonnyawsom3

2024-06-03 06:28:03	I assume it's more multispectral color data? Otherwise you could store it in RGB and then use entirely extra channels for the location data to still get a useful preview image
2024-06-03 06:28:55	I could also mention the napkin maths we did a long time ago, that the entire eath at 1m resolution could fit in under 2 JXL files

jjrv

2024-06-03 06:48:24

Ooh, putting the coordinates in the same file makes sense, somehow didn't think of that. It's entirely uncorrelated with color data, but who cares. Could just use 5 extra channels. Altitude might even be correlated somehow.

jonnyawsom3

2024-06-03 06:51:35

Presumably, using `-E 3` would mean the 3 colour channels would be compressed together, then the 3 location channels, followed by whatever else. Although I forget the ordering of the Extra MA learning so that might be wrong

jjrv

2024-06-03 06:52:56

Going to be 21 related color channels and 5 channels of metadata garbage basically, at least the 32-bit lon/lat make no sense as image but are needed to reproject it. Just wanting to handle decoding it all in a single pipeline. All of them as extra channels makes the most sense, then.

jonnyawsom3

2024-06-03 07:03:53

Actually... I wonder if they could just be stored in a brotli metadata box Edit: Nevermind, since you want coordinates per pixel anyway it's probably best to use extra channels

_wb_

2024-06-03 07:04:34

jxl does not allow 32-bit uint in the current spec, only up to 24-bit uint. It does support 32-bit float though (also losslessly). The idea is that an implementation can use float32 for everything, and float32 has 1 sign bit, 8 exponent bits, and 23 mantissa bits (or 24 if you count the implicit one).

jjrv

2024-06-03 07:16:14

Makes sense. These lon/lat coordinates need the full 29 bits because they're coordinate inputs for a vertex shader, all others are color data for the fragment shader. Anyway I'll hack it somehow. Very unusual use case.

_wb_

2024-06-03 07:52:52	You can have 29 bits if you stuff them in a float somehow 🙂 — e.g. make the most significant bits 001 and then just put your uint29 in the 29 least significant bits
2024-06-03 07:53:46	that maps the number to something between 0.f and 2.f, though most numbers will be very small and close to zero 🙂

salrit

_wb_ Planar is better for decorrelation. Extreme example: imagine an 8-bit grayscale image encoded as 8-bit RGBA: after color transforms (YCoCg or SubtractGreen or whatever), it is basically one byte of image data followed by three bytes that are always the same (0 for the two chroma channels, 255 for the alpha channel). Doing that interleaved is not so great, doing it planar means you can encode the main data and then just a big bunch of repeating values that will compress down to nothing (either by doing RLE or by using entropy coding with a singleton distribution which assigns a 0 bit code to represent a symbol if there is only one symbol).

2024-06-03 08:04:52

I think the case of 8-bit image data using the interleaved encoding in WebP lossless, is taken care of , as for the grey images, after application of RCT (subtract green), the [A,R,G,B]-> [255,0,G,0] and then after the predictor transform is applied, each of the color planes and alpha plane is Huffman + LZ Coded individually and for bi-level images its just substitution of color indices from a palette.

CrushedAsian255

2024-06-04 01:13:51

is there a way to do some kind of tree-from-jxl or something? to see the MA tree generated by an encoder?

monad

	CrushedAsian255 is there a way to do some kind of tree-from-jxl or something? to see the MA tree generated by an encoder?
2024-06-04 01:57:55	go back to libjxl 0.7, call benchmark_xl with --debug_image_dir

CrushedAsian255

	monad go back to libjxl 0.7, call benchmark_xl with --debug_image_dir
2024-06-04 01:58:37	is there a way to have bother HEAD and 0.7 installed at the same time?

monad

2024-06-04 01:59:18

just install 0.7 to whatever directory you want

salrit

2024-06-04 07:00:55

For single intensity 8-bit images.... In WebP lossless:A RCT is applied (subtract green), resulting in ARGB -> A0G0 , Then predictor transform is applied. A0G0A0G0... is taken in consideration for the predictor. Then the residue A0G0 A0G0...is LZ coded and then each of the channels is Huffman coded individually. in JXL lossless : A RCT is chosen block wise. Here individual channels are taken for applying the predictor transform. Each channel is LZ coded and then entropy coded individually. Will it make a difference or did I get the procedure wrong 😅 ? I tested few 512*512 natural gray-scale images and found JXL's Compression Ratios >> WebP's CRs. Is the reason here just more flexible predictors and context based entropy coding?

_wb_

2024-06-04 07:10:13

At least for JXL you got it right, for WebP I think you're right too but I don't know it that well (others here like <@532010383041363969> <@768090355546587137> <@987371399867945100> were involved in designing lossless webp so they will know much better). In JXL, LZ coding is barely used (at default effort it is not used at all, iirc), while in WebP/PNG it is a major coding tool. For photographic images, LZ does not help much since there will be few good (long) matches — that's why WebP and PNG perform well mostly for non-photographic images. For photo, what makes jxl better than png/webp are the better predictors (in particular the self-correcting predictor), slightly better RCTs, better entropy coding (ANS instead of Huffman), but most of all, way better context modeling.

salrit

2024-06-04 07:23:22

Thanks! I have been in touch with <@532010383041363969> (over emails) and he has helped much in understanding the algo for lossless (the empirical designs etc..) 🙌 ... although much is left to understand .. Ya , few things more, is there just a single kind of scan-line ordering for JXL (talking in terms of predictor transform here) or something like Adam-infi (like in FLIF) is implemented here too? you said something about the 'patches' the 2D like LZ coding - is it for just between frames of animations? or in a single frame rectangles from adjacent blocks can be compared and used? Apart from subtract green and YCoCg, what all RCTs are used ? (I should probably see this, sorry for being lazy here..😅 )

_wb_

2024-06-04 07:46:35

Patches are also useful for single frame. You cannot reference the current frame, but you can insert an invisible auxiliary frame that contains a sprite sheet of patches, and then use it to remove repetitive elements from the actual frame. The current libjxl encoder does this in both lossy and lossless mode, it has some heuristics to detect things like letters or icons on a solid background and will use Patches to encode them only once.

salrit

_wb_ Patches are also useful for single frame. You cannot reference the current frame, but you can insert an invisible auxiliary frame that contains a sprite sheet of patches, and then use it to remove repetitive elements from the actual frame. The current libjxl encoder does this in both lossy and lossless mode, it has some heuristics to detect things like letters or icons on a solid background and will use Patches to encode them only once.

2024-06-04 07:47:27

I see, kind of like the Soft Pattern Matching in JBIG2..

_wb_

2024-06-04 07:49:26

Scan-line ordering: it's only raster order (within the group), no more Adam-inf scanline order since that is quite bad for memory locality (so bad for decode speed). But there is the Squeeze transform that can be used to achieve a similar but better progressive decoding, where the low-res previews are based on averaging sample values rather than just nearest-neighbor sampling.

lonjil

2024-06-04 07:51:10

example of patches. I took a screenshot of this chat, then encoded with VarDCT with patches enabled, then decoded the 1/8th size preview. However, since patches are stored *before*, they are also included even when stopping before the full resolution data, and so we can see their impact.

_wb_

	salrit I see, kind of like the Soft Pattern Matching in JBIG2..
2024-06-04 07:51:24	Yes, exactly like that, except of course it works for color images, not just 1-bit black&white. And also a major difference is that residuals are still encoded: patches just get subtracted from the main image but do not replace it, so if a lossy patch does not match exactly, there is still a chance to correct it.
	lonjil example of patches. I took a screenshot of this chat, then encoded with VarDCT with patches enabled, then decoded the 1/8th size preview. However, since patches are stored before, they are also included even when stopping before the full resolution data, and so we can see their impact.
2024-06-04 07:53:18	as you can see, the heuristic does manage to find a lot of letters and encodes them via patches (which also means they get modular encoded instead of vardct, which is a good idea anyway for text, though not so much for my avatar which also became a patch). It does not find everything though, there is still room for improvement in those heuristics...
2024-06-04 07:58:40	RCTs: they always take 3 subsequent channels as input, permute them in any of the 6 ways, and then either do YCoCg or any combination of these primitives: - subtract first channel from third, e.g. RGB -> RG(B-R) - subtract first channel from second, e.g. RGB -> R(G-R)B - subtract avg of first and third from second, e.g. RGB -> R(G - (R+B)/2)B
2024-06-04 07:59:14	RCTs can be stacked, multiple RCTs can be done one after the other
2024-06-04 08:00:36	So in principle you can do something like RGBA -> RABG -> ARGB -> AYCoCg via two permute-only RCTs and one YCoCg
2024-06-04 08:01:29	Channel order does not matter that much since planar encoding is used anyway, but it can make a difference when using PrevChannel properties in the MA context tree
2024-06-04 08:03:29	The current libjxl encoder is only trying RCTs on the first three channels though, we haven't explored images with more channels very much yet. For something like those 21-channel multispectral images, the modular design of the transforms in modular mode might be quite useful though.

salrit

2024-06-04 08:05:29

🙌 thanks <@167023260574154752> and <@794205442175402004> ..

Demiurge

2024-06-04 09:14:23

Is there a difference between XYB and XYZ? And is there an important practical advantage over LAB?

_wb_

2024-06-04 10:05:47

<@446428281630097408> after that pull request lands, for your 21-channel test image I got good results when doing this: ``` cjxl radiance-21.jxl radiance-21.jxl.jxl -d 0 -e 2 -E 1 -C 0 JPEG XL encoder v0.10.2 67cb0e82 [NEON] lib/extras/dec/apng.cc:795: JXL_FAILURE: PNG signature mismatch Encoding [Modular, lossless, effort: 2] Compressed to 336647.9 kB (135.317 bpp). 4865 x 4091, 9.625 MP/s [9.62, 9.62], , 1 reps, 12 threads. ``` That's a setting with a reasonable enc/dec speed (unlike e9 E5 which compresses better but is very slow to encode and decode). The `-C 0` is to disable YCoCg, which is not very useful here, especially in combination with using PrevChannel context where it becomes counterproductive. You get a 336 MB file while the uncompressed data (i.e. the PAM file) is 835 MB. On my laptop, it takes under 8 seconds to do the encode, under 3 seconds to do the decode. For comparison, running default gzip on the PAM file produces a 560 MB file in a bit over 14 seconds.

a goat

2024-06-04 12:32:53

<@794205442175402004> Is there any particular reason why the Burrows-Wheeler Transform isn't used in image compression much?

CrushedAsian255

	a goat <@794205442175402004> Is there any particular reason why the Burrows-Wheeler Transform isn't used in image compression much?
2024-06-04 12:52:31	I’m just guessing but probably similar reasons on why raw RLE isn’t great for images (excluding flat logos), lots of colours that are similar aren’t the same, so any kind of RLE (even with BWT) is not going to work amazingly

_wb_

2024-06-04 12:58:08

It's also not clear to me how you can combine it with prediction. BWT is a 1D thing, if you would e.g. just apply it to all rows then you end up garbling up the image and while there will be more runs in every row, any predictor that uses rows above will be messed up...

veluca

2024-06-04 01:01:57	I mean, why not BWT after prediction?
2024-06-04 01:02:23	I'm not convinced it would be better than plain old lz77, but maybe

_wb_

2024-06-04 01:15:14

yes, you could do first prediction and then BWT, but what about context?

veluca

2024-06-04 01:18:25	shrug
2024-06-04 01:18:43	some version of jxl-lossless encoded separate streams for each context
2024-06-04 01:18:53	if you did that, it would work to BWT each stream

_wb_

2024-06-04 01:40:00	that would work. Probably not so good for enc/dec speed/memory, but could be good for compression
2024-06-04 01:42:06	could also try BWT on AC coeffs (say, encoding one coefficient position at a time)

veluca

	_wb_ that would work. Probably not so good for enc/dec speed/memory, but could be good for compression
2024-06-04 01:46:15	you're BWTing already anyway, that's not going to be fast 😛

jjrv

_wb_ <@446428281630097408> after that pull request lands, for your 21-channel test image I got good results when doing this: ``` cjxl radiance-21.jxl radiance-21.jxl.jxl -d 0 -e 2 -E 1 -C 0 JPEG XL encoder v0.10.2 67cb0e82 [NEON] lib/extras/dec/apng.cc:795: JXL_FAILURE: PNG signature mismatch Encoding [Modular, lossless, effort: 2] Compressed to 336647.9 kB (135.317 bpp). 4865 x 4091, 9.625 MP/s [9.62, 9.62], , 1 reps, 12 threads. ``` That's a setting with a reasonable enc/dec speed (unlike e9 E5 which compresses better but is very slow to encode and decode). The `-C 0` is to disable YCoCg, which is not very useful here, especially in combination with using PrevChannel context where it becomes counterproductive. You get a 336 MB file while the uncompressed data (i.e. the PAM file) is 835 MB. On my laptop, it takes under 8 seconds to do the encode, under 3 seconds to do the decode. For comparison, running default gzip on the PAM file produces a 560 MB file in a bit over 14 seconds.

2024-06-04 06:51:10

Works great! Switched to the branch version already while working on tooling. BTW I'm also targeting Wasm and compiling with Zig instead of Emscripten. Works a treat.

a goat

	_wb_ It's also not clear to me how you can combine it with prediction. BWT is a 1D thing, if you would e.g. just apply it to all rows then you end up garbling up the image and while there will be more runs in every row, any predictor that uses rows above will be messed up...
2024-06-05 06:50:27	What about a hilbert curve?

spider-mario

2024-06-05 08:54:03	I have the impression that while the British spelling of “color” is “colour”, “colourspace” is nevertheless rarely used over “colorspace” (and “colourimeter” even less)
2024-06-05 08:54:07	is that correct?

yoochan

2024-06-05 08:56:20

They'll never admit their incoherence

Quackdoc

	spider-mario I have the impression that while the British spelling of “color” is “colour”, “colourspace” is nevertheless rarely used over “colorspace” (and “colourimeter” even less)
2024-06-05 09:05:27	I use color and colour equally, same with colourspace and colorspace when talking but always colorspace in code, and ive never seen colourimeter

_wb_

	a goat What about a hilbert curve?
2024-06-05 09:42:10	might work but will mess up speed even more...

KKT

2024-06-05 06:16:47

Hey all, I've started to put together an FAQ and a Glossary for the jpegxl.info site. The FAQ is divided into **General**, **Usage** & **Technical**. I've only scraped the surface on the glossary. As an experiment, I'm opening the Google doc up for everyone to edit, so be gentle! Thanks for the help! https://docs.google.com/document/d/1tn0YNAeOCfDVGy6v0olsG9de3y3goehjoBT_G3I6Xmo/edit?usp=sharing

yoochan

2024-06-05 06:55:12

Nice! I started something similar some times ago and failed to reach a mature enough point to share it 😅 excess of shyness

HCrikki

2024-06-05 06:56:33	about low bitrates, the final result is not a shortcoming of the format itself but a natural consequence of prioriting preserving detail for any given target final filesize. video-based formats discard a lot of information and low filesizes for either format would perpetuate the issue web faced with low quality jpegs of low filesize
2024-06-05 07:06:32	perhaps add benchmark numbers for cases not usually or ever tested, like jpg->jxl conversions
2024-06-05 07:07:49	jpg to any format results either into massive lossless new file or degraded lossy new file that still doesnt guarantee lower filesize unless more detail is sacrificed - unlike jpg->jxl conversions that are completely lossless, guarantee 20% less storage/bandwidth and take less than 50ms for sub-20mp images

salrit

2024-06-05 10:00:50

Is the JPEG XL art the only place to understand the MA Tree? Can I get any reference/documentation to understand it please?

monad

	salrit Is the JPEG XL art the only place to understand the MA Tree? Can I get any reference/documentation to understand it please?
2024-06-06 12:14:33	read the spec? https://discord.com/channels/794206087879852103/1021189485960114198

salrit

	monad read the spec? https://discord.com/channels/794206087879852103/1021189485960114198
2024-06-06 08:10:45	Will go through it...Thanks

jjrv

2024-06-06 09:59:11

Somehow it seems JxlEncoderAddImageFrame wants a buffer large enough to contain the extra channels as well, and yet they must also be passed to JxlEncoderSetExtraChannelBuffer. Should the num_channels field in JxlPixelFormat include the extra channel count or not?

Info

JPEG XL

General chat

Voice Channels

Archived

on-topic

Whatever else