|
HCrikki
|
2024-05-02 01:51:47
|
what model? it couldve had h264 available but not exposed and defaulted to mjpeg. many cams default to whats hardware-accelerated (hw-accelerated encoders suck and cant be updated but consume less cycles)
|
|
2024-05-02 01:54:31
|
some niches also require keyframes for more precise seeking for hot footage not yet being archived so keyframe-only captured images is seen as a plus. id have expected mjpeg2000 used for that though to accomodate extra metadata with individual frames
|
|
|
username
|
|
Crite Spranberry
idk I've just used paint.net
|
|
2024-05-02 02:45:08
|
this plugin does exist for times where you just wanna quickly export something out of Paint.NET and not mess with any other programs: https://forums.getpaint.net/topic/118213-mozjpeg-filetype-2022-08-28/
|
|
|
Demiurge
|
|
lonjil
it still beats jpegli below like, q=75? idk the exact number.
|
|
2024-05-02 02:50:45
|
the JPEG format isn't suitable for very low fidelity, it breaks down rather suddenly below a certain threshold for technical reasons that are inherent to the bitstream format that I can't recall right now
|
|
2024-05-02 02:51:24
|
But most people don't want to degrade their images that much
|
|
|
LMP88959
|
2024-05-02 02:51:34
|
I thought jpeg at low bitrates is bad because all the AC coefficients pretty much disappear
|
|
|
Demiurge
|
2024-05-02 02:51:45
|
Most people expect an image to look the same, before and after saving it.
|
|
|
LMP88959
|
2024-05-02 02:52:06
|
Plus they quantize the DC coefficients too so it ends up giving you totally wrong colors
|
|
|
Demiurge
|
|
LMP88959
I thought jpeg at low bitrates is bad because all the AC coefficients pretty much disappear
|
|
2024-05-02 02:52:55
|
It just looks like 8x8 blocks with 8 unique colors after a certain point
|
|
|
LMP88959
|
2024-05-02 02:53:39
|
Yeah cuz at that point the block has only a nonzero DC value and that DC value is quantized
|
|
2024-05-02 02:53:54
|
Equivalent to applying a posterization effect to an rgb image
|
|
2024-05-02 02:54:19
|
Ideally you wouldnt touch the DC value
|
|
2024-05-02 02:54:27
|
Idk why jpeg touches it
|
|
|
Demiurge
|
2024-05-02 03:22:52
|
Neither do I... Unless you want to progressively compress the DC value, but I don't think JPEG can do that
|
|
|
_wb_
|
2024-05-02 01:40:15
|
<@532010383041363969> can probably explain it better than me, but basically gaborish is just doing some mild smoothing using a 3x3 blur kernel after decoding the image, which is useful to hide block boundaries and dct artifacts. To avoid making the image blurry, the encoder does the opposite before encoding the image: it applies a 5x5 sharpening kernel in such a way that the overall result of sharpening -> lossy compression -> decoding -> blurring is as close to the original as possible. Effectively this is an alternative to lapped transforms (another approach to avoid block artifacts, e.g. used in JPEG XR).
|
|
2024-05-02 01:43:50
|
The problem is that currently the encode-side sharpening is probably too strong, which is not really a problem for single-generation encoding, but when doing many generations the error accumulates and you end up getting pretty bad generation loss, like this: https://discord.com/channels/794206087879852103/803645746661425173/1230065259784572928
|
|
2024-05-02 01:44:33
|
(note: this is not an inherent property of JPEG XL, it is a property of what the libjxl encoder is currently doing by default)
|
|
|
LMP88959
|
2024-05-02 01:46:34
|
how does the deblocking filter compare?
|
|
|
Traneptora
|
|
lonjil
yep, mjpeg 😬
|
|
2024-05-02 01:49:13
|
because the encode cpu needed to encode high-quality mjpeg on the fly
|
|
2024-05-02 01:49:19
|
is very low
|
|
|
|
afed
|
2024-05-02 01:51:57
|
knusperli uses something like gaborish
https://github.com/google/knusperli
|
|
|
_wb_
|
2024-05-02 02:21:51
|
knusperli is doing something else: instead of doing simple dequantization (just multiplying the coefficient with the quantization factor), which restores the coefficient to the center of the quantization bucket, it does a context-aware dequantization that assigns values within the quantization bucket in such a way that discontinuities at block boundaries are avoided (i.e. it assumes the original was smooth, not blocky)
|
|
|
Meow
|
|
lonjil
lossless webp is pretty darn good
|
|
2024-05-02 04:44:10
|
People simply forgot that lossless WebP is the early masterpiece by <@532010383041363969>
|
|
|
Crite Spranberry
|
2024-05-02 04:47:21
|
png
|
|
2024-05-02 04:47:26
|
gif <:t_:1088450292095389706>
|
|
2024-05-02 04:47:41
|
Is 100 quality jpeg lossy?
|
|
2024-05-02 04:48:01
|
I've never thought about it as I've had no case where 100 quality jpeg is better than png
|
|
|
Crite Spranberry
Is 100 quality jpeg lossy?
|
|
2024-05-02 04:49:32
|
If it is it doesn't show
|
|
2024-05-02 04:49:56
|
4:2:0 shows a bit of loss though
|
|
2024-05-02 04:50:27
|
But 100 quality 4:4:4 seems equal to png in random unigine heaven screenshot i found
|
|
|
Crite Spranberry
4:2:0 shows a bit of loss though
|
|
2024-05-02 04:50:59
|
|
|
|
Meow
|
|
Crite Spranberry
Is 100 quality jpeg lossy?
|
|
2024-05-02 05:00:57
|
Of course it is lossy
|
|
2024-05-02 05:02:06
|
Some softwares would switch to lossless when exporting quality 100 for AVIF or WebP
|
|
|
Demiurge
|
2024-05-02 11:22:14
|
Why does my shiny new iPhone not support color profiles on JPEG?
|
|
|
a goat
|
|
Demiurge
Why does my shiny new iPhone not support color profiles on JPEG?
|
|
2024-05-02 11:24:27
|
Forcing people into having to use DCI P3 maybe?
|
|
|
Demiurge
|
2024-05-02 11:27:38
|
Well if an image is tagged it's supposed to convert it to the display profile, not completely misinterpret it somehow
|
|
2024-05-02 11:29:44
|
|
|
2024-05-02 11:30:37
|
It's even worse than if they completely ignored/stripped the profile.
|
|
|
LMP88959
|
2024-05-03 08:18:53
|
i used imagemagick's JXL implementation
|
|
2024-05-03 08:19:06
|
```convert -quality 10 orig.png out.jxl```
|
|
2024-05-03 08:19:15
|
is it a good impl
|
|
2024-05-03 08:19:23
|
|
|
2024-05-03 08:19:29
|
because im confused why it looks so bad
|
|
2024-05-03 08:19:36
|
it's a 64x64 icon
|
|
2024-05-03 08:20:08
|
JXL on the left (1,700 bytes), mine in the middle (1,000 bytes), original on the right'
|
|
2024-05-03 08:21:42
|
|
|
2024-05-03 08:21:43
|
here is the original
|
|
|
lonjil
|
2024-05-03 08:28:22
|
here is a 889 byte one I just encoded
|
|
|
LMP88959
|
2024-05-03 08:28:31
|
what did you use to encode
|
|
2024-05-03 08:28:56
|
i just noticed imagemagick inserts a ton of useless metadata into the file which is why the size is so large
|
|
|
lonjil
|
2024-05-03 08:28:58
|
`cjxl -d 4 -e 10 orig.png d4_e10.jxl`
|
|
|
LMP88959
|
2024-05-03 08:31:54
|
ok sweet, thanks. i will avoid imagemagick from now on
|
|
|
lonjil
|
2024-05-03 08:32:39
|
I wonder what it even maps `-quality 10` to
|
|
|
LMP88959
|
2024-05-03 08:34:27
|
cjxl has a -quality param too
|
|
2024-05-03 08:34:32
|
so maybe that?
|
|
2024-05-03 08:35:36
|
|
|
2024-05-03 08:35:44
|
jxl 600 bytes, mine 600 bytes
|
|
2024-05-03 08:35:56
|
pretty cool seeing the differences
|
|
|
Quackdoc
|
|
LMP88959
cjxl has a -quality param too
|
|
2024-05-03 09:06:56
|
magick can do some weird things sometimes
|
|
|
LMP88959
|
2024-05-03 09:07:13
|
yeah.. it's a shame
|
|
|
HCrikki
|
2024-05-03 09:13:02
|
d4 looks overkill for avatar-like images (smaller than 256x256)
|
|
2024-05-03 09:14:32
|
are you using a current libjxl? some workflows with env paths can end using old versions even though you keep updating
just recently someone experienced such an issue with their setups and erroneously assumed jxl barely improved since 0.7
|
|
|
LMP88959
|
2024-05-03 09:15:27
|
JPEG XL encoder v0.10.2 0.10.2 [AVX2,SSE4,SSE2]
|
|
|
jonnyawsom3
|
|
LMP88959
|
|
2024-05-03 09:23:15
|
There's also VarDCT and Modular
|
|
2024-05-03 09:23:26
|
About 600 bytes too
|
|
|
LMP88959
|
2024-05-03 09:24:07
|
holy moly
|
|
2024-05-03 09:24:16
|
so much to learn about the knobs and parameters
|
|
2024-05-03 09:25:23
|
it looks like JXL is losing a lot of chroma
|
|
2024-05-03 09:25:30
|
is there subsampling going on?
|
|
|
HCrikki
|
|
LMP88959
JPEG XL encoder v0.10.2 0.10.2 [AVX2,SSE4,SSE2]
|
|
2024-05-03 09:36:45
|
is that what ver reports in terminal?
|
|
|
lonjil
|
2024-05-03 09:36:45
|
no, but in the current release it quantizes chroma probably too aggressively
|
|
2024-05-03 09:37:09
|
Jyrki posted recently about increasing the amount of chroma info by 20% (and reducing luma by 2% IIRC)
|
|
|
LMP88959
|
|
HCrikki
is that what ver reports in terminal?
|
|
2024-05-03 09:38:20
|
yeah
|
|
|
Demiurge
|
|
LMP88959
i just noticed imagemagick inserts a ton of useless metadata into the file which is why the size is so large
|
|
2024-05-04 06:59:52
|
I wonder if graphicsmagick does the same thing
|
|
|
LMP88959
is there subsampling going on?
|
|
2024-05-04 07:00:41
|
No but v0.10 mutilates colors badly I noticed when using q<90 or d>1
|
|
2024-05-04 07:03:52
|
If you use lower quality settings than that, then the color is so mangled that it's practically useless output
|
|
|
yoochan
|
2024-05-04 07:04:14
|
What about lossless encoding using resampling? I can't test at the moment
|
|
|
Demiurge
|
2024-05-04 07:05:02
|
Lossless and resampling are mutually exclusive, no? I wouldn't use cjxl as a resampling tool.
|
|
2024-05-04 07:06:54
|
cjxl is pretty good at lossless but sometimes there are bugs in the decoder with the colorspace profile it generates and the output format.
|
|
2024-05-04 07:07:35
|
But I think some of those bugs only affect lossy
|
|
2024-05-04 07:08:59
|
If you try to encode an ambiguous file that does not have color tags, then it will create one and I think assume it's sRGB
|
|
2024-05-04 07:09:21
|
which is correct behavior
|
|
2024-05-04 07:11:08
|
But if you compare an untagged ambiguous file with a file tagged with non-ambiguous color info, they can look different even if the data is exactly the same, so that can sometimes cause images to change in appearance after lossless encoding an untagged file to JXL
|
|
2024-05-04 07:11:20
|
because in JXL, there is no such thing as "untagged"
|
|
|
yoochan
|
2024-05-04 07:17:37
|
For pixel art you can reduce the pixel count during encoding and upscale it at decoding it works really well. I'll try to find the exact command
|
|
|
damian101
|
2024-05-04 09:31:31
|
|
|
2024-05-04 09:33:12
|
|
|
|
w
|
2024-05-04 10:02:49
|
just use webp
|
|
2024-05-04 10:03:06
|
654 bytes
|
|
|
yoochan
|
2024-05-04 11:20:15
|
3905 bytes with lossless 🙂
|
|
|
Demiurge
|
|
yoochan
For pixel art you can reduce the pixel count during encoding and upscale it at decoding it works really well. I'll try to find the exact command
|
|
2024-05-04 11:23:36
|
you mean like nearest neighbor resizing?
|
|
|
yoochan
|
2024-05-04 11:27:30
|
yes, I frist thought it was pixel art but the image is pixel level drawing, my mistake
|
|
2024-05-04 11:30:00
|
we spoke about this here : https://discord.com/channels/794206087879852103/794206170445119489/1222214319198961714
|
|
|
TheBigBadBoy - 𝙸𝚛
|
2024-05-04 11:48:13
|
what's that tool again to convert JPG to PNG (and "remove" compression artifacts) ?
|
|
|
username
|
|
TheBigBadBoy - 𝙸𝚛
what's that tool again to convert JPG to PNG (and "remove" compression artifacts) ?
|
|
2024-05-04 11:49:20
|
https://github.com/victorvde/jpeg2png
https://github.com/ilyakurdyukov/jpeg-quantsmooth
|
|
|
TheBigBadBoy - 𝙸𝚛
|
2024-05-04 11:49:35
|
thanks
|
|
2024-05-04 11:49:37
|
[⠀](https://cdn.discordapp.com/emojis/853506500088692747.webp?size=48&quality=lossless&name=pepelove)
|
|
2024-05-04 11:52:36
|
9 years ago [⠀](https://cdn.discordapp.com/emojis/852007419474608208.webp?size=48&quality=lossless&name=av1_woag)
|
|
|
damian101
|
|
w
654 bytes
|
|
2024-05-04 12:08:10
|
that's wild...
|
|
2024-05-04 12:08:19
|
|
|
2024-05-04 12:08:36
|
But ssimulacra2 prefers AVIF
|
|
|
Demiurge
|
2024-05-04 12:11:25
|
metrics are useless as far as they disagree with your own eyes
|
|
2024-05-04 12:12:27
|
the goal of these metrics is to predict what a human would say
|
|
2024-05-04 12:12:57
|
So if you say it looks like shit while the metric says it's great then it's not you that's wrong, it's the metric
|
|
2024-05-04 12:13:16
|
since the metric is supposed to be designed to predict what you would say
|
|
2024-05-04 12:13:56
|
incidentally, all existing metrics are a very bad match to actual human vision
|
|
2024-05-04 12:15:06
|
But it requires time and fatiguing effort for a human to stare at and compare images
|
|
2024-05-04 12:15:13
|
that's why metrics are developed and used
|
|
2024-05-04 12:15:30
|
not because they are superior to humans but because they are more fast and convenient to use
|
|
2024-05-04 12:15:55
|
the goal of a lossy codec is to also closely match human vision
|
|
2024-05-04 12:16:09
|
therefore actual humans are needed to be the guide and the judge
|
|
2024-05-04 12:17:15
|
until metrics get much, much better at predicting what humans see
|
|
|
damian101
|
2024-05-04 12:32:06
|
At larger viewing distance, I definitely prefer AVIF, though
|
|
2024-05-04 12:32:58
|
WebP preserves more detail, but moves it around a lot.
|
|
2024-05-04 12:33:03
|
And chroma is also worse.
|
|
|
Meow
|
2024-05-04 01:14:41
|
Are those better than denoising tools built in several softwares?
|
|
|
_wb_
|
2024-05-04 01:44:53
|
I think it's safe to say that most encoders are not tuned for doing lossy on a 64x64 image — for such small images the overhead of various constant-sized signaling becomes relevant, but this is often not really something that gets taken into account when tuning an encoder
|
|
|
LMP88959
|
|
w
just use webp
|
|
2024-05-04 02:09:18
|
that looks fantastic
|
|
|
username
|
|
w
just use webp
|
|
2024-05-04 05:23:21
|
here's what I got with WebP, file size is 650 bytes
|
|
|
LMP88959
|
2024-05-04 05:41:43
|
so webp is better for lossy icon sized pixel art?
|
|
|
damian101
|
|
LMP88959
so webp is better for lossy icon sized pixel art?
|
|
2024-05-04 07:52:21
|
webp preserves (only) high-contrast detail well, and doesn't try hard to prevent spatial distortion
|
|
|
LMP88959
|
2024-05-04 08:25:06
|
Ah i see
|
|
|
damian101
|
2024-05-04 09:26:19
|
so, yes, it's quite good appeal-wise for highly lossy low resolution content...
|
|
2024-05-04 09:26:54
|
But at such small resolutions you usually don't want to be that lossy...
|
|
|
LMP88959
|
2024-05-04 09:28:06
|
Yeah that is a good point. Data savings are pretty negligible with such small images
|
|
|
w
|
2024-05-04 09:30:13
|
webp often wins at pixel art at any resolution
|
|
2024-05-04 09:30:21
|
even for lossless
|
|
|
Demiurge
|
|
Demiurge
|
|
2024-05-05 03:28:25
|
Maybe this has something to do with this bug... https://github.com/libjxl/libjxl/issues/3512
|
|
2024-05-05 03:28:37
|
idk how to add an app14 tag to an existing jpeg to test
|
|
|
TheBigBadBoy - 𝙸𝚛
|
|
Demiurge
idk how to add an app14 tag to an existing jpeg to test
|
|
2024-05-05 04:54:08
|
https://encode.su/threads/2489-jpegultrascan-an-exhaustive-JPEG-scan-optimizer?p=82584&viewfull=1#post82584
the `jpegultrascan.pl` script adds it automatically, use `-b 0 -t $(nproc)` for fastest result
|
|
|
Demiurge
|
2024-05-05 09:53:21
|
Perl is cool
|
|
|
yoochan
|
2024-05-06 07:04:34
|
perl 5 or perl 6 ?
|
|
|
spider-mario
|
2024-05-06 09:12:51
|
both, in their own respective ways (but perl 6 is called raku now)
|
|
2024-05-07 06:20:53
|
https://gtmetrix.com/efficiently-encode-images.html
what kind of crappy heuristic is that
|
|
2024-05-07 06:21:04
|
what does “85% of their original quality” even mean
|
|
2024-05-07 06:21:47
|
degrading quality by 15% (whatever that means) for a 4 kB gain doesn’t sound like a worthy trade-off?
|
|
|
username
|
2024-05-07 06:22:49
|
"BMP" what who even uses BMPs anymore‽‽
|
|
|
spider-mario
|
2024-05-07 06:46:52
|
oh no https://developer.chrome.com/docs/lighthouse/performance/uses-optimized-images#how_lighthouse_flags_images_as_optimizable
|
|
2024-05-07 06:47:32
|
well, that almost looks more reasonable
|
|
2024-05-07 06:47:47
|
maybe that’s what the GTmetrix page meant to imply
|
|
|
Oleksii Matiash
|
2024-05-07 07:06:28
|
I'm confused, bmp quality 85? <:WhatThe:806133036059197491>
|
|
|
Eugene Vert
|
2024-05-07 08:40:49
|
It's a bit weird that they don't use quantization table analysis for jpeg, like in `magick identify -format '%Q' 1.JPG` or https://gist.github.com/atr000/559324
|
|
|
a goat
|
2024-05-08 07:06:35
|
Are there any GPU accelerated difference metrics with comparable results to butteraugli for 4MP and up images?
|
|
|
Quackdoc
|
2024-05-08 07:16:33
|
the only semi decent gpu accelerated metric I know of is someone ported ssimu2rs to onnx
|
|
|
a goat
|
|
Quackdoc
the only semi decent gpu accelerated metric I know of is someone ported ssimu2rs to onnx
|
|
2024-05-08 07:23:46
|
Oh? Is there a repo?
|
|
|
Quackdoc
|
2024-05-08 07:24:14
|
I dont think so, source was just a zip release iirc
|
|
2024-05-08 07:24:38
|
ill see if I can dig it up
|
|
|
a goat
Oh? Is there a repo?
|
|
2024-05-08 07:27:28
|
https://cdn.discordapp.com/attachments/1042536514783023124/1118195477188444230/ssimulacra2_bin-gpu.zip?ex=663ce730&is=663b95b0&hm=5275ab283f34843656fe6580495da77789a4535437f62aa886adcba19ded4963&
|
|
2024-05-08 07:27:32
|
oh crap
|
|
2024-05-08 07:27:40
|
actually no that worked
|
|
|
a goat
|
|
Quackdoc
https://cdn.discordapp.com/attachments/1042536514783023124/1118195477188444230/ssimulacra2_bin-gpu.zip?ex=663ce730&is=663b95b0&hm=5275ab283f34843656fe6580495da77789a4535437f62aa886adcba19ded4963&
|
|
2024-05-08 08:44:55
|
Thanks! This will come in handy
|
|
|
Demiurge
|
|
Jyrki Alakuijala
we could do this -- now we are opting towards smoothness, but to a lesser degree than most other codecs -- we used to do this a lot more, but Jon changed it (with a 2-3 % objective score improvement) back in the day, perhaps three years ago -- we could have modes for this
|
|
2024-05-08 11:46:47
|
Is it possible to do this in a way that looks acceptable for non-photographic graphics as well?
|
|
2024-05-08 11:47:44
|
Grainy, bluenoise-like lossy artifacts I mean
|
|
2024-05-08 11:58:38
|
Certain types of dithering look good for non-photo content. So maybe it's possible to produce lossy compression artifacts similar to that, in a way that looks natural and less disruptive to the human eye for both photo and non-photo. Metrics be damned.
|
|
|
|
JendaLinda
|
2024-05-11 08:32:36
|
It's fun to create image files so small so they fit inside the NTFS file entry itself.
|
|
|
lonjil
|
2024-05-11 08:45:42
|
what size is that?
|
|
2024-05-11 08:45:47
|
on zfs it's 112 bytes
|
|
|
|
JendaLinda
|
2024-05-11 08:47:39
|
Around 600 bytes. It depends on the file attributes and length of the file name.
|
|
|
HCrikki
|
2024-05-11 08:47:59
|
quick question, can that be made to generate QR codes ?
|
|
|
|
JendaLinda
|
2024-05-11 08:50:17
|
Small files could be turned to QR codes as well.
|
|
2024-05-11 08:53:15
|
Anyway. Windows stores payload of very small files inside their file entries automatically.
|
|
|
jonnyawsom3
|
2024-05-11 09:19:50
|
500 bytes is the rough limit, probably meant to be 512 but filename, ect....
|
|
|
JendaLinda
Small files could be turned to QR codes as well.
|
|
2024-05-11 09:22:44
|
I tried to do that, but there's no way to open the file upon scan. `file:///` doesn't default to a browser and just shows as text, so I had to resort to a usual link with a redirect to a website hosting the file
|
|
|
|
JendaLinda
|
2024-05-11 09:28:45
|
I have a 627 bytes file that still fits.
|
|
|
I tried to do that, but there's no way to open the file upon scan. `file:///` doesn't default to a browser and just shows as text, so I had to resort to a usual link with a redirect to a website hosting the file
|
|
2024-05-11 09:30:08
|
You need a QR code software that can encode and decode binary files. QR codes can store any data.
|
|
|
lonjil
|
2024-05-11 09:31:57
|
fun fact: "binary" QR is universally interpreted as UTF-8, so when actual arbitrary binary data needs to be stored, people use text-mode QR with the data in base45 encoding.
|
|
|
|
JendaLinda
|
2024-05-11 09:38:50
|
This is only depending on the implementation how it interprets the bytes.
|
|
|
jonnyawsom3
|
|
JendaLinda
You need a QR code software that can encode and decode binary files. QR codes can store any data.
|
|
2024-05-11 09:39:51
|
Not too useful when trying to make it work for the average person unfortunately
|
|
|
|
JendaLinda
|
2024-05-11 09:41:08
|
Encoding binary data in QR is usually application specific.
|
|
|
Meow
|
2024-05-11 10:12:54
|
Good I should experiment QR codes with various codes later
|
|
|
jonnyawsom3
|
2024-05-11 11:12:15
|
I've posted this channel before, but a unique way of explaining colors and perceptual color spaces using Minecraft if an accessible form https://youtu.be/e0HM_vfSuDw
|
|
|
yoochan
|
|
lonjil
fun fact: "binary" QR is universally interpreted as UTF-8, so when actual arbitrary binary data needs to be stored, people use text-mode QR with the data in base45 encoding.
|
|
2024-05-11 01:35:43
|
What a waste. Iirc, shift JIS is also part of the norm. A pre unicode relic
|
|
|
Meow
|
2024-05-11 03:10:43
|
UTF-8 is a must for CJKV characters
|
|
|
yoochan
|
2024-05-11 03:29:58
|
Only when you want to mix them... Before unicode Japaneses, Chineses, Russians used language specific codepages which were much better for each language but a nightmare to mix
|
|
|
|
JendaLinda
|
2024-05-11 04:01:47
|
And different operating systems were using different codepages fot the same language. It's a good thing there is now finally a standard everybody agreed on.
|
|
|
Meow
|
|
yoochan
Only when you want to mix them... Before unicode Japaneses, Chineses, Russians used language specific codepages which were much better for each language but a nightmare to mix
|
|
2024-05-11 04:04:03
|
A worse nightmare even before Unicode
|
|
2024-05-11 04:04:21
|
If you lived in such environment
|
|
|
|
JendaLinda
|
2024-05-11 04:06:34
|
Just for my native language, there exist about 6 different codepages.
|
|
|
Meow
|
2024-05-11 04:37:01
|
We can also experiment QR code version 40 with JXL
|
|
|
jonnyawsom3
|
|
I tried to do that, but there's no way to open the file upon scan. `file:///` doesn't default to a browser and just shows as text, so I had to resort to a usual link with a redirect to a website hosting the file
|
|
2024-05-11 04:55:52
|
When I was messing with QR codes, I tried to make JXL art of a QR code, which then scanned as it's self, but naturally every change to the image changes what the code looks like so I was slowly turning into Sisyphus
|
|
|
Crite Spranberry
|
|
Meow
|
2024-05-11 05:20:56
|
There are some utilities that can produce QR codes blending with some image
|
|
|
spider-mario
|
|
JendaLinda
Just for my native language, there exist about 6 different codepages.
|
|
2024-05-11 05:25:50
|
may I ask which language that is?
|
|
|
Oleksii Matiash
|
2024-05-11 05:47:10
|
I know about 3 CP for my language, but 6 🤯
|
|
|
|
JendaLinda
|
|
spider-mario
may I ask which language that is?
|
|
2024-05-11 05:49:39
|
Czech
|
|
2024-05-11 05:51:17
|
There are ISO 8859-2, Windows-1250, Mac OS CE, CP 852, Kamenický, KOI8-CS
|
|
|
Oleksii Matiash
|
2024-05-11 06:10:14
|
Ah, yes, ISO, I forgot about it. ISO 8859-5, Windows-CP1251, CP866, KOI8-U
|
|
|
Meow
|
2024-05-12 03:04:25
|
QR code version 40
|
|
2024-05-12 03:05:16
|
The original image
|
|
2024-05-12 03:10:19
|
Comparing to PNG
QOI 4770%
AVIF 284%
HEIC 10118%
JXL 222%
WebP 104%
All lossless
|
|
|
jonnyawsom3
|
2024-05-12 03:25:19
|
Had to reduce the size slightly since it was 1:9 instead of 1:8, but close to the original
|
|
2024-05-12 03:27:26
|
Did get it slightly under 10 KB completely lossless too though
|
|
2024-05-12 03:30:25
|
45% `cjxl "QR code Pixels.png" "QR code.jxl" -d 0 -e 10 -g 3 -I 0 -P 0 --resampling=8 --already_downsampled --upsampling_mode=0`
159% `cjxl "QR code.png" "QR code 10KB.jxl" -d 0 -e 10 -g 3 -I 0 -P 0`
|
|
2024-05-12 03:33:06
|
Oh, and here's the 'Pixels' input
|
|
|
Meow
|
2024-05-12 03:45:01
|
Yes still scannable
|
|
|
jonnyawsom3
|
2024-05-12 03:53:00
|
I do wonder if an LZ77 only JXL file would be possible or even worth it...
|
|
|
Jyrki Alakuijala
|
|
Oh, and here's the 'Pixels' input
|
|
2024-05-13 09:02:37
|
for that image it is critical that the LZ77 will find the 4-line pattern and favor the LZ77 copy that is exactly 4 lines above -- this can be rare for images in general, basic heuristics in WebP lossless try combos of previous-line, previous-pixel and all LZ77 -- I don't know what is favored in LZ77 in JPEG XL's current encoder (Luca wrote that)
|
|
2024-05-13 09:04:26
|
often faster LZ77 tries to find matches with hashing -- and hashing is looking at a few symbols only, 3 or 4
|
|
2024-05-13 09:05:36
|
with PNG and WebP lossless there is a step that first collapses 8 binary symbols into a single symbol, so hashing looks at 16 (in WebP I think -- as hashing there is based on two pixels IIRC) or 24 (in PNG/zlib) bits at once (zlib hashing three bytes)
|
|
|
jonnyawsom3
|
2024-05-13 09:49:14
|
Since that image is a 1 bit pallete, I assume the same amount of pixels are hashed as are bits
|
|
|
|
veluca
|
|
Jyrki Alakuijala
for that image it is critical that the LZ77 will find the 4-line pattern and favor the LZ77 copy that is exactly 4 lines above -- this can be rare for images in general, basic heuristics in WebP lossless try combos of previous-line, previous-pixel and all LZ77 -- I don't know what is favored in LZ77 in JPEG XL's current encoder (Luca wrote that)
|
|
2024-05-13 10:01:02
|
I think special distances are favoured but perhaps not
|
|
|
_wb_
|
2024-05-13 03:06:05
|
we don't have bit packing for low bitdepth images in jxl (unlike png which packs eight 1-bit pixels in a byte, or four 2-bit pixels, or two 4-bit pixels)
|
|
2024-05-13 03:06:39
|
but we could approximate it by doing some horizontal squeeze steps and then doing palette on the result
|
|
2024-05-13 03:07:52
|
if you have e.g. a single-channel 1-bit 800x600 image, then with some squeezing you can turn it into eight channels that are 100x600
|
|
2024-05-13 03:08:26
|
then we can do an 8-channel palette on that
|
|
2024-05-13 03:10:05
|
it will probably have more than 256 'colors' though, due to the squeeze tendency thing, but more or less it turns 8 pixels into one pixel with a bigger range
|
|
|
jonnyawsom3
|
2024-05-13 03:20:41
|
Riiight, I remember you mentioning the squeeze as a replacement before
|
|
2024-05-13 03:21:27
|
I know I've already seen it work, by turning on squeeze for lossless on a single color image
|
|
|
_wb_
|
2024-05-13 04:38:59
|
for something where png's bitpacking happens to be very beneficial (like that qr code), doing a custom squeeze script + palette could be useful, currently libjxl has no way to do that though
|
|
2024-05-13 04:39:34
|
possibly this could be a strategy worth trying for images that are 1 bit (or even 2 or 3 bit)
|
|
2024-05-13 04:40:58
|
for now we haven't really tried to optimize compression for such relatively niche images — but at some point we should look into it, and make sure we consistently beat png on such images too
|
|
|
jonnyawsom3
|
2024-05-13 04:57:50
|
Squeezed and upsampled QR codes... Just think of how many more tricks there are to find
|
|
|
Meow
|
|
_wb_
for now we haven't really tried to optimize compression for such relatively niche images — but at some point we should look into it, and make sure we consistently beat png on such images too
|
|
2024-05-13 06:22:42
|
QR codes aren't niche
|
|
|
|
JendaLinda
|
2024-05-13 06:41:42
|
Scanned documents are often 1 bit.
|
|
|
_wb_
|
2024-05-13 07:31:51
|
It's quite oldschool imo to scan in 1 bit, and not grayscale. Reminds me of fax machines. But yes, there is certainly legacy content like that, and some things still do that.
|
|
|
Meow
QR codes aren't niche
|
|
2024-05-13 07:32:42
|
A more efficient way to store a QR code is to store the actual content and generate the QR code from that if you need it visually 🙂
|
|
|
190n
|
2024-05-13 07:36:52
|
now could you express qr encoding logic in jxl_from_tree
|
|
|
jonnyawsom3
|
2024-05-13 07:49:50
|
I did try to make a QR code using splines, got about a quater of the way before realising my insanity
|
|
|
|
veluca
|
2024-05-13 07:54:15
|
QR codes can be quite resilient
|
|
2024-05-13 07:56:00
|
my phone could scan that QR without any isues, despite it being hand-drawn
|
|
2024-05-13 07:56:06
|
(don't ask why)
|
|
|
TheBigBadBoy - 𝙸𝚛
|
2024-05-13 08:43:17
|
what about stable diffusion QR codes <:KekDog:805390049033191445>
https://cdn.arstechnica.net/wp-content/uploads/2023/06/qr_code_lady-800x450.jpg
|
|
2024-05-13 08:43:40
|
mine can hardly get it after a few tries
|
|
|
190n
|
2024-05-13 08:44:49
|
for me binary eye gets it very quickly but google camera has trouble
|
|
|
jonnyawsom3
|
|
TheBigBadBoy - 𝙸𝚛
what about stable diffusion QR codes <:KekDog:805390049033191445>
https://cdn.arstechnica.net/wp-content/uploads/2023/06/qr_code_lady-800x450.jpg
|
|
2024-05-13 08:51:28
|
WebP strikes again
|
|
|
TheBigBadBoy - 𝙸𝚛
|
|
|
JendaLinda
|
|
_wb_
It's quite oldschool imo to scan in 1 bit, and not grayscale. Reminds me of fax machines. But yes, there is certainly legacy content like that, and some things still do that.
|
|
2024-05-13 08:53:46
|
I've seen that more often than I like, unfortunately. 1 bit scans are usually hard to read.
|
|
|
spider-mario
|
|
veluca
QR codes can be quite resilient
|
|
2024-05-13 08:53:56
|
when covid certificates were still a thing, I tried to order temporary tattoos of mine
|
|
2024-05-13 08:54:06
|
sadly, once applied to the skin, it didn’t read properly
|
|
|
jonnyawsom3
|
2024-05-13 08:56:25
|
Should've gone with RFID implants clearly
|
|
|
TheBigBadBoy - 𝙸𝚛
what about stable diffusion QR codes <:KekDog:805390049033191445>
https://cdn.arstechnica.net/wp-content/uploads/2023/06/qr_code_lady-800x450.jpg
|
|
2024-05-13 08:58:07
|
Still 1 bit ;P
|
|
|
Demiurge
|
|
spider-mario
when covid certificates were still a thing, I tried to order temporary tattoos of mine
|
|
2024-05-14 09:02:42
|
lmao...
|
|
|
|
JendaLinda
|
2024-05-14 09:30:31
|
I would consider QR codes displayed using block characters in VGA text mode. Using half block characters, it's possible to display 80x50 pixels.
|
|
|
lonjil
|
2024-05-14 09:39:08
|
```
█████████████████████████████████
█████████████████████████████████
████ ▄▄▄▄▄ █ █▄▀▀▄▄█ ▄▄▄▄▄ ████
████ █ █ █ ▀▄ █▀ █ █ █ ████
████ █▄▄▄█ █▀██▀▀█▄▄▀█ █▄▄▄█ ████
████▄▄▄▄▄▄▄█▄▀▄█ █ ▀▄█▄▄▄▄▄▄▄████
████ ▄▀█ ▄▀█▀▀▄▀▀█▀▀██▄▀ ▄ ▄████
████▀▀▄ █▄▄▀ █▀ ▄█ ▄▄▀ ▀ ▄▀██████
████▀ ▀▄ ▄█▀▀▄█▄▀ ▀ ▄▄▄▀▀▄▄▄████
██████ ▀▄▀▀▀██▀██ ▀▄█ █ ▀▀████
████▄▄█▄▄▄▄█▀ █▄▀▀█▄ ▄▄▄ ▄▄▄█████
████ ▄▄▄▄▄ █▀█▀ ▄█ █ █▄█ ▀█▀████
████ █ █ █▄█▄█▄▀█▀ ▄ ▄█▄ ████
████ █▄▄▄█ █▀ ▄█▀██▀▄ ▄ ▀▄ █████
████▄▄▄▄▄▄▄█▄█▄▄▄▄▄▄██▄█▄█▄▄▄████
█████████████████████████████████
█████████████████████████████████
```
|
|
|
Demiurge
|
2024-05-14 09:54:36
|
Is UTF8 considered a codepage? lol... or are codepages not referring to multi-byte encoding?
|
|
|
jonnyawsom3
|
|
lonjil
```
█████████████████████████████████
█████████████████████████████████
████ ▄▄▄▄▄ █ █▄▀▀▄▄█ ▄▄▄▄▄ ████
████ █ █ █ ▀▄ █▀ █ █ █ ████
████ █▄▄▄█ █▀██▀▀█▄▄▀█ █▄▄▄█ ████
████▄▄▄▄▄▄▄█▄▀▄█ █ ▀▄█▄▄▄▄▄▄▄████
████ ▄▀█ ▄▀█▀▀▄▀▀█▀▀██▄▀ ▄ ▄████
████▀▀▄ █▄▄▀ █▀ ▄█ ▄▄▀ ▀ ▄▀██████
████▀ ▀▄ ▄█▀▀▄█▄▀ ▀ ▄▄▄▀▀▄▄▄████
██████ ▀▄▀▀▀██▀██ ▀▄█ █ ▀▀████
████▄▄█▄▄▄▄█▀ █▄▀▀█▄ ▄▄▄ ▄▄▄█████
████ ▄▄▄▄▄ █▀█▀ ▄█ █ █▄█ ▀█▀████
████ █ █ █▄█▄█▄▀█▀ ▄ ▄█▄ ████
████ █▄▄▄█ █▀ ▄█▀██▀▄ ▄ ▀▄ █████
████▄▄▄▄▄▄▄█▄█▄▄▄▄▄▄██▄█▄█▄▄▄████
█████████████████████████████████
█████████████████████████████████
```
|
|
2024-05-14 10:25:32
|
Jesus christ, even after reading the block character message and thinking "Huh, good idea". I just spent a full minute tapping on your message thinking it was just an underexposed printout
|
|
|
lonjil
|
|
Meow
|
2024-05-14 11:16:11
|
I tried to scan it immediately
|
|
|
|
JendaLinda
|
|
Demiurge
Is UTF8 considered a codepage? lol... or are codepages not referring to multi-byte encoding?
|
|
2024-05-14 11:55:59
|
UTF-8 is not a codepage, UTF-8 is one of the methods to encode Unicode. Unicode is technically a codepage. The one codepage that rules them all.
|
|
|
yoochan
|
2024-05-14 11:57:18
|
a bit like TRON : https://en.wikipedia.org/wiki/TRON_(encoding) 😄
|
|
|
|
JendaLinda
|
2024-05-14 12:06:03
|
Another commonly used Unicode encoding is UTF-16. It's the same idea as UTF-8 but UTF-16 uses 16 bit words rather than bytes to encode unicode characters. Both UTF-8 and UTF-16 code sequences encode the same Unicode codes.
|
|
|
yoochan
|
2024-05-14 12:06:39
|
and UTF-32, rarely used
|
|
|
|
JendaLinda
|
2024-05-14 12:09:17
|
I'd say UTF-32 is actually the raw Unicode encoding as it encodes the Unicode codes directly.
|
|
|
yoochan
|
2024-05-14 12:10:06
|
indeed, but the name exists...
|
|
|
|
JendaLinda
|
2024-05-14 12:12:09
|
It does, to specify the exact encoding, so there's no confusion.
|
|
|
spider-mario
|
2024-05-14 12:17:17
|
UTF-16 is the worst of both worlds
|
|
2024-05-14 12:17:52
|
more bloated than UTF-8 and byte-order-sensitive but without the constant-width property of UTF-32
|
|
2024-05-14 12:18:13
|
it originates from when it was thought that 16 bits would be enough to be constant-width (UCS-2)
|
|
2024-05-14 12:18:21
|
later on, “oops, actually not”
|
|
|
|
JendaLinda
|
2024-05-14 12:20:48
|
Microsoft also believed 64k code point would be enough for everybody, so they chose UTF-16 as the default Unicode encoding in Windows.
|
|
|
spider-mario
|
2024-05-14 12:21:22
|
https://utf8everywhere.org/#facts
|
|
2024-05-14 12:21:38
|
> UTF-16 is the worst of both worlds, being both variable length and too wide. It exists only for historical reasons and creates a lot of confusion. We hope that its usage will further decline.
|
|
|
yoochan
|
2024-05-14 12:22:17
|
Still waiting for the UTF4, nibble based
|
|
2024-05-14 12:23:33
|
and for the unicode to accept the "decimal separator" as a codepoint to throw away issues with comma or dot in numbers in the world hence avoid the WW3
|
|
|
spider-mario
|
2024-05-14 12:24:01
|
(arguably, even codepoints being a constant number of code units is overrated, since graphemes can consist of several codepoints anyway)
|
|
2024-05-14 12:24:15
|
this is e + combining ´: é
|
|
|
|
JendaLinda
|
2024-05-14 12:24:22
|
Others have chosen UTF-8 for it's flawless backward compatibility with ASCII. So Microsoft is alone with UTF-16.
|
|
|
spider-mario
|
2024-05-14 12:25:12
|
```python
>>> [ord(c) for c in 'é']
[101, 769]
>>> [ord(c) for c in 'é']
[233]
```
|
|
|
|
JendaLinda
|
2024-05-14 12:26:53
|
It's not perfect. There are incremental additions and new ideas but the old stuff couldn't be changed.
|
|
|
spider-mario
|
2024-05-14 12:27:57
|
```shell
$ perl -Mutf8 -E 'binmode *STDOUT, ":encoding(UTF-8)"; while ("énervé" =~ /(\X)/g) { say "Grapheme: $1"; for my $c (split "", $1) { say " codepoint: ", ord($c) } }'
Grapheme: é
codepoint: 101
codepoint: 769
Grapheme: n
codepoint: 110
Grapheme: e
codepoint: 101
Grapheme: r
codepoint: 114
Grapheme: v
codepoint: 118
Grapheme: é
codepoint: 101
codepoint: 769
```
|
|
2024-05-14 12:29:19
|
```shell
$ perldoc perlre
[…]
\X [4] Match Unicode "eXtended grapheme cluster"
```
|
|
2024-05-14 12:29:58
|
```shell
$ perldoc perlrebackslash
[…]
\X This matches a Unicode *extended grapheme cluster*.
"\X" matches quite well what normal (non-Unicode-programmer) usage
would consider a single character. As an example, consider a G with
some sort of diacritic mark, such as an arrow. There is no such
single character in Unicode, but one can be composed by using a G
followed by a Unicode "COMBINING UPWARDS ARROW BELOW", and would be
displayed by Unicode-aware software as if it were a single
character.
The match is greedy and non-backtracking, so that the cluster is
never broken up into smaller components.
See also "\b{gcb}".
Mnemonic: e*X*tended Unicode character.
```
|
|
|
lonjil
|
|
spider-mario
it originates from when it was thought that 16 bits would be enough to be constant-width (UCS-2)
|
|
2024-05-14 12:31:33
|
Note that the international community insisted that 16 bit wasn't enough, but early Unicode, composed entirely of American tech companies, thought they knew better.
|
|
2024-05-14 12:32:21
|
And it only took like, a year after Unicode 1.0 for the mistake to become a problem.
|
|
|
spider-mario
|
2024-05-14 12:46:59
|
I love UTF-8, it’s a work of beauty
|
|
2024-05-14 12:47:06
|
UTF-32 is, eh, very niche, but “why not”
|
|
2024-05-14 12:47:16
|
UTF-16 is a nope from me
|
|
2024-05-14 12:47:31
|
pointless, no pun intended
|
|
|
lonjil
|
2024-05-14 12:49:43
|
I think ISO wanted 32 bits
|
|
|
|
JendaLinda
|
2024-05-14 12:50:39
|
Windows uses three character encodings at once, for different purposes.
1) UTF-16, so-called "UNICODE", the default for the OS and applications compiled with Unicode support.
2) Legacy 8 bit Windows encoding, depending on the country and language settings, so-called "ANSI" encoding, used by legacy graphical Windows applications without Unicode support, usually applications written for Win9x.
3) 8 bit MS-DOS encoding, also depending on the country and language settings, so-called "OEM" encoding, used by console applications that doesn't support the quirky Windows Unicode implementation. The Windows terminal itself supports Unicode but the programs will use the MS-DOS encoding by default, this includes cjxl, djxl and other tools.
|
|
|
spider-mario
|
2024-05-14 01:06:38
|
oh, I thought we’d be using ANSI by default
|
|
2024-05-14 01:07:34
|
wasn’t there a PR that did the manifest thing that makes ANSI=UTF-8? does it not do what was hoped of it, then?
|
|
2024-05-14 01:08:28
|
why is MS-DOS even still relevant at all? Windows has dropped NTVDM
|
|
|
|
JendaLinda
|
2024-05-14 01:30:22
|
MS-DOS encoding was the default in the MS-DOS command line in Win9x and there were also 32 bit "console applications" running in Windows 9x so these had to use the MS-DOS codepage as well.
|
|
2024-05-14 01:35:18
|
Curiously, batch/cmd scripts are also assumed to be encoded in MS-DOS codepage by default.
|
|
2024-05-15 01:07:55
|
I wonder if any application actually used the default VGA palette.
|
|
2024-05-15 01:09:30
|
It seems to be based on HSL model and has poor coverage of the RGB color space so it's not very useful. The default 256 color palette was there as a placeholder because everybody were using custom palettes anyway.
|
|
|
_wb_
|
2024-05-15 02:05:02
|
Can someone explain why so many color spaces have a transfer function that has a linear segment near black? sRGB, Rec709, Rec2020, ProPhoto: they all do that. What is the purpose of this? I've seen many vague references to "near black issues" but I still don't understand what the problem is and how this solves it.
|
|
|
lonjil
|
2024-05-15 02:08:53
|
I assume it has something to do with the derivative being 0 at 0
|
|
|
|
afed
|
2024-05-15 02:08:57
|
maybe for that
<https://poynton.ca/notes/colour_and_gamma/GammaFAQ.html>
|
|
|
_wb_
|
2024-05-15 02:10:02
|
There are plenty of other color spaces / transfer functions (Adobe98, PQ, HLG, DCI-P3) that don't have such a linear segment near black, so I don't think there's a math reason for it
|
|
2024-05-15 02:11:07
|
"minimizes the effect of sensor noise"? I don't understand what this means.
|
|
|
|
afed
|
|
afed
maybe for that
<https://poynton.ca/notes/colour_and_gamma/GammaFAQ.html>
|
|
2024-05-15 02:13:37
|
|
|
|
_wb_
|
2024-05-15 02:15:14
|
When I select "ProPhoto" in photoshop, it uses something with a pure gamma curve, while the official ProPhoto definition does have a linear segment near black. Maybe it doesn't matter enough for Adobe to care about it.
|
|
2024-05-15 02:18:32
|
It's curious that all the "old" transfer functions have a linear segment near black, and then at some point this was no longer fashionable — any "new" transfer functions don't do such special near-black segments anymore.
|
|
|
|
JendaLinda
|
2024-05-15 05:02:02
|
I guess IBM didn't take any of that into consideration when they designed VGA.
|
|
|
kkourin
|
|
_wb_
"minimizes the effect of sensor noise"? I don't understand what this means.
|
|
2024-05-15 05:23:32
|
I guess the idea is to dedicate less bits to the noise floor area?
|
|
|
Quackdoc
|
|
_wb_
Can someone explain why so many color spaces have a transfer function that has a linear segment near black? sRGB, Rec709, Rec2020, ProPhoto: they all do that. What is the purpose of this? I've seen many vague references to "near black issues" but I still don't understand what the problem is and how this solves it.
|
|
2024-05-15 09:13:33
|
its a retro detail from when CRTs needed built in flate compensation and stuff
|
|
2024-05-15 09:13:50
|
its outdated and should have died long ago, but people refuse to give it up
|
|
2024-05-15 10:07:21
|
actually correction it turns out it has different intents depending on the specification? I found a comment from jack holm stating this <@456226577798135808>
> The straight line part of the sRGB EOTF was to avoid extreme slope, which caused problems in some color management systems. It also partly addresses the difference between the expected black point of 0.1 nit for 1886 and 0.2 nit for sRGB (with more veiling glare because of the higher ambient).
https://community.acescentral.com/t/srgb-piece-wise-eotf-vs-pure-gamma/4024/27
|
|
2024-05-15 10:07:38
|
ah man <@794205442175402004> meant to tag you instead
|
|
2024-05-15 10:08:32
|
so it handles the glare partially, but also has other uses
|
|
|
spider-mario
|
2024-05-15 10:48:39
|
glare, the bane of my existence
|
|
|
dogelition
|
2024-05-15 11:21:28
|
BT.709 and BT.2020 are kinda weird in that they used to not specify an EOTF at all, and now both refer to BT.1886, which is essentially a pure 2.4 gamma EOTF on a reference monitor
so the piecewise transfer function (OETF) specified in BT.709/BT.2020 only applies to cameras and is irrelevant for how content should be displayed, i.e. areas like display calibration and color management
|
|
|
_wb_
|
2024-05-16 07:30:12
|
I now think the linear part is there just so you can do conversions from linear to sRGB and back in limited precision, fixed-point arithmetic (e.g. uint16), without running into trouble. If you make it a pure gamma curve, you cannot do that accurately in fixed-point arithmetic. In float32 this all doesn't matter since that has plenty of precision, or even float16 is probably OK (the important thing is to have floating point and not fixed point, since you need more precision near zero), but in uint16 I guess it does make a difference, and I suspect much of the early CMS implementations couldn't afford floats.
Since the displays and viewing conditions sRGB was designed for had substantial amounts of glare anyway, I guess they tried to kill two birds with one stone and defined things so that when viewing an sRGB image on a display that actually renders the pixels as gamma 2.2, it results in some glare compensation.
|
|
|
Quackdoc
|
2024-05-16 07:47:20
|
yeah, the thread was quite... hehe illuminating
|
|
|
jonnyawsom3
|
2024-05-16 10:02:32
|
Mentioning float32 reminded me
https://github.com/libjxl/libjxl/issues/3511
|
|
|
Jyrki Alakuijala
|
|
_wb_
Can someone explain why so many color spaces have a transfer function that has a linear segment near black? sRGB, Rec709, Rec2020, ProPhoto: they all do that. What is the purpose of this? I've seen many vague references to "near black issues" but I still don't understand what the problem is and how this solves it.
|
|
2024-05-16 01:46:55
|
spontaneous isomerization of opsin chemicals creates a linear response at very low intensity values in the eye -- I used biasing of the log-function for this (in butteraugli xyb)
|
|
2024-05-16 01:48:29
|
or not quite sure if it is called isomerization -- spontaneous excitation nonetheless
|
|
|
KKT
|
2024-05-16 11:24:30
|
This may seem slightly off-topic, but bear with me – I need the hive mind. If you were creating a website that was open for the community to edit, what would you do it in? Any frontend devs that use tools or frameworks that would make sense in this context? Straight up HTML & CSS would be a bit painful if the site becomes at all complex… Gohugo? Tailwind?
|
|
|
_wb_
|
2024-05-17 05:12:34
|
This is very much on-topic since the question is relevant for jpegxl.info too 😉
|
|
|
yoochan
|
|
KKT
This may seem slightly off-topic, but bear with me – I need the hive mind. If you were creating a website that was open for the community to edit, what would you do it in? Any frontend devs that use tools or frameworks that would make sense in this context? Straight up HTML & CSS would be a bit painful if the site becomes at all complex… Gohugo? Tailwind?
|
|
2024-05-17 05:25:25
|
Where the content is open to edit? Wiki like?
|
|
2024-05-17 05:28:51
|
Jxlinfo content is managed as a git repository if i remember correctly, which delegates the collabrorative coordination to a tool made for it
|
|
|
_wb_
|
2024-05-17 05:40:41
|
It might be nice to put some static page generation thing between the editable source and the final html. So we can have things like multiple pages with nice menus and all, without having to manually keep it all consistent.
|
|
2024-05-17 05:43:25
|
That can be done in GitHub too, just have to do the page regeneration as a GitHub Action or something. I imagine there are several frameworks out there to make that easy, but I wouldn't know which ones are the nicest / most convenient
|
|
|
yoochan
|
2024-05-17 06:00:45
|
We use(d) hugo at work. It requires users to be familiar with the geeky side of editing (blindly), and commiting or serving locally.
|
|
2024-05-17 06:05:22
|
The editor experience was not very smooth. And a correct end user experience required minimal setup
|
|
|
|
veluca
|
2024-05-17 06:10:00
|
I've been using zola for a similar purpose (I'm even adding jxl support to it :P)
|
|
|
yoochan
|
2024-05-17 06:12:48
|
Emile Zola, Victor Hugo... I see a pattern here
|
|
|
spider-mario
|
2024-05-17 07:46:40
|
I use hugo for https://sami.boo/; it's all right
|
|
|
KKT
|
2024-05-17 08:33:31
|
K, we'll investigate Hugo a bit more since it came up in our independent research as well. Thanks for the input.
|
|
|
spider-mario
|
2024-05-19 07:30:46
|
https://www.tomshardware.com/software/windows/enthusiast-gets-windows-xp-running-on-an-i486
|
|
|
|
JendaLinda
|
2024-05-19 09:59:50
|
So they replaced all opcodes unsupported by 486 in the binaries. Interesting, I thought it would be easier to make some hybrid of Win2000 and WinXP. Win2000 can run on 486 quite well, it's just very slow.
|
|
|
a goat
|
2024-05-21 07:45:26
|
What would be a good metric for determining how distracting the artifacts are in a transformed video compared to the source? While I am somewhat concerned about perceptible similarity to the source, I'm more concerned about separating out the most visible artifacts and measuring their ability to add motion specific noise
|
|
2024-05-21 07:50:59
|
Specifically, I'm quantizing colors and I'm trying to give a number to the degree to which samples have more obvious temporal errors like flickering or improperly moving banding
|
|
|
_wb_
|
2024-05-22 10:31:31
|
What is the most recent "important" software/platform that doesn't yet support icc v4 but only icc v2?
|
|
|
HCrikki
|
2024-05-22 11:19:13
|
since v104 firefox seems to misrepresent its icc v4 support and bases its claim on an obsolete or flawed test it never evaluated again
|
|
2024-05-22 11:21:50
|
on windows, whats most common is that image viewers have color management disabled out of the box as a legacy decision that has not been changed since
|
|
|
username
|
2024-05-22 11:24:18
|
doesn't Firefox's v4 icc support work in most cases though? like I know there's v4 test that Firefox fails but I don't think I have seen any images that naturally have a v4 profile that fails
|
|
2024-05-22 11:25:46
|
also wb's question is about what supports color management with icc v2 but not icc v4
|
|
|
HCrikki
|
2024-05-22 11:26:47
|
full support is necessary, partial is pointless if not WIP
|
|
2024-05-22 11:27:32
|
images encoded with xyb using jpegli show severe color distortion on firefox. almost everything that renders using webkit or blink shows correct colors, while software based on firefox or gecko fails (including derivatives like waterfox and floorp). clearly some deficiency here in need of admitting and fixing - proper color management is necessary for HDR
|
|
2024-05-22 11:30:20
|
*Servo* is working towards implementation but still lagging btw - last checked a month ago
|
|
2024-05-22 11:37:14
|
safari for windows (nightly snapshot from may) has full support firefox lacks despite no longer released there
|
|
|
Meow
|
|
HCrikki
safari for windows (nightly snapshot from may) has full support firefox lacks despite no longer released there
|
|
2024-05-22 11:40:19
|
Isn't it discontinued long ago?
|
|
|
HCrikki
|
2024-05-22 11:42:05
|
extract 2 zip archives in the same folder and you can use it to test sites without a mac. supports jxl btw
h**ps://dev.to/dustinbrett/running-the-latest-safari-webkit-on-windows-33pb
|
|
|
lonjil
|
2024-05-22 11:57:17
|
I don't think Safari supports ICCv4 fully either
|
|
|
Nyao-chan
|
|
JendaLinda
It's fun to create image files so small so they fit inside the NTFS file entry itself.
|
|
2024-05-22 12:10:13
|
reminds me of https://github.com/nanochess/bootRogue
510 byte roguelike that fits in a boot sector
|
|
|
|
JendaLinda
|
|
Nyao-chan
reminds me of https://github.com/nanochess/bootRogue
510 byte roguelike that fits in a boot sector
|
|
2024-05-22 01:20:24
|
Creating a working game fitting the tiny space is a great achievement. There is also demo scene around tiny executables.
|
|
|
jonnyawsom3
|
2024-05-22 01:27:06
|
Good ol' Kkrieger
|
|
|
w
|
2024-05-22 09:39:06
|
everyone hates iccv4 anyway
|
|
2024-05-22 09:39:21
|
it's worse than iccv2
|
|
|
_wb_
|
2024-05-23 07:53:17
|
how so? It's significantly more compact (and correct!) if you can e.g. express the sRGB transfer function as a parametric function instead of having to use a tabulated approximation like in iccv2
|
|
|
w
|
2024-05-23 08:35:39
|
argyllcms guy hates iccv4 so I also hate it
|
|
|
_wb_
|
2024-05-23 09:27:15
|
you're not confusing with iccv5 are you?
|
|
|
spider-mario
|
2024-05-23 09:28:40
|
as in iccMAX?
|
|
2024-05-23 09:28:42
|
(https://discuss.pixls.us/t/wayland-color-management/10804/304 )
|
|
|
Quackdoc
|
|
spider-mario
(https://discuss.pixls.us/t/wayland-color-management/10804/304 )
|
|
2024-05-23 09:34:32
|
this thread... this has given me immense headache
|
|
|
w
|
2024-05-23 09:38:16
|
https://hub.displaycal.net/forums/topic/how-to-create-icc-v-4-files/
|
|
2024-05-23 09:38:18
|
https://www.argyllcms.com/doc/iccgamutmapping.html
|
|
2024-05-23 09:42:23
|
the same florian and gill
|
|
2024-05-23 09:46:34
|
but it sounds like iccmax isnt needed and iccv4 isnt needed
|
|
|
Quackdoc
|
2024-05-23 09:56:50
|
I don't know the specifics of what ICCMax ads over iccv4, but I know a lot of people complain that iccv4 is wholly unsuitable for HDR content. This is a quote from troy sobotka, author of hitchikers guide to digital colour, specifcally in regards to wayland compositors but the point still stands
> _If_ one is serious about fixing it, the wiser approach would be to use the non-domain bound ICCMax tags.
> And in the CMS, which would ideally be written from scratch for display work, leverage the appropriate ICCMax tags.
> At least if I were tasked with designing an appropriate path, that’s where I would certainly start. curv and para tags are explicitly domain bound in ICC, and scaling or other hacks do not seem prudent.
> And it would help unfuck the glaring fuckup in V4 regarding achromatic axis.
|
|
|
w
|
2024-05-23 09:59:30
|
hdr is fine in icc v2 🤷
|
|
|
Quackdoc
|
2024-05-23 10:00:48
|
evidently not
|
|
2024-05-23 10:01:10
|
I've seen many people complain about it, and it's pretty much never used in any mastering workflow
|
|
|
yoochan
|
2024-05-23 01:30:52
|
If I wanted to output the minimal size at which I can truncate a file and have a full 1/8th preview, would it be complex ? would it work whatever the encoding process ? the container ? the use of modular ? lossless ? lossy ?
|
|
|
_wb_
|
2024-05-23 03:49:13
|
Default lossless is without progressive preview. You can have lossless + progressive, but it comes at a cost in compression. For lossy, both vardct and default lossy modular are progressive. The API doesn't have a function yet to tell you what offset you'd need (maybe we should add such a function!) — you can figure it out manually via the libjxl API by doing a progressive decode and recording the bytes read at every pass, but that's far from ideal/efficient.
|
|
|
yoochan
|
|
_wb_
Default lossless is without progressive preview. You can have lossless + progressive, but it comes at a cost in compression. For lossy, both vardct and default lossy modular are progressive. The API doesn't have a function yet to tell you what offset you'd need (maybe we should add such a function!) — you can figure it out manually via the libjxl API by doing a progressive decode and recording the bytes read at every pass, but that's far from ideal/efficient.
|
|
2024-05-23 06:59:06
|
Thank you. What would happen if I progressively load a lossless file without preview? Will it fill the frame from top to bottom or will I have passes?
|
|
2024-05-23 06:59:13
|
I'll test that
|
|
|
_wb_
|
2024-05-23 07:26:42
|
You will get it one group at a time
|
|
|
Quackdoc
|
2024-05-23 07:39:02
|
an example
|
|
|
yoochan
|
2024-05-23 07:40:49
|
neat ! thank you !
|
|
|
a goat
|
2024-05-24 10:11:09
|
<@794205442175402004> What would be the better format for compressing lossless (or near lossless) floating point elevation data stored inside of a large number of images: WebP or JXL? I know the standard is TIFF for this sort of thing, but I'm still curious. I know the container for JXL is pretty small, which makes me lean towards it, but people speak highly of lossless WebP as well
|
|
|
_wb_
|
2024-05-24 11:08:07
|
Lossless webp is not an option, it can only do uint8
|
|
|
a goat
|
2024-05-25 12:19:23
|
Ah gotcha
|
|
|
jjrv
|
2024-05-25 06:17:29
|
Many web mapping tools use PNG and store up to 24 bits of integer elevation in RGB channels. You can do that in webp too. I think jxl is better though. The RGB channel abuse won't compress too well.
|
|
|
yoochan
|
|
jjrv
Many web mapping tools use PNG and store up to 24 bits of integer elevation in RGB channels. You can do that in webp too. I think jxl is better though. The RGB channel abuse won't compress too well.
|
|
2024-05-25 09:24:38
|
interesting ! with an additionnal 8bit alpha channel you could encode 32 bits float ... directly.... seems messy as hell, I have to try
|
|
|
jjrv
|
2024-05-25 09:44:10
|
I recommend JXL because it has lossy and lossless floating point, so you don't have to do a transformation first that hurts compression.
BUT if you want to really dive into this, it's possible to store altitude, temperature or other mostly smooth things as an integer modulo 256. Then when decompressing, if the (absolute) difference between neighbors is over +-128, assume it's actually less than that and the more significant bit dropped by the modulo got flipped. Then if the gradient for some reason actually is that large or more, create a second image with only such problematic parts, or drop precision and signal that by adding chroma to an otherwise grayscale image.
I got pretty deep into that stuff, you can do lossy or lossless compression of floating point data even using video codecs. But in my experience JXL finally just made that pain go away.
|
|
2024-05-25 09:48:31
|
If you take let's say LiDAR altitudes and actually 24 bits (millimeter precision), you will have one color channel of pure white noise and it won't compress pretty much at all. JXL understands it's a single number with more bits and is able to do something with it, though it'd be better to round off the noise or compress lossy.
|
|
|
Jyrki Alakuijala
|
|
_wb_
Lossless webp is not an option, it can only do uint8
|
|
2024-05-25 11:04:00
|
You could store a 32 bit floating point number in argb (4x 8 bits) and store with --exact so that a=0 doesn't mess up things.
|
|
|
_wb_
|
2024-05-25 11:06:30
|
Sure but a standard webp will show something weird when displaying such an image. You can do the same with PNG.
|
|
|
|
JendaLinda
|
2024-05-25 05:12:13
|
Instead of hacking image formats, it would be better to just use pfm and compress it using any file archiver.
|
|
|
jjrv
|
2024-05-26 07:51:07
|
Using image formats as a generic archiver like that makes no sense, but using a more suitable input transformation or image format saves a lot of space. Like for elevations on this planet on at least city-wide scale, a single 16-bit integer grayscale channel is plenty. If you include Everest and Mariana trench, that's still 33cm of vertical precision. If you focus on just the land or ocean and add 8 more bits, it's difficult for those bits to be something other than noise.
As mentioned, you can go with just 8 bits storing height modulo 256 as long as neighbor elevations have under 42 meters of difference, and treat those as special cases somehow. Top left corner full elevation is then image metadata. Why not store deltas (gradient) instead? Can't compress them lossy in any way at all and otherwise they have all the same problems (with lossy you do need to decrease that 42 threshold to account for compression artifacts).
|
|
2024-05-26 08:19:00
|
There's so much more potential in jxl tooling. ECMWF distributes the world's leading weather forecasts as JPEG 2000 which is a slow to decompress nightmare format. Each forecast has dozens of things like temperature, wind u/v, humidity... Each is stored as a separate single-channel image. But everything is available for over 20 pressure levels and over 100 time steps, a 4D cube essentially. If you could put either time steps or pressure levels into different channels of the same image and take advantage of their correlation, that would surely save a lot of space. And pretty much any weather forecast you see anywhere, had to transfer a lot of that as part of the pipeline of getting it from ECMWF or NOAA to your eyes.
ESA / NOAA multispectral satellite imagery can be processed into a 4D cube as well, with about 20 spectral channels (unlike the 3 of RGB) and any number of time steps.
There are specialist formats for that kind of 4D scientific data, like ZFP. But they cannot compete even with JPEG 2000. ECMWF didn't just pick that one randomly. ESA also uses it for Sentinel 2.
|
|
2024-05-26 08:21:41
|
JPEG XL would deserve to become the standard format for earth observation, climate science, etc.
It already compresses a bit smaller and way faster than JPEG 2000. It could also compress way smaller if multiple channels fit in a single file more efficiently. Way faster way smaller might be enough of a combo to overcome some inertia and get it adopted
|
|
2024-05-26 09:12:40
|
We're talking terabyte a day of compressed data transferred and archived in operational use.
I'm sure ICEYE could produce a petabyte a day of meaningful data if it were economical to store and transfer, so as speeds and capacities increase compression will always remain relevant.
|
|
|
CrushedAsian255
|
2024-05-28 11:00:22
|
does JPEG XL support 4D image data?
|
|
|
_wb_
|
2024-05-28 11:44:49
|
Not really. It supports 2D images. If you consider multi-frame as another dimension, that gives you 3D. If you consider multi-channel as yet another dimension, you could call it 4D, but the number of channels is limited to 4K. But the main compression/decorrelation/prediction is happening for the two image dimensions, since in the end everything is stored in a planar way. There are some options to decorrelate frames (e.g. kAdd blend mode, cropped frames) and to decorrelate channels (color transforms), but the current libjxl encoder is mostly not using those except for decorrelating the first three channels (RGB).
|
|
|
jjrv
|
2024-05-28 01:14:19
|
And even in its current state, where you can take advantage of the correlation between 3 values by stashing them into RGB channels and otherwise use separate image files, it already hands-down beats everything else out there for the use cases mentioned. So, I'm extremely happy with it.
|
|
|
Demiurge
|
|
jjrv
JPEG XL would deserve to become the standard format for earth observation, climate science, etc.
It already compresses a bit smaller and way faster than JPEG 2000. It could also compress way smaller if multiple channels fit in a single file more efficiently. Way faster way smaller might be enough of a combo to overcome some inertia and get it adopted
|
|
2024-05-28 11:46:00
|
The problem is the current decoder will run out of memory and crash, and lacks ROI decoding api or resize-on-load I believe either
|
|
2024-05-28 11:47:33
|
Also instead of backing out and returning an error it just calls an illegal instruction and forces the caller to abort so you can't recover from errors
|
|
2024-05-28 11:47:48
|
:(
|
|
|
yoochan
|
2024-05-29 05:42:15
|
the current libjxl implementation lacks features and may have bug today. This doesn't go in contradiction with the fact it deserve to be a standard format for science
|
|
|
jjrv
|
2024-05-29 05:40:38
|
<@794205442175402004> I tried with latest release and then compiled cjxl from your pam_extrachans branch but it keeps throwing Invalid float description. Maybe it thinks the extra channels should be floats?
|
|
2024-05-29 05:43:33
|
Tried dropping the channel count:
```
P7
WIDTH 4865
HEIGHT 4091
DEPTH 9
MAXVAL 65535
TUPLTYPE RGB
TUPLTYPE Optional
TUPLTYPE Optional
TUPLTYPE Optional
TUPLTYPE Optional
TUPLTYPE Optional
TUPLTYPE Optional
ENDHDR
```
And running:
`cjxl radiance.pam radiance.jxl -d 0 -e 9 -E 5`
Prints:
```
JPEG XL encoder v0.10.2 f7e65ca1 [AVX2,SSE4,SSE2]
Encoding [Modular, lossless, effort: 9]
./lib/jxl/encode.cc:591: Invalid float description
JxlEncoderSetExtraChannelInfo() failed.
EncodeImageJXL() failed.
```
Tested that 3 channels does work with that header format and current pipeline.
|
|
2024-05-29 05:52:56
|
Huh, switched those from Optional to Thermal (that immediately failed with 21 channels but now 12 didn't) and now it's doing something. Let's see what happens, then.
|
|
2024-05-29 05:56:20
|
Inputs were Oa01_radiance.jxl - Oa10_radiance.jxl, total 214058510 bytes. Output is:
https://reakt.io/jxl/radiance.jxl
177305253 bytes. Not bad at all.
OK, maybe when I tested with 21 channels, it was with the release and not the branch, because now that did something too:
https://reakt.io/jxl/radiance-21.jxl
312569879 bytes (down from 381512264 in 7 .jxl files so 18% less)
Excellent result, just that it took 2 minutes on 48 cores and I've got to run it on 200k images so it would be great to re-use the MA tree and use a lower effort? OTOH I guess I *can* run it like this, since it's less than a year of constant crunching, not like decades.
I guess I'd like to take a random sampling of up to 1000 images, optimize the compression on those, and then use those settings for the whole archive?
|
|
|
DZgas Ж
|
2024-05-29 05:57:49
|
bruh
|
|
2024-05-29 06:06:58
|
my small project of converting images to RGB Place (1500 images with 1% damping)
|
|
2024-05-29 06:10:43
|
count of pixels
|
|
|
monad
|
|
jjrv
Inputs were Oa01_radiance.jxl - Oa10_radiance.jxl, total 214058510 bytes. Output is:
https://reakt.io/jxl/radiance.jxl
177305253 bytes. Not bad at all.
OK, maybe when I tested with 21 channels, it was with the release and not the branch, because now that did something too:
https://reakt.io/jxl/radiance-21.jxl
312569879 bytes (down from 381512264 in 7 .jxl files so 18% less)
Excellent result, just that it took 2 minutes on 48 cores and I've got to run it on 200k images so it would be great to re-use the MA tree and use a lower effort? OTOH I guess I *can* run it like this, since it's less than a year of constant crunching, not like decades.
I guess I'd like to take a random sampling of up to 1000 images, optimize the compression on those, and then use those settings for the whole archive?
|
|
2024-05-29 11:02:00
|
Not sure how e9 could possibly be worth the compute. e7 with g3 incorporated is probably way more efficient.
|
|
|
jjrv
|
2024-05-30 07:33:16
|
Definitely don't want to use e9 but that's what wb wanted me to use first, to test compressing 21 channels.
|
|
2024-05-30 07:40:31
|
How is the quality / distance value unit and channel type related anyway? What is the meaning of a depth vs thermal vs something else channel in terms of compressing arbitrary data?
I've been compressing thermal data as RGB to get 3 time steps in a single image. `uses_original_profile` and `JXL_TRANSFER_FUNCTION_LINEAR` seem to affect what the Butteraugli distance does. I don't even care about Butteraugli in this context, would prefer some measure related to the input numbers. So if my input is in Kelvin, what kind of relative errors can I expect with different distances?
I know this format isn't meant for science, but it's an unreasonably good fit.
Latest confusion is that 2 RGB images compressed lossy are total about 400kb. When compressing losslessly, there were nice further savings from combining all the frames into a single 6-channel image. Now when testing lossy, the result bloated to 2 megabytes, 5 times the expected size. Maybe these aren't the same unit any more, when there's RGB channels and extra channels are thermal?
```
JxlEncoderSetFrameDistance(frame, 0.05)
JxlEncoderSetExtraChannelDistance(frame, num, 0.05)
```
Or maybe I screwed up something else 😬
|
|
2024-05-30 07:44:51
|
For reference, I'm working on this:
https://reakt.io/temp/
The globe spins. For me JPEG XL is the only practical way to get that thing working at all. I've been mucking around with ffmpeg, video codecs and getting temperatures -80 - 60 Celsius somehow encoded at 0.1 Celsius resolution, then decoded in a web app. It has been a nightmare, and now after picking up JPEG XL suddenly 3 days later it works.
|
|
2024-05-30 07:53:44
|
I want it to also show optical imagery, 300m resolution globally and select areas (ones in the news lately) at 10m resolution, day per frame over the past several years. If I win the lottery, then 10m globally. So that's a whole different dataset to also compress, the 21 channel case is related to that side, because the optical imagery is way more than just RGB. I'll use JPEG XL for lossless archival and probably have to re-visit the latest video codecs every couple years.
|
|
|
_wb_
|
2024-05-30 08:35:24
|
When doing lossy, the RGB channels are treated differently, by default they're converted to XYB and encoded with VarDCT — which is good if it's visual data intended for human viewing. VarDCT is hardcoded for 3 channels, any additional channels are always using Modular.
|
|
2024-05-30 08:37:23
|
When doing modular lossy, very different coding tools are used than when doing the usual vardct lossy, so results in both quality and size can be quite different.
|
|
2024-05-30 08:41:57
|
For lossy, I expect no real benefit from putting all 21 channels together, since they will be compressed more or less independently (prevchannel context does not work well in combination with squeeze)
|
|
|
yoochan
|
2024-05-30 08:47:00
|
what is the reason behind the choice to forbid varDCT for all other channels ?
|
|
|
_wb_
|
2024-05-30 09:31:31
|
Only having to deal with the case of 3 channels makes it easier to have an efficient implementation, and we assumed that DCT-based lossy compression mostly makes sense for perceptual image data, not so much for other channels like Alpha or the K of CMYK. It was a design trade-off between keeping things simple conceptually and in terms of implementation (generalizing varDCT to an arbitrary number of channels would complicate quite a few things, like chroma from luma etc) on the one hand, and having an expressive / feature-rich image codec on the other hand (having a varDCT option would give an encoder more possibilities for extra channel encoding).
|
|
|
Oleksii Matiash
|
2024-05-30 09:42:36
|
Well, this is the only thing in jxl that I can't really agree with. For me much more logical would be to allow any channel above first 3 to be encoded either with modular or vardct, just not using special tricks, used for first 3 channels in vardct mode. Probably I'm missing something critical that lead to this decision
|
|
|
_wb_
|
2024-05-30 09:57:09
|
It wouldn't be impossible to define it like that, but it would complicate quite a few things.
|
|
2024-05-30 10:00:51
|
One other way to represent 21 channels in a single image is to do it as an RGB image with 7 layers/frames. Frames are encoded mostly independently, but you can use the kAdd blend mode and subtract one frame from another frame, which can be beneficial, especially for lossy compression, if the data is similar.
|
|
|
Oleksii Matiash
|
2024-05-30 10:13:25
|
Yes, I know that it is possible to do workaround for this limitation, I just believe that this limitation is not very clever. For 99.99% cases - yes, 3 channels is the enough, but disallowing it at all.. I'd rather split it to 'levels': "first 3 channels vardct only" for base level encoders\decoders, and it would be enough for almost everything, but if you need to encode\decode more complex - you are allowed to do it
|
|
2024-05-30 10:13:46
|
But it's just thoughts
|
|
|
Tirr
|
2024-05-30 10:20:16
|
I guess it might cause decoder fragmentation. as an author of jxl-oxide I can see such features will complicate decoder in various ways, and if it's optional feature I'd rather defer implementing, if not skip that at all
|
|
|
Oleksii Matiash
|
2024-05-30 10:51:00
|
Yes, I agree, sure. And I'd personally never use more than 3 channels, it's just thoughts about ideal world 🙂
|
|
|
_wb_
|
2024-05-30 12:59:10
|
In the old JPEG, you can have between 1 and 4 channels. In the JXL design, we could have done something similar (but with a higher maximum number of channels), in fact that's what I did in FUIF (which evolved into JXL's Modular mode): everything is defined for an arbitrary number of channels, and everything is encoded channel per channel anyway so it does not complicate implementations.
In PIK (which evolved into JXL's VarDCT mode) a different choice was made: they had more of a focus on enc/dec speed, and everything is hardcoded for exactly 3 channels (even grayscale is done as 3 channels where the two chroma channels just happen to be all-zeroes), so everything can be specialized for the 3-channel case, e.g. making it possible to do everything from inverse DCT to XYB-to-RGB in one go. Also all signaling of things like quant tables, filter weights, etc is made simpler and can have nice default values by assuming 3 channels.
|
|
|
yoochan
|
2024-05-30 01:10:07
|
interesting 👍
|
|
2024-05-30 01:11:34
|
it sounds like the varDCT decode function is not even implemented once in a single layer fashion
|
|
|
jjrv
|
2024-05-30 05:17:37
|
Very good to know! Then I'll plan to always use 3 channels when compressing lossy, and more only when lossless. It sounds like being able to re-use an MA tree is the main potential new thing on the roadmap relevant to this area.
|
|
|
_wb_
|
|
jjrv
Inputs were Oa01_radiance.jxl - Oa10_radiance.jxl, total 214058510 bytes. Output is:
https://reakt.io/jxl/radiance.jxl
177305253 bytes. Not bad at all.
OK, maybe when I tested with 21 channels, it was with the release and not the branch, because now that did something too:
https://reakt.io/jxl/radiance-21.jxl
312569879 bytes (down from 381512264 in 7 .jxl files so 18% less)
Excellent result, just that it took 2 minutes on 48 cores and I've got to run it on 200k images so it would be great to re-use the MA tree and use a lower effort? OTOH I guess I *can* run it like this, since it's less than a year of constant crunching, not like decades.
I guess I'd like to take a random sampling of up to 1000 images, optimize the compression on those, and then use those settings for the whole archive?
|
|
2024-06-02 12:34:22
|
I made some small tweaks to e2 that are useful in general for 16-bit images: https://github.com/libjxl/libjxl/pull/3622
|
|
2024-06-02 12:35:21
|
Also added something to make `cjxl -e 2 -E 1` actually do something different than default `-e 2`, and it seems to work quite well on this 21-channel test image.
|
|
2024-06-02 12:37:57
|
Not quite as small as the e9 E5 image of course, but should be quite a bit faster to encode and decode.
|
|
2024-06-02 12:40:21
|
Before:
e2: 396.7 MB
e2E1: same as just e2
e9E5: 312.6 MB
After:
e2: 387.1 MB
e2E1: 343.8 MB
|
|
2024-06-02 12:47:53
|
At some point we should add some functionality to encode with a manually specified tree (in `jxl_from_tree` syntax, for example), and to extract trees from a jxl bitstream. That should open up some possibilities for making your own custom fast encoder that is tuned for the specific kind of image content you have. No idea how to add such a thing in a nice way to the libjxl API though...
|
|
2024-06-03 08:42:36
|
Made some tweaks, it's now 341.0 MB at e2E1 🙂
|
|
|
CrushedAsian255
|
|
_wb_
Made some tweaks, it's now 341.0 MB at e2E1 🙂
|
|
2024-06-03 08:45:08
|
maybe `cjxl` should give information about the size / performance tradeoffs of the options
|
|
|
_wb_
|
2024-06-03 08:45:59
|
it now says this:
```
-e EFFORT, --effort=EFFORT
Encoder effort setting. Range: 1 .. 10.
Default: 7. Higher numbers allow more computation at the expense of time.
For lossless, generally it will produce smaller files.
For lossy, higher effort should more accurately reach the target quality.
```
|
|
|
CrushedAsian255
|
|
_wb_
it now says this:
```
-e EFFORT, --effort=EFFORT
Encoder effort setting. Range: 1 .. 10.
Default: 7. Higher numbers allow more computation at the expense of time.
For lossless, generally it will produce smaller files.
For lossy, higher effort should more accurately reach the target quality.
```
|
|
2024-06-03 08:47:02
|
i was thinking more like for the more advanced options, like it's not directly obvious what changing the group size does for performance
|
|
|
_wb_
|
2024-06-03 08:48:55
|
Hard to say what it actually does for performance. Changing the group size to something different may improve or worsen the compression density, and may be good or bad for speed depending on the image content and the number of threads.
|
|
|
CrushedAsian255
|
|
_wb_
Hard to say what it actually does for performance. Changing the group size to something different may improve or worsen the compression density, and may be good or bad for speed depending on the image content and the number of threads.
|
|
2024-06-03 08:49:48
|
so it's more complex than just "smaller group = faster but worse quality" or something, and you kinda just need advanced understanding of the format to get an intuition of the performance/speed tradeoff
|
|
|
_wb_
|
2024-06-03 08:53:12
|
for group size it's tricky — larger groups can be good for lz77 and to have fewer poorly-predicted group edges, smaller groups can be good for finer granularity in local RCTs and local palettes; when the number of available threads is large and the image is not so large, smaller groups allow more parallelization so it's faster, but if the number of threads is smaller than the number of megapixels, even the largest group size will allow using all threads so it won't make much of a speed difference.
|
|
|
|
salrit
|
2024-06-03 09:07:54
|
Hey, I was curious to know how is the predictor for lossless - working here, like its said in the jxl's git, "An adaptive predictor computes 4 from the NW, N, NE and W pixels and combines them with weights based on previous errors." and in krita's website "Fraction of pixels used for the Meta-Adaptive Context tree. The MA tree is a way of analyzing the pixels surrounding the current pixel, and depending on the context choose a given predictor for this pixel". What's the MA-Tree used here for? How is 'context modelling' used here?
and the other thing is - what exactly in WebP lossless (in terms of C-Ratio not the rendering speed etc.), JPEG-XL's lossless tries to fix ,what were the issues with WebP that JPEG-XL overcomes?
|
|
|
monad
|
|
CrushedAsian255
so it's more complex than just "smaller group = faster but worse quality" or something, and you kinda just need advanced understanding of the format to get an intuition of the performance/speed tradeoff
|
|
2024-06-03 09:20:09
|
generally: larger is denser (correlated with increasing image size), smaller is slower single-threaded but can be faster multi-threaded
|
|
2024-06-03 09:22:07
|
so doing things single-threaded, it's safe to just use the largest group size
|
|
|
_wb_
|
|
salrit
Hey, I was curious to know how is the predictor for lossless - working here, like its said in the jxl's git, "An adaptive predictor computes 4 from the NW, N, NE and W pixels and combines them with weights based on previous errors." and in krita's website "Fraction of pixels used for the Meta-Adaptive Context tree. The MA tree is a way of analyzing the pixels surrounding the current pixel, and depending on the context choose a given predictor for this pixel". What's the MA-Tree used here for? How is 'context modelling' used here?
and the other thing is - what exactly in WebP lossless (in terms of C-Ratio not the rendering speed etc.), JPEG-XL's lossless tries to fix ,what were the issues with WebP that JPEG-XL overcomes?
|
|
2024-06-03 09:22:44
|
jxl has various predictors, the self-correcting one (aka Weighted predictor) is only one of them.
The MA tree determines what predictor to use for what pixels, and it also determines the entropy coding context (i.e. the distribution of symbols that will be used in the ANS or Huffman coding). See also: https://qon.github.io/jxl-art/wtf.html and https://qon.github.io/jxl-art/
|
|
2024-06-03 09:26:12
|
The main difference between lossless webp and jxl is that lossless webp is hardcoded for 8-bit RGBA, while lossless jxl can handle arbitrary bit depths and number of channels. Hardcoding for 8-bit RGBA does allow webp to be fast, but it also seriously limits its applicability since in many use cases (especially HDR or medical/scientific) you need more precision and/or channels.
|
|
2024-06-03 09:29:29
|
Another difference is that lossless webp is encoding full rows of pixels (like PNG), while jxl is tiled (by default it does groups of 256x256 pixels at a time). Doing full rows has some advantages for compression (e.g. you can have longer runs in RLE, there are fewer border pixels with poor prediction), but it also means it cannot be parallelized, while in jxl both encoding and decoding can use multiple threads effectively, making it more suitable for larger images and current and future processors (which tend to become not really faster but mostly just have more cores).
|
|
2024-06-03 09:40:26
|
(webp cannot handle very large images anyway since dimensions are limited to 16k by 16k in the header syntax, but even if larger dimensions would be allowed, it would be limited by its non-tiled bitstream structure; e.g. jxl can do chunked encode/decode which is essential for handling huge images; webp and png can do a line-by-line decode but that's not as good — e.g. if you have a 100k by 10k image and you want to show a 2k by 1k viewport, with just line-by-line decode you'll have to decode basically everything if the viewport is close to the bottom, and even if it's close to the top you'll need to decode 100k columns while you need only 2k columns, while in jxl it can be done with way less overhead wherever the viewport is)
|
|
|
|
salrit
|
2024-06-03 09:57:18
|
<@794205442175402004> thanks for the detailed answer. BTW, the rolling window scheme for LZ77, is implemented here too? and ya the difference in compression ratios that JXL-lossless is better than WebP's lossless is for the
1.) Adaptive predictor in case of JXL lossless and while WebP although have 14 choices, chooses block wise and get one fixed for all the pixels in the block?
2.) The entropy coding here uses (context modelling and ANS) + LZ77 while WebP uses Huffman + LZ77 ?
|
|
|
_wb_
|
2024-06-03 10:16:23
|
Oops I see you also emailed me about this, I missed that email but I guess I can answer here now 🙂
|
|
2024-06-03 10:21:43
|
Yes, jxl also supports lz77, though libjxl doesn't make as much use of it as libwebp does — webp is closer to PNG in that respect. Part of the reason is another difference between jxl and png/webp: jxl encodes samples in a planar way (RRRR.... GGGG.... BBBB....) while png/webp encode in an interleaved way (RGBRGBRGBRGB....). For lz77, interleaved is better since you can match entire pixel sequences while in planar you can only match sample sequences of one channel. But in general, planar is often better for compression, and it is also easier to generalize to handle an arbitrary number of channels.
|
|
2024-06-03 10:26:30
|
1) Yes, webp chooses a predictor per block while in jxl the choice depends on the MA tree, which is more expressive (you could make an MA tree that mimicks what WebP or PNG does, but you can also do much more)
2) I think WebP also does some kind of context modeling but it's certainly more limited and less expressive. JXL allows both ANS and Huffman as the 'back-end' for the entropy coding; usually ANS is better but for fast encoding Huffman is better.
|
|
|
monad
|
2024-06-03 10:28:14
|
more thoroughly
|
|
|
yoochan
|
2024-06-03 10:29:21
|
did jyrki participated a bit to the design of the lossless encoder ? or it was only your ideas _wb_ ? did some webp innovations were incorporated in jxl ?
|
|
|
|
salrit
|
|
_wb_
1) Yes, webp chooses a predictor per block while in jxl the choice depends on the MA tree, which is more expressive (you could make an MA tree that mimicks what WebP or PNG does, but you can also do much more)
2) I think WebP also does some kind of context modeling but it's certainly more limited and less expressive. JXL allows both ANS and Huffman as the 'back-end' for the entropy coding; usually ANS is better but for fast encoding Huffman is better.
|
|
2024-06-03 10:32:02
|
Thanks and ya WebP uses dedicated entropy codes for regions sharing similar statistical properties - kinda context modelling- although as you said the one for JXL is a stronger one.
And there are few more.... :
1.) I guess color transform(the XYB one) don't take place in JXL for the case of lossless, pixels are read in ARGB format?
2.) There was something mentioned about the Image feature extraction, is it involved in lossless too?
|
|
|
_wb_
|
2024-06-03 11:01:16
|
1) XYB is only used for lossy since it is not reversible in integer arithmetic; RCTs like YCoCg are used (can be different RCT per group)
2) Yes, there is a coding tool called Patches that can be used to draw a rectangle taken from a previous frame; this is effectively a form of 2D lz77 that can be used e.g. to encode repetitive elements like letters of text only once (in a kind of sprite sheet hidden frame) and reuse them multiple times with cheap signaling.
|
|
2024-06-03 11:02:30
|
There's also a coding tool called Splines that can be used in principle in both lossy and lossless but we don't currently have an encoder that can use this tool effectively, so for now it is only something to play with in <#824000991891554375> 🙂
|
|
|
|
salrit
|
|
_wb_
1) XYB is only used for lossy since it is not reversible in integer arithmetic; RCTs like YCoCg are used (can be different RCT per group)
2) Yes, there is a coding tool called Patches that can be used to draw a rectangle taken from a previous frame; this is effectively a form of 2D lz77 that can be used e.g. to encode repetitive elements like letters of text only once (in a kind of sprite sheet hidden frame) and reuse them multiple times with cheap signaling.
|
|
2024-06-03 11:11:21
|
Thanks 🙌
|
|
|
_wb_
Yes, jxl also supports lz77, though libjxl doesn't make as much use of it as libwebp does — webp is closer to PNG in that respect. Part of the reason is another difference between jxl and png/webp: jxl encodes samples in a planar way (RRRR.... GGGG.... BBBB....) while png/webp encode in an interleaved way (RGBRGBRGBRGB....). For lz77, interleaved is better since you can match entire pixel sequences while in planar you can only match sample sequences of one channel. But in general, planar is often better for compression, and it is also easier to generalize to handle an arbitrary number of channels.
|
|
2024-06-03 02:58:26
|
<@794205442175402004> Isn't matching sample sequences in a planar way might be beneficial than the interleaved way? Like in a plane, pixels might be correlated more -'area wise' than taking three of the planes combined? Can that can be a point of generalizing that planar encoding > interleaved encoding?
|
|
|
_wb_
|
2024-06-03 05:07:38
|
Planar is better for decorrelation. Extreme example: imagine an 8-bit grayscale image encoded as 8-bit RGBA: after color transforms (YCoCg or SubtractGreen or whatever), it is basically one byte of image data followed by three bytes that are always the same (0 for the two chroma channels, 255 for the alpha channel). Doing that interleaved is not so great, doing it planar means you can encode the main data and then just a big bunch of repeating values that will compress down to nothing (either by doing RLE or by using entropy coding with a singleton distribution which assigns a 0 bit code to represent a symbol if there is only one symbol).
|
|
2024-06-03 05:12:25
|
But interleaved is somewhat better if you have something that works well with lz77, say an image that has the exact same pixel values repeating, like a screenshot of this discord chat where avatars, emojis, letters of text etc are causing repetitive data that can be matched with lz77. If it's interleaved, you only need to encode one (distance, length) pair per horizontal run of repeated pixels, while if it's planar, you need to encode 3 such pairs (you'll need to repeat it in every channel).
|
|
2024-06-03 05:14:46
|
(Patches is even better than lz77 on an interleaved image, but we don't have a very exhaustive Patches detector yet; it's a harder thing to write a good encoder for than for finding lz77 matches, which is also not trivial but it has been around for a long time now)
|
|
|
jonnyawsom3
|
2024-06-03 05:19:58
|
I recall I think Veluca(?) had the idea of using lz77 to find matches to then use patches instead, or something along those lines
|
|
|
|
salrit
|
|
_wb_
(Patches is even better than lz77 on an interleaved image, but we don't have a very exhaustive Patches detector yet; it's a harder thing to write a good encoder for than for finding lz77 matches, which is also not trivial but it has been around for a long time now)
|
|
2024-06-03 05:22:12
|
So something like a patch-substitution on an interleaved is the best option? But this best again depends, like if its a synthetic image, with repeatative values then great but if a natural image maybe then the planar one suits... just tryin to reason why the planar is chosen in JXL ...
|
|
|
_wb_
|
2024-06-03 05:55:10
|
Yes, for natural images planar is better, and for many synthetic images too. In general it is part of why jxl is compressing better. But there are some cases where interleaved is better, especially when combined with full row encoding, like if you put two copies of a photo side by side horizontally.
|
|
|
jjrv
|
2024-06-03 06:06:03
|
Wonder what would be the best way to encode 32bit ints? I guess the format supports them as such, but libjxl not yet? Not asking it to be fixed, just wondering what's the best workaround. 16bit grayscale for 16 least significant bits and an extra channel for 16 most significant bits? Sounds questionable to use two color channels and transform to some other color space? Guess I'll do some benchmarking.
|
|
2024-06-03 06:08:22
|
These are longitude-latitude pairs in range -180000000 - 180000000. Just planning to use the same codec for these as the corresponding color data in a separate file. Actually there's 28 bits of latitude, 29 bits of longitude and 16 bits of altitude. I'll cram them in a single file somehow, just have to test different channel ordering strategies. These shouldn't really be correlated at all I think.
|
|
|
jonnyawsom3
|
2024-06-03 06:28:03
|
I assume it's more multispectral color data? Otherwise you could store it in RGB and then use entirely extra channels for the location data to still get a useful preview image
|
|
2024-06-03 06:28:55
|
I could also mention the napkin maths we did a long time ago, that the entire eath at 1m resolution could fit in under 2 JXL files
|
|
|
jjrv
|
2024-06-03 06:48:24
|
Ooh, putting the coordinates in the same file makes sense, somehow didn't think of that. It's entirely uncorrelated with color data, but who cares. Could just use 5 extra channels. Altitude might even be correlated somehow.
|
|
|
jonnyawsom3
|
2024-06-03 06:51:35
|
Presumably, using `-E 3` would mean the 3 colour channels would be compressed together, then the 3 location channels, followed by whatever else. Although I forget the ordering of the Extra MA learning so that might be wrong
|
|
|
jjrv
|
2024-06-03 06:52:56
|
Going to be 21 related color channels and 5 channels of metadata garbage basically, at least the 32-bit lon/lat make no sense as image but are needed to reproject it. Just wanting to handle decoding it all in a single pipeline. All of them as extra channels makes the most sense, then.
|
|
|
jonnyawsom3
|
2024-06-03 07:03:53
|
Actually... I wonder if they could just be stored in a brotli metadata box
Edit: Nevermind, since you want coordinates per pixel anyway it's probably best to use extra channels
|
|
|
_wb_
|
2024-06-03 07:04:34
|
jxl does not allow 32-bit uint in the current spec, only up to 24-bit uint. It does support 32-bit float though (also losslessly). The idea is that an implementation can use float32 for everything, and float32 has 1 sign bit, 8 exponent bits, and 23 mantissa bits (or 24 if you count the implicit one).
|
|
|
jjrv
|
2024-06-03 07:16:14
|
Makes sense. These lon/lat coordinates need the full 29 bits because they're coordinate inputs for a vertex shader, all others are color data for the fragment shader. Anyway I'll hack it somehow. Very unusual use case.
|
|
|
_wb_
|
2024-06-03 07:52:52
|
You can have 29 bits if you stuff them in a float somehow 🙂 — e.g. make the most significant bits 001 and then just put your uint29 in the 29 least significant bits
|
|
2024-06-03 07:53:46
|
that maps the number to something between 0.f and 2.f, though most numbers will be very small and close to zero 🙂
|
|
|
|
salrit
|
|
_wb_
Planar is better for decorrelation. Extreme example: imagine an 8-bit grayscale image encoded as 8-bit RGBA: after color transforms (YCoCg or SubtractGreen or whatever), it is basically one byte of image data followed by three bytes that are always the same (0 for the two chroma channels, 255 for the alpha channel). Doing that interleaved is not so great, doing it planar means you can encode the main data and then just a big bunch of repeating values that will compress down to nothing (either by doing RLE or by using entropy coding with a singleton distribution which assigns a 0 bit code to represent a symbol if there is only one symbol).
|
|
2024-06-03 08:04:52
|
I think the case of 8-bit image data using the interleaved encoding in WebP lossless, is taken care of , as for the grey images, after application of RCT (subtract green), the [A,R,G,B]-> [255,0,G,0] and then after the predictor transform is applied, each of the color planes and alpha plane is Huffman + LZ Coded individually and for bi-level images its just substitution of color indices from a palette.
|
|
|
CrushedAsian255
|
|
_wb_
1) Yes, webp chooses a predictor per block while in jxl the choice depends on the MA tree, which is more expressive (you could make an MA tree that mimicks what WebP or PNG does, but you can also do much more)
2) I think WebP also does some kind of context modeling but it's certainly more limited and less expressive. JXL allows both ANS and Huffman as the 'back-end' for the entropy coding; usually ANS is better but for fast encoding Huffman is better.
|
|
2024-06-04 01:13:51
|
is there a way to do some kind of tree-from-jxl or something? to see the MA tree generated by an encoder?
|
|
|
monad
|
|
CrushedAsian255
is there a way to do some kind of tree-from-jxl or something? to see the MA tree generated by an encoder?
|
|
2024-06-04 01:57:55
|
go back to libjxl 0.7, call benchmark_xl with --debug_image_dir
|
|
|
CrushedAsian255
|
|
monad
go back to libjxl 0.7, call benchmark_xl with --debug_image_dir
|
|
2024-06-04 01:58:37
|
is there a way to have bother HEAD and 0.7 installed at the same time?
|
|
|
monad
|
2024-06-04 01:59:18
|
just install 0.7 to whatever directory you want
|
|
|
|
salrit
|
|
_wb_
Planar is better for decorrelation. Extreme example: imagine an 8-bit grayscale image encoded as 8-bit RGBA: after color transforms (YCoCg or SubtractGreen or whatever), it is basically one byte of image data followed by three bytes that are always the same (0 for the two chroma channels, 255 for the alpha channel). Doing that interleaved is not so great, doing it planar means you can encode the main data and then just a big bunch of repeating values that will compress down to nothing (either by doing RLE or by using entropy coding with a singleton distribution which assigns a 0 bit code to represent a symbol if there is only one symbol).
|
|
2024-06-04 07:00:55
|
For single intensity 8-bit images....
In WebP lossless:A RCT is applied (subtract green), resulting in ARGB -> A0G0 , Then predictor transform is applied. A0G0A0G0... is taken in consideration for the predictor. Then the residue A0G0 A0G0...is LZ coded and then each of the channels is Huffman coded individually.
in JXL lossless : A RCT is chosen block wise. Here individual channels are taken for applying the predictor transform. Each channel is LZ coded and then entropy coded individually.
Will it make a difference or did I get the procedure wrong 😅 ? I tested few 512*512 natural gray-scale images and found JXL's Compression Ratios >> WebP's CRs. Is the reason here just more flexible predictors and context based entropy coding?
|
|
|
_wb_
|
2024-06-04 07:10:13
|
At least for JXL you got it right, for WebP I think you're right too but I don't know it that well (others here like <@532010383041363969> <@768090355546587137> <@987371399867945100> were involved in designing lossless webp so they will know much better).
In JXL, LZ coding is barely used (at default effort it is not used at all, iirc), while in WebP/PNG it is a major coding tool. For photographic images, LZ does not help much since there will be few good (long) matches — that's why WebP and PNG perform well mostly for non-photographic images.
For photo, what makes jxl better than png/webp are the better predictors (in particular the self-correcting predictor), slightly better RCTs, better entropy coding (ANS instead of Huffman), but most of all, way better context modeling.
|
|
|
|
salrit
|
2024-06-04 07:23:22
|
Thanks! I have been in touch with <@532010383041363969> (over emails) and he has helped much in understanding the algo for lossless (the empirical designs etc..) 🙌 ... although much is left to understand ..
Ya , few things more, is there just a single kind of scan-line ordering for JXL (talking in terms of predictor transform here) or something like Adam-infi (like in FLIF) is implemented here too?
you said something about the 'patches' the 2D like LZ coding - is it for just between frames of animations? or in a single frame rectangles from adjacent blocks can be compared and used?
Apart from subtract green and YCoCg, what all RCTs are used ? (I should probably see this, sorry for being lazy here..😅 )
|
|
|
_wb_
|
2024-06-04 07:46:35
|
Patches are also useful for single frame. You cannot reference the current frame, but you can insert an invisible auxiliary frame that contains a sprite sheet of patches, and then use it to remove repetitive elements from the actual frame. The current libjxl encoder does this in both lossy and lossless mode, it has some heuristics to detect things like letters or icons on a solid background and will use Patches to encode them only once.
|
|
|
|
salrit
|
|
_wb_
Patches are also useful for single frame. You cannot reference the current frame, but you can insert an invisible auxiliary frame that contains a sprite sheet of patches, and then use it to remove repetitive elements from the actual frame. The current libjxl encoder does this in both lossy and lossless mode, it has some heuristics to detect things like letters or icons on a solid background and will use Patches to encode them only once.
|
|
2024-06-04 07:47:27
|
I see, kind of like the Soft Pattern Matching in JBIG2..
|
|
|
_wb_
|
2024-06-04 07:49:26
|
Scan-line ordering: it's only raster order (within the group), no more Adam-inf scanline order since that is quite bad for memory locality (so bad for decode speed). But there is the Squeeze transform that can be used to achieve a similar but better progressive decoding, where the low-res previews are based on averaging sample values rather than just nearest-neighbor sampling.
|
|
|
lonjil
|
2024-06-04 07:51:10
|
example of patches. I took a screenshot of this chat, then encoded with VarDCT with patches enabled, then decoded the 1/8th size preview. However, since patches are stored *before*, they are also included even when stopping before the full resolution data, and so we can see their impact.
|
|
|
_wb_
|
|
salrit
I see, kind of like the Soft Pattern Matching in JBIG2..
|
|
2024-06-04 07:51:24
|
Yes, exactly like that, except of course it works for color images, not just 1-bit black&white. And also a major difference is that residuals are still encoded: patches just get subtracted from the main image but do not replace it, so if a lossy patch does not match exactly, there is still a chance to correct it.
|
|
|
lonjil
example of patches. I took a screenshot of this chat, then encoded with VarDCT with patches enabled, then decoded the 1/8th size preview. However, since patches are stored *before*, they are also included even when stopping before the full resolution data, and so we can see their impact.
|
|
2024-06-04 07:53:18
|
as you can see, the heuristic does manage to find a lot of letters and encodes them via patches (which also means they get modular encoded instead of vardct, which is a good idea anyway for text, though not so much for my avatar which also became a patch). It does not find everything though, there is still room for improvement in those heuristics...
|
|
2024-06-04 07:58:40
|
RCTs: they always take 3 subsequent channels as input, permute them in any of the 6 ways, and then either do YCoCg or any combination of these primitives:
- subtract first channel from third, e.g. RGB -> RG(B-R)
- subtract first channel from second, e.g. RGB -> R(G-R)B
- subtract avg of first and third from second, e.g. RGB -> R(G - (R+B)/2)B
|
|
2024-06-04 07:59:14
|
RCTs can be stacked, multiple RCTs can be done one after the other
|
|
2024-06-04 08:00:36
|
So in principle you can do something like RGBA -> RABG -> ARGB -> AYCoCg via two permute-only RCTs and one YCoCg
|
|
2024-06-04 08:01:29
|
Channel order does not matter that much since planar encoding is used anyway, but it can make a difference when using PrevChannel properties in the MA context tree
|
|
2024-06-04 08:03:29
|
The current libjxl encoder is only trying RCTs on the first three channels though, we haven't explored images with more channels very much yet. For something like those 21-channel multispectral images, the modular design of the transforms in modular mode might be quite useful though.
|
|
|
|
salrit
|
2024-06-04 08:05:29
|
🙌 thanks <@167023260574154752> and <@794205442175402004> ..
|
|
|
Demiurge
|
2024-06-04 09:14:23
|
Is there a difference between XYB and XYZ? And is there an important practical advantage over LAB?
|
|
|
_wb_
|
2024-06-04 10:05:47
|
<@446428281630097408> after that pull request lands, for your 21-channel test image I got good results when doing this:
```
cjxl radiance-21.jxl radiance-21.jxl.jxl -d 0 -e 2 -E 1 -C 0
JPEG XL encoder v0.10.2 67cb0e82 [NEON]
lib/extras/dec/apng.cc:795: JXL_FAILURE: PNG signature mismatch
Encoding [Modular, lossless, effort: 2]
Compressed to 336647.9 kB (135.317 bpp).
4865 x 4091, 9.625 MP/s [9.62, 9.62], , 1 reps, 12 threads.
```
That's a setting with a reasonable enc/dec speed (unlike e9 E5 which compresses better but is very slow to encode and decode). The `-C 0` is to disable YCoCg, which is not very useful here, especially in combination with using PrevChannel context where it becomes counterproductive. You get a 336 MB file while the uncompressed data (i.e. the PAM file) is 835 MB. On my laptop, it takes under 8 seconds to do the encode, under 3 seconds to do the decode. For comparison, running default gzip on the PAM file produces a 560 MB file in a bit over 14 seconds.
|
|
|
a goat
|
2024-06-04 12:32:53
|
<@794205442175402004> Is there any particular reason why the Burrows-Wheeler Transform isn't used in image compression much?
|
|
|
CrushedAsian255
|
|
a goat
<@794205442175402004> Is there any particular reason why the Burrows-Wheeler Transform isn't used in image compression much?
|
|
2024-06-04 12:52:31
|
I’m just guessing but probably similar reasons on why raw RLE isn’t great for images (excluding flat logos), lots of colours that are similar aren’t the same, so any kind of RLE (even with BWT) is not going to work amazingly
|
|
|
_wb_
|
2024-06-04 12:58:08
|
It's also not clear to me how you can combine it with prediction. BWT is a 1D thing, if you would e.g. just apply it to all rows then you end up garbling up the image and while there will be more runs in every row, any predictor that uses rows above will be messed up...
|
|
|
|
veluca
|
2024-06-04 01:01:57
|
I mean, why not BWT after prediction?
|
|
2024-06-04 01:02:23
|
I'm not convinced it would be better than plain old lz77, but maybe
|
|
|
_wb_
|
2024-06-04 01:15:14
|
yes, you could do first prediction and then BWT, but what about context?
|
|
|
|
veluca
|
2024-06-04 01:18:25
|
*shrug*
|
|
2024-06-04 01:18:43
|
some version of jxl-lossless encoded separate streams for each context
|
|
2024-06-04 01:18:53
|
if you did that, it would work to BWT each stream
|
|
|
_wb_
|
2024-06-04 01:40:00
|
that would work. Probably not so good for enc/dec speed/memory, but could be good for compression
|
|
2024-06-04 01:42:06
|
could also try BWT on AC coeffs (say, encoding one coefficient position at a time)
|
|
|
|
veluca
|
|
_wb_
that would work. Probably not so good for enc/dec speed/memory, but could be good for compression
|
|
2024-06-04 01:46:15
|
you're BWTing already anyway, that's not going to be fast 😛
|
|
|
jjrv
|
|
_wb_
<@446428281630097408> after that pull request lands, for your 21-channel test image I got good results when doing this:
```
cjxl radiance-21.jxl radiance-21.jxl.jxl -d 0 -e 2 -E 1 -C 0
JPEG XL encoder v0.10.2 67cb0e82 [NEON]
lib/extras/dec/apng.cc:795: JXL_FAILURE: PNG signature mismatch
Encoding [Modular, lossless, effort: 2]
Compressed to 336647.9 kB (135.317 bpp).
4865 x 4091, 9.625 MP/s [9.62, 9.62], , 1 reps, 12 threads.
```
That's a setting with a reasonable enc/dec speed (unlike e9 E5 which compresses better but is very slow to encode and decode). The `-C 0` is to disable YCoCg, which is not very useful here, especially in combination with using PrevChannel context where it becomes counterproductive. You get a 336 MB file while the uncompressed data (i.e. the PAM file) is 835 MB. On my laptop, it takes under 8 seconds to do the encode, under 3 seconds to do the decode. For comparison, running default gzip on the PAM file produces a 560 MB file in a bit over 14 seconds.
|
|
2024-06-04 06:51:10
|
Works great! Switched to the branch version already while working on tooling. BTW I'm also targeting Wasm and compiling with Zig instead of Emscripten. Works a treat.
|
|
|
a goat
|
|
_wb_
It's also not clear to me how you can combine it with prediction. BWT is a 1D thing, if you would e.g. just apply it to all rows then you end up garbling up the image and while there will be more runs in every row, any predictor that uses rows above will be messed up...
|
|
2024-06-05 06:50:27
|
What about a hilbert curve?
|
|
|
spider-mario
|
2024-06-05 08:54:03
|
I have the impression that while the British spelling of “color” is “colour”, “colourspace” is nevertheless rarely used over “colorspace” (and “colourimeter” even less)
|
|
2024-06-05 08:54:07
|
is that correct?
|
|
|
yoochan
|
2024-06-05 08:56:20
|
They'll never admit their incoherence
|
|
|
Quackdoc
|
|
spider-mario
I have the impression that while the British spelling of “color” is “colour”, “colourspace” is nevertheless rarely used over “colorspace” (and “colourimeter” even less)
|
|
2024-06-05 09:05:27
|
I use color and colour equally, same with colourspace and colorspace when talking but always colorspace in code, and ive never seen colourimeter
|
|
|
_wb_
|
|
a goat
What about a hilbert curve?
|
|
2024-06-05 09:42:10
|
might work but will mess up speed even more...
|
|
|
KKT
|
2024-06-05 06:16:47
|
Hey all, I've started to put together an FAQ and a Glossary for the jpegxl.info site. The FAQ is divided into **General**, **Usage** & **Technical**. I've only scraped the surface on the glossary. As an experiment, I'm opening the Google doc up for everyone to edit, so be gentle! Thanks for the help!
https://docs.google.com/document/d/1tn0YNAeOCfDVGy6v0olsG9de3y3goehjoBT_G3I6Xmo/edit?usp=sharing
|
|
|
yoochan
|
2024-06-05 06:55:12
|
Nice! I started something similar some times ago and failed to reach a mature enough point to share it 😅 excess of shyness
|
|
|
HCrikki
|
2024-06-05 06:56:33
|
about low bitrates, the final result is not a shortcoming of the format itself but a natural consequence of prioriting preserving detail for any given target final filesize. video-based formats discard a lot of information and low filesizes for either format would perpetuate the issue web faced with low quality jpegs of low filesize
|
|
2024-06-05 07:06:32
|
perhaps add benchmark numbers for cases not usually or ever tested, like jpg->jxl conversions
|
|
2024-06-05 07:07:49
|
jpg to any format results either into massive *lossless* new file or degraded *lossy* new file that still doesnt guarantee lower filesize unless more detail is sacrificed - unlike jpg->jxl conversions that are completely lossless, guarantee 20% less storage/bandwidth and take less than 50ms for sub-20mp images
|
|
|
|
salrit
|
2024-06-05 10:00:50
|
Is the JPEG XL art the only place to understand the MA Tree? Can I get any reference/documentation to understand it please?
|
|
|
monad
|
|
salrit
Is the JPEG XL art the only place to understand the MA Tree? Can I get any reference/documentation to understand it please?
|
|
2024-06-06 12:14:33
|
read the spec? https://discord.com/channels/794206087879852103/1021189485960114198
|
|
|
|
salrit
|
|
monad
read the spec? https://discord.com/channels/794206087879852103/1021189485960114198
|
|
2024-06-06 08:10:45
|
Will go through it...Thanks
|
|
|
jjrv
|
2024-06-06 09:59:11
|
Somehow it seems JxlEncoderAddImageFrame wants a buffer large enough to contain the extra channels as well, and yet they must also be passed to JxlEncoderSetExtraChannelBuffer. Should the num_channels field in JxlPixelFormat include the extra channel count or not?
|
|