|
Orum
|
2025-11-03 03:54:48
|
I adjust the exposure for the subject matter
|
|
|
Exorcist
|
|
Modular mode and AV1 instead of VarDCT
|
|
2025-11-04 07:46:40
|
In VarDCT mode, is there any intra prediction mode that AV1 can, but JXL can't?
|
|
|
jonnyawsom3
|
2025-11-04 10:03:33
|
Maybe ask one of the AV1 guys, but I'd assume so considering it's a video codec
|
|
|
AccessViolation_
|
|
That gave me a cursed idea of translating AVIF coefficients into a JXL to prove we can hit the same density/quality, but I have had a few to drink so maybe it's just hypothetical
|
|
2025-11-04 10:50:50
|
this sounds interesting, have you thought more about this?
|
|
|
Exorcist
In VarDCT mode, is there any intra prediction mode that AV1 can, but JXL can't?
|
|
2025-11-04 10:53:44
|
iirc AV1 has some coding tools specifically for *very low quality* images, mostly aimed at suppressing artifacts. JXL has some coding tools which could achieve similar reductions of artifacts in very low quality images, but unlike those in AV1 they weren't made specifically for super low quality, they were made to be generally useful across a wider quality range
|
|
2025-11-04 10:55:53
|
I should NOT have looked up "av1 deblurring" with safe search off when trying to find out more about that
|
|
|
jonnyawsom3
|
|
AccessViolation_
this sounds interesting, have you thought more about this?
|
|
2025-11-04 11:07:57
|
I have no idea if it's even feasible, the idea came to me at the bottom of a bottle of cider
|
|
|
AccessViolation_
|
2025-11-04 12:30:43
|
> The AFV transforms are composed of a
> horizontal DCT4x8, a square DCT4x4, and a variant of
> DCT4x4 with one βcorner cutβ. Effectively three pixels in a
> corner (the corner pixel itself and its two nearest neighbors)
> are coded separately while a DCT transform is applied
> covering the remaining 13 pixels of the 4 Γ 4 block.
what's the reasoning behind these? I'm assuming these are for cases where some of those three corner pixels are very different from the pixels in the rest of the block which would cause bad ringing when DCT encoded? but that's my guess, they always seemed oddly specific
|
|
|
_wb_
|
2025-11-04 01:01:39
|
Yes, to me it also feels oddly specific, but then again it probably isn't _that_ rare for hard edges to appear in a corner of a block and that's pretty much a worst-case scenario for the DCT. If the edge cuts more through the middle or even affects only an edge but a complete edge, I assume it's not as bad as when it's hitting the corner and you basically need every single DCT coefficient to be kept intact if you want to not get bad ringing.
But I wasn't there when this was invented, it existed already in PIK iirc. <@532010383041363969> <@179701849576833024> are the A and V of AFV, not sure if Thomas Fischbacher (the F of AFV) is on this discord. I would also be interested in what the reasoning was and what caused them to come up with this "corner cut" DCT variant.
|
|
|
|
veluca
|
2025-11-04 01:15:58
|
from my memory, I would say you are correct on that being the reasoning
|
|
|
AccessViolation_
|
2025-11-04 01:18:38
|
also, I don't know if they were displayed incorrectly in the paper or if they were encoded incorrectly, but they seem to be used in exactly the wrong way, in that example image with the tennis player. at least in this spot
|
|
2025-11-04 01:21:34
|
I found another spot in that image where the corners are actually oriented towards the larger flat area instead of the different corner area
|
|
|
|
veluca
|
2025-11-04 01:21:38
|
I believe in the figure being correct
|
|
2025-11-04 01:21:54
|
but maybe it is not π
|
|
2025-11-04 01:22:30
|
(btw, in this image I'd see a 8x4 as a better idea, but could be wrong)
|
|
|
AccessViolation_
|
2025-11-04 01:24:56
|
hmm I might have an idea why this is happening. those AVF blocks effectively contain an 8x4 block at either the top or bottom, where the cut corner isn't. so where an 8x4 does pretty well, an AVF with the 8x4 in the same spot would do pretty well too
|
|
2025-11-04 01:25:54
|
except then it presumably wastes some bits for the whole corner thingy
|
|
|
|
veluca
|
2025-11-04 01:26:58
|
yeah that seems credible
|
|
|
AccessViolation_
|
2025-11-04 01:35:12
|
they're used like this all over the place. I guess I'll create an issue. I have no clue how changing the tuning regarding these would affect density though
|
|
|
_wb_
|
2025-11-04 01:40:03
|
not sure how much you pay in AFV for a corner that is similar to the surrounding pixels, maybe not much, so it's not too different from 8x4 / 4x8.
But it is worth pointing out that block selection in libjxl is, afaik, all based completely on heuristics and not on actual RD optimization (mostly because it is pretty much impossible to estimate the signaling cost of a block well locally, entropy coding effects are pretty nonlocal).
|
|
|
jonnyawsom3
|
2025-11-04 01:43:06
|
I could be misinterpreting it, but I *think* AVF is being penalised higher than 8x4 in the logic here <https://github.com/libjxl/libjxl/blob/main/lib/jxl/enc_ac_strategy.cc#L547>
|
|
2025-11-04 01:45:03
|
We're actively working on improving sharpness by reducing the amount of large blocks, but it's a struggle to wrap my head around with so many variables feeding into each other
|
|
2025-11-04 01:47:17
|
AKA, we're trying to fix https://discord.com/channels/794206087879852103/1278292301038227489
|
|
|
AccessViolation_
|
|
I could be misinterpreting it, but I *think* AVF is being penalised higher than 8x4 in the logic here <https://github.com/libjxl/libjxl/blob/main/lib/jxl/enc_ac_strategy.cc#L547>
|
|
2025-11-04 01:50:31
|
in terms of entropy mul it seems to be the opposite? a lower entropy is 'better':
```c
if (entropy < best) {
best_tx = tx.type;
best = entropy;
}
```
but the 4x8 blocks have a multiplier of 0.85, while the AVF blocks have a multiplier of 0.81
|
|
|
jonnyawsom3
|
2025-11-04 01:51:41
|
The way the larger blocks work is that entropy_mul is a multiplier for the measured entropy, so a higher number means it gets more entropy and less chance to be selected
|
|
2025-11-04 01:58:20
|
<https://github.com/libjxl/libjxl/blob/main/lib/jxl/enc_ac_strategy.cc#L508>
|
|
|
AccessViolation_
|
2025-11-04 01:59:39
|
so doesn't that mean that 8x4 blocks are less likely to be selected than AVF blocks?
|
|
2025-11-04 02:00:18
|
which we don't want in these instances
|
|
2025-11-04 02:00:22
|
or am I misunderstanding
|
|
2025-11-04 02:03:27
|
I suppose I could swap their entropy_mul values and just see what happens on some test images :3
|
|
|
jonnyawsom3
|
|
AccessViolation_
hmm I might have an idea why this is happening. those AVF blocks effectively contain an 8x4 block at either the top or bottom, where the cut corner isn't. so where an 8x4 does pretty well, an AVF with the 8x4 in the same spot would do pretty well too
|
|
2025-11-04 02:03:54
|
Assuming both are doing well, they should have similar entropy, but the corner will add or remove a little bit. The difference in the multiplier means the entropy for AVF is multipled to a lower value than 8x4, likely more than the difference in entropy
|
|
|
Prower
|
2025-11-04 08:42:30
|
Do we have a simple diagram like this for the JXL encoding path. It would be cool to see the comparison.
|
|
|
_wb_
|
2025-11-04 08:48:44
|
slides 20 and 21 are doing something somewhat like that: https://docs.google.com/presentation/d/1LlmUR0Uoh4dgT3DjanLjhlXrk_5W2nJBDqDAMbhe8v8/edit?slide=id.g910bfb2ea8_45_0#slide=id.g910bfb2ea8_45_0
|
|
|
Jyrki Alakuijala
|
|
AccessViolation_
> The AFV transforms are composed of a
> horizontal DCT4x8, a square DCT4x4, and a variant of
> DCT4x4 with one βcorner cutβ. Effectively three pixels in a
> corner (the corner pixel itself and its two nearest neighbors)
> are coded separately while a DCT transform is applied
> covering the remaining 13 pixels of the 4 Γ 4 block.
what's the reasoning behind these? I'm assuming these are for cases where some of those three corner pixels are very different from the pixels in the rest of the block which would cause bad ringing when DCT encoded? but that's my guess, they always seemed oddly specific
|
|
2025-11-04 09:38:53
|
They are for cutting corners π
. When there are two slabs of matter, one in front, one in back, relatively large number of blocks covering the edge will have a small number of pixels covering one of the blocks. AFV tries to make that easy. We had grandieuse plans for similar ideas, but this was a small and practical step in this direction.
|
|
|
AccessViolation_
|
2025-11-04 09:45:41
|
I think I understand that
|
|
|
_Broken sΜΈyΜ΄mΜ΄mΜ΅ΜΏeΜ΄ΝΝtΜΈrΜ΅ΜΜΏyΜ΄Ν Ν
|
|
_wb_
slides 20 and 21 are doing something somewhat like that: https://docs.google.com/presentation/d/1LlmUR0Uoh4dgT3DjanLjhlXrk_5W2nJBDqDAMbhe8v8/edit?slide=id.g910bfb2ea8_45_0#slide=id.g910bfb2ea8_45_0
|
|
2025-11-04 09:46:33
|
Just flipped through this;
Love the humour.
|
|
|
AccessViolation_
|
|
Jyrki Alakuijala
They are for cutting corners π
. When there are two slabs of matter, one in front, one in back, relatively large number of blocks covering the edge will have a small number of pixels covering one of the blocks. AFV tries to make that easy. We had grandieuse plans for similar ideas, but this was a small and practical step in this direction.
|
|
2025-11-04 09:47:08
|
so even if more than those three corner pixels are a discrepancy, it still does well? like a vertical column of different pixels on the very left or right of the block, for example?
|
|
2025-11-04 09:47:35
|
a general coding tool for avoiding ringing I suppose?
|
|
2025-11-04 09:49:22
|
I think that's what you meant with two slabs, one in front and one in the back, but I'm not sure
|
|
|
AccessViolation_
also, I don't know if they were displayed incorrectly in the paper or if they were encoded incorrectly, but they seem to be used in exactly the wrong way, in that example image with the tennis player. at least in this spot
|
|
2025-11-04 09:49:37
|
like this I suppose, the white in front of the blue
|
|
|
Jyrki Alakuijala
|
2025-11-04 09:51:07
|
There are not many miracles in jpeg xl. The corner cutting saves perhaps 0.1 % or less
|
|
2025-11-04 09:51:56
|
At some bit rates it helped to maintain edge continuity
|
|
2025-11-04 09:52:32
|
Avif does such by the wedge blocks and aggressive filtering
|
|
|
AccessViolation_
|
2025-11-04 09:53:10
|
yeah I saw the directional filtering of AV1, pretty interesting too
|
|
|
Jyrki Alakuijala
|
2025-11-04 09:53:11
|
I didnt properly study wedge blocks
|
|
2025-11-04 09:53:39
|
Directional filtering sounds cool but doesnt work that well
|
|
|
AccessViolation_
|
2025-11-04 09:53:39
|
I also saw wedge block but didn't really look into them
|
|
|
Jyrki Alakuijala
|
2025-11-04 09:53:58
|
Wedge blocks and palette blocks do the work
|
|
|
AccessViolation_
|
|
Jyrki Alakuijala
Directional filtering sounds cool but doesnt work that well
|
|
2025-11-04 09:54:48
|
use cases are limited yeah, from what I can tell it's mostly for *very very* low bitrates where it's better than not doing it, where the ringing would be unbearable
|
|
|
Jyrki Alakuijala
|
2025-11-04 09:55:17
|
In jpeg xl I didnt want palette blocks because they operate only in a limited quality range
|
|
|
Exorcist
|
2025-11-04 09:56:28
|
What is the different of CDEF in AV1 vs Edge-preserving filter in JXL?
|
|
|
Jyrki Alakuijala
|
2025-11-04 09:56:41
|
Day and night
|
|
2025-11-04 09:56:58
|
Epf is not directional
|
|
|
AccessViolation_
|
|
AccessViolation_
also, I don't know if they were displayed incorrectly in the paper or if they were encoded incorrectly, but they seem to be used in exactly the wrong way, in that example image with the tennis player. at least in this spot
|
|
2025-11-04 09:57:31
|
what I don't quite understand is why using AFV here is better than doing 8x4, especially because it's not using utilizing the corner pixels feature. is there some special behavior in AFV that works differently from DCT (aside form the corner pixel feature) that makes it do better in this use case?
|
|
|
Jyrki Alakuijala
|
2025-11-04 09:58:12
|
Epf adjusts its operation to each 8x8 block and to the adaptive quant level
|
|
|
Exorcist
|
2025-11-04 09:58:13
|
AV1 allow DCT and DST and no-transform
|
|
|
Jyrki Alakuijala
|
2025-11-04 09:58:49
|
DST is nice perhaps, a little worse but more continuous
|
|
2025-11-04 09:59:46
|
CDEF makes photos look a bit psychedelic, im not a fan
|
|
2025-11-04 10:00:15
|
It is a +-1 % thing
|
|
|
Exorcist
|
2025-11-04 10:00:48
|
And, AV1 can choose different transform for different axis in one block: for example, horizontal DCT + vertical DST...
https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/Appendix-TX-Search.md
|
|
|
Jyrki Alakuijala
|
2025-11-04 10:01:10
|
I didnt try if that is good
|
|
|
|
veluca
|
2025-11-04 10:01:24
|
I don't know why but I remember convincing myself that DST is mostly meant for motion prediction residuals
|
|
|
Jyrki Alakuijala
|
2025-11-04 10:01:25
|
Or not seriously
|
|
2025-11-04 10:01:48
|
It didnt feel right
|
|
|
|
veluca
|
2025-11-04 10:01:49
|
not even sure you can get the encoder to use it in an image
|
|
|
AccessViolation_
|
2025-11-04 10:08:14
|
earlier today I found a message here that said AFV didn't use DCT for the non-corner data, but I can't find that message anymore
|
|
2025-11-04 10:08:43
|
does it do something special aside from having separately coded corner pixels?
|
|
2025-11-04 10:10:03
|
|
|
2025-11-04 10:10:03
|
oh I found it
|
|
|
|
veluca
|
2025-11-04 10:11:07
|
I mean, it's basically a DCT
|
|
2025-11-04 10:11:12
|
just on a fun grid
|
|
2025-11-04 10:12:17
|
(this if you buy the story that the DCT is the transform that diagonalizes the shift operator on the 1xN vector, or something like that, IDR, it was a long time ago and I forgot exactly how we derived the corner transform)
|
|
|
AccessViolation_
|
2025-11-04 10:15:15
|
I think I'm missing some background knowledge to understand why that's useful π
|
|
2025-11-04 10:15:41
|
that's okay, I appreciate it nonetheless
|
|
|
jonnyawsom3
|
2025-11-04 10:18:31
|
It's a good time for all the DCT talk, if I can understand the code then I might try to weight the different types better. Issue is all the variables feed into each other so it's hard to tell what value I'm meant to be changing
|
|
|
|
veluca
|
2025-11-04 10:24:25
|
I doubt anybody *really* knows what the encoder heuristics do, fwiw π
|
|
2025-11-04 10:24:48
|
I do have some ideas of things to try in the future though
|
|
|
AccessViolation_
|
2025-11-04 10:25:59
|
I can't imagine how overwhelming optimizing/tuning an encoder must be. I bet I'd constantly get the urge to burn it all and start over
|
|
|
|
veluca
|
2025-11-04 10:26:56
|
as well as the interesting observation that the problem of chosing a perfect split in 8x8, 16x8 and 8x16 blocks can be done by min-cost matching (and I think if you add any larger transform it gets NP-complete, but perhaps one could proceed hierarchically)... that's most likely slow though, and still needs you to have *some* metric to evaluate how good a specific block is in a certain place
|
|
|
jonnyawsom3
|
|
AccessViolation_
I can't imagine how overwhelming optimizing/tuning an encoder must be. I bet I'd constantly get the urge to burn it all and start over
|
|
2025-11-04 10:37:55
|
As a rough idea, right now I'm focusing on these value
|
|
2025-11-04 10:38:01
|
Which use these values...
|
|
2025-11-04 10:38:28
|
In this....
|
|
2025-11-04 10:38:43
|
So still a few things for me to figure out π
|
|
2025-11-04 10:40:01
|
I think there must be about 6 multiplications going on before the variable I change actually reaches the block selection code
|
|
|
veluca
I doubt anybody *really* knows what the encoder heuristics do, fwiw π
|
|
2025-11-04 10:43:48
|
It is a little intimidating changing things when the variables have 17 decimal places haha
|
|
|
veluca
as well as the interesting observation that the problem of chosing a perfect split in 8x8, 16x8 and 8x16 blocks can be done by min-cost matching (and I think if you add any larger transform it gets NP-complete, but perhaps one could proceed hierarchically)... that's most likely slow though, and still needs you to have *some* metric to evaluate how good a specific block is in a certain place
|
|
2025-11-04 10:47:56
|
I mean, right now the encoder is already 2x slower than v0.8 was (Our baseline for our changes now), so that might be relative
|
|
|
|
runr855
|
2025-11-05 12:21:06
|
How has distance 1.0 been determined to be visually lossless? Cause it's clearly not for me, degradation is easy to spot when I compare to the original image
|
|
|
jonnyawsom3
|
2025-11-05 12:28:05
|
Visually lossless at 1x or 100% zoom sat 2ft from the display (or something along those lines, I'm sure I'll be corrected)
|
|
|
lonjil
|
2025-11-05 12:53:41
|
I think originally, it was supposed to be at a viewing distance of 500 pixels
|
|
|
jonnyawsom3
|
2025-11-05 12:57:51
|
A distance of 500 pixels? On a phone that would be like 2 inches
|
|
|
lonjil
|
|
A distance of 500 pixels? On a phone that would be like 2 inches
|
|
2025-11-05 01:17:15
|
image pixels. if you're displaying at 1:1 on a phone you'd have to be pretty close, yeah
|
|
|
monad
|
2025-11-05 04:43:31
|
1000 pixels given particular lighting conditions. but whether the encoder was ever very accurate, the distance 1 fidelity has slipped a bit with updates.
|
|
2025-11-05 04:48:14
|
recall that what's described as "visually lossless" is also described as having one unit of just noticeable distortion
|
|
|
_wb_
|
2025-11-05 01:45:33
|
I think it was 900 pixels, but I don't know how exact the viewing distance was controlled (pretty sure no chin rests were used), and also no idea what display was used (display brightness makes a big difference too).
|
|
|
jonnyawsom3
|
2025-11-05 01:49:05
|
I vaguely recall a discussion around 2 of the testing centres using different monitors, one reference and the other generic, but I could be mistaken
|
|
|
_wb_
|
2025-11-05 01:50:12
|
"Visually lossless" is not something very exact, it can mean anything from "no human can see any difference under any viewing conditions" (which basically requires fully lossless) to "1 JND in some typical viewing conditions and using side-by-side comparison" (which can be a pretty low quality, since 1 JND means half of the people _can_ see a difference, typical viewing conditions are not very sensitive (e.g. lots of ambient light, relatively large viewing distance, etc), and side-by-side is way less sensitive than in-place flipping.
|
|
2025-11-05 01:56:09
|
Basically the threshold for "visually lossless" can be just about anything from `cjxl -d 0` to `cjxl -d 3` depending on who is defining it. I've even heard video codec people refer to something like typical `cjxl -d 6` output as "impractically high quality well above what is needed for a visually lossless image". And then there are other people who consider `cjxl -d 0.3` to be too low quality because they can spot the difference (when zooming in 10x and flickering).
|
|
|
|
runr855
|
2025-11-05 03:39:54
|
Do other codecs have anything similar that passes similar tests, or is JPEG XL unique in defining a 'visually lossless' quality threshold?
|
|
|
A homosapien
|
2025-11-05 05:42:10
|
I'm pretty sure JPEG XL is unique in this way
|
|
|
_wb_
|
2025-11-05 05:53:21
|
I think it's the first codec that explicitly had the goal of having an encoder that is aiming at achieving a consistent perceptual result.
But this is more a property of an encoder implementation than an intrinsic property of a codec. Although the codec design can help or complicate this, e.g. an approach where colorspace is considered an external application-level concern makes it hard for an encoder to do anything perceptual since it will typically not be able to know what the data actually represents.
|
|
|
Orum
|
2025-11-05 07:55:50
|
should we start taking bets on whether or not we'll get a new stable <:JXL:805850130203934781> this year?
|
|
|
jonnyawsom3
|
2025-11-05 08:00:23
|
I keep flipping between wanting the new release soon and hoping it's a little longer, depending on what PRs are pending haha
|
|
|
|
veluca
|
2025-11-05 08:05:14
|
should I take bets on whether libjxl 1.0 is out before or after jxl-rs is ready? π
|
|
|
screwball
|
|
I keep flipping between wanting the new release soon and hoping it's a little longer, depending on what PRs are pending haha
|
|
2025-11-05 08:09:30
|
I hope a fix comes along for the progressive photon noise issue but I won't hold my breath
|
|
2025-11-05 08:09:40
|
Its not vital
|
|
|
jonnyawsom3
|
|
veluca
should I take bets on whether libjxl 1.0 is out before or after jxl-rs is ready? π
|
|
2025-11-05 08:10:38
|
Something tells me you're already working as fast as you can without gambling incentives xD
|
|
|
|
veluca
|
2025-11-05 08:12:12
|
probably true! doesn't help that I have a million things to do
|
|
|
A homosapien
|
2025-11-05 08:24:50
|
what other cool projects are you also working on? unless you are not at liberty to say
|
|
|
|
veluca
|
2025-11-05 08:33:31
|
I will not comment on work stuff, but organizing https://egoi2026.it/ (and a bunch of other olympiads, although for those at least I am not the main organizer :P) has been taking a huge chunk of my non-work time
|
|
|
lonjil
|
|
AccessViolation_
|
|
veluca
I will not comment on work stuff, but organizing https://egoi2026.it/ (and a bunch of other olympiads, although for those at least I am not the main organizer :P) has been taking a huge chunk of my non-work time
|
|
2025-11-05 11:12:28
|
really nice, this is a great initiative
|
|
|
monad
|
2025-11-06 07:21:56
|
libjxl 1.0 is imaginary
|
|
|
spider-mario
|
2025-11-06 07:50:53
|
you're thinking of libjxl i.0
|
|
|
_wb_
|
2025-11-06 07:54:09
|
libjxl -e^iΟ
|
|
|
[ ]
|
2025-11-06 10:08:39
|
Hey, I'm currently looking to render raw photos I have from DNG into JXL files and I'd rather like to use a colorspace wider than sRGB, making better use of HDR when support for displaying such images eventually matures.
Imagemagick seems to do a good job at converting from the DNG files I have, but I am struggling to find a way to actually set a transfer function or colorspace in the generated JXL file. I have also considered using `cjxl` directly, though even there I can't seem to find any options to configure this, nor am I sure which intermediate format to use since DNG files are seemingly not supported as input. As a last resort I could probably write a program interfacing directly with libjxl or similar I suppose, but I figured I'd ask here before attempting something like that.
Would appreciate any pointers anyone can provide β I'm well aware the land of colorspaces is vast and confusing at the best of times so do let me know if I'm fundamentally misunderstanding anything.
|
|
|
AccessViolation_
|
|
[ ]
Hey, I'm currently looking to render raw photos I have from DNG into JXL files and I'd rather like to use a colorspace wider than sRGB, making better use of HDR when support for displaying such images eventually matures.
Imagemagick seems to do a good job at converting from the DNG files I have, but I am struggling to find a way to actually set a transfer function or colorspace in the generated JXL file. I have also considered using `cjxl` directly, though even there I can't seem to find any options to configure this, nor am I sure which intermediate format to use since DNG files are seemingly not supported as input. As a last resort I could probably write a program interfacing directly with libjxl or similar I suppose, but I figured I'd ask here before attempting something like that.
Would appreciate any pointers anyone can provide β I'm well aware the land of colorspaces is vast and confusing at the best of times so do let me know if I'm fundamentally misunderstanding anything.
|
|
2025-11-06 10:14:53
|
would you consider using free software specifically for developing raws? Darktable is quite nice for developing your raw files. then you can configure color spaces, HDR, etc, and export directly to a JPEG XL file
|
|
|
[ ]
|
2025-11-06 10:17:11
|
I did try several (including darktable) before settling on this solution: the numerous ux issues each give me aside, I'd rather be able to automate the process even if it requires coming up with some less than ideal heuristics honestly.
|
|
|
AccessViolation_
|
2025-11-06 10:18:20
|
I seem to remember a discussion here about automating darktable (or rawtherapee?) with scripts, but I don't remember exploring that further
|
|
|
[ ]
|
2025-11-06 10:19:23
|
If it requires direct library integration I'd definitely rather write my own software for it β I prefer to keep things minimal and the raw editing programs I've seen so far don't quite tick that box I'd say.
|
|
|
RaveSteel
|
|
[ ]
I did try several (including darktable) before settling on this solution: the numerous ux issues each give me aside, I'd rather be able to automate the process even if it requires coming up with some less than ideal heuristics honestly.
|
|
2025-11-06 10:26:45
|
you can automate using darktable-cli, bypassing the UI entirely
if that works for you
|
|
|
[ ]
|
2025-11-06 10:27:38
|
I will have a look at that, thanks for the suggestion
|
|
|
_wb_
|
2025-11-06 10:49:54
|
Easiest workflow is to produce 16-bit PNG files with an ICC profile corresponding to the colorspace you're developing your DNG files to, e.g. you could use Rec2100 PQ to make sure you're not clamping any gamut or dynamic range. You can then pass such PNG files directly to cjxl and it will handle them appropriately. You can also view the PNG files in Chrome and it should render them correctly in HDR if you're in Windows or MacOS. Sadly in Linux proper HDR rendering is still something that is not easily possible afaik.
|
|
|
[ ]
|
2025-11-06 10:51:45
|
Best I've managed for HDR rendering on Linux has been mpv so far π
I'll consider that approach too: getting the right icc profile could definitely help here, though if I'm being honest I'm not sure what the best way to generate/obtain one would be.
|
|
|
_wb_
|
2025-11-06 11:20:23
|
in the libjxl repo tools there is `icc_simplify` which you can also use to generate icc profiles from a string, e.g. `icc_simplify Rec2100PQ rec2100pq.icc` should work iirc.
|
|
|
RaveSteel
|
|
[ ]
Best I've managed for HDR rendering on Linux has been mpv so far π
I'll consider that approach too: getting the right icc profile could definitely help here, though if I'm being honest I'm not sure what the best way to generate/obtain one would be.
|
|
2025-11-06 11:21:13
|
Take a look at tev, which is a very nice image viewer under Linux with proper HDR support
|
|
2025-11-06 11:21:34
|
https://github.com/Tom94/tev
|
|
|
[ ]
|
2025-11-06 11:24:20
|
I've consigned myself to just waiting for gnome/gtk to get hdr support integrated in their apps, but in the meantime this might be worth looking at for quick debugging instead of opening in mpv
|
|
|
Quackdoc
|
|
RaveSteel
Take a look at tev, which is a very nice image viewer under Linux with proper HDR support
|
|
2025-11-06 01:34:00
|
does it support jxl now?
|
|
|
RaveSteel
|
|
Quackdoc
does it support jxl now?
|
|
2025-11-06 01:45:18
|
yes, for a few months now
|
|
|
Quackdoc
|
2025-11-06 01:45:35
|
nice, haven't had the chance to use it for a whike
|
|
|
RaveSteel
|
2025-11-06 01:45:38
|
I think I posted the initial release with JXL support in adoption back then
|
|
|
[ ]
|
2025-11-06 03:13:24
|
Seems like I've got a solution working with darktable-cli, so thanks all for the help and suggestions. As a side note: the UI issues I was having were related to xwayland and running darktable natively on wayland resolved them.
|
|
|
_wb_
|
2025-11-06 04:38:07
|
https://www.reddit.com/r/jpegxl/comments/1oq1vjv/what_is_jpeg_xl/
Why are people using reddit like it's a search engine or llm?
|
|
|
monad
|
2025-11-06 04:43:20
|
This is the reality of any forum.
|
|
|
HCrikki
|
2025-11-06 06:01:19
|
forums compile permanent public knowledge. on reddit and lemmy, there's no decent stickied or sidebared information anywhere like a discord join link and its a miracle release threads even got pinned
|
|
|
lonjil
|
|
HCrikki
forums compile permanent public knowledge. on reddit and lemmy, there's no decent stickied or sidebared information anywhere like a discord join link and its a miracle release threads even got pinned
|
|
2025-11-06 06:23:46
|
my experience with forums is that I google something, and all the forums threads just have replied like "this has already been explained, use google before asking questions"
|
|
|
monad
|
2025-11-06 06:27:34
|
It's great when you have five threads on the first page all asking the same question and someone feels obliged to make a sixth.
|
|
2025-11-06 06:29:32
|
I just link to my prior response. Then at some point to drive the message home harder I started linking to the last link.
|
|
|
jonnyawsom3
|
|
_wb_
https://www.reddit.com/r/jpegxl/comments/1oq1vjv/what_is_jpeg_xl/
Why are people using reddit like it's a search engine or llm?
|
|
2025-11-06 07:36:24
|
https://discord.com/channels/794206087879852103/806898911091753051/1434491997984653323
|
|
|
|
afed
|
2025-11-06 07:51:27
|
https://discord.com/channels/794206087879852103/794206170445119489/1199165201459716126
|
|
|
Jyrki Alakuijala
|
|
lonjil
I think originally, it was supposed to be at a viewing distance of 500 pixels
|
|
2025-11-07 03:39:47
|
I used a viewing distance of 900 pixels in most experiments leading to compromises in butteraugli, xyb, and JPEG XL. Later I made adjustments that make even more zooming fail more graciously.
|
|
|
_wb_
I think it was 900 pixels, but I don't know how exact the viewing distance was controlled (pretty sure no chin rests were used), and also no idea what display was used (display brightness makes a big difference too).
|
|
2025-11-07 03:44:23
|
This happened at my desk. If i had others verifying my work, i would control their head position. Sometimes it would 850 pixels, sometimes 1000, I didnt think it is somehow a particularly important idea to make it exactly 900.
|
|
2025-11-07 03:45:51
|
I used an expensive photoediting monitor, and made each pixel 2x2 physical pixels.
|
|
|
DZgas Π
|
|
monad
libjxl 1.0 is imaginary
|
|
2025-11-08 08:06:43
|
<:KekDog:805390049033191445> never, next version 0.11.1.1
|
|
|
_wb_
|
2025-11-08 08:15:01
|
I like the version numbering of TeX
|
|
|
AccessViolation_
|
|
DZgas Π
<:KekDog:805390049033191445> never, next version 0.11.1.1
|
|
2025-11-08 08:39:49
|
finally --JXL Secure DNS
|
|
|
qdwang
|
2025-11-08 09:06:40
|
anyone has the backup samples form https://people.csail.mit.edu/ericchan/hdr/hdr-jxl.php ? the website is down...
|
|
2025-11-08 09:07:29
|
I remember there was a tree sample with PQ transfer.
|
|
|
spider-mario
|
|
anyone has the backup samples form https://people.csail.mit.edu/ericchan/hdr/hdr-jxl.php ? the website is down...
|
|
2025-11-08 10:39:02
|
it prints the PHP source instead of executing it, but so you can see the name of the directory the images are taken from, and going there lists the files https://people.csail.mit.edu/ericchan/hdr/jxl_images/
|
|
|
derbergπ
|
|
lonjil
my experience with forums is that I google something, and all the forums threads just have replied like "this has already been explained, use google before asking questions"
|
|
2025-11-09 11:22:26
|
There are spammy forums, good forums and then there are forums with threads containing actually helpful info in the second-last reply followed by a post saying something along the lines of "You necroposted!!!111elf" from a forum moderator, basically asking people to not solve long-standing problems cause it could be the OP doesn't have it anymore and people finsing threads via search engines is not a thing or something?!
|
|
2025-11-09 11:25:42
|
In fact, most solutions from the english Arch forum that I found where from necroposters.
|
|
2025-11-09 11:31:10
|
But yeah, the occasional "you can find this with an easy search via a search engone" is also a thing. The best thing is then when you find three threads with such a reply but no solution.
|
|
|
monad
I just link to my prior response. Then at some point to drive the message home harder I started linking to the last link.
|
|
2025-11-09 11:34:58
|
Yeah that is the best practice and in case a new thread somewhat differs, a OP can still reply (unless the new thread got locked after such a reply immiadetly)
|
|
2025-11-09 11:39:05
|
On the other end are then SO mods that may even hide perfectly valid and well explained questions and maek stuff as duplicate even when it is pretty obvious stuff is only merely related, if at all.
|
|
2025-11-09 11:42:20
|
Has been a while and I only posted a few questions so I can't be that specific about it.
|
|
2025-11-09 11:47:20
|
Just logged in again after years and apparently the inbox overview doesn't include years but only months and days. Wtf?
|
|
|
qdwang
|
|
spider-mario
it prints the PHP source instead of executing it, but so you can see the name of the directory the images are taken from, and going there lists the files https://people.csail.mit.edu/ericchan/hdr/jxl_images/
|
|
2025-11-09 04:08:54
|
thank you so much<:BlobYay:806132268186861619>
|
|
|
jonnyawsom3
|
2025-11-09 10:06:10
|
Oh, they've advanced to actually uploading the images instead of embedding them now. <@&807636211489177661>
|
|
|
AccessViolation_
|
2025-11-10 12:12:40
|
I wonder, were other forms of lossy compression considered for JXL? I know modular has some, but I'm talking about things like using [2D Gaussians](https://arxiv.org/abs/2407.01866), or the thing that uses polygons with gradients and added noise (I can't find this anymore), in addition to using DCT?
|
|
2025-11-10 12:14:38
|
I guess this would've had to be done during the "call for proposals" phase?
|
|
|
lonjil
|
2025-11-10 12:18:03
|
yes
|
|
2025-11-10 12:19:36
|
Pik was already an 8x8 DCT codec when JPEG put out the call for proposals, and FUIF was already a modular-mode like thing with trees and IIRC DCT support for lossless JPEG conversion.
|
|
|
AccessViolation_
|
2025-11-10 03:08:49
|
gotcha
|
|
2025-11-10 03:17:02
|
I recently found out about [*Paged Out!*](https://pagedout.institute/) which is a free computer magazine where every article is one page. I thought it might be neat to submit an article about JXL art? π
|
|
|
Jyrki Alakuijala
|
|
lonjil
Pik was already an 8x8 DCT codec when JPEG put out the call for proposals, and FUIF was already a modular-mode like thing with trees and IIRC DCT support for lossless JPEG conversion.
|
|
2025-11-10 05:24:37
|
Lossless jpeg recompression came from the pik devs (brunsli first, a predecessor of pik), later unified with vardct
|
|
|
lonjil
|
2025-11-10 05:25:19
|
I know you had that, I just seen to recall it being a FUIF feature too, even if that version wasn't brought into JXL.
|
|
|
Jyrki Alakuijala
|
2025-11-10 05:25:58
|
Brunsli was developed in 2014, and pik was started as a combination of guetzli and brunsli
|
|
2025-11-10 05:26:56
|
I dont think fuif had jpeg recompression
|
|
2025-11-10 05:27:19
|
Fuif context modeling was very expensive
|
|
2025-11-10 05:28:17
|
It was only when Luca copied the brotli/webp lossless like entropy clustering that modular gained some decoding speed
|
|
2025-11-10 05:29:17
|
Entropy clustering allows complex computation happen at encode time so that decode can be faster
|
|
2025-11-10 05:30:36
|
I drove the addition of lossless jpeg recompression into the requirements of jpeg xl, it wasnt requested by the committee earlier
|
|
2025-11-10 05:31:20
|
Touradj accepted the idea β he understood the appeal
|
|
2025-11-10 05:32:54
|
I wanted to do it with brunsli originally, but two of our engineers made a revolution π and wanted it to work with more modern coding (vardct)
|
|
2025-11-10 05:33:43
|
But for about 10 months brunsli was in this role
|
|
2025-11-10 05:34:12
|
Before it was happily deleted
|
|
2025-11-10 05:34:45
|
The new code compresses a percent less but has tiles and 20 pages less spec
|
|
|
_wb_
|
2025-11-10 05:36:04
|
FUIF did have JPEG recompression too, at least dct coefficient recompression (not jpeg bitstream reconstruction). There was a transform in fuif corresponding to 8x8 dct that turned each channel into 64 channels, one per coefficient. It worked, compression was comparable to brunsli, but it was slower and not better so we quickly removed it early on in the fuif-pik combination effort.
|
|
|
Jyrki Alakuijala
|
2025-11-10 05:37:44
|
You added it to compare that approach with brunsli? Or it was there from day 1?
|
|
|
_wb_
|
2025-11-10 05:38:17
|
JPEG recompression was indeed not in the original requirements for JPEG XL, this was one of the things we had agreement on quickly since both our proposals had this feature despite it not being a requirement π
|
|
2025-11-10 05:38:34
|
It was there from day 1 in fuif
|
|
2025-11-10 05:38:48
|
Well from before it was submitted at least
|
|
|
Jyrki Alakuijala
|
2025-11-10 05:38:58
|
With quantization matrices etc?
|
|
|
_wb_
|
2025-11-10 05:39:34
|
Yeah, it would have the quant table as a metachannel and then turn every channel into 64 channels that are 8x smaller in both dimensions
|
|
|
Jyrki Alakuijala
|
2025-11-10 05:40:04
|
Nice, i didnt know π
or dont remember
|
|
|
_wb_
|
2025-11-10 05:40:27
|
It was a "modular transform" (though then it wasn't called "modular" yet, that name came later
|
|
2025-11-10 05:41:31
|
https://github.com/cloudinary/fuif/blob/master/transform/dct.h
|
|
|
Jyrki Alakuijala
|
2025-11-10 05:41:57
|
I wanted to add brunsli because there was a big corpus of brunsli jpegs, it would have made politics a bit easier perhaps β but team wanted cleanliness of design over politics-optimized design
|
|
|
_wb_
Yeah, it would have the quant table as a metachannel and then turn every channel into 64 channels that are 8x smaller in both dimensions
|
|
2025-11-10 05:42:54
|
Nice! Great minds think alike!
|
|
|
_wb_
|
2025-11-10 05:49:29
|
It was a big spec simplification to get rid of brunsli as a separate mode. In the long run it was a good decision.
|
|
|
|
runr855
|
2025-11-10 11:47:37
|
A question of curiosity, how much effort has been put into making sure patches in JPEG XL won't end up making mistakes similar to what JBIG2 can do? Does the spec contain any constraints at all to limit possibilities of "dangerous" mistakes? Repeating a huge block of color and patterns seem fine, but things like character substitution sound a bit sketchy
|
|
2025-11-10 11:48:11
|
Or would something like this be entirely up to the implementation of encoders with no rules in the spec?
|
|
|
username
|
|
runr855
Or would something like this be entirely up to the implementation of encoders with no rules in the spec?
|
|
2025-11-10 11:52:25
|
AFAIK yeah it's up to the encoder. The reference encoder currently only uses them in a lossless capacity IIRC
|
|
2025-11-10 11:53:04
|
I'm not really sure if there would be a way to present such a thing spec side besides presenting recommendations of what you should and shouldn't do
|
|
|
lonjil
|
|
username
AFAIK yeah it's up to the encoder. The reference encoder currently only uses them in a lossless capacity IIRC
|
|
2025-11-10 11:53:06
|
Not lossless, but it goes by peak error rather than average error
|
|
|
|
veluca
|
2025-11-10 11:58:33
|
the spec cannot really say what encoders can do
|
|
|
|
runr855
|
2025-11-11 12:04:26
|
Of course the worst could be defining the encoder as non-compliant with the spec
|
|
|
Jyrki Alakuijala
|
|
runr855
A question of curiosity, how much effort has been put into making sure patches in JPEG XL won't end up making mistakes similar to what JBIG2 can do? Does the spec contain any constraints at all to limit possibilities of "dangerous" mistakes? Repeating a huge block of color and patterns seem fine, but things like character substitution sound a bit sketchy
|
|
2025-11-11 12:10:59
|
It is an encoder decision, not a format decision
|
|
|
|
runr855
|
2025-11-11 01:30:40
|
Makes sense, the JPEG XL spec defines a contract of possibilities and nothing more. I guess that would be a correct statement?
|
|
|
A homosapien
|
2025-11-11 02:18:14
|
I think of it as specifying a bitstream, a decoder basically. You are given free reign to use whatever encoding techniques you want as long as it can be decoded.
|
|
|
|
runr855
|
2025-11-11 02:25:45
|
That's much more specific thank you
|
|
|
_wb_
|
2025-11-11 07:40:56
|
Anyone could make an encoder that ignores the input completely and produces a rickroll image instead, and technically that would be a conforming encoder according to the spec since the only thing a conforming encoder has to do is produce bitstreams that are decodable by a conforming decoder.
|
|
2025-11-11 07:45:29
|
That said, in libjxl we did make sure it is not possible to get JBIG2 style results. We use patches with kAdd blending so the residuals are never just removed (as you would get when using kReplace blending), and the criterion for considering patches "similar enough" is based on max error, not average error, so even if we would use kReplace blending it wouldn't cause catastrophic substitutions like what happened in those Xerox scanners doing lossy jbig2.
|
|
|
Orum
|
|
_wb_
Anyone could make an encoder that ignores the input completely and produces a rickroll image instead, and technically that would be a conforming encoder according to the spec since the only thing a conforming encoder has to do is produce bitstreams that are decodable by a conforming decoder.
|
|
2025-11-11 08:15:02
|
you jest, but Stable Diffusion was doing this for content they considered inappropriate (or at least they were at one time; not sure if they still are)
|
|
|
_wb_
|
2025-11-11 08:30:40
|
In Cloudinary you can select a "default image" to be returned instead of a 404. It is usually set to a single transparent pixel gif. Better to put something invisible on a page than a broken image icon, typically.
|
|
|
AccessViolation_
|
|
runr855
A question of curiosity, how much effort has been put into making sure patches in JPEG XL won't end up making mistakes similar to what JBIG2 can do? Does the spec contain any constraints at all to limit possibilities of "dangerous" mistakes? Repeating a huge block of color and patterns seem fine, but things like character substitution sound a bit sketchy
|
|
2025-11-11 09:47:29
|
to elaborate on what people mean with libjxl using max error instead of average error: the difference between a patch and a patch candidate for them to be considered too different, is based on a per-pixel error threshold, instead of the average error of all pixels. for example, if the lowercase `L` is just a pixel taller than an uppercase `I` in some font, the average error is very low, but there will be one or more pixels where the difference is very large, and those alone prevent that candidate from being considered.
when the condition is met and patching does happen, the patch does not *replace* the original pixels, instead it is *subtracted* form the original pixels. when decoding the image, the patch is *added* to what's left over after that subtraction, meaning you always get the exact original pixels back
|
|
|
RaveSteel
|
2025-11-11 01:44:00
|
<@&807636211489177661>
|
|
|
gggol
|
|
AccessViolation_
I wonder, were other forms of lossy compression considered for JXL? I know modular has some, but I'm talking about things like using [2D Gaussians](https://arxiv.org/abs/2407.01866), or the thing that uses polygons with gradients and added noise (I can't find this anymore), in addition to using DCT?
|
|
2025-11-11 04:20:38
|
Polygons with gradients? While raster based formats are nearly universal, remember that they aren't everything. Polygons with gradients are easy to do in a vector format, like in PDF and SVG. I have not noticed support for vector in any of the JPEG work, though maybe JPEG AI will have that implicitly.
But then, the P in JPEG does stand for "photographic", so maybe vector is beyond their scope. Can't quite see JPEG supporting the scaling of fonts like PDF does. It sure is annoying the way businesses are too apt to print out and scan in a text document, thus converting it to a JPEG image. For them, paper is still an easier way to sign a document than trying to figure out digital signatures.
|
|
|
lonjil
|
2025-11-11 04:23:41
|
JXL has some limited vector stuff.
|
|
|
AccessViolation_
|
|
gggol
Polygons with gradients? While raster based formats are nearly universal, remember that they aren't everything. Polygons with gradients are easy to do in a vector format, like in PDF and SVG. I have not noticed support for vector in any of the JPEG work, though maybe JPEG AI will have that implicitly.
But then, the P in JPEG does stand for "photographic", so maybe vector is beyond their scope. Can't quite see JPEG supporting the scaling of fonts like PDF does. It sure is annoying the way businesses are too apt to print out and scan in a text document, thus converting it to a JPEG image. For them, paper is still an easier way to sign a document than trying to figure out digital signatures.
|
|
2025-11-11 04:37:16
|
can you give me a recipe for mushroom risotto?
|
|
2025-11-11 04:38:40
|
what you said sounded very AI generated :p
|
|
2025-11-11 04:42:50
|
what I meant was using polygons with gradients to approximate a photographic image, rendering a low-resolution preview with very few bytes
|
|
|
jonnyawsom3
|
2025-11-11 05:02:12
|
AKA WebP2
|
|
|
AccessViolation_
|
2025-11-11 05:09:18
|
ah that was then one then i guess. I was playing around with a demo but I couldn't find it again
|
|
2025-11-11 05:10:06
|
https://skal65535.github.io/triangle/index.html yep
|
|
|
A homosapien
|
|
AccessViolation_
what I meant was using polygons with gradients to approximate a photographic image, rendering a low-resolution preview with very few bytes
|
|
2025-11-11 06:33:31
|
See https://github.com/eustas/2im and here's a visual comparison https://eustas.github.io/2im/twim-vs-sqip.html
|
|
|
AccessViolation_
|
2025-11-11 06:36:25
|
I've been tinkering away at this one
|
|
2025-11-11 06:36:26
|
WebP2 (250 bytes)
|
|
2025-11-11 06:37:45
|
JPEG XL (259 bytes)
(`cjxl scaled.png scaled-lossy.jxl -q 10 --resampling=8 --gaborish=1 --epf=3 --photon_noise_iso=30000000 -e 4`)
|
|
2025-11-11 06:38:10
|
original (source for the JXL, idk what the source for the webp was) (scaled by me to match the webp2 size)
|
|
|
A homosapien
|
2025-11-11 06:41:55
|
I have a working 2im binary, I'll give it a shot
|
|
|
AccessViolation_
|
2025-11-11 06:42:27
|
wanna move over to <#803645746661425173> btw?
|
|
|
[ ]
|
2025-11-12 04:05:11
|
While exporting images in darktable I noticed that jxl outputs can't seem to output actual black, just something close (gimp says rgb of 3Γ5.5). This only seems to be an issue affecting outputs using the PQ transfer function: linear and HLG seem unaffected.
Does anyone know what might be going on here? The PQ test pattern from <https://jpegxl.info/resources/hdr-test-page.html> has no such issue and I have checked with multiple image viewers and editors β they all agree with my assessment.
|
|
|
_wb_
|
2025-11-12 07:48:12
|
sounds like a darktable issue...
|
|
|
[ ]
|
2025-11-12 08:07:19
|
That is what I figured. I was poking around in their jxl encoding code for an unrelated thing and didn't see anything that could cause that though: from what I recall they just pass the float/int values to libjxl.
|
|
2025-11-12 10:30:01
|
On further testing, it seems darktable is indeed passing different values based on the colorspace. I'll have a closer look at that to try and work out what's going on.
|
|
|
Adrian The Frog
|
2025-11-12 09:06:27
|
How bad would it be to compress 300 frames of video as a 900 channel jxl
|
|
2025-11-12 09:07:36
|
Would it be able to save a lot of space on stationary background elements just by the default compression systems?
|
|
|
_wb_
|
2025-11-12 09:20:48
|
uhm, probably not, at least not with default settings
|
|
2025-11-12 09:21:10
|
with -E 10 it can use the previous 10 channels as context, that will probably help a bit
|
|
2025-11-12 09:21:41
|
but to really leverage inter-channel correlation, you'd need to apply RCTs on those 900 channels, which is something libjxl currently doesn't do
|
|
2025-11-12 09:22:02
|
(it does on the first 3 channels but doesn't try doing it on the rest)
|
|
2025-11-12 09:22:27
|
also it will be a memory nightmare
|
|
2025-11-12 09:22:45
|
and no viewer will play that as an animation of course
|
|
2025-11-12 09:23:58
|
so probably best not to use channels for other things than actual additional per-pixel data like alpha or depth
|
|
|
Adrian The Frog
|
2025-11-12 10:44:53
|
I feel like a 3d dct would make sense for spectral images once the spectral resolution is high enough
What's the SOTA for lossy compression of volumes?
|
|
|
jonnyawsom3
|
|
_wb_
but to really leverage inter-channel correlation, you'd need to apply RCTs on those 900 channels, which is something libjxl currently doesn't do
|
|
2025-11-12 11:49:23
|
Huh, I thought it did.. I wonder if that's related to RCT and Squeeze not using the right selection
|
|
|
_wb_
|
2025-11-13 08:06:49
|
RCTs are just always operating on the first three channels in current libjxl. Probably if there are more channels it is useful to apply RCTs on those too, but this wasn't really explored yet.
|
|
2025-11-13 08:10:01
|
For sure there are RGBA images where Alpha correlates with the rest so it could be useful to do something like YCoCg (A-Y) or something, which can be done but it's a bit tricky since you'll have to use some permutation-only RCTs and do something like RGBA -> YCoCgA -> CoCgYA -> CoCgY(A-Y) -> YCoCg(A-Y), since RCT can only operate on three subsequent channels
|
|
|
jonnyawsom3
|
2025-11-13 04:54:24
|
<@794205442175402004> as a follow up to that, do you know if there's an easy way to tell if the current channel is a squeeze step during encoding?
Currently we're just setting the predictor to None for progressive lossless, because it does best on the smallest half of the steps. If we could only disable it for that half, the predictors should help for the first few steps where there's still plenty of context
|
|
|
_wb_
|
2025-11-14 08:09:37
|
uhm, you could look at the `hshift` and `vshift` of the modular channel, early squeeze steps will have higher values for those variables
|
|
|
jonnyawsom3
|
2025-11-14 08:12:31
|
So would a `hshift` and `vshift` of 1 correlate to an image of half resolution, then 2 for 1/4 res, ect?
|
|
|
screwball
|
2025-11-14 08:54:25
|
in cjxl, is there a way to manipulate the bit depth and potentially even chroma subsampling of an output jxl?
i see there is a way to change color space information which is good
|
|
|
jonnyawsom3
|
2025-11-14 09:04:32
|
Bitdepth yes `--override_bitdepth`
Chroma subsampling no, as that doesn't exist. It's only in the spec for JPEG transcoding
|
|
|
screwball
|
|
Bitdepth yes `--override_bitdepth`
Chroma subsampling no, as that doesn't exist. It's only in the spec for JPEG transcoding
|
|
2025-11-14 09:05:20
|
thanks
right after asking that, i found this blog post
this is crazy and i had no idea about it
https://www.fractionalxperience.com/ux-ui-graphic-design-blog/why-jpeg-xl-ignoring-bit-depth-is-genius
|
|
|
AccessViolation_
|
|
screwball
thanks
right after asking that, i found this blog post
this is crazy and i had no idea about it
https://www.fractionalxperience.com/ux-ui-graphic-design-blog/why-jpeg-xl-ignoring-bit-depth-is-genius
|
|
2025-11-14 10:35:16
|
the person who wrote this blog post is in this server btw
|
|
2025-11-14 10:35:47
|
|
|
|
screwball
|
|
AccessViolation_
the person who wrote this blog post is in this server btw
|
|
2025-11-14 10:37:37
|
thats dope
is everything in that blog post really true?
it sounds almost too good to be true
|
|
|
AccessViolation_
|
2025-11-14 10:46:11
|
it seems it was well received by the core devs in this server so I have to assume it is. I'm not at all familiar enough with the AVIF side of things, but as for claims made about JXL the story seems to hold
|
|
2025-11-14 10:48:11
|
I will say, as I've learned more about image and video formats, especially since my time here, I've learned so many things that seem too *bad* to be true about other formats and standards, so compared to those JXL really is a huge step up
|
|
|
screwball
|
2025-11-15 01:47:29
|
is `photon_noise_iso` the only way to get dithering when displaying a higher bit depth image on an 8 bit display?
|
|
2025-11-15 01:51:06
|
it technically gets the job done, but i dont even know if this is the intended solution
|
|
|
|
ignaloidas
|
2025-11-15 01:52:01
|
libjxl should apply dithering automatically when decoding a higher bit depth image to a lower bit depth?
|
|
|
jonnyawsom3
|
2025-11-15 01:52:13
|
Yeah, dithering to lower bitdepth is enabled by default, photon noise is separate
|
|
|
screwball
|
|
ignaloidas
libjxl should apply dithering automatically when decoding a higher bit depth image to a lower bit depth?
|
|
2025-11-15 01:52:29
|
ive used 2 different image viewers on different operating systems
neither of them do this
i dont know what im doing wrong lol
|
|
|
username
|
|
screwball
ive used 2 different image viewers on different operating systems
neither of them do this
i dont know what im doing wrong lol
|
|
2025-11-15 01:53:15
|
they might be using some old version of libjxl. try djxl and see what happens maybe?
|
|
|
jonnyawsom3
|
2025-11-15 01:53:17
|
Could be using old libjxl versions, requesting the full bitdepth so libjxl isn't outputting 8bit, or could have disabled the dithering intentionally
|
|
|
|
ignaloidas
|
2025-11-15 01:53:20
|
might be those image viewers misconfiguring libjxl decodes and as such taking float32 data and doing the conversion to lower bit depth themselves?
|
|
|
screwball
|
|
username
they might be using some old version of libjxl. try djxl and see what happens maybe?
|
|
2025-11-15 01:56:12
|
how do i tell djxl that im targeting an 8 bit png?
|
|
2025-11-15 01:56:19
|
it outputs a 16 bit png
|
|
2025-11-15 01:56:29
|
which i know does not dither
|
|
2025-11-15 01:56:34
|
so that doesnt tell me much
|
|
|
ignaloidas
might be those image viewers misconfiguring libjxl decodes and as such taking float32 data and doing the conversion to lower bit depth themselves?
|
|
2025-11-15 01:56:41
|
oooof
|
|
|
Could be using old libjxl versions, requesting the full bitdepth so libjxl isn't outputting 8bit, or could have disabled the dithering intentionally
|
|
2025-11-15 01:57:34
|
the old libjxl thing is weird to me
one of the image viewers ive tested on is Gwenview (linux)
and it literally *just* got jpeg xl support very recently
|
|
|
|
ignaloidas
|
|
screwball
how do i tell djxl that im targeting an 8 bit png?
|
|
2025-11-15 01:57:42
|
add `--bits_per_sample=8`
|
|
|
screwball
|
|
ignaloidas
add `--bits_per_sample=8`
|
|
2025-11-15 01:58:45
|
ok yeah there is some dithering
its not enough dithering but its definitely better
|
|
|
RaveSteel
|
|
screwball
the old libjxl thing is weird to me
one of the image viewers ive tested on is Gwenview (linux)
and it literally *just* got jpeg xl support very recently
|
|
2025-11-15 01:58:54
|
what distro are you on? gwenview has had support for at least 1-2 years now via the kimageformats package
|
|
|
screwball
|
|
RaveSteel
what distro are you on? gwenview has had support for at least 1-2 years now via the kimageformats package
|
|
2025-11-15 01:59:41
|
Cachy OS which is basically Arch with fewer steps
maybe my memory is trolling me but i tried to do jpeg xl stuff not that long ago and it just did not work at all
now it works just like it did on Windows 11
|
|
2025-11-15 01:59:49
|
(though Windows 11 didnt have dithering either)
|
|
|
RaveSteel
|
2025-11-15 02:00:35
|
I checked git, support for JXL was added to gwenview in 2021 even
|
|
|
screwball
|
2025-11-15 02:01:03
|
maybe im delusional then
|
|
|
RaveSteel
|
2025-11-15 02:01:07
|
maybe you had an invalid file
|
|
|
username
|
|
screwball
ok yeah there is some dithering
its not enough dithering but its definitely better
|
|
2025-11-15 02:05:36
|
hmm if possible could you show the file? because I wonder if this is related to this issue: https://github.com/libjxl/libjxl/pull/4516
|
|
|
|
ignaloidas
|
2025-11-15 02:06:14
|
I'm pretty sure that kimageformats ends up always decoding lossy images to float32 because of this? https://github.com/KDE/kimageformats/blob/master/src/imageformats/jxl.cpp#L296
|
|
2025-11-15 02:06:50
|
(I don't see anywhere you'd pass the target bitdepth in the API so I think libjxl just doesn't get an opportunity to dither)
|
|
|
screwball
|
|
username
hmm if possible could you show the file? because I wonder if this is related to this issue: https://github.com/libjxl/libjxl/pull/4516
|
|
2025-11-15 02:07:46
|
im doing a dark gradient test
this is a jxl thats been encoded from a 16 bit png
it *should* have no visible banding
and this is the 8 bit png from djxl
|
|
2025-11-15 02:08:02
|
notice the png still has visible banding
|
|
2025-11-15 02:08:07
|
BUT there is a slight dither
|
|
2025-11-15 02:08:10
|
its just not enough dithering
|
|
|
RaveSteel
|
2025-11-15 02:11:39
|
nomacs and gewnview show the same amount of dithering, both use kimageformats
tev has some banding
mpv has no banding
Adobe's Gain Map Demo App (lol) also shows some banding
|
|
|
screwball
|
|
RaveSteel
nomacs and gewnview show the same amount of dithering, both use kimageformats
tev has some banding
mpv has no banding
Adobe's Gain Map Demo App (lol) also shows some banding
|
|
2025-11-15 02:12:27
|
when you say "some banding", do you mean theres no dithering? or do you mean theres insufficient dithering?
|
|
2025-11-15 02:13:28
|
i should also mention the cjxl/djxl versions im using are master builds
|
|
2025-11-15 02:13:32
|
not 0.11
|
|
|
RaveSteel
|
2025-11-15 02:14:07
|
probably insufficient dithering? But I am not sure, apologies
|
|
|
screwball
|
2025-11-15 02:14:57
|
no worries, its hard to see for me as well
|
|
|
jonnyawsom3
|
|
screwball
im doing a dark gradient test
this is a jxl thats been encoded from a 16 bit png
it *should* have no visible banding
and this is the 8 bit png from djxl
|
|
2025-11-15 02:15:05
|
This is what a modern libjxl decode gives (Discord preview will be ass so remember to click on it twice)
|
|
|
screwball
|
|
This is what a modern libjxl decode gives (Discord preview will be ass so remember to click on it twice)
|
|
2025-11-15 02:17:02
|
yeah this is basically identical to the result i got from djxl
weirdly, depending on the scaling of the image the dithering can appear sufficient
but from most viewpoints it still leaves banding
the sampling/filtering of the image viewer inconsistently hides and reveals banding when you zoom in or out
|
|
|
jonnyawsom3
|
|
screwball
im doing a dark gradient test
this is a jxl thats been encoded from a 16 bit png
it *should* have no visible banding
and this is the 8 bit png from djxl
|
|
2025-11-15 02:19:29
|
That one uses bayer, mine uses blue noise
|
|
2025-11-15 02:19:58
|
Should avoid a lot of the moire at different zooms
|
|
|
screwball
|
|
Should avoid a lot of the moire at different zooms
|
|
2025-11-15 02:20:23
|
ah you know what you're right
its more robust than mine
|
|
2025-11-15 02:20:41
|
i feel like the blue noise needs to be like ever so slightly stronger
|
|
2025-11-15 02:20:49
|
theres still banding visible close up
|
|
2025-11-15 02:20:57
|
but its almost perfect
|
|
|
jonnyawsom3
|
2025-11-15 02:22:33
|
The JXL is lossy isn't it? So the rings of banding between the dithering could just be quantization
|
|
2025-11-15 02:23:29
|
It looks smooth to me until 500% zoom, at which point I can see the individual pixels anyway
|
|
|
screwball
|
|
The JXL is lossy isn't it? So the rings of banding between the dithering could just be quantization
|
|
2025-11-15 02:24:14
|
im not sure what you mean
yes the jxl is lossy, but the original source is 16 bit, so its virtually bandless
when i use photon noise i can push the dithering to the point where the bands are never visible
|
|
|
The JXL is lossy isn't it? So the rings of banding between the dithering could just be quantization
|
|
2025-11-15 02:27:28
|
this is what i mean
i have photon noise at a level where the bands are never visible
|
|
|
lonjil
|
|
screwball
yeah this is basically identical to the result i got from djxl
weirdly, depending on the scaling of the image the dithering can appear sufficient
but from most viewpoints it still leaves banding
the sampling/filtering of the image viewer inconsistently hides and reveals banding when you zoom in or out
|
|
2025-11-15 03:03:49
|
Yyyeah scaling is pretty bad for dither. libjxl's dither can only do so much if an image viewer rescales without its own dither
|
|
2025-11-15 03:04:17
|
Really image viewers should always decode to fp32 and then dither as the last display step
|
|
|
Inner Hollow
|
2025-11-15 08:20:37
|
i just downloaded a sample image from a Hasselblad H6D 100C (403MP Pixel shift image, 17400x23200 res.) and it went from 1.1gb to 6MB, but it took long to convert even on effort 1 and quality 70 and it takes ages to open on my old ass pc but still extremly impressive, it still looks amazing
|
|
2025-11-15 08:22:06
|
|
|
2025-11-15 08:24:17
|
if i had a medium format camera(and a good pc)(102MP sensor) with pixel shift ability (400MP image output), then id just spent time uusing JXL to optimize these images to perfection
|
|
|
jonnyawsom3
|
2025-11-15 08:32:41
|
At those settings you probably could've just used jpegli and got similar results
|
|
|
<@794205442175402004> as a follow up to that, do you know if there's an easy way to tell if the current channel is a squeeze step during encoding?
Currently we're just setting the predictor to None for progressive lossless, because it does best on the smallest half of the steps. If we could only disable it for that half, the predictors should help for the first few steps where there's still plenty of context
|
|
2025-11-15 11:24:22
|
Ah, that puts a spanner in the works
|
|
|
screwball
|
2025-11-16 04:23:29
|
i am working on a PR
|
|
|
Quackdoc
|
|
Orum
|
2025-11-16 04:55:06
|
hasn't blender had a PR to add <:JXL:805850130203934781> for a long time now?
|
|
2025-11-16 04:55:15
|
like, just waiting for approval?
|
|
2025-11-16 04:56:04
|
maybe that was for reading only though and not writing
|
|
|
jonnyawsom3
|
|
Orum
hasn't blender had a PR to add <:JXL:805850130203934781> for a long time now?
|
|
2025-11-16 05:10:58
|
It had 4 actually, and the devs got so annoyed they threatened to ignore any future PRs
https://projects.blender.org/blender/blender/pulls/143313#issuecomment-1650599
|
|
|
Orum
|
2025-11-16 05:16:40
|
TBF it seems like none of them were good
|
|
|
screwball
|
|
Orum
TBF it seems like none of them were good
|
|
2025-11-16 05:34:51
|
Yeah theyβre horribly broken and not feature complete
|
|
|
AccessViolation_
|
|
AccessViolation_
hmm I might have an idea why this is happening. those AVF blocks effectively contain an 8x4 block at either the top or bottom, where the cut corner isn't. so where an 8x4 does pretty well, an AVF with the 8x4 in the same spot would do pretty well too
|
|
2025-11-17 03:01:08
|
I tried swapping the `entropy_mul` value of the AFV and DCT8x4 blocks, and indeed the encoder now seems to more properly use the 8x4 and 4x8 instead of abusing the 8x4 sub-block of the AFV blocks. however, the encoded jxl is effectively the same size for both. it's possible there is a quality difference, I haven't tested that as I don't have SSIMULACRA 2 set up currently
(source, before, after)
|
|
2025-11-17 03:02:36
|
I would expect AFV blocks to be more expensive because they encode the three corner pixels separately, in addition of being built up of three subblocks, whereas the 8x4 is just two subblocks
|
|
2025-11-17 03:03:29
|
pinging <@238552565619359744> since you were working on tuning block selection, maybe these are interesting results
|
|
2025-11-17 03:06:42
|
I might create a build that reads these weights from a TOML file so you can change them on the fly without having to rebuild libjxl
|
|
|
jonnyawsom3
|
2025-11-17 03:07:27
|
For changing blocks, don't trust metrics. They tend to prefer smoothing, even when an area is meant to be sharp
|
|
|
AccessViolation_
I might create a build that reads these weights from a TOML file so you can change them on the fly without having to rebuild libjxl
|
|
2025-11-17 03:08:10
|
And that would be smart... I've basically been doing trial and error, examining results, tweaking decimals then committing and repeating
|
|
|
AccessViolation_
|
2025-11-17 03:12:18
|
I'm curious how these `entropy_mul` values were picked/produced
|
|
|
jonnyawsom3
|
2025-11-17 03:25:59
|
The better the block match to an area, the less entropy. But if the entropy estimate was correct, we wouldn't need multipliers to weight them differently
|
|
|
AccessViolation_
|
2025-11-17 03:27:02
|
gotcha gotcha
|
|
|
jonnyawsom3
|
2025-11-17 03:27:31
|
So I wonder how much is related to wrong multiplers, and how much is wrong entropy estimation
|
|
|
AccessViolation_
|
2025-11-17 03:27:57
|
let me set them both to 1 to see what happens
|
|
|
jonnyawsom3
|
2025-11-17 03:28:42
|
In my experience, a whole lot of 2x2 blocks
|
|
2025-11-17 03:28:56
|
Oh right, you mean AFV and 8x4
|
|
|
AccessViolation_
|
2025-11-17 03:36:33
|
I feel like I should have averaged them instead of setting them both to 1 because the penalty is too high now and it doesn't use either, it uses DCT8x16 blocks instead
|
|
|
jonnyawsom3
|
2025-11-17 03:37:31
|
Yeahh, the problem is each weight affects every other weight, including larger blocks. So usually you end up with none of one and way too much of another
|
|
|
AccessViolation_
|
2025-11-17 03:38:13
|
|
|
2025-11-17 03:38:43
|
chaos
|
|
|
Yeahh, the problem is each weight affects every other weight, including larger blocks. So usually you end up with none of one and way too much of another
|
|
2025-11-17 03:41:39
|
do they actually affect each other like that or is it just that when you change one weight that naturally affects whether others will fill its place or not. for example I might not have affected the weight of the 8x16 block, but it naturally will be more likely to be chosen now that the other two are less likely to be chosen
|
|
2025-11-17 03:42:26
|
if they're actually a web of interdependent weights that sounds like hell. time to throw in some machine learning? <:KekDog:805390049033191445>
|
|
|
jonnyawsom3
|
2025-11-17 03:43:46
|
Naturally... I think... It hurt my brain last time I tried to figure out the entropy code
|
|
|
AccessViolation_
|
2025-11-17 03:46:42
|
I see
|
|
2025-11-17 03:48:30
|
alright, I'll create a branch that reads these weights (and other useful parameters, let me know which) from a TOML file. then maybe after that I can script together a pipeline that lets you test every change against a corpus using a quality metric
|
|
|
jonnyawsom3
|
|
AccessViolation_
alright, I'll create a branch that reads these weights (and other useful parameters, let me know which) from a TOML file. then maybe after that I can script together a pipeline that lets you test every change against a corpus using a quality metric
|
|
2025-11-17 03:55:29
|
You want the 8x8s, the bigger blocks, info_loss_multiplier and the ones under it, maybe kPow1, kPow2 and kPow3 too. Some of those might not need changing, but worth having the options
|
|
2025-11-17 03:55:46
|
You can see most of them here <https://github.com/libjxl/libjxl/pull/4506/files>
|
|
|
AccessViolation_
|
2025-11-17 04:05:09
|
`cargo add serde_toml` oh wait this is c, I better go purchase some punch cards
|
|
|
Quackdoc
|
|
AccessViolation_
|
|
AccessViolation_
`cargo add serde_toml` oh wait this is c, I better go purchase some punch cards
|
|
2025-11-18 08:32:56
|
it only took me 6 hours of figuring out how C, libjxl and its selection of build systems work to add another library
|
|
2025-11-18 08:33:16
|
I have never wanted a rust-based encoder more in my life
|
|
2025-11-18 08:33:28
|
luca, name your price, any price
|
|
|
|
veluca
|
|
AccessViolation_
luca, name your price, any price
|
|
2025-11-18 08:45:57
|
a timeturner
|
|
2025-11-18 08:46:23
|
I'll get to it eventually π
|
|
|
AccessViolation_
|
|
veluca
a timeturner
|
|
2025-11-18 08:47:44
|
I'll see what I can do
|
|
|
lonjil
|
2025-11-18 08:50:25
|
Nothing stopping us discord randos from starting the work π
|
|
|
AccessViolation_
|
2025-11-18 08:51:33
|
that's not a bad idea
|
|
2025-11-18 08:53:03
|
me personally, I'm not confident enough to set a good design for the foundations in stone, but I'm looking forward to tinkering with it once the foundations are there ^^
|
|
|
|
veluca
|
2025-11-18 09:06:54
|
tbh at the moment I have enough to do that I'm pretty sure splitting my attention to also keep an eye on an encoder would be a bad idea
|
|
2025-11-18 09:08:18
|
but maybe we should spend a bit of time to create a list of smallish things that still need to be done in the decoder and are roughly independent of each other, for people that might want to contribute -- <@206628065147748352> has been a great help so far π
|
|
|
TheBigBadBoy - πΈπ
|
2025-11-18 09:57:22
|
so JXL does not support DPI info (like pHYs chunk in PNG), and someone on another server reall wanted to retain this info.
I presented this solution: basically use `exiftool` to extract pHYs info, write to external .txt file, `cjxl` all PNGs and to recover the DPI info simply decode to PNG and reuse exiftool```sh
exiftool -p '$PixelsPerUnitX $PixelsPerUnitY $PixelUnits' input.png > pHYs.txt
# now you can convert to jxl without worrying about lost DPI info
# now, decode JXL->PNG then apply this
set $(<pHYs.txt) # put stuff in variables $1 $2 $3
exiftool -overwrite_original -PixelsPerUnitX=$1 -PixelsPerUnitY=$2 -PixelUnits=$3 decoded.png
```Is there a better/easier way to retain DPI info in JXL?
|
|
|
Laserhosen
|
2025-11-18 10:39:35
|
Could store the whole exif data in an Exif box?
|
|
2025-11-18 10:40:51
|
In fact doesn't cjxl do that by default? And djxl should preserve it.
|
|
2025-11-18 10:43:14
|
Oh, it's a pHYs chunk, not in exif.
|
|
2025-11-18 10:43:29
|
But maybe it could be turned into an exif box manually
|
|
|
TheBigBadBoy - πΈπ
|
2025-11-18 10:46:05
|
would setting `XResolution, YResolution, ResolutionUnit` in Exif be enough?
|
|
|
Laserhosen
|
2025-11-18 10:50:30
|
Relevant: https://github.com/libjxl/libjxl/issues/2641
|
|
|
TheBigBadBoy - πΈπ
|
2025-11-18 10:54:45
|
thanks
|
|
2025-11-18 10:54:59
|
yeah seems like these 3 Exif tags should be enough <https://pmt.sourceforge.io/exif/drafts/history/d009.html>
|
|
|
jonnyawsom3
|
2025-11-18 11:52:31
|
Specifically https://github.com/libjxl/libjxl/issues/817
|
|
|
Jyrki Alakuijala
|
|
In my experience, a whole lot of 2x2 blocks
|
|
2025-11-19 10:51:52
|
2x2 is not bad for high quality, just it doesnt have the right feel to it because of its name/simplicity
|
|
2025-11-19 10:53:30
|
All those block selection heuristics I chose by eyeballing, no metric was useful there ... because of it ac strategy development become mind numbing eventually
|
|
|
AccessViolation_
|
2025-11-19 10:58:11
|
I turned an entire image into 2x2 blocks out of curiosity and it didn't get *that* much larger to my surprise
|
|
2025-11-19 11:02:36
|
|
|
|
Jyrki Alakuijala
All those block selection heuristics I chose by eyeballing, no metric was useful there ... because of it ac strategy development become mind numbing eventually
|
|
2025-11-19 11:04:41
|
why were metrics not useful?
|
|
2025-11-19 11:09:47
|
I wanted to experiment with a machine learning approach to arrive at good values for these, using a large image corpus and SSIMULACRA 2, but if metrics weren't useful that might not work...
|
|
|
jonnyawsom3
|
2025-11-19 11:13:15
|
The issue is we want to avoid over smoothing, which metrics tend to prefer
|
|
|
AccessViolation_
|
|
A homosapien
|
2025-11-19 11:21:09
|
we need some kind of high-frequency bias
|
|
|
jonnyawsom3
|
2025-11-19 11:22:17
|
0.8 used to have one, with the blurring/noise caused by each block type allowing you to adjust the result
|
|
|
Lumen
|
|
A homosapien
we need some kind of high-frequency bias
|
|
2025-11-19 11:26:28
|
there is one in butteraugli
|
|
2025-11-19 11:26:39
|
there is "hf_asymetry" parameter
|
|
2025-11-19 11:26:49
|
in vship it isnt exposed runtime but you could modify the code easily
|
|
|
A homosapien
|
2025-11-19 11:27:07
|
I'm assuming it's related to max norm error?
|
|
2025-11-19 11:27:19
|
Or is it something else entirely?
|
|
|
Lumen
|
|
A homosapien
I'm assuming it's related to max norm error?
|
|
2025-11-19 11:28:03
|
it is different
|
|
2025-11-19 11:28:05
|
it is more internal
|
|
2025-11-19 11:28:17
|
it balances the importance of higher frequency planes measures
|
|
2025-11-19 11:28:34
|
butteraugli splits the initial planes into 3-4 frequencies (B only gets 3)
|
|
2025-11-19 11:28:57
|
and each frequency sort of has its pipeline
|
|
2025-11-19 11:29:10
|
hf_asymetry balance the importance of these pipeline I believe
|
|
|
A homosapien
|
2025-11-19 11:32:42
|
hmm
|
|
2025-11-19 11:33:20
|
I wonder if this parameter is in libjxl's encoding internals
|
|
|
Lumen
|
2025-11-19 11:34:45
|
vship - libjxl
|
|
|
A homosapien
|
2025-11-19 11:41:00
|
I think this could be used higher effort levels <:Thonk:805904896879493180>
|
|
2025-11-19 11:43:26
|
Since butteraugli iterations tends to make the image blurrier
|
|
|
Jyrki Alakuijala
|
|
Lumen
hf_asymetry balance the importance of these pipeline I believe
|
|
2025-11-19 11:55:15
|
I never could get that working as well as i wanted. It was like a 20% solution to a 110% problem.
|
|
|
Lumen
|
|
Jyrki Alakuijala
I never could get that working as well as i wanted. It was like a 20% solution to a 110% problem.
|
|
2025-11-19 11:55:31
|
I see, I never tried it so I didnt know
|
|
|
Jyrki Alakuijala
|
|
A homosapien
we need some kind of high-frequency bias
|
|
2025-11-19 11:56:52
|
Correct. But it is a lesser sinn to blur than to create artefacts.
|
|
|
AccessViolation_
why were metrics not useful?
|
|
2025-11-19 12:01:04
|
Metrics were not useful because they converge to overuse of 32x32 etc with loads of ringing
|
|
|
A homosapien
|
|
Jyrki Alakuijala
Correct. But it is a lesser sinn to blur than to create artefacts.
|
|
2025-11-19 12:03:09
|
Libjxl is already too blurry π
|
|
|
AccessViolation_
|
2025-11-19 12:04:54
|
would it help if the code had some sort of internal metric that
- checked for added contrast in local low contrast areas (ringing) within a block
- checked for reduced contrast in local high-contrast areas (smoothing) within a block
that it could reference in its heuristics
this should also not care as much about ringing in areas that are already high contrast which should be less noticeable anyway. might also be fairly cheap?
|
|
2025-11-19 12:05:31
|
just thinking aloud
|
|
|
Jyrki Alakuijala
|
|
AccessViolation_
would it help if the code had some sort of internal metric that
- checked for added contrast in local low contrast areas (ringing) within a block
- checked for reduced contrast in local high-contrast areas (smoothing) within a block
that it could reference in its heuristics
this should also not care as much about ringing in areas that are already high contrast which should be less noticeable anyway. might also be fairly cheap?
|
|
2025-11-19 12:32:11
|
Yes, that's what i tried adding in butteraugli. It turned out that scale and direction of masking started to matter for making reasonable decisions. Otherwise the benefit was small.
|
|
2025-11-19 12:34:04
|
An edge can be smooth in one direction and sharp in another. Just measuring the sharpest edge will allow compression to break the smoothness along the edge. This didn't look good and was difficult to control.
|
|
|
Lumen
|
2025-11-19 12:34:32
|
butteraugli is sadly already the heaviest metric we got
|
|
2025-11-19 12:34:43
|
now even CVVDP is 1.5x faster than Butteraugli
|
|
2025-11-19 12:35:06
|
(both on GPU)
|
|
|
Jyrki Alakuijala
|
2025-11-19 12:39:28
|
On masking, scale matters a lot. Features/noise of size L masks other features roughly 0.5L to L, but does nothing for 4L sized features. There is no scale specific masking in butteraugli β another compromise between time and quality.
|
|
|
AccessViolation_
|
2025-11-19 12:52:49
|
if a good but slow metric is feasible, it might be nice to have for running against a large image corpus once, and using the resulting good parameters in the encoder, rather than using the metric itself in the encoder. isn't that also how quantization tables were generated for JPEG 1?
|
|
|
Lumen
|
2025-11-19 12:57:08
|
if you were to modify butteraugli, don't hesitate to tell me, I ll update Vship too
|
|
|
jonnyawsom3
|
|
Jyrki Alakuijala
Metrics were not useful because they converge to overuse of 32x32 etc with loads of ringing
|
|
2025-11-19 01:05:51
|
I was considering retuning the DCT weights from scratch, as the 8x8 types have vastly different entropy costs. Even when one might fit perfectly, another still gets chosen. While still using your parameters as a base, I think I've solved a lot of the 8x8s now, but I still need to give less weight to the larger merges. On Main, libjxl tends to do either 8x8 or 64x64 with not much in-between
|
|
2025-11-19 02:06:41
|
Just had a random thought. In cjxl there's a lesser known option to resample extra channels such as Alpha, so I gave it a shot to see if it could be useful as a default for lossy encoding. Unfortunately, it blurs too much for any hard edges. *But* <@794205442175402004> added resampling weights for NN upsampling a while back
For lossy, if we could default to 2x NN resampling on Alpha, I *think* we could get away with some nice savings without much image degradation
The image I'm testing with is 20% smaller with Alpha resampled, but `--upsampling_mode` only works with `--already_downsampled` currently
|
|
2025-11-19 02:09:09
|
Though, most Alpha tends to be blurry anyway, so probably best if we just made it an option with `--upsampling_mode` and maybe `--downsampling_mode` so we can do NN entirely in cjxl/libjxl
|
|
|
_wb_
|
2025-11-19 02:30:15
|
if it's blurry, NN will make it pixelated, default upsample should look nicer
|
|
|
jonnyawsom3
|
2025-11-19 02:47:01
|
In this instance my image has sharp Alpha cutoff around the edge of objects, so default is blurring the edge and causing more color to become transparent, while also removing small areas of transparency entirely due to only being a pixel or two thick. Strangely `--ec_resampling 8` makes the image entirely black, and I have no idea why
|
|
|
_wb_
|
2025-11-19 04:17:13
|
probably if you have sharp alpha edges, it's best to not downsample it at all...
|
|
|
spider-mario
|
|
_wb_
if it's blurry, NN will make it pixelated, default upsample should look nicer
|
|
2025-11-19 06:59:44
|
maybe we can downsample with NN during encoding but leave the upsampling to the default?
|
|
2025-11-19 07:01:13
|
ah, I guess having two separate options already encompasses this
|
|
|
jonnyawsom3
|
2025-11-20 04:14:12
|
Yeah, just fun ideas. Controlling both downsampling and upsampling in cjxl would mean more flexibility for different content types
|
|
|
Jyrki Alakuijala
|
|
I was considering retuning the DCT weights from scratch, as the 8x8 types have vastly different entropy costs. Even when one might fit perfectly, another still gets chosen. While still using your parameters as a base, I think I've solved a lot of the 8x8s now, but I still need to give less weight to the larger merges. On Main, libjxl tends to do either 8x8 or 64x64 with not much in-between
|
|
2025-11-20 08:39:57
|
It is my fault, or possibly even slightly intentional.
|
|
|
jonnyawsom3
|
2025-11-20 08:41:58
|
It can always be tuned with time, but my changes should bring it in-line with 0.8 results if I'm accurate enough. Worst case, we may need help to reimplement some of the lost parameters
|
|
|
Jyrki Alakuijala
|
|
It can always be tuned with time, but my changes should bring it in-line with 0.8 results if I'm accurate enough. Worst case, we may need help to reimplement some of the lost parameters
|
|
2025-11-20 08:43:23
|
Wonderful!
|
|
|
AccessViolation_
|
2025-11-20 10:12:23
|
[Improvement of JPEG XL Lossy Image Coding Using Region Adaptive DCT Block Partitioning Structure](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9505599)
|
|
|
AccessViolation_
[Improvement of JPEG XL Lossy Image Coding Using Region Adaptive DCT Block Partitioning Structure](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9505599)
|
|
2025-11-20 10:15:06
|
|
|
|
Quackdoc
|
2025-11-20 11:36:49
|
interesting
|
|
|
jonnyawsom3
|
|
In this instance my image has sharp Alpha cutoff around the edge of objects, so default is blurring the edge and causing more color to become transparent, while also removing small areas of transparency entirely due to only being a pixel or two thick. Strangely `--ec_resampling 8` makes the image entirely black, and I have no idea why
|
|
2025-11-21 07:22:03
|
Okay yeah, it didn't work. NN just turns everything into squares, but non-seperable blurs too large of an area. Maybe a single squeeze step would do the trick
|
|
|
AccessViolation_
[Improvement of JPEG XL Lossy Image Coding Using Region Adaptive DCT Block Partitioning Structure](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9505599)
|
|
2025-11-21 07:58:11
|
We were wondering why it was never submitted to libjxl. Seems like it was developed for a paid product
https://doublebench.com/doublebench-software/#db-software-tabs|0
|
|
|
_wb_
|
2025-11-21 08:20:44
|
Oh interesting, I hadn't seen that. Oh-Jin Kwon is a regular at JPEG meetings, he did at some point (years ago) present these block partitioning ideas in the JXL adhoc group, but he never mentioned they are making a proprietary jxl encoder.
|
|
|
jonnyawsom3
|
2025-11-21 08:43:09
|
There's a patent listed, but it doesn't seem relevant to the block selection. If it could still be implemented, it would be roughly a 5x encoding speedup, but tuning would be needed for larger block sizes
|
|
|
username
|
|
AccessViolation_
[Improvement of JPEG XL Lossy Image Coding Using Region Adaptive DCT Block Partitioning Structure](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9505599)
|
|
2025-11-21 08:47:38
|
I distinctly remember seeing and reading this a few years back but Discord says no one else has posted this URL in this server before and since I opened the link in my browser the page's "last viewed" date for me got updated so now I can't even check when I last saw it now
|
|
2025-11-21 08:56:50
|
ah here we go I managed to scavenge the date out of an old backup. It was September of 2024 when I last viewed that paper
|
|
|
AccessViolation_
|
|
username
I distinctly remember seeing and reading this a few years back but Discord says no one else has posted this URL in this server before and since I opened the link in my browser the page's "last viewed" date for me got updated so now I can't even check when I last saw it now
|
|
2025-11-21 09:27:18
|
I've posted it before
|
|
2025-11-21 09:27:35
|
though that might have been from a different source hence why you can't find the link twice
|
|
2025-11-21 09:28:06
|
I posted it here now because we were talking about it in vc
|
|
2025-11-21 09:29:44
|
|
|
2025-11-21 09:39:43
|
from my understanding libjxl first creates only 8x8 tiles and sees if there are any that can be merged, meaning you can end up with large blocks offset by any number of 8x8 blocks in both directions, whereas this paper starts with large blocks and subdivides them in a quad tree-like fashion, which is less optimal because then large blocks have to stay aligned. hopefully we can retain the merging behavior of libjxl rather than the partitioning behavior of this paper, but still use the metrics that this paper uses somehow
|
|
|
There's a patent listed, but it doesn't seem relevant to the block selection. If it could still be implemented, it would be roughly a 5x encoding speedup, but tuning would be needed for larger block sizes
|
|
2025-11-21 09:41:55
|
if this turns out to be a problem, I propose we patent it too and simply disagree with the rejection <:Stonks:806137886726553651>
|
|
|
jonnyawsom3
|
|
AccessViolation_
from my understanding libjxl first creates only 8x8 tiles and sees if there are any that can be merged, meaning you can end up with large blocks offset by any number of 8x8 blocks in both directions, whereas this paper starts with large blocks and subdivides them in a quad tree-like fashion, which is less optimal because then large blocks have to stay aligned. hopefully we can retain the merging behavior of libjxl rather than the partitioning behavior of this paper, but still use the metrics that this paper uses somehow
|
|
2025-11-21 09:45:47
|
If it were expanded up to the 256x256 maximum though, the blocks would have to be aligned anyway, no?
|
|
|
AccessViolation_
|
2025-11-21 09:50:07
|
yes that's true
|
|
2025-11-21 09:51:30
|
but as it stands, the logic in that paper will never create two blocks of the same size that aren't aligned, like here
|
|
2025-11-21 10:00:50
|
they have to stay aligned to a multiple of their own dimensions, basically it means blocks can never cross these boundaries (except the smaller internal boundaries, by just not subdividing at all)
|
|
2025-11-21 10:15:21
|
actually I'm not sure if blocks are allowed to be 8 px aligned, I've only found examples of 16 px aligned blocks, so I'm reading up on that
|
|
|
|
veluca
|
2025-11-21 10:27:45
|
can be 8px aligned
|
|
2025-11-21 10:27:53
|
but not sure the encoder tries to do that
|
|
|
AccessViolation_
|
2025-11-21 10:29:52
|
> Any segmentation is allowed that satisfies the following
> constraints: the whole frame region is covered by blocks,
> none of the blocks overlap, and none of the blocks cross a
> boundary between HF groups.
|
|
2025-11-21 10:29:58
|
ah you were faster
|
|
|
|
veluca
|
2025-11-21 10:30:34
|
tbf I just remember it π
|
|
|
AccessViolation_
|
2025-11-21 10:31:02
|
I assumed so haha
|
|
2025-11-21 10:32:19
|
this is me with jxl
https://tenor.com/view/adventure-time-demon-cat-i-have-approximate-knowledge-of-many-things-wise-wisdom-gif-4593239
|
|
2025-11-21 10:43:33
|
if blocks were allowed to cross the bottom and right group boundaries that could've made for some interesting optimizations. a if a 64x64 block is the best fit for a 64x56 section at the bottom right of the image, then that 8x64 out of bounds strip could be ignored during the transform - treating them as "don't care" if that's possible. they'd resolve to pixels but it doesn't matter which pixels, and the decoder would not render them
|
|
2025-11-21 10:45:02
|
that would've allowed for even more refined alignment of blocks to image features, because then an encoder wouldn't have to worry about "if I put a large block *here* then that would prevent me form using a large block *here*," which sounds like a computationally hard problem
|
|
|
|
veluca
|
2025-11-21 10:57:40
|
would also have made the life of decoder writes much harder π
|
|
|
AccessViolation_
|
2025-11-21 11:02:52
|
hmm how so? the logic for decoding the blocks themselves wouldn't change, the signaling of blocks wouldn't change other than loosening restrictions. you can decode all the out of bounds pixel data as if you need it like normal, and crop the group back to 256x256 at the end
unless there are many subtle complexities that I'm not aware of (which there probably are)
|
|
|
jonnyawsom3
|
|
AccessViolation_
[Improvement of JPEG XL Lossy Image Coding Using Region Adaptive DCT Block Partitioning Structure](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9505599)
|
|
2025-11-21 12:40:43
|
Maybe I was thinking of this when I said I saw another DCT paper in the code? <https://github.com/libjxl/libjxl/blob/a85895838affa3c52bf362d0f0078af54ece8a7a/lib/jxl/dct-inl.h#L42>
|
|
|
AccessViolation_
|
|
veluca
would also have made the life of decoder writes much harder π
|
|
2025-11-21 04:10:42
|
oop I didn't take LF into account
|
|
2025-11-21 04:11:08
|
I should read before I yap sometimes
|
|
|
pshufb
|
2025-11-21 06:51:35
|
Is there any sort of coding trick that might let a patch be applied in a mirrored or flipped fashion? E.g. presenting them in an additional frame with a somehow modified orientation?
|
|
|
jonnyawsom3
|
2025-11-21 07:09:13
|
Patches can't do rotation or transformation to keep decoders simple, just location and blend modes
|
|
|
pshufb
|
|
Patches can't do rotation or transformation to keep decoders simple, just location and blend modes
|
|
2025-11-21 07:12:27
|
Unfortunate, but thank you. Any suggestions for something that might be able to approximate a rotated patch as a hack? The specific use-cases I have in mind are very simple, symmetrical images. Think e.g. π¬π§ or (for something much scarier because the "patches" are rotated by what should be ~72 degrees) ππ° .
|
|
|
Jyrki Alakuijala
|
|
AccessViolation_
|
|
2025-11-21 08:58:56
|
where is the PR π
|
|
|
AccessViolation_
|
2025-11-21 08:59:32
|
in a proprietary commercial encoder, unfortunately :(
|
|
2025-11-21 09:00:09
|
|
|
|
juliobbv
|
|
Jyrki Alakuijala
where is the PR π
|
|
2025-11-21 09:00:29
|
I skimmed the paper, and it's def not a silver bullet
|
|
2025-11-21 09:00:43
|
it makes some textures look too smooth
|
|
|
jonnyawsom3
|
2025-11-21 09:01:04
|
I was going to say, maybe Jon could ask them about rebasing it on modern libjxl next time they meet, but doublebench might put a spanner in the works. Along with not handling higher DCT sizes or offset blocks like Access mentioned earlier
|
|
|
Jyrki Alakuijala
|
2025-11-21 09:01:21
|
it is very easy to get ~10 % butteraugli improvement in libjxl, just looks smooth/artefacty
|
|
|
juliobbv
|
2025-11-21 09:01:26
|
|
|
2025-11-21 09:02:17
|
I'm sure the community can cook up something better π
|
|
|
Jyrki Alakuijala
|
2025-11-21 09:02:23
|
larger dcts are not great when there is a bit of flat area in the block (ringing propagating to the flat area)
|
|
|
juliobbv
|
|
Jyrki Alakuijala
|
2025-11-21 09:02:38
|
but larger dcts interpolate geometry better
|
|
2025-11-21 09:02:59
|
a combo of both large and small dcts would work -- unfortunately we were not able to build such an encoder yet
|
|
2025-11-21 09:03:39
|
for example a combo of 8x8 and 64x64, with entropy minimization
|
|
|
AccessViolation_
|
2025-11-21 09:04:12
|
come to think of it I have no clue how I came across this paper originally by the way. I might have just been looking up random research about JXL at the time
|
|
2025-11-21 09:04:37
|
because this was before The Paper, so there wasn't a lot of written information except the spec
|
|
2025-11-21 09:05:50
|
I wonder if there's more publications about compression techniques that we don't know about either
|
|
|
jonnyawsom3
|
2025-11-21 09:08:06
|
Still very much a WIP, but while on the topic, a very quick demo of my DCT tuning so far. A lot less ringing, better preserved details (in some areas, still needs work) and smaller size
Original, Main, WIP
|
|
|
AccessViolation_
|
|
juliobbv
|
|
2025-11-21 09:08:49
|
the top-right one looks a bit like what you get when you signal that the image should be 2x upsampled
|
|
|
juliobbv
|
|
AccessViolation_
the top-right one looks a bit like what you get when you signal that the image should be 2x upsampled
|
|
2025-11-21 09:09:27
|
yeah, I think my preference would go: proposed, original, original
|
|
|
jonnyawsom3
|
2025-11-21 09:09:50
|
At 0.5 bpp, there's a chance it was. I can't remember what the threshold used to be, but a while back I made it so that >= Distance 10 enables resampling 2
|
|
|
AccessViolation_
|
|
Still very much a WIP, but while on the topic, a very quick demo of my DCT tuning so far. A lot less ringing, better preserved details (in some areas, still needs work) and smaller size
Original, Main, WIP
|
|
2025-11-21 09:16:52
|
definitely an improvement with it not removing basically all detail to the left of the pillar
|
|
2025-11-21 09:18:36
|
both of them are fine completely removing any evidence of this bolt in the panel though <:KekDog:805390049033191445>
|
|
|
jonnyawsom3
|
2025-11-21 09:21:06
|
I could be hallucinating it after staring at DCT blocks all day but... Is that green more vibrant in the tweaked version?
|
|
|
spider-mario
|
2025-11-21 09:24:01
|
Iβm not sure which is which but the one on the left seems slightly brighter
|
|
2025-11-21 09:24:21
|
(opening them in libjxlβs comparison tool)
|
|
|
jonnyawsom3
|
2025-11-21 09:24:45
|
Would help if I posted the original too
|
|
|
spider-mario
|
2025-11-21 09:26:09
|
yeah, left one closer
|
|
|
jonnyawsom3
|
2025-11-21 09:27:28
|
I'll have to look into that later
|
|
|
spider-mario
|
2025-11-21 09:27:39
|
which one was the tweaked version?
|
|
|
jonnyawsom3
|
2025-11-21 09:28:14
|
The right
|
|
|
AccessViolation_
|
2025-11-21 09:31:25
|
that brightest part of the line that's two (in-game) pixels across, it's fairly homogeneous in the first one, but in the second one the lower row of pixels is a bit darker than the row above it
|
|
2025-11-21 09:31:51
|
I honestly only see the difference in a flicker test or pixel peeping
|
|
|
jonnyawsom3
|
2025-11-21 09:35:02
|
Probably ringing from it choosing an 8x8 instead of an 8x4 or something similar
|
|
|
monad
|
|
Still very much a WIP, but while on the topic, a very quick demo of my DCT tuning so far. A lot less ringing, better preserved details (in some areas, still needs work) and smaller size
Original, Main, WIP
|
|
2025-11-21 09:52:41
|
smaller size + lower fidelity isn't revelatory, but I wonder if such high distances can inform much of practical value anyway
|
|
|
AccessViolation_
|
2025-11-21 09:53:32
|
wonder if the green bar happens to be aligned such that it fits neatly in a single DCT8x32 :)
|
|
2025-11-21 09:54:21
|
it looks to be a bit too long sadly
|
|
|
monad
|
2025-11-21 09:59:46
|
oh, looks like the image is scaled up
|
|
2025-11-21 10:01:01
|
so just lower fidelity in that region, don't know about the rest of the image
|
|
|
jonnyawsom3
|
|
monad
smaller size + lower fidelity isn't revelatory, but I wonder if such high distances can inform much of practical value anyway
|
|
2025-11-21 10:02:23
|
That was distance 1, and yes the image is scaled up to make it easier to see. It's a very fine balance between the blocks, so 2 steps forward 1 step back. Just need a lot more time to figure out what block type fits best and why it's not being used, ect
|
|
|
monad
|
|
I could be hallucinating it after staring at DCT blocks all day but... Is that green more vibrant in the tweaked version?
|
|
2025-11-21 10:27:26
|
hm, again main is much better here. not seeing the claimed improvement in the demos, but still wishing you luck
|
|
|
AccessViolation_
|
2025-11-21 10:32:23
|
the pressed train button has borders that have some color in the original which the tuned version retains and main makes gray
|
|
2025-11-21 10:37:11
|
main, tweaked, original
|
|
|
monad
|
2025-11-21 10:48:38
|
yes, and the orange wire looks better in the tweaked. but essentially anywhere else you can inspect the color and geometry is better retained by main. I don't expect these improved pixels are perceptible at practical scale anyway
|
|
|
Orum
|
2025-11-22 05:08:58
|
is there a way to make animated JXLs? <:Thonk:805904896879493180>
|
|
|
jonnyawsom3
|
2025-11-22 05:10:33
|
APNG and cjxl or FFMPEG
|
|
|
Orum
|
2025-11-22 05:11:14
|
the only way to do it with cjxl is apng input? π©
|
|
|
_wb_
|
2025-11-22 05:18:48
|
Or gif
|
|
|
Orum
|
2025-11-22 05:22:36
|
it takes forever just to make an apng
|
|
|
AccessViolation_
|
2025-11-22 05:36:42
|
256 colors are more than enough
|
|
|
Orum
|
2025-11-22 05:41:37
|
man, ffmpeg does *not* like apng input
|
|