|
Scope
|
|
Scope
**Patches Tuning Corpus**
<https://docs.google.com/spreadsheets/d/14dIfETkUkocEuDOhGQI24tiKnnpTMKSp1qlTtmXOD74/>
`PatchesCorpus.7z`
<https://drive.google.com/file/d/1o48eAuEaGn-EO5kiTJvJ-iFzHQ8KcqRd/>
|
|
2021-08-14 11:49:05
|
|
|
2021-08-14 11:53:01
|
-
```6,305 - s9 E3 I1 (master build)
5,987 - s9 E3 I1 (improved patches)
4,204 - s9 E3 I1 patches=0```
https://i.redd.it/qu0dqaxauoh41.png
|
|
2021-08-14 11:55:40
|
Lots of repetitive elements, but the patches are still inefficient
|
|
|
|
veluca
|
2021-08-14 11:57:26
|
I suspect that's because of many unnecessary patches
|
|
2021-08-14 11:57:44
|
I really need a better heuristic...
|
|
|
Scope
|
2021-08-14 12:00:47
|
Also in this large image, for some reason more often such small patches/dots are mostly not effectively compressed (at least in lossless)
https://i.redd.it/sbtsy7cxujg21.png
|
|
|
|
veluca
|
2021-08-14 12:01:53
|
I suspect there should at least be two different heuristics for lossless and lossy
|
|
|
Scope
|
2021-08-14 12:02:19
|
Looks like
|
|
|
|
veluca
|
2021-08-14 12:04:51
|
tbh patches for lossless never got any real though put into them - I wrote the lossy heuristic, we said "oh, we could use patches for lossless too!" and then we stuck with the same heuristic ๐
|
|
|
Scope
|
2021-08-14 12:08:13
|
So far, in my test images, in lossless other ways without patches seem to compress better (if compared to the size of other formats, for example WebP)
Patches can be useful somewhere like text and even there not always
|
|
|
|
veluca
|
2021-08-14 12:08:50
|
you mean with LZ77, right?
|
|
2021-08-14 12:09:01
|
and/or at -s 9
|
|
|
Scope
|
2021-08-14 12:10:45
|
For example:
https://i.redd.it/uo9u57i7opj41.png
|
|
2021-08-14 12:10:49
|
|
|
|
|
veluca
|
2021-08-14 12:12:24
|
ah that's one of those with dithering patterns - I know patches are very bad there xD
|
|
|
Scope
|
2021-08-14 12:12:51
|
-
https://i.redd.it/ailftxk03ed41.png
|
|
2021-08-14 12:12:55
|
|
|
|
|
veluca
|
2021-08-14 12:13:42
|
also dithering...
|
|
|
Scope
|
2021-08-14 12:15:08
|
Seems like it should be avoided somehow, at least for lossless:
https://i.redd.it/v8gnvrfxh1p31.png
|
|
2021-08-14 12:15:13
|
|
|
|
|
veluca
|
2021-08-14 12:15:59
|
I have no idea what will happen with that image and lossy... but I suspect nothing good
|
|
|
Scope
|
2021-08-14 12:20:45
|
Also, maybe larger group sizes (-g 3) are more useful for patches
Or are they independent?
https://i.redd.it/x58a481h2xe31.png
|
|
2021-08-14 12:20:48
|
|
|
|
|
veluca
|
2021-08-14 12:21:21
|
the two things should be completely orthogonal, i.e. patches don't "see" groups
|
|
2021-08-14 12:21:47
|
although larger groups make lz77 more useful and thus patches less
|
|
|
_wb_
|
2021-08-14 12:23:42
|
Why do dithering patterns cause bad patches? Does it make many small patches that cost much to signal? Does it ruin the regularity, making the residuals have more entropy than without patches?
|
|
|
Scope
|
2021-08-14 12:23:54
|
So, as I thought before, mostly where patches can be useful and where there are a lot of repeating elements, something like lz77 with a large group size may be more efficient
|
|
|
|
veluca
|
2021-08-14 12:24:02
|
many small patches
|
|
2021-08-14 12:24:19
|
(we added patches before adding lz77 :P)
|
|
|
Scope
|
2021-08-14 12:26:40
|
Also, patches can make encoding noticeably slower, especially if there are a lot of them?
|
|
|
|
veluca
|
2021-08-14 12:27:19
|
yeah there's quadratic stuff in there
|
|
2021-08-14 12:27:35
|
as I said... not the best heuristic ever ๐
|
|
|
Scope
|
2021-08-14 12:33:39
|
So, perhaps for lossless they should be used much less often and more aggressively disabled on very small elements (like dots), because in such images they very rarely help
|
|
|
|
veluca
|
2021-08-14 12:34:30
|
well, that's what my patch tries to do ๐ but I suspect there are better ways
|
|
2021-08-14 12:34:35
|
(pun intended)
|
|
|
Scope
|
2021-08-14 12:36:54
|
Yep and it improved the results (as can be seen in the spreadsheet above), but it looks like it should be done even more aggressively
|
|
2021-08-14 12:39:36
|
For lossy mode many of the patches can be useful to improve visual quality, but for lossless it seems that a different strategy is needed
|
|
|
|
veluca
|
2021-08-14 12:40:29
|
what column is the old code there?
|
|
|
Scope
|
2021-08-14 12:40:54
|
|
|
2021-08-14 12:41:11
|
New code
|
|
|
|
veluca
|
2021-08-14 12:41:11
|
I see the gap is down to ~1% between new and no patches
|
|
2021-08-14 12:41:28
|
ah, so it's about 2/3 of the way
|
|
2021-08-14 12:41:34
|
kinda ๐ฆ
|
|
|
Scope
|
2021-08-14 12:42:12
|
Yes, but for some images (besides, patches increase encoding time)
|
|
|
_wb_
|
2021-08-14 12:43:19
|
It's a tricky thing that there are basically 3 orthogonal ways in which we compress repetitive things: patches, lz77, and learning the pattern in the MA tree.
|
|
|
Scope
|
2021-08-14 12:45:44
|
Also where patches help, larger group sizes or WebP are also even more effective
|
|
|
_wb_
|
2021-08-14 01:20:35
|
Green means worse here?
|
|
|
Scope
|
2021-08-14 01:23:35
|
Yes, bigger size
|
|
2021-08-14 01:24:15
|
Good patches:
|
|
2021-08-14 01:24:16
|
|
|
2021-08-14 01:24:16
|
|
|
2021-08-14 01:24:16
|
|
|
2021-08-14 01:24:16
|
|
|
2021-08-14 01:24:17
|
|
|
2021-08-14 01:24:17
|
|
|
2021-08-14 01:24:18
|
|
|
2021-08-14 01:24:18
|
|
|
2021-08-14 01:24:18
|
|
|
2021-08-14 01:34:55
|
Bad Patches:
|
|
2021-08-14 01:34:55
|
|
|
2021-08-14 01:34:56
|
|
|
2021-08-14 01:34:56
|
|
|
2021-08-14 01:34:56
|
|
|
2021-08-14 01:34:57
|
|
|
2021-08-14 01:34:57
|
|
|
2021-08-14 01:34:57
|
|
|
2021-08-14 01:34:58
|
|
|
2021-08-14 01:34:58
|
|
|
2021-08-14 01:35:07
|
|
|
2021-08-14 01:35:10
|
|
|
2021-08-14 01:35:11
|
|
|
2021-08-14 01:35:13
|
|
|
2021-08-14 01:35:14
|
|
|
2021-08-14 01:35:17
|
|
|
2021-08-14 01:35:19
|
|
|
2021-08-14 01:35:19
|
|
|
2021-08-14 01:35:20
|
|
|
2021-08-14 01:35:21
|
|
|
2021-08-14 01:35:22
|
|
|
2021-08-14 01:35:22
|
|
|
2021-08-14 01:35:24
|
|
|
2021-08-14 01:35:24
|
|
|
2021-08-14 01:35:26
|
|
|
2021-08-14 01:35:27
|
|
|
2021-08-14 01:35:29
|
|
|
2021-08-14 01:35:30
|
|
|
2021-08-14 01:35:31
|
|
|
2021-08-14 01:35:32
|
|
|
2021-08-14 01:35:32
|
|
|
2021-08-14 01:35:33
|
|
|
2021-08-14 01:35:34
|
|
|
2021-08-14 01:35:35
|
|
|
2021-08-14 01:35:35
|
|
|
2021-08-14 01:37:52
|
In many examples it seems that they could be effective, but for some reason they are not
|
|
2021-08-14 01:43:19
|
|
|
2021-08-14 01:43:19
|
Also bad:
|
|
2021-08-14 01:43:20
|
|
|
2021-08-14 01:43:21
|
|
|
2021-08-14 01:43:21
|
|
|
2021-08-14 01:43:21
|
|
|
|
|
veluca
|
2021-08-14 01:59:44
|
I see an asymmetry here xD
|
|
|
Scope
|
2021-08-14 02:47:54
|
Also, the patches in this image are quite strangely selected, like there are many other similar elements, but they are not picked (and the selected ones only worsen the efficiency)
|
|
2021-08-14 02:48:00
|
|
|
2021-08-14 03:40:43
|
With transparency
|
|
2021-08-14 03:40:48
|
|
|
2021-08-14 03:42:26
|
And some more bad examples
|
|
2021-08-14 03:42:27
|
|
|
2021-08-14 03:42:27
|
|
|
2021-08-14 03:42:31
|
|
|
2021-08-14 04:33:50
|
|
|
2021-08-14 09:41:32
|
Also patches:
https://github.com/libjxl/libjxl/issues/426
|
|
2021-08-14 09:41:44
|
https://user-images.githubusercontent.com/3265248/128751712-e2564f0b-043e-40c2-87c5-be30bc81041f.png
|
|
2021-08-14 09:42:17
|
|
|
|
|
veluca
|
2021-08-14 09:42:49
|
likely broken background detection
|
|
2021-08-14 09:42:51
|
sigh...
|
|
|
Scope
|
2021-08-14 09:46:25
|
|
|
2021-08-14 09:46:31
|
|
|
2021-08-14 09:59:13
|
For lossy I think it's better to detect more characters or numbers in the patches, but, for lossless, disabling them improves compression for such images
|
|
2021-08-15 12:53:17
|
Another bad (for compression) examples:
|
|
2021-08-15 12:53:22
|
|
|
2021-08-15 12:55:31
|
|
|
2021-08-15 12:58:48
|
|
|
2021-08-15 12:58:55
|
|
|
2021-08-15 12:59:41
|
|
|
2021-08-15 01:01:40
|
-
|
|
2021-08-15 01:01:45
|
|
|
2021-08-15 01:04:22
|
|
|
2021-08-15 01:04:28
|
|
|
2021-08-15 01:06:52
|
|
|
2021-08-15 01:06:57
|
|
|
2021-08-15 01:08:08
|
|
|
2021-08-15 01:10:10
|
However, `-g 0` is even smaller
|
|
2021-08-15 01:53:14
|
-
Good patches, but still WebP (and BMF) have noticeably better compression (even with `--premultiply`)
|
|
2021-08-15 01:53:19
|
|
|
2021-08-15 01:53:38
|
|
|
2021-08-15 02:15:53
|
The corpus and spreadsheet have been updated
https://discord.com/channels/794206087879852103/803645746661425173/876069698725376000
|
|