Page 12 of 12

Port Cube

Posted: Sun Jun 18, 2023 6:52 pm
by Levi

Port Cube

Posted: Tue Jun 20, 2023 12:22 pm
by Sherman
The malcontents should learn to code. It's so easy!

Port Cube

Posted: Tue Jun 20, 2023 12:41 pm
by Curly
Easy 4 u 2 say.

Port Cube

Posted: Tue Jun 20, 2023 3:43 pm
by Baltasar
This worked for me.


Port Cube

Posted: Wed Jun 21, 2023 3:46 pm
by Rocky
There are ample resources for learning CUDA. We published sample source code for a CUDA filter, and documented our full dialog with nVidia during the development of DGDecNV. Everything you need is in the nVidia SDKs, API documentation, and developer forum. Focus, persistence, and attention to detail!

The zimg/avsresize authors...maybe they'd be willing to work with us to port their stuff to CUDA. I'd be more than happy to help. I could add some CUDASynth magic to eliminate extra PCIe transfers, etc. Their filters can remain standalone and fully in their control. We'll help for free. That seems like the most likely path to reach the promised land.

Port Cube

Posted: Sun Jul 02, 2023 7:55 am
by Rocky
Guest 2 wrote:
Wed Aug 10, 2022 10:08 am
P.S: Will you add tetrahedral to AVSCube too?
sekrit-twc has added tetrahedral to timecube so after it is ported to AVS we can ditch DGCube and stop all the associated nonsense. ;)

Port Cube

Posted: Sun Jul 02, 2023 4:32 pm
by Guest 2
Rocky wrote:
Sun Jul 02, 2023 7:55 am
Then we can ditch DGCube and stop all the associated nonsense. ;)
Not everybody has a 56 core CPU. Someone still relies on GPU. (me)

Port Cube

Posted: Sun Jul 02, 2023 4:34 pm
by Rocky
Oh OK. We'll keep it then.

Port Cube

Posted: Tue Jul 04, 2023 3:52 am
by Guest 2
Rocky wrote:
Sun Jul 02, 2023 4:34 pm
Oh OK. We'll keep it then.
https://github.com/rigaya/NVEnc/blob/ma ... ram2value2

lut3d=<string>
Apply a 3D LUT to an input video. Currently supports .cube file only.

lut3d_interp=<string>
nearest, trilinear, tetrahedral, pyramid, prism

I think this could ease my pain.

I could use nvenc to encode to lossless intermediate and then proceed with x265.

Do you think is now feasible to use the source code to implement it in DGCube? :)

Port Cube

Posted: Tue Jul 04, 2023 7:42 am
by Rocky
Guest 2 wrote:
Tue Jul 04, 2023 3:52 am
Do you think is now feasible to use the source code to implement it in DGCube? :)
Implement what? Please be specific and precise.

And what is your pain exactly? You cannot implement some desired processing? Or you can but it doesn't run as fast as you'd like?

Port Cube

Posted: Tue Jul 04, 2023 12:55 pm
by Guest 2
Rocky wrote:
Tue Jul 04, 2023 7:42 am
Implement what? Please be specific and precise.
Nothing, Rocky. I think I won't bug you again about DGCube.

I have eased my pains applying LUT with NVEnc to an intermediate lossless HEVC and then encoded it with standard x265. Easy peasy and unbelievably fast.

Thanks again for having implemented HEVC 4:4:4 decoding :)

Port Cube

Posted: Tue Jul 04, 2023 1:28 pm
by Rocky
Guest 2 wrote:
Tue Jul 04, 2023 12:55 pm
I have eased my pains...
Won't you have the grace to explain what your pains are after I explicitly asked and after all I've done for you over the years.

Port Cube

Posted: Tue Jul 04, 2023 2:05 pm
by Guest 2
Rocky wrote:
Tue Jul 04, 2023 1:28 pm
Won't you have the grace to explain what your pains are after I explicitly asked and after all I've done for you over the years.
As I told many times, my CPU is too old and slow to comfortably apply zimg conversion and have a correct PQ to HLG transformation, using DGCube.

I have squeezed my brain to find a workaround and I am testing NVEnc to do all the job but the final encode.

It does everything in HW, fast and clean, tetrahedral included.

1080p SDR to HLG 160.65 fps
2160p PQ to HLG 42.90 fps

The only issue is storage requirements but I can cope with that.

Port Cube

Posted: Tue Jul 04, 2023 2:14 pm
by Rocky
It's not just storage, it's the time to write out massive lossless streams and then read them again. And having to encode twice. That's not fast and it's certainly not clean. Nevertheless I'm happy you have what you consider to be an adequate workaround for your pains.
As I told many times
That feels rude and unfriendly. And who knows what "comfortable" means for you?
my CPU is too old and slow
When you upgrade your HW for HEVC lossless, think too about upgrading your CPU.

Port Cube

Posted: Fri Jul 07, 2023 10:05 am
by Rocky
Hehe, I have successfully ported sekrit-twc's latest vscube to AVS+. Still have to fix up some loose ends but it's running fine with tetrahedral and all cpu modes. It took just less than 3 hours to port.

Port Cube

Posted: Sat Jul 08, 2023 12:36 pm
by Rocky
Here is a test release of the AVS+ support for sekrit-twc's latest vscube. Refer to the user manual for syntax and examples. Your test results will be greatly appreciated. My testing shows the AVS+ version with prefetch(6) to be faster than the Vapoursynth version.

https://rationalqm.us/cube/AVSCube_test.rar

Say thank you.

Port Cube

Posted: Sat Jul 08, 2023 12:59 pm
by Sherman
Rocky wrote:
Fri Jul 07, 2023 10:05 am
It took just less than 3 hours to port.
You're slipping, Rocky. Do you need to get some young blood involved?

Port Cube

Posted: Sat Jul 08, 2023 1:00 pm
by Natasha
Sherman wrote:
Sat Jul 08, 2023 12:59 pm
young blood
The best kind!

Port Cube

Posted: Tue Jul 11, 2023 10:05 am
by Rocky
The timecube+AVS support test build was relocated:

https://rationalqm.us/cube/

All the cube stuff is now together in directory cube.

Port Cube

Posted: Wed Jul 12, 2023 5:05 am
by Rocky
Here is an updated test build for AVS+ support for sekrit-twc's vscube. It includes sekrit-twc's bug fix for the AVX2 support of tetrahedral mode.

https://rationalqm.us/cube/AVSCube_test.rar

Port Cube

Posted: Fri Apr 05, 2024 4:56 pm
by Rocky
Guest 2 wrote:
Wed Aug 31, 2022 12:18 pm
I see tiny discrepancies between the product of external (identical to AVSCube) and of internal processing (look at graphs, mostly).
Well guys, I finally wrapped my rodent brain around this stuff and discovered the reasons for this. My matrices are off. I did research and now fully understand how to generate the matrices for any combination of:

8 vs. 16 bits
limited vs full range for input and output
601 vs. 709 vs. 2020 space
constant vs. non-constant luminance

Working through that for DGHDRtoSDR(), I saw that the equations I was using (can't even remember where I got them) were off by enough to account for discrepancies.

So I will fix that and, more importantly, I will fix DGCube's internal conversions and properly extend them to support all needed conversions. This will eliminate the need for external conversions using zimg, greatly improving performance.

Port Cube

Posted: Thu Apr 11, 2024 1:23 pm
by Rocky
Well guys, my optimism was premature. I was under the impression that all gamma-related stuff would be implemented in the LUT. But looking at the script (the one that revealed discrepancies) shows that the specified gamma inverse is being applied to create linear RGB to be passed to the script. So it is not enough for me to fix the coefficients in the YUV->RGB->YUV conversions. I also have to implement all the gamma stuff. And who knows, maybe also primaries stuff. So it's back to having to recreate the whole of z_ConvertFormat() if we are to have everything on the GPU. I'm not going to do that as it is a massive undertaking with zero benefit for me.

If you are wondering about DGHDRtoSDR() everything is fine as it does the needed gamma processing for
PQ/HLG->709.

Port Cube

Posted: Sat Apr 13, 2024 9:06 am
by hydra3333
OK and thanks for looking into it. :salute:

At a guess, I suppose it also means no gpu HDRAGC ? Or even a hybrid ?

Port Cube

Posted: Sat Apr 13, 2024 9:54 am
by Rocky
No m8 it has no relevance for HDRAGC and curves-type stuff. I am still developing my own curves filter.

Port Cube

Posted: Sun Apr 14, 2024 3:38 am
by hydra3333
Beaut, thanks bloke.