Port Cube

These CUDA filters are packaged into DGDecodeNV, which is part of DGDecNV.
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 718
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

I was reading https://github.com/WolframRhodium/VapourSynth-BM3DCUDA and I noticed:

Code: Select all

fast:

Multi-threaded copy between CPU and GPU at the expense of 4x memory consumption.

Default True.
Perhaps you can find some trick in the source to increase transfer speed.

P.S: I really would like to see its AVS+ native by your talented hands.
User avatar
Sherman
Moose Approved
Posts: 377
Joined: Mon Jan 06, 2020 10:19 pm

Port Cube

Post by Sherman »

And now, it's time for a message from our sponsor.

User avatar
Britney
Curly Approved
Posts: 89
Joined: Sun Aug 09, 2020 3:24 pm

Port Cube

Post by Britney »

User avatar
Baltasar
DG Approved
Posts: 28
Joined: Tue Nov 02, 2021 9:51 am

Port Cube

Post by Baltasar »

User avatar
Albert
Moose Approved
Posts: 21
Joined: Thu Oct 15, 2020 1:20 pm

Port Cube

Post by Albert »

User avatar
Boris
Posts: 70
Joined: Sun Nov 10, 2019 2:55 pm

Port Cube

Post by Boris »

User avatar
DG
Curly Approved
Posts: 68
Joined: Thu Dec 31, 2020 9:55 am

Port Cube

Post by DG »

Ha ha, Boris! Very timely.

Stunning visuals in that Rocket Man video, Balti. Great find.

This is how I'm feeling these days. Take my word, I'm a madman, don't you know?

User avatar
Curly
Moose Approved
Posts: 203
Joined: Sun Mar 15, 2020 11:05 am

Port Cube

Post by Curly »

Image
User avatar
Sherman
Moose Approved
Posts: 377
Joined: Mon Jan 06, 2020 10:19 pm

Port Cube

Post by Sherman »

And now we return you to our regularly scheduled program.
User avatar
hydra3333
DG Approved/Moose Approved
Posts: 281
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Port Cube

Post by hydra3333 »

Curly wrote:
Tue Oct 11, 2022 7:47 pm
Image
Not wishing to intrude and not understanding what's going on around me,
I have watched this a few dozen times over the last year or so and admired the song.
User avatar
Rocky
Moose Approved
Posts: 2503
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Great one, hydra3333. Awesome tribute to Vincent, the tortured soul.

OK, so the z code returns values of Kr and Kb that I can inspect. Kg is 1 - (Kr + Kb). Then it all goes into AVX512 etc. And the stupid STL and object-oriented filter graph crapola makes it impenetrable. No comments in the code either. Not trying to ding z. I'm sure it's just great for genius-level humans, but I'm a primitive rodent and so use only procedural code without all the crapola. Look at thdmerge source to see the kind of code I like. However, there is a standard way to convert Kr/Kb/Kg to actual matrix equations. I'll do that and see what results viz-a-viz my equations.

@tormento

What is the claimed speedup for the 'fast' mode you described? I'm just not grokking how multithreading can make a significant difference. The bottleneck is simply the amount of data versus the available PCIe bandwidth. Kernel launch, etc., is insignificant compared to that. I keep saying this but nobody gives up the magic!
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 718
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Rocky wrote:
Wed Oct 12, 2022 11:31 am
What is the claimed speedup for the 'fast' mode you described?
Dunno! Anyway I am forwarding you all the possible solutions with an available source code.

Please, if you find it a lost of time, just tell me. ;)
User avatar
Rocky
Moose Approved
Posts: 2503
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Not a loss of time. Keep all ideas coming.
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 718
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Perhaps you already know:

https://github.com/FranceBB/LinearTransformation

They reversed engineered DoVi too.
Post Reply