System: Windows 10 Home x64 20H2
DGMPGDec version: 2.0.01
System Text Encoding Name : Unicode (UTF-8)
System CodePage : 65001
DGIndex.exe creates an invalid d2v file encoded in UTF-8 with BOM in CLI while it saves a valid d2v file encoded in UTF-8 without BOM in GUI on some systems.
This becomes a critical issue for software like StaxRip which uses a d2v file generated via CLI.
These are the errors users encounter when such a situation occurs:
AviSynth
VapourSynth
I put a post in detail in the StaxRip GitHub issue tracker regarding this issue.
I'm wondering if this is related to the system codepage because some users do not experience this issue.
(I guess this can be easily resolved if you make DGIndex.exe and DGDecode.dll accept UTF-8 with BOM encoding as well.)
Anyway, it'd be greatly appreciated if this problem is fixed.
DGIndex creates a d2v file in UTF-8 with BOM in CLI
Re: DGIndex creates a d2v file in UTF-8 with BOM in CLI
Welcome to the forum, JKyle! Already Moose Approved? How did you pull that off?
I wasn't able to get a BOM with CLI or GUI. Both use the same code to open the D2V file, so I am naturally skeptical. Nevertheless, please try this:
http://rationalqm.us/misc/DGIndex_jkyle.exe
If it is still broken, then please attach the D2Vs, one made with GUI and one made with CLI. Do not open them or edit them after they are created, just attach them here. Also, use DGIndex directly, not through staxrip (to minimize possible confounding factors).
The goal is to never write a BOM. There is no explicit code in DGIndex to write a BOM. However, I was using open mode w+ for no apparent reason. Maybe that is triggering some nonsense in Windows. The test build I linked uses just w. BTW, DGIndexNV uses just w also.
Better not to write the BOM to keep compatibility with legacy third-party stuff that works with D2V files. I could however add code to DGDecode to skip a BOM if it exists. Before that, though, I want to find out where the BOM is coming from.
How are you setting the UTF-8 code page? Do you use the system locale "Beta: ..." option of Win10? Or some other way? If you are not using the "Beta: ..." option, then please try using it. I and others have been running fine with that. It effectively adds unicode support for free (no code changes required in the application, which continues to use char text processing). Still, we don't want BOMs in the D2V files.
BTW, it's a bit rude to run around claiming supposed bugs. We're just now adjusting to new windows functionality, so it's unfair to call something that worked fine for decades a bug just because Windows changed something. And we're not even sure it is DGIndex making the BOM you see. For example, does staxrip edit the D2V file? Or maybe you opened it in an editor that saves it with a BOM.
I wasn't able to get a BOM with CLI or GUI. Both use the same code to open the D2V file, so I am naturally skeptical. Nevertheless, please try this:
http://rationalqm.us/misc/DGIndex_jkyle.exe
If it is still broken, then please attach the D2Vs, one made with GUI and one made with CLI. Do not open them or edit them after they are created, just attach them here. Also, use DGIndex directly, not through staxrip (to minimize possible confounding factors).
The goal is to never write a BOM. There is no explicit code in DGIndex to write a BOM. However, I was using open mode w+ for no apparent reason. Maybe that is triggering some nonsense in Windows. The test build I linked uses just w. BTW, DGIndexNV uses just w also.
Better not to write the BOM to keep compatibility with legacy third-party stuff that works with D2V files. I could however add code to DGDecode to skip a BOM if it exists. Before that, though, I want to find out where the BOM is coming from.
How are you setting the UTF-8 code page? Do you use the system locale "Beta: ..." option of Win10? Or some other way? If you are not using the "Beta: ..." option, then please try using it. I and others have been running fine with that. It effectively adds unicode support for free (no code changes required in the application, which continues to use char text processing). Still, we don't want BOMs in the D2V files.
BTW, it's a bit rude to run around claiming supposed bugs. We're just now adjusting to new windows functionality, so it's unfair to call something that worked fine for decades a bug just because Windows changed something. And we're not even sure it is DGIndex making the BOM you see. For example, does staxrip edit the D2V file? Or maybe you opened it in an editor that saves it with a BOM.
Re: DGIndex creates a d2v file in UTF-8 with BOM in CLI
1. I was Moose Approved by videoh on Doom9.
2. Yes, I'm already setting the UTF-8 code page via the system locale "Beta: ..." option of Win10.
3. After more intensive (sort of... ) testing, I found out that the bug lies with StaxRip, not with DGIndex. Sorry about the confusion.
test_DGIndex_CLI_CMD.d2v
test_DGIndex_CLI_PS.d2v
test_DGIndex_StaxRip-preprocessor.d2v
This is pretty weird because calling DGIndexNV.exe from StaxRip does NOT have this BOM issue.
Anyway, thank you for your help.
I owe you a big apology for making a hasty conclusion.
I will report this issue in the StaxRip community.
2. Yes, I'm already setting the UTF-8 code page via the system locale "Beta: ..." option of Win10.
3. After more intensive (sort of... ) testing, I found out that the bug lies with StaxRip, not with DGIndex. Sorry about the confusion.
- Calling DGIndex.exe directly on CMD or PowerShell does not produce BOM.
Code: Select all
D:\Tmp\dgmpgdec2001\DGindex.exe -i "D:\Tmp\test.vob" -ia 6 -fo 0 -yr 1 -tn 1 -om 2 -drc 2 -dsd 0 -dsa 0 -o "D:\Tmp\test_DGIndex_CLI_CMD" -hide -exit
- Calling DGIndex.exe on PowerShell via CMD call also has NO problem.
Code: Select all
cmd /s /c --% "D:\Tmp\dgmpgdec2001\DGindex.exe -i "D:\Tmp\test.vob" -ia 6 -fo 0 -yr 1 -tn 1 -om 2 -drc 2 -dsd 0 -dsa 0 -o "D:\Tmp\test_DGIndex_CLI_PS" -hide -exit"
- Only calling DGIndex.exe from StaxRip as a preprocessor has the BOM issue.
test_DGIndex_StaxRip-preprocessor.d2v
This is pretty weird because calling DGIndexNV.exe from StaxRip does NOT have this BOM issue.
Anyway, thank you for your help.
I owe you a big apology for making a hasty conclusion.
I will report this issue in the StaxRip community.
Re: DGIndex creates a d2v file in UTF-8 with BOM in CLI
Gotta find out who this videoh guy is. No members here with that handle. Has to be either me, admin, or Bullwinkle, because only one of them could have changed your rank. DG is not an administrator or moderator. I'm guessing that Bullwinkle is videoh.
Thank you for the update. That's what I guessed must be happening. Somehow the staxrip preprocessor must be editing the D2V file as there is no way the cited invocation could cause DGIndex to generate a BOM. There is simply no code for that in DGIndex. Maybe staxrip edits the D2V to add the option for demuxing? Anyway, just check if staxrip is set up to write files with BOMs.
Keep us informed, I am curious.
Thank you for the update. That's what I guessed must be happening. Somehow the staxrip preprocessor must be editing the D2V file as there is no way the cited invocation could cause DGIndex to generate a BOM. There is simply no code for that in DGIndex. Maybe staxrip edits the D2V to add the option for demuxing? Anyway, just check if staxrip is set up to write files with BOMs.
Keep us informed, I am curious.
- Bullwinkle
- Posts: 345
- Joined: Thu Sep 05, 2019 6:37 pm
Re: DGIndex creates a d2v file in UTF-8 with BOM in CLI
StaxRip fixed this issue in 2.1.7.1 Beta today: internal code had missed the case where user's codepage was explicitly set to UTF-8.
And plus, now StaxRip supports DGDecode natively for both AviSynth and VapourSynth.
And plus, now StaxRip supports DGDecode natively for both AviSynth and VapourSynth.
Re: DGIndex creates a d2v file in UTF-8 with BOM in CLI
Great to hear and thank you for your update.
DGIndex creates a d2v file in UTF-8 with BOM in CLI
Remember this postRocky wrote: ↑Sun Jan 10, 2021 9:03 amGotta find out who this videoh guy is. No members here with that handle. Has to be either me, admin, or Bullwinkle, because only one of them could have changed your rank. DG is not an administrator or moderator. I'm guessing that Bullwinkle is videoh.
Thank you for the update. That's what I guessed must be happening. Somehow the staxrip preprocessor must be editing the D2V file as there is no way the cited invocation could cause DGIndex to generate a BOM. There is simply no code for that in DGIndex. Maybe staxrip edits the D2V to add the option for demuxing? Anyway, just check if staxrip is set up to write files with BOMs.
Keep us informed, I am curious.
Well, look at this from another location You gotta figure out how he does it
DGIndex creates a d2v file in UTF-8 with BOM in CLI
No mystery. We are all spokespersons for DG.