View Full Version : Bug of Unicode ID3v2 in dMC r11 beta2
good_reason
08-18-2004, 02:55 AM
Bug of Unicode ID3v2 in dMC r11 beta2
Program: dMC r11 beta2
OS: win2000 sp4
I found that dMC r11 beta2 didn't wrote the (Unicode) ID3v2 correctly.
It wrote the characters that should be Unicode with Unicode Big Endian.
ERROR of dMC r11 beta2 -- Unicode Big Endian was written
for example:
ANSI : 54 65 73 74 69 6E 67
Unicode : FF FE 00 54 00 65 00 73 00 74 00 69 00 6E 00 67
--------------------------------
a correct sample : Testing
ANSI : 54 65 73 74 69 6E 67
Unicode : FF FE 54 00 65 00 73 00 74 00 69 00 6E 00 67 00
Unicode Big Endian : FE FF 00 54 00 65 00 73 00 74 00 69 00 6E 00 67
UTF-8 : EF BB BF 54 65 73 74 69 6E 67
-------------
Spoon
08-18-2004, 03:45 AM
Well spotted, I came across it also last week and it has been fixed, just need to upload a new dMC tonight.
good_reason
08-18-2004, 05:36 PM
Bug of Unicode ID3v2 in dMC r11 beta5
Program: dMC r11beta5
OS: win2000sp4 (system locale: Chinese PRC)
dMC r11beta5 solved the Unicode problem of english characters.
Yet, it introduce a NEW bug to DBCS characters (eg: simplified Chinese in GB2312 Code).
Some byte(s) of the Characters was filled incorrectly with FF .
// note: error byte(s) is in red and correct one is in bluecolor
Sample1:
Chinese Character(GB2312) : 咻咻咻
Unicode (error,dMC11 beta2): FF FE 54 BB 54 BB 54 BB
Unicode (error,dMC11 beta5): FF FE BB FF BB FF BB FF
Unicode (correct) : FF FE BB 54 BB 54 BB 54
Sample2:
Chinese Character(GB2312) : 日期
Unicode (error,dMC11 beta2): FF FE 65 E5 67 1F
Unicode (error,dMC11 beta5): FF FE E5 FF 1F 67
Unicode (correct) : FF FE E5 65 1F 67
Spoon
08-19-2004, 05:35 AM
If you are trying to enter that information using dMC, currently dMC is not able to enter such charcters (say from edit tag) they will go to ?? ??? ???, but dMC should be able to convert and send the tag across (ie that mp3 to Monkeys, the tag should become utf-8).
Hopefully R12 of dMC should be fully unicode compatible (reading files with unicode filenames) and entering unicode information in CD Input and edit tag (the problem is dMC supports Win 95/98/ME which is not very unicode able).
Spoon
08-19-2004, 05:36 AM
Oh yes, a quick way to check that dMC can read the unicode tag correctly is hold the mouse over the file, it should show those Chinese chars.
good_reason
08-19-2004, 09:05 AM
If you are trying to enter that information using dMC, currently dMC is not able to enter such charcters (say from edit tag) they will go to ?? ??? ???, but dMC should be able to convert and send the tag across (ie that mp3 to Monkeys, the tag should become utf-8).
Hopefully R12 of dMC should be fully unicode compatible (reading files with unicode filenames) and entering unicode information in CD Input and edit tag (the problem is dMC supports Win 95/98/ME which is not very unicode able).
(so sorry for my broken english :D )
The bug is in the TAG , and not in the filename .
in the case above, all filenames is in English ... ;)
(Actually, Chinese filename is not a matter, coz,i didn't meet any bad filename yet.
Of Course, it is nice that dMC can/will fully support chinese filename. ;) )
I converted Monkeys3.97/Monkeysv3.99/MP3(with ID3v1) files to MP3 files(with both ID3v1 and (Unicode)Id3v2.3)
using dMC r11beta5 and got the results in previous post.
Some bytes of Unicode characters is wrong, and it was replaced by "FF"
anyway, dMC r11beta2 ever wrote them correctly, although they are written in Unicode Big Endian.
pls take a look in the example below (and notice the Byte(s) in BOLD):
Unicode (error,dMC11 beta5): FF FE E5 FF 1F 67 // Unicode
Unicode (correct) : FF FE E5 65 1F 67 // Unicode
Unicode (error,dMC11 beta2): FF FE 65 E5 67 1F // Should be Unicode, but Unicode Big Endian WAS Written
good_reason
08-19-2004, 09:07 AM
Oh yes, a quick way to check that dMC can read the unicode tag correctly is hold the mouse over the file, it should show those Chinese chars.
Yes, it could show the Unicode Tag correctly, But it couldn't store all unicode Character correctly.
and, i found another NEW bug:
if the filename is in Chinese ,all Chinese Tags( MP3 ID3v1/ID3v2 and APETAG2 ) is not be shown correctly in the windows tooltips.
BTW, "Edit Tag"function of dMc11beta5 is able to read the those unicode/UTF-8 tag correctly,
Coz, E5 FF is different from E5 65 . Obviously, the chinese word to be shown is totally different.
I guess the Unicode error tag is occur in the converting Process,
and i believe the store process of "Edit Tag" function got the same Unicode BUG (Coz, if i input a same chinese word, it stored a same error tag).
Spoon
08-20-2004, 07:58 AM
Edit tag cannot handle unicode characters, and dMC cannot work with any files with unicode characters in the name (they cannot be converted, or the tag shown).
Look in the dMC Configuration, it has an option to write unicode for ID3v2 (it is off by default). If still no go, take a Monkey file with those chinese chars and convert to mp3 id3v2, are they correct?
good_reason
08-20-2004, 09:49 PM
Edit tag cannot handle unicode characters, and dMC cannot work with any files with unicode characters in the name (they cannot be converted, or the tag shown).
Look in the dMC Configuration, it has an option to write unicode for ID3v2 (it is off by default). If still no go, take a Monkey file with those chinese chars and convert to mp3 id3v2, are they correct?
(maybe my english is too broken to understand :D)
No, the tag result is not correct. Just like i told u in my previous posts.
"write unicode for ID3v2" option was enabled.
emm....well, i try my best to show u the problem.
the Unicode that we said in the win2000/winxp system is UTF16.
Unicode Tags of Monkey3.99r4 is using UTF8
Unicode Tags of ID3v2 is using UTF16
lets assume that all Tags in the source files is correct .
all filename is in english.
OS: win2000sp4 english (System Locale: Chinese (PRC), Default input language: Chinese (PRC) )
Program: dMC11 beta5.
so,
<Source>APE --> <Target> == <the TAG result of the Target.
Monkey 3.97 --> ID3v1 == (correct)
Monkey 3.99(UTF8) --> ID3v1 == (correct)
Monkey 3.97 --> ID3v2 (Unicode) == (UTF16 error)
Monkey 3.99(UTF8) --> ID3v2 (Unicode) == (UTF16 error)
Monkey 3.97 --> Monkey3.97 == (correct)
MOnkey 3.99(UTF8) --> Monkey3.99 (UTF8) == (UTF8 correct)
Monkey 3.97 --> WMA9 == (correct)
MOnkey 3.99(UTF8) --> WMA9 == (correct)
----------------
<Source>MP3 --> <Target>
ID3v1 --> ID3v2 (Unicode) == (UTF16 error)
ID3v1 --> ID3v2 == (correct)
ID3v1 --> ID3v1 == (correct)
ID3v1 --> Monkey3.99 (UTF8) == (UTF8 correct)
ID3v1 --> Monkey3.97 == (correct)
ID3v1 --> WMA9 == (correct)
ID3v2 --> ID3v2 (Unicode) == (UTF16 error)
ID3v2 --> ID3v2 == (correct)
ID3v2 --> ID3v1 == (correct)
ID3v2 --> Monkey3.99 (UTF8) == (UTF8 correct)
ID3v2 --> Monkey3.97 == (correct)
ID3v2 --> WMA9 == (correct)
ID3v2(Unicode) --> ID3v2 (Unicode) == (UTF16 error)
ID3v2(Unicode) --> Monkey3.99 (UTF8) == (UTF8 correct)
ID3v2(Unicode) --> Monkey3.97 == (correct)
ID3v2(Unicode) --> WMA9 == (correct)
----------------
<Source>WMA9 --> <Target>
WMA9 --> ID3v1 == (correct)
WMA9 --> ID3v2 == (correct)
WMA9 --> ID3v2(unicode) == (UTF16 error)
WMA9 --> Monkey3.99 (UTF8) == (UTF8 correct)
----------------
i think everything is clear enough, right ??
if u need, i can send some samples to u.
summary:
if we write some Tags with unicode ID3v2 .Some byte(s) of the Unicode(UTF16) words in ID3v2Tag (Unicode) will be corrupted by byte(s) FF.
I believe it is a new bug since dMC11beta5 . Because, dMC11beta2 didn't wrote those byte(s) FF .
.
good_reason
08-20-2004, 09:50 PM
In my case, dMC DO work with any files with Chinese characters in the name
(they CAN be converted, or the tag shown(except the tag in the tooltips :D)).
Maybe ur system setting is differnt from mine.
In order to handle between non-unicode characters and unicode characters correctly,
U had to set your windows to a correct "System Locale" ;)
Spoon
08-21-2004, 04:52 AM
If you could send small Ape (with UTF8) and Mp3 (with correct ID3v2 unicode 16) to:
http://www.dbpoweramp.com/email.htm
I can try my self.
good_reason
08-21-2004, 08:30 PM
file was sent, 1 attachment (smaller than 200kB) with 5 music files inside.
pls view the viewme.png inside the attachment
Spoon
08-22-2004, 08:38 AM
Excellent, will look ASAP.
good_reason
08-23-2004, 02:48 PM
Excellent, will look ASAP.
Hi, I found another New BUG of unicode TAG :D PLS,receive your email.and view the viewme2.png
We know that, the Unicode Tag should be show correctly in "any" unicode system, no matter which "System Locale" is used.
"Edit Tag" function of dMC11beta5 MAY NOT show Unicode Tag correctly in ANY unicode system.
some Unicode Characters might be shown as "?"(Question Mark).
For example,
OS: win2000sp4
Files: APE_UTF8.ape and ID3v2_UTF16.mp3
They can be shown correclty ,if i set the "System Locale" to Chinese(PRC).
BUt, they May NOT be shown correctly, if i change the "system Locale" to Chinese(Taiwan).
P/s: I think i can help u ,if u have problem to setting the system. Just tell me which OS u used. ;)
Spoon
08-24-2004, 08:28 AM
dMC R11 will not be able to edit the tags for unicode audio files, R12 should (in many months).
good_reason
08-24-2004, 05:15 PM
dMC R11 will not be able to edit the tags for unicode audio files, R12 should (in many months).
To me,it is helping so much that dMC could show and convert unicode Tag correctly.
and ..it's a good news that dMC can fully support Unicode Tag in future..
Cheers...
Spoon
08-26-2004, 08:28 AM
Now fixed, was a bug in id3lib, (get latest dmc beta), thanks good_reason for your help.
vBulletin® v3.7.0, Copyright ©2000-2008, Jelsoft Enterprises Ltd.