Discussion:
[RCD] GB2312, ISO-2022-KR
Vladimir Gorpenko
2016-09-26 20:41:48 UTC
Permalink
Something managed to be clarified. Perhaps, it will be interesting.

If you remember, I have two servers, will call them W and T. T normally
opens letters in these codings, W shows something like the Arab
characters.

1. In case of conversion of the codings RC at first tries to use iconv.
As iconv(php) is an interface to the linux utility iconv, it doesn't
work at the server W where chroot is used. On the T server iconv
transforms both of these codings normally.

2. In the second queue RC tries to apply mbstring to conversion of
codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
server. I think, the reason is also somehow connected to chroot.

I suppose, RC has problems with different codings by operation under
chroot. Though I still hope to set up operation of iconv and mbstring.
--
Best regards,
Vladimir Gorpenko
Kyle Francis
2016-09-26 22:27:43 UTC
Permalink
_______________________________________________
Roundcube Development discussion mailing list
***@lists.roundcube.net
http://lists.roundcube.net/mailman/listinfo/dev
Vladimir Gorpenko
2016-09-27 08:52:27 UTC
Permalink
All many thanks.

The problem was really solved by copying/usr/lib64/gconv in /chroot. Why
mbstring didn't work and whether it works now - I don't know, but iconv,
obviously, works.

---
Best regards,
Vladimir Gorpenko
Vladimir,
Is it possible to add iconv to your chroot? Are you permitted to since this is a production machine? If so that seems your best option. It seems as though you will need to copy the /usr/lib/gconv directory into your chroot root as well.
See https://bugs.php.net/bug.php?id=44096 towards the end if the page.
Kyle
Post by Vladimir Gorpenko
Something managed to be clarified. Perhaps, it will be interesting.
If you remember, I have two servers, will call them W and T. T normally
opens letters in these codings, W shows something like the Arab
characters.
1. In case of conversion of the codings RC at first tries to use iconv.
As iconv(php) is an interface to the linux utility iconv, it doesn't
work at the server W where chroot is used. On the T server iconv
transforms both of these codings normally.
2. In the second queue RC tries to apply mbstring to conversion of
codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
server. I think, the reason is also somehow connected to chroot.
I suppose, RC has problems with different codings by operation under
chroot. Though I still hope to set up operation of iconv and mbstring.
--
Best regards,
Vladimir Gorpenko
_______________________________________________
Roundcube Development discussion mailing list
http://lists.roundcube.net/mailman/listinfo/dev
_______________________________________________
Roundcube Development discussion mailing list
http://lists.roundcube.net/mailman/listinfo/dev
Rimas Kudelis
2016-09-27 06:21:58 UTC
Permalink
Hi,
Post by Vladimir Gorpenko
1. In case of conversion of the codings RC at first tries to use
iconv. As iconv(php) is an interface to the linux utility iconv, it
doesn't work at the server W where chroot is used. On the T server
iconv transforms both of these codings normally.
Small correction: I think it's actually an interface to a _library_, not
_utility_. I don't think you need /usr/bin/iconv available in your chroot.

Regards,
Rimas
Vladimir Gorpenko
2016-09-27 09:14:27 UTC
Permalink
Hi!

Thanks for this note. I really was going to transfer under chroot
/usr/bin/iconv and worried whether it will be launched under Apache's
chroot.

---
Best regards,
Vladimir Gorpenko
Post by Rimas Kudelis
Hi,
Post by Vladimir Gorpenko
1. In case of conversion of the codings RC at first tries to use
iconv. As iconv(php) is an interface to the linux utility iconv, it
doesn't work at the server W where chroot is used. On the T server
iconv transforms both of these codings normally.
Small correction: I think it's actually an interface to a _library_, not
_utility_. I don't think you need /usr/bin/iconv available in your chroot.
Regards,
Rimas
_______________________________________________
Roundcube Development discussion mailing list
http://lists.roundcube.net/mailman/listinfo/dev
A.L.E.C
2016-09-27 09:40:22 UTC
Permalink
Post by Vladimir Gorpenko
2. In the second queue RC tries to apply mbstring to conversion of
codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
server. I think, the reason is also somehow connected to chroot.
I'm curious if this patch would fix the GB2312 issue for mbstring path.

-- a/program/lib/Roundcube/rcube_charset.php
+++ b/program/lib/Roundcube/rcube_charset.php
@@ -39,8 +39,8 @@ class rcube_charset
'UNKNOWN' => 'ISO-8859-15',
'USERDEFINED' => 'ISO-8859-15',
'KSC56011987' => 'EUC-KR',
- 'GB2312' => 'GBK',
- 'GB231280' => 'GBK',
+ 'GB2312' => 'GB18030',
+ 'GB231280' => 'GB18030',
'UNICODE' => 'UTF-8',
'UTF7IMAP' => 'UTF7-IMAP',
'TIS620' => 'WINDOWS-874',
@@ -51,7 +51,7 @@ class rcube_charset
'128' => 'SHIFT-JIS',
'129' => 'CP949',
'130' => 'CP1361',
- '134' => 'GBK',
+ '134' => 'GB18030',
'136' => 'BIG5',
'161' => 'WINDOWS-1253',
'162' => 'WINDOWS-1254',
--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer [http://kolab.org]
Roundcube Webmail Developer [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
Vladimir Gorpenko
2016-09-27 11:30:19 UTC
Permalink
Yes, it works.

Operations were carried out on T server on which there is no chroot.

I commented out the operators calling iconv and made these corrections.
GB2312 fulfilled normally.

It seems mb_check_encoding returns false.

I uncommented iconv and was convinced that iconv normally works both
with GB18030, and with ISO-2022-KR.

Whether it is necessary also to add a line:
'GBK' => 'GB18030',
?

---
Best regards,
Vladimir Gorpenko
Post by A.L.E.C
Post by Vladimir Gorpenko
2. In the second queue RC tries to apply mbstring to conversion of
codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
server. I think, the reason is also somehow connected to chroot.
I'm curious if this patch would fix the GB2312 issue for mbstring path.
-- a/program/lib/Roundcube/rcube_charset.php
+++ b/program/lib/Roundcube/rcube_charset.php
@@ -39,8 +39,8 @@ class rcube_charset
'UNKNOWN' => 'ISO-8859-15',
'USERDEFINED' => 'ISO-8859-15',
'KSC56011987' => 'EUC-KR',
- 'GB2312' => 'GBK',
- 'GB231280' => 'GBK',
+ 'GB2312' => 'GB18030',
+ 'GB231280' => 'GB18030',
'UNICODE' => 'UTF-8',
'UTF7IMAP' => 'UTF7-IMAP',
'TIS620' => 'WINDOWS-874',
@@ -51,7 +51,7 @@ class rcube_charset
'128' => 'SHIFT-JIS',
'129' => 'CP949',
'130' => 'CP1361',
- '134' => 'GBK',
+ '134' => 'GB18030',
'136' => 'BIG5',
'161' => 'WINDOWS-1253',
'162' => 'WINDOWS-1254',
A.L.E.C
2016-09-27 11:39:07 UTC
Permalink
Post by Vladimir Gorpenko
Yes, it works.
Operations were carried out on T server on which there is no chroot.
I commented out the operators calling iconv and made these corrections.
GB2312 fulfilled normally.
It seems mb_check_encoding returns false.
I uncommented iconv and was convinced that iconv normally works both
with GB18030, and with ISO-2022-KR.
'GBK' => 'GB18030',
?
It might be, indeed. Could you provide samples that fail without the
patch for both GB2312 and ISO-2022-KR, so I could investigate more?
--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer [http://kolab.org]
Roundcube Webmail Developer [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
Vladimir Gorpenko
2016-09-27 12:04:58 UTC
Permalink
I can send an example of ISO-2022-KR which mbstring can't process.
I send it to your address the separate letter.

In case of GB2312, I suppose, there is nothing to investigate. Iconv
converts it normally, mbstring of such coding doesn't support, and with
renaming also converts absolutely normally.

---
Best regards,
Vladimir Gorpenko
Post by A.L.E.C
Post by Vladimir Gorpenko
Yes, it works.
Operations were carried out on T server on which there is no chroot.
I commented out the operators calling iconv and made these
corrections.
GB2312 fulfilled normally.
It seems mb_check_encoding returns false.
I uncommented iconv and was convinced that iconv normally works both
with GB18030, and with ISO-2022-KR.
'GBK' => 'GB18030',
?
It might be, indeed. Could you provide samples that fail without the
patch for both GB2312 and ISO-2022-KR, so I could investigate more?
A.L.E.C
2016-09-27 12:10:10 UTC
Permalink
Post by Vladimir Gorpenko
I can send an example of ISO-2022-KR which mbstring can't process.
I send it to your address the separate letter.
In case of GB2312, I suppose, there is nothing to investigate. Iconv
converts it normally, mbstring of such coding doesn't support, and with
renaming also converts absolutely normally.
I found some sources that it supports "GBK" name, but we probably should
not compare encoding name with mb_list_encodings() result, as it looks
it does not return all supported encodings. So, I'm just looking for the
most universal solution.
--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer [http://kolab.org]
Roundcube Webmail Developer [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
A.L.E.C
2016-09-27 15:07:43 UTC
Permalink
Post by Vladimir Gorpenko
I can send an example of ISO-2022-KR which mbstring can't process.
I send it to your address the separate letter.
I commented iconv code path and wasn't able to reproduce the issue. I'm
using PHP7.
Post by Vladimir Gorpenko
In case of GB2312, I suppose, there is nothing to investigate. Iconv
converts it normally, mbstring of such coding doesn't support, and with
renaming also converts absolutely normally.
Could you confirm that it works with
https://github.com/roundcube/roundcubemail/commit/42ddfe5ec9f0294bb3c44b6f7a9a0b205e951c45
instead of the previous patch?
--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer [http://kolab.org]
Roundcube Webmail Developer [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
Vladimir Gorpenko
2016-09-27 15:37:47 UTC
Permalink
It is good. I use php 5.6. Obviously, in php the 7th this error is
corrected.
Unfortunately,

I can't make the test about which you speak. I tried to make these
corrections. But in my version of RC the place designated at you as
lines 244-247 looks differently.


// return if encoding found, string matches encoding and
convert succeeded
if (in_array($mb_from, $mbstring_list) && in_array($mb_to,
$mbstring_list)) {
if (mb_check_encoding($str, $mb_from)) {
// Do the same as //IGNORE with iconv
mb_substitute_character('none');
$out = mb_convert_encoding($str, $mb_to, $mb_from);
mb_substitute_character($mbstring_sch);

if ($out !== false) {
return $out;
}
}
}
I don't decide to adapt your fix to my rcube_charset version.

---
Best regards,
Vladimir Gorpenko
Post by A.L.E.C
Post by Vladimir Gorpenko
I can send an example of ISO-2022-KR which mbstring can't process.
I send it to your address the separate letter.
I commented iconv code path and wasn't able to reproduce the issue. I'm
using PHP7.
Post by Vladimir Gorpenko
In case of GB2312, I suppose, there is nothing to investigate. Iconv
converts it normally, mbstring of such coding doesn't support, and with
renaming also converts absolutely normally.
Could you confirm that it works with
https://github.com/roundcube/roundcubemail/commit/42ddfe5ec9f0294bb3c44b6f7a9a0b205e951c45
instead of the previous patch?
A.L.E.C
2016-09-27 15:49:09 UTC
Permalink
Post by Vladimir Gorpenko
// return if encoding found, string matches encoding and
convert succeeded
if (in_array($mb_from, $mbstring_list) && in_array($mb_to,
$mbstring_list)) {
if (mb_check_encoding($str, $mb_from)) {
// Do the same as //IGNORE with iconv
mb_substitute_character('none');
$out = mb_convert_encoding($str, $mb_to, $mb_from);
mb_substitute_character($mbstring_sch);
if ($out !== false) {
return $out;
}
}
}
I don't decide to adapt your fix to my rcube_charset version.
In general mb_list_encodings() and mb_check_encoding() is not used now.

I did some more test and indeed mb_check_encoding() does not work with
'GBK', but mb_convert_encoding() does (at least with sample text I've
got). So, I assume current git-master code will work for you as well.
--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer [http://kolab.org]
Roundcube Webmail Developer [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
Vladimir Gorpenko
2016-09-27 16:01:16 UTC
Permalink
Probably I couldn't explain well.

When I tried to apply a fix to that text which I use (1.1.4), I couldn't
make it.
Those lines which shall be replaced according to a fix in 1.1.4 had
significantly other appearance. In particular, there was an additional
operator if.

But I understood the idea, thanks. If I deal still with this issue, I
will consider your words.

---
Best regards,
Vladimir Gorpenko
Post by A.L.E.C
Post by Vladimir Gorpenko
// return if encoding found, string matches encoding and
convert succeeded
if (in_array($mb_from, $mbstring_list) && in_array($mb_to,
$mbstring_list)) {
if (mb_check_encoding($str, $mb_from)) {
// Do the same as //IGNORE with iconv
mb_substitute_character('none');
$out = mb_convert_encoding($str, $mb_to,
$mb_from);
mb_substitute_character($mbstring_sch);
if ($out !== false) {
return $out;
}
}
}
I don't decide to adapt your fix to my rcube_charset version.
In general mb_list_encodings() and mb_check_encoding() is not used now.
I did some more test and indeed mb_check_encoding() does not work with
'GBK', but mb_convert_encoding() does (at least with sample text I've
got). So, I assume current git-master code will work for you as well.
Loading...