Avatar

Problems with special character e.g. ä (Bugs)

by Micha ⌂, Thursday, March 01, 2018, 21:34 (2248 days ago)

Hi,

today, I installed mlf2 on a new server. If I use German.lang, special characters aren't correctly displayed, e.g. März vs. März. Wasn't this problem solved? What was the solution?

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Problems with special character e.g. ä

by Micha ⌂, Thursday, March 01, 2018, 21:37 (2248 days ago) @ Micha

... hmm, I found a solution. I changed

locale_charset = iso-8859-1

to

locale_charset = utf-8

but is it the right place?

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Problems with special character e.g. ä

by Auge ⌂, Friday, March 02, 2018, 08:46 (2247 days ago) @ Micha

Hello

today, I installed mlf2 on a new server. If I use German.lang, special characters aren't correctly displayed, e.g. März vs. März. Wasn't this problem solved? What was the solution?

We discussed this back in 2016 where you pointed to one of your postings from 2008 and to a discussion with a solution in the testing forum, that no longer exists.

In the thread in the testing forum Alex or you provided the solution to set the value for locale_charset to ISO-8859-1 or ISO-8859-15. But that was all the time a bit crude for me. We use UTF-8 and then we set the locale charset to ISO-8859-*? A setting, that seems to only apply to script generated strings like the date. On the other hand, I didn't find a date function in the PHP docs, that takes a charset value into account at all, only a locale like de_DE for german german (set with set_locale, see examples for strftime).

What's the benefit of locale_charset in the script, especially when it's value is not UTF-8 which is in use everywhere else in the scripts?

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Tags:
locale, charset, utf-8

Avatar

Problems with special character e.g. ä

by Micha ⌂, Friday, March 02, 2018, 09:03 (2247 days ago) @ Auge

Hi,

But that was all the time a bit crude for me. We use UTF-8 and then we set the locale charset to ISO-8859-*?

Okay, what's happens, if you change the locale_charset in your installation?

If I change it to UTF-8 at derletztekick.com, the date is not shown.

What's the benefit of locale_charset in the script, especially when it's value is not UTF-8 which is in use everywhere else in the scripts?

I don't know, sorry. The intention to set it to ISO... or UTF is somewhat educated guessing. ;-)

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Problems with special character e.g. ä

by Auge ⌂, Friday, March 02, 2018, 09:52 (2247 days ago) @ Micha

Hello

But that was all the time a bit crude for me. We use UTF-8 and then we set the locale charset to ISO-8859-*?


Okay, what's happens, if you change the locale_charset in your installation?

To tell the truth, I never tried that (and actually I can't, I am at work).

If I change it to UTF-8 at derletztekick.com, the date is not shown.

Aha.

What's the benefit of locale_charset in the script, especially when it's value is not UTF-8 which is in use everywhere else in the scripts?


I don't know, sorry. The intention to set it to ISO... or UTF is somewhat educated guessing. ;-)

:-)

In the index.php #55 ff. the program defines two constants. One is CHARSET and the second is LOCALE_CHARSET and will only gets defined, if $lang['locale_charset'] is different from $lang['charset']. That's the case in the german language file (line #8 ff.). In the function format_time in functions.inc.php a time/date string get's normally formatted with strftime or with iconv in the case of the existence of constant LOCALE_CHARSET. In the latter case iconv converts the time string from LOCALE_CHARSET to CHARSET. That is in our case a conversion from ISO-8859-1 to UTF-8.

But when I'm right, the time/date is provided with the charset UTF-8 from the beginning on ("ä" is a mismatching presentation of the UTF-8-codepoints for "ä" in Latin1 (ISO-8859-1)). So the string get's doubled encoded in UTF-8.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

Problems with special character e.g. ä

by Micha ⌂, Friday, March 02, 2018, 15:43 (2247 days ago) @ Auge

Hi,

But when I'm right, [...] the string get's doubled encoded in UTF-8.

Hmm, sounds good. How can we solve it?

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

UTF-8 double encoding

by Mardor ⌂, Monday, March 12, 2018, 20:56 (2237 days ago) @ Micha

I updated today all the way up from a near-prehistoric 1.4.6b up to 2.4.8. All entries were double encoded afterwards. I solved the problem with a little Python script.

1. Export the database as SQL via phpMyAdmin. Rename it to x.sql.
2. Start this Python3 script:

with open("x.sql","r") as x, open("y.sql","w") as y:
    y.write(x.read().encode('raw_unicode_escape').decode('utf-8',errors="ignore"))


3. Delete all database tables and import y.sql

Avatar

UTF-8 double encoding

by Micha ⌂, Tuesday, March 13, 2018, 09:36 (2236 days ago) @ Mardor

Hi,

I updated today all the way up from a near-prehistoric 1.4.6b up to 2.4.8. All entries were double encoded afterwards. I solved the problem with a little Python script.

If one switchs the language of the forum, the database itself is not changed. Thus, it isn't a sql issue.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

UTF-8 double encoding

by Mardor ⌂, Tuesday, March 13, 2018, 10:34 (2236 days ago) @ Micha

My posting was a bit too short, sorry. Years ago I had changed the forum encoding of the old 1.4.6 Version to utf-8 and at some point of the update process there must have been an additional utf-8 conversion so that I ended up with the double encoding, Milo mentioned.

Avatar

No problems with special character e.g. ä

by Auge ⌂, Tuesday, March 13, 2018, 12:25 (2236 days ago) @ Micha

Hello

But when I'm right, [...] the string get's doubled encoded in UTF-8.


Hmm, sounds good. How can we solve it?

Exactly like you proposed and did it. I adapted your idea in my forum, found no side effects and was afterwards a bit astonished, when I saw the value utf-8 for locale_charset in german.lang in the repository. :-)

Case closed?

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

No problems with special character e.g. ä

by Micha ⌂, Tuesday, March 13, 2018, 15:18 (2236 days ago) @ Auge

Hi,

Exactly like you proposed and did it.

For my forum @derletztekick.com, utf8 doesn't work. This is, maybe, a wrong configuration but I believe, it is better to set this value to utf8.

I adapted your idea in my forum, found no side effects and was afterwards a bit astonished, when I saw the value utf-8 for locale_charset in german.lang in the repository.

Ich add the new paragraphs to the language files and check it on another platform. Here, I must set UTF8 to get the right date-string. Maybe, a better fix is to change the date format, i.e. xx. März to xx.03.

Case closed?

I believe, it is not the final solution... but yes.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

No problems with special character e.g. ä

by Auge ⌂, Tuesday, March 13, 2018, 15:41 (2236 days ago) @ Micha

Hello

For my forum @derletztekick.com, utf8 doesn't work. This is, maybe, a wrong configuration but I believe, it is better to set this value to utf8.

Seeing the block in the repo, it is perplexing.

charset =                         utf-8
locale =                          de_DE.utf8
//...
locale_charset =                  utf-8

charset has the value utf-8, the locale writes it as utf8 and you set utf-8 for locale_charset, what does not seem to work in every case (as you wrote).

Here, I must set UTF8 to get the right date-string.

Explicitely in upper case?

Maybe, a better fix is to change the date format, i.e. xx. März to xx.03.

It's only a prevention strategy but it would ... ahem ... prevent the issue. :-)

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

No problems with special character e.g. ä

by Micha ⌂, Tuesday, March 13, 2018, 15:49 (2236 days ago) @ Auge

Hi,

what does not seem to work in every case (as you wrote).

At derletztekick.com, only locale_charset = iso-8859-1 works as expected. In any other case, no date is shown

Here, I must set UTF8 to get the right date-string.

Explicitely in upper case?

No, I'm sorry.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

No problems with special character e.g. ä

by Auge ⌂, Sunday, March 18, 2018, 19:55 (2231 days ago) @ Micha

Hello

Maybe, a better fix is to change the date format, i.e. xx. März to xx.03.

I found only two occurences of the long date format (month as name). The posting time in the entry and thread views of postings and in the bookmark listing table. Every other place seems to use the short date format with the month as a number with leading zero. I never saw the question to extend the use of the month names instead the numbers from anybody using any of the supported languages. So I assume, to "kill" the last two occurences of month names would not harm anybody.

So, long story short, I support your proposal.

Tschö, Auge

PS: Leave the key in the language file. Maybe it's useful in other cases.

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

No problems with special character e.g. ä

by Micha ⌂, Sunday, March 18, 2018, 21:11 (2231 days ago) @ Auge

Hi,

So, long story short, I support your proposal.
PS: Leave the key in the language file. Maybe it's useful in other cases.

I add the changes only to the German language file, because this is afaik the only trouble spot, isn't it? I don't remove the key.

regards
Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

No problems with special character e.g. ä

by Auge ⌂, Sunday, March 18, 2018, 21:44 (2231 days ago) @ Micha

Hello

I add the changes only to the German language file, because this is afaik the only trouble spot, isn't it? I don't remove the key.

It's a nice and proper solution. I like it.

I would have used the hammer instead the scalpel.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

RSS Feed of thread