[Bf-committers] Proposal for handling string encoding in blender.

Elia Sarti vekoon at gmail.com
Fri Aug 13 12:47:01 CEST 2010


The point is that different systems use different encodings. UTF-8 is 
just one way to encode multibyte characters, UTF-16 is another for 
instance (and there are hundreds others).

Means if you save "numéro" in your .blend on an OS using utf-8 and 
someone opens it in one using utf-16 then the string is incompatible.

I say +1 to this with an addendum.
To some extent encoding can be detected and thus converted, would it be 
hard to do so for strings in the .blend? Of course only for a limited 
collection, I'd say utf-8 <-> utf-16 would probably suffice as I believe 
many linux distros use utf-8 while windows and mac use utf-16, so this 
would cover the majority of cases.


Remo Pini wrote, on 08/13/2010 10:56 AM:
> Maybe I'm dense, but why would the letter "é" not be UTF-8 compliant? According to my sources that would be é = c3 a9 (LATIN SMALL LETTER E WITH ACUTE) which is perfectly fine...
>
> So mesh.name = "numéro" should NOT raise an error IMHO if the system is truly UTF-8.
>
> Cheers
>
> Remo
>
>    
>> -----Original Message-----
>> From: bf-committers-bounces at blender.org [mailto:bf-committers-
>> bounces at blender.org] On Behalf Of Campbell Barton
>> Sent: Freitag, 13. August 2010 11:40
>> To: bf-blender developers
>> Subject: [Bf-committers] Proposal for handling string encoding in blender.
>>
>> At moment we have have a problem with decoding strings in blender
>> which is caused by blend files not having any encoding information.
>> We have a number of reports about this in the tracker - eg
>> https://projects.blender.org/tracker/index.php?func=detail&aid=23285&gro
>> up_id=9&atid=498.
>> This also gave us trouble for models given to us for the durian
>> sprint, I ended up having to manually rename objects so scripts would
>> work.
>>
>> In practice this means the following can raise an error:
>>   fn = bpy.data.filename # the file path may not be utf8 compatible
>>   print(bpy.context.object.name) # the person who made this file may
>> have cyrillic characters which blender lets them enter.
>>
>> If your not into scripting this means simple things like importing a
>> file from your home directory can be impossible if your name isnt utf8
>> compliant, so I dont think this is a problem we can ignore.
>>
>> The stupid/simple solution is not to use strings, just use byte arrays
>> all over - then you never have any encoding problems.
>> Normally I like stupid solutions but it means every string needs to
>> have a 'b' prefix. eg:  b"Some String", and I think this is too
>> annoying&  ugly.
>>
>> We could just enforce one encoding for all blend files except as
>> hinted at earlier this wont work for peoples filepaths are not utf8
>> compatible.
>>
>> ---
>>
>> So heres my proposed solution:
>> (in brief.  strings: utf8, except for filepaths: fs-natve)
>>
>> * Enforce UTF8 for all blenders internal strings, this can be handled
>> at the UI&  python level so that you are not allowed to set
>> utf8-incompatible strings.
>>   - This means that if you enter a non-utf8 compatible character in an
>> object name it will reject the name.
>>   - If you try to do: mesh.name = "numéro" # an error will be raised.
>>
>> * filenames can't have this limitation imposed because blender needs
>> to be able to reference paths on the users system which we have no
>> control over, however we have a FILENAME type in RNA, we can exempt
>> these strings from the utf8 check, instead these need to follow the
>> filesystems encoding.
>>   - Python can handle this with - Py_FileSystemDefaultEncoding
>>   - This means the string encoding for a file path and an object name
>> for instance may differ.
>>
>> The flaw in this solution is that someone may create a blend file with
>> an image in //numéro/foo.png, then they give this to someone else who
>> can open the file, but get a python error when they try to export it
>> as an OBJ.
>>
>> I think this is an acceptable limitation, we can just tell users that
>> if they want to share their projects to use ascii filenames, people
>> already need to use relative paths if they share projects in that ase
>> the name of their home directory wont matter.
>> Its a lot better then the current state which stops people from
>> exporting a file to their own home directory (under certain
>> conditions).
>>
>> If this is ok I can go ahead with this before the next release, its
>> not really all that much work but since this limits mesh/object/bone
>> names, and the string input field its not just the python api thats
>> affected.
>>
>> --
>> - Campbell
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers
>>      
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>
>    


More information about the Bf-committers mailing list