[Bf-committers] Proposal for handling string encoding in blender.

Campbell Barton ideasman42 at gmail.com
Fri Aug 13 11:39:48 CEST 2010


At moment we have have a problem with decoding strings in blender
which is caused by blend files not having any encoding information.
We have a number of reports about this in the tracker - eg
https://projects.blender.org/tracker/index.php?func=detail&aid=23285&group_id=9&atid=498.
This also gave us trouble for models given to us for the durian
sprint, I ended up having to manually rename objects so scripts would
work.

In practice this means the following can raise an error:
 fn = bpy.data.filename # the file path may not be utf8 compatible
 print(bpy.context.object.name) # the person who made this file may
have cyrillic characters which blender lets them enter.

If your not into scripting this means simple things like importing a
file from your home directory can be impossible if your name isnt utf8
compliant, so I dont think this is a problem we can ignore.

The stupid/simple solution is not to use strings, just use byte arrays
all over - then you never have any encoding problems.
Normally I like stupid solutions but it means every string needs to
have a 'b' prefix. eg:  b"Some String", and I think this is too
annoying & ugly.

We could just enforce one encoding for all blend files except as
hinted at earlier this wont work for peoples filepaths are not utf8
compatible.

---

So heres my proposed solution:
(in brief.  strings: utf8, except for filepaths: fs-natve)

* Enforce UTF8 for all blenders internal strings, this can be handled
at the UI & python level so that you are not allowed to set
utf8-incompatible strings.
 - This means that if you enter a non-utf8 compatible character in an
object name it will reject the name.
 - If you try to do: mesh.name = "numéro" # an error will be raised.

* filenames can't have this limitation imposed because blender needs
to be able to reference paths on the users system which we have no
control over, however we have a FILENAME type in RNA, we can exempt
these strings from the utf8 check, instead these need to follow the
filesystems encoding.
 - Python can handle this with - Py_FileSystemDefaultEncoding
 - This means the string encoding for a file path and an object name
for instance may differ.

The flaw in this solution is that someone may create a blend file with
an image in //numéro/foo.png, then they give this to someone else who
can open the file, but get a python error when they try to export it
as an OBJ.

I think this is an acceptable limitation, we can just tell users that
if they want to share their projects to use ascii filenames, people
already need to use relative paths if they share projects in that ase
the name of their home directory wont matter.
Its a lot better then the current state which stops people from
exporting a file to their own home directory (under certain
conditions).

If this is ok I can go ahead with this before the next release, its
not really all that much work but since this limits mesh/object/bone
names, and the string input field its not just the python api thats
affected.

-- 
- Campbell


More information about the Bf-committers mailing list