Discussion:
question about convert_unicode
Vasily Sulatskov
2006-04-17 09:47:30 UTC
Permalink
Hello

I have a question about convert-unicode engine option.

The documentation says:

convert_unicode=False : if set to True, all String/character based types will
convert Unicode values to raw byte values going into the database, and all
raw byte values to Python Unicode coming out in result sets. This is an
engine-wide method to provide unicode across the board. For unicode
conversion on a column-by-column level, use the Unicode column type
instead.convert_unicode=False : if set to True, all String/character based
types will convert Unicode values to raw byte values going into the database,
and all raw byte values to Python Unicode coming out in result sets. This is
an engine-wide method to provide unicode across the board. For unicode
conversion on a column-by-column level, use the Unicode column type instead.

Wut when convert_unicode is set to true it converts Unicode objects to strings
and leaves String objects unchanged and it can lead to problems:

here is a simple example:
# -*- coding: cp1251 -*-

import sqlalchemy

db = sqlalchemy.create_engine('sqlite://', echo=True, echo_uow=False,
convert_unicode=True)

# a table to store companies
companies = sqlalchemy.Table('companies', db,
sqlalchemy.Column('company_id', sqlalchemy.Integer, primary_key=True),
sqlalchemy.Column('name', sqlalchemy.String(50)))

class Company(object):
pass

sqlalchemy.assign_mapper(Company, companies)

companies.create()

# Company(name=u'Some text in cp1251 encoding')
# This lines works perfectly, unicode object is automatically encoded to
# utf8 before going to database
Company(name=u'Какой-то текст в кодировке cp1251')

# This line still works fine:
# It goes to database as is, i.e. as a string and when decoded
# it is a valid utf8 that can be converted to unicode without
# problems
Company(name='Some text in ascii')

# And this line causes problems:
# It goes to database as is, i.e. as a string and when
Company(name='Какой-то текст в кодировке cp1251')

sqlalchemy.objectstore.commit()

sqlalchemy.objectstore.clear()


c = Company.get(1)
print type(c.name)


c = Company.get(2)
# Now we get something funny. We specified name as a string during
# object creation and get it out of database as Unicode.
print type(c.name)

# And this line will crash interpeter because sqlalchemy tries to convert it
# name to Unicode as it was an utf8 and it is not. It is still in cp1251
encoding
c2 = Company.get(3)


So is it intended behaviour for sqlalchemy or is that a bug?

In my opinion that's a bug and that behaviour should be changed to something
like that:
1. If object is unicode then convert it to engine specified encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings in database
and on client machines (at least for people with "alternate languages" :-)

If that's indeed problems with sqlalchemy and not my expectations of what
sqlalchemy should be theh I perhaps can make those changes to sqlalchemy


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
Michael Bayer
2006-04-17 13:24:42 UTC
Permalink
Post by Vasily Sulatskov
In my opinion that's a bug and that behaviour should be changed to something
1. If object is unicode then convert it to engine specified
encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings in database
and on client machines (at least for people with "alternate
languages" :-)
there already is an encoding parameter for the engine.

http://www.sqlalchemy.org/docs/dbengine.myt#database_options

does that solve your problem ?





-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Vasily Sulatskov
2006-04-17 15:11:49 UTC
Permalink
Hello Michael,

I know there's a database engine parameter "encoding". It tells
sqlalchemy in which encoding Unicode objects should be saved to
database.

I suggest adding another encoding, let's say "client_encoding" which
will be used when convert_unicode is True and user assigns string
object to object attribute. Currently even if convert_unicode is set
to True string go to database as-is, bypassing convertion to unicode.

This option will allow to assign string's in national/platform
specific encodings, like cp1251 straigt to object attributes and they
will be properly converted to database encoding (engine.encoding).


See, encoding on client machine may be different from encoding in
database. You can see changes that I suggest from attached diff.

Suggested changes will can make life of users of
multilingual/multienconding enviromnents a little easier while not
affexcting all other users of SQLAlchemy.
Post by Vasily Sulatskov
In my opinion that's a bug and that behaviour should be changed to something
1. If object is unicode then convert it to engine specified
encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings in database
and on client machines (at least for people with "alternate
languages" :-)
MB> there already is an encoding parameter for the engine.

MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options

MB> does that solve your problem ?
--
Best regards,
Vasily mailto:***@mail.ru
Michael Bayer
2006-04-17 18:46:27 UTC
Permalink
OK, i think im getting this now...i think between the myghty list and here
one can begin to see my lack of unicode awareness...

so basically, since your python file has the -*- coding attribute, you
dont really have to put the u'' around strings that contain multibyte
characters, since the multibyte encoding is implicit throughout the file.
so the client_encoding pretty much is designed to match up with a python
script that has a -*- declaration, is that accurate ?

ill add the patch to my list. code says it all for me ....
Post by Vasily Sulatskov
Hello Michael,
I know there's a database engine parameter "encoding". It tells
sqlalchemy in which encoding Unicode objects should be saved to
database.
I suggest adding another encoding, let's say "client_encoding" which
will be used when convert_unicode is True and user assigns string
object to object attribute. Currently even if convert_unicode is set
to True string go to database as-is, bypassing convertion to unicode.
This option will allow to assign string's in national/platform
specific encodings, like cp1251 straigt to object attributes and they
will be properly converted to database encoding (engine.encoding).
See, encoding on client machine may be different from encoding in
database. You can see changes that I suggest from attached diff.
Suggested changes will can make life of users of
multilingual/multienconding enviromnents a little easier while not
affexcting all other users of SQLAlchemy.
Post by Vasily Sulatskov
In my opinion that's a bug and that behaviour should be changed to something
1. If object is unicode then convert it to engine specified
encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings in database
and on client machines (at least for people with "alternate
languages" :-)
MB> there already is an encoding parameter for the engine.
MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options
MB> does that solve your problem ?
--
Best regards,
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Vasily Sulatskov
2006-04-18 08:54:24 UTC
Permalink
Hello Michael,

Here's a perhaps better version of that patch. I added default value for
client_encoding='ascii'. For people with ascii encoding it will make no
difference but for people with 8-bit encodings it will produce error if they
will try to put regular string to sqlalchemy when convert_unicode=True.
Post by Michael Bayer
OK, i think im getting this now...i think between the myghty list and here
one can begin to see my lack of unicode awareness...
so basically, since your python file has the -*- coding attribute, you
dont really have to put the u'' around strings that contain multibyte
characters, since the multibyte encoding is implicit throughout the file.
so the client_encoding pretty much is designed to match up with a python
script that has a -*- declaration, is that accurate ?
ill add the patch to my list. code says it all for me ....
Not exactly. See this example:

# -*- coding: cp1251 -*-
s1 = 'текст в кПЎОрПвке cp1251'
s2 = unicode('текст в кПЎОрПвке cp1251', 'cp1251')
s3 = u'текст в кПЎОрПвке cp1251'

print 's1:', type(s1)
# prints: s1: <type 'str'>

print 's2:', type(s2)
# prints: s2: <type 'unicode'>

print 's3:', type(s3)
# prints: s3: <type 'unicode'>

s1 is a regular python string, i.e. a sequence of bytes, it cannot be
converted to unicode or to another encoding, without knowing it encoding

s2 is an unicode object, it's converted to unicode from regular string in
constructor because we specifed correct encoding.

s3 is an unicode object. Creation of s2 is just a shorthand for s2.

If you are interested, there is a good article about unicode, it's modestly
named "
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know
About Unicode and Character Sets (No Excuses!)" :-)
http://www.joelonsoftware.com/articles/Unicode.html
Post by Michael Bayer
Post by Vasily Sulatskov
Hello Michael,
I know there's a database engine parameter "encoding". It tells
sqlalchemy in which encoding Unicode objects should be saved to
database.
I suggest adding another encoding, let's say "client_encoding" which
will be used when convert_unicode is True and user assigns string
object to object attribute. Currently even if convert_unicode is set
to True string go to database as-is, bypassing convertion to unicode.
This option will allow to assign string's in national/platform
specific encodings, like cp1251 straigt to object attributes and they
will be properly converted to database encoding (engine.encoding).
See, encoding on client machine may be different from encoding in
database. You can see changes that I suggest from attached diff.
Suggested changes will can make life of users of
multilingual/multienconding enviromnents a little easier while not
affexcting all other users of SQLAlchemy.
Post by Vasily Sulatskov
In my opinion that's a bug and that behaviour should be changed to something
1. If object is unicode then convert it to engine specified encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings in database
and on client machines (at least for people with "alternate languages" :-)
MB> there already is an encoding parameter for the engine.
MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options
MB> does that solve your problem ?
--
Best regards,
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live
webcast and join the prime developer group breaking into this new coding
territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users
Michael Bayer
2006-04-18 16:53:50 UTC
Permalink
Post by Vasily Sulatskov
If you are interested, there is a good article about unicode, it's modestly
named "
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know
About Unicode and Character Sets (No Excuses!)" :-)
http://www.joelonsoftware.com/articles/Unicode.html
yes im familiar....that article is actually why I understand any of this
at all ! :)


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Qvx
2006-04-19 06:54:45 UTC
Permalink
Hello Vasily,

I'm also the unfortunate one who has to use encodings other than ascii. I'm
sure that your patch helps, but I'm not sure that this is the "right way".

The thing that I learned from my dealing with unicode and string encodings
is: always use unicode. What I mean is when you write your source:
* make all your data (variables, literals) as unicode
* put the -*- coding: -*- directive so that interpreter knows how to convert
your u"" strings

Those two rules lead to the following:

# -*- coding: cp1251 -*-

import sqlalchemy

# note that there is no convert_unicode flag, but there is encoding flag
db = sqlalchemy.create_engine('sqlite://', encoding='cp1251')

# note a change in type of "name" column from String to Unicode
companies = sqlalchemy.Table('companies', db,
sqlalchemy.Column('company_id', sqlalchemy.Integer, primary_key=True),
sqlalchemy.Column('name', sqlalchemy.Unicode(50)))

# ....

# OK, unicode
Company(name=u'ëÁËÏÊ-ÔÏ ÔÅËÓÔ × ËÏÄÉÒÏ×ËÅ cp1251')

# Avoid plain strings
Company(name='Some text in ascii')


This becomes necessity if you have for example more than one database driver
using different encoding. You get back unicode strings which you can combine
and copy from one database to another without worrying.

db1 = sqlalchemy.create_engine('mysql://', encoding='latin2')
db2 = sqlalchemy.create_engine('oracle://', encoding='windows-1250')

ob1 = db1_mapper.select(...)
ob2 = db2_mapper.select(...)

ob1.name = ob1.name + ob2.name # All unicode, no problems
Post by Vasily Sulatskov
Hello Michael,
I know there's a database engine parameter "encoding". It tells
sqlalchemy in which encoding Unicode objects should be saved to
database.
I suggest adding another encoding, let's say "client_encoding" which
will be used when convert_unicode is True and user assigns string
object to object attribute. Currently even if convert_unicode is set
to True string go to database as-is, bypassing convertion to unicode.
This option will allow to assign string's in national/platform
specific encodings, like cp1251 straigt to object attributes and they
will be properly converted to database encoding (engine.encoding).
See, encoding on client machine may be different from encoding in
database. You can see changes that I suggest from attached diff.
Suggested changes will can make life of users of
multilingual/multienconding enviromnents a little easier while not
affexcting all other users of SQLAlchemy.
Post by Vasily Sulatskov
In my opinion that's a bug and that behaviour should be changed to something
1. If object is unicode then convert it to engine specified
encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings in database
and on client machines (at least for people with "alternate
languages" :-)
MB> there already is an encoding parameter for the engine.
MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options
MB> does that solve your problem ?
--
Best regards,
Qvx
2006-04-19 10:11:51 UTC
Permalink
I didn't look at your patch. I just gave a few general observations.

I'm not sure that I would set 'ascii' as default value. I would set it to
None (meening "avoid using it" or "inherit value from encoding param").

I guess that flag called "client_encoding" could make things work more
explicitely in SA if you *must* use plain strings instead of unicode. But
after looking at types.py I'm not sure that String class is correct, and
adding client_encoding into the mix makes it even more obscure. Although, it
has a potential of actually making it better.

My observations of types.py by looking at code:

Unicode:
- good
- unicode on client side (bind params and column values),
- explicit conversion to encoded string when talking to engine

String:
- strange beast
- it can be unicode as well as string on client side (bind params and
column values) depending on convert_unicode param
- it uses both unicode and strings when talking to engine depending on
convert_unicode param
- or, in other words: pass unchanged data (be it unicode or string) if
there is no convert_unicode param

Your additions could make it into a better thing if done differently:

String:
- string on client side (bind params and column values), no unicode in
sight
- talk to database in expected encoding
- use encoding / client_encoding pair to do conversions between client /
db side
- remove convert_unicode param (If you want to use unicode there is
Unicode class)

I'm not sure what else would break, or what other use case I'm braking with
this proposal, but the current String (with or without your additions)
leaves a bad taste in my mouth.

I would do it like this (not tested):

Index: lib/sqlalchemy/types.py
===================================================================
--- lib/sqlalchemy/types.py (revision 1294)
+++ lib/sqlalchemy/types.py (working copy)
@@ -96,15 +96,24 @@
def get_constructor_args(self):
return {'length':self.length}
def convert_bind_param(self, value, engine):
- if not engine.convert_unicode or value is None or not
isinstance(value, unicode):
+ if value is None:
+ return None
+ elif isinstance(value, unicode):
+ return value.encode(engine.encoding)
+ # or even raise exception (but I wouldn't go that far)
+ elif engine.client_encoding != engine.encoding:
+ return unicode(value, engine.client_encoding).encode(
engine.encoding)
+ else:
return value
+ def convert_result_value(self, value, engine):
+ if value is None:
+ return None
+ elif isinstance(value, unicode):
+ return value.encode(engine.client_encoding)
+ elif engine.client_encoding != engine.encoding:
+ return unicode(value, engine.encoding).encode(
engine.client_encoding)
else:
- return value.encode(engine.encoding)
- def convert_result_value(self, value, engine):
- if not engine.convert_unicode or value is None or isinstance(value,
unicode):
return value
- else:
- return value.decode(engine.encoding)
def adapt_args(self):
if self.length is None:
return TEXT()
Index: lib/sqlalchemy/engine.py
===================================================================
--- lib/sqlalchemy/engine.py (revision 1294)
+++ lib/sqlalchemy/engine.py (working copy)
@@ -227,7 +227,7 @@
SQLEngines are constructed via the create_engine() function inside this
package.
"""

- def __init__(self, pool=None, echo=False, logger=None,
default_ordering=False, echo_pool=False, echo_uow=False,
convert_unicode=False, encoding='utf-8', **params):
+ def __init__(self, pool=None, echo=False, logger=None,
default_ordering=False, echo_pool=False, echo_uow=False, encoding='utf-8',
client_encoding=None, **params):
"""constructs a new SQLEngine. SQLEngines should be constructed
via the create_engine()
function which will construct the appropriate subclass of
SQLEngine."""
# get a handle on the connection pool via the connect arguments
@@ -246,8 +246,8 @@
self.default_ordering=default_ordering
self.echo = echo
self.echo_uow = echo_uow
- self.convert_unicode = convert_unicode
self.encoding = encoding
+ self.client_encoding = client_encoding or encoding
self.context = util.ThreadLocal()
self._ischema = None
self._figure_paramstyle()


Kind regards,
Tvrtko
Hello Qvx,
Well, perhaps you are right. But let's then define what the "right way"
is.
Second version of patch that I submitted included default value "ascii"
for
new engine parameter "client_encoding" it works in the following way: If
user
specifies conver_unicode=True, and doesn't specify client_encoding it will
be
ascii, and new types.Sring will try to convert regular strings to unicode
using specifed client_encoding if it unable to convert to unicode it will
produce exception during construction of unicode object.
That guarantee's that any string going to database will get converted to
proper encoding. But I dont't say that it's the best or even "right way".
I also think that the more strictly you enforce unicode usage the better,
but
unfortunately there are many places in python where regular string is used
(like str() function e.t.c) so for some time we have to live with regular
strings.
What do you think how it should be in sqlalchemy?
Post by Qvx
I'm also the unfortunate one who has to use encodings other than ascii.
I'm
Post by Qvx
sure that your patch helps, but I'm not sure that this is the "right
way".
Post by Qvx
The thing that I learned from my dealing with unicode and string
encodings
Post by Qvx
* make all your data (variables, literals) as unicode
* put the -*- coding: -*- directive so that interpreter knows how to
convert your u"" strings
# -*- coding: cp1251 -*-
import sqlalchemy
# note that there is no convert_unicode flag, but there is encoding flag
db = sqlalchemy.create_engine('sqlite://', encoding='cp1251')
# note a change in type of "name" column from String to Unicode
companies = sqlalchemy.Table('companies', db,
sqlalchemy.Column('company_id', sqlalchemy.Integer,
primary_key=True),
Post by Qvx
sqlalchemy.Column('name', sqlalchemy.Unicode(50)))
# ....
# OK, unicode
Company(name=u'ëÁËÏÊ-ÔÏ ÔÅËÓÔ × ËÏÄÉÒÏ×ËÅ cp1251')
# Avoid plain strings
Company(name='Some text in ascii')
This becomes necessity if you have for example more than one database
driver using different encoding. You get back unicode strings which you
can
Post by Qvx
combine and copy from one database to another without worrying.
db1 = sqlalchemy.create_engine('mysql://', encoding='latin2')
db2 = sqlalchemy.create_engine('oracle://', encoding='windows-1250')
ob1 = db1_mapper.select(...)
ob2 = db2_mapper.select(...)
ob1.name = ob1.name + ob2.name # All unicode, no problems
Post by Vasily Sulatskov
Hello Michael,
I know there's a database engine parameter "encoding". It tells
sqlalchemy in which encoding Unicode objects should be saved to
database.
I suggest adding another encoding, let's say "client_encoding" which
will be used when convert_unicode is True and user assigns string
object to object attribute. Currently even if convert_unicode is set
to True string go to database as-is, bypassing convertion to unicode.
This option will allow to assign string's in national/platform
specific encodings, like cp1251 straigt to object attributes and they
will be properly converted to database encoding (engine.encoding).
See, encoding on client machine may be different from encoding in
database. You can see changes that I suggest from attached diff.
Suggested changes will can make life of users of
multilingual/multienconding enviromnents a little easier while not
affexcting all other users of SQLAlchemy.
Post by Vasily Sulatskov
In my opinion that's a bug and that behaviour should be changed to something
1. If object is unicode then convert it to engine specified encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings
in database
and on client machines (at least for people with "alternate languages" :-)
MB> there already is an encoding parameter for the engine.
MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options
MB> does that solve your problem ?
--
Best regards,
Vasily Sulatskov
2006-04-19 10:46:29 UTC
Permalink
Hello Qvx,

As far as I understand your proposal will always convert String objects to
regular python strings when recieving data from database. But documentation
says:

convert_unicode=False : if set to True, all String/character based types will
convert Unicode values to raw byte values going into the database, and all
raw byte values to Python Unicode coming out in result sets. This is an
engine-wide method to provide unicode across the board. For unicode
conversion on a column-by-column level, use the Unicode column type instead.

I understand that paragraph that way: If you set convert_unicode to True then
all strings will get converted to unicode on it's way out of database. And
unicode columns is supposed to be used for "column-by-column level" unicode
control.

Patch that I suggested was supposed to work like documentation say. But your
patch changes that behaviour.

Perhap's documentation should be changed to work another way.

There's another aspect of this problem: table autoload option. I don't know
will it generate Unicode type columns or String type columns. If it will
geenrate Unicode type columns then everything fine, but if not, that will be
bad because you will always get string objects from autoloaded tables.

But again, I am not sure about table autoloading I just feel some uncertainity
in this area.

So perhaps we can discuss this problem and find solution that will satisfy all
users of sqlalchemy.
Post by Qvx
I didn't look at your patch. I just gave a few general observations.
I'm not sure that I would set 'ascii' as default value. I would set it to
None (meening "avoid using it" or "inherit value from encoding param").
I guess that flag called "client_encoding" could make things work more
explicitely in SA if you *must* use plain strings instead of unicode. But
after looking at types.py I'm not sure that String class is correct, and
adding client_encoding into the mix makes it even more obscure. Although,
it has a potential of actually making it better.
- good
- unicode on client side (bind params and column values),
- explicit conversion to encoded string when talking to engine
- strange beast
- it can be unicode as well as string on client side (bind params and
column values) depending on convert_unicode param
- it uses both unicode and strings when talking to engine depending on
convert_unicode param
- or, in other words: pass unchanged data (be it unicode or string) if
there is no convert_unicode param
- string on client side (bind params and column values), no unicode in
sight
- talk to database in expected encoding
- use encoding / client_encoding pair to do conversions between client /
db side
- remove convert_unicode param (If you want to use unicode there is
Unicode class)
I'm not sure what else would break, or what other use case I'm braking with
this proposal, but the current String (with or without your additions)
leaves a bad taste in my mouth.
Index: lib/sqlalchemy/types.py
===================================================================
--- lib/sqlalchemy/types.py (revision 1294)
+++ lib/sqlalchemy/types.py (working copy)
@@ -96,15 +96,24 @@
return {'length':self.length}
- if not engine.convert_unicode or value is None or not
+ return None
+ return value.encode(engine.encoding)
+ # or even raise exception (but I wouldn't go that far)
+ return unicode(value, engine.client_encoding).encode(
engine.encoding)
return value
+ return None
+ return value.encode(engine.client_encoding)
+ return unicode(value, engine.encoding).encode(
engine.client_encoding)
- return value.encode(engine.encoding)
- if not engine.convert_unicode or value is None or
return value
- return value.decode(engine.encoding)
return TEXT()
Index: lib/sqlalchemy/engine.py
===================================================================
--- lib/sqlalchemy/engine.py (revision 1294)
+++ lib/sqlalchemy/engine.py (working copy)
@@ -227,7 +227,7 @@
SQLEngines are constructed via the create_engine() function inside
this package.
"""
- def __init__(self, pool=None, echo=False, logger=None,
default_ordering=False, echo_pool=False, echo_uow=False,
+ def __init__(self, pool=None, echo=False, logger=None,
default_ordering=False, echo_pool=False, echo_uow=False, encoding='utf-8',
"""constructs a new SQLEngine. SQLEngines should be constructed
via the create_engine()
function which will construct the appropriate subclass of
SQLEngine."""
# get a handle on the connection pool via the connect arguments
@@ -246,8 +246,8 @@
self.default_ordering=default_ordering
self.echo = echo
self.echo_uow = echo_uow
- self.convert_unicode = convert_unicode
self.encoding = encoding
+ self.client_encoding = client_encoding or encoding
self.context = util.ThreadLocal()
self._ischema = None
self._figure_paramstyle()
Kind regards,
Tvrtko
Hello Qvx,
Well, perhaps you are right. But let's then define what the "right way"
is.
Second version of patch that I submitted included default value "ascii"
for
new engine parameter "client_encoding" it works in the following way: If
user
specifies conver_unicode=True, and doesn't specify client_encoding it
will be
ascii, and new types.Sring will try to convert regular strings to unicode
using specifed client_encoding if it unable to convert to unicode it will
produce exception during construction of unicode object.
That guarantee's that any string going to database will get converted to
proper encoding. But I dont't say that it's the best or even "right way".
I also think that the more strictly you enforce unicode usage the better,
but
unfortunately there are many places in python where regular string is
used (like str() function e.t.c) so for some time we have to live with
regular strings.
What do you think how it should be in sqlalchemy?
Post by Qvx
I'm also the unfortunate one who has to use encodings other than ascii.
I'm
Post by Qvx
sure that your patch helps, but I'm not sure that this is the "right
way".
Post by Qvx
The thing that I learned from my dealing with unicode and string
encodings
Post by Qvx
* make all your data (variables, literals) as unicode
* put the -*- coding: -*- directive so that interpreter knows how to
convert your u"" strings
# -*- coding: cp1251 -*-
import sqlalchemy
# note that there is no convert_unicode flag, but there is encoding
flag db = sqlalchemy.create_engine('sqlite://', encoding='cp1251')
# note a change in type of "name" column from String to Unicode
companies = sqlalchemy.Table('companies', db,
sqlalchemy.Column('company_id', sqlalchemy.Integer,
primary_key=True),
Post by Qvx
sqlalchemy.Column('name', sqlalchemy.Unicode(50)))
# ....
# OK, unicode
Company(name=u'Какой-то текст в кодировке cp1251')
# Avoid plain strings
Company(name='Some text in ascii')
This becomes necessity if you have for example more than one database
driver using different encoding. You get back unicode strings which you
can
Post by Qvx
combine and copy from one database to another without worrying.
db1 = sqlalchemy.create_engine('mysql://', encoding='latin2')
db2 = sqlalchemy.create_engine('oracle://', encoding='windows-1250')
ob1 = db1_mapper.select(...)
ob2 = db2_mapper.select(...)
ob1.name = ob1.name + ob2.name # All unicode, no problems
Post by Vasily Sulatskov
Hello Michael,
I know there's a database engine parameter "encoding". It
tells sqlalchemy in which encoding Unicode objects should be
saved to database.
I suggest adding another encoding, let's say "client_encoding"
which will be used when convert_unicode is True and user assigns
string object to object attribute. Currently even if
convert_unicode is set to True string go to database as-is, bypassing
convertion to unicode.
This option will allow to assign string's in
national/platform specific encodings, like cp1251 straigt to object
attributes and they will be properly converted to database encoding
(engine.encoding).
See, encoding on client machine may be different from encoding
in database. You can see changes that I suggest from attached diff.
Suggested changes will can make life of users
of multilingual/multienconding enviromnents a little easier while
not affexcting all other users of SQLAlchemy.
Post by Vasily Sulatskov
In my opinion that's a bug and that behaviour should be changed to
something
1. If object is unicode then convert it to engine specified
encoding (like
utf8) as it happens now
2. If it's a string then convert it to unicode using some another
specifed
encoding (it should be added to engine parameters). This encoding
specifies
client-side encoding. It's often handy to have different encodings
in database
and on client machines (at least for people with "alternate
languages" :-)
MB> there already is an encoding parameter for the engine.
MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options
MB> does that solve your problem ?
--
Best regards,
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
Qvx
2006-04-19 11:33:35 UTC
Permalink
You are correct. My proposal changes the way the String works. I felt that
the way it currently works does not justify the name String, especially when
we have Unicode class. I felt that Unicode columns should work with unicode
data only, and String columns should work with regular python strings only.
Documentation correctly describes what happens, but that is in my opinion
wrong way to do it. I never gave much thought about String because once I
saw how it works I just started using Unicode (after my patch about encoding
param was accepted). But also, after seeing your proposal, I saw the
opportunity to make String class behave the way I thing it should behave.

As for the autoload I'm not sure what to do. If *I* had to do it I would
return Unicode columns everywhere. More flexible solution would, I guess,
alow developer to intervene in some way (via kw param).

It seems to me that plain strings, in general, are used for two main
reasons: lack of proper unicode support and laziness/lack of knowledge. Only
after those two reasons would come all other valid reasons from knowledgable
developers. I don't give much thought to those other reasons if I can use
unicode. More often than not, I must use strings because of lack of unicode
support, so I'm happy that SA has it. I don't consider myself unicode
expert; just an unfortunate fellow who has to work with latin2 and
windows-1250 encodings and somehow manage my way through. If there is
somebody else here who knows more about unicode I think now would be the
right time to say something...

Tvrtko

P.S. My previus post was too long so it got rejected.
Vasily Sulatskov
2006-04-20 15:03:24 UTC
Permalink
Hello Qvx,
Post by Qvx
As for the autoload I'm not sure what to do. If *I* had to do it I would
return Unicode columns everywhere. More flexible solution would, I guess,
alow developer to intervene in some way (via kw param).
I did some testing on sqlalchemy autoload feature and it seems that
sqlalchemy or string-like columns assigns types like:
sqlalchemy.databases.mysql.MSString and there are no types like
sqlalchemy.databases.mysql.MSUnicode in sqlalchemy. So I think it will
require lot's of code changing to make sqlalchemy behave "right-way",
i.e. when string is string and unicode is unicode.

Hence in a current situation it's very good that one can specify
convert_unicode=True and even with autoload=True still get unicode
objects from database.

And with patch that I suggested sqlalchemy will even protect it's
users from hardest pitfalls like putting data in incorect encoding in
database. (Sqlalchemy will try to convert supplied string to unicode
with ascii codec and if converion fails, and it WILL fail if user is
using national encoding in his strings.)

Untill autoload behaviour is not changed I think it would be better
not to make Strings always behave like strings.


I think that sqlalchemy in a perfect world should behave like that:

User controls sqlalchemy behaviour with three engine parameters:

1. engine.server_encoding - encoding used for storing data data in database,
defaults to 'ascii', when I say 'ascii' I actually mean 'ascii' or
some other encoding common to most of sqlalchemy users.

2. engine.client_encoding - encoding for client side strings, i.e.
string that user feeds to sqlalchemy or gets from it. Defaults to
'ascii', or some other encoding common to most of
sqlalchemy users

3. engine.autoload_unicode, defaults to False - parameter that tells sqlalchemy should id
create columns of string type or unicode type when autoloading
tables, or perhaps some other way to hint column types when
autoloading.

String column types always return strings to user but also accepts
unicode objects on assignment(unicode objects can always be converted
to string of known encoding)

Unicode column types always return unicode objects. They accepy only
unicode objects. (perhaps they should also accept strings and treat
them as strings with engine.client_encoding encoding)

For string columnt types, if engine.client_encoding doesn't match
engine.server_encoding, takes place automatic string encoding conversion.

In that situation most users of sqlalchemy will just happily use default
parameters.

And unfortunate users of nationtal encodings will turn engine
parameters to something like that:
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = True
or even
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = False

And so, everyone will be happy.
1. Ascii users work as they are used to, not knowing about horrors of
encodings and unicode.

2. National encoding users work using theirs marginal encodings
without data loss.

3. Language purists enjoy that string is string and unicode is unicode
:-)

Any thoughts, comments?
Post by Qvx
It seems to me that plain strings, in general, are used for two main
reasons: lack of proper unicode support and laziness/lack of knowledge. Only
after those two reasons would come all other valid reasons from knowledgable
developers. I don't give much thought to those other reasons if I can use
unicode. More often than not, I must use strings because of lack of unicode
support, so I'm happy that SA has it. I don't consider myself unicode
expert; just an unfortunate fellow who has to work with latin2 and
windows-1250 encodings and somehow manage my way through. If there is
somebody else here who knows more about unicode I think now would be the
right time to say something...
Same deal. I am not an unicode expert, but I suspect that all people
of countries where non ascii encodings is uses possess innate unicode
knowledge :-)
--
Best regards,
Vasily mailto:***@mail.ru




-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Michael Bayer
2006-04-20 17:06:37 UTC
Permalink
a couple of things:

- where's the patch ? I thought i had put it in but it seems not.
lets put in a Trac ticket for it.

- may I suggest, that since this issue is decided completely within
the source code for types.String, that various implementations of
String, corresponding to different user preferences with regards to
Unicode treatment, be provided as mods which simply patch themselves
into the types package....that way, whatever type you put there gets
loaded in from an autoload scenario automatically.
Hello Qvx,
Post by Qvx
As for the autoload I'm not sure what to do. If *I* had to do it I would
return Unicode columns everywhere. More flexible solution would, I guess,
alow developer to intervene in some way (via kw param).
I did some testing on sqlalchemy autoload feature and it seems that
sqlalchemy.databases.mysql.MSString and there are no types like
sqlalchemy.databases.mysql.MSUnicode in sqlalchemy. So I think it will
require lot's of code changing to make sqlalchemy behave "right-way",
i.e. when string is string and unicode is unicode.
Hence in a current situation it's very good that one can specify
convert_unicode=True and even with autoload=True still get unicode
objects from database.
And with patch that I suggested sqlalchemy will even protect it's
users from hardest pitfalls like putting data in incorect encoding in
database. (Sqlalchemy will try to convert supplied string to unicode
with ascii codec and if converion fails, and it WILL fail if user is
using national encoding in his strings.)
Untill autoload behaviour is not changed I think it would be better
not to make Strings always behave like strings.
1. engine.server_encoding - encoding used for storing data data in database,
defaults to 'ascii', when I say 'ascii' I actually mean 'ascii' or
some other encoding common to most of sqlalchemy users.
2. engine.client_encoding - encoding for client side strings, i.e.
string that user feeds to sqlalchemy or gets from it. Defaults to
'ascii', or some other encoding common to most of
sqlalchemy users
3. engine.autoload_unicode, defaults to False - parameter that
tells sqlalchemy should id
create columns of string type or unicode type when autoloading
tables, or perhaps some other way to hint column types when
autoloading.
String column types always return strings to user but also accepts
unicode objects on assignment(unicode objects can always be converted
to string of known encoding)
Unicode column types always return unicode objects. They accepy only
unicode objects. (perhaps they should also accept strings and treat
them as strings with engine.client_encoding encoding)
For string columnt types, if engine.client_encoding doesn't match
engine.server_encoding, takes place automatic string encoding
conversion.
In that situation most users of sqlalchemy will just happily use default
parameters.
And unfortunate users of nationtal encodings will turn engine
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = True
or even
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = False
And so, everyone will be happy.
1. Ascii users work as they are used to, not knowing about horrors of
encodings and unicode.
2. National encoding users work using theirs marginal encodings
without data loss.
3. Language purists enjoy that string is string and unicode is unicode
:-)
Any thoughts, comments?
Post by Qvx
It seems to me that plain strings, in general, are used for two main
reasons: lack of proper unicode support and laziness/lack of
knowledge. Only
after those two reasons would come all other valid reasons from knowledgable
developers. I don't give much thought to those other reasons if I can use
unicode. More often than not, I must use strings because of lack of unicode
support, so I'm happy that SA has it. I don't consider myself unicode
expert; just an unfortunate fellow who has to work with latin2 and
windows-1250 encodings and somehow manage my way through. If there is
somebody else here who knows more about unicode I think now would be the
right time to say something...
Same deal. I am not an unicode expert, but I suspect that all people
of countries where non ascii encodings is uses possess innate unicode
knowledge :-)
--
Best regards,
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?
cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Michael Bayer
2006-04-20 18:41:28 UTC
Permalink
Post by Michael Bayer
- may I suggest, that since this issue is decided completely within
the source code for types.String, that various implementations of
String, corresponding to different user preferences with regards to
Unicode treatment, be provided as mods which simply patch themselves
into the types package....that way, whatever type you put there gets
loaded in from an autoload scenario automatically.
Wow. Great idea. I never event thought of replacing class in library I
use at runtime. Thanks a lot
Well I also think people should tell me what they think of
that....its essentially more monkeypatching. Python makes it easy
to just replace the namespace within a module, which is total heresy
in "proper OO land"....would people prefer I use a more verbose
"dependency injection" approach, such as this thing http://
aspn.activestate.com/ASPN/Cookbook/Python/Recipe/413268 ? or do we
think "dependency injection" of any kind is just wrong, creating
scattered application flow thats impossible to follow (which I dont
agree with, stack traces say it all usually) ?


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Jonathan Ellis
2006-04-23 22:45:22 UTC
Permalink
Post by Michael Bayer
Post by Michael Bayer
- may I suggest, that since this issue is decided completely within
the source code for types.String, that various implementations of
String, corresponding to different user preferences with regards to
Unicode treatment, be provided as mods which simply patch themselves
into the types package....that way, whatever type you put there gets
loaded in from an autoload scenario automatically.
Wow. Great idea. I never event thought of replacing class in library I
use at runtime. Thanks a lot
Well I also think people should tell me what they think of
that....its essentially more monkeypatching.
Well, it comes down to the usual trade-offs, right? How hard is it to
accomplish w/o monkeypatching, and how likely is it to screw someone up who
isn't expecting your patch in a stdlib object?

Unless I'm missing something, the answers here seem to be "not very hard"
and "fairly likely." But, maybe I _am_ missing something. :)

WRT Vasily's three aspects of unicode,
Post by Michael Bayer
1. engine.server_encoding - encoding used for storing data data in database,
defaults to 'ascii', when I say 'ascii' I actually mean 'ascii' or
some other encoding common to most of sqlalchemy users.
This should default to what the db says it is. :)
Post by Michael Bayer
2. engine.client_encoding - encoding for client side strings, i.e.
string that user feeds to sqlalchemy or gets from it. Defaults to
'ascii', or some other encoding common to most of
sqlalchemy users
Unnecessary to have a separate SA setting. That's what the locale module is
for; use that instead.
Post by Michael Bayer
3. engine.autoload_unicode, defaults to False - parameter that tells sqlalchemy should id
create columns of string type or unicode type when autoloading
tables, or perhaps some other way to hint column types when
autoloading.
If you're not happy with what you set at the engine level, you should just
create an explicit class for that table instead of piling on the autoload
options.

Python makes it easy
Post by Michael Bayer
to just replace the namespace within a module, which is total heresy
in "proper OO land"....would people prefer I use a more verbose
"dependency injection" approach, such as this thing http://
aspn.activestate.com/ASPN/Cookbook/Python/Recipe/413268 ? or do we
think "dependency injection" of any kind is just wrong, creating
scattered application flow thats impossible to follow (which I dont
agree with, stack traces say it all usually) ?
My gut reaction is that the above cookbook recipe is very ugly and not at
all Pythonic. But then I think the Java equivalents are ugly
solutions-in-search-of-problems too. :)

--
Jonathan Ellis
http://spyced.blogspot.com
Michael Bayer
2006-04-25 01:42:30 UTC
Permalink
Post by Michael Bayer
Well I also think people should tell me what they think of
that....its essentially more monkeypatching.
Well, it comes down to the usual trade-offs, right? How hard is
it to accomplish w/o monkeypatching, and how likely is it to screw
someone up who isn't expecting your patch in a stdlib object?
Unless I'm missing something, the answers here seem to be "not very
hard" and "fairly likely." But, maybe I _am_ missing something. :)
OK, here is the architectural situation.

We have an object called "Unicode" which is used to represent a
String type in a database table that will do some unicode conversion
on data going in and out of the table.

The system has a set of "engine" APIs designed for different kinds of
databases, and each supports a function "reflecttable" which knows
how to go into the database, query some rows, and return a Table
object which represents the databases returned information about a
specific Table. Each database API, when it sees something like
VARCHAR or whatever, pulls up a new String object and sticks it in
the appropriate column in the Table object.

These guys want to write their own Unicode object, and make it so
that whenever those database libraries do "reflecttable" and see
VARCHAR, they pull up their custom Unicode object, not the default
String object.

so the most braindead way to do this is just to monkeypatch the
"String" with a "Unicode" class.

the more elaborate way is to have the "engine" take some map of
arguments that relates some kind of constant to the Unicode class, or
a callable, or some interface called a TableTypeFactory, what have
you. none that I can think of are not either a. some arbitrary
thing, which is messy since we have lots of other places where we
need to plug things in and we'd have an explosion of ***Factory
objects or arbitrary dictionaries of classes and stuff, or b.
something totally genericised like the "Dependency Injection" example
but also seems pretty overblown.

So how would you approach this ?





-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Jonathan Ellis
2006-04-26 17:46:40 UTC
Permalink
Post by Michael Bayer
Post by Michael Bayer
Well I also think people should tell me what they think of
that....its essentially more monkeypatching.
Well, it comes down to the usual trade-offs, right? How hard is
it to accomplish w/o monkeypatching, and how likely is it to screw
someone up who isn't expecting your patch in a stdlib object?
Unless I'm missing something, the answers here seem to be "not very
hard" and "fairly likely." But, maybe I _am_ missing something. :)
OK, here is the architectural situation.
We have an object called "Unicode" which is used to represent a
String type in a database table that will do some unicode conversion
on data going in and out of the table.
The system has a set of "engine" APIs designed for different kinds of
databases, and each supports a function "reflecttable" which knows
how to go into the database, query some rows, and return a Table
object which represents the databases returned information about a
specific Table. Each database API, when it sees something like
VARCHAR or whatever, pulls up a new String object and sticks it in
the appropriate column in the Table object.
These guys want to write their own Unicode object, and make it so
that whenever those database libraries do "reflecttable" and see
VARCHAR, they pull up their custom Unicode object, not the default
String object.
so the most braindead way to do this is just to monkeypatch the
"String" with a "Unicode" class.
the more elaborate way is to have the "engine" take some map of
arguments that relates some kind of constant to the Unicode class, or
a callable, or some interface called a TableTypeFactory, what have
you. none that I can think of are not either a. some arbitrary
thing, which is messy since we have lots of other places where we
need to plug things in and we'd have an explosion of ***Factory
objects or arbitrary dictionaries of classes and stuff, or b.
something totally genericised like the "Dependency Injection" example
but also seems pretty overblown.
So how would you approach this ?
Well, it seems clear to me that data coming out of the db should be
interpreted according to what the db says its encoding is. (I said before
that we could allow SA to override what the db says the encoding is and use
something else instead but I can't think of a use case for this. It would
just confuse people more.)

There's no need to re-encode the data to a "client encoding"; just hand out
raw unicode objects, which is usually more useful; if the client wants to
encode to his locale (or another) leave that to him.

For data going into the db, same thing, encode it the way the db expects.

DB drivers may already help with this.

--
Jonathan Ellis
http://spyced.blogspot.com
Vasily Sulatskov
2006-04-25 17:07:05 UTC
Permalink
Hello Jonathan,
Post by Jonathan Ellis
Post by Vasily Sulatskov
2. engine.client_encoding - encoding for client side strings, i.e.
string that user feeds to sqlalchemy or gets from it. Defaults to
'ascii', or some other encoding common to most of
sqlalchemy users
Unnecessary to have a separate SA setting. That's what the locale module is
for; use that instead.
Can you be more specific, please?
I can't find anything in locale module that will help in this case.
There is a sys.setdefaultencoding() function, but I actually never
managed to get it work...
Post by Jonathan Ellis
Post by Vasily Sulatskov
3. engine.autoload_unicode, defaults to False - parameter that tells
sqlalchemy should id
Post by Vasily Sulatskov
create columns of string type or unicode type when autoloading
tables, or perhaps some other way to hint column types when
autoloading.
If you're not happy with what you set at the engine level, you should just
create an explicit class for that table instead of piling on the autoload
options.
If I understand you correctly, you suggest explict specification of
table columns. Right?
But isn't autoload option exactly for avvoiding explict column
specification?
--
Best regards,
Vasily mailto:***@mail.ru




-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Qvx
2006-04-25 17:50:06 UTC
Permalink
Post by Vasily Sulatskov
Hello Jonathan,
Post by Jonathan Ellis
Post by Vasily Sulatskov
2. engine.client_encoding - encoding for client side strings, i.e.
string that user feeds to sqlalchemy or gets from it. Defaults to
'ascii', or some other encoding common to most of
sqlalchemy users
Unnecessary to have a separate SA setting. That's what the locale
module is
Post by Jonathan Ellis
for; use that instead.
Can you be more specific, please?
I can't find anything in locale module that will help in this case.
There is a sys.setdefaultencoding() function, but I actually never
managed to get it work...
I used to use setdefaultencoding but now I consider it a dangerous one.
Maybe the problem was that I was using it from sitecustomize.py (how else
can you use it :) and it affected all programs. Later when I learned better
(to place it inside my app dir), I already stopped using sitecustomize.

I'm also interested how can locale module help.

Tvrtko
Jonathan Ellis
2006-04-26 17:48:24 UTC
Permalink
Post by Vasily Sulatskov
If I understand you correctly, you suggest explict specification of
table columns. Right?
But isn't autoload option exactly for avvoiding explict column
specification?
Sure, but obviously there are limits to how much you can expect SA to
"guess" for you. Instead of a forest of hint options, I'd rather say: if
you don't like the defaults, you need to create explicit models yourself.

--
Jonathan Ellis
http://spyced.blogspot.com
Michael Bayer
2006-04-26 18:28:31 UTC
Permalink
Post by Vasily Sulatskov
If I understand you correctly, you suggest explict specification of
table columns. Right?
But isn't autoload option exactly for avvoiding explict column
specification?
Sure, but obviously there are limits to how much you can expect SA
to "guess" for you. Instead of a forest of hint options, I'd
rather say: if you don't like the defaults, you need to create
explicit models yourself.
+1 ! i might add "no forests of hint options" to the core
philosophy. maybe "bring me a shrubbery of options, but not a forest !"
Michael Bayer
2006-04-26 19:39:51 UTC
Permalink
Post by Michael Bayer
+1 ! i might add "no forests of hint options" to the core
philosophy. maybe "bring me a shrubbery of options, but not a forest !"
Kind of a dictionary of types for mapping database columns to
mapper types and back?
I actually prefer Ellis' unicode-only for the API — which makes
Post by Michael Bayer
MetaData(types=dict(VARCHAR=types.Unicode)) # kinda like this
yeah +1 on that too (as opposed to monkeypatch), (and id put it in
the Dialect), since it uses a dictionary of types like that anyway.
regardless of unicode approach, you still should have ways to use
whatever Types you want in coming from reflected values or generic
types specified in tables.

-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642
Continue reading on narkive:
Loading...