Discussion:
Postgresql trim-function
Zwieberl
2012-12-13 10:03:42 UTC
Permalink
Hello,

I am very happy with using sqlalchemy 0.8 so far.
But now I stumpled upon a problem I could not solve:

I have strings like 'POINT(-141.1643 32.42112)' and want to extract the
numbers between the brackets.
Note1: I can not use python to manipulate the string, since the string gets
created from a Postgresql-function-call within the select-statement, and
the substring has to be available in the same select.
Note2: the length of the numbers is not constant, therefore a simple
substring()-call is not possible, since I dont know the end-value.

Now, since "POINT(" is always the same I could use
trim(leading "POINT(" from x), as well as
trim(trailing ")" from x)

to get rid of everything outside the brackets.

But how can I use the trim-function in sql-Alchemy?

I tried sqlalchemy.func.trim('leading', '"POINT("' ,x)
-> error: function pg_catalog.btrim(unknown, unknown, text) does not exist

and sqlalchemy.func.trim('leading "POINT(" from ' + x)
-> returns just the whole string 'leading "POINT(" from POINT(32.233 4.42)'

Does anyone know how to use trim in sqlalchemy correctly?

Any help would be appreciated! (Also if you have a different approach
towards extracting the substring from the string)

Thank you all!
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/VepaVySkkGMJ.
To post to this group, send email to ***@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Audrius Kažukauskas
2012-12-13 11:30:52 UTC
Permalink
Post by Zwieberl
I have strings like 'POINT(-141.1643 32.42112)' and want to extract the
numbers between the brackets.
Note1: I can not use python to manipulate the string, since the string gets
created from a Postgresql-function-call within the select-statement, and
the substring has to be available in the same select.
Note2: the length of the numbers is not constant, therefore a simple
substring()-call is not possible, since I dont know the end-value.
How about this:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

db_engine = create_engine('postgresql://localhost')
Session = sessionmaker(bind=db_engine)
db_session = Session()

Base = declarative_base()

class Point(Base):
__tablename__ = 'points'
id = Column(Integer, primary_key=True)
data = Column(Text)

Base.metadata.create_all(bind=db_engine)
p1 = Point(data=u'POINT(-141.1643 32.42112)')
p2 = Point(data=u'POINT(-42.1337 99.99999)')
db_session.add_all([p1, p2])
db_session.commit()

q = db_session.query(func.regexp_split_to_array(
func.substring(Point.data, r'^POINT\((.*)\)$'), r'\s'))
for row in q:
print row

q = db_session.query(cast(func.regexp_split_to_table(
func.substring(Point.data, r'^POINT\((.*)\)$'), r'\s'), Float))
for row in q:
print row
Post by Zwieberl
Any help would be appreciated! (Also if you have a different approach
towards extracting the substring from the string)
Although I'm not familiar with this stuff, but perhaps this would be
better suited for PostGIS and GeoAlchemy?
--
Audrius KaÅŸukauskas
http://neutrino.lt/
Zwieberl
2012-12-13 11:58:22 UTC
Permalink
Hi Audrius,
Post by Audrius Kažukauskas
q = db_session.query(func.regexp_split_to_array(
func.substring(Point.data, r'^POINT\((.*)\)$'), r'\s'))
print row
q = db_session.query(cast(func.regexp_split_to_table(
func.substring(Point.data, r'^POINT\((.*)\)$'), r'\s'), Float))
print row
That works! Thank you very much!
But somehow I am a little bit worried about the performance. Wouldn't
trim() be much faster than regex? (I need to do this quite a lot of times)
Post by Audrius Kažukauskas
Although I'm not familiar with this stuff, but perhaps this would be
better suited for PostGIS and GeoAlchemy?
Actually, I am using PostGIS+GeoAlchemy2, but due to my data-format there
is no function available which can extract the coordinates from the object.
Only the String-representation, which is of the format shown above (Yes,
this is VERY annoying.)

Cheers
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/KCFzSi_afJwJ.
To post to this group, send email to ***@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Audrius Kažukauskas
2012-12-13 12:16:29 UTC
Permalink
Post by Zwieberl
But somehow I am a little bit worried about the performance. Wouldn't
trim() be much faster than regex? (I need to do this quite a lot of times)
Here's version which uses trim (actually, ltrim and rtrim), but the best
way to tell what's faster is to try both versions with your data:

func.rtrim(func.ltrim(Point.data, 'POINT('), ')')
--
Audrius KaÅŸukauskas
http://neutrino.lt/
Zwieberl
2012-12-13 12:29:00 UTC
Permalink
Post by Audrius Kažukauskas
Here's version which uses trim (actually, ltrim and rtrim), but the best
func.rtrim(func.ltrim(Point.data, 'POINT('), ')')
That is awesome! Thank you very much!
Now I understand why he tried to call btrim() in my faulty version above.
(I wondered about the 'b')

Stuff like this should be part of the documentation.
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/23bPLpfAqTkJ.
To post to this group, send email to ***@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Zwieberl
2012-12-13 13:00:54 UTC
Permalink
but the best way to tell what's faster is to try both versions with your
Just as additional information:

# explain analyze select trim(trailing ')' from trim(leading 'POINT(' from
ST_AsText(geo))) from profiles;
------------------------------------------------------------------------------------------------------------------
(rows=17420)
Total runtime: 150.843 ms

# explain analyze select substring(ST_AsText(geo) from '^POINT\((.*)\)$')
from profiles;
------------------------------------------------------------------------------------------------------------------
(rows=17420)
Total runtime: 381.579 ms

So trim is, at least in pure Postgresql, indeed more than twice as fast. I
doubt that this tendency changes when used from within sqlalchemy.

Thanks again!
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/_D6p1-rUfvkJ.
To post to this group, send email to ***@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Loading...