Discussion:
[sqlalchemy] idle in transaction
Ed Rahn
2015-02-14 03:13:24 UTC
Permalink
I have several programs that are Multi Process and long running, they
open up 30 or so connections and do selects periodically.
For a select query, the default behaviour is to begin a transaction if
not currently in one, but not commit afterwards. This leaves a large
number of postgresql processes with a status of "idle in transaction".
Which means they "lock" tables, so you can not drop them or alter them
including add and drop indexes. I have also seen some problems were
connections do not get closed if the connecting process exits, although
I haven’t verified this is the cause.

Is this a problem others have had in the past or am I just being overly
worried?

If it is a problem is there any other way to fix it beside commit()'ing
after each select query? I tried add an event handler after queries get
run, but I didn't see a way to get to the current transaction.

thanks
Ed
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
Michael Bayer
2015-02-14 04:30:19 UTC
Permalink
I have several programs that are Multi Process and long running, they open
up 30 or so connections and do selects periodically. For a select query,
the default behaviour is to begin a transaction if not currently in one,
but not commit afterwards. This leaves a large number of postgresql
processes with a status of "idle in transaction". Which means they “lock”
tables, so you can not drop them or alter them including add and drop
indexes. I have also seen some problems were connections do not get closed
if the connecting process exits, although I haven’t verified this is the
cause.
Is this a problem others have had in the past or am I just being overly
worried?
yeah that’s kind of bad. you want the connections to be “idle”, but
definitely not “idle in transaction”. that will cause problems.
If it is a problem is there any other way to fix it beside commit()’ing
after each select query? I tried add an event handler after queries get
run, but I didn't see a way to get to the current transaction.
Well at least on the connection itself, if its used in non-autocommit mode
(by which I refer to psycopg2’s autocommit flag), the “idle in transaction”
will remain until either commit() or rollback() is called on that
connection.

So if you have a Session, and want to stay at that level, your options are
to commit() it, to rollback() it, to close() it which returns the connection
to the pool which does a connection-level rollback, or you could use the
Session in autocommit=True mode, which means after each query it returns the
connection to the pool for the same effect.

If you want to turn off transactions completely with the DBAPI connection
itself, even though this overhead is very minimal for Postgresql you could
set it to isolation level of AUTOCOMMIT which for psycopg2 sets the
“autocommit” flag on the connection. The commit()/ rollback() calls from
SQLAlchemy would have no effect.
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
Ed Rahn
2015-02-15 02:58:21 UTC
Permalink
Post by Michael Bayer
I have several programs that are Multi Process and long running, they open
up 30 or so connections and do selects periodically. For a select query,
the default behaviour is to begin a transaction if not currently in one,
but not commit afterwards. This leaves a large number of postgresql
processes with a status of "idle in transaction". Which means they “lock”
tables, so you can not drop them or alter them including add and drop
indexes. I have also seen some problems were connections do not get closed
if the connecting process exits, although I haven’t verified this is the
cause.
Is this a problem others have had in the past or am I just being overly
worried?
yeah that’s kind of bad. you want the connections to be “idle”, but
definitely not “idle in transaction”. that will cause problems.
If it is a problem is there any other way to fix it beside commit()’ing
after each select query? I tried add an event handler after queries get
run, but I didn't see a way to get to the current transaction.
Well at least on the connection itself, if its used in non-autocommit mode
(by which I refer to psycopg2’s autocommit flag), the “idle in transaction”
will remain until either commit() or rollback() is called on that
connection.
So if you have a Session, and want to stay at that level, your options are
to commit() it, to rollback() it, to close() it which returns the connection
to the pool which does a connection-level rollback, or you could use the
Session in autocommit=True mode, which means after each query it returns the
connection to the pool for the same effect.
I'd really like to not use autocommit mode. There are parts of the code
that I need to maintain DB consistency with transactions.
And I need to keep the objects attached to a session so automatically
closing it isn't an option.
So I guess my only option is to commit after each select, which seems
like a lot of work as the code base is fairly large.

This seems like a fairly common use case, do people just not care about
it or how do they handle it?
Post by Michael Bayer
If you want to turn off transactions completely with the DBAPI connection
itself, even though this overhead is very minimal for Postgresql you could
set it to isolation level of AUTOCOMMIT which for psycopg2 sets the
“autocommit” flag on the connection. The commit()/ rollback() calls from
SQLAlchemy would have no effect.
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
Michael Bayer
2015-02-15 21:09:58 UTC
Permalink
Post by Michael Bayer
I have several programs that are Multi Process and long running, they open
up 30 or so connections and do selects periodically. For a select query,
the default behaviour is to begin a transaction if not currently in one,
but not commit afterwards. This leaves a large number of postgresql
processes with a status of "idle in transaction". Which means they “lock”
tables, so you can not drop them or alter them including add and drop
indexes. I have also seen some problems were connections do not get closed
if the connecting process exits, although I haven’t verified this is the
cause.
Is this a problem others have had in the past or am I just being overly
worried?
yeah that’s kind of bad. you want the connections to be “idle”, but
definitely not “idle in transaction”. that will cause problems.
If it is a problem is there any other way to fix it beside commit()’ing
after each select query? I tried add an event handler after queries get
run, but I didn't see a way to get to the current transaction.
Well at least on the connection itself, if its used in non-autocommit mode
(by which I refer to psycopg2’s autocommit flag), the “idle in transaction”
will remain until either commit() or rollback() is called on that
connection.
So if you have a Session, and want to stay at that level, your options are
to commit() it, to rollback() it, to close() it which returns the connection
to the pool which does a connection-level rollback, or you could use the
Session in autocommit=True mode, which means after each query it returns the
connection to the pool for the same effect.
I'd really like to not use autocommit mode. There are parts of the code that I need to maintain DB consistency with transactions.
And I need to keep the objects attached to a session so automatically closing it isn't an option.
So I guess my only option is to commit after each select, which seems like a lot of work as the code base is fairly large.
This seems like a fairly common use case, do people just not care about it or how do they handle it?
Usually the application is written such that the start/end of how a Session
is used is just in one place in the app. Web applications can do this, and
also approaches like using decorators or custom context managers may be
used. Often, people just use the built in context manager of the Session,
such as:

with session.transaction:
# do stuff


I talk about this a lot in this section:
http://docs.sqlalchemy.org/en/rel_0_9/orm/session_basics.html#when-do-i-construct-a-session-when-do-i-commit-it-and-when-do-i-close-it
at the end is an example of how to build a context manager to do what you
need.
Post by Michael Bayer
If you want to turn off transactions completely with the DBAPI connection
itself, even though this overhead is very minimal for Postgresql you could
set it to isolation level of AUTOCOMMIT which for psycopg2 sets the
“autocommit” flag on the connection. The commit()/ rollback() calls from
SQLAlchemy would have no effect.
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
Jonathan Vanasco
2015-02-15 22:21:10 UTC
Permalink
Post by Ed Rahn
This seems like a fairly common use case, do people just not care about
it or how do they handle it?
This isn't common, and looks like an anti-pattern.

Consider these lines from your emails - they're not really compatible with
one another :

* I have several programs that are Multi Process and long running... they
open up 30 or so connections and do selects periodically.
* the default behaviour is to begin a transaction if not currently in one,
but not commit afterwards.
* I need to maintain DB consistency with transactions

The best way to maintain DB consistency with transactions is to keep the
transactions/units-of-work as small as possible. If you have long-running
transactions across multiple processes... you're creating a situation that
is conducive to integrity errors or locking tables -- and even creating
deadlocks.

You may be better off just pulling all the data on startup - or
periodically - and treating it like a cache. if you have any write
operations, use fresh selects -- not the cached data.
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
Jonathan Vanasco
2015-02-14 15:13:14 UTC
Permalink
Just to add to Michael's reply -- keep in mind that how you handle the
session (commit/rollback/close) will impact the state of any objects the
session loaded if you're using the ORM and that can have implications in
your app.

The default behavior of sessions is to have `expire_on_commit=True` - so if
there are any writes and you `commit`, that will force a database resync
when you access the data again. If you didn't eagerload necessary data,
accessing unloaded attributes after a `commit` will again lock tables in
the new transaction. If you call `close` instead of commit, the object
state will become detached from the session and unloaded attributes will
raise an error (unless you merge the object into a new session).
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...