[sqlalchemy] Using a connection pool with multiple processes on the same machine

Discussion:

Maximilian Roos

2016-01-22 01:43:47 UTC

We're using celery, a job distribution package. On a single machine, there
are 20+ celery workers running, each with their own Python process. We had
some issues with the processes attempting to use the same SQLAlchemy
connections (I think because they are referenced by file descriptors, and
so can cross process boundaries?), and so we're now using a NullPool.

But is there a way of doing this more efficiently than a NullPool?

There's are two suggestions in the docs
<http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing>
:

- Dispose the connection as soon as the other process receives it. This
wouldn't work with celery, since the workers are started on their own, not
forked from another python process.
- Implement a listener that invalidates connections created in another
process. I'm unclear how this works and whether that would be practical in
this context. Specifically, given the number of processes we have running
on each machine, is the pool that a process is selecting & testing from
shared between all processes on that machine? If a pool is shared across 20
processes, the chances that a connection chosen at random would have been
created in that process is fairly low...

It's also possible we're woefully misunderstanding how this works;
corrections appreciated.

Many thanks,
Max

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Mike Bayer

2016-01-22 04:01:31 UTC

Permalink

Post by Maximilian Roos
We're using celery, a job distribution package. On a single machine,
there are 20+ celery workers running, each with their own Python
process. We had some issues with the processes attempting to use the
same SQLAlchemy connections (I think because they are referenced by file
descriptors, and so can cross process boundaries?), and so we're now
using a NullPool.
But is there a way of doing this more efficiently than a NullPool?

yes! Though if NullPool is doing the job, there might not be a need to
switch.

Post by Maximilian Roos
There's are two suggestions in the docs
<http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing>
* Dispose the connection as soon as the other process receives it.
This wouldn't work with celery, since the workers are started on
their own, not forked from another python process.

If the worker isn't forked from another Python process, then if you call
create_engine() in that process, that's a brand new engine. You
wouldn't be re-using connection across processes in this case. Only if
forking is happening, can the problem you refer to occur. If there's no
forking, then the process does not inherit any state.

However, if Celery is using Python multiprocessing, it's doing forking
and there is a parent Python process.

Post by Maximilian Roos
* Implement a listener that invalidates connections created in another
process. I'm unclear how this works and whether that would be
practical in this context.

this is a simple recipe and is straight from what we do in Openstack,
and again, it only makes sense if there's forking going on. If Celery is
starting brand new Python processes that don't inherit any state from a
parent process, and the worker itself also isn't forking, then there's
nothing to do.

Specifically, given the number of

Post by Maximilian Roos
processes we have running on each machine, is the pool that a
process is selecting & testing from shared between all processes on
that machine?

it's not, since the pool is an application-level pool. It exists within
the space of a Python interpreter and therefore is local to a process.
So when you have a lot of processes, pooling of this style is hard
because the pooling cannot coordinate between processes.

For Postgresql backends, there's already a solution, which is to use
PGBouncer: https://wiki.postgresql.org/wiki/PgBouncer For PG, I'd stick
with NullPool and just use PgBouncer as the pool. The issue with
PgBouncer is that it's entirely Postgresql specific and there aren't
really similar solutions for other backends.

For MySQL and others, there's not really a software solution out there
with the possible exception of SQLRelay, but SQLRelay is kind of an odd
quasi-commercial product with very complex licensing and an unfortunate
home on Sourceforge.

One of the things I'm doing at Red Hat is attempting to address this
issue in some way that addresses MySQL and backends in general. Some
approaches I'm looking into include building a coordination service that
communicates with the otherwise process-local pools, building a system
that somehow coordinates with an HAProxy service, or the last resort
which is that I build a brand-new PGBouncer / SQLRelay-like solution
that isn't hardwired to Postgresql and isn't burdened by weird
commercial licensing like SQLRelay.

If a pool is shared across 20 processes, the chances

Post by Maximilian Roos
that a connection chosen at random would have been created in that
process is fairly low...
It's also possible we're woefully misunderstanding how this works;
corrections appreciated.

I haven't worked with Celery so you might want to get a clear answer on
if the create_engine() you're calling is in fact in a parent process
that gets inherited (e.g. forked) or if it really is called individually
in totally distinct Python processes. If the latter, you wouldn't have
any shared connections between processes.

Post by Maximilian Roos
Many thanks,
Max
--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Maximilian Roos

2016-01-22 05:18:10 UTC

Permalink

Great, thanks for the reply Mike.

Given that if the processes are started separately, are the possible
'cross-effects' between processes on the same machine the same as those
between threads within the same process - and so solved with scoped_session?

Post by Mike Bayer

yes! Though if NullPool is doing the job, there might not be a need to
switch.

Post by Maximilian Roos
There's are two suggestions in the docs
<

http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing>

Post by Maximilian Roos
* Dispose the connection as soon as the other process receives it.
This wouldn't work with celery, since the workers are started on
their own, not forked from another python process.

If the worker isn't forked from another Python process, then if you call
create_engine() in that process, that's a brand new engine. You
wouldn't be re-using connection across processes in this case. Only if
forking is happening, can the problem you refer to occur. If there's no
forking, then the process does not inherit any state.
However, if Celery is using Python multiprocessing, it's doing forking
and there is a parent Python process.

Post by Maximilian Roos
* Implement a listener that invalidates connections created in another
process. I'm unclear how this works and whether that would be
practical in this context.

Post by Maximilian Roos
processes we have running on each machine, is the pool that a
process is selecting & testing from shared between all processes on
that machine?

it's not, since the pool is an application-level pool. It exists within
the space of a Python interpreter and therefore is local to a process.
So when you have a lot of processes, pooling of this style is hard
because the pooling cannot coordinate between processes.
For Postgresql backends, there's already a solution, which is to use
PGBouncer: https://wiki.postgresql.org/wiki/PgBouncer For PG, I'd stick
with NullPool and just use PgBouncer as the pool. The issue with
PgBouncer is that it's entirely Postgresql specific and there aren't
really similar solutions for other backends.
For MySQL and others, there's not really a software solution out there
with the possible exception of SQLRelay, but SQLRelay is kind of an odd
quasi-commercial product with very complex licensing and an unfortunate
home on Sourceforge.
One of the things I'm doing at Red Hat is attempting to address this
issue in some way that addresses MySQL and backends in general. Some
approaches I'm looking into include building a coordination service that
communicates with the otherwise process-local pools, building a system
that somehow coordinates with an HAProxy service, or the last resort
which is that I build a brand-new PGBouncer / SQLRelay-like solution
that isn't hardwired to Postgresql and isn't burdened by weird
commercial licensing like SQLRelay.
If a pool is shared across 20 processes, the chances

<javascript:>

Post by Maximilian Roos
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Maximilian Roos

2016-01-22 06:23:04 UTC

Permalink

Great, thanks for the reply Mike.

I looked further - Celery *is* using fork / multiprocessing, but the
forking occurs before any import of our libraries / sqlalchemy /
create_engine. Is there a risk of reusing connections in that state?

Can I confirm that the 'cross-effects' between processes on the same
machine include those the docs details between threads - and so require a
scoped_session (or something that achieves the same goal)?

Post by Mike Bayer

yes! Though if NullPool is doing the job, there might not be a need to
switch.

Post by Maximilian Roos
There's are two suggestions in the docs
<

http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing>

If the worker isn't forked from another Python process, then if you call
create_engine() in that process, that's a brand new engine. You
wouldn't be re-using connection across processes in this case. Only if
forking is happening, can the problem you refer to occur. If there's no
forking, then the process does not inherit any state.
However, if Celery is using Python multiprocessing, it's doing forking
and there is a parent Python process.

Post by Maximilian Roos
* Implement a listener that invalidates connections created in another
process. I'm unclear how this works and whether that would be
practical in this context.

Post by Maximilian Roos
processes we have running on each machine, is the pool that a
process is selecting & testing from shared between all processes on
that machine?

it's not, since the pool is an application-level pool. It exists within
the space of a Python interpreter and therefore is local to a process.
So when you have a lot of processes, pooling of this style is hard
because the pooling cannot coordinate between processes.
For Postgresql backends, there's already a solution, which is to use
PGBouncer: https://wiki.postgresql.org/wiki/PgBouncer For PG, I'd stick
with NullPool and just use PgBouncer as the pool. The issue with
PgBouncer is that it's entirely Postgresql specific and there aren't
really similar solutions for other backends.
For MySQL and others, there's not really a software solution out there
with the possible exception of SQLRelay, but SQLRelay is kind of an odd
quasi-commercial product with very complex licensing and an unfortunate
home on Sourceforge.
One of the things I'm doing at Red Hat is attempting to address this
issue in some way that addresses MySQL and backends in general. Some
approaches I'm looking into include building a coordination service that
communicates with the otherwise process-local pools, building a system
that somehow coordinates with an HAProxy service, or the last resort
which is that I build a brand-new PGBouncer / SQLRelay-like solution
that isn't hardwired to Postgresql and isn't burdened by weird
commercial licensing like SQLRelay.
If a pool is shared across 20 processes, the chances

<javascript:>

Post by Maximilian Roos
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Mike Bayer

2016-01-22 15:03:28 UTC

Permalink

Post by Maximilian Roos
Great, thanks for the reply Mike.
I looked further - Celery *is* using fork / multiprocessing, but the
forking occurs before any import of our libraries / sqlalchemy /
create_engine. Is there a risk of reusing connections in that state?
Can I confirm that the 'cross-effects' between processes on the same
machine include those the docs details between threads - and so require
a scoped_session (or something that achieves the same goal)?

A python process has absolutely no way to affect the state of another
python process without explicit communication channels (e.g. pipes) set
up between those processes, other than whatever resources were mutually
inherited from the parent, which can include filehandles (such as socket
connections to databases).

If you are seeing the effects of connection sharing between processes
this suggests that database connections are being created before a fork
occurs. The recipe in the docs for asserting this is not the case is
very handy for guaranteeing this is not possible.

Post by Maximilian Roos

by file

Post by Maximilian Roos
descriptors, and so can cross process boundaries?), and so we're now
using a NullPool.
But is there a way of doing this more efficiently than a NullPool?

yes! Though if NullPool is doing the job, there might not be a need to
switch.

Post by Maximilian Roos
There's are two suggestions in the docs

<http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing
<http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing>>

If the worker isn't forked from another Python process, then if you call
create_engine() in that process, that's a brand new engine. You
wouldn't be re-using connection across processes in this case.
Only if
forking is happening, can the problem you refer to occur. If there's no
forking, then the process does not inherit any state.
However, if Celery is using Python multiprocessing, it's doing forking
and there is a parent Python process.

Post by Maximilian Roos
* Implement a listener that invalidates connections created in

another

Post by Maximilian Roos
process. I'm unclear how this works and whether that would be
practical in this context.

Post by Maximilian Roos
processes we have running on each machine, is the pool that a
process is selecting & testing from shared between all

processes on

Post by Maximilian Roos
that machine?

it's not, since the pool is an application-level pool. It exists within
the space of a Python interpreter and therefore is local to a process.
So when you have a lot of processes, pooling of this style is hard
because the pooling cannot coordinate between processes.
For Postgresql backends, there's already a solution, which is to use
PGBouncer: https://wiki.postgresql.org/wiki/PgBouncer
<https://wiki.postgresql.org/wiki/PgBouncer> For PG, I'd stick
with NullPool and just use PgBouncer as the pool. The issue with
PgBouncer is that it's entirely Postgresql specific and there aren't
really similar solutions for other backends.
For MySQL and others, there's not really a software solution out there
with the possible exception of SQLRelay, but SQLRelay is kind of an odd
quasi-commercial product with very complex licensing and an unfortunate
home on Sourceforge.
One of the things I'm doing at Red Hat is attempting to address this
issue in some way that addresses MySQL and backends in general. Some
approaches I'm looking into include building a coordination service that
communicates with the otherwise process-local pools, building a system
that somehow coordinates with an HAProxy service, or the last resort
which is that I build a brand-new PGBouncer / SQLRelay-like solution
that isn't hardwired to Postgresql and isn't burdened by weird
commercial licensing like SQLRelay.
If a pool is shared across 20 processes, the chances

Post by Maximilian Roos
that a connection chosen at random would have been created in

that

Post by Maximilian Roos
process is fairly low...
It's also possible we're woefully misunderstanding how this works;
corrections appreciated.

send
<javascript:>

Post by Maximilian Roos
Visit this group at https://groups.google.com/group/sqlalchemy

<https://groups.google.com/group/sqlalchemy>.

Post by Maximilian Roos
For more options, visit https://groups.google.com/d/optout

<https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Jonathan Vanasco

2016-01-22 17:14:14 UTC

Permalink

I use SqlAlchemy and Celery and haven't run into this issue yet, but it
appears to be dumb-luck from the size of my worker pool.

I did a bit of searching, and check out Cerlery.signals

Reading through some closed tickets on their github, it looks like
worker_process_init was introduced to handle support for
Crypto.Random.atfork() [https://github.com/celery/celery/issues/109]

http://docs.celeryproject.org/en/latest/userguide/signals.html#worker-process-init

I haven't tested yet -- and this is just pseudocode -- but i imagine
something like this might fix your problem:

@worker_process_init
def mycelery_atfork():
mySqlalchemyEngine.dispose()

Maximilian Roos

2016-01-22 23:24:37 UTC

Permalink

Post by Mike Bayer
A python process has absolutely no way to affect the state of another
python process without explicit communication channels (e.g. pipes) set
up between those processes, other than whatever resources were mutually
inherited from the parent, which can include filehandles (such as socket
connections to databases).

That's what I first thought... Then we had the problems we have, and I
figured there must be some cross-process contamination. It makes sense that
there needs to communication for there to be contamination.

We'll investigate further. It could be something like unwittingly pickling
the engine and sending it across processes.

Thanks again

Post by Mike Bayer

A python process has absolutely no way to affect the state of another
python process without explicit communication channels (e.g. pipes) set
up between those processes, other than whatever resources were mutually
inherited from the parent, which can include filehandles (such as socket
connections to databases).
If you are seeing the effects of connection sharing between processes
this suggests that database connections are being created before a fork
occurs. The recipe in the docs for asserting this is not the case is
very handy for guaranteeing this is not possible.

Post by Maximilian Roos

Post by Maximilian Roos
We're using celery, a job distribution package. On a single

machine,

Post by Maximilian Roos

Post by Maximilian Roos
there are 20+ celery workers running, each with their own Python
process. We had some issues with the processes attempting to use

the

Post by Maximilian Roos

Post by Maximilian Roos
same SQLAlchemy connections (I think because they are referenced

by file

Post by Maximilian Roos
descriptors, and so can cross process boundaries?), and so we're

now

Post by Maximilian Roos

Post by Maximilian Roos
using a NullPool.
But is there a way of doing this more efficiently than a NullPool?

yes! Though if NullPool is doing the job, there might not be a need to
switch.

Post by Maximilian Roos
There's are two suggestions in the docs

http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing

Post by Maximilian Roos
<

http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing>>

Post by Maximilian Roos

Post by Maximilian Roos
* Dispose the connection as soon as the other process receives

it.

Post by Maximilian Roos

Post by Maximilian Roos
This wouldn't work with celery, since the workers are started

Post by Maximilian Roos

Post by Maximilian Roos
their own, not forked from another python process.

If the worker isn't forked from another Python process, then if you call
create_engine() in that process, that's a brand new engine. You
wouldn't be re-using connection across processes in this case.
Only if
forking is happening, can the problem you refer to occur. If there's no
forking, then the process does not inherit any state.
However, if Celery is using Python multiprocessing, it's doing

forking

Post by Maximilian Roos
and there is a parent Python process.

Post by Maximilian Roos
* Implement a listener that invalidates connections created in

another

Post by Maximilian Roos
process. I'm unclear how this works and whether that would be
practical in this context.

this is a simple recipe and is straight from what we do in

Openstack,

Post by Maximilian Roos
and again, it only makes sense if there's forking going on. If Celery is
starting brand new Python processes that don't inherit any state

from a

Post by Maximilian Roos
parent process, and the worker itself also isn't forking, then

there's

Post by Maximilian Roos
nothing to do.
Specifically, given the number of

Post by Maximilian Roos
processes we have running on each machine, is the pool that a
process is selecting & testing from shared between all

processes on

Post by Maximilian Roos
that machine?

it's not, since the pool is an application-level pool. It exists within
the space of a Python interpreter and therefore is local to a

process.

Post by Maximilian Roos
So when you have a lot of processes, pooling of this style is hard
because the pooling cannot coordinate between processes.
For Postgresql backends, there's already a solution, which is to use
PGBouncer: https://wiki.postgresql.org/wiki/PgBouncer
<https://wiki.postgresql.org/wiki/PgBouncer> For PG, I'd stick
with NullPool and just use PgBouncer as the pool. The issue with
PgBouncer is that it's entirely Postgresql specific and there aren't
really similar solutions for other backends.
For MySQL and others, there's not really a software solution out

there

Post by Maximilian Roos
with the possible exception of SQLRelay, but SQLRelay is kind of an

odd

Post by Maximilian Roos
quasi-commercial product with very complex licensing and an

unfortunate

Post by Maximilian Roos
home on Sourceforge.
One of the things I'm doing at Red Hat is attempting to address this
issue in some way that addresses MySQL and backends in general.

Some

Post by Maximilian Roos
approaches I'm looking into include building a coordination service that
communicates with the otherwise process-local pools, building a

system

Post by Maximilian Roos
that somehow coordinates with an HAProxy service, or the last resort
which is that I build a brand-new PGBouncer / SQLRelay-like solution
that isn't hardwired to Postgresql and isn't burdened by weird
commercial licensing like SQLRelay.
If a pool is shared across 20 processes, the chances

Post by Maximilian Roos
that a connection chosen at random would have been created in

that

Post by Maximilian Roos
process is fairly low...
It's also possible we're woefully misunderstanding how this works;
corrections appreciated.

I haven't worked with Celery so you might want to get a clear answer

Post by Maximilian Roos
if the create_engine() you're calling is in fact in a parent process
that gets inherited (e.g. forked) or if it really is called individually
in totally distinct Python processes. If the latter, you wouldn't

have

Post by Maximilian Roos
any shared connections between processes.

send

<javascript:>>.

Post by Maximilian Roos
<javascript:>

Post by Maximilian Roos
Visit this group at https://groups.google.com/group/sqlalchemy

<https://groups.google.com/group/sqlalchemy>.

Post by Maximilian Roos
For more options, visit https://groups.google.com/d/optout

<javascript:>

Post by Maximilian Roos
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Jonathan Vanasco

2016-01-28 02:01:07 UTC

Permalink

just implemented the following code:

_saEngine is just the sqlalchemy engine I use.

logging some vars to the terminal window in my routine and this listener,
and I see that this routine is called in the worker after forking.

from celery.signals import worker_process_init

@worker_process_init.connect
def mycelery_atfork(signal=None, sender=None, **named):
_saEngine.dispose()

Continue reading on narkive:

Search results for '[sqlalchemy] Using a connection pool with multiple processes on the same machine' (Questions and Answers)

replies

Why does everyone use this? Why not google search?

started 2006-05-14 04:04:41 UTC

society & culture

replies

Is cocaine, heroine, marijuana and other drugs being used as legal tender in parts of the world?

started 2006-06-27 01:10:37 UTC

news & events

replies

can i get question answer of asp.net ?