Since version 2.2 TurboGears has basic support for Master/Slave load balancing and provides a set of utilities to use it.
TurboGears permits to declare a master server and any number of slave servers, all the writes will automatically redirected to the master node, while the other calls will be dispatched randomly to the slave nodes.
All the queries executed outside of TurboGears controllers will run only on the master node, those include the queries performed by the authentication stack to initially look up an already logged in user, its groups and permissions.
To enable Master Slave load Balancing you just need to edit your model/__init__.py
making the sessionmaker
use the TurboGears BalancedSession:
from tg.configuration.sqla.balanced_session import BalancedSession
maker = sessionmaker(autoflush=True, autocommit=False,
class_=BalancedSession,
extension=ZopeTransactionExtension())
Doing this by itself will suffice to make load balancing work, but still
as there is only the standard database configuration the BalancedSession
will just be redirecting all the queries to the only available serve.
To let load balancing work we must specify at least a master and slave server inside our application configuration. The master server can be specified using the sqlalchemy.master set of options, while any number of slaves can be configured using the sqlalchemy.slaves options:
sqlalchemy.master.url = mysql://username:password@masterhost:port/databasename
sqlalchemy.master.pool_recycle = 3600
sqlalchemy.slaves.slave1.url = mysql://username:password@slavehost:port/databasename
sqlalchemy.slaves.slave1.pool_recycle = 3600
The master node can be configured also to be a slave, this is usually the case when we want the master to also handle some read queries.
TurboGears provides a set of utilities to let you change the default behavior of the load balancer. Those include the @with_engine(engine_name) decorator and the DBSession().using_engine(engine_name) context.
The with_engine
decorator permits to force a controller method to
run on a specific node. It is a great tool for ensuring that some
actions take place on the master node, like controllers that edit
content.
from tg import with_engine
@expose('myproj.templates.about')
@with_engine('master')
def about(self):
DBSession.query(model.User).all()
return dict(page='about')
The previous query will be executed on the master node, if the @with_engine decorator is removed it will get execute on any random slave.
The with_engine
decorator can also be used to force turbogears
to use the master node when some parameters are passed by url:
@expose('myproj.templates.index')
@with_engine(master_params=['m'])
def index(self):
DBSession.query(model.User).all()
return dict(page='index')
In this case calling http://localhost:8080/index will result in queries performed on a slave node, while calling http://localhost:8080/index?m=1 will force the queries to be executed on the master node.
Pay attention that the m=1 parameter can actually have any value, it just has to be there. This is especially useful when redirecting after an action that just created a new item to a page that has to show the new item. Using a parameter specified in master_params we can force TurboGears to fetch the items from the master node so to avoid odd results due to data propagation delay.
By default parameters specified in with_engine
master_params will be
popped from the controller params. This is to avoid messing with validators
or controller code that doesn’t expect the parameter to exist.
If the controller actually needs to access the parameter a dictionary can be passed to @with_engine instead of a list. The dictionary keys will be the parameters, while the value will be if to pop it from the parameters or not.
@expose('myproj.templates.index')
@with_engine(master_params={'m':False})
def index(self, m=None):
DBSession.query(model.User).all()
return dict(page='index', m=m)
Single queries can be forced to execute on a specific node using the
using_engine
method of the BalancedSession
. This method
returns a context manager, until queries are executed inside this
context they are run on the constrained engine:
with DBSession().using_engine('master'):
DBSession.query(model.User).all()
DBSession.query(model.Permission).all()
DBSession.query(model.Group).all()
In the previous example the Users and the Permissions will be fetched from the master node, while the Groups will be fetched from a random slave node.
Setting the root logger of your application to DEBUG will let
you see which node has been choose by the BalancedSession
to perform a specific query.