6.1. Sessions

Sessions are objects linked to an authenticated user. The Session.new_cnx method returns a new Connection linked to that session.

6.2. Connections

Connections provide the .execute method to query the data sources, along with .commit and .rollback methods for transaction management.

6.2.1. Kinds of connections

There are two kinds of connections.

  • normal connections are the most common: they are related to users and carry security checks coming with user credentials
  • internal connections have all the powers; they are also used in only a few situations where you don’t already have an adequate session at hand, like: user authentication, data synchronisation in multi-source contexts

Normal connections are typically named _cw in most appobjects or sometimes just session.

Internal connections are available from the Repository object and are to be used like this:

with self.repo.internal_cnx() as cnx:
    do_stuff_with(cnx)
    cnx.commit()

Connections should always be used as context managers, to avoid leaks.

6.2.1.1. Python/RQL API

The Python API developped to interface with RQL is inspired from the standard db-api, but since execute returns its results directly, there is no cursor concept.

execute(rqlstring, args=None, build_descr=True)
rqlstring:the RQL query to execute (unicode)
args:if the query contains substitutions, a dictionary containing the values to use

The Connection object owns the methods commit and rollback. You should never need to use them during the development of the web interface based on the CubicWeb framework as it determines the end of the transaction depending on the query execution success. They are however useful in other contexts such as tests or custom controllers.

Note

If a query generates an error related to security (Unauthorized) or to integrity (ValidationError), the transaction can still continue but you won’t be able to commit it, a rollback will be necessary to start a new transaction.

Also, a rollback is automatically done if an error occurs during commit.

Note

A ValidationError has a entity attribute. In CubicWeb, this atttribute is set to the entity’s eid (not a reference to the entity itself).

6.2.1.2. Executing RQL queries from a view or a hook

When you’re within code of the web interface, the Connection is handled by the request object. You should not have to access it directly, but use the execute method directly available on the request, eg:

rset = self._cw.execute(rqlstring, kwargs)

Similarly, on the server side (eg in hooks), there is no request object (since you’re directly inside the data-server), so you’ll have to use the execute method of the Connection object.

6.2.1.3. Proper usage of .execute

Let’s say you want to get T which is in configuration C, this translates to:

self._cw.execute('Any T WHERE T in_conf C, C eid %s' % entity.eid)

But it must be written in a syntax that will benefit from the use of a cache on the RQL server side:

self._cw.execute('Any T WHERE T in_conf C, C eid %(x)s', {'x': entity.eid})

The syntax tree is built once for the “generic” RQL and can be re-used with a number of different eids. The rql IN operator is an exception to this rule.

self._cw.execute('Any T WHERE T in_conf C, C name IN (%s)'
                 % ','.join(['foo', 'bar']))

Alternatively, some of the common data related to an entity can be obtained from the entity.related() method (which is used under the hood by the ORM when you use attribute access notation on an entity to get a relation. The initial request would then be translated to:

entity.related('in_conf', 'object')

Additionally this benefits from the fetch_attrs policy (see Loaded attributes and default sorting management) optionally defined on the class element, which says which attributes must be also loaded when the entity is loaded through the ORM.

6.2.1.4. The ResultSet API

ResultSet instances are a very commonly manipulated object. They have a rich API as seen below, but we would like to highlight a bunch of methods that are quite useful in day-to-day practice:

  • __str__() (applied by print) gives a very useful overview of both the underlying RQL expression and the data inside; unavoidable for debugging purposes
  • printable_rql() returns a well formed RQL expression as a string; it is very useful to build views
  • entities() returns a generator on all entities of the result set
  • get_entity(row, col) gets the entity at row, col coordinates; one of the most used result set methods
class cubicweb.rset.ResultSet(results, rql, args=None, description=None, rqlst=None)[source]

A result set wraps a RQL query result. This object implements partially the list protocol to allow direct use as a list of result rows.

Parameters:
  • rowcount (int) – number of rows in the result
  • rows (list) – list of rows of result
  • description (list) – result’s description, using the same structure as the result itself
  • rql (str or unicode) – the original RQL query string
column_types(*args, **kwargs)[source]

return the list of different types in the column with the given col

Parameters:col (int) – the index of the desired column
Return type:list
Returns:the different entities type found in the column
complete_entity(row, col=0, skip_bytes=True)[source]

short cut to get an completed entity instance for a particular row (all instance’s attributes have been fetched)

description_struct(*args, **kwargs)[source]

return a list describing sequence of results with the same description, e.g. : [[0, 4, (‘Bug’,)] [[0, 4, (‘Bug’,), [5, 8, (‘Story’,)] [[0, 3, (‘Project’, ‘Version’,)]]

entities(col=0)[source]

iter on entities with eid in the col column of the result set

filtered_rset(filtercb, col=0)[source]

filter the result set according to a given filtercb

Parameters:
  • filtercb (callable(entity)) – a callable which should take an entity as argument and return False if it should be skipped, else True
  • col (int) – the column index
Return type:

ResultSet

get_entity(*args, **kwargs)[source]

convenience method for query retrieving a single entity, returns a partially initialized Entity instance.

Warning

Due to the cache wrapping this function, you should NEVER give row as a named parameter (i.e. rset.get_entity(0, 1) is OK but rset.get_entity(row=0, col=1) isn’t)

Parameters:row,col (int, int) – row and col numbers localizing the entity among the result’s table
Returns:the partially initialized Entity instance
iter_rows_with_entities()[source]

iterates over rows, and for each row eids are converted to plain entities

limit(limit, offset=0, inplace=False)[source]

limit the result set to the given number of rows optionally starting from an index different than 0

Parameters:
  • limit (int) – the maximum number of results
  • offset (int) – the offset index
  • inplace (bool) – if true, the result set is modified in place, else a new result set is returned and the original is left unmodified
Return type:

ResultSet

limited_rql()[source]

returns a printable rql for the result set associated to the object, with limit/offset correctly set according to maximum page size and currently displayed page when necessary

one(col=0)[source]

Retrieve exactly one entity from the query.

If the result set is empty, raises NoResultError. If the result set has more than one row, raises MultipleResultsError.

Parameters:col (int) – The column localising the entity in the unique row
Returns:the partially initialized Entity instance
printable_rql(encoded=<nullobject>)[source]

return the result set’s origin rql as a string, with arguments substitued

related_entity(*args, **kwargs)[source]

given an cell of the result set, try to return a (entity, relation name) tuple to which this cell is linked.

This is especially useful when the cell is an attribute of an entity, to get the entity to which this attribute belongs to.

searched_text(*args, **kwargs)[source]

returns the searched text in case of full-text search

Returns:searched text or None if the query is not a full-text query
sorted_rset(keyfunc, reverse=False, col=0)[source]

sorts the result set according to a given keyfunc

Parameters:
  • keyfunc (callable(entity)) – a callable which should take an entity as argument and return the value used to compare and sort
  • reverse (bool) – if the result should be reversed
  • col (int) – the column index. if col = -1, the whole row are used
Return type:

ResultSet

split_rset(keyfunc=None, col=0, return_dict=False)[source]

splits the result set in multiple result sets according to a given key

Parameters:
  • keyfunc (callable(entity or FinalType)) – a callable which should take a value of the rset in argument and return the value used to group the value. If not define, raw value of the specified columns is used.
  • col (int) – the column index. if col = -1, the whole row are used
  • return_dict (Boolean) – If true, the function return a mapping (key -> rset) instead of a list of rset
Return type:

List of ResultSet or mapping of ResultSet

syntax_tree(*args, **kwargs)[source]

return the syntax tree (rql.stmts.Union) for the originating query. You can expect it to have solutions computed and it will be properly annotated.

transformed_rset(transformcb)[source]

the result set according to a given column types

Parameters:
  • transformcb – a callable which should take a row and its type description as parameters, and return the transformed row and type description.
  • col (int) – the column index
Return type:

ResultSet

6.2.2. Authentication and management of sessions

The authentication process is a ballet involving a few dancers:

  • through its get_session method the top-level application object (the CubicWebPublisher) will open a session whenever a web request comes in; it asks the session manager to open a session (giving the web request object as context) using open_session
    • the session manager asks its authentication manager (which is a component) to authenticate the request (using authenticate)
      • the authentication manager asks, in order, to its authentication information retrievers, a login and an opaque object containing other credentials elements (calling authentication_information), giving the request object each time
        • the default retriever (named LoginPasswordRetriever) will in turn defer login and password fetching to the request object (which, depending on the authentication mode (cookie or http), will do the appropriate things and return a login and a password)
      • the authentication manager, on success, asks the Repository object to connect with the found credentials (using connect)
        • the repository object asks authentication to all of its sources which support the CWUser entity with the given credentials; when successful it can build the cwuser entity, from which a regular Session object is made; it returns the session id
          • the source in turn will delegate work to an authentifier class that defines the ultimate authenticate method (for instance the native source will query the database against the provided credentials)
      • the authentication manager, on success, will call back _all_ retrievers with authenticated and return its authentication data (on failure, it will try the anonymous login or, if the configuration forbids it, raise an AuthenticationError)

6.2.3. Writing authentication plugins

Sometimes CubicWeb’s out-of-the-box authentication schemes (cookie and http) are not sufficient. Nowadays there is a plethora of such schemes and the framework cannot provide them all, but as the sequence above shows, it is extensible.

Two levels have to be considered when writing an authentication plugin: the web client and the repository.

We invented a scenario where it makes sense to have a new plugin in each side: some middleware will do pre-authentication and under the right circumstances add a new HTTP x-foo-user header to the query before it reaches the CubicWeb instance. For a concrete example of this, see the trustedauth cube.

6.2.3.1. Repository authentication plugins

On the repository side, it is possible to register a source authentifier using the following kind of code:

from cubicweb.server.sources import native

class FooAuthentifier(native.LoginPasswordAuthentifier):
    """ a source authentifier plugin
    if 'foo' in authentication information, no need to check
    password
    """
    auth_rql = 'Any X WHERE X is CWUser, X login %(login)s'

    def authenticate(self, session, login, **kwargs):
        """return CWUser eid for the given login
        if this account is defined in this source,
        else raise `AuthenticationError`
        """
        session.debug('authentication by %s', self.__class__.__name__)
        if 'foo' not in kwargs:
            return super(FooAuthentifier, self).authenticate(session, login, **kwargs)
        try:
            rset = session.execute(self.auth_rql, {'login': login})
            return rset[0][0]
        except Exception, exc:
            session.debug('authentication failure (%s)', exc)
        raise AuthenticationError('foo user is unknown to us')

Since repository authentifiers are not appobjects, we have to register them through a server_startup hook.

class ServerStartupHook(hook.Hook):
    """ register the foo authenticator """
    __regid__ = 'fooauthenticatorregisterer'
    events = ('server_startup',)

    def __call__(self):
        self.debug('registering foo authentifier')
        self.repo.system_source.add_authentifier(FooAuthentifier())

6.2.3.2. Web authentication plugins

class XFooUserRetriever(authentication.LoginPasswordRetriever):
    """ authenticate by the x-foo-user http header
    or just do normal login/password authentication
    """
    __regid__ = 'x-foo-user'
    order = 0

    def authentication_information(self, req):
        """retrieve authentication information from the given request, raise
        NoAuthInfo if expected information is not found
        """
        self.debug('web authenticator building auth info')
        try:
           login = req.get_header('x-foo-user')
           if login:
               return login, {'foo': True}
           else:
               return super(XFooUserRetriever, self).authentication_information(self, req)
        except Exception, exc:
           self.debug('web authenticator failed (%s)', exc)
        raise authentication.NoAuthInfo()

    def authenticated(self, retriever, req, cnx, login, authinfo):
        """callback when return authentication information have opened a
        repository connection successfully. Take care req has no session
        attached yet, hence req.execute isn't available.

        Here we set a flag on the request to indicate that the user is
        foo-authenticated. Can be used by a selector
        """
        self.debug('web authenticator running post authentication callback')
        cnx.foo_user = authinfo.get('foo')

In the authenticated method we add (in an admitedly slightly hackish way) an attribute to the connection object. This, in turn, can be used to build a selector dispatching on the fact that the user was preauthenticated or not.

@objectify_selector
def foo_authenticated(cls, req, rset=None, **kwargs):
    if hasattr(req.cnx, 'foo_user') and req.foo_user:
        return 1
    return 0

6.2.3.3. Full Session and Connection API

class cubicweb.server.session.Session(user, repo, _id=None)[source]

Repository user session

This ties all together:
  • session id,
  • user,
  • other session data.
class cubicweb.server.session.Connection(session)[source]

Repository Connection

Holds all connection related data

Database connection resources:

hooks_in_progress, boolean flag telling if the executing query is coming from a repoapi connection or is a query from within the repository (e.g. started by hooks)

cnxset, the connections set to use to execute queries on sources. If the transaction is read only, the connection set may be freed between actual queries. This allows multiple connections with a reasonably low connection set pool size. Control mechanism is detailed below.

set_cnxset(*args, **kwargs)[source]
free_cnxset(*args, **kwargs)[source]

mode, string telling the connections set handling mode, may be one of ‘read’ (connections set may be freed), ‘write’ (some write was done in the connections set, it can’t be freed before end of the transaction), ‘transaction’ (we want to keep the connections set during all the transaction, with or without writing)

Shared data:

data is a dictionary bound to the underlying session, who will be present for the life time of the session. This may be useful for web clients that rely on the server for managing bits of session-scoped data.

transaction_data is a dictionary cleared at the end of the transaction. Hooks and operations may put arbitrary data in there.

Internal state:

pending_operations, ordered list of operations to be processed on commit/rollback

commit_state, describing the transaction commit state, may be one of None (not yet committing), ‘precommit’ (calling precommit event on operations), ‘postcommit’ (calling postcommit event on operations), ‘uncommitable’ (some ValidationError or Unauthorized error has been raised during the transaction and so it must be rolled back).

Hooks controls:

hooks_mode, may be either HOOKS_ALLOW_ALL or HOOKS_DENY_ALL.

enabled_hook_cats, when hooks_mode is HOOKS_DENY_ALL, this set contains hooks categories that are enabled.

disabled_hook_cats, when hooks_mode is HOOKS_ALLOW_ALL, this set contains hooks categories that are disabled.

Security level Management:

read_security and write_security, boolean flags telling if read/write security is currently activated.