PEP: 311
Title: Simplified Global Interpreter Lock Acquisition for Extensions
Version: $Revision$
Last-Modified: $Date$
Author: Mark Hammond <mhammond@skippinet.com.au>
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 05-Feb-2003
Post-History: 05-Feb-2003, 14-Feb-2003, 19-Apr-2003


Abstract
========

This PEP proposes a simplified API for access to the Global
Interpreter Lock (GIL) for Python extension modules.
Specifically, it provides a solution for authors of complex
multi-threaded extensions, where the current state of Python
(i.e., the state of the GIL is unknown.

This PEP proposes a new API, for platforms built with threading
support, to manage the Python thread state.  An implementation
strategy is proposed, along with an initial, platform independent
implementation.


Rationale
=========

The current Python interpreter state API is suitable for simple,
single-threaded extensions, but quickly becomes incredibly complex
for non-trivial, multi-threaded extensions.

Currently Python provides two mechanisms for dealing with the GIL:

- ``Py_BEGIN_ALLOW_THREADS`` and ``Py_END_ALLOW_THREADS`` macros.
  These macros are provided primarily to allow a simple Python
  extension that already owns the GIL to temporarily release it
  while making an "external" (ie, non-Python), generally
  expensive, call.  Any existing Python threads that are blocked
  waiting for the GIL are then free to run.  While this is fine
  for extensions making calls from Python into the outside world,
  it is no help for extensions that need to make calls into Python
  when the thread state is unknown.

- ``PyThreadState`` and ``PyInterpreterState`` APIs.
  These API functions allow an extension/embedded application to
  acquire the GIL, but suffer from a serious boot-strapping
  problem - they require you to know the state of the Python
  interpreter and of the GIL before they can be used.  One
  particular problem is for extension authors that need to deal
  with threads never before seen by Python, but need to call
  Python from this thread.  It is very difficult, delicate and
  error prone to author an extension where these "new" threads
  always know the exact state of the GIL, and therefore can
  reliably interact with this API.

For these reasons, the question of how such extensions should
interact with Python is quickly becoming a FAQ.  The main impetus
for this PEP, a thread on python-dev [1]_, immediately identified
the following projects with this exact issue:

- The win32all extensions
- Boost
- ctypes
- Python-GTK bindings
- Uno
- PyObjC
- Mac toolbox
- PyXPCOM

Currently, there is no reasonable, portable solution to this
problem, forcing each extension author to implement their own
hand-rolled version.  Further, the problem is complex, meaning
many implementations are likely to be incorrect, leading to a
variety of problems that will often manifest simply as "Python has
hung".

While the biggest problem in the existing thread-state API is the
lack of the ability to query the current state of the lock, it is
felt that a more complete, simplified solution should be offered
to extension authors.  Such a solution should encourage authors to
provide error-free, complex extension modules that take full
advantage of Python's threading mechanisms.


Limitations and Exclusions
==========================

This proposal identifies a solution for extension authors with
complex multi-threaded requirements, but that only require a
single "PyInterpreterState".  There is no attempt to cater for
extensions that require multiple interpreter states.  At the time
of writing, no extension has been identified that requires
multiple PyInterpreterStates, and indeed it is not clear if that
facility works correctly in Python itself.

This API will not perform automatic initialization of Python, or
initialize Python for multi-threaded operation.  Extension authors
must continue to call ``Py_Initialize()``, and for multi-threaded
applications, ``PyEval_InitThreads()``.  The reason for this is that
the first thread to call ``PyEval_InitThreads()`` is nominated as the
"main thread" by Python, and so forcing the extension author to
specify the main thread (by requiring them to make this first call)
removes ambiguity.  As ``Py_Initialize()`` must be called before
``PyEval_InitThreads()``, and as both of these functions currently
support being called multiple times, the burden this places on
extension authors is considered reasonable.

It is intended that this API be all that is necessary to acquire
the Python GIL.  Apart from the existing, standard
``Py_BEGIN_ALLOW_THREADS`` and ``Py_END_ALLOW_THREADS`` macros, it is
assumed that no additional thread state API functions will be used
by the extension.  Extensions with such complicated requirements
are free to continue to use the existing thread state API.


Proposal
========

This proposal recommends a new API be added to Python to simplify
the management of the GIL.  This API will be available on all
platforms built with ``WITH_THREAD`` defined.

The intent is that assuming Python has correctly been initialized,
an extension author be able to use a small, well-defined "prologue
dance", at any time and on any thread, which will ensure Python
is ready to be used on that thread.  After the extension has
finished with Python, it must also perform an "epilogue dance" to
release any resources previously acquired.  Ideally, these dances
can be expressed in a single line.

Specifically, the following new APIs are proposed::

   /* Ensure that the current thread is ready to call the Python
      C API, regardless of the current state of Python, or of its
      thread lock.  This may be called as many times as desired
      by a thread so long as each call is matched with a call to
      PyGILState_Release().  In general, other thread-state APIs may
      be used between _Ensure() and _Release() calls, so long as the
      thread-state is restored to its previous state before the Release().
      For example, normal use of the Py_BEGIN_ALLOW_THREADS/
      Py_END_ALLOW_THREADS macros are acceptable.

      The return value is an opaque "handle" to the thread state when
      PyGILState_Acquire() was called, and must be passed to
      PyGILState_Release() to ensure Python is left in the same state. Even
      though recursive calls are allowed, these handles can *not* be
      shared - each unique call to PyGILState_Ensure must save the handle
      for its call to PyGILState_Release.

      When the function returns, the current thread will hold the GIL.

      Failure is a fatal error.
   */
   PyAPI_FUNC(PyGILState_STATE) PyGILState_Ensure(void);

   /* Release any resources previously acquired.  After this call, Python's
      state will be the same as it was prior to the corresponding
      PyGILState_Acquire call (but generally this state will be unknown to
      the caller, hence the use of the GILState API.)

      Every call to PyGILState_Ensure must be matched by a call to
      PyGILState_Release on the same thread.
   */
   PyAPI_FUNC(void) PyGILState_Release(PyGILState_STATE);

Common usage will be::

   void SomeCFunction(void)
   {
       /* ensure we hold the lock */
       PyGILState_STATE state = PyGILState_Ensure();
       /* Use the Python API */
       ...
       /* Restore the state of Python */
       PyGILState_Release(state);
   }


Design and Implementation
=========================

The general operation of ``PyGILState_Ensure()`` will be:

- assert Python is initialized.

- Get a ``PyThreadState`` for the current thread, creating and saving
  if necessary.

- remember the current state of the lock (owned/not owned)

- If the current state does not own the GIL, acquire it.

- Increment a counter for how many calls to ``PyGILState_Ensure`` have been
  made on the current thread.

- return

The general operation of ``PyGILState_Release()`` will be:

- assert our thread currently holds the lock.

- If old state indicates lock was previously unlocked, release GIL.

- Decrement the ``PyGILState_Ensure`` counter for the thread.

- If counter == 0:

  - release and delete the ``PyThreadState``.

  - forget the ``ThreadState`` as being owned by the thread.

- return

It is assumed that it is an error if two discrete ``PyThreadStates``
are used for a single thread.  Comments in ``pystate.h`` ("State
unique per thread") support this view, although it is never
directly stated.  Thus, this will require some implementation of
Thread Local Storage.  Fortunately, a platform independent
implementation of Thread Local Storage already exists in the
Python source tree, in the SGI threading port.  This code will be
integrated into the platform independent Python core, but in such
a way that platforms can provide a more optimal implementation if
desired.


Implementation
==============

An implementation of this proposal can be found at
https://bugs.python.org/issue684256


References
==========

.. [1] David Abrahams, Extension modules, Threading, and the GIL
       https://mail.python.org/pipermail/python-dev/2002-December/031424.html


Copyright
=========

This document has been placed in the public domain.



..
  Local Variables:
  mode: indented-text
  indent-tabs-mode: nil
  sentence-end-double-space: t
  fill-column: 70
  End: