[Glass] unexpected semantics of #doTransaction: with multithreading (used in WAGemStoneServiceTask)

Johan Brichau johan at yesplan.be
Wed Jan 29 11:33:59 PST 2014

Ow, I did not realise this was part of Grease. I always thought it was some part of the GLASS codebase.

This code has been running in production with us for almost a year now, and all troubles with the service vm have stopped.
So yes, let's include it. 

Actually, Grease1.1 should be ported decently and I will do that once I get to pick up the Seaside 3.1 work for gemstone.


On 29 Jan 2014, at 19:37, Paul DeBruicker <pdebruic at gmail.com> wrote:

> It appears that Johan's changes haven't been adopted in Grease for GemStone.  
> Are these changes just for those transactions in the ServiceVM?   From
> reading the thread it seems like you both agreed it was a was a good change. 
> I'm just not sure where to put it in my stone.  
> Thanks
> Paul
> Johan Brichau-2 wrote
>> Hi Dale,
>> Good to know ;-)
>> I noticed a typo in the code I posted. This is the correct one:
>> 	self transactionMutex critical: [
>> 		"Get the transactionMutex, and perform the transaction."
>> 		System inTransaction
>> 			ifTrue: [ "We already are in a transaction, so just evaluate the block"
>> 					aBlock value.
>> 					^ true]
>> 			ifFalse:[ 
>> 					[
>> 						self doBeginTransaction.
>> 						aBlock value
>> 					] ensure: [ ^self doCommitTransaction]]]
>> On 11 Feb 2013, at 18:39, Dale Henrichs wrote:
>>> Johan,
>>> Your changes are correct ... the only time that it makes sense to perform
>>> aBlock when already in transaction without acquiring the mutex is when
>>> you are in a _process_ that has already acquired the mutex and that is
>>> handled in the #critical: implementation of TransientRecursionLock ...
>>> Good catch!
>>> Dale
>>> | 
>>> | Hi folks, Dale,
>>> | 
>>> | I have been chasing down on regular occurrences of 'blocked' service
>>> VMs
>>> | (i.e. they were no longer processing tasks and had to be restarted). I
>>> | ultimately tracked it down to a (for me) unexpected semantics of the
>>> | GRGemStonePlatform>>doTransaction: implementation in the context of
>>> multiple
>>> | processes in the same vm (as is the case in the service vm).
>>> | 
>>> | In a WAGemStoneServiceTaskVM, up to a 100 concurrent
>>> WAGemStoneServiceTask
>>> | instances can be executed at the same time. Each of these processes
>>> (running
>>> | in the same service vm) starts transactions using the #doTransaction:
>>> | method, which is implemented as follows:
>>> | 
>>> | 	System inTransaction
>>> | 		ifTrue: [ "We alread in a transaction, so just evaluate the block"
>>> | 			aBlock value.
>>> | 			^true].
>>> | 	self transactionMutex critical: [
>>> | 		"Get the transactionMutex, and perform the transaction."
>>> | 		[
>>> | 			self doBeginTransaction.
>>> | 			aBlock value.
>>> | 		] ensure: [
>>> | 			^self doCommitTransaction]].
>>> | 
>>> | 
>>> | If the VM is in a Tx, the task block will be executed right away and
>>> always
>>> | return true. If not, only then we wait for the tx-mutex and execute a
>>> | transaction.
>>> | So it seems that nested calls of #doTransaction should always work, but
>>> that
>>> | is not true when multiple processes are running.
>>> | 
>>> | The way we are using the service tasks is that inside their task block,
>>> they
>>> | also use the #doTransaction: method _and_ they are making external
>>> (http
>>> | socket) calls while inside that transaction block.
>>> | The result is that the scheduler will interweave processes while there
>>> are
>>> | executing the #doTransaction: blocks. So another process could abort or
>>> | commit a partial result of another process, etc...
>>> | Some of the executed blocks will get committed, some not, pure randomly
>>> | depending on how the scheduler started interweaving the blocks.
>>> | 
>>> | I therefore changed the implementation of #doTransaction: as follows
>>> you
>>> | always need to get the tx-mutex, wether you are already in tx or not
>>> because
>>> | otherwise you might be screwing with another process tx-block.
>>> | This means tx-blocks are mutually exclusive for all processes, which I
>>> think
>>> | they were meant to be?
>>> | 
>>> | 
>>> | 	self transactionMutex critical: [
>>> | 		"Get the transactionMutex, and perform the transaction."
>>> | 		System inTransaction
>>> | 			ifTrue: [ "We already are in a transaction, so just evaluate the
>>> block"
>>> | 					aBlock value.
>>> | 					^ true]
>>> | 			ifFalse:[
>>> | 				self doBeginTransaction.
>>> | 				[aBlock value] ensure: [ ^self doCommitTransaction]]]
>>> | 
>>> | 
>>> | What are your thoughts?
>>> | thx
>>> | Johan
