Home

A customer today reported a problem where version history was not what you would expect between the 1.0 and the 1.1 versions in Alfresco Share. A bit of background history to start with:

As soon as content is dropped into a site, it will get its type changed to rl:customType. This type adds a new properties such as rl:documentNumber and others. The rl:documentNumber is generated for all new content added to the site. All this is done in a behaviour listening at onCreateNode on Transaction commit.

1
2
3
4
5
6
7
8
9
10
@Override
public void afterPropertiesSet() throws Exception {
    super.afterPropertiesSet();
    Assert.notNull(policyComponent);
    if (!isInitialized()) {
      if (LOG.isTraceEnabled())
        LOG.trace("Initialized " + this.getClass().getName());
      policyComponent.bindClassBehaviour(OnCreateNodePolicy.QNAME, ContentModel.TYPE_CONTENT, new JavaBehaviour(this, "onCreateNode", NotificationFrequency.TRANSACTION_COMMIT));
    }
}

The problem that appeared when a version 1.1 was created through editing the uploaded document. When browsing the version history you could see that the 1.0 version did not have a rl:documentNumber set, however, the 1.1 version did. So the conclusion was that the version 1.0 was actually created before transaction commit in the version store. If this is a feature or a bug, I don’t know, but the solution is simple. In your behaviour, bind on the first event instead of transaction commit as this will correctly make sure that everything written to the node in the behaviour also ends up in the versioning history.

1
2
3
4
5
6
7
8
9
10
@Override
public void afterPropertiesSet() throws Exception {
    super.afterPropertiesSet();
    Assert.notNull(policyComponent);
    if (!isInitialized()) {
      if (LOG.isTraceEnabled())
        LOG.trace("Initialized " + this.getClass().getName());
      policyComponent.bindClassBehaviour(OnCreateNodePolicy.QNAME, ContentModel.TYPE_CONTENT, new JavaBehaviour(this, "onCreateNode", NotificationFrequency.FIRST_EVENT));
    }
}
This entry was posted in Alfresco, Behaviours, Version History. Bookmark the permalink.

5 Responses to Specialize type upon content creation within behaviour can result in invalid version history

  1. Axel Faust says:

    Hello Marcus,

    this is neither a feature nor a bug, just a misunderstanding of how auto-versioning works. Auto-versioning IS a TRANSACTION_COMMIT level behaviour and you can never guarantee that your custom TRANSACTION_COMMIT level behaviour will be called before the auto-versioning behaviour.
    In my point-of-view, TRANSACTION_COMMIT should be a highly discouraged notification frequency. We almost never use it in our projects, because there are too many pitfalls that developers simply do not consider (e.g. runAs and permission). For me, only EVERY_EVENT is a valid frequency and FIRST_EVENT is virtually useless – for OnCreateNodePolicy EVERY_EVENT and FIRST_EVENT are completely identical because a node can only be created once anyway.

    Regards, Axel

  2. Marcus Svensson says:

    Hi Axel,

    Great input, I had actually not looked into how auto-versioning is implemented.

    Transaction commit is as you say obviously a pitfall when it comes to auto versioning. To bind on transaction commit is used quite commonly among alfresco developers, and this case shows that you cannot blindly bind on transaction commit without understanding the underlying mechanics.

    Regards,
    Marcus

  3. Peter Monks says:

    Axel, the problem with EVERY_EVENT is that it can be called numerous times within a single transaction (a lot more than you might expect), so unless your behaviour logic is fast+in-repository-only (no RPCs, no external data access, etc.) or asynchronous, it’s easy to seriously impact Alfresco’s performance. Compounding this is that some of Alfresco’s access mechanisms (e.g. CIFS) can result in a lot of small Alfresco transactions for every “user transaction”. This quickly becomes a multiplicative problem.

    Personally speaking, I’ve found TRANSACTION_COMMIT behaviours the best compromise between the “uselessness” of FIRST_EVENT and the almost-certain performance penalties of EVERY_EVENT. You then just have to face up to the realities of auto-versioning (if present).

    It’s also worth pointing out that these characteristics of auto-versioning aren’t specific to custom behaviours – there are other ways that these kinds of corner cases can occur (e.g. CMIS).

  4. Axel Faust says:

    Peter:
    I require our devs to include the necessary “early check and bail out”, e.g. checking the affected store and actual change (if a property value has actually been modified or the correct aspect been applied). If you do that, then there is very limited chance of noticable performance penalty, because the actual state transitions that interest you should not occur multiple times in a transaction and also not without user interaction (which would exclude crazy stuff from CIFS et al).
    Unfortunately, I see a lot of people in the community (via their code) that have implemented their behaviours in a very naiive way, assuming only the use case they are working on will trigger the behaviour. Behaviours implemented that way will definitely be obliterated by performance penalties when switching to EVERY_EVENT.

    Behaviours in my point of view should / must almost never include any logic that is reliant on remote data, either external data sources or indices. That should be reserved for jobs or asynchronous actions that may merely be triggered by the behaviour. I have seen few use cases that really require transaction-scoped external data access/import with real-time data consistency.

    My preference for EVERY_EVENT is rooted in the experience that 4 out of 5 behaviour problems in past projects or found in reviews of customer code were due to TRANSACTION_COMMIT and could have easily been avoided. Also 1 in 10 workarounds/patches we have had to code to deal with Alfresco out-of-the-box features in the past were due to Alfresco default behaviours being TRANSACTION_COMMIT – most of the time, this meant auto-versioning (e.g. patches to SPP, versioning hacks for workflow runAsSystem logic).

    • Peter Monks says:

      Yeah early bail is a good strategy in all custom behaviours regardless of notification frequency, although of course those checks still have non-zero costs that get multiplied by the number of times an EVERY_EVENT custom behaviour is triggered. Accessing the node that the event is being triggered for (a common way that bail out logic is implemented) is usuallyok (due to caching), but pretty much any repository operation can result in database I/O (especially when Alfresco is under load), so I usually err on the side of caution and assume that even that will be expensive (milliseconds).

      I’ve rarely seen the kinds of issues you’re describing with TRANSACTION_COMMIT, and when I have it’s almost always been due to assumptions about the order in which behaviours are invoked (which are always wrong – behaviour invocation order is non-deterministic). Ensuring that custom behaviours function correctly no matter what state the data is in at the time of invocation goes a long way towards avoiding these problems. This won’t fix issues in Alfresco’s own logic (such as the auto-versioning behaviour), but a support case should always be the first response when that kind of thing is observed.

      I’d suggest that the moral of this story is that the custom behaviours extension point has more subtle complexities than might meet the inexperienced eye, and they should therefore be used sparingly…

Publicités

Laisser un commentaire

Choisissez une méthode de connexion pour poster votre commentaire:

Logo WordPress.com

Vous commentez à l'aide de votre compte WordPress.com. Déconnexion / Changer )

Image Twitter

Vous commentez à l'aide de votre compte Twitter. Déconnexion / Changer )

Photo Facebook

Vous commentez à l'aide de votre compte Facebook. Déconnexion / Changer )

Photo Google+

Vous commentez à l'aide de votre compte Google+. Déconnexion / Changer )

Connexion à %s