We had discussed about metadata extraction in Alfresco in this article. Now Alfresco support many different type of content types and for each of the content type there will be different set of properties which are specific to those contents.

For Instance all images will have common properties like “width”,”height”,”exposure-time”,” focalLength” etc… (Part of exif aspect in alfresco Model)apart from common or rather basic properties like name, title and description.

For all Emails they have different set of properties like “Addresses”,”subjectline”,”sentdate” (Part of Emailed aspect in Alfresco Model)

Alfresco has metadata extractor specific for each of those content type which could handle this contents and extract properties from those contents and mapped with alfresco model. It works smoothly out of box and you face no issue in that process everything works like charm.

Issue with Bulk Import

Real issue comes up when you try for Bulk import. We had discussed about Alfresco bulk import capabilities in this blog. Now whenever we import images of emails in bulk though zip import it create all contents in alfresco and set basic properties properly. Only issue is with specific properties It does not able to extract those specific properties for those content types. For emails bulk import you figure out that “Addresses”,”subjectline” are blank although that information is present in emails which you had imported.

Cause of Issue:

Main cause of this issue is that Alfresco does not able to invoke that specific metadata-extractor for special types and thus all those special properties will be undetected. When we import individual files it could invoke specific extractor which is bind to such event but in case of bulk import this does not happen.

Solution to Issue:

To resolve that issue one rule needs to be created on space under which you are planning to do bulk import. Rule is simple and out of box all you need to do is follow these steps.

  1. Start Alfresco server
  2. Login as admin in alfresco share (http://localhost:8080/share)
  3. Navigate to the space details under which you want to do bulk import.
  4. Click on “Manage Rules” link from right hand side action links panel.Manage Rule in Alfresco
  5. Click on “create” and set action as “Extract common metadata” as shown in figure below.Manage Rule Dialog for Medatadata Extraction Rule
  6. Click on create and you are done.

Now if you import zip file with emails or images respective metadata extractor will be invoked in the backend by this new rule and thus all those properties which were not extracted earlier will be extracted automatically at the time of import.

