NLA eClips Service Incident - Report

Problem:

 

Loading of The Daily Mail into the NLA database failed on Tuesday 4th November 2008. This meant that the distribution of feeds to NLA clients did not contain 1st edition Daily Mail content by the target time of 01:00. Loading and distribution of other titles were unaffected.

 

Cause:

 

A momentary connectivity failure between the server running the loading module and the storage device to which loading takes place, caused a single thread of the loading module to loop erroneously.

 

Solution:

 

NLA engineers restarted the loading module which resulted in the loading and distribution of all the 1st edition Daily Mail content by 01:15.

 

Analysis of the loading module's source code has identified areas where modifications can be made to prevent a similar incident in future. These modifications will be scheduled soon.

Posted on Tuesday, November 4, 2008 at 17:55 by Registered CommenterIncident Management Team | Comments Off | EmailEmail | PrintPrint

NLA eClips Service Incident - Report

 

Problem:

 

Certain eClips customers had intermittent access to NLA web and FTP services from 7:35am to 9:00am and from 9:19am to 9:33am on Saturday 18th October 2008.

 

Cause:

 

The owners of NLA's London hosting facility were carrying out the first phase of a planned, annual, power-down exercise on Saturday 18th October. This involved disabling one of the two power feeds which supply the NLA infrastructure. The NLA's infrastructure can usually tolerate removal of one power feed as it has a dual-fed, clustered architecture. In this instance, the automated failover of one clustered network component did not complete successfully.

 

Solution:

 

The failover process for the affected network component required manual intervention by engineers, who ensured that it completed successfully. The engineers also made some configuration changes to the cluster which should reduce the risk of a similar event occurring in future.

Posted on Saturday, October 18, 2008 at 22:01 by Registered CommenterIncident Management Team | Comments Off | EmailEmail | PrintPrint

NLA eClips Service Incident Report

NLA eClips Service Incident - Report

Problem:

At approximately 9:30 this morning an incident occurred which impacted eClips service delivery. The incident was resolved at 10:00. During this incident, clients ability to view eClips content was impacted as the service was intermittently unavailable.

Cause:

The root cause of the incident is still under investigation by NLA engineers, however indications show that when attempting to serve a higher than normal number of requests, the eClips database license checking process became less responsive and is being investigated as a potential area requiring optimization.

Solution:

NLA engineers are now reviewing the eClips core code related to this aspect of the service with the aim of discovering the root cause and optimizing it to prevent reoccurrence.

The NLA engineering team is also preparing to deploy a new database architecture which will be more resilient and scalable. This should also have the benefit of preventing such an incident from occurring.

NLA Service Operations Management

Posted on Monday, October 6, 2008 at 13:19 by Registered CommenterIncident Management Team | Comments Off | EmailEmail | PrintPrint

CANCELLED: NLA maintenance - Saturday 4th October 2008 at 13:00

NLA engineers will be carrying out configuration changes to the NLA eClips environment on Saturday 4th October at 13:00. Web and FTP services will be intermittently unavailable for up to three hours while the work takes place.

Notifications will be sent before the work begins and as soon as it is complete.

We apologise for any inconvenience this may cause.

Posted on Thursday, October 2, 2008 at 11:36 by Registered CommenterIncident Management Team | Comments Off | EmailEmail | PrintPrint

NLA eClips Email Issues

Yesterday the NLA implemented changes to our email system which creates further resilience and fault tolerance. Unfortunately, during the implementation, new rules and restrictions within the system blocked some emails from being passed to third party recipients. In particular, this issue impacted client requests sent to the reprocessing@nla.co.uk alias. All requests sent yesterday to this alias were not routed appropriately to Ninestars and therefore any requests for reprocessing were not performed.

The NLA apologises for any impact this may have had on PCA eClips production workflow. The issue has now been resolved and all emails are being routed appropriately.

Posted on Tuesday, September 23, 2008 at 13:02 by Registered CommenterIncident Management Team | Comments Off | EmailEmail | PrintPrint
Page | 1 | 2 | 3 | 4 | 5 | Next 5 Entries