Updates from May, 2013 Toggle Comment Threads | Keyboard Shortcuts

  • ezidstatus 10:12 am on May 23, 2013 Permalink  

    Information about April DataCite Outage 

    The following has been provided by the DataCite Tech Team:

    We are currently in the process of renewing our servers. This includes MDS being moved to new machine which also serves http://www.datacite.org and schema.datacite.org.

    After everything has been tested for a long time we switched to the new machine on May 11th. The switch was absolutely smooth. There was no service disruption at all!

    But four days later some connection timeouts for MDS occur and shortly afterwards MDS became completely unreachable. Unfortunately due to a configuration error our monitoring system did not noticed this. (This is of course fixed now!) This was the main reason for the long duration of the outage.

    After noticing the problem the day after, we immediately switched back to the old machine. Everything was back to normal and we had time to investigate.

    So what caused the outage? The connection between two key server components (Tomcat and Apache with proxy_ajp) broke down. The reason for this is unclear. Unfortunately we were also unable to reproduce the error no matter how hard we hit MDS. In this case it is obviously very hard to find a fix letting us feel confident enough for another try with this setup.

    So after some discussion we decided to circumvent any potential roots of the problem. We migrated to a more modern and scalable web server (nginx). This took us a while, but the setup is now in place and we have already switched to it on Sunday. We are very confident that we now have a modern and reliable system.

    However this switch was not as smooth as the one before. Two problems occurred:

    1. We had to install a new SSL certificate due to expiring of the old one. However we missed to include the intermediate certificate. This might have broken your API clients. Due to browser caching this might have only affected a minority of UI users. This was fixed immediately after we got to know it on Tuesday.

    2. We have also enabled HTTPS on schema.datacite.org and http://www.datacite.org. This causes a problem in MDS when reading the schema needed for validation. MDS was rejecting all metadata uploads. This is also fixed now.

    Both problems are hard to detect at time of the switch or beforehand, because due to caching both did not occur immediately.

    We are very sorry for all the inconvenience. We learned from the issues, e.g. improved our monitoring system. We are very confident that MDS is now stable again, and that all future server migrations will be smooth.

    DataCite Tech Team

    Advertisements
     
  • ezidstatus 9:16 am on May 22, 2013 Permalink  

    Information about May 21st outage 

    The outage affecting EZID service yesterday was part of a CDL-wide outage. In other words, all CDL services hosted in our organization’s data center were unavailable during this time (from 3:00 pm to 4:10 pm Pacific). The root cause is not known at this time. The data center staff are currently working on determine the cause.

    We apologize for this inconvenience, and we will provide you with further information when it is available.

     
  • ezidstatus 4:18 pm on May 21, 2013 Permalink  

    Network outage over 

    The network outage appears to be over.  All systems are accessible again.

     
  • ezidstatus 3:12 pm on May 21, 2013 Permalink  

    Network outage 

    CDL appears to be suffering some type of network outage.  All services are inaccessible, including EZID and ARK identifier resolution.

     
  • ezidstatus 6:20 am on May 21, 2013 Permalink  

    Metadata problem resolved 

    DataCite has resolved the problem.  DOIs can be created and updated again.

     
  • ezidstatus 5:21 pm on May 20, 2013 Permalink  

    DOI metadata problems 

    EZID has just begun encountering a problem in which DataCite XML metadata records that have always been accepted by DataCite are now being rejected by DataCite with the error,

    “[xml] xml error: s4s-elt-character: Non-whitespace characters are not allowed in schema elements other than ‘xs:appinfo’ and ‘xs:documentation’. Saw ‘301 Moved Permanently’.”

    Although DOI resolution is not affected, you may be unable to create or update DOIs while this problem persists.

    We have contacted DataCite technical support, and hopefully the problem will be resolved shortly.

     
  • ezidstatus 12:52 pm on May 14, 2013 Permalink  

    Brief network outage 

    CDL recently experienced a network outage lasting approximately 15 minutes.  All appears fine now.

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel