And We're UP! Server Down Time Update, Status and Information.

Dear ModelCentro partners and friends,

This is an update about server-related down time ModelCentro and sites powered by ModelCentro have been experiencing today. Your websites are currently up and functional for your users. Your admin is also working, although some functionality – like uploading content – is limited due to the continuation of efforts to restore the system to its normal state. We will keep updating this thread with progress notifications.

The reason, as we have mentioned in our twitter reports, is that the Data Center has been having a major server overload. Upon the current stage of investigation, we can also add that it turned out to be a severe storage hardware malfunction.

We will surely investigate further, learn from this and improve our operations in order to ensure that we are doing everything possible to avoid down times.

Comments

  • first downtime since 2 years i was wonder already how u do it but hey they got you :P 10k models was to much for the server :D keep the good work going and like you say learn from it !!!

     

  • Still down for me. Not able to log in. Been down pretty much all day.

     

     

  • The site has still been down most of the day. when the admin panel is up, its not showing my members or transactions. my site itself is down as I type this.

  • Well my one is running

  • Hey folks! Latest update on what's been done so far and what's going on:

    The server overload has been brought down by disabling some of the less vital processes in the system (like content processing). Our attempts to bring it back to full speed caused some intermittent down times (ones reported here and over twitter), so we had to cease trying to enable them for now. This is why you've noticed the robots again, now we're focusing on maintaining website stability for your fans as top priority, doing our best to avoid more "robots".

    The admin panel is accessible, but has a number of features limited and content processing has been disabled entirely.

    As to the cause of the matter and some technical back end information - the server overload caused an unexpected malfunction the Isilon cluster, which has still not been resolved or conclusively identified at this time, making it difficult to give you a reliable ETA on when this will be completely over. Currently any addition to the load of the server will cause another down time, so we are holding on re-enabling any more back end features and looking for ways to temporarily reduce the load. Part of the solutions the DC team is trying is communicating with Isilon's support about their product.

    Thank you very much for your patience and understanding. You do have the full rights to be thoroughly outraged, so if you'd like to yell at anyone - let me know and I'll give you my phone number. Not that it'd change anything, but if it'll make you feel better I'm here for you. I myself am looking at plane tickets to the head office of our Data Center.

    P.S. My "educated" opinion is that the darn storage units were cursed... and I'm starting to no longer smile when I think that.

  • Will our scheduled material still appear? I have new movies dropping every week, I get it that I can't upload anything new right now, but what about the stuff we've already uploaded for scheduled release?

  • Do you know when it will be possible again to ad new content?

  • Hey there @busty_von_tease and @monasummers!

    The system is now back to normal, and our team is ironing out any hiccups that do appear. Please report any issues here or via live chat as it will help us fix things faster.

    Content that was meant to publish during this time is being restored and published. No data was lost, so there is no need to re-process anything.

    I'll be back in a bit with a more detailed report on what happened and why.

  • Mine's not back up, i've a client wanting a refund after saying none of the videos work, so I checked them in the admin area, and I can't still edit my videos.

  • @TheNatalieK please try clearing cache/cookies and re-loading the video management page.

  • You might still occasionally get corrupted images as the content encoding mechanisms of the platform are restoring normal functions. The content you upload is processing normally, so just a matter of some thumbnails acting up. This should gradually stop happening over the next few hours, depending on where you are in the world.

  • @TheNatalieK, as to the user - we are able to play videos normally, so please ask him to reach out to Technical Support via the footer of your website and we will help him get this resolved.

  • edited June 2016

    Official causes report and important information.

    As you probably know, this weekend the ModelCentro platform experienced unprecedented down time and technical issues as result of a severe server hardware malfunction. 

    We are happy to let you know that now the platform is entirely back up to its full capacity, all features are available and working as expected. There was no data loss on the server, so all content is safe.

    On Saturday, in order to restore service from a complete down time during the progress of repairs on the server, the team at MC had to limit some data-heavy features – like uploading and managing content, which resulted in giving us the option of restoring your websites for your fans and getting rid of the “Maintenance” notifications. Some other features were not performing normally also, including fans lists, account approvals etc. Limiting admin features was the only way to bring back the websites and stop the global down time. As soon as the data center reported that the malfunction was terminated (for which they even had to elicit the help of the hardware manufacturer), we were able to carefully start restoring features of your admin and bringing the platform to its regular performance.

    Here is a report from the DC, for those of you who want to know what exactly happened:

    “On Saturday 06/25 at 04:00 AM CEST (09:00 PM CDT), one of the cloud storage nodes has experienced severe hardware fault with internal NVRAM module. Fault has caused the filesystem journal to become completely corrupt, both in NVRAM and in a spare copy on mechanical drives. We have replaced bad NVRAM module and reformatted the node and it's been participating in cloud operations since.

    However, this kind of malfunction has caused 80% of the remaining cloud storage nodes to have indeterminate file transactions, incomplete and non-committed by default. Initially, huge directory issues were suspected, but that was false alarm and the root cause was in fact the dead storage node that had leftover file transactions. We have been escalating this matter and working on this during Sunday 06/26 and together with vendor team we have been able to manually replay where possible and/or cancel those file transfers where needed. The storage is stable since Monday 06/27 01:00 AM CEST (Sunday 6/26 18:00 CDT).”

    In a recap of what happened on the server and the explanation above, first a hardware piece got “fried”, which was quickly fixed. However, the malfunction caused issues by corrupting logs and creating faulty file transfers throughout the system, which resulted in the overload and consecutive lengthy search for cause and repairs.

    We want to sincerely apologize for any trouble this down time has caused you. The magnitude and time span of this issue has been a record of all time for us, manifold. We will now focus on ensuring we are doing everything humanly possible to avoid such situations in the future. This dreadful situation, although it caused so much grief and costed nerve cells for us all, will become a useful learning tool for us to improve and strive for the better.

    A few important notices:

    - If you had content that was scheduled during the term of the down time, it has been published now, so no need to re-schedule or publish again.

    - If your website or admin are still displaying any problems, please post in this community thread, email support@modelcentro.com or report them via live chat in your admin panel.

    The MC team is grateful for the support, patience and understanding which you have displayed during this weekend's hardest moments. We are thrilled to be working with such awesome people like you.

  • edited June 2016

    It is still not working properly for me. I have cleared my cache & cookies & browser data. I'm trying to upload my newest video and the upload itself went fine, however, the thumbnail section is missing, When I hit publish, it publishes to the website but no cover image, just a link to the video.

    I've tried both my Chrome & Firefox browsers, It makes no difference.

  • Just to add to my previous comment, My website is displaying the "We are sorry, site undergoing maintenace" page.

  • edited June 2016

    I'm having problems uploading content, I just tried to upload a picture and its just black.

    *update*

    It eventually showed up but after an hour of adding it.
  • Hey there @EllaApples @busty_von_tease! We were able to find the reason for the intermittent pop ups of maintenance messages and issues with uploads (they were geo-based, as different parts of the CDN were applying and syncing with latest fixes). The fix that was done should have completely gotten rid of this, if anyone still has issues please do let us know.

  • hi there , 

    Thanks for getting the sites back up as quick as you did. Just wanted to let you know think there is still an issuse with the Blogs as can't seem to write any thing in a new post . When you press the button to add writng nothing happens . 

    Thanks 

  • Just wanted to let you know that my scheduled video update did not post to my twitter account this morning

  • Thanks ladies, looking into both issues, support will reach out to you if they can't reproduce on their own.

  • My Wednesday Scheduled movie release published ok, and posted to my Twitter and Tumblr accounts.

  • Woot woot! Yes, we're back to normal people...gosh it feels good :)

  • Thank you @Natalie they have reached out and are looking into it. You all work so hard and are really great. I appreciate everything you all do so much.

Sign In or Register to comment.