Powered by Invision Power Board


  Reply to this topicStart new topicStart Poll

> hwvps1.schmolie.com storage adapter failure, repairs completed
andy
Posted: Oct 4 2017, 01:56 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



hwvps1.schmolie.com is exhibiting the symptoms of a failed storage adapter.

We're in the process of diagnosing which one it is (there are several).

Once we know which storage adapter is having trouble, we'll begin repairs.

At present this is causing difficulty for the hwVPS on that node as well as webpro1.speedingbits.com and manage.speedingbits.com (the front end for the HSPC control panel). The redundancy seems to have kicked in and taken over, but the node is still struggling to reliably use the storage system.

The expected ETA for repairs is 2 hours, give or take an hour depending on how difficult the diagnosis and repair turns out to be.

More information will be posted on this thread when it is known.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 02:06 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



The failover to the redundant storage adapter succeeded, but it was too late for some hwVPS to continue functioning properly.

We're restarting each of the hwVPS on this node to restore service.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 02:54 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



A couple of hwVPS and webpro1.speedingbits.com are still doing disk checks due to ungraceful restarts and/or more than 180 days since last disk check. We're monitoring each to completion to ensure they finish successfully.

Time to complete depends on how many files are on the filesystem being checked. More data = more time for filesystem checks.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 03:09 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



All hwVPS are running and have completed their file system checks.

Only webpro1.speedingbits.com is still doing a disk check and it is almost complete.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 03:14 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



webpro1.speedingbits.com has now completed its disk check as well.

We're going to resume diagnosing which storage adapter needs to be physically replaced.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 04:06 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



We've diagnosed which storage adapter needed to be replaced.

We've reviewed our parts inventory and it looks like we need to order a replacement. While waiting for the replacement, the system should run fine over the redundant controller.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 04:53 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



The failed controller is intermittently being retried by the system, so we're going to remove it to force the system to only use the working controller.

At this time, the trouble with the hwVPS (and webpro1.speedingbits.com) on this node has resumed.

It will take us about 20 minutes to get on-site, another 10 minutes to physically remove the storage adapter, another 5 minutes to boot the node, and 10 to 30 minutes to get all the hwVPS running again once the node is up. In total, that's 45 minutes to an hour or so before we have this resolved.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 06:31 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



The failed storage adapter has been removed and the node is currently booting.

We'll be starting all the hwVPS on this node as soon as it finishes booting.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
andy
Posted: Oct 4 2017, 07:15 AM
Quote Post





Group: Advantagecom Staff
Posts: 4,310
Member No.: 9
Joined: 12-July 02



All of the hwVPS and webpro1.speedingbits.com are running. Several had to do a disk check, which added some time for the boot process for those hwVPS.


--------------------
Sincerely,
Andrew Kinney
CTO, Advantagecom Networks

Please do not private message me. My regular management duties preclude responding to every customer that sends me a support issue. Instead, post on the forum or contact tech support.
PMUsers Website
Top
0 User(s) are reading this topic (0 Guests and 0 Anonymous Users)
0 Members:

Topic Options Reply to this topicStart new topicStart Poll