* * * * *
                                        
                        Yeah, I kind of saw this coming
                                        
Yet more reasons I hate control panels.

I had to move a site last night from one server, hades to another server,
marion,  and both servers are running Insipid, which has a backup and restore
feature. So the plan was to backup the site from hades, then restore it on
marion.

It took several attempts on my part to get this process to work. One has to
realize that when Insipid says the operation was “sucessful” all that was
“sucessful” was your request for the operation—the operation itself is a
separate process that will notify you of the actual sucess (or failure
thereof) via email.

Hate hate hate.

Now, one aspect of this site is that it has its own IP (Internet Protocol)
address. And lo', once I got the site on marion, marion was listening in on
the IP address.

Good.

But hades was still listening in on the site as well.

Bad.

Not wanting to actually remove the site from hades until I know this is
working, I then decided to manually remove the IP address (not knowing how
one even approaches this using Insipid).

> ip addr del XXXXXXXXXXXXXX/24 dev eth0
> 

Try to view the site, and the request is going to hades.

Okay, the switch it's on is probably still sending traffic to hades. Clear
the ARP (Address Resolution Protocol) cache on the switch.

Try to view the site, and the request is going to hades.

Check hades and see that it really wants to hold onto that IP address. Use
both ip and ifconfig to nuke the IP from hades.

Try to view the site, and the request is going to hades.

Clear the ARP cache on the switch.

Try to view the site, and the request is going to hades.

Okay, shut down the port that hades is plugged into on the switch, clear the
ARP cache.

Try to view the site, and the request is now going to marion.

Good.

Now, that was late last night (between 3:00 and 4:00 am during which I
stupidly answered the phone and took a tech support call around 3:30 am
dealing with an email issue—sigh).

This morning, requests for said site are now going to hades.

Hate hate hate hate hate.

Okay, [DELETED-nuke-DELETED] delete the site from hades, make sure it doesn't
have the IP address, shutdown the port it's plugged into on the switch, clear
the ARP cache on the switch and okay, requests are now going to marion.

I even double check to make absolutely sure that no other sites are on this
IP address. There aren't.

Hate hate hate hate hate.

I know why I hate control panels—I don't feel in control. And when something
breaks, I have no idea how to fix it. Oh, I typically know what's wrong, and
how one could fix it, if one weren't running a control panel. How to fix it
within the context of the control panel? That, I don't know (oh, I suppose
one could dive into the internals of the control panel but a) that kind of
defeats the purpose of a control panel, which supposedly makes Unix
administration easy and b) we use three or four different control panels,
which all work differently, which means we need to become experts in using
all these control panels which again, kind of defeats the purpose of a
control panel. Either that, or I'm bitter that all my experience in
administrating a Linux system is no longer applicable and that I have to
relearn all this crap four new times, just to administrate a Linux system).

Hate hate hate hate hate.

> Subject: No subject
> Posted-by: Sean (Staff)
> Date: 06-14-2006 3:25am EDT
> 
> Moved site to marion. It”s disabled on hades and hopefully, hades won't try
> to reassert the IP address.
> 
> “Response to trouble ticket last night after moving the site”
> 

I fully expect that in a few hours, I'll have to revisit this situation
again.

An hour and a half later …

hades took control again. This time, we found a process, ntpd (Network Time
Protocol—which keeps the clocks on all the servers in sync) had explicitely
bound to each IP address on hades, as well as Apache [1] apparently still
configured for the site in question (then what the XXXXXXX XXXX good is
Insipid if it doesn't restart Apache?).

Okay, maybe now hades won't take over the address.


Update three hours after the previous update

It's happened twice since the last update. Short of rebooting hades the next
time it happens, I can't think of what else might be causing it to respond to
an IP address it's no longer programmed to respond to.


Update about half an hour later

Found the script buried in /etc that had the IP address and nuked it. That
seems to have taken care of the problem.

For now.


[1] http://httpd.apache.org/

Email author at sean@conman.org