Tuesday, October 25, 2011

(Mis)Adventures in Networking

EDIT:  Whoops! I never published this blog post.

Wow, it's been a while since my last post.  Anyway, if you were looking for an example of Murphy's law, I've got one for you.

We had 2 simple goals:  upgrade our internet service and bring up our new WAN connections.  This adventure starts on Tuesday, April 19, 2011.

Background:

In the city where I work, we are really locked into our choice of providers.  Because of the city's infrastructure, we have 2 options: the incumbent telecom provider and the local cable company who has no interest in providing service to us.

This install is an upgrade to our existing service and also a relocation to another building.  It was originally supposed to take place last November. As we were ramping up for the install, we discovered that my former boss didn't place the order. (This after I received a couple of phone calls asking why we were requesting service at 2 sites during the summer.) After a couple of angry phone calls to the telco rep, the order was placed.  We were given an install date in early February.  A week before the date arrived I sent my weekly pester the project manager email and was told that they were having "unforseen issues and would be unable to meet the date." (According to the rep, they ran out of capacity) The date they gave me was in June.  Called the rep, explained to him how this was unacceptable and that I was going to recommend that we take action to terminate our existing contracts. He was then able to get us a March date. As we got close to the March date, we were told that date was only for our end of the connection and it would take a few weeks more to get the other end installed. Finally in late March, we were told that everything was in place and we looked at the calendar to find a suitable date that wouldn't cause too much disruption.  We decided on spring break week since the majority of our employees were off anyway.

Tuesday Afternoon:
I met with an engineers from my ISP and my integrator to go over the connections and site prep prior to the installation. He checked the configuration on the router as it was shipped and made some changes/corrections.  He made a call to try to bring up the connection, even though it wouldn't be routing any traffic at that point - just to make sure everything was good on the telco end.  After spending some time on hold and talking to several people, he was told that our appointment wasn't until 9:00 the next morning and there was nothing that could be done until then. I found this rather entertaining since he works for the same company.  While he was doing this the integrator and I busied ourselves with racking switches, connecting cables, checking connections, etc. A couple of hours later we went over our game plan for installation and left for the day.

Wednesday Morning:
We met at the appointed time and set out to make magic!! Our enthusiasm was soon dashed when we couldn't get a link to the demark. We did the usual switch the polarity of the cables, trying new/different cables, trying different SFP modules, nothing was working.  At some point during we noticed that we didn't have a link light on the WAN router either - but the connection was up and passing traffic. The ISP engineer continued to work with the telco side to try to troubleshoot the issue.

During a trip to the demark to bang my head against the wall I looked at the fiber box that went to the MDF - it read "SM Fiber to MDF". I thought SM - as in Single Mode Fiber? This should be MultiMode Fiber.  I disbelief I opened the box and looked inside.  Sure enough, it was Single Mode. WTF! (Bang Head on wall). Funny thing is the vendor that installed the fiber was hired by the telco on the premise that they wouldn't install the line unless dedicated fiber was in place. Now, the demark happens to also be in one of the IDF's in the building. Couldn't we just use that? We were told no.  After we put our heads together for a few minutes, I said "F it, this is the only opening we have for over a month that we can do this during the day." I then connected to the fiber that was in the IDF, ran upstairs, moved the cross-connects and bingo, the WAN came up with a link light.

While this was going on, the telco engineer had moved the router to the demark to try to get a connection there. After a while, he was able to get the line up and brought the router back upstairs.  We re-racked the router and fired it up. He was unable to pass any traffic across the connection.  He gets on the phone again and calls me over after about 15 minutes and puts his cell on speaker.  The person on the other line says "Oh, you want to switch your internet service to that line? I asked him why the heck he thought we were going through all of this? His answer (I wish I was making this up): "Oh, I thought you were just testing the line." As Bill Engvall says "I didn't start out the day wanting to be a jackass, but you just pushed my jackass button." Needless to say, we had internet in a couple of minutes. Elapsed time: 3.5 hours.

I wish this was the end of the story, but it's not.

While we did have internet, traffic was sporadic. Our initial troubleshooting lead us to believe it was a DNS issue.  And, indeed, we were having trouble with DNS resolution.  But, when we tried to use a public DNS server, it only marginally better. It worked fine from the DMZ and from outside.  Made a few changes to the routers and firewall and we were able to get access at-first. After about 10 minutes, it stopped working again. Undid the router and firewall changes and it worked again for about 10 minutes.

Hours of troubleshooting later, we were able find the issue, but not a clear solution. When we starting routing traffic through the new WAN, it introduced routing loops at almost every point in the network.  Considering we have 22 locations, the problem was quite massive.

Making corrections after deployment (or whoops!)

Like clockwork, almost as soon as we deployed the new workstations, we began receiving complaints about things that weren't right.  Most of the initial complaint were "I need the administrator password so that I can install software and make changes."  This was met with a resounding "NO!"  It's been an uphill battle and we admittedly have to do some good work to repair our department's reputation. It was most entertaining to get a phone call with a complaint and to be able to tell them "Look at the computer, I'm logged into it right now. And, that program that you said doesn't work does."

The last week or so has been pretty quiet.  After they began to see printers and software magically appear on their computers, they stopped complaining.

There was one item that slipped by us: Some of the laptops would be going home with teachers and administrators and they would need to be able to connect to their home wireless networks.  We had this locked down to prevent students from changing to an open network that they can see from the school.

After hours of digging and false starts, I found the preference file that needed to be changed:

/Library/Preferences/SystemConfiguration/preferences.plist

My first reaction was it's a plist, it should be easy to script the change.  Wrong! It doesn't act like any of the other plist files.  Using a "defaults write" command would put the setting into the wrong place in the file.  I tried to open it in several plist editors, and it only found some of the settings, none of which I needed to change.

After I got home I spent a couple of hours trying to use sed and awk to change the file, but I couldn't get the regex correct.  I was about to give up, when I decided to give Google another try and found my answer:

/usr/libexec/airportd

I ran it with the -h flag to see the options available and I found what I needed:


RequireAdmin (Boolean)
RequireAdminIBSS (Boolean)
RequireAdminNetworkChange (Boolean)
RequireAdminPowerToggle (Boolean)

I could use "/usr/libexec/airportd en1 prefs RequireAdmin=NO" to change all of the settings at once or I could use the others to get more granular if I chose. I wrapped this in a quick BASH script and tested - success!! Here's the code that I created.


#!/bin/sh
#
#  Script to allow Users to change the wireless network without requiring a password
#
#

# Turn off all settings
/usr/libexec/airportd en1 prefs RequireAdmin=NO

# To allow granular settings you could do the following:
#/usr/libexec/airportd en1 prefs RequireAdminIBSS=
#/usr/libexec/airportd en1 prefs RequireAdminNetworkChange=
#/usr/libexec/airportd en1 prefs RequireAdminPowerToggle=

exit 0


Tomorrow I get to see if it works with Lion.



Adventures in Managed Computing

Over the last month we've been working on deploying over 700 new workstations and laptops.  This time around, I was able to get support from the upper administration to make them fully managed.  I was also able to convince them that we needed to pay for installation. After a few months of negotiating with Apple and my local var we were able to settling on an SOW for the project.

The VAR would:

  1. Install the management and imaging software on a brand new server.
  2. Install and configure the pile of servers that I had in my office.
  3. Image, deliver and deploy the new computers.
  4. Ensure that the computers were integrated into the management system and our online tracking system.
We chose to use Casper for JamfSoftware to manage the computers.  Casper has a pretty good reputation in the community and we decided to give it a whirl. After this is settled, we'll begin a SCCM deployment for the PC side of the house.

I'll post about my experiences with Casper in a later post.

That is all for now.  Later.